name: Philadelphia Inquirer Sitemap Crawl description: >- Capability that walks the Inquirer.com sitemap index, expands child URL sitemaps for a given date range, and emits a stream of article URLs with lastmod and image metadata for downstream crawling or freshness checks. category: Discovery trigger: scheduled inputs: - name: dateRange description: Inclusive YYYY-MM-DD start and end. type: object properties: start: type: string format: date end: type: string format: date outputs: - name: urls description: Article URL entries with sitemap metadata. type: array items: $ref: ../json-schema/sitemap-url-schema.json steps: - name: fetch-sitemap-index call: api: philadelphia-inquirer:sitemaps operationId: getSitemapIndex - name: filter-children-by-date - name: fetch-daily-sitemaps forEach: childSitemaps call: api: philadelphia-inquirer:sitemaps operationId: getDailySitemap pathParams: date: "{{ item.date }}" - name: flatten-urls