--- name: checking-freshness description: Quick data freshness check. Use when the user asks if data is up to date, when a table was last updated, if data is stale, or needs to verify data currency before using it. --- # Data Freshness Check Quickly determine if data is fresh enough to use. ## Freshness Check Process For each table to check: ### 1. Find the Timestamp Column Look for columns that indicate when data was loaded or updated: - `_loaded_at`, `_updated_at`, `_created_at` (common ETL patterns) - `updated_at`, `created_at`, `modified_at` (application timestamps) - `load_date`, `etl_timestamp`, `ingestion_time` - `date`, `event_date`, `transaction_date` (business dates) Query INFORMATION_SCHEMA.COLUMNS if you need to see column names. ### 2. Query Last Update Time ```sql SELECT MAX() as last_update, CURRENT_TIMESTAMP() as current_time, TIMESTAMPDIFF('hour', MAX(), CURRENT_TIMESTAMP()) as hours_ago, TIMESTAMPDIFF('minute', MAX(), CURRENT_TIMESTAMP()) as minutes_ago FROM ``` ### 3. Check Row Counts by Time For tables with regular updates, check recent activity: ```sql SELECT DATE_TRUNC('day', ) as day, COUNT(*) as row_count FROM
WHERE >= DATEADD('day', -7, CURRENT_DATE()) GROUP BY 1 ORDER BY 1 DESC ``` ## Freshness Status Report status using this scale: | Status | Age | Meaning | |--------|-----|---------| | **Fresh** | < 4 hours | Data is current | | **Stale** | 4-24 hours | May be outdated, check if expected | | **Very Stale** | > 24 hours | Likely a problem unless batch job | | **Unknown** | No timestamp | Can't determine freshness | ## If Data is Stale Check Airflow for the source pipeline: 1. **Find the DAG**: Which DAG populates this table? Use `list_dags` and look for matching names. 2. **Check DAG status**: - Is the DAG paused? Use `get_dag_details` - Did the last run fail? Use `get_dag_stats` - Is a run currently in progress? 3. **Diagnose if needed**: If the DAG failed, use the **debugging-dags** skill to investigate. ## Output Format Provide a clear, scannable report: ``` FRESHNESS REPORT ================ TABLE: database.schema.table_name Last Update: 2024-01-15 14:32:00 UTC Age: 2 hours 15 minutes Status: Fresh TABLE: database.schema.other_table Last Update: 2024-01-14 03:00:00 UTC Age: 37 hours Status: Very Stale Source DAG: daily_etl_pipeline (FAILED) Action: Investigate with **debugging-dags** skill ``` ## Quick Checks If user just wants a yes/no answer: - "Is X fresh?" -> Check and respond with status + one line - "Can I use X for my 9am meeting?" -> Check and give clear yes/no with context