--- name: discover-data description: Automatically discover data pipeline and ETL skills when working with ETL. Activates for data development tasks. --- # Data Skills Discovery Provides automatic access to comprehensive data skills. ## When This Skill Activates This skill auto-activates when you're working with: - ETL - data pipelines - batch processing - stream processing - data validation - orchestration - Airflow - timely dataflow - differential dataflow - streaming aggregations - windowing - real-time analytics ## Available Skills ### Quick Reference The Data category contains 9 skills: 1. **batch-processing** - Orchestrating complex data pipelines with dependencies 2. **data-validation** - Validating data schema before processing 3. **dataflow-coordination** - Coordination patterns for distributed dataflow systems 4. **differential-dataflow** - Differential computation for incremental updates and efficient joins 5. **etl-patterns** - Designing data extraction from multiple sources 6. **pipeline-orchestration** - Coordinating complex multi-step data workflows 7. **stream-processing** - Processing real-time event streams (Kafka, Flink) 8. **streaming-aggregations** - Windowing, sessionization, time-series aggregation 9. **timely-dataflow** - Low-latency streaming computation with progress tracking ### Load Full Category Details For complete descriptions and workflows: ```bash cat ~/.claude/skills/data/INDEX.md ``` This loads the full Data category index with: - Detailed skill descriptions - Usage triggers for each skill - Common workflow combinations - Cross-references to related skills ### Load Specific Skills Load individual skills as needed: ```bash # Traditional ETL/Batch cat ~/.claude/skills/data/batch-processing.md cat ~/.claude/skills/data/data-validation.md cat ~/.claude/skills/data/etl-patterns.md cat ~/.claude/skills/data/pipeline-orchestration.md # Stream Processing cat ~/.claude/skills/data/stream-processing.md cat ~/.claude/skills/data/streaming-aggregations.md # Advanced Dataflow Systems cat ~/.claude/skills/data/timely-dataflow.md cat ~/.claude/skills/data/differential-dataflow.md cat ~/.claude/skills/data/dataflow-coordination.md ``` ## Common Workflow Combinations ### Real-Time Analytics Pipeline ```bash # Load these skills together: cat ~/.claude/skills/data/stream-processing.md # Kafka setup cat ~/.claude/skills/data/streaming-aggregations.md # Windowing patterns cat ~/.claude/skills/data/dataflow-coordination.md # Coordination ``` ### Incremental Computation System ```bash # Load these skills together: cat ~/.claude/skills/data/timely-dataflow.md # Foundation cat ~/.claude/skills/data/differential-dataflow.md # Incremental updates cat ~/.claude/skills/data/dataflow-coordination.md # Distributed coordination ``` ### Hybrid Batch + Stream ```bash # Load these skills together: cat ~/.claude/skills/data/batch-processing.md # Batch jobs cat ~/.claude/skills/data/stream-processing.md # Stream processing cat ~/.claude/skills/data/pipeline-orchestration.md # Overall coordination ``` ## Progressive Loading This gateway skill enables progressive loading: - **Level 1**: Gateway loads automatically (you're here now) - **Level 2**: Load category INDEX.md for full overview - **Level 3**: Load specific skills as needed ## Usage Instructions 1. **Auto-activation**: This skill loads automatically when Claude Code detects data work 2. **Browse skills**: Run `cat ~/.claude/skills/data/INDEX.md` for full category overview 3. **Load specific skills**: Use bash commands above to load individual skills --- **Next Steps**: Run `cat ~/.claude/skills/data/INDEX.md` to see full category details.