--- name: data-analysis description: Analyze datasets and create visualizations version: 1.0.0 author: Minion Team tags: [data, analysis, visualization, pandas] requirements: - pandas>=2.0.0 - matplotlib>=3.7.0 - numpy>=1.24.0 --- # Data Analysis Skill ## Description This skill helps analyze datasets and create meaningful visualizations. It can handle CSV files, perform statistical analysis, and generate various types of plots. ## Usage Instructions When a user requests data analysis: 1. **Load the dataset**: Use pandas to read the data file 2. **Inspect the data**: Check shape, columns, data types, and basic statistics 3. **Clean the data**: Handle missing values and outliers if necessary 4. **Perform analysis**: Calculate relevant statistics based on user's question 5. **Create visualizations**: Generate appropriate plots (line, bar, scatter, etc.) 6. **Save results**: Export results and visualizations ## Available Resources ### Scripts - **scripts/analyze.py**: Core analysis functions - `load_dataset(filepath)`: Load data from various formats - `basic_statistics(df)`: Calculate descriptive statistics - `detect_outliers(df, column)`: Identify outliers - `correlation_analysis(df)`: Compute correlations - **scripts/visualize.py**: Visualization utilities - `plot_distribution(df, column)`: Create distribution plots - `plot_correlation_matrix(df)`: Visualize correlation heatmap - `plot_time_series(df, date_col, value_col)`: Time series plots - `save_plot(fig, filename)`: Save figure to file ### References - **references/examples.md**: Usage examples and common patterns - **references/best_practices.md**: Data analysis best practices ## Example Prompts - "Analyze this CSV file and show me the trends" - "Create a visualization of the sales data by month" - "Find correlations in this dataset" - "Identify outliers in the price column" - "Generate a statistical summary of the data" ## Output Format Analysis results should include: 1. Data overview (shape, columns, types) 2. Statistical summary 3. Key insights and findings 4. Visualizations (saved as PNG files) 5. Recommendations or next steps ## Notes - Always inspect data before analysis - Handle missing values appropriately - Choose visualizations that match the data type - Provide clear explanations of findings - Save all outputs for user reference