{ "cells": [ { "cell_type": "markdown", "id": "bce99b4a", "metadata": {}, "source": [ "---\n", "title: \"Data Visualization with Matplotlib\"\n", "description: \"Create professional charts and visualizations from weather data using Python's Matplotlib library\"\n", "date: 2025-01-27\n", "lastmod: 2025-01-27\n", "author: \"Zer0-Mistakes Team\"\n", "layout: notebook\n", "difficulty: intermediate\n", "tags: [python, matplotlib, visualization, data-science, charts]\n", "categories: [Notebooks, Tutorials]\n", "toc: true\n", "comments: true\n", "---\n", "\n", "# Data Visualization with Matplotlib\n", "\n", "Learn to create professional-quality charts and visualizations using Python's Matplotlib library. This tutorial uses real weather data to demonstrate various chart types including line plots, bar charts, scatter plots, and multi-panel figures.\n", "\n", "**What you'll learn:**\n", "- Basic line and bar charts\n", "- Customizing colors, labels, and legends\n", "- Creating subplots and multi-panel figures\n", "- Scatter plots with color mapping\n", "- Saving high-quality images for publication" ] }, { "cell_type": "markdown", "id": "80693725", "metadata": {}, "source": [ "## Setup and Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "7566a988", "metadata": {}, "outputs": [], "source": [ "# Import visualization libraries\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import numpy as np\n", "\n", "# Configure matplotlib for better display\n", "plt.rcParams['figure.figsize'] = [10, 6] # Default figure size\n", "plt.rcParams['figure.dpi'] = 100 # Display resolution\n", "plt.rcParams['savefig.dpi'] = 150 # Save resolution\n", "plt.rcParams['axes.grid'] = True\n", "plt.rcParams['grid.alpha'] = 0.3\n", "\n", "print(\"✅ Libraries imported successfully!\")\n", "print(f\"Matplotlib version: {plt.matplotlib.__version__}\")" ] }, { "cell_type": "markdown", "id": "c6a7c930", "metadata": {}, "source": [ "## Load Weather Data" ] }, { "cell_type": "code", "execution_count": null, "id": "fd368284", "metadata": {}, "outputs": [], "source": [ "# Load the weather dataset\n", "weather = pd.read_csv('/Users/bamr87/github/zer0-mistakes/assets/data/notebooks/weather_data.csv', \n", " parse_dates=['date'])\n", "\n", "print(\"🌤️ Weather Data Preview:\")\n", "print(f\"Shape: {weather.shape[0]} days × {weather.shape[1]} columns\\n\")\n", "weather.head(10)" ] }, { "cell_type": "markdown", "id": "f80f9fc2", "metadata": {}, "source": [ "## Basic Line Chart: Temperature Over Time" ] }, { "cell_type": "code", "execution_count": null, "id": "2a3f184c", "metadata": {}, "outputs": [], "source": [ "# Create a simple line chart showing temperature trends\n", "fig, ax = plt.subplots(figsize=(12, 5))\n", "\n", "# Plot each city as a separate line\n", "cities = weather['city'].unique()\n", "colors = ['#e63946', '#457b9d', '#2a9d8f', '#e9c46a', '#264653']\n", "\n", "for i, city in enumerate(cities):\n", " city_data = weather[weather['city'] == city]\n", " ax.plot(city_data['date'], city_data['temperature_f'], \n", " label=city, color=colors[i], linewidth=2, marker='o', markersize=4)\n", "\n", "# Customize the chart\n", "ax.set_xlabel('Date', fontsize=12)\n", "ax.set_ylabel('Temperature (°F)', fontsize=12)\n", "ax.set_title('Daily Temperature by City', fontsize=14, fontweight='bold')\n", "ax.legend(loc='upper right', frameon=True)\n", "ax.grid(True, alpha=0.3)\n", "\n", "# Rotate date labels for readability\n", "plt.xticks(rotation=45)\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "22909398", "metadata": {}, "source": [ "## Bar Chart: Average Temperature by City" ] }, { "cell_type": "code", "execution_count": null, "id": "0f638b7b", "metadata": {}, "outputs": [], "source": [ "# Calculate average temperature by city\n", "avg_temp = weather.groupby('city')['temperature_f'].mean().sort_values(ascending=False)\n", "\n", "# Create horizontal bar chart\n", "fig, ax = plt.subplots(figsize=(10, 6))\n", "bars = ax.barh(avg_temp.index, avg_temp.values, color=colors[:len(avg_temp)])\n", "\n", "# Add value labels on bars\n", "for bar, temp in zip(bars, avg_temp.values):\n", " ax.text(bar.get_width() + 1, bar.get_y() + bar.get_height()/2, \n", " f'{temp:.1f}°F', va='center', fontsize=11)\n", "\n", "ax.set_xlabel('Average Temperature (°F)', fontsize=12)\n", "ax.set_title('Average Temperature by City', fontsize=14, fontweight='bold')\n", "ax.set_xlim(0, max(avg_temp.values) + 15)\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "fb54b5d4", "metadata": {}, "source": [ "## Scatter Plot: Temperature vs. Humidity" ] }, { "cell_type": "code", "execution_count": null, "id": "851781f8", "metadata": {}, "outputs": [], "source": [ "# Create scatter plot with color-coded precipitation\n", "fig, ax = plt.subplots(figsize=(10, 7))\n", "\n", "scatter = ax.scatter(weather['humidity'], weather['temperature_f'],\n", " c=weather['precipitation'], cmap='Blues',\n", " s=80, alpha=0.7, edgecolors='white', linewidth=0.5)\n", "\n", "# Add colorbar\n", "cbar = plt.colorbar(scatter, ax=ax)\n", "cbar.set_label('Precipitation (inches)', fontsize=11)\n", "\n", "ax.set_xlabel('Humidity (%)', fontsize=12)\n", "ax.set_ylabel('Temperature (°F)', fontsize=12)\n", "ax.set_title('Temperature vs. Humidity\\n(Color = Precipitation)', fontsize=14, fontweight='bold')\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "e70141fd", "metadata": {}, "source": [ "## Multi-Panel Figure: Weather Dashboard" ] }, { "cell_type": "code", "execution_count": null, "id": "9c8284ea", "metadata": {}, "outputs": [], "source": [ "# Create a 2x2 dashboard with multiple visualizations\n", "fig, axes = plt.subplots(2, 2, figsize=(14, 10))\n", "fig.suptitle('Weather Analysis Dashboard', fontsize=16, fontweight='bold', y=1.02)\n", "\n", "# 1. Temperature distribution (histogram)\n", "ax1 = axes[0, 0]\n", "ax1.hist(weather['temperature_f'], bins=15, color='#e63946', edgecolor='white', alpha=0.7)\n", "ax1.axvline(weather['temperature_f'].mean(), color='#264653', linestyle='--', \n", " linewidth=2, label=f'Mean: {weather[\"temperature_f\"].mean():.1f}°F')\n", "ax1.set_xlabel('Temperature (°F)')\n", "ax1.set_ylabel('Frequency')\n", "ax1.set_title('Temperature Distribution')\n", "ax1.legend()\n", "\n", "# 2. Weather conditions pie chart\n", "ax2 = axes[0, 1]\n", "condition_counts = weather['condition'].value_counts()\n", "colors_pie = ['#ffd166', '#06d6a0', '#118ab2', '#073b4c', '#ef476f']\n", "wedges, texts, autotexts = ax2.pie(condition_counts.values, labels=condition_counts.index,\n", " autopct='%1.1f%%', colors=colors_pie[:len(condition_counts)],\n", " explode=[0.05] * len(condition_counts))\n", "ax2.set_title('Weather Conditions')\n", "\n", "# 3. Wind speed by city (box plot)\n", "ax3 = axes[1, 0]\n", "city_wind_data = [weather[weather['city'] == city]['wind_speed'].values for city in cities]\n", "bp = ax3.boxplot(city_wind_data, labels=cities, patch_artist=True)\n", "for patch, color in zip(bp['boxes'], colors):\n", " patch.set_facecolor(color)\n", " patch.set_alpha(0.7)\n", "ax3.set_xlabel('City')\n", "ax3.set_ylabel('Wind Speed (mph)')\n", "ax3.set_title('Wind Speed Distribution by City')\n", "\n", "# 4. Precipitation by city (grouped bar)\n", "ax4 = axes[1, 1]\n", "city_precip = weather.groupby('city')['precipitation'].agg(['mean', 'max']).reset_index()\n", "x = np.arange(len(city_precip))\n", "width = 0.35\n", "bars1 = ax4.bar(x - width/2, city_precip['mean'], width, label='Average', color='#457b9d')\n", "bars2 = ax4.bar(x + width/2, city_precip['max'], width, label='Maximum', color='#1d3557')\n", "ax4.set_xlabel('City')\n", "ax4.set_ylabel('Precipitation (inches)')\n", "ax4.set_title('Precipitation by City')\n", "ax4.set_xticks(x)\n", "ax4.set_xticklabels(city_precip['city'], rotation=45, ha='right')\n", "ax4.legend()\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "93ae7e39", "metadata": {}, "source": [ "## Saving Figures\n", "\n", "Save your visualizations as high-quality images for reports and presentations:" ] }, { "cell_type": "code", "execution_count": null, "id": "f581f7a7", "metadata": {}, "outputs": [], "source": [ "# Create a publication-ready figure and save it\n", "import os\n", "\n", "# Create output directory if it doesn't exist\n", "output_dir = '/Users/bamr87/github/zer0-mistakes/assets/images/notebooks/matplotlib-visualization_files'\n", "os.makedirs(output_dir, exist_ok=True)\n", "\n", "# Create a clean, professional chart\n", "fig, ax = plt.subplots(figsize=(10, 6))\n", "\n", "# Group data by city and calculate daily averages\n", "for i, city in enumerate(cities):\n", " city_data = weather[weather['city'] == city]\n", " ax.plot(city_data['date'], city_data['temperature_c'], \n", " label=city, color=colors[i], linewidth=2.5)\n", "\n", "ax.set_xlabel('Date', fontsize=12)\n", "ax.set_ylabel('Temperature (°C)', fontsize=12)\n", "ax.set_title('Temperature Trends Across Cities', fontsize=14, fontweight='bold')\n", "ax.legend(loc='upper right', frameon=True, fancybox=True, shadow=True)\n", "ax.grid(True, alpha=0.3)\n", "plt.xticks(rotation=45)\n", "\n", "# Save in multiple formats\n", "fig.savefig(f'{output_dir}/temperature_trends.png', dpi=150, bbox_inches='tight',\n", " facecolor='white', edgecolor='none')\n", "fig.savefig(f'{output_dir}/temperature_trends.svg', bbox_inches='tight',\n", " facecolor='white', edgecolor='none')\n", "\n", "print(f\"✅ Figures saved to: {output_dir}\")\n", "print(\" - temperature_trends.png (150 DPI)\")\n", "print(\" - temperature_trends.svg (vector format)\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "33d6ceb7", "metadata": {}, "source": [ "## Summary\n", "\n", "In this tutorial, you learned:\n", "\n", "1. **Line charts** - Show trends over time with `plt.plot()`\n", "2. **Bar charts** - Compare categories with `plt.bar()` and `plt.barh()`\n", "3. **Scatter plots** - Visualize relationships with color mapping using `plt.scatter()`\n", "4. **Subplots** - Create multi-panel dashboards with `plt.subplots()`\n", "5. **Saving figures** - Export high-quality images with `fig.savefig()`\n", "\n", "**Next steps:**\n", "- Try [Seaborn](https://seaborn.pydata.org/) for statistical visualizations\n", "- Explore [Plotly](https://plotly.com/python/) for interactive charts\n", "- Check out our [Python Statistics](/notebooks/python-statistics/) tutorial" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }