{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "be6d9f42",
   "metadata": {},
   "source": [
    "# Lesson 10 activity: Pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d35e167",
   "metadata": {},
   "source": [
    "## Learning objectives\n",
    "\n",
    "By the end of this activity, you will be able to:\n",
    "- Create Pandas Series and DataFrames\n",
    "- Load data from CSV files\n",
    "- Perform basic data exploration and analysis\n",
    "- Calculate descriptive statistics\n",
    "- Filter and manipulate DataFrame data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4f4250f",
   "metadata": {},
   "source": [
    "## Tips\n",
    "\n",
    "- **Creating DataFrames:** Use `pd.DataFrame(dictionary)` where dictionary keys become column names\n",
    "- **Loading CSV files:** Use `pd.read_csv('filename.csv')`\n",
    "- **Basic exploration:** Use `.head()`, `.tail()`, `.info()`, `.describe()`, and `.shape`\n",
    "- **Filtering data:** Use conditions like `df[df['column'] > value]`\n",
    "- **Column selection:** Use `df['column_name']` or `df[['col1', 'col2']]`\n",
    "- **Adding columns:** Use `df['new_column'] = calculation`\n",
    "- **Statistics:** Use `.mean()`, `.max()`, `.min()`, `.sum()` methods\n",
    "\n",
    "**Remember:** Take your time with each step and test your code frequently!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "8a821472",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7a3b120",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 1: creating your first DataFrame\n",
    "\n",
    "**Scenario:** You're working at a bookstore and need to create a simple inventory system.\n",
    "\n",
    "**Your task:**\n",
    "1. Create a DataFrame called `books_df` with the following data:\n",
    "   - Book titles: [\"Python Basics\", \"Data Science Handbook\", \"Web Development Guide\"]\n",
    "   - Authors: [\"John Smith\", \"Jane Doe\", \"Mike Johnson\"]\n",
    "   - Prices: [29.99, 45.50, 35.00]\n",
    "   - Stock: [15, 8, 12]\n",
    "\n",
    "2. Display the DataFrame\n",
    "3. Print the shape of the DataFrame\n",
    "4. Display basic information about the DataFrame using `.info()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37891ada",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 1: Create the DataFrame\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d26a67e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 2: Display the DataFrame\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6de21323",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 3: Print the shape\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "142c46ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 4: Display info\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b44fc045",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 2: loading and exploring student data\n",
    "\n",
    "**Scenario:** You're a teacher analyzing student performance data.\n",
    "\n",
    "**Your task:**\n",
    "1. Load the `students.csv` file into a DataFrame called `students_df`\n",
    "2. Display the first 3 rows using `.head()`\n",
    "3. Display the last 2 rows using `.tail()`\n",
    "4. Show descriptive statistics for numerical columns using `.describe()`\n",
    "5. Find the average grade of all students\n",
    "\n",
    "Find `students.csv` on the **Datasets** page of the course site, or download it directly: [students.csv](https://media.githubusercontent.com/media/gperdrizet/fullstack-2605/refs/heads/main/data/students.csv)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d953fa64",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 1: Load the CSV file\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "042d5db0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 2: Display first 3 rows\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ac6010a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 3: Display last 2 rows\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6c0fe4f4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 4: Show descriptive statistics\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d012dc5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 5: Calculate average grade\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8670b6ff",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 3: Data filtering and selection\n",
    "\n",
    "**Scenario:** Continue working with the student data to find specific information.\n",
    "\n",
    "**Your task:**\n",
    "1. Display only the 'name' and 'grade' columns from `students_df`\n",
    "2. Find all students who scored above 85\n",
    "3. Find all students studying 'Math'\n",
    "4. Find the highest grade in the dataset\n",
    "5. Count how many students are in each subject"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d91f2d8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 1: Display only name and grade columns\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f157592f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 2: Students with grades above 85\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fd058aeb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 3: Students studying Math\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c139a22b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 4: Highest grade\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2a96dcba",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 5: Count students by subject\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d5b4d72",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 4: sales data analysis\n",
    "\n",
    "**Scenario:** You're analyzing sales data for an electronics store.\n",
    "\n",
    "**Your task:**\n",
    "1. Load the `sales.csv` file into a DataFrame called `sales_df`\n",
    "2. Calculate the total value for each product (price × quantity)\n",
    "3. Add this as a new column called 'total_value' to the DataFrame\n",
    "4. Find the product with the highest total value\n",
    "5. Calculate the grand total of all sales\n",
    "\n",
    "Find `sales.csv` on the **Datasets** page of the course site, or download it directly: [sales.csv](https://media.githubusercontent.com/media/gperdrizet/fullstack-2605/refs/heads/main/data/sales.csv)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0ea00e85",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 1: Load the sales data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "759aaa1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 2 & 3: Calculate total value and add as new column\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3457bdd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 4: Find product with highest total value\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b70405b9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 5: Calculate grand total of all sales\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b46ec82",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 5: series creation and manipulation\n",
    "\n",
    "**Scenario:** Create and work with Pandas Series for daily temperature data.\n",
    "\n",
    "**Your task:**\n",
    "1. Create a Pandas Series called `temperatures` with the following data:\n",
    "   - Values: [22, 25, 23, 26, 24, 27, 25]\n",
    "   - Index: ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']\n",
    "2. Find the temperature for Wednesday\n",
    "3. Find days with temperature above 24 degrees\n",
    "4. Calculate the average temperature for the week\n",
    "5. Find the day with the highest temperature"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a5e217e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 1: Create the temperature series\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4bfa4772",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 2: Temperature for Wednesday\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3f801c36",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 3: Days with temperature above 24\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3df5c281",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 4: Average temperature\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "efe22f36",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Step 5: Day with highest temperature\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8dfe6711",
   "metadata": {},
   "source": [
    "---\n",
    "## Reflection questions\n",
    "\n",
    "Please answer these questions after completing the activity:\n",
    "\n",
    "1. **What is the difference between a Pandas Series and a DataFrame?**\n",
    "   \n",
    "   *Your answer:*\n",
    "\n",
    "2. **What are the advantages of using Pandas over working with plain Python lists and dictionaries?**\n",
    "   \n",
    "   *Your answer:*\n",
    "\n",
    "3. **Describe a real-world scenario where you might use the filtering techniques you learned in Problem 3.**\n",
    "   \n",
    "   *Your answer:*\n",
    "\n",
    "4. **What did you find most challenging about working with Pandas in this activity?**\n",
    "   \n",
    "   *Your answer:*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}