{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "127b2c4d",
   "metadata": {},
   "source": [
    "# Lesson 09 activity solution\n",
    "\n",
    "This notebook contains complete solutions for the lesson 03 NumPy activities.\n",
    "\n",
    "## Instructions:\n",
    "- Review each solution carefully\n",
    "- Compare with your own approach\n",
    "- Run the cells to see the output\n",
    "- Note that there are often multiple valid ways to solve each problem!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "c8a3813c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import NumPy\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b63cfde",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 1: The grade book analyzer\n",
    "\n",
    "**Your task:**\n",
    "Write a function called `analyze_grades(grades)` that:\n",
    "- Takes a 2D NumPy array where each row represents a student and each column represents a test score\n",
    "- Calculates and returns a dictionary containing:\n",
    "  - `'class_average'`: The overall average of all grades\n",
    "  - `'student_averages'`: Array of average scores for each student\n",
    "  - `'test_averages'`: Array of average scores for each test\n",
    "  - `'highest_score'`: The highest individual score\n",
    "  - `'lowest_score'`: The lowest individual score\n",
    "  - `'passing_rate'`: Percentage of grades >= 60\n",
    "\n",
    "**Test case:**\n",
    "```python\n",
    "grades = np.array([[85, 90, 78, 92],\n",
    "                   [76, 88, 81, 79],\n",
    "                   [93, 95, 89, 97],\n",
    "                   [67, 72, 65, 70]])\n",
    "```\n",
    "\n",
    "**Hints:**\n",
    "- Use `np.mean()` with `axis` parameter for averages\n",
    "- Use `np.max()` and `np.min()` for highest/lowest\n",
    "- Use boolean indexing to find passing grades"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "89a0f317",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Grade analysis results:\n",
      "class_average: 82.31\n",
      "student_averages: [86.25 81.   93.5  68.5 ]\n",
      "test_averages: [80.25 86.25 78.25 84.5 ]\n",
      "highest_score: 97\n",
      "lowest_score: 65\n",
      "passing_rate: 100.00%\n"
     ]
    }
   ],
   "source": [
    "# Problem 1: Solution\n",
    "\n",
    "def analyze_grades(grades):\n",
    "    '''\n",
    "    Analyzes a grade book and returns various statistics.\n",
    "    \n",
    "    Args:\n",
    "        grades (np.ndarray): 2D array of grades (students x tests)\n",
    "    \n",
    "    Returns:\n",
    "        dict: Dictionary containing various grade statistics\n",
    "    '''\n",
    "\n",
    "    # Calculate overall class average (mean of all grades)\n",
    "    class_average = np.mean(grades)\n",
    "\n",
    "    # Calculate average for each student (mean along axis=1, across columns)\n",
    "    student_averages = np.mean(grades, axis=1)\n",
    "\n",
    "    # Calculate average for each test (mean along axis=0, down rows)\n",
    "    test_averages = np.mean(grades, axis=0)\n",
    "\n",
    "    # Find highest and lowest scores\n",
    "    highest_score = np.max(grades)\n",
    "    lowest_score = np.min(grades)\n",
    "\n",
    "    # Calculate passing rate (percentage of grades >= 60)\n",
    "    passing_grades = grades >= 60\n",
    "    passing_rate = (np.sum(passing_grades) / grades.size) * 100\n",
    "\n",
    "    # Return all statistics in a dictionary\n",
    "    return {\n",
    "        'class_average': class_average,\n",
    "        'student_averages': student_averages,\n",
    "        'test_averages': test_averages,\n",
    "        'highest_score': highest_score,\n",
    "        'lowest_score': lowest_score,\n",
    "        'passing_rate': passing_rate\n",
    "    }\n",
    "\n",
    "# Test case\n",
    "grades = np.array([[85, 90, 78, 92],\n",
    "                   [76, 88, 81, 79],\n",
    "                   [93, 95, 89, 97],\n",
    "                   [67, 72, 65, 70]])\n",
    "\n",
    "results = analyze_grades(grades)\n",
    "\n",
    "print('Grade analysis results:')\n",
    "print(f\"class_average: {results['class_average']:.2f}\")\n",
    "print(f\"student_averages: {results['student_averages']}\")\n",
    "print(f\"test_averages: {results['test_averages']}\")\n",
    "print(f\"highest_score: {results['highest_score']}\")\n",
    "print(f\"lowest_score: {results['lowest_score']}\")\n",
    "print(f\"passing_rate: {results['passing_rate']:.2f}%\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4365def4",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 2: The array transformer\n",
    "\n",
    "**Your task:**\n",
    "Write a function called `transform_array(arr, operation='normalize', axis=None)` that:\n",
    "- Accepts a NumPy array of any dimension\n",
    "- Performs different transformations based on the `operation` parameter:\n",
    "  - `'normalize'`: Scale values to range [0, 1] using min-max normalization\n",
    "  - `'standardize'`: Apply z-score standardization (subtract mean, divide by std)\n",
    "  - `'square'`: Square all values\n",
    "  - `'sqrt'`: Take square root of all values\n",
    "- Supports an optional `axis` parameter for operations that can be done along specific axes\n",
    "- Returns the transformed array\n",
    "\n",
    "**Formulas:**\n",
    "- Min-max normalization: `(x - min) / (max - min)`\n",
    "- Z-score standardization: `(x - mean) / std`\n",
    "\n",
    "**Test cases:**\n",
    "- `transform_array(np.array([1, 2, 3, 4, 5]), 'normalize')` → array from 0 to 1\n",
    "- `transform_array(np.array([1, 4, 9, 16]), 'sqrt')` → [1, 2, 3, 4]\n",
    "\n",
    "**Hints:**\n",
    "- Use `np.min()`, `np.max()`, `np.mean()`, `np.std()`\n",
    "- Use `np.sqrt()` for square root\n",
    "- Remember to handle division by zero for edge cases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc366e51",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Test 1 (normalize):\n",
      "Original: [[1 2 3 4 5]\n",
      " [5 4 3 2 1]]\n",
      "Normalized: [[0.   0.25 0.5  0.75 1.  ]\n",
      " [1.   0.75 0.5  0.25 0.  ]]\n",
      "\n",
      "Test 2 (sqrt):\n",
      "Original: [ 1  4  9 16]\n",
      "Square root: [1. 2. 3. 4.]\n",
      "\n",
      "Test 3 (standardize):\n",
      "Original: [10 20 30 40 50]\n",
      "Standardized: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]\n",
      "\n",
      "Test 4 (square):\n",
      "Original: [1 2 3 4]\n",
      "Squared: [ 1  4  9 16]\n"
     ]
    }
   ],
   "source": [
    "# Problem 2: Solution\n",
    "\n",
    "def transform_array(arr, operation='normalize', axis=None):\n",
    "    '''\n",
    "    Transforms an array using various mathematical operations.\n",
    "    \n",
    "    Args:\n",
    "        arr (np.ndarray): Input array\n",
    "        operation (str): Type of transformation to apply\n",
    "        axis (int): Axis along which to apply the operation (if applicable)\n",
    "    \n",
    "    Returns:\n",
    "        np.ndarray: Transformed array\n",
    "    '''\n",
    "\n",
    "    if operation == 'normalize':\n",
    "\n",
    "        # Min-max normalization: (x - min) / (max - min)\n",
    "        min_val = np.min(arr, axis=axis, keepdims=True)\n",
    "        max_val = np.max(arr, axis=axis, keepdims=True)\n",
    "\n",
    "        return (arr - min_val) / (max_val - min_val)\n",
    "\n",
    "    elif operation == 'standardize':\n",
    "\n",
    "        # Z-score standardization: (x - mean) / std\n",
    "        mean = np.mean(arr, axis=axis, keepdims=True)\n",
    "        std = np.std(arr, axis=axis, keepdims=True)\n",
    "\n",
    "        # Handle edge case where std is 0\n",
    "        if std == 0:\n",
    "            return np.zeros_like(arr)\n",
    "\n",
    "        return (arr - mean) / std\n",
    "\n",
    "    elif operation == 'square':\n",
    "\n",
    "        # Square all values\n",
    "        return arr ** 2\n",
    "\n",
    "    elif operation == 'sqrt':\n",
    "\n",
    "        # Take square root of all values\n",
    "        return np.sqrt(arr)\n",
    "\n",
    "    else:\n",
    "        return f\"Error: Unknown operation '{operation}'\"\n",
    "\n",
    "# Test cases\n",
    "print('Test 1 (normalize):')\n",
    "arr1 = np.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1]])\n",
    "print(f'Original: {arr1}')\n",
    "print(f'Normalized: {transform_array(arr1, \"normalize\")}')\n",
    "\n",
    "print('\\nTest 2 (sqrt):')\n",
    "arr2 = np.array([1, 4, 9, 16])\n",
    "print(f'Original: {arr2}')\n",
    "print(f'Square root: {transform_array(arr2, \"sqrt\")}')\n",
    "\n",
    "print('\\nTest 3 (standardize):')\n",
    "arr3 = np.array([10, 20, 30, 40, 50])\n",
    "print(f'Original: {arr3}')\n",
    "print(f'Standardized: {transform_array(arr3, \"standardize\")}')\n",
    "\n",
    "print('\\nTest 4 (square):')\n",
    "arr4 = np.array([1, 2, 3, 4])\n",
    "print(f'Original: {arr4}')\n",
    "print(f'Squared: {transform_array(arr4, \"square\")}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e37c017",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 3: The matrix operations toolkit\n",
    "\n",
    "**Your task:**\n",
    "Write a function called `matrix_operations(matrix1, matrix2=None, operation='transpose')` that:\n",
    "- Performs different matrix operations based on the `operation` parameter:\n",
    "  - `'transpose'`: Return the transpose of matrix1 (only needs matrix1)\n",
    "  - `'multiply'`: Element-wise multiplication of matrix1 and matrix2\n",
    "  - `'matmul'`: Matrix multiplication of matrix1 and matrix2\n",
    "  - `'add'`: Add matrix1 and matrix2\n",
    "  - `'flatten'`: Flatten matrix1 to 1D array\n",
    "- Returns the result of the operation\n",
    "- Handles incompatible matrix shapes by returning an error message\n",
    "\n",
    "**Test cases:**\n",
    "```python\n",
    "A = np.array([[1, 2], [3, 4]])\n",
    "B = np.array([[5, 6], [7, 8]])\n",
    "```\n",
    "\n",
    "**Hints:**\n",
    "- Use `.T` or `np.transpose()` for transpose\n",
    "- Use `@` or `np.matmul()` for matrix multiplication\n",
    "- Use `.flatten()` to flatten arrays\n",
    "- Check shapes before operations using `.shape`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bec3ba8a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Matrix A:\n",
      "[[1 2]\n",
      " [3 4]]\n",
      "\n",
      "Matrix B:\n",
      "[[5 6]\n",
      " [7 8]]\n",
      "\n",
      "Transpose of A:\n",
      "[[1 3]\n",
      " [2 4]]\n",
      "\n",
      "Element-wise multiplication (A * B):\n",
      "[[ 5 12]\n",
      " [21 32]]\n",
      "\n",
      "Matrix multiplication (A @ B):\n",
      "[[19 22]\n",
      " [43 50]]\n",
      "\n",
      "Addition (A + B):\n",
      "[[ 6  8]\n",
      " [10 12]]\n",
      "\n",
      "Flatten A:\n",
      "[1 2 3 4]\n"
     ]
    }
   ],
   "source": [
    "# Problem 3: Solution\n",
    "\n",
    "def matrix_operations(matrix1, matrix2=None, operation='transpose'):\n",
    "    '''\n",
    "    Performs various matrix operations.\n",
    "    \n",
    "    Args:\n",
    "        matrix1 (np.ndarray): First matrix\n",
    "        matrix2 (np.ndarray): Second matrix (optional)\n",
    "        operation (str): The operation to perform\n",
    "    \n",
    "    Returns:\n",
    "        np.ndarray or str: Result of the operation or error message\n",
    "    '''\n",
    "\n",
    "    if operation == 'transpose':\n",
    "        # Return the transpose of matrix1\n",
    "        return matrix1.T\n",
    "\n",
    "    elif operation == 'flatten':\n",
    "        # Flatten matrix1 to 1D array\n",
    "        return matrix1.flatten()\n",
    "\n",
    "    # For operations requiring two matrices, check if matrix2 is provided\n",
    "    if matrix2 is None:\n",
    "        return f\"Error: Operation '{operation}' requires two matrices\"\n",
    "\n",
    "    if operation == 'multiply':\n",
    "\n",
    "        # Element-wise multiplication\n",
    "        if matrix1.shape != matrix2.shape:\n",
    "            return 'Error: Matrices must have the same shape for element-wise multiplication'\n",
    "\n",
    "        return matrix1 * matrix2\n",
    "\n",
    "    elif operation == 'add':\n",
    "\n",
    "        # Matrix addition\n",
    "        if matrix1.shape != matrix2.shape:\n",
    "            return 'Error: Matrices must have the same shape for addition'\n",
    "\n",
    "        return matrix1 + matrix2\n",
    "\n",
    "    elif operation == 'matmul':\n",
    "\n",
    "        # Matrix multiplication\n",
    "        # Check if dimensions are compatible: (m x n) @ (n x p) = (m x p)\n",
    "        if matrix1.shape[1] != matrix2.shape[0]:\n",
    "            error_str = f'Error: Incompatible shapes for matrix multiplication: {matrix1.shape} and {matrix2.shape}'\n",
    "            return error_str\n",
    "\n",
    "        return matrix1 @ matrix2\n",
    "\n",
    "    else:\n",
    "        return f\"Error: Unknown operation '{operation}'\"\n",
    "\n",
    "# Test cases\n",
    "A = np.array([[1, 2], [3, 4]])\n",
    "B = np.array([[5, 6], [7, 8]])\n",
    "\n",
    "print('Matrix A:')\n",
    "print(A)\n",
    "print('\\nMatrix B:')\n",
    "print(B)\n",
    "\n",
    "print('\\nTranspose of A:')\n",
    "print(matrix_operations(A, operation='transpose'))\n",
    "\n",
    "print('\\nElement-wise multiplication (A * B):')\n",
    "print(matrix_operations(A, B, operation='multiply'))\n",
    "\n",
    "print('\\nMatrix multiplication (A @ B):')\n",
    "print(matrix_operations(A, B, operation='matmul'))\n",
    "\n",
    "print('\\nAddition (A + B):')\n",
    "print(matrix_operations(A, B, operation='add'))\n",
    "\n",
    "print('\\nFlatten A:')\n",
    "print(matrix_operations(A, operation='flatten'))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5965257b",
   "metadata": {},
   "source": [
    "---\n",
    "## Problem 4: Fixing NumPy bugs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8434dcf7",
   "metadata": {},
   "source": [
    "### Bug 1\n",
    "\n",
    "**Expected output:** A 3x3 array with values [1, 2, 3, 4, 5, 6, 7, 8, 9] reshaped\n",
    "\n",
    "**Current error:** ValueError\n",
    "\n",
    "**Problem:** Array has 8 elements but trying to reshape to 3x3 (9 elements)\n",
    "\n",
    "**Solution:** Need 9 elements or reshape to compatible dimensions (e.g., 2x4, 4x2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "533fff24",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 3]\n",
      " [4 5 6]\n",
      " [7 8 9]]\n",
      "\n",
      "Alternative (2x4):\n",
      "[[1 2 3 4]\n",
      " [5 6 7 8]]\n"
     ]
    }
   ],
   "source": [
    "# Bug 1: Fixed code\n",
    "\n",
    "# Fix: Need 9 elements for a 3x3 array\n",
    "arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])  # Added 9 to make it 9 elements\n",
    "reshaped = arr.reshape(3, 3)\n",
    "\n",
    "print(reshaped)\n",
    "\n",
    "# Alternative fix: Reshape to compatible dimensions with 8 elements\n",
    "arr2 = np.array([1, 2, 3, 4, 5, 6, 7, 8])\n",
    "reshaped2 = arr2.reshape(2, 4)  # or (4, 2)\n",
    "print('\\nAlternative (2x4):')\n",
    "print(reshaped2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5536fc71",
   "metadata": {},
   "source": [
    "### Bug 2\n",
    "\n",
    "**Expected output:** Should extract the second row from the matrix\n",
    "\n",
    "**Current error:** Wrong output (extracts column instead of row)\n",
    "\n",
    "**Problem:** Using `[:, 1]` gets the second column, not the second row\n",
    "\n",
    "**Solution:** Use `[1, :]` to get the second row (row index 1, all columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "beb0da63",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Matrix:\n",
      "[[1 2 3]\n",
      " [4 5 6]\n",
      " [7 8 9]]\n",
      "\n",
      "Second row: [4 5 6]\n",
      "Second column: [2 5 8]\n"
     ]
    }
   ],
   "source": [
    "# Bug 2: Fixed code\n",
    "\n",
    "matrix = np.array([[1, 2, 3],\n",
    "                   [4, 5, 6],\n",
    "                   [7, 8, 9]])\n",
    "\n",
    "print('Matrix:')\n",
    "print(matrix)\n",
    "\n",
    "# Fix: Use [1, :] to get the second row (row index 1, all columns)\n",
    "# The original code [:, 1] gets the second column instead\n",
    "second_row = matrix[1, :]  # or simply matrix[1]\n",
    "print('\\nSecond row:', second_row)\n",
    "\n",
    "# For comparison, here's how to get the second column:\n",
    "second_column = matrix[:, 1]\n",
    "print('Second column:', second_column)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "59f63b2a",
   "metadata": {},
   "source": [
    "### Bug 3\n",
    "\n",
    "**Expected output:** Array should contain decimal values for division results\n",
    "\n",
    "**Current (wrong) output:** Integer division (truncated results)\n",
    "\n",
    "**Problem:** Array is integer type by default, causing integer division in older Python/NumPy\n",
    "\n",
    "**Solution:** Convert array to float type before division, or use float in array creation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "9e133843",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Option 1: using dtype=float:\n",
      "Numbers: [10. 15. 20. 25. 30.]\n",
      "Divided by 3: [ 3.33333333  5.          6.66666667  8.33333333 10.        ]\n",
      "Data type: float64\n",
      "\n",
      "Option 2: using astype(float):\n",
      "Divided by 3: [ 3.33333333  5.          6.66666667  8.33333333 10.        ]\n",
      "Data type: float64\n"
     ]
    }
   ],
   "source": [
    "# Bug 3: Fixed code\n",
    "\n",
    "# Fix option 1: Explicitly specify float dtype when creating array\n",
    "numbers = np.array([10, 15, 20, 25, 30], dtype=float)\n",
    "divisor = 3\n",
    "\n",
    "result = numbers / divisor\n",
    "\n",
    "print('Option 1: using dtype=float:')\n",
    "print(f'Numbers: {numbers}')\n",
    "print(f'Divided by {divisor}: {result}')\n",
    "print(f'Data type: {result.dtype}')\n",
    "\n",
    "# Fix option 2: Convert to float after creation\n",
    "numbers2 = np.array([10, 15, 20, 25, 30])\n",
    "result2 = numbers2.astype(float) / divisor\n",
    "\n",
    "print('\\nOption 2: using astype(float):')\n",
    "print(f'Divided by {divisor}: {result2}')\n",
    "print(f'Data type: {result2.dtype}')\n",
    "\n",
    "# Note: In Python 3 and modern NumPy, division with / always produces floats,\n",
    "# but explicitly setting the dtype is still good practice for clarity"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cfd4e21d",
   "metadata": {},
   "source": [
    "---\n",
    "## Bonus challenge: The data filter\n",
    "\n",
    "**Your task:**\n",
    "Write a function called `filter_data(data, condition='positive', threshold=0)` that:\n",
    "- Takes a 1D or 2D NumPy array\n",
    "- Filters data based on the condition:\n",
    "  - `'positive'`: Keep only values > 0\n",
    "  - `'negative'`: Keep only values < 0\n",
    "  - `'threshold'`: Keep only values > threshold\n",
    "  - `'range'`: Keep only values between -threshold and +threshold\n",
    "- Returns the filtered array (1D result)\n",
    "- Also returns the count of filtered elements\n",
    "\n",
    "**Test case:**\n",
    "```python\n",
    "data = np.array([-5, 10, -3, 15, 0, -8, 20, 3])\n",
    "```\n",
    "\n",
    "**Hints:**\n",
    "- Use boolean indexing with conditions\n",
    "- Combine conditions using `&` (and) or `|` (or)\n",
    "- Use `.flatten()` if input is 2D\n",
    "- Return multiple values as a tuple"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad64e1b5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original data: [-5 10 -3 15  0 -8 20  3]\n",
      "\n",
      "Positive values: [10 15 20  3]\n",
      "Count: 4\n",
      "\n",
      "Negative values: [-5 -3 -8]\n",
      "Count: 3\n",
      "\n",
      "Values > 10: [15 20]\n",
      "Count: 2\n",
      "\n",
      "Values in range [-5, 5]: [-5 -3  0  3]\n",
      "Count: 4\n"
     ]
    }
   ],
   "source": [
    "# Bonus challenge: Solution\n",
    "\n",
    "def filter_data(data, condition='positive', threshold=0):\n",
    "    '''\n",
    "    Filters data based on various conditions.\n",
    "    \n",
    "    Args:\n",
    "        data (np.ndarray): Input data array\n",
    "        condition (str): The filtering condition to apply\n",
    "        threshold (float): Threshold value for certain conditions\n",
    "    \n",
    "    Returns:\n",
    "        tuple: (filtered_data, count)\n",
    "    '''\n",
    "\n",
    "    # Flatten the array if it's multi-dimensional\n",
    "    flat_data = data.flatten()\n",
    "\n",
    "    # Apply the appropriate filter based on condition\n",
    "    if condition == 'positive':\n",
    "\n",
    "        # Keep only positive values\n",
    "        mask = flat_data > 0\n",
    "\n",
    "    elif condition == 'negative':\n",
    "\n",
    "        # Keep only negative values\n",
    "        mask = flat_data < 0\n",
    "\n",
    "    elif condition == 'threshold':\n",
    "\n",
    "        # Keep only values greater than threshold\n",
    "        mask = flat_data > threshold\n",
    "\n",
    "    elif condition == 'range':\n",
    "\n",
    "        # Keep only values between -threshold and +threshold (inclusive)\n",
    "        mask = (flat_data >= -threshold) & (flat_data <= threshold)\n",
    "\n",
    "    else:\n",
    "        return (np.array([]), 0)\n",
    "\n",
    "    # Apply the mask to filter the data\n",
    "    filtered_data = flat_data[mask]\n",
    "    data_count = len(filtered_data)\n",
    "\n",
    "    return (filtered_data, data_count)\n",
    "\n",
    "# Test cases\n",
    "data = np.array([-5, 10, -3, 15, 0, -8, 20, 3])\n",
    "print(f'Original data: {data}\\n')\n",
    "\n",
    "filtered, count = filter_data(data, condition='positive')\n",
    "print(f'Positive values: {filtered}')\n",
    "print(f'Count: {count}\\n')\n",
    "\n",
    "filtered, count = filter_data(data, condition='negative')\n",
    "print(f'Negative values: {filtered}')\n",
    "print(f'Count: {count}\\n')\n",
    "\n",
    "filtered, count = filter_data(data, condition='threshold', threshold=10)\n",
    "print(f'Values > 10: {filtered}')\n",
    "print(f'Count: {count}\\n')\n",
    "\n",
    "filtered, count = filter_data(data, condition='range', threshold=5)\n",
    "print(f'Values in range [-5, 5]: {filtered}')\n",
    "print(f'Count: {count}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43e8e52c",
   "metadata": {},
   "source": [
    "---\n",
    "## Reflection questions: sample answers"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61988ef9",
   "metadata": {},
   "source": [
    "1. **How does NumPy's vectorization make operations faster compared to Python loops?**\n",
    "\n",
    "   NumPy's vectorization performs operations on entire arrays at once using optimized C code, rather than iterating through elements one by one in Python. For example, in problem 2, `arr ** 2` squares all elements simultaneously, which is much faster than using a for loop like `[x**2 for x in arr]`. NumPy operations are implemented in C and use CPU-level optimizations like SIMD (Single Instruction, Multiple Data), making them 10-100x faster for large datasets.\n",
    "\n",
    "2. **What is the purpose of the `axis` parameter in NumPy functions?**\n",
    "\n",
    "   The `axis` parameter specifies which dimension to perform an operation along. In problem 1, `axis=0` calculates along rows (down columns) to get test averages, while `axis=1` calculates along columns (across rows) to get student averages. Think of it as: `axis=0` collapses rows (vertical), `axis=1` collapses columns (horizontal). Without an axis parameter, the operation applies to the entire flattened array.\n",
    "\n",
    "3. **Explain the difference between element-wise multiplication and matrix multiplication.**\n",
    "\n",
    "   Element-wise multiplication (`*`) multiplies corresponding elements: `[[1,2],[3,4]] * [[5,6],[7,8]] = [[5,12],[21,32]]`. Matrix multiplication (`@` or `np.matmul()`) performs the dot product of rows and columns: for 2x2 matrices, result[i,j] = sum of (row i of A * column j of B). Matrix multiplication requires compatible shapes (m×n @ n×p = m×p), while element-wise requires identical shapes. Matrix multiplication is fundamental to linear algebra and machine learning transformations.\n",
    "\n",
    "4. **Why is it important to pay attention to array shapes?**\n",
    "\n",
    "   Array shapes determine whether operations are valid and affect the results. Bug 1 showed that reshape requires compatible dimensions (8 elements can't become 3×3). Bug 2 demonstrated how indexing depends on understanding row vs column dimensions. Shape mismatches cause ValueError in operations like matrix multiplication or addition. In data science, shape errors often indicate conceptual mistakes, like trying to multiply features with the wrong number of samples. Always check shapes with `.shape` before operations.\n",
    "\n",
    "5. **How can boolean indexing be used for data filtering?**\n",
    "\n",
    "   Boolean indexing creates a mask of True/False values based on conditions, then uses it to select elements. In problem 1, `grades >= 60` created a boolean array which we used to count passing grades. The bonus challenge showed more complex filtering with conditions like `(data >= -threshold) & (data <= threshold)`. Real-world example: filtering customer data to find high-value purchases over $100: `expensive_items = prices[prices > 100]`. This is much cleaner and faster than loops, and it's essential for data cleaning, outlier detection, and feature engineering in data science workflows."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}