{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "---\n", "title: \"Part 2: Language Core (Control Flow & Comprehensions)\"\n", "---" ] }, { "cell_type": "markdown", "id": "1", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sambaiga/ds-mlops-path/blob/main/tutorials/01-python-basics/02-control-flow.ipynb) [![Download Notebook](https://img.shields.io/badge/Download-Notebook-blue.svg?logo=jupyter&logoColor=white)](https://raw.githubusercontent.com/sambaiga/ds-mlops-path/main/tutorials/01-python-basics/02-control-flow.ipynb)" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "**DS-MLOps Python Foundations**\n", "\n", "**Python 3.12+ | Author: Anthony Faustine**\n", "\n", "## Before you begin\n", "\n", "This notebook assumes you have completed Part 1 (`01-python-core.ipynb`). If you have not, start there. Part 2 picks up immediately where Part 1 left off, using the same **university analytics platform** scenario, and covers everything that decides *what runs and how many times*: conditionals, pattern matching, loops, and comprehensions.\n", "\n", "> Callout markers used throughout this notebook are explained on the [book cover page](../../index.qmd#callout-guide)." ] }, { "cell_type": "markdown", "id": "3", "metadata": {}, "source": [ "::: {.callout-note collapse=\"true\" icon=false}\n", "## Learning Objectives\n", "\n", "By the end of Part 2 you will be able to:\n", "\n", "| # | Skill | Covered in |\n", "|---|---|---|\n", "| 1 | Write `if` / `elif` / `else` and structural `match` / `case` | Sec. 1 |\n", "| 2 | Replace manual index counters with `for`, `enumerate`, and `zip` | Sec. 2 |\n", "| 3 | Use `while`, `break`, and `continue` for indefinite loops | Sec. 3 |\n", "| 4 | Replace loops with list, dict, set, and generator comprehensions | Sec. 4 |\n", ":::\n" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## 1. Control Flow: if / elif / else & match / case\n", "\n", "So far every cell runs its lines from top to bottom, once, in order. **Control flow** lets you change that:\n", "- `if / elif / else`: run one branch based on a condition\n", "- `match / case`: route structured data to different handlers (Python 3.10+)\n", "\n", "
\n", " Key Concept: match / case (Python 3.10+)

\n", "Structural pattern matching goes beyond simple equality checks. It can match on the shape of data, destructuring dicts, lists, and class instances in the same step. Use it when branching on the shape or value of structured data, not just numeric thresholds.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "def classify_grade(score: float) -> str:\n", " \"\"\"Return letter grade for a numeric score.\"\"\"\n", " if score >= 90:\n", " return \"A: Excellent\"\n", " elif score >= 80:\n", " return \"B: Good\"\n", " elif score >= 70:\n", " return \"C: Satisfactory\"\n", " elif score >= 60:\n", " return \"D: Needs improvement\"\n", " else:\n", " return \"F: See instructor\"" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "Test the function across the full grade range:" ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "for s in [95.0, 83.5, 71.0, 62.0, 45.0]:\n", " print(f\" {s:5.1f} -> {classify_grade(s)}\")\n", "\n", "# Ternary expression: one-liner for simple binary choices\n", "score = 87.0\n", "status = \"pass\" if score >= 70 else \"fail\"\n", "band = \"high\" if score >= 90 else (\"mid\" if score >= 70 else \"low\")\n", "print(f\"\\n{score} -> {status}, {band}\")" ] }, { "cell_type": "raw", "id": "8", "metadata": { "raw_mimetype": "text/markdown" }, "source": [ "> **Decision flow: if / elif / else**\n", "\n", "```{mermaid}\n", "flowchart TD\n", " A[\"evaluate condition\"] --> B{if condition1}\n", " B -->|True| C[\"execute if block\"]\n", " B -->|False| D{elif condition2}\n", " D -->|True| E[\"execute elif block\"]\n", " D -->|False| F{else?}\n", " F -->|present| G[\"execute else block\"]\n", " F -->|absent| H[\"skip all\"]\n", " C & E & G & H --> I[\"continue program\"]\n", "\n", " style C fill:#EBF5F0,stroke:#059669,color:#065F46\n", " style E fill:#EAF3FA,stroke:#0369A1,color:#0C4A6E\n", " style G fill:#F5F3FF,stroke:#7C3AED,color:#3B0764\n", "```\n" ] }, { "cell_type": "markdown", "id": "9", "metadata": {}, "source": [ "### match / case: Structural Pattern Matching (Python 3.10+)\n", "`match` goes beyond simple equality checks. It can destructure **the shape of data**, extracting values from dicts and lists in one step. Define the routing function first:" ] }, { "cell_type": "code", "execution_count": null, "id": "10", "metadata": {}, "outputs": [], "source": [ "# match / case: structural pattern matching (Python 3.10+)\n", "def process_event(event: dict[str, object]) -> str:\n", " \"\"\"Route a training event to the right handler.\"\"\"\n", " match event:\n", " case {\"type\": \"epoch\", \"epoch\": e, \"loss\": l} if float(str(l)) < 0.05:\n", " return f\"Epoch {e}: converged (loss={l:.3f})\"\n", " case {\"type\": \"epoch\", \"epoch\": e, \"loss\": l}:\n", " return f\"Epoch {e}: loss={l:.3f}\"\n", " case {\"type\": \"error\", \"message\": msg}:\n", " return f\"ERROR: {msg}\"\n", " case {\"type\": t}:\n", " return f\"Unhandled event type: {t!r}\"\n", " case _:\n", " return \"Malformed event\"" ] }, { "cell_type": "markdown", "id": "11", "metadata": {}, "source": [ "Run a variety of event shapes through the dispatcher to see each `case` arm triggered. The `case _:` arm is a catch-all that always matches:" ] }, { "cell_type": "code", "execution_count": null, "id": "12", "metadata": {}, "outputs": [], "source": [ "events: list[dict[str, object]] = [\n", " {\"type\": \"epoch\", \"epoch\": 1, \"loss\": 0.823},\n", " {\"type\": \"epoch\", \"epoch\": 20, \"loss\": 0.041},\n", " {\"type\": \"error\", \"message\": \"OOM on GPU 0\"},\n", " {\"type\": \"checkpoint\"},\n", " {\"status\": \"idle\"},\n", "]\n", "\n", "for ev in events:\n", " print(process_event(ev))" ] }, { "cell_type": "markdown", "id": "13", "metadata": {}, "source": [ "
\n", " Activity 6 - Match on HTTP-style Status Codes

\n", "Goal: Write a describe_status(code) function using match/case that returns a short description.\n", "
describe_status(200)  -> '200 OK'\n",
    "describe_status(404)  -> '404 Not Found'\n",
    "describe_status(500)  -> '500 Server Error'\n",
    "describe_status(301)  -> '3xx Redirect'\n",
    "describe_status(999)  -> 'Unknown code'
\n", "Hint: Use case 2xx patterns are not valid. Use guard conditions instead: case c if 200 <= c < 300.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "def describe_status(code: int) -> str:\n", " \"\"\"Return a short description for an HTTP-style status code.\"\"\"\n", " match code:\n", " case _:\n", " return \"unknown\" # TODO: replace with specific case patterns\n", "\n", "\n", "for c in [200, 404, 500, 301, 999]:\n", " print(describe_status(c))" ] }, { "cell_type": "markdown", "id": "15", "metadata": {}, "source": [ "## 2. Control Flow: for Loops\n", "\n", "A **`for` loop** repeats a block of code once for each item in a collection. It is the primary tool for processing datasets, running training epochs, and iterating over files.\n", "\n", "```python\n", "for score in [78, 85, 92]: # repeat once per score\n", " print(score) # output: 78, then 85, then 92\n", "```\n", "\n", "The indented block (4 spaces) is the **loop body**: it runs once per item.\n", "\n", "Python `for` loops iterate over any **iterable**. The built-ins `range()`, `enumerate()`, and `zip()` cover the most common patterns in data work." ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "# range(start, stop, step): generates integers lazily (no list in memory)\n", "MAX_EPOCHS: int = 5\n", "loss: float = 1.0\n", "\n", "for epoch in range(1, MAX_EPOCHS + 1):\n", " loss *= 0.75\n", " print(f\" Epoch {epoch}/{MAX_EPOCHS} loss={loss:.4f}\")" ] }, { "cell_type": "markdown", "id": "17", "metadata": {}, "source": [ "`enumerate()` pairs each element with its index, counting from `start=1` by default (or any integer you choose), eliminating the need for manual `i += 1` counters:" ] }, { "cell_type": "code", "execution_count": null, "id": "18", "metadata": {}, "outputs": [], "source": [ "# enumerate(): loop with automatic index; avoids manual counter variables\n", "students: list[str] = [\"Alice\", \"Carol\", \"Dan\", \"Bob\"]\n", "\n", "print(\"Leaderboard:\")\n", "for rank, name in enumerate(students, start=1):\n", " print(f\" #{rank} {name}\")" ] }, { "cell_type": "markdown", "id": "19", "metadata": {}, "source": [ "`zip()` stitches two or more iterables together element-by-element. Pairs stop when the **shortest** input is exhausted. Build a `dict` from two parallel lists using `dict(zip(keys, values))`:" ] }, { "cell_type": "code", "execution_count": null, "id": "20", "metadata": {}, "outputs": [], "source": [ "# zip(): iterate two or more iterables in lockstep\n", "# strict=True raises ValueError if the iterables have different lengths\n", "names: list[str] = [\"Alice\", \"Bob\", \"Carol\"]\n", "scores: list[float] = [92.0, 74.5, 88.0]\n", "\n", "print(\"Score sheet:\")\n", "for name, score in zip(names, scores, strict=True):\n", " grade = \"pass\" if score >= 70 else \"fail\"\n", " print(f\" {name:<8} {score:5.1f} {grade}\")\n", "\n", "# Build a dict from two parallel lists\n", "metric_names: list[str] = [\"accuracy\", \"precision\", \"recall\"]\n", "metric_vals: list[float] = [0.923, 0.911, 0.934]\n", "report: dict[str, float] = dict(zip(metric_names, metric_vals, strict=True))\n", "print()\n", "print(f\"Report: {report}\")" ] }, { "cell_type": "markdown", "id": "21", "metadata": {}, "source": [ "### tqdm: Progress Bars for Long Loops\n", "\n", "When a loop processes thousands of files or training examples, you need to know how long it will take. `tqdm` wraps any iterable and displays a live progress bar with elapsed time, rate, and ETA, with zero code changes to the loop body:\n", "\n", "```python\n", "pip install tqdm # if not already installed\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "22", "metadata": {}, "outputs": [], "source": [ "from tqdm import tqdm\n", "\n", "# Wrap any iterable with tqdm() - the loop body is unchanged\n", "scores: list[float] = []\n", "for i in tqdm(range(1_000), desc=\"Simulating scores\", unit=\"rec\"):\n", " scores.append(50 + (i % 50)) # dummy computation\n", "\n", "print(f\"Generated {len(scores)} scores, mean = {sum(scores) / len(scores):.1f}\")\n", "\n", "# tqdm also works with enumerate and zip\n", "labels: list[str] = [\"pass\" if s >= 70 else \"fail\" for s in tqdm(scores, desc=\"Labelling\", leave=False)]\n", "print(f\"pass rate: {labels.count('pass') / len(labels):.1%}\")" ] }, { "cell_type": "markdown", "id": "23", "metadata": {}, "source": [ "## 3. Control Flow: while, break, continue\n", "\n", "A **`while` loop** repeats a block as long as a condition is `True`. Unlike `for` (which iterates a fixed collection), `while` runs an **indefinite** number of times until either the condition becomes `False` or a `break` statement is hit.\n", "\n", "```python\n", "loss = 1.0\n", "while loss > 0.05: # keep running until loss is small enough\n", " loss *= 0.7 # shrink loss by 30% each iteration\n", "```\n", "\n", "Use `while` when you do not know in advance how many iterations are needed: waiting for convergence, retrying a failing operation, or consuming a data stream.\n", "\n", "- `break`: exit the loop immediately\n", "- `continue`: skip the rest of this iteration\n", "- `else` on a loop: runs **only** if no `break` was hit" ] }, { "cell_type": "code", "execution_count": null, "id": "24", "metadata": {}, "outputs": [], "source": [ "# while: train until convergence or budget exhausted\n", "loss: float = 1.0\n", "epoch: int = 0\n", "MAX_EPOCHS: int = 30\n", "THRESHOLD: float = 0.05\n", "\n", "while loss > THRESHOLD and epoch < MAX_EPOCHS:\n", " loss *= 0.7\n", " epoch += 1\n", "\n", "print(f\"Stopped at epoch {epoch}: loss={loss:.4f}\")\n", "print(f\"Converged: {loss <= THRESHOLD}\")" ] }, { "cell_type": "markdown", "id": "25", "metadata": {}, "source": [ "### break and continue\n", "`break` exits the innermost loop immediately. Use it when a sentinel value or error condition means further iteration is pointless:" ] }, { "cell_type": "code", "execution_count": null, "id": "26", "metadata": {}, "outputs": [], "source": [ "# break: exit the loop immediately when a sentinel is found\n", "readings: list[float | None] = [36.5, 36.9, 37.4, None, 38.1, 37.8]\n", "clean: list[float] = []\n", "\n", "for r in readings:\n", " if r is None:\n", " print(\"Sensor error : stopping collection\")\n", " break\n", " clean.append(r)\n", "\n", "print(f\"Clean readings: {clean}\")" ] }, { "cell_type": "markdown", "id": "27", "metadata": {}, "source": [ "`continue` skips the rest of the current iteration and jumps to the next one. Ideal for filtering bad data without a nested `if/else`. The `else` clause on a loop runs only if no `break` occurred:" ] }, { "cell_type": "code", "execution_count": null, "id": "28", "metadata": {}, "outputs": [], "source": [ "# continue: skip the rest of this iteration and move to the next\n", "raw: list[object] = [85.0, \"n/a\", None, 92.0, \"\", 78.5, -1.0, 95.0]\n", "valid: list[float] = []\n", "\n", "for item in raw:\n", " if not isinstance(item, int | float) or float(str(item)) < 0:\n", " continue # skip bad items\n", " valid.append(float(str(item)))\n", "\n", "print(f\"Valid scores: {valid}\")\n", "\n", "# loop else: runs only when the loop was NOT exited via break\n", "required_fields: list[str] = [\"name\", \"gpa\", \"major\"]\n", "record: dict[str, str] = {\"name\": \"Alice\", \"gpa\": \"3.95\", \"major\": \"CS\"}\n", "\n", "for field in required_fields:\n", " if field not in record:\n", " print(f\"Missing required field: {field!r}\")\n", " break\n", "else:\n", " print(\"All required fields present\")" ] }, { "cell_type": "markdown", "id": "29", "metadata": {}, "source": [ "## 4. Comprehensions\n", "\n", "A **comprehension** builds a new collection by transforming or filtering an existing one, all in a single expression. It replaces the verbose `for` + `.append()` pattern:\n", "\n", "```python\n", "# Loop version (3 lines):\n", "squares = []\n", "for n in range(5):\n", " squares.append(n ** 2) # [0, 1, 4, 9, 16]\n", "\n", "# Comprehension (1 line, identical result):\n", "squares = [n ** 2 for n in range(5)]\n", "```\n", "\n", "Comprehensions are faster than equivalent loops and are considered idiomatic Python.\n", "\n", "
\n", " Key Concept: Concise, Readable Collection Construction

\n", "Comprehensions build new collections by transforming or filtering an iterable in a single expression. They are faster than equivalent for + .append() loops and are idiomatic Python.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
[expr for x in it if cond]list
{k: v for x in it if cond}dict
{expr for x in it if cond}set
(expr for x in it if cond)generator (lazy, no list in memory)
\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "30", "metadata": {}, "outputs": [], "source": [ "raw_scores: list[float] = [78.0, 85.5, 92.0, 88.5, 95.0, 67.0, 81.0]\n", "\n", "# Transform: min-max normalise to [0, 1]\n", "lo, hi = min(raw_scores), max(raw_scores)\n", "normed: list[float] = [(s - lo) / (hi - lo) for s in raw_scores]\n", "print(f\"Normalised: {[round(n, 2) for n in normed]}\")\n", "\n", "# Filter: keep only passing scores\n", "passing: list[float] = [s for s in raw_scores if s >= 70]\n", "print(f\"Passing : {passing}\")\n", "\n", "# Filter + transform: label each score\n", "labels: list[str] = [f\"{s:.0f} (pass)\" if s >= 70 else f\"{s:.0f} (FAIL)\" for s in raw_scores]\n", "print(f\"Labelled : {labels}\")" ] }, { "cell_type": "markdown", "id": "31", "metadata": {}, "source": [ "A two-clause comprehension flattens a nested collection. Read `[s for batch in batches for s in batch]` left-to-right: \"outer loop, inner loop, collect `s`\":" ] }, { "cell_type": "code", "execution_count": null, "id": "32", "metadata": {}, "outputs": [], "source": [ "# Flatten a nested structure with a two-clause comprehension\n", "batches: list[list[float]] = [[85.0, 91.0], [74.0, 88.5], [95.0, 79.0]]\n", "flat: list[float] = [s for batch in batches for s in batch]\n", "print(f\"Flattened : {flat}\")" ] }, { "cell_type": "markdown", "id": "33", "metadata": {}, "source": [ "### Dict, Set, and Generator Comprehensions\n", "The `[...]` syntax extends to dicts (`{k: v for ...}`), sets (`{expr for ...}`), and lazy generators (`(expr for ...)`):" ] }, { "cell_type": "code", "execution_count": null, "id": "34", "metadata": {}, "outputs": [], "source": [ "students: list[dict[str, object]] = [\n", " {\"name\": \"Alice\", \"score\": 92.0, \"major\": \"CS\"},\n", " {\"name\": \"Bob\", \"score\": 74.5, \"major\": \"Math\"},\n", " {\"name\": \"Carol\", \"score\": 88.0, \"major\": \"CS\"},\n", " {\"name\": \"Dan\", \"score\": 61.0, \"major\": \"Physics\"},\n", "]\n", "\n", "# Dict comprehension: build a name -> score lookup\n", "score_lookup: dict[str, float] = {str(s[\"name\"]): float(str(s[\"score\"])) for s in students}\n", "print(f\"Lookup : {score_lookup}\")\n", "\n", "# Dict comprehension with filter: honours students only\n", "honours: dict[str, float] = {str(s[\"name\"]): float(str(s[\"score\"])) for s in students if float(str(s[\"score\"])) >= 80}\n", "print(f\"Honours: {honours}\")" ] }, { "cell_type": "markdown", "id": "35", "metadata": {}, "source": [ "Set comprehensions deduplicate automatically. Generator expressions compute values **lazily**: they use O(1) memory regardless of input size, making them ideal inside `sum()`, `any()`, and `all()`:" ] }, { "cell_type": "code", "execution_count": null, "id": "36", "metadata": {}, "outputs": [], "source": [ "students: list[dict[str, object]] = [\n", " {\"name\": \"Alice\", \"score\": 92.0, \"major\": \"CS\"},\n", " {\"name\": \"Bob\", \"score\": 74.5, \"major\": \"Math\"},\n", " {\"name\": \"Carol\", \"score\": 88.0, \"major\": \"CS\"},\n", " {\"name\": \"Dan\", \"score\": 61.0, \"major\": \"Physics\"},\n", "]\n", "\n", "# Set comprehension: unique majors\n", "majors: set[str] = {str(s[\"major\"]) for s in students}\n", "print(f\"Majors : {sorted(majors)}\")\n", "\n", "# Generator expression: lazy evaluation; ideal inside sum/any/all\n", "total: float = sum(float(str(s[\"score\"])) for s in students)\n", "any_fail: bool = any(float(str(s[\"score\"])) < 70 for s in students)\n", "all_pass: bool = all(float(str(s[\"score\"])) >= 60 for s in students)\n", "\n", "print(f\"Mean : {total / len(students):.1f}\")\n", "print(f\"Any fail (<70): {any_fail}\")\n", "print(f\"All pass (>=60): {all_pass}\")" ] }, { "cell_type": "markdown", "id": "37", "metadata": {}, "source": [ "
\n", " Activity 7 - Cohort Score Report

\n", "Goal: Using a single comprehension for each, produce the outputs below from records.\n", "
records = [\n",
    "    {'name': 'Alice', 'scores': [88, 92, 85]},\n",
    "    {'name': 'Bob',   'scores': [62, 70, 58]},\n",
    "    {'name': 'Carol', 'scores': [91, 95, 89]},\n",
    "]\n",
    "\n",
    "# 1. List of averages (one float per student)\n",
    "averages = [82.33, 63.33, 91.67]\n",
    "\n",
    "# 2. Dict mapping name -> average (rounded to 2 dp)\n",
    "avg_map = {'Alice': 88.33, 'Bob': 63.33, 'Carol': 91.67}\n",
    "\n",
    "# 3. Set of unique student names who scored >= 80 average\n",
    "top = {'Alice', 'Carol'}
\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "38", "metadata": {}, "outputs": [], "source": [ "records: list[dict[str, object]] = [\n", " {\"name\": \"Alice\", \"scores\": [88, 92, 85]},\n", " {\"name\": \"Bob\", \"scores\": [62, 70, 58]},\n", " {\"name\": \"Carol\", \"scores\": [91, 95, 89]},\n", "]\n", "\n", "# TODO: 1. list of averages\n", "averages: list[float] = ...\n", "\n", "# TODO: 2. name -> average dict\n", "avg_map: dict[str, float] = ...\n", "\n", "# TODO: 3. set of names with average >= 80\n", "top: set[str] = ...\n", "\n", "print(f\"averages: {averages}\")\n", "print(f\"avg_map : {avg_map}\")\n", "print(f\"top : {top}\")" ] }, { "cell_type": "markdown", "id": "39", "metadata": {}, "source": [ "## Capstone: Monte Carlo Pi Estimation\n", "\n", "This activity ties together everything from Part 1 and Part 2: variables, lists, for loops, random numbers, functions, and comprehensions, to estimate the value of π using a simulation technique called **Monte Carlo integration**.\n", "\n", "### The idea\n", "\n", "Imagine a unit circle (radius = 1) inscribed in a 2×2 square. A random point `(x, y)` with `x, y ∈ [−1, 1]` falls inside the circle if `x² + y² ≤ 1`.\n", "\n", "The ratio of the circle's area to the square's area is π/4. If we throw millions of random points and count how many land inside the circle, the proportion converges to π/4, so `π ≈ 4 × (hits / total)`.\n", "\n", "```\n", " ┌──────────────┐\n", " │ · ● · │ ● inside circle → hit\n", " │ ● ● │ · outside → miss\n", " │ circle │\n", " │ ● ● │\n", " │ · ● · │\n", " └──────────────┘\n", " π/4 ≈ hits/total\n", "```\n", "\n", "This is a real technique used in finance, physics, and ML for problems that are too complex to solve analytically." ] }, { "cell_type": "markdown", "id": "40", "metadata": {}, "source": [ "**Step 1:** Write a helper that checks whether a point is inside the unit circle:" ] }, { "cell_type": "code", "execution_count": null, "id": "41", "metadata": {}, "outputs": [], "source": [ "import math\n", "\n", "\n", "def in_unit_circle(x: float, y: float) -> bool:\n", " \"\"\"Return True if (x, y) lies inside the unit circle (radius = 1).\"\"\"\n", " return x**2 + y**2 <= 1.0" ] }, { "cell_type": "markdown", "id": "42", "metadata": {}, "source": [ "**Step 2:** Simulate random points and count how many land inside the circle. `random.seed()` makes results reproducible. Always set a seed before any simulation:" ] }, { "cell_type": "code", "execution_count": null, "id": "43", "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "random.seed(42) # fix seed for reproducibility\n", "\n", "N_POINTS: int = 1_000_000\n", "inside: int = sum(\n", " 1\n", " for _ in range(N_POINTS)\n", " if in_unit_circle(random.uniform(-1, 1), random.uniform(-1, 1)) # noqa: S311\n", ")\n", "\n", "pi_estimate: float = 4 * inside / N_POINTS\n", "print(f\"Points : {N_POINTS:,}\")\n", "print(f\"Hits (inside): {inside:,}\")\n", "print(f\"pi estimate : {pi_estimate:.5f}\")\n", "print(f\"math.pi : {math.pi:.5f}\")\n", "print(f\"Error : {abs(pi_estimate - math.pi):.5f}\")" ] }, { "cell_type": "markdown", "id": "44", "metadata": {}, "source": [ "**Step 3:** See how the estimate improves as `N` grows: the law of large numbers at work:" ] }, { "cell_type": "code", "execution_count": null, "id": "45", "metadata": {}, "outputs": [], "source": [ "import math\n", "import random\n", "\n", "random.seed(0)\n", "\n", "for n in [100, 1_000, 10_000, 100_000, 1_000_000]:\n", " hits = sum(\n", " 1\n", " for _ in range(n)\n", " if in_unit_circle(random.uniform(-1, 1), random.uniform(-1, 1)) # noqa: S311\n", " )\n", " est = 4 * hits / n\n", " error = abs(est - math.pi)\n", " print(f\" n={n:>9,} pi={est:.5f} error={error:.5f}\")" ] }, { "cell_type": "markdown", "id": "46", "metadata": {}, "source": [ "
\n", " What you just used

\n", "\n", "This exact pattern (sample randomly, count outcomes, estimate a ratio) appears in A/B testing, Bayesian inference, and reinforcement learning.\n", "
" ] }, { "cell_type": "markdown", "id": "47", "metadata": {}, "source": [ "## Further Reading\n", "\n", "| Resource | Why it matters |\n", "|---|---|\n", "| [PEP 636 — Structural Pattern Matching](https://peps.python.org/pep-0636/) | Official tutorial for `match`/`case`, with worked examples from the Python core team |\n", "| Ramalho, L. (2022). *Fluent Python*, 2nd ed. O'Reilly. | Chapter 10 covers pattern matching in depth, including class patterns and guards |\n", "| [Real Python — Python `for` Loops](https://realpython.com/python-for-loop/) | Clear treatment of `enumerate`, `zip`, and the iterator protocol behind every loop |\n", "| [Real Python — List Comprehensions](https://realpython.com/list-comprehension-python/) | When to use comprehensions vs explicit loops, and how to avoid making them unreadable |\n" ] }, { "cell_type": "markdown", "id": "48", "metadata": {}, "source": [ "## Summary\n", "\n", "| Concept | Key rule |\n", "|---|---|\n", "| `match`/`case` | Structural pattern matching on values, dicts, lists (3.10+) |\n", "| `enumerate` / `zip` | Always prefer these over manual index counters |\n", "| `while` / `break` / `continue` | For indefinite loops, early exit, and skipping bad data |\n", "| Comprehensions | `[expr for x in it if cond]`; use generators `(...)` inside `sum()` / `any()` / `all()` |\n", "\n", "**Next:** `03-python-patterns.ipynb`, covering functions, lambdas, `*args`/`**kwargs`, dataclasses, modules, exception handling, and file I/O with `pathlib`." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.12" } }, "nbformat": 4, "nbformat_minor": 5 }