{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "introduction",
   "metadata": {},
   "source": [
    "# arXiv Paper Curator - Week 1: Infrastructure Setup\n",
    "\n",
    "Build a production-grade RAG system using Docker, PostgreSQL, OpenSearch, FastAPI, Airflow, and Ollama.\n",
    "\n",
    "## Technology Stack\n",
    "| Component | Purpose | Port |\n",
    "|-----------|---------|------|\n",
    "| **FastAPI** | REST API | 8000 |\n",
    "| **PostgreSQL** | Paper metadata storage | 5432 |\n",
    "| **OpenSearch** | Hybrid search engine | 9200/5601 |\n",
    "| **Apache Airflow** | Workflow automation | 8080 |\n",
    "| **Ollama** | Local LLM inference | 11434 |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "jbe3c0gz31c",
   "metadata": {},
   "source": [
    "## Learning Materials\n",
    "\n",
    "**Core Technologies:**\n",
    "- **Docker**: [Tutorial Video](https://www.youtube.com/watch?v=pg19Z8LL06w) | [Docker Compose](https://www.youtube.com/watch?v=SXwC9fSwct8)\n",
    "- **FastAPI**: [YouTube Series](https://www.youtube.com/playlist?list=PLK8U0kF0E_D6l19LhOGWhVZ3sQ6ujJKq_) | [Documentation](https://fastapi.tiangolo.com/tutorial/)\n",
    "- **PostgreSQL**: [Beginners Guide](https://www.youtube.com/watch?v=SpfIwlAYaKk) | [FastAPI + PostgreSQL](https://www.youtube.com/watch?v=398DuQbQJq0)\n",
    "- **OpenSearch**: [Getting Started](https://docs.opensearch.org/latest/getting-started/)\n",
    "- **Apache Airflow**: [Tutorial Video](https://www.youtube.com/watch?v=Y_vQyMljDsE)\n",
    "\n",
    "**Development Tools:**\n",
    "- **VS Code Setup**: [Video Guide](https://www.youtube.com/watch?v=mpk4Q5feWaw)\n",
    "- **Git Basics**: [Tutorial](https://www.youtube.com/watch?v=zTjRZNkhiEU)\n",
    "- **UV Package Manager**: [Setup Video](https://www.youtube.com/watch?v=AMdG7IjgSPM)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "prerequisites",
   "metadata": {},
   "source": [
    "## Prerequisites\n",
    "\n",
    "**Required Software:**\n",
    "- Python 3.12+ ([Download](https://www.python.org/downloads/))\n",
    "- UV Package Manager ([Install Guide](https://docs.astral.sh/uv/getting-started/installation/))\n",
    "- Docker Desktop ([Download](https://docs.docker.com/get-docker/))\n",
    "- Git ([Download](https://git-scm.com/downloads))\n",
    "\n",
    "**System Requirements:**\n",
    "- 8GB+ RAM (16GB recommended)\n",
    "- 20GB+ free disk space"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "step1-intro",
   "metadata": {},
   "source": [
    "## Setup Instructions\n",
    "\n",
    "**Before running cells:**\n",
    "1. Extract/clone project to your system\n",
    "2. Open terminal in project root (contains `compose.yml`)\n",
    "3. Run: `uv sync`\n",
    "4. Start Jupyter: `uv run jupyter notebook`\n",
    "5. Verify kernel shows project environment (.venv)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 198,
   "id": "74qckgs5icl",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Python Version: 3.12.11\n",
      "Environment: /Users/Shared/Projects/MOAI/zero_to_RAG/.venv/bin/python\n",
      "✓ Python version compatible\n"
     ]
    }
   ],
   "source": [
    "# Environment Check\n",
    "import sys\n",
    "from pathlib import Path\n",
    "\n",
    "python_version = sys.version_info\n",
    "print(f\"Python Version: {python_version.major}.{python_version.minor}.{python_version.micro}\")\n",
    "print(f\"Environment: {sys.executable}\")\n",
    "\n",
    "if python_version >= (3, 12):\n",
    "    print(\"✓ Python version compatible\")\n",
    "else:\n",
    "    print(\"✗ Need Python 3.12+\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 199,
   "id": "12izzsax7tmq",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Project root: /Users/Shared/Projects/MOAI/zero_to_RAG\n"
     ]
    }
   ],
   "source": [
    "# Find Project Root\n",
    "current_dir = Path.cwd()\n",
    "\n",
    "if current_dir.name == \"week1\" and current_dir.parent.name == \"notebooks\":\n",
    "    project_root = current_dir.parent.parent\n",
    "elif (current_dir / \"compose.yml\").exists():\n",
    "    project_root = current_dir\n",
    "else:\n",
    "    project_root = None\n",
    "\n",
    "if project_root and (project_root / \"compose.yml\").exists():\n",
    "    print(f\"✓ Project root: {project_root}\")\n",
    "else:\n",
    "    print(\"✗ Missing compose.yml - check directory\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 203,
   "id": "step2-intro",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Docker: Docker version 28.1.1, build 4eba377\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Check Docker\n",
    "import subprocess\n",
    "\n",
    "try:\n",
    "    result = subprocess.run([\"docker\", \"--version\"], capture_output=True, text=True, timeout=5)\n",
    "    if result.returncode == 0:\n",
    "        print(f\"✓ Docker: {result.stdout}\")\n",
    "    else:\n",
    "        print(\"✗ Docker: Not working\")\n",
    "        exit()\n",
    "except:\n",
    "    print(\"✗ Docker: Not found\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 204,
   "id": "cue2b9j33ho",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Docker Compose: v2.35.1-desktop.1\n"
     ]
    }
   ],
   "source": [
    "# Check Docker Compose\n",
    "try:\n",
    "    result = subprocess.run([\"docker\", \"compose\", \"version\"], capture_output=True, text=True, timeout=5)\n",
    "    if result.returncode == 0:\n",
    "        print(f\"✓ Docker Compose: {result.stdout.split()[3]}\")\n",
    "    else:\n",
    "        print(\"✗ Docker Compose: Not working\")\n",
    "        exit()\n",
    "except:\n",
    "    print(\"✗ Docker Compose: Not found\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 205,
   "id": "k6oz19mcke8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ UV: uv 0.7.13 (Homebrew 2025-06-12)\n",
      "\n",
      "✓ All required software ready!\n"
     ]
    }
   ],
   "source": [
    "# Check UV Package Manager\n",
    "try:\n",
    "    result = subprocess.run([\"uv\", \"--version\"], capture_output=True, text=True, timeout=5)\n",
    "    if result.returncode == 0:\n",
    "        print(f\"✓ UV: {result.stdout.strip()}\")\n",
    "        print(\"\\n✓ All required software ready!\")\n",
    "    else:\n",
    "        print(\"✗ UV: Not working\")\n",
    "        exit()\n",
    "except:\n",
    "    print(\"✗ UV: Not found\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98bk2oh59la",
   "metadata": {},
   "source": [
    "## Start Services\n",
    "\n",
    "**Command to run (in terminal):**\n",
    "```bash\n",
    "cd [project-root]\n",
    "docker compose up -d\n",
    "```\n",
    "\n",
    "**What this does:** Downloads images (first time) and starts all services in background."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 206,
   "id": "l4vhkj6bl7h",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Docker is running\n"
     ]
    }
   ],
   "source": [
    "# Check Docker Running\n",
    "try:\n",
    "    result = subprocess.run([\"docker\", \"info\"], capture_output=True, timeout=5)\n",
    "    if result.returncode == 0:\n",
    "        print(\"✓ Docker is running\")\n",
    "    else:\n",
    "        print(\"✗ Docker not running - start Docker Desktop\")\n",
    "        exit()\n",
    "except:\n",
    "    print(\"✗ Docker daemon not accessible\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 207,
   "id": "1yuulcv2wqe",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Current containers:\n",
      "  • airflow: running\n",
      "  • api: running\n",
      "  • opensearch-dashboards: running\n",
      "  • ollama: running\n",
      "  • opensearch: running\n",
      "  • postgres: running\n"
     ]
    }
   ],
   "source": [
    "# Check Current Containers\n",
    "import json\n",
    "\n",
    "try:\n",
    "    result = subprocess.run(\n",
    "        [\"docker\", \"compose\", \"ps\", \"--format\", \"json\"],\n",
    "        cwd=str(project_root),\n",
    "        capture_output=True,\n",
    "        text=True,\n",
    "        timeout=10\n",
    "    )\n",
    "    \n",
    "    if result.returncode == 0 and result.stdout.strip():\n",
    "        print(\"Current containers:\")\n",
    "        for line in result.stdout.strip().split('\\n'):\n",
    "            if line.strip():\n",
    "                try:\n",
    "                    container = json.loads(line)\n",
    "                    service = container.get('Service', 'unknown')\n",
    "                    state = container.get('State', 'unknown')\n",
    "                    print(f\"  • {service}: {state}\")\n",
    "                except:\n",
    "                    pass\n",
    "    else:\n",
    "        print(\"No containers running\")\n",
    "        \n",
    "except Exception as e:\n",
    "    print(\"Could not check containers\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ql8vfnm1iq",
   "metadata": {},
   "source": [
    "## Service Health Verification\n",
    "\n",
    "All services start automatically. Check their health status:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 212,
   "id": "77j1d8uyv9j",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SERVICE STATUS\n",
      "======================================================================\n",
      "Service              State           Status          Notes\n",
      "----------------------------------------------------------------------\n",
      "✓ airflow            running        healthy        Ready\n",
      "✓ api                running        healthy        Ready\n",
      "✓ opensearch-dashboards running        healthy        Ready\n",
      "✓ ollama             running        healthy        Ready\n",
      "✓ opensearch         running        healthy        Ready\n",
      "✓ postgres           running        healthy        Ready\n"
     ]
    }
   ],
   "source": [
    "# Service Health Check\n",
    "EXPECTED_SERVICES = {\n",
    "    'api': 'FastAPI REST API server',\n",
    "    'postgres': 'PostgreSQL database',\n",
    "    'opensearch': 'OpenSearch search engine', \n",
    "    'opensearch-dashboards': 'OpenSearch web dashboard',\n",
    "    'ollama': 'Local LLM inference server',\n",
    "    'airflow': 'Workflow automation (optional - may be off)'\n",
    "}\n",
    "\n",
    "try:\n",
    "    result = subprocess.run(\n",
    "        [\"docker\", \"compose\", \"ps\", \"--format\", \"json\"],\n",
    "        cwd=str(project_root),\n",
    "        capture_output=True,\n",
    "        text=True,\n",
    "        timeout=15\n",
    "    )\n",
    "    \n",
    "    if result.returncode == 0:\n",
    "        print(\"SERVICE STATUS\")\n",
    "        print(\"=\" * 70)\n",
    "        print(f\"{'Service':<20} {'State':<15} {'Status':<15} {'Notes'}\")\n",
    "        print(\"-\" * 70)\n",
    "    else:\n",
    "        print(\"Could not get service status\")\n",
    "        exit()\n",
    "        \n",
    "except Exception as e:\n",
    "    print(f\"Error checking services: {e}\")\n",
    "    exit()\n",
    "\n",
    "# Parse Service Status\n",
    "found_services = set()\n",
    "service_states = {}\n",
    "\n",
    "if result.stdout.strip():\n",
    "    for line in result.stdout.strip().split('\\n'):\n",
    "        if line.strip():\n",
    "            try:\n",
    "                container = json.loads(line)\n",
    "                service = container.get('Service', 'unknown')\n",
    "                state = container.get('State', 'unknown')\n",
    "                health = container.get('Health', 'no check')\n",
    "                \n",
    "                found_services.add(service)\n",
    "                service_states[service] = {'state': state, 'health': health}\n",
    "                \n",
    "                if state == 'running' and health in ['healthy', 'no check']:\n",
    "                    indicator = \"✓\"\n",
    "                    notes = \"Ready\"\n",
    "                elif state == 'running' and health == 'unhealthy':\n",
    "                    indicator = \"⚠\"\n",
    "                    notes = \"Starting up...\"\n",
    "                elif state == 'exited':\n",
    "                    indicator = \"✗\"\n",
    "                    notes = \"Failed to start\"\n",
    "                else:\n",
    "                    indicator = \"?\"\n",
    "                    notes = f\"Status: {state}\"\n",
    "                \n",
    "                print(f\"{indicator} {service:<18} {state:<14} {health:<14} {notes}\")\n",
    "                \n",
    "            except json.JSONDecodeError:\n",
    "                pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 213,
   "id": "393qfrwg7h",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check Missing Services\n",
    "missing_services = set(EXPECTED_SERVICES.keys()) - found_services\n",
    "\n",
    "if missing_services:\n",
    "    print(\"\\nMISSING SERVICES:\")\n",
    "    print(\"-\" * 70)\n",
    "    for service in missing_services:\n",
    "        description = EXPECTED_SERVICES[service]\n",
    "        if service == 'airflow':\n",
    "            print(f\"⚠ {service:<18} not running    {'(Optional)':<14} {description}\")\n",
    "        else:\n",
    "            print(f\"✗ {service:<18} not running    {'Required':<14} {description}\")\n",
    "\n",
    "failed_services = [s for s, info in service_states.items() \n",
    "                  if info['state'] in ['exited', 'restarting'] or info['health'] == 'unhealthy']\n",
    "\n",
    "if failed_services:\n",
    "    print(f\"\\nTROUBLESHOOTING:\")\n",
    "    for service in failed_services:\n",
    "        print(f\"   docker compose logs {service}\")\n",
    "elif missing_services and 'airflow' not in missing_services:\n",
    "    print(f\"\\nACTION NEEDED:\")\n",
    "    print(\"Start missing services: docker compose up -d\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3646795",
   "metadata": {},
   "source": [
    "### 1. FastAPI - REST API Service\n",
    "\n",
    "**Interactive Exploration:**\n",
    "\n",
    "You can explore and test the FastAPI service in several ways:\n",
    "- **API Documentation**: http://localhost:8000/docs (Interactive Swagger UI)\n",
    "- **Alternative Docs**: http://localhost:8000/redoc (ReDoc interface)\n",
    "- **Source Code**: Located in `src/routers/` directory\n",
    "\n",
    "Let's test the API endpoints and explore the documentation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 214,
   "id": "rhnz43uolf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ FastAPI is responding\n",
      "Status: ok\n"
     ]
    }
   ],
   "source": [
    "# Test FastAPI Health\n",
    "import requests\n",
    "\n",
    "try:\n",
    "    response = requests.get(\"http://localhost:8000/api/v1/health\", timeout=5)\n",
    "    if response.status_code == 200:\n",
    "        data = response.json()\n",
    "        print(\"✓ FastAPI is responding\")\n",
    "        print(f\"Status: {data.get('status', 'unknown')}\")\n",
    "    else:\n",
    "        print(f\"⚠ API returned status: {response.status_code}\")\n",
    "except requests.exceptions.ConnectionError:\n",
    "    print(\"✗ API not responding - wait 1-2 minutes\")\n",
    "except Exception as e:\n",
    "    print(f\"✗ API test error: {e}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 215,
   "id": "vt3k6opg0e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "============================================================\n",
      "  PRODUCTION INSIGHT (Online Sessions Only)\n",
      "============================================================\n",
      "❓ How are they scaled?\n",
      "❓ What are the bottlenecks?\n",
      "❓ How are they monitored and managed?\n",
      "❓ How are they integrated with other systems?\n",
      "❓ What are the best practices for using these systems?\n",
      "❓ How are these systems used and deployed in production?\n",
      "❓ How are they tested? in terms of load and performance?\n",
      "→ Learn these production secrets in our online walkthrough sessions!\n",
      "============================================================\n"
     ]
    }
   ],
   "source": [
    "# PRODUCTION INSIGHTS\n",
    "print(\"\\n\" + \"=\"*60)\n",
    "print(\"  PRODUCTION INSIGHT (Online Sessions Only)\")\n",
    "print(\"=\"*60)\n",
    "print(\"❓ How are they scaled?\")\n",
    "print(\"❓ What are the bottlenecks?\")\n",
    "print(\"❓ How are they monitored and managed?\")\n",
    "print(\"❓ How are they integrated with other systems?\")\n",
    "print(\"❓ What are the best practices for using these systems?\")\n",
    "print(\"❓ How are these systems used and deployed in production?\")\n",
    "print(\"❓ How are they tested? in terms of load and performance?\")\n",
    "print(\"→ Learn these production secrets in our online walkthrough sessions!\")\n",
    "print(\"=\"*60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0n2lfch6fjzn",
   "metadata": {},
   "source": [
    "### 2. Apache Airflow - Workflow Automation\n",
    "\n",
    "**Interactive Exploration:**\n",
    "\n",
    "Apache Airflow manages data pipelines and automated workflows. You can explore it through:\n",
    "- **Web Dashboard**: http://localhost:8080 \n",
    "- **Login**: Username: `admin`, Password: Found in container (see test below)\n",
    "- **Source Code**: Located in `airflow/dags/` directory\n",
    "\n",
    "**Simple Password Location:**\n",
    "Airflow 3.0 stores the admin password in a predictable file:\n",
    "```\n",
    "/opt/airflow/simple_auth_manager_passwords.json.generated\n",
    "```\n",
    "\n",
    "The test below automatically reads this file for you!\n",
    "\n",
    "Let's test Airflow and get the password:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 216,
   "id": "7uu7h40rutn",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Airflow password: sBtDW9ffYBgETMqR\n"
     ]
    }
   ],
   "source": [
    "# Get Airflow Password\n",
    "import json\n",
    "from pathlib import Path\n",
    "\n",
    "password_file = project_root / \"airflow\" / \"simple_auth_manager_passwords.json.generated\"\n",
    "\n",
    "try:\n",
    "    if password_file.exists():\n",
    "        with open(password_file, 'r') as f:\n",
    "            data = json.load(f)\n",
    "            password = data.get(\"admin\")\n",
    "        print(f\"✓ Airflow password: {password}\")\n",
    "    else:\n",
    "        print(f\"⚠ Password file not found\")\n",
    "        password = None\n",
    "except Exception as e:\n",
    "    print(f\"✗ Could not read password: {e}\")\n",
    "    password = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 217,
   "id": "950853cd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Airflow is healthy\n",
      "\n",
      "Airflow Login:\n",
      "URL: http://localhost:8080\n",
      "Username: admin\n",
      "Password: sBtDW9ffYBgETMqR\n"
     ]
    }
   ],
   "source": [
    "# Test Airflow Health\n",
    "try:\n",
    "    response = requests.get(\"http://localhost:8080/api/v1/health\", timeout=5)\n",
    "    if response.status_code == 200:\n",
    "        print(\"✓ Airflow is healthy\")\n",
    "        \n",
    "        if password:\n",
    "            print(f\"\\nAirflow Login:\")\n",
    "            print(f\"URL: http://localhost:8080\")\n",
    "            print(f\"Username: admin\")\n",
    "            print(f\"Password: {password}\")\n",
    "    else:\n",
    "        print(f\"⚠ Airflow returned: {response.status_code}\")\n",
    "        \n",
    "except requests.exceptions.ConnectionError:\n",
    "    print(\"✗ Airflow not responding - wait 2-3 minutes\")\n",
    "except Exception as e:\n",
    "    print(f\"✗ Airflow test error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45pe694qafu",
   "metadata": {},
   "source": [
    "### 3. OpenSearch - Hybrid database\n",
    "\n",
    "**Interactive Exploration:**\n",
    "\n",
    "OpenSearch provides full-text search and analytics capabilities:\n",
    "- **API Endpoint**: http://localhost:9200 \n",
    "- **Dashboards UI**: http://localhost:5601 (Web interface)\n",
    "- **Source Code**: Located in `src/services/opensearch/` directory\n",
    "\n",
    "**Important for Students:** \n",
    "- ✅ Use http://localhost:5601 for web interface\n",
    "- ✅ Use Dev Tools in Dashboards for API queries\n",
    "\n",
    "Let's test OpenSearch and explore its capabilities:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 218,
   "id": "lie8ph4ilsb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ OpenSearch Dashboards is accessible!\n",
      "✓ Web interface is ready for exploration\n",
      "\n",
      " Web Interface Access:\n",
      "========================================\n",
      "Main Dashboard: http://localhost:5601\n",
      "Dev Tools: http://localhost:5601/app/dev_tools\n",
      "========================================\n",
      "\n",
      " Student Learning Activities:\n",
      "1. Explore the Dashboard:\n",
      "   • Visit http://localhost:5601\n",
      "   • Navigate through the interface\n",
      "   • Check out the 'Discover' tab\n",
      "\n",
      "2. Use Dev Tools for API Queries:\n",
      "   • Go to Dev Tools\n",
      "   • Try: GET /_cluster/health\n",
      "   • Try: GET /_cat/indices?v\n",
      "   • Try: GET /_cluster/stats\n",
      "   • Check the learning material for more information\n"
     ]
    }
   ],
   "source": [
    "# Test 1: Check OpenSearch Dashboards Web Interface\n",
    "# This is the proper way for students to interact with OpenSearch\n",
    "\n",
    "dashboards_url = \"http://localhost:5601\"\n",
    "\n",
    "try:\n",
    "    # Test if Dashboards is accessible\n",
    "    response = requests.get(f\"{dashboards_url}/api/status\", timeout=10, allow_redirects=True)\n",
    "    if response.status_code == 200:\n",
    "        print(\"✓ OpenSearch Dashboards is accessible!\")\n",
    "        print(\"✓ Web interface is ready for exploration\")\n",
    "        \n",
    "        print(\"\\n Web Interface Access:\")\n",
    "        print(\"=\" * 40)\n",
    "        print(f\"Main Dashboard: {dashboards_url}\")\n",
    "        print(f\"Dev Tools: {dashboards_url}/app/dev_tools\")\n",
    "        print(\"=\" * 40)\n",
    "        \n",
    "        print(\"\\n Student Learning Activities:\")\n",
    "        print(\"1. Explore the Dashboard:\")\n",
    "        print(\"   • Visit http://localhost:5601\")\n",
    "        print(\"   • Navigate through the interface\")\n",
    "        print(\"   • Check out the 'Discover' tab\")\n",
    "        \n",
    "        print(\"\\n2. Use Dev Tools for API Queries:\")\n",
    "        print(\"   • Go to Dev Tools\")\n",
    "        print(\"   • Try: GET /_cluster/health\")\n",
    "        print(\"   • Try: GET /_cat/indices?v\")\n",
    "        print(\"   • Try: GET /_cluster/stats\")\n",
    "        print(\"   • Check the learning material for more information\")\n",
    "        \n",
    "    else:\n",
    "        print(f\"⚠ Dashboards returned status: {response.status_code}\")\n",
    "        print(\"Interface may still be starting up\")\n",
    "        \n",
    "except requests.exceptions.ConnectionError:\n",
    "    print(\"✗ OpenSearch Dashboards not accessible yet\")\n",
    "    print(\"Wait 2-3 minutes for full startup\")\n",
    "    \n",
    "except requests.exceptions.Timeout:\n",
    "    print(\"⚠ Dashboards request timed out\")\n",
    "    print(\"This is normal during startup - try again in a few minutes\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"✗ Error accessing Dashboards: {e}\")\n",
    "    print(\"Check container status: docker compose ps\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 219,
   "id": "dnzxw9vmo6v",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "============================================================\n",
      "🎯 PRODUCTION INSIGHT (Online Sessions Only)\n",
      "============================================================\n",
      "❓ Why companies use OpenSearch?\n",
      "❓ What all is achievable with OpenSearch?\n",
      "❓ How does OpenSearch handle billions of documents?\n",
      "❓ How do companies search through billions of documents?\n",
      "❓ How do e-commerce giants search millions of products instantly?\n",
      "→ Learn these production secrets in our online walkthrough sessions!\n",
      "============================================================\n"
     ]
    }
   ],
   "source": [
    "# PRODUCTION DEPLOYMENT INSIGHT\n",
    "print(\"\\n\" + \"=\"*60)\n",
    "print(\"🎯 PRODUCTION INSIGHT (Online Sessions Only)\")\n",
    "print(\"=\"*60)\n",
    "print(\"❓ Why companies use OpenSearch?\")\n",
    "print(\"❓ What all is achievable with OpenSearch?\")\n",
    "print(\"❓ How does OpenSearch handle billions of documents?\")\n",
    "print(\"❓ How do companies search through billions of documents?\")\n",
    "print(\"❓ How do e-commerce giants search millions of products instantly?\")\n",
    "print(\"→ Learn these production secrets in our online walkthrough sessions!\")\n",
    "print(\"=\"*60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "my7ctahs1bi",
   "metadata": {},
   "source": [
    "### 4. Ollama - Local LLM Inference Engine\n",
    "\n",
    "**Interactive Exploration:**\n",
    "\n",
    "Ollama runs large language models locally on your machine:\n",
    "- **API Endpoint**: http://localhost:11434\n",
    "- **Command Line**: Available inside the container\n",
    "- **Privacy**: All AI processing happens locally (no external APIs)\n",
    "\n",
    "Let's test Ollama and see what models are available:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 220,
   "id": "olb9gxpkomk",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Ollama is running!\n",
      "Available models: 1\n",
      "\n",
      "Installed Models:\n",
      "  • llama3.2:1b (1.2 GB)\n",
      "\n",
      "  Try This Later (Week 4):\n",
      "1. docker exec -it rag-ollama ollama pull llama3.2\n",
      "2. docker exec -it rag-ollama ollama list\n",
      "3. docker exec -it rag-ollama ollama run llama3.2\n"
     ]
    }
   ],
   "source": [
    "# Test 1: Check Ollama Service Status\n",
    "# Let's see if Ollama is running and what models are available\n",
    "\n",
    "import requests\n",
    "import json\n",
    "\n",
    "ollama_url = \"http://localhost:11434/api/tags\"\n",
    "\n",
    "try:\n",
    "    response = requests.get(ollama_url, timeout=5)\n",
    "    if response.status_code == 200:\n",
    "        models_data = response.json()\n",
    "        models = models_data.get('models', [])\n",
    "        \n",
    "        print(\"✓ Ollama is running!\")\n",
    "        print(f\"Available models: {len(models)}\")\n",
    "        \n",
    "        if models:\n",
    "            print(\"\\nInstalled Models:\")\n",
    "            for model in models:\n",
    "                name = model.get('name', 'unknown')\n",
    "                size = model.get('size', 0)\n",
    "                size_gb = round(size / (1024**3), 1)\n",
    "                print(f\"  • {name} ({size_gb} GB)\")\n",
    "        else:\n",
    "            print(\"\\n  No models installed yet\")\n",
    "            print(\"   This is normal - models are large files (3-7 GB each)\")\n",
    "            print(\"   In Week 4, we'll install a model like llama3.2\")\n",
    "            \n",
    "        print(\"\\n  Try This Later (Week 4):\")\n",
    "        print(\"1. docker exec -it rag-ollama ollama pull llama3.2\")\n",
    "        print(\"2. docker exec -it rag-ollama ollama list\")\n",
    "        print(\"3. docker exec -it rag-ollama ollama run llama3.2\")\n",
    "        \n",
    "    else:\n",
    "        print(f\"⚠ Ollama returned status: {response.status_code}\")\n",
    "        \n",
    "except requests.exceptions.ConnectionError:\n",
    "    print(\"✗ Ollama is not responding yet\")\n",
    "    print(\"Ollama service might still be starting\")\n",
    "    \n",
    "except requests.exceptions.Timeout:\n",
    "    print(\"✗ Ollama request timed out\")\n",
    "    print(\"Service might still be initializing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"✗ Unexpected error testing Ollama: {e}\")\n",
    "    print(\"Try again in a few minutes\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 221,
   "id": "rgr25nmpdx",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ Ollama API is healthy!\n",
      "Version: 0.11.2\n",
      "\n",
      "  What is Ollama?\n",
      "• Runs AI models completely on your local machine\n",
      "• No data sent to external services (privacy-first)\n",
      "• No API fees or rate limits\n",
      "• Supports models like Llama, Mistral, Phi, etc.\n",
      "\n",
      "  Coming in Week 4:\n",
      "• Install and run a local language model\n",
      "• Generate answers to research questions\n",
      "• Summarize academic papers\n",
      "• All processing stays on your computer!\n"
     ]
    }
   ],
   "source": [
    "# Test 2: Check Ollama Version and Health\n",
    "# Let's verify Ollama is properly configured\n",
    "\n",
    "import requests\n",
    "import json\n",
    "\n",
    "ollama_version_url = \"http://localhost:11434/api/version\"\n",
    "\n",
    "try:\n",
    "    response = requests.get(ollama_version_url, timeout=5)\n",
    "    if response.status_code == 200:\n",
    "        version_data = response.json()\n",
    "        version = version_data.get('version', 'unknown')\n",
    "        \n",
    "        print(\"✓ Ollama API is healthy!\")\n",
    "        print(f\"Version: {version}\")\n",
    "        \n",
    "        print(\"\\n  What is Ollama?\")\n",
    "        print(\"• Runs AI models completely on your local machine\")\n",
    "        print(\"• No data sent to external services (privacy-first)\")\n",
    "        print(\"• No API fees or rate limits\")\n",
    "        print(\"• Supports models like Llama, Mistral, Phi, etc.\")\n",
    "        \n",
    "        print(\"\\n  Coming in Week 4:\")\n",
    "        print(\"• Install and run a local language model\")\n",
    "        print(\"• Generate answers to research questions\")\n",
    "        print(\"• Summarize academic papers\")\n",
    "        print(\"• All processing stays on your computer!\")\n",
    "        \n",
    "    else:\n",
    "        print(f\"⚠ Ollama version check returned: {response.status_code}\")\n",
    "        \n",
    "except requests.exceptions.ConnectionError:\n",
    "    print(\"✗ Could not check Ollama version\")\n",
    "    print(\"Service might still be starting up\")\n",
    "    \n",
    "except requests.exceptions.Timeout:\n",
    "    print(\"✗ Ollama request timed out\")\n",
    "    print(\"Service might still be initializing\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\"✗ Unexpected error checking version: {e}\")\n",
    "    print(\"Try again in a few minutes\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 222,
   "id": "xofrh13d2rj",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "============================================================\n",
      "🎯 PRODUCTION INSIGHT (Online Sessions Only)\n",
      "============================================================\n",
      "❓ What are the real issues with LLMs when in production?\n",
      "❓ What is the difference between fine-tuned LLM and RAG?\n",
      "❓ How do companies serve LLMs without burning through cash?\n",
      "→ Learn these production secrets in our online walkthrough sessions!\n",
      "============================================================\n"
     ]
    }
   ],
   "source": [
    "# PRODUCTION DEPLOYMENT INSIGHT\n",
    "print(\"\\n\" + \"=\"*60)\n",
    "print(\"🎯 PRODUCTION INSIGHT (Online Sessions Only)\")\n",
    "print(\"=\"*60)\n",
    "print(\"❓ What are the real issues with LLMs when in production?\")\n",
    "print(\"❓ What is the difference between fine-tuned LLM and RAG?\")\n",
    "print(\"❓ How do companies serve LLMs without burning through cash?\")\n",
    "print(\"→ Learn these production secrets in our online walkthrough sessions!\")\n",
    "print(\"=\"*60)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 223,
   "id": "1w3j46f69ge",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DOWNLOADING LLAMA 3.2:1B MODEL\n",
      "==================================================\n",
      "This is a small 1.3GB model - perfect for testing!\n",
      "Download will take 2-5 minutes depending on your internet speed...\n",
      "Llama 3.2:1b model downloaded successfully!\n"
     ]
    }
   ],
   "source": [
    "# HANDS-ON: Pull and Test Llama 3.2 (Small Model)\n",
    "\n",
    "import requests\n",
    "import subprocess\n",
    "import time\n",
    "\n",
    "print(\"DOWNLOADING LLAMA 3.2:1B MODEL\")\n",
    "print(\"=\" * 50)\n",
    "print(\"This is a small 1.3GB model - perfect for testing!\")\n",
    "print(\"Download will take 2-5 minutes depending on your internet speed...\")\n",
    "\n",
    "try:\n",
    "    result = subprocess.run(\n",
    "        [\"docker\", \"exec\", \"rag-ollama\", \"ollama\", \"pull\", \"llama3.2:1b\"],\n",
    "        cwd=str(project_root),\n",
    "        capture_output=True,\n",
    "        text=True,\n",
    "        timeout=600\n",
    "    )\n",
    "    \n",
    "    if result.returncode == 0:\n",
    "        print(\"Llama 3.2:1b model downloaded successfully!\")\n",
    "    else:\n",
    "        print(f\"Download issue: {result.stderr}\")\n",
    "        \n",
    "except subprocess.TimeoutExpired:\n",
    "    print(\"Download timed out - this is normal for slow connections\")\n",
    "    print(\"The download continues in the background\")\n",
    "except Exception as e:\n",
    "    print(f\"Error downloading model: {e}\")\n",
    "    print(\"Make sure Ollama container is running: docker compose ps\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 224,
   "id": "4qa8r07k01v",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Testing llama3.2:1b with prompt: 'What is machine learning in one sentence?'\n",
      "------------------------------------------------------------\n",
      "Generating response (this may take 10-30 seconds)...\n",
      "Response generated in 2.9 seconds\n",
      "\n",
      "RESPONSE:\n",
      "========================================\n",
      "Machine learning is a subfield of artificial intelligence that enables computers to learn from data, make predictions or decisions without being explicitly programmed, by analyzing patterns and relationships within the data.\n",
      "========================================\n",
      "\n",
      "Model: llama3.2:1b\n",
      "Generation time: 2929ms\n",
      "\n",
      "SUCCESS! Your local AI model is working!\n",
      "\n",
      "Try more prompts:\n",
      "• test_ollama_model(\"llama3.2:1b\", \"Explain neural networks simply\")\n",
      "• test_ollama_model(\"llama3.2:1b\", \"Write a Python function to sort a list\")\n"
     ]
    }
   ],
   "source": [
    "# Test Llama 3.2:1b API\n",
    "\n",
    "def test_ollama_model(model_name, prompt, max_wait_time=60):\n",
    "    \"\"\"Test an Ollama model with a prompt.\"\"\"\n",
    "    print(f\"Testing {model_name} with prompt: '{prompt}'\")\n",
    "    print(\"-\" * 60)\n",
    "    \n",
    "    url = \"http://localhost:11434/api/generate\"\n",
    "    data = {\n",
    "        \"model\": model_name,\n",
    "        \"prompt\": prompt,\n",
    "        \"stream\": False\n",
    "    }\n",
    "    \n",
    "    try:\n",
    "        print(\"Generating response (this may take 10-30 seconds)...\")\n",
    "        start_time = time.time()\n",
    "        \n",
    "        response = requests.post(url, json=data, timeout=max_wait_time)\n",
    "        \n",
    "        if response.status_code == 200:\n",
    "            result = response.json()\n",
    "            response_text = result.get('response', '').strip()\n",
    "            \n",
    "            elapsed_time = time.time() - start_time\n",
    "            print(f\"Response generated in {elapsed_time:.1f} seconds\")\n",
    "            print(\"\\nRESPONSE:\")\n",
    "            print(\"=\" * 40)\n",
    "            print(response_text)\n",
    "            print(\"=\" * 40)\n",
    "            \n",
    "            if 'model' in result:\n",
    "                print(f\"\\nModel: {result['model']}\")\n",
    "            if 'total_duration' in result:\n",
    "                duration_ms = result['total_duration'] / 1000000\n",
    "                print(f\"Generation time: {duration_ms:.0f}ms\")\n",
    "                \n",
    "            return True\n",
    "            \n",
    "        else:\n",
    "            print(f\"API error: {response.status_code}\")\n",
    "            print(f\"Response: {response.text}\")\n",
    "            return False\n",
    "            \n",
    "    except requests.exceptions.ConnectionError:\n",
    "        print(\"Could not connect to Ollama API\")\n",
    "        print(\"Make sure Ollama is running: docker compose ps\")\n",
    "        return False\n",
    "    except requests.exceptions.Timeout:\n",
    "        print(\"Request timed out\")\n",
    "        print(\"Model might be loading for the first time (this is normal)\")\n",
    "        return False\n",
    "    except Exception as e:\n",
    "        print(f\"Unexpected error: {e}\")\n",
    "        return False\n",
    "\n",
    "test_prompt = \"What is machine learning in one sentence?\"\n",
    "success = test_ollama_model(\"llama3.2:1b\", test_prompt)\n",
    "\n",
    "if success:\n",
    "    print(\"\\nSUCCESS! Your local AI model is working!\")\n",
    "    print(\"\\nTry more prompts:\")\n",
    "    print('• test_ollama_model(\"llama3.2:1b\", \"Explain neural networks simply\")')\n",
    "    print('• test_ollama_model(\"llama3.2:1b\", \"Write a Python function to sort a list\")')\n",
    "else:\n",
    "    print(\"\\nTroubleshooting:\")\n",
    "    print(\"1. Make sure model downloaded: docker exec rag-ollama ollama list\")\n",
    "    print(\"2. Check Ollama logs: docker compose logs ollama\")\n",
    "    print(\"3. Try again - first run takes longer to load model into memory\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bh1ic7xxxrq",
   "metadata": {},
   "source": [
    "### 5. PostgreSQL - Database Storage\n",
    "\n",
    "**Interactive Exploration:**\n",
    "\n",
    "PostgreSQL stores all structured data for our application:\n",
    "- **Connection**: localhost:5432\n",
    "- **Database**: rag_db\n",
    "- **Username/Password**: rag_user / rag_password\n",
    "- **GUI Tool Recommendation**: DBeaver (free database client)\n",
    "\n",
    "Let's test the database connection and explore the schema:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 225,
   "id": "qjyjq3s023m",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ PostgreSQL is accepting connections on port 5432!\n",
      "\n",
      "  Database Connection Details:\n",
      "• Host: localhost\n",
      "• Port: 5432\n",
      "• Database: rag_db\n",
      "• Username: rag_user\n",
      "• Password: rag_password\n",
      "\n",
      "  Recommended GUI Tools:\n",
      "• DBeaver (Free): https://dbeaver.io/download/\n",
      "• pgAdmin: https://www.pgadmin.org/download/\n"
     ]
    }
   ],
   "source": [
    "# Test 1: Check PostgreSQL Connection (Basic)\n",
    "# Let's verify PostgreSQL is accepting connections\n",
    "\n",
    "def test_postgres_connection():\n",
    "    \"\"\"Test PostgreSQL connection using simple socket check.\"\"\"\n",
    "    import socket\n",
    "    \n",
    "    try:\n",
    "        # Test if PostgreSQL port is open\n",
    "        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n",
    "        sock.settimeout(3)\n",
    "        result = sock.connect_ex(('localhost', 5432))\n",
    "        sock.close()\n",
    "        \n",
    "        if result == 0:\n",
    "            print(\"✓ PostgreSQL is accepting connections on port 5432!\")\n",
    "            return True\n",
    "        else:\n",
    "            print(\"✗ PostgreSQL port is not accessible\")\n",
    "            return False\n",
    "            \n",
    "    except Exception as e:\n",
    "        print(f\"✗ Could not test PostgreSQL: {e}\")\n",
    "        return False\n",
    "\n",
    "postgres_available = test_postgres_connection()\n",
    "\n",
    "if postgres_available:\n",
    "    print(\"\\n  Database Connection Details:\")\n",
    "    print(\"• Host: localhost\")\n",
    "    print(\"• Port: 5432\") \n",
    "    print(\"• Database: rag_db\")\n",
    "    print(\"• Username: rag_user\")\n",
    "    print(\"• Password: rag_password\")\n",
    "    \n",
    "    print(\"\\n  Recommended GUI Tools:\")\n",
    "    print(\"• DBeaver (Free): https://dbeaver.io/download/\")\n",
    "    print(\"• pgAdmin: https://www.pgadmin.org/download/\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 226,
   "id": "ok33sipeapa",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "✓ PostgreSQL connected\n"
     ]
    }
   ],
   "source": [
    "# Test PostgreSQL Connection\n",
    "try:\n",
    "    import psycopg2\n",
    "    \n",
    "    conn = psycopg2.connect(\n",
    "        host=\"localhost\",\n",
    "        port=5432,\n",
    "        database=\"rag_db\", \n",
    "        user=\"rag_user\",\n",
    "        password=\"rag_password\"\n",
    "    )\n",
    "    \n",
    "    print(\"✓ PostgreSQL connected\")\n",
    "    cursor = conn.cursor()\n",
    "    \n",
    "except ImportError:\n",
    "    print(\"⚠ psycopg2 not installed - basic connection only\")\n",
    "    exit()\n",
    "except Exception as e:\n",
    "    print(f\"✗ Database connection failed: {e}\")\n",
    "    exit()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 227,
   "id": "db723wrw0x",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Found 46 total tables\n",
      "Application tables: 1\n",
      "Airflow tables: 45\n",
      "  • papers\n"
     ]
    }
   ],
   "source": [
    "# Check Database Tables\n",
    "cursor.execute(\"\"\"\n",
    "    SELECT table_name \n",
    "    FROM information_schema.tables \n",
    "    WHERE table_schema = 'public'\n",
    "    ORDER BY table_name;\n",
    "\"\"\")\n",
    "\n",
    "all_tables = cursor.fetchall()\n",
    "\n",
    "app_tables = []\n",
    "airflow_tables = []\n",
    "\n",
    "for (table_name,) in all_tables:\n",
    "    if table_name in ['papers', 'users', 'embeddings']:\n",
    "        app_tables.append(table_name)\n",
    "    else:\n",
    "        airflow_tables.append(table_name)\n",
    "\n",
    "print(f\"Found {len(all_tables)} total tables\")\n",
    "print(f\"Application tables: {len(app_tables)}\")\n",
    "print(f\"Airflow tables: {len(airflow_tables)}\")\n",
    "\n",
    "for table in app_tables:\n",
    "    print(f\"  • {table}\")\n",
    "\n",
    "if not app_tables:\n",
    "    print(\"  No application tables yet (expected in Week 1)\")\n",
    "    \n",
    "cursor.close()\n",
    "conn.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 228,
   "id": "uu7g2qsxjun",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "============================================================\n",
      "🎯 PRODUCTION INSIGHT (Online Sessions Only)\n",
      "============================================================\n",
      "❓ How do companies handle millions of transactions with PostgreSQL?\n",
      "❓ What's the secret to zero-downtime database migrations?\n",
      "→ Learn these production secrets in our online walkthrough sessions!\n",
      "============================================================\n"
     ]
    }
   ],
   "source": [
    "# PRODUCTION DEPLOYMENT INSIGHT\n",
    "print(\"\\n\" + \"=\"*60)\n",
    "print(\"🎯 PRODUCTION INSIGHT (Online Sessions Only)\")\n",
    "print(\"=\"*60)\n",
    "print(\"❓ How do companies handle millions of transactions with PostgreSQL?\")\n",
    "print(\"❓ What's the secret to zero-downtime database migrations?\")\n",
    "print(\"→ Learn these production secrets in our online walkthrough sessions!\")\n",
    "print(\"=\"*60)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "r9r8vebw5vl",
   "metadata": {},
   "source": [
    "### Service Health Summary and Next Steps\n",
    "\n",
    "Based on the interactive tests above:\n",
    "\n",
    "**If all services show ✓**: \n",
    "- 🎉 Congratulations! Your infrastructure is ready\n",
    "- All services are healthy and responding correctly\n",
    "- You can explore each service using the links and instructions provided\n",
    "\n",
    "**If some services show ✗**:\n",
    "- Don't worry! Services take time to start\n",
    "- Wait 2-3 minutes and re-run the test cells\n",
    "- OpenSearch and Airflow take the longest (up to 5 minutes)\n",
    "\n",
    "**Service Access Points:**\n",
    "- **FastAPI Documentation**: http://localhost:8000/docs - Interactive API testing\n",
    "- **Airflow Dashboard**: http://localhost:8080 (admin/admin) - Workflow management\n",
    "- **OpenSearch Dashboards**: http://localhost:5601 - Dashboard and user interface + analytics\n",
    "- **OpenSearch API**: http://localhost:9200 - Direct API access\n",
    "- **Ollama API**: http://localhost:11434 - Local LLM inference\n",
    "- **PostgreSQL**: http://localhost:5432 - Use DBeaver or similar tools\n",
    "\n",
    "**Hands-On Learning Activities:**\n",
    "\n",
    "1. **FastAPI**: Test endpoints in the interactive documentation\n",
    "2. **Airflow**: Login and trigger a DAG manually  \n",
    "3. **OpenSearch**: Try queries in the Dev Tools\n",
    "4. **Ollama**: Prepare for Week 6 model installation\n",
    "5. **PostgreSQL**: Install DBeaver and explore the database structure\n",
    "\n",
    "**Common Issues:**\n",
    "- \"Connection refused\" → Service still starting\n",
    "- \"Port in use\" → Another application using the port  \n",
    "- Container restarting → Check logs with `docker compose logs [service-name]`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "vv0a9iu3rk",
   "metadata": {},
   "source": [
    "## Troubleshooting\n",
    "\n",
    "**Common Issues:**\n",
    "- **Connection refused** → Service still starting (wait 2-3 minutes)\n",
    "- **Port in use** → Stop conflicting application or change ports\n",
    "- **Container restarting** → Check logs: `docker compose logs [service-name]`\n",
    "- **Out of memory** → Increase Docker Desktop memory allocation\n",
    "\n",
    "**Reset everything:** `docker compose down && docker compose up -d`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1y3f29mcuhw",
   "metadata": {},
   "source": [
    "## Week 1 Complete\n",
    "\n",
    "**Service Access Points:**\n",
    "- **API**: http://localhost:8000/docs\n",
    "- **Airflow**: http://localhost:8080 (admin/sBtDW9ffYBgETMqR)  \n",
    "- **OpenSearch**: http://localhost:5601\n",
    "- **PostgreSQL**: localhost:5432 (rag_user/rag_password)\n",
    "\n",
    "**Success Criteria:**\n",
    "- [ ] All services healthy in status check\n",
    "- [ ] API documentation accessible\n",
    "- [ ] Airflow dashboard loads\n",
    "- [ ] OpenSearch interface works\n",
    "\n",
    "**Next:** Keep services running or restart with `docker compose up -d`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "s58tqz40v58",
   "metadata": {},
   "source": [
    "## Project Commands\n",
    "\n",
    "**Makefile shortcuts:**\n",
    "```bash\n",
    "make start    # Start all services  \n",
    "make status   # Check service status\n",
    "make logs     # View logs\n",
    "make health   # Check service health\n",
    "make stop     # Stop all services\n",
    "make help     # View all commands\n",
    "```\n",
    "\n",
    "**Next:** Read the main `README.md` for complete project documentation."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}