{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Week 7: Agentic RAG with LangGraph\n",
    "\n",
    "**What We're Testing This Week:**\n",
    "\n",
    "Week 7 extends our RAG system with **intelligent, adaptive retrieval** using LangGraph's agentic architecture with guardrail validation and iterative query refinement.\n",
    "\n",
    "## Agentic RAG Features\n",
    "\n",
    "### Traditional RAG vs. Agentic RAG\n",
    "\n",
    "**Traditional RAG (Week 5-6)**:\n",
    "```\n",
    "Query → Always Retrieve → Generate Answer\n",
    "```\n",
    "\n",
    "**Agentic RAG (Week 7)**:\n",
    "```\n",
    "Query → Guardrail Validation (Score 0-100)\n",
    "  ├─ Score < 60 → Out of Scope (reject with helpful message)\n",
    "  └─ Score >= 60 → Retrieve Documents\n",
    "       ↓\n",
    "     Grade Documents\n",
    "       ├─ Relevant → Generate Answer\n",
    "       └─ Not Relevant → Rewrite Query → Retry (max 2 attempts)\n",
    "```\n",
    "\n",
    "### Key Capabilities\n",
    "\n",
    "1. **Guardrail Validation** - LLM validates query scope (0-100 score) before retrieval\n",
    "   - Score < 60: Query is out-of-scope (e.g., \"What is a dog?\")\n",
    "   - Score >= 60: Query is relevant to ML/NLP research papers\n",
    "2. **Out-of-Scope Handling** - Automatically rejects queries outside ML/NLP domain\n",
    "3. **Document Grading** - Validates that retrieved papers are relevant\n",
    "4. **Query Refinement** - Rewrites vague queries for better results\n",
    "5. **Reasoning Transparency** - Shows the agent's decision-making steps\n",
    "6. **Iterative Improvement** - Can retry with better queries if needed (max 2 attempts)\n",
    "\n",
    "### Architecture: LangGraph Workflow\n",
    "\n",
    "![LangGraph Agentic RAG Workflow](../../static/langgraph-mermaid.png)\n",
    "\n",
    "**Workflow Nodes:**\n",
    "- **start** → **guardrail** (LLM scoring 0-100)\n",
    "- **retrieve** → **tool_retrieve** (executes search)\n",
    "- **grade_documents** (LLM relevance check)\n",
    "- **rewrite_query** (query refinement if documents not relevant)\n",
    "- **end** (terminates with answer or rejection)\n",
    "\n",
    "### New Response Fields\n",
    "\n",
    "- `reasoning_steps`: Detailed decision-making trace\n",
    "- `retrieval_attempts`: Number of search attempts (0-2)\n",
    "- `rewritten_query`: Query after refinement (if rewritten)\n",
    "\n",
    "### Configuration (GraphConfig)\n",
    "\n",
    "- `max_retrieval_attempts`: 2\n",
    "- `guardrail_threshold`: 60/100\n",
    "- `model`: \"llama3.2:1b\"\n",
    "- `temperature`: 0.0\n",
    "- `top_k`: 3\n",
    "\n",
    "---\n",
    "\n",
    "## 1. Prerequisites\n",
    "\n",
    "### 1. Environment Variables Setup\n",
    "\n",
    "**Copy the example file and add your API keys:**\n",
    "\n",
    "```bash\n",
    "cp .env.example .env\n",
    "```\n",
    "\n",
    "Then edit `.env` and add your:\n",
    "- `JINA_API_KEY` - Get from [Jina AI](https://jina.ai/) for hybrid search\n",
    "- `LANGFUSE_PUBLIC_KEY` - Get from Langfuse UI after setup (see step 2 below)\n",
    "- `LANGFUSE_SECRET_KEY` - Get from Langfuse UI after setup (see step 2 below)\n",
    "\n",
    "The other values in `.env.example` can be kept as-is for now.\n",
    "\n",
    "### 2. Langfuse v3 Self-Hosted Setup\n",
    "\n",
    "This project uses **Langfuse v3** (self-hosted) which includes:\n",
    "- **langfuse-web**: Web UI at http://localhost:3001\n",
    "- **langfuse-worker**: Background job processor\n",
    "- **langfuse-postgres**: Database for traces\n",
    "- **langfuse-redis**: Cache and queue management\n",
    "- **langfuse-minio**: S3-compatible object storage\n",
    "- **clickhouse**: Analytics database\n",
    "\n",
    "**First-time setup:**\n",
    "1. Make sure `.env` has all the auto-generated secrets from `.env.example`\n",
    "2. Start services: `docker compose up langfuse-web langfuse-worker langfuse-postgres langfuse-redis langfuse-minio clickhouse -d`\n",
    "3. Visit http://localhost:3001 and create your first user\n",
    "4. Go to Settings → API Keys to get your `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY`\n",
    "5. Copy these keys to your `.env` file\n",
    "\n",
    "**Note:** If Langfuse keys are missing, tracing will be disabled but the API will still work.\n",
    "\n",
    "### 3. Ollama Model Setup\n",
    "\n",
    "**The `llama3.2:1b` model is automatically pulled when you start the Docker services.**\n",
    "\n",
    "If you need to manually pull it:\n",
    "```bash\n",
    "# Pull model in the Ollama container\n",
    "docker exec rag-ollama ollama pull llama3.2:1b\n",
    "\n",
    "# Or if running Ollama locally\n",
    "ollama pull llama3.2:1b\n",
    "```\n",
    "\n",
    "**Verify model is available:**\n",
    "```bash\n",
    "docker exec rag-ollama ollama list\n",
    "```\n",
    "\n",
    "### 4. Start All Services\n",
    "\n",
    "**Ensure all services are running:**\n",
    "```bash\n",
    "docker compose up --build -d\n",
    "```\n",
    "\n",
    "**Service Access Points:**\n",
    "- **FastAPI**: http://localhost:8000/docs\n",
    "- **OpenSearch**: http://localhost:9200\n",
    "- **Ollama**: http://localhost:11434\n",
    "- **Langfuse UI**: http://localhost:3001\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Service Health Check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "from pathlib import Path\n",
    "import requests\n",
    "import time\n",
    "\n",
    "print(f\"Python Version: {sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\")\n",
    "\n",
    "# Find project root\n",
    "current_dir = Path.cwd()\n",
    "if current_dir.name == \"week7\" and current_dir.parent.name == \"notebooks\":\n",
    "    project_root = current_dir.parent.parent\n",
    "elif (current_dir / \"compose.yml\").exists():\n",
    "    project_root = current_dir\n",
    "else:\n",
    "    project_root = current_dir.parent.parent\n",
    "\n",
    "if project_root.exists():\n",
    "    print(f\"Project root: {project_root}\")\n",
    "    sys.path.insert(0, str(project_root))\n",
    "else:\n",
    "    print(\"⚠ Project root not found - check directory structure\")\n",
    "\n",
    "# Load .env file if it exists\n",
    "env_file = project_root / \".env\"\n",
    "if env_file.exists():\n",
    "    print(f\"\\n✓ Loading environment from: {env_file}\")\n",
    "    with open(env_file) as f:\n",
    "        for line in f:\n",
    "            line = line.strip()\n",
    "            if line and not line.startswith('#') and '=' in line:\n",
    "                key, value = line.split('=', 1)\n",
    "                if key not in os.environ:\n",
    "                    os.environ[key] = value\n",
    "    print(\"✓ Environment variables loaded\")\n",
    "else:\n",
    "    print(f\"\\n⚠ No .env file found at: {env_file}\")\n",
    "    print(\"  Run: cp .env.example .env\")\n",
    "    print(\"  Then add your JINA_API_KEY, LANGFUSE_PUBLIC_KEY, and LANGFUSE_SECRET_KEY\")\n",
    "\n",
    "# Configuration for notebook tests\n",
    "REQUEST_TIMEOUT = 300\n",
    "TRUNCATE_ANSWERS = True\n",
    "TRUNCATE_LENGTH = 200\n",
    "\n",
    "print(\"\\n✓ Setup complete\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"WEEK 7 SERVICE HEALTH CHECK\")\n",
    "print(\"=\" * 40)\n",
    "\n",
    "services = {\n",
    "    \"FastAPI\": \"http://localhost:8000/api/v1/health\",\n",
    "    \"Ollama\": \"http://localhost:11434/api/version\"\n",
    "}\n",
    "\n",
    "all_healthy = True\n",
    "for service_name, url in services.items():\n",
    "    try:\n",
    "        response = requests.get(url, timeout=5)\n",
    "        if response.status_code == 200:\n",
    "            print(f\"✓ {service_name}: Healthy\")\n",
    "        else:\n",
    "            print(f\"✗ {service_name}: HTTP {response.status_code}\")\n",
    "            all_healthy = False\n",
    "    except:\n",
    "        print(f\"✗ {service_name}: Not accessible\")\n",
    "        all_healthy = False\n",
    "\n",
    "# Check if Ollama model is available\n",
    "print(\"\\nChecking Ollama model availability...\")\n",
    "try:\n",
    "    response = requests.get(\"http://localhost:11434/api/tags\", timeout=5)\n",
    "    if response.status_code == 200:\n",
    "        models = [m['name'] for m in response.json().get('models', [])]\n",
    "        if 'llama3.2:1b' in models:\n",
    "            print(\"✓ llama3.2:1b model is available\")\n",
    "        else:\n",
    "            print(\"⚠ llama3.2:1b not found. Run: docker exec rag-ollama ollama pull llama3.2:1b\")\n",
    "            all_healthy = False\n",
    "except:\n",
    "    print(\"⚠ Could not check Ollama models\")\n",
    "\n",
    "if all_healthy:\n",
    "    print(\"\\n✓ All services ready for Week 7!\")\n",
    "else:\n",
    "    print(\"\\n⚠ Some services need attention. Run: docker compose up --build -d\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Test Traditional RAG (Baseline)\n",
    "\n",
    "First, let's test the traditional RAG endpoint to establish a baseline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"TRADITIONAL RAG TEST (Baseline)\")\n",
    "print(\"=\" * 40)\n",
    "\n",
    "question = \"What are attention mechanisms?\"\n",
    "print(f\"Question: {question}\\n\")\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "try:\n",
    "    response = requests.post(\n",
    "        \"http://localhost:8000/api/v1/ask\",\n",
    "        json={\n",
    "            \"query\": question,\n",
    "            \"top_k\": 3,\n",
    "            \"use_hybrid\": True,\n",
    "            \"model\": \"llama3.2:3b\"\n",
    "        },\n",
    "        timeout=REQUEST_TIMEOUT\n",
    "    )\n",
    "    \n",
    "    elapsed = time.time() - start_time\n",
    "    \n",
    "    if response.status_code == 200:\n",
    "        data = response.json()\n",
    "        print(f\"✓ Traditional RAG ({elapsed:.1f}s)\")\n",
    "        \n",
    "        # Display answer with configurable truncation\n",
    "        answer = data['answer']\n",
    "        if TRUNCATE_ANSWERS and len(answer) > TRUNCATE_LENGTH:\n",
    "            print(f\"\\nAnswer: {answer[:TRUNCATE_LENGTH]}...\")\n",
    "            print(f\"(truncated, full length: {len(answer)} chars)\")\n",
    "        else:\n",
    "            print(f\"\\nAnswer: {answer}\")\n",
    "        \n",
    "        # Display sources with validation\n",
    "        sources = data.get('sources', [])\n",
    "        print(f\"\\nSources: {len(sources)} papers\")\n",
    "        if sources:\n",
    "            for i, source in enumerate(sources[:3], 1):  # Show first 3\n",
    "                if isinstance(source, dict):\n",
    "                    print(f\"  {i}. {source.get('title', 'Unknown')}\")\n",
    "                else:\n",
    "                    print(f\"  {i}. {source}\")\n",
    "        \n",
    "        print(f\"Search mode: {data.get('search_mode', 'unknown')}\")\n",
    "    else:\n",
    "        print(f\"✗ Request failed: {response.status_code}\")\n",
    "        \n",
    "except Exception as e:\n",
    "    print(f\"✗ Error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Test Agentic RAG - Scenario 1: Out-of-Scope Rejection\n",
    "\n",
    "Test if the guardrail correctly rejects queries outside the ML/NLP domain."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"AGENTIC RAG - SCENARIO 1: Out-of-Scope Rejection\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "question = \"What is a dog?\"\n",
    "print(f\"Question: {question}\")\n",
    "print(\"Expected: Guardrail should reject (score < 60) and explain scope\\n\")\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "try:\n",
    "    response = requests.post(\n",
    "        \"http://localhost:8000/api/v1/ask-agentic\",\n",
    "        json={\n",
    "            \"query\": question,\n",
    "            \"top_k\": 3,\n",
    "            \"use_hybrid\": True,\n",
    "        },\n",
    "        timeout=REQUEST_TIMEOUT\n",
    "    )\n",
    "    \n",
    "    elapsed = time.time() - start_time\n",
    "    \n",
    "    if response.status_code == 200:\n",
    "        data = response.json()\n",
    "        print(f\"✓ Agentic RAG ({elapsed:.1f}s)\")\n",
    "        print(f\"\\nAnswer: {data['answer']}\")\n",
    "        print(f\"\\nRetrieval attempts: {data.get('retrieval_attempts', 0)}\")\n",
    "        print(f\"\\nReasoning steps:\")\n",
    "        for i, step in enumerate(data.get('reasoning_steps', []), 1):\n",
    "            print(f\"  {i}. {step}\")\n",
    "        \n",
    "        # Check if guardrail score is in reasoning steps\n",
    "        guardrail_step = next(\n",
    "            (s for s in data.get('reasoning_steps', []) if 'validated' in s.lower() and 'score' in s.lower()),\n",
    "            None\n",
    "        )\n",
    "        if guardrail_step:\n",
    "            print(f\"\\nGuardrail validation: {guardrail_step}\")\n",
    "        \n",
    "        if data.get('retrieval_attempts', 0) == 0:\n",
    "            print(\"\\n✓ SUCCESS: Query correctly rejected by guardrail (no retrieval)!\")\n",
    "        else:\n",
    "            print(\"\\n⚠ UNEXPECTED: Query should have been rejected without retrieval\")\n",
    "    else:\n",
    "        print(f\"✗ Request failed: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "        \n",
    "except Exception as e:\n",
    "    print(f\"✗ Error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Test Agentic RAG - Scenario 2: Successful Retrieval\n",
    "\n",
    "Test if the agent correctly retrieves and grades documents for research questions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"AGENTIC RAG - SCENARIO 2: Successful Retrieval\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "question = \"What are transformers in machine learning?\"\n",
    "print(f\"Question: {question}\")\n",
    "print(\"Expected: Agent should pass guardrail, retrieve documents and generate answer\\n\")\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "try:\n",
    "    response = requests.post(\n",
    "        \"http://localhost:8000/api/v1/ask-agentic\",\n",
    "        json={\n",
    "            \"query\": question,\n",
    "            \"top_k\": 3,\n",
    "            \"use_hybrid\": True,\n",
    "            \"model\": \"llama3.2:3b\"\n",
    "        },\n",
    "        timeout=REQUEST_TIMEOUT\n",
    "    )\n",
    "    \n",
    "    elapsed = time.time() - start_time\n",
    "    \n",
    "    if response.status_code == 200:\n",
    "        data = response.json()\n",
    "        print(f\"✓ Agentic RAG ({elapsed:.1f}s)\")\n",
    "        \n",
    "        # Display answer with better formatting\n",
    "        answer = data.get('answer', '')\n",
    "        print(f\"\\nAnswer:\\n{'-'*50}\")\n",
    "        if TRUNCATE_ANSWERS and len(answer) > 500:  # Use longer limit for detailed answers\n",
    "            print(answer[:500] + \"...\")\n",
    "            print(f\"(truncated, full length: {len(answer)} chars)\")\n",
    "        else:\n",
    "            print(answer)\n",
    "        print('-'*50)\n",
    "        \n",
    "        # Display sources with validation\n",
    "        sources = data.get('sources', [])\n",
    "        print(f\"\\nSources: {len(sources)} papers\")\n",
    "        if sources:\n",
    "            for i, source in enumerate(sources, 1):\n",
    "                if isinstance(source, dict):\n",
    "                    print(f\"  {i}. {source.get('title', source.get('id', 'Unknown'))}\")\n",
    "                elif isinstance(source, str):\n",
    "                    print(f\"  {i}. {source}\")\n",
    "                else:\n",
    "                    print(f\"  {i}. {str(source)}\")\n",
    "        \n",
    "        print(f\"\\nRetrieval attempts: {data.get('retrieval_attempts', 0)}\")\n",
    "        print(f\"\\nReasoning steps:\")\n",
    "        for i, step in enumerate(data.get('reasoning_steps', []), 1):\n",
    "            print(f\"  {i}. {step}\")\n",
    "        \n",
    "\n",
    "        # Check rewritten_query field\n",
    "        if data.get('rewritten_query') is None:\n",
    "            print(\"\\n✓ Query was not rewritten (worked on first attempt)\")\n",
    "        else:\n",
    "            print(f\"\\n→ Query was rewritten to: {data['rewritten_query']}\")\n",
    "        \n",
    "        if data.get('retrieval_attempts', 0) >= 1:\n",
    "            print(\"\\n✓ SUCCESS: Agent retrieved and used documents!\")\n",
    "        else:\n",
    "            print(\"\\n⚠ UNEXPECTED: Agent didn't retrieve for research question\")\n",
    "    else:\n",
    "        print(f\"✗ Request failed: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "        \n",
    "except Exception as e:\n",
    "    print(f\"✗ Error: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Test Agentic RAG - Scenario 3: Query Rewriting\n",
    "\n",
    "Test if the agent rewrites vague queries for better results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"AGENTIC RAG - SCENARIO 3: Query Rewriting\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "question = \"Tell me about ML stuff\"\n",
    "print(f\"Question: {question}\")\n",
    "print(\"Expected: Agent may rewrite query if documents aren't relevant\\n\")\n",
    "\n",
    "start_time = time.time()\n",
    "\n",
    "try:\n",
    "    response = requests.post(\n",
    "        \"http://localhost:8000/api/v1/ask-agentic\",\n",
    "        json={\n",
    "            \"query\": question,\n",
    "            \"top_k\": 3,\n",
    "            \"use_hybrid\": True,\n",
    "            \"model\": \"llama3.2:3b\"\n",
    "        },\n",
    "        timeout=REQUEST_TIMEOUT\n",
    "    )\n",
    "    \n",
    "    elapsed = time.time() - start_time\n",
    "    \n",
    "    if response.status_code == 200:\n",
    "        data = response.json()\n",
    "        print(f\"✓ Agentic RAG ({elapsed:.1f}s)\")\n",
    "        \n",
    "        # Display answer with better formatting\n",
    "        answer = data.get('answer', '')\n",
    "        print(f\"\\nAnswer:\\n{'-'*50}\")\n",
    "        if TRUNCATE_ANSWERS and len(answer) > 500:\n",
    "            print(answer[:500] + \"...\")\n",
    "            print(f\"(truncated, full length: {len(answer)} chars)\")\n",
    "        else:\n",
    "            print(answer)\n",
    "        print('-'*50)\n",
    "        \n",
    "        print(f\"\\nRetrieval attempts: {data.get('retrieval_attempts', 0)}\")\n",
    "        print(f\"\\nReasoning steps:\")\n",
    "        for i, step in enumerate(data.get('reasoning_steps', []), 1):\n",
    "            print(f\"  {i}. {step}\")\n",
    "        \n",
    "        # Check for guardrail validation step\n",
    "        print(\"\\nValidating guardrail and rewrite steps:\")\n",
    "        reasoning_steps = data.get('reasoning_steps', [])\n",
    "        if any(\"validated\" in step.lower() for step in reasoning_steps):\n",
    "            guardrail_step = next(s for s in reasoning_steps if \"validated\" in s.lower())\n",
    "            print(f\"  ✓ Guardrail validation: {guardrail_step}\")\n",
    "        else:\n",
    "            print(\"  ⚠ Guardrail validation step missing\")\n",
    "        \n",
    "        # Check for query rewriting\n",
    "        if data.get('rewritten_query'):\n",
    "            print(f\"\\n✓ Query was rewritten!\")\n",
    "            print(f\"  Original: {question}\")\n",
    "            print(f\"  Rewritten: {data['rewritten_query']}\")\n",
    "        elif data.get('retrieval_attempts', 0) > 1:\n",
    "            print(\"\\n→ Multiple retrieval attempts detected\")\n",
    "            if any(\"rewritten\" in step.lower() for step in reasoning_steps):\n",
    "                print(\"  ✓ Rewrite step found in reasoning\")\n",
    "            else:\n",
    "                print(\"  ⚠ Multiple attempts but no rewrite info\")\n",
    "        else:\n",
    "            print(\"\\n→ Query worked on first attempt (no rewrite needed)\")\n",
    "        \n",
    "        if data.get('retrieval_attempts', 0) > 1:\n",
    "            print(f\"\\n✓ Agent performed {data['retrieval_attempts']} retrieval attempts\")\n",
    "    else:\n",
    "        print(f\"✗ Request failed: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "        \n",
    "except Exception as e:\n",
    "    print(f\"✗ Error: {e}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"AGENTIC RAG - SCENARIO 4: Multiple Out-of-Scope Queries\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "test_queries = [\n",
    "    (\"What is a dog?\", \"Biology question\"),\n",
    "    (\"What's the weather today?\", \"Weather question\"),\n",
    "    (\"Hello, how are you?\", \"Greeting\"),\n",
    "]\n",
    "\n",
    "print(\"Testing guardrail rejection with various non-ML/NLP queries:\\n\")\n",
    "\n",
    "for query, description in test_queries:\n",
    "    print(f\"Query: {query}\")\n",
    "    print(f\"Type: {description}\")\n",
    "    \n",
    "    try:\n",
    "        response = requests.post(\n",
    "            \"http://localhost:8000/api/v1/ask-agentic\",\n",
    "            json={\"query\": query, \"top_k\": 3, \"use_hybrid\": True},\n",
    "            timeout=30\n",
    "        )\n",
    "        \n",
    "        if response.status_code == 200:\n",
    "            data = response.json()\n",
    "            \n",
    "            # Check if rejected (no retrieval)\n",
    "            is_rejected = data['retrieval_attempts'] == 0\n",
    "            \n",
    "            # Get guardrail score from reasoning if available\n",
    "            guardrail_step = next(\n",
    "                (s for s in data['reasoning_steps'] if 'validated' in s.lower() and 'score' in s.lower()),\n",
    "                None\n",
    "            )\n",
    "            \n",
    "            print(f\"Result: {'✓ REJECTED' if is_rejected else '✗ ACCEPTED'} (attempts: {data['retrieval_attempts']})\")\n",
    "            if guardrail_step:\n",
    "                print(f\"Guardrail: {guardrail_step}\")\n",
    "        else:\n",
    "            print(f\"✗ Request failed: {response.status_code}\")\n",
    "    except Exception as e:\n",
    "        print(f\"✗ Error: {e}\")\n",
    "    \n",
    "    print(\"-\" * 50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Interactive Testing\n",
    "\n",
    "Try your own questions!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ask_agentic(question: str, show_full_answer: bool = False):\n",
    "    \"\"\"Helper function to test agentic RAG.\n",
    "    \n",
    "    Args:\n",
    "        question: The question to ask\n",
    "        show_full_answer: If True, show full answer regardless of TRUNCATE_ANSWERS setting\n",
    "    \"\"\"\n",
    "    print(f\"Question: {question}\\n\")\n",
    "    \n",
    "    start = time.time()\n",
    "    \n",
    "    try:\n",
    "        response = requests.post(\n",
    "            \"http://localhost:8000/api/v1/ask-agentic\",\n",
    "            json={\"query\": question, \"top_k\": 3, \"use_hybrid\": True},\n",
    "            timeout=REQUEST_TIMEOUT\n",
    "        )\n",
    "        \n",
    "        elapsed = time.time() - start\n",
    "        \n",
    "        if response.status_code == 200:\n",
    "            data = response.json()\n",
    "            print(f\"✓ Response in {elapsed:.1f}s\\n\")\n",
    "            \n",
    "            # Display answer\n",
    "            answer = data.get('answer', '')\n",
    "            print(f\"Answer:\\n{'-'*50}\")\n",
    "            if not show_full_answer and TRUNCATE_ANSWERS and len(answer) > 500:\n",
    "                print(answer[:500] + \"...\")\n",
    "                print(f\"(truncated, full length: {len(answer)} chars)\")\n",
    "            else:\n",
    "                print(answer)\n",
    "            print('-'*50)\n",
    "            \n",
    "            # Display metadata\n",
    "            print(f\"\\nRetrieval attempts: {data.get('retrieval_attempts', 0)}\")\n",
    "            \n",
    "            # Display sources with validation\n",
    "            sources = data.get('sources', [])\n",
    "            print(f\"Sources: {len(sources)}\")\n",
    "            if sources:\n",
    "                for i, source in enumerate(sources[:3], 1):  # Show first 3\n",
    "                    if isinstance(source, dict):\n",
    "                        print(f\"  {i}. {source.get('title', source.get('id', 'Unknown'))}\")\n",
    "                    elif isinstance(source, str):\n",
    "                        print(f\"  {i}. {source}\")\n",
    "            \n",
    "            # Display reasoning\n",
    "            print(f\"\\nReasoning:\")\n",
    "            for step in data.get('reasoning_steps', []):\n",
    "                print(f\"  • {step}\")\n",
    "        else:\n",
    "            print(f\"✗ Error: {response.status_code}\")\n",
    "            print(response.text)\n",
    "    except Exception as e:\n",
    "        print(f\"✗ Exception: {e}\")\n",
    "\n",
    "# Try it!\n",
    "ask_agentic(\"How does BERT differ from GPT?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Try more questions\n",
    "ask_agentic(\"What is the capital of France?\")  # Should reject as out-of-scope"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ask_agentic(\"Explain self-attention mechanisms\")  # Should retrieve papers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "### What We Tested in Week 7:\n",
    "\n",
    "**Agentic RAG Capabilities**:\n",
    "1. ✅ **Guardrail Validation** - LLM validates query scope (0-100 score) before retrieval\n",
    "2. ✅ **Out-of-Scope Handling** - Automatically rejects queries outside ML/NLP domain\n",
    "3. ✅ **Document Grading** - Validates retrieved papers for relevance\n",
    "4. ✅ **Query Rewriting** - Improves queries if needed\n",
    "5. ✅ **Reasoning Transparency** - Shows decision-making steps\n",
    "6. ✅ **Iterative Improvement** - Can retry with better queries (max 2 attempts)\n",
    "\n",
    "### Key Improvements Over Traditional RAG:\n",
    "\n",
    "| Feature | Traditional RAG | Agentic RAG |\n",
    "|---------|----------------|-------------|\n",
    "| **Query Validation** | None | Guardrail scoring (0-100) |\n",
    "| **Out-of-Scope Handling** | None | Automatic rejection with helpful message |\n",
    "| **Retrieval Decision** | Always retrieves | Only if guardrail passes (score >= 60) |\n",
    "| **Relevance Check** | None | LLM-based document grading |\n",
    "| **Query Refinement** | None | LLM-based rewriting |\n",
    "| **Iterations** | Single pass | Up to 2 retrieval attempts |\n",
    "| **Transparency** | Black box | Detailed reasoning steps |\n",
    "| **Configuration** | Hardcoded | GraphConfig with thresholds |\n",
    "\n",
    "### Architecture: 7-Node LangGraph Workflow\n",
    "\n",
    "```\n",
    "LangGraph Workflow:\n",
    "  START\n",
    "    ↓\n",
    "  guardrail (LLM scoring 0-100)\n",
    "    ├─ score < 60 → out_of_scope → END (rejection message)\n",
    "    └─ score >= 60 → retrieve\n",
    "         ↓\n",
    "       tool_retrieve (ToolNode - executes search)\n",
    "         ↓\n",
    "       grade_documents (LLM relevance check)\n",
    "         ├─ Relevant → generate_answer → END\n",
    "         └─ Not relevant → rewrite_query → retrieve (retry, max 2 attempts)\n",
    "```\n",
    "\n",
    "### Reasoning Step Format:\n",
    "\n",
    "The new agentic RAG returns structured reasoning steps:\n",
    "\n",
    "1. **\"Validated query scope (score: X/100)\"** - Guardrail validation result\n",
    "2. **\"Retrieved documents (N attempt(s))\"** - Number of retrieval attempts\n",
    "3. **\"Graded documents (N relevant)\"** - Document relevance check\n",
    "4. **\"Rewritten query for better results\"** - Query refinement (if needed)\n",
    "5. **\"Generated answer from context\"** - Final answer generation\n",
    "\n",
    "### Configuration Parameters (GraphConfig):\n",
    "\n",
    "- `max_retrieval_attempts`: 2 - Maximum retry attempts\n",
    "- `guardrail_threshold`: 60/100 - Minimum score to proceed\n",
    "- `model`: \"llama3.2:1b\" - Default LLM model\n",
    "- `temperature`: 0.0 - Deterministic generation\n",
    "- `top_k`: 3 - Documents to retrieve\n",
    "\n",
    "### Next Steps:\n",
    "\n",
    "- **Experiment** with different question types and query complexity\n",
    "- **Monitor** reasoning steps to understand agent decision-making\n",
    "- **Compare** performance and accuracy with traditional RAG\n",
    "- **Adjust** guardrail threshold based on your domain requirements\n",
    "- **Extend** with additional tools (web search, calculations, code execution)\n",
    "\n",
    "**Week 7 Complete! You now have an intelligent, adaptive RAG system with guardrail validation! 🎉**"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}