{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f6bea003",
   "metadata": {},
   "source": [
    "# Oracle RAG with Retrieval and Generation Evaluations\n",
    "\n",
    "--------"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b223d47",
   "metadata": {},
   "source": [
    "[![Open in Colab](https://img.shields.io/badge/Open%20in-Colab-F9AB00?style=flat-square&logo=googlecolab)](https://colab.research.google.com/github/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/oracle_rag_with_evals.ipynb)\n",
    "\n",
    "This notebook shows how to build and evaluate an Oracle AI Database RAG pipeline using BEIR retrieval benchmarks and answer-level RAG evaluation metrics."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cfd3258",
   "metadata": {},
   "source": [
    "## What You'll Learn\n",
    "\n",
    "- How to run Oracle AI Database 26ai locally and connect from Python.\n",
    "- How to load BEIR data and prepare evaluation queries and documents.\n",
    "- How to generate embeddings and ingest vectors into Oracle AI Database.\n",
    "- How to evaluate keyword, vector, and hybrid retrieval strategies.\n",
    "- How to evaluate complete RAG pipelines and compare quality metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e322ad7c",
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install -Uq oracledb pandas sentence-transformers datasets einops \"numpy<2.0\" beir matplotlib"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "010cf528",
   "metadata": {},
   "source": [
    "## 1. Oracle AI Database (26ai) Local Installation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0308b61d",
   "metadata": {},
   "source": [
    "1. Install oracle via docker\n",
    "2. Ensure that docker engine is runnning\n",
    "3. Pull docker image \n",
    "4. Run a container with oracle image\n",
    "  ```\n",
    "  docker run -d \\\n",
    "    --name oracle-full \\\n",
    "    -p 1521:1521 -p 5500:5500 \\\n",
    "    -e ORACLE_PWD=OraclePwd_2025 \\\n",
    "    -e ORACLE_SID=FREE \\\n",
    "    -e ORACLE_PDB=FREEPDB1 \\\n",
    "    -v ~/oracle/full_data:/opt/oracle/oradata \\\n",
    "    container-registry.oracle.com/database/free:latest\n",
    "  ```\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de8defa7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import oracledb\n",
    "\n",
    "conn = oracledb.connect(\n",
    "    user=\"VECTOR\",\n",
    "    password=\"VectorPwd_2025\", # must match ORACLE_PWD above\n",
    "    dsn=\"localhost:1521/FREEPDB1\"\n",
    ")\n",
    "\n",
    "with conn.cursor() as cur:\n",
    "    cur.execute(\"SELECT banner FROM v$version WHERE banner LIKE 'Oracle%';\")\n",
    "    print(cur.fetchone()[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02032d60",
   "metadata": {},
   "source": [
    "## 2. Data Loading: Import BEIR and setup evaluation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07392848",
   "metadata": {},
   "source": [
    "BEIR: Benchmarking Information Retrieval ->  Standardized evaluation for retrieval models and RAG pipelines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7d347503",
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "from beir import util, LoggingHandler\n",
    "\n",
    "# Setup logging\n",
    "logging.basicConfig(format='%(asctime)s - %(message)s',\n",
    "                    datefmt='%Y-%m-%d %H:%M:%S',\n",
    "                    level=logging.INFO,\n",
    "                    handlers=[LoggingHandler()])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b2fc4747",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pathlib\n",
    "import numpy as np\n",
    "from beir.datasets.data_loader import GenericDataLoader\n",
    "from beir.retrieval.evaluation import EvaluateRetrieval\n",
    "\n",
    "\n",
    "# Download and load a BEIR dataset (e.g., scifact - scientific papers)\n",
    "dataset = \"scifact\"\n",
    "url = f\"https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{dataset}.zip\"\n",
    "data_path = util.download_and_unzip(url, \"datasets\")\n",
    "\n",
    "# Load the dataset\n",
    "corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split=\"test\")\n",
    "\n",
    "print(f\" Loaded {len(corpus)} documents, {len(queries)} queries\")\n",
    "print(f\"Sample corpus document: {list(corpus.values())[0]}\")\n",
    "print(f\"Sample query: {list(queries.values())[0]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1fc32d3a",
   "metadata": {},
   "source": [
    "| Component                             | Description                                                                                                                                       | Example shown below                        |\n",
    "| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- |\n",
    "| **Corpus**                            | The collection of documents you can retrieve from. Each has an ID (`doc_id`), text body, and optional title.                                      | “Microstructural development of human newborn...” |\n",
    "| **Queries**                           | The search inputs or questions used to test your retriever. Each has a unique `query_id` and text.                                                | “A deficiency of vitamin B12 increases blood...”  |\n",
    "| **Qrels (Query Relevance Judgments)** | Ground-truth labels that indicate which documents are relevant for each query. Each entry maps a `query_id` to a `doc_id` with a relevance score. | `query_id=1, doc_id=31715818, relevance=1.0`      |\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3eacc415",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "# Corpus\n",
    "corpus_df = pd.DataFrame.from_dict(corpus, orient=\"index\")\n",
    "corpus_df.reset_index(inplace=True)\n",
    "corpus_df.rename(columns={\"index\": \"doc_id\"}, inplace=True)\n",
    "\n",
    "# Queries\n",
    "queries_df = pd.DataFrame(list(queries.items()), columns=[\"query_id\", \"query\"])\n",
    "\n",
    "# Qrels\n",
    "qrels_df = pd.DataFrame.from_dict(qrels, orient=\"index\").stack().reset_index()\n",
    "qrels_df.columns = [\"query_id\", \"doc_id\", \"relevance\"]\n",
    "\n",
    "# --- 4. Display a few rows from each ---\n",
    "print(\"\\n Corpus sample:\")\n",
    "display(corpus_df.head())\n",
    "\n",
    "print(\"\\n Queries sample:\")\n",
    "display(queries_df.head())\n",
    "\n",
    "print(\"\\n Qrels sample:\")\n",
    "display(qrels_df.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77ac879d",
   "metadata": {},
   "source": [
    "## 3. Embedding Generation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "939be3b7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sentence_transformers import SentenceTransformer\n",
    "embedding_model = SentenceTransformer(\"nomic-ai/nomic-embed-text-v1.5\", trust_remote_code=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be75832a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 3: Generate embeddings for BEIR corpus documents (one-by-one with single progress bar)\n",
    "from tqdm import tqdm\n",
    "import pandas as pd\n",
    "\n",
    "print(f\" Generating embeddings for {len(corpus)} BEIR documents...\")\n",
    "\n",
    "# Prepare corpus data for embedding\n",
    "corpus_data = []\n",
    "for doc_id, doc_content in corpus.items():\n",
    "    title = doc_content.get('title', '')\n",
    "    text = doc_content.get('text', '')\n",
    "    \n",
    "    # Combine title and text for embedding\n",
    "    combined_text = f\"{title} {text}\".strip()\n",
    "    \n",
    "    corpus_data.append({\n",
    "        'doc_id': doc_id,\n",
    "        'title': title,\n",
    "        'text': text,\n",
    "        'combined_text': combined_text\n",
    "    })\n",
    "\n",
    "corpus_df = pd.DataFrame(corpus_data)\n",
    "\n",
    "# Add prefix for retrieval-style embeddings\n",
    "corpus_df[\"text_prefixed\"] = corpus_df[\"combined_text\"].apply(\n",
    "    lambda x: f\"search_document: {x}\"\n",
    ")\n",
    "\n",
    "print(f\" Corpus prepared: {len(corpus_df)} documents\")\n",
    "\n",
    "# Generate embeddings one-by-one with single progress bar\n",
    "print(\" Encoding embeddings one-by-one (this will take a few minutes)...\")\n",
    "embeddings = []\n",
    "\n",
    "for text in tqdm(corpus_df[\"text_prefixed\"], desc=\"Generating embeddings\", unit=\"doc\"):\n",
    "    # Generate embedding for single document\n",
    "    embedding = embedding_model.encode(\n",
    "        [text],  # Pass as single-item list\n",
    "        convert_to_numpy=True,\n",
    "        normalize_embeddings=True,\n",
    "        show_progress_bar=False  # Disable internal progress bar\n",
    "    )[0]  # Extract first (and only) embedding\n",
    "    \n",
    "    # Convert to float32 and store as list\n",
    "    embeddings.append(embedding.astype(np.float32).tolist())\n",
    "\n",
    "# Add embeddings to dataframe\n",
    "corpus_df[\"embedding\"] = embeddings\n",
    "\n",
    "embedding_dim = len(corpus_df[\"embedding\"].iloc[0])\n",
    "print(f\" Embeddings generated! Dimension: {embedding_dim}\")\n",
    "\n",
    "corpus_df.head(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5dadff8a",
   "metadata": {},
   "outputs": [],
   "source": [
    "corpus_df.head(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7072118f",
   "metadata": {},
   "source": [
    "# 4. Data Ingestion into Oracle AI Database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "640fc616",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 4: Create BEIR evaluation table in Oracle\n",
    "ddl = f\"\"\"\n",
    "BEGIN\n",
    "    EXECUTE IMMEDIATE 'DROP TABLE beir_corpus';\n",
    "EXCEPTION WHEN OTHERS THEN\n",
    "    IF SQLCODE != -942 THEN RAISE; END IF;\n",
    "END;\n",
    "/\n",
    "CREATE TABLE beir_corpus (\n",
    "    doc_id VARCHAR2(255) PRIMARY KEY,\n",
    "    title VARCHAR2(4000),\n",
    "    text CLOB,\n",
    "    embedding VECTOR({embedding_dim}, FLOAT32)\n",
    ")\n",
    "TABLESPACE USERS\n",
    "\"\"\"\n",
    "\n",
    "with conn.cursor() as cur:\n",
    "    for stmt in ddl.split(\"/\"):\n",
    "        if stmt.strip():\n",
    "            cur.execute(stmt)\n",
    "\n",
    "conn.commit()\n",
    "print(f\" Table BEIR_CORPUS created with VECTOR dimension: {embedding_dim}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80aef076",
   "metadata": {},
   "source": [
    "Create vector index on BEIR corpus"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b95f61e5",
   "metadata": {},
   "outputs": [],
   "source": [
    "with conn.cursor() as cur:\n",
    "    cur.execute(\"\"\"\n",
    "        CREATE VECTOR INDEX BEIR_VEC_IVF\n",
    "        ON beir_corpus(embedding)\n",
    "        ORGANIZATION NEIGHBOR PARTITIONS\n",
    "        DISTANCE COSINE\n",
    "        WITH TARGET ACCURACY 90\n",
    "        TABLESPACE USERS\n",
    "    \"\"\")\n",
    "\n",
    "conn.commit()\n",
    "print(\" Vector Index BEIR_VEC_IVF created\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d5a03b7",
   "metadata": {},
   "source": [
    "Create Text Index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "723744f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\" Setting up Oracle Text for proper keyword search...\")\n",
    "\n",
    "try:\n",
    "    with conn.cursor() as cur:\n",
    "        # Drop existing index if exists\n",
    "        try:\n",
    "            cur.execute(\"DROP INDEX beir_text_idx\")\n",
    "        except:\n",
    "            pass\n",
    "        \n",
    "        # Create CONTEXT index on text column\n",
    "        cur.execute(\"\"\"\n",
    "            CREATE INDEX beir_text_idx \n",
    "            ON beir_corpus(text) \n",
    "            INDEXTYPE IS CTXSYS.CONTEXT\n",
    "            PARAMETERS('SYNC (ON COMMIT)')\n",
    "        \"\"\")\n",
    "        \n",
    "        # Also index title\n",
    "        try:\n",
    "            cur.execute(\"DROP INDEX beir_title_idx\")\n",
    "        except:\n",
    "            pass\n",
    "            \n",
    "        cur.execute(\"\"\"\n",
    "            CREATE INDEX beir_title_idx \n",
    "            ON beir_corpus(title) \n",
    "            INDEXTYPE IS CTXSYS.CONTEXT\n",
    "            PARAMETERS('SYNC (ON COMMIT)')\n",
    "        \"\"\")\n",
    "        \n",
    "    conn.commit()\n",
    "    print(\" Oracle Text indexes created successfully!\")\n",
    "    \n",
    "except Exception as e:\n",
    "    print(f\" Oracle Text not available or error: {e}\")\n",
    "    print(\"   Falling back to LIKE-based search\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b4d5d04",
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm\n",
    "import array\n",
    "\n",
    "rows = []\n",
    "for i, row in corpus_df.iterrows():\n",
    "    # Convert embedding list to array.array for proper VECTOR binding\n",
    "    embedding_array = array.array('f', row.get(\"embedding\"))\n",
    "    \n",
    "    rows.append((\n",
    "        row.get(\"doc_id\"),\n",
    "        row.get(\"title\"),\n",
    "        row.get(\"text\"),\n",
    "        embedding_array\n",
    "    ))\n",
    "\n",
    "print(f\" Inserting {len(rows)} documents into BEIR_CORPUS...\")\n",
    "\n",
    "with conn.cursor() as cur:\n",
    "    for row in tqdm(rows):\n",
    "        cur.execute(\n",
    "            \"\"\"\n",
    "            INSERT INTO beir_corpus (doc_id, title, text, embedding)\n",
    "            VALUES (:1, :2, :3, :4)\n",
    "            \"\"\", \n",
    "            row\n",
    "        )\n",
    "\n",
    "conn.commit()\n",
    "print(\" BEIR corpus inserted successfully!\")\n",
    "\n",
    "# Verify insertion\n",
    "with conn.cursor() as cur:\n",
    "    cur.execute(\"SELECT COUNT(*) FROM beir_corpus\")\n",
    "    count = cur.fetchone()[0]\n",
    "    print(f\" Total documents in BEIR_CORPUS: {count}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6398b1ea",
   "metadata": {},
   "source": [
    "# Part 1: Evaluating Information Retrieval Pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "80bf5a32",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Dict\n",
    "import re\n",
    "\n",
    "class BEIRKeywordRetriever:\n",
    "    \"\"\"Fixed keyword retriever with proper error logging\"\"\"\n",
    "    \n",
    "    def __init__(self, conn, table_name=\"beir_corpus\"):\n",
    "        self.conn = conn\n",
    "        self.table_name = table_name\n",
    "        \n",
    "    def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n",
    "        \"\"\"Perform keyword search using CONTAINS operator\"\"\"\n",
    "        \n",
    "        # Extract key terms and keep mixed alphanumeric tokens (e.g., CCL19)\n",
    "        words = re.findall(r'\\b[a-zA-Z][a-zA-Z0-9]{3,}\\b', query.lower())\n",
    "        \n",
    "        if not words:\n",
    "            print(f\" No words extracted from query: '{query}'\")\n",
    "            return {}\n",
    "        \n",
    "        # Build Oracle Text query with escaped literals (prevents reserved-word parser errors)\n",
    "        unique_words = list(dict.fromkeys(words))\n",
    "        oracle_text_query = ' OR '.join(f'{{{w}}}' for w in unique_words[:5])\n",
    "        \n",
    "        sql = f\"\"\"\n",
    "            SELECT doc_id, SCORE(1) as relevance_score\n",
    "            FROM {self.table_name}\n",
    "            WHERE CONTAINS(text, :query, 1) > 0\n",
    "            ORDER BY SCORE(1) DESC\n",
    "            FETCH FIRST {top_k} ROWS ONLY\n",
    "        \"\"\"\n",
    "        \n",
    "        try:\n",
    "            with self.conn.cursor() as cur:\n",
    "                cur.execute(sql, query=oracle_text_query)\n",
    "                rows = cur.fetchall()\n",
    "            \n",
    "            # Return normalized scores\n",
    "            if rows:\n",
    "                max_score = max(row[1] for row in rows) if rows else 1\n",
    "                if max_score == 0:\n",
    "                    max_score = 1\n",
    "                results = {row[0]: float(row[1]) / max_score for row in rows}\n",
    "                return results\n",
    "            else:\n",
    "                return {}\n",
    "            \n",
    "        except Exception as e:\n",
    "            # Log the actual error instead of silently failing\n",
    "            print(f\" CONTAINS error for query '{query[:50]}...': {e}\")\n",
    "            import traceback\n",
    "            traceback.print_exc()\n",
    "            return {}\n",
    "    \n",
    "    def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n",
    "        \"\"\"Batch search for multiple queries\"\"\"\n",
    "        from tqdm import tqdm\n",
    "        results = {}\n",
    "        error_count = 0\n",
    "        \n",
    "        for query_id, query_text in tqdm(queries.items(), desc=\"Keyword search\", unit=\"query\"):\n",
    "            try:\n",
    "                result = self.search(query_text, top_k)\n",
    "                results[query_id] = result\n",
    "                if not result:\n",
    "                    error_count += 1\n",
    "            except Exception as e:\n",
    "                print(f\" Error on query {query_id}: {e}\")\n",
    "                results[query_id] = {}\n",
    "                error_count += 1\n",
    "        \n",
    "        if error_count > 0:\n",
    "            print(f\"\\n {error_count}/{len(queries)} queries returned empty results\")\n",
    "        \n",
    "        return results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38dddf57",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell: Update Vector Retriever to hide embedding progress bars\n",
    "class BEIRVectorRetriever:\n",
    "    \"\"\"Vector-based retriever using Oracle vector search on BEIR corpus\"\"\"\n",
    "    \n",
    "    def __init__(self, conn, embedding_model, table_name=\"beir_corpus\"):\n",
    "        self.conn = conn\n",
    "        self.embedding_model = embedding_model\n",
    "        self.table_name = table_name\n",
    "        \n",
    "    def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n",
    "        \"\"\"Perform vector search\"\"\"\n",
    "        \n",
    "        # Generate query embedding (disable progress bar)\n",
    "        query_embedding = self.embedding_model.encode(\n",
    "            [f\"search_query: {query}\"],\n",
    "            convert_to_numpy=True,\n",
    "            normalize_embeddings=True,\n",
    "            show_progress_bar=False  # ← This is the key fix!\n",
    "        )[0].astype(np.float32).tolist()\n",
    "        \n",
    "        query_embedding_array = array.array('f', query_embedding)\n",
    "        \n",
    "        # Execute vector search (convert distance to similarity score)\n",
    "        sql = f\"\"\"\n",
    "            SELECT \n",
    "                doc_id,\n",
    "                ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n",
    "            FROM {self.table_name}\n",
    "            ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n",
    "            FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n",
    "        \"\"\"\n",
    "        \n",
    "        with self.conn.cursor() as cur:\n",
    "            cur.execute(sql, q=query_embedding_array)\n",
    "            rows = cur.fetchall()\n",
    "        \n",
    "        # Return as dict {doc_id: score}\n",
    "        results = {row[0]: float(row[1]) for row in rows}\n",
    "        return results\n",
    "    \n",
    "    def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n",
    "        \"\"\"Batch search for multiple queries\"\"\"\n",
    "        from tqdm import tqdm\n",
    "        results = {}\n",
    "        for query_id, query_text in tqdm(queries.items(), desc=\"Vector search\", unit=\"query\"):\n",
    "            results[query_id] = self.search(query_text, top_k)\n",
    "        return results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f88fbdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "class BEIRHybridRetriever:\n",
    "    \"\"\"Hybrid retriever combining proper keyword filtering with vector search\"\"\"\n",
    "    \n",
    "    def __init__(self, conn, embedding_model, table_name=\"beir_corpus\"):\n",
    "        self.conn = conn\n",
    "        self.embedding_model = embedding_model\n",
    "        self.table_name = table_name\n",
    "        \n",
    "    def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n",
    "        \"\"\"Perform hybrid search (keyword prefilter + vector ranking)\"\"\"\n",
    "        \n",
    "        # Generate query embedding (disable progress bar)\n",
    "        query_embedding = self.embedding_model.encode(\n",
    "            [f\"search_query: {query}\"],\n",
    "            convert_to_numpy=True,\n",
    "            normalize_embeddings=True,\n",
    "            show_progress_bar=False\n",
    "        )[0].astype(np.float32).tolist()\n",
    "        \n",
    "        query_embedding_array = array.array('f', query_embedding)\n",
    "        \n",
    "        # Extract key words for keyword filter (4+ letters)\n",
    "        words = re.findall(r'\\b[a-zA-Z]{4,}\\b', query.lower())\n",
    "        \n",
    "        if not words:\n",
    "            # No keywords, fall back to pure vector search\n",
    "            sql = f\"\"\"\n",
    "                SELECT \n",
    "                    doc_id,\n",
    "                    ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n",
    "                FROM {self.table_name}\n",
    "                ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n",
    "                FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n",
    "            \"\"\"\n",
    "            with self.conn.cursor() as cur:\n",
    "                cur.execute(sql, q=query_embedding_array)\n",
    "                rows = cur.fetchall()\n",
    "        else:\n",
    "            # Build keyword filter with OR conditions (less restrictive)\n",
    "            search_words = words[:3]  # Use top 3 words\n",
    "            or_conditions = []\n",
    "            params = {'q': query_embedding_array}\n",
    "            \n",
    "            for i, word in enumerate(search_words):\n",
    "                or_conditions.append(f\"LOWER(text) LIKE :word{i}\")\n",
    "                params[f'word{i}'] = f'%{word}%'\n",
    "            \n",
    "            where_clause = \" OR \".join(or_conditions)\n",
    "            \n",
    "            # Hybrid: keyword prefilter + vector ranking\n",
    "            sql = f\"\"\"\n",
    "                SELECT \n",
    "                    doc_id,\n",
    "                    ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n",
    "                FROM {self.table_name}\n",
    "                WHERE {where_clause}\n",
    "                ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n",
    "                FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n",
    "            \"\"\"\n",
    "            \n",
    "            with self.conn.cursor() as cur:\n",
    "                cur.execute(sql, **params)\n",
    "                rows = cur.fetchall()\n",
    "        \n",
    "        # Return as dict {doc_id: score}\n",
    "        results = {row[0]: float(row[1]) for row in rows}\n",
    "        return results\n",
    "    \n",
    "    def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n",
    "        \"\"\"Batch search for multiple queries\"\"\"\n",
    "        from tqdm import tqdm\n",
    "        results = {}\n",
    "        for query_id, query_text in tqdm(queries.items(), desc=\"Hybrid search\", unit=\"query\"):\n",
    "            results[query_id] = self.search(query_text, top_k)\n",
    "        return results\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a991e66",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize all three retrievers\n",
    "print(\"\\n Initializing BEIR retrievers...\")\n",
    "beir_keyword_retriever = BEIRKeywordRetriever(conn)\n",
    "beir_vector_retriever = BEIRVectorRetriever(conn, embedding_model)\n",
    "beir_hybrid_retriever = BEIRHybridRetriever(conn, embedding_model)\n",
    "\n",
    "print(\" All BEIR retrievers initialized successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0d2e71a8",
   "metadata": {},
   "source": [
    "## Run evaluation for all three retrieval methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce839775",
   "metadata": {},
   "outputs": [],
   "source": [
    "from beir.retrieval.evaluation import EvaluateRetrieval\n",
    "\n",
    "# Initialize evaluator\n",
    "evaluator = EvaluateRetrieval()\n",
    "\n",
    "# Dictionary to store all results\n",
    "retrieval_results = {}\n",
    "\n",
    "# ------------------------------------------------------------------\n",
    "# 1. KEYWORD-BASED RETRIEVAL EVALUATION\n",
    "# ------------------------------------------------------------------\n",
    "print(\"\\n [1/3] Evaluating KEYWORD-BASED retrieval...\")\n",
    "keyword_results = beir_keyword_retriever.batch_search(queries, top_k=100)\n",
    "keyword_ndcg, keyword_map, keyword_recall, keyword_precision = evaluator.evaluate(\n",
    "    qrels, keyword_results, [1, 3, 5, 10, 100]\n",
    ")\n",
    "\n",
    "retrieval_results['keyword'] = {\n",
    "    'ndcg': keyword_ndcg,\n",
    "    'map': keyword_map,\n",
    "    'recall': keyword_recall,\n",
    "    'precision': keyword_precision,\n",
    "    'raw_results': keyword_results\n",
    "}\n",
    "\n",
    "print(f\" Keyword retrieval complete - NDCG@10: {keyword_ndcg.get('NDCG@10', 0):.4f}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39f2a78f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ------------------------------------------------------------------\n",
    "# 2. VECTOR-BASED RETRIEVAL EVALUATION\n",
    "# ------------------------------------------------------------------\n",
    "print(\"\\n [2/3] Evaluating VECTOR-BASED retrieval...\")\n",
    "vector_results = beir_vector_retriever.batch_search(queries, top_k=100)\n",
    "vector_ndcg, vector_map, vector_recall, vector_precision = evaluator.evaluate(\n",
    "    qrels, vector_results, [1, 3, 5, 10, 100]\n",
    ")\n",
    "\n",
    "retrieval_results['vector'] = {\n",
    "    'ndcg': vector_ndcg,\n",
    "    'map': vector_map,\n",
    "    'recall': vector_recall,\n",
    "    'precision': vector_precision,\n",
    "    'raw_results': vector_results\n",
    "}\n",
    "\n",
    "print(f\" Vector retrieval complete - NDCG@10: {vector_ndcg.get('NDCG@10', 0):.4f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed1dbf3d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ------------------------------------------------------------------\n",
    "# 3. HYBRID RETRIEVAL EVALUATION\n",
    "# ------------------------------------------------------------------\n",
    "print(\"\\n [3/3] Evaluating HYBRID retrieval...\")\n",
    "hybrid_results = beir_hybrid_retriever.batch_search(queries, top_k=100)\n",
    "hybrid_ndcg, hybrid_map, hybrid_recall, hybrid_precision = evaluator.evaluate(\n",
    "    qrels, hybrid_results, [1, 3, 5, 10, 100]\n",
    ")\n",
    "\n",
    "retrieval_results['hybrid'] = {\n",
    "    'ndcg': hybrid_ndcg,\n",
    "    'map': hybrid_map,\n",
    "    'recall': hybrid_recall,\n",
    "    'precision': hybrid_precision,\n",
    "    'raw_results': hybrid_results\n",
    "}\n",
    "\n",
    "print(f\" Hybrid retrieval complete - NDCG@10: {hybrid_ndcg.get('NDCG@10', 0):.4f}\")\n",
    "\n",
    "print(\"\\n\" + \"=\"*80)\n",
    "print(\" ALL EVALUATIONS COMPLETE!\")\n",
    "print(\"=\"*80)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ce1ca61",
   "metadata": {},
   "source": [
    "## Create comparison tables for all metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "152a84a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_comparison_table(metric_name, metric_dict_key):\n",
    "    \"\"\"Helper function to create comparison tables\"\"\"\n",
    "    data = []\n",
    "    k_values = [1, 3, 5, 10, 100]\n",
    "    \n",
    "    for k in k_values:\n",
    "        row = {'k': k}\n",
    "        for method in ['keyword', 'vector', 'hybrid']:\n",
    "            metric_key = f\"{metric_name}@{k}\"\n",
    "            row[method] = retrieval_results[method][metric_dict_key].get(metric_key, 0)\n",
    "        data.append(row)\n",
    "    \n",
    "    return pd.DataFrame(data)\n",
    "\n",
    "# Create comparison tables\n",
    "print(\"\\n\" + \"=\"*80)\n",
    "print(\" RETRIEVAL METHODS COMPARISON - DETAILED METRICS\")\n",
    "print(\"=\"*80)\n",
    "\n",
    "# NDCG Comparison\n",
    "print(\"\\n NDCG (Normalized Discounted Cumulative Gain) Comparison:\")\n",
    "print(\"-\" * 80)\n",
    "ndcg_comparison = create_comparison_table(\"NDCG\", \"ndcg\")\n",
    "print(ndcg_comparison.to_string(index=False))\n",
    "\n",
    "# MAP Comparison\n",
    "print(\"\\n MAP (Mean Average Precision) Comparison:\")\n",
    "print(\"-\" * 80)\n",
    "map_comparison = create_comparison_table(\"MAP\", \"map\")\n",
    "print(map_comparison.to_string(index=False))\n",
    "\n",
    "# Recall Comparison\n",
    "print(\"\\n Recall Comparison:\")\n",
    "print(\"-\" * 80)\n",
    "recall_comparison = create_comparison_table(\"Recall\", \"recall\")\n",
    "print(recall_comparison.to_string(index=False))\n",
    "\n",
    "# Precision Comparison\n",
    "print(\"\\n Precision Comparison:\")\n",
    "print(\"-\" * 80)\n",
    "precision_comparison = create_comparison_table(\"Precision\", \"precision\")\n",
    "print(precision_comparison.to_string(index=False))\n",
    "\n",
    "print(\"\\n\" + \"=\"*80)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81a54220",
   "metadata": {},
   "source": [
    "## Create comprehensive comparison charts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "065a6a88",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cell 10: Create comprehensive comparison charts\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "# Set up the plotting style\n",
    "plt.style.use('seaborn-v0_8-darkgrid')\n",
    "colors = {'keyword': '#FF6B6B', 'vector': '#4ECDC4', 'hybrid': '#95E1D3'}\n",
    "\n",
    "# Create a 2x2 subplot figure\n",
    "fig, axes = plt.subplots(2, 2, figsize=(16, 12))\n",
    "fig.suptitle('Retrieval Methods Comparison - BEIR Evaluation', fontsize=18, fontweight='bold', y=0.995)\n",
    "\n",
    "k_values = [1, 3, 5, 10, 100]\n",
    "x_pos = np.arange(len(k_values))\n",
    "bar_width = 0.25\n",
    "\n",
    "# ------------------------------------------------------------------\n",
    "# Plot 1: NDCG Comparison\n",
    "# ------------------------------------------------------------------\n",
    "ax1 = axes[0, 0]\n",
    "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n",
    "    ndcg_values = [retrieval_results[method]['ndcg'].get(f\"NDCG@{k}\", 0) for k in k_values]\n",
    "    ax1.bar(x_pos + i*bar_width, ndcg_values, bar_width, \n",
    "            label=method.upper(), color=colors[method], alpha=0.8)\n",
    "\n",
    "ax1.set_xlabel('k', fontsize=12, fontweight='bold')\n",
    "ax1.set_ylabel('NDCG Score', fontsize=12, fontweight='bold')\n",
    "ax1.set_title('NDCG@k - Normalized Discounted Cumulative Gain', fontsize=14, fontweight='bold')\n",
    "ax1.set_xticks(x_pos + bar_width)\n",
    "ax1.set_xticklabels(k_values)\n",
    "ax1.legend(loc='lower right')\n",
    "ax1.grid(axis='y', alpha=0.3)\n",
    "ax1.set_ylim([0, 1])\n",
    "\n",
    "# ------------------------------------------------------------------\n",
    "# Plot 2: MAP Comparison\n",
    "# ------------------------------------------------------------------\n",
    "ax2 = axes[0, 1]\n",
    "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n",
    "    map_values = [retrieval_results[method]['map'].get(f\"MAP@{k}\", 0) for k in k_values]\n",
    "    ax2.bar(x_pos + i*bar_width, map_values, bar_width, \n",
    "            label=method.upper(), color=colors[method], alpha=0.8)\n",
    "\n",
    "ax2.set_xlabel('k', fontsize=12, fontweight='bold')\n",
    "ax2.set_ylabel('MAP Score', fontsize=12, fontweight='bold')\n",
    "ax2.set_title('MAP@k - Mean Average Precision', fontsize=14, fontweight='bold')\n",
    "ax2.set_xticks(x_pos + bar_width)\n",
    "ax2.set_xticklabels(k_values)\n",
    "ax2.legend(loc='lower right')\n",
    "ax2.grid(axis='y', alpha=0.3)\n",
    "ax2.set_ylim([0, 1])\n",
    "\n",
    "# ------------------------------------------------------------------\n",
    "# Plot 3: Recall Comparison\n",
    "# ------------------------------------------------------------------\n",
    "ax3 = axes[1, 0]\n",
    "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n",
    "    recall_values = [retrieval_results[method]['recall'].get(f\"Recall@{k}\", 0) for k in k_values]\n",
    "    ax3.bar(x_pos + i*bar_width, recall_values, bar_width, \n",
    "            label=method.upper(), color=colors[method], alpha=0.8)\n",
    "\n",
    "ax3.set_xlabel('k', fontsize=12, fontweight='bold')\n",
    "ax3.set_ylabel('Recall Score', fontsize=12, fontweight='bold')\n",
    "ax3.set_title('Recall@k - Proportion of Relevant Docs Retrieved', fontsize=14, fontweight='bold')\n",
    "ax3.set_xticks(x_pos + bar_width)\n",
    "ax3.set_xticklabels(k_values)\n",
    "ax3.legend(loc='lower right')\n",
    "ax3.grid(axis='y', alpha=0.3)\n",
    "ax3.set_ylim([0, 1])\n",
    "\n",
    "# ------------------------------------------------------------------\n",
    "# Plot 4: Precision Comparison\n",
    "# ------------------------------------------------------------------\n",
    "ax4 = axes[1, 1]\n",
    "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n",
    "    precision_values = [retrieval_results[method]['precision'].get(f\"P@{k}\", 0) for k in k_values]\n",
    "    ax4.bar(x_pos + i*bar_width, precision_values, bar_width, \n",
    "            label=method.upper(), color=colors[method], alpha=0.8)\n",
    "\n",
    "ax4.set_xlabel('k', fontsize=12, fontweight='bold')\n",
    "ax4.set_ylabel('Precision Score', fontsize=12, fontweight='bold')\n",
    "ax4.set_title('Precision@k - Proportion of Retrieved Docs Relevant', fontsize=14, fontweight='bold')\n",
    "ax4.set_xticks(x_pos + bar_width)\n",
    "ax4.set_xticklabels(k_values)\n",
    "ax4.legend(loc='upper right')\n",
    "ax4.grid(axis='y', alpha=0.3)\n",
    "ax4.set_ylim([0, 1])\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ef3ea56",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"\\n\" + \"=\"*80)\n",
    "print(\" RETRIEVAL METHODS SUMMARY & WINNER ANALYSIS\")\n",
    "print(\"=\"*80)\n",
    "\n",
    "# Calculate average scores across all k values\n",
    "summary_data = []\n",
    "\n",
    "for method in ['keyword', 'vector', 'hybrid']:\n",
    "    avg_ndcg = np.mean([retrieval_results[method]['ndcg'].get(f\"NDCG@{k}\", 0) for k in k_values])\n",
    "    avg_map = np.mean([retrieval_results[method]['map'].get(f\"MAP@{k}\", 0) for k in k_values])\n",
    "    avg_recall = np.mean([retrieval_results[method]['recall'].get(f\"Recall@{k}\", 0) for k in k_values])\n",
    "    avg_precision = np.mean([retrieval_results[method]['precision'].get(f\"P@{k}\", 0) for k in k_values])\n",
    "    \n",
    "    summary_data.append({\n",
    "        'Method': method.upper(),\n",
    "        'Avg NDCG': avg_ndcg,\n",
    "        'Avg MAP': avg_map,\n",
    "        'Avg Recall': avg_recall,\n",
    "        'Avg Precision': avg_precision,\n",
    "        'Overall Score': (avg_ndcg + avg_map + avg_recall + avg_precision) / 4\n",
    "    })\n",
    "\n",
    "summary_df = pd.DataFrame(summary_data)\n",
    "summary_df = summary_df.round(4)\n",
    "\n",
    "print(\"\\n Average Performance Across All k Values:\")\n",
    "print(\"-\" * 80)\n",
    "print(summary_df.to_string(index=False))\n",
    "\n",
    "# Determine winners for each metric\n",
    "print(\"\\n Winners by Metric:\")\n",
    "print(\"-\" * 80)\n",
    "for metric in ['Avg NDCG', 'Avg MAP', 'Avg Recall', 'Avg Precision', 'Overall Score']:\n",
    "    winner_idx = summary_df[metric].idxmax()\n",
    "    winner = summary_df.loc[winner_idx, 'Method']\n",
    "    score = summary_df.loc[winner_idx, metric]\n",
    "    print(f\"{metric:20s}: {winner:10s} (Score: {score:.4f})\")\n",
    "\n",
    "print(\"\\n\" + \"=\"*80)\n",
    "\n",
    "# Create a radar chart for overall comparison\n",
    "fig, ax = plt.subplots(figsize=(10, 10), subplot_kw=dict(projection='polar'))\n",
    "\n",
    "categories = ['NDCG@10', 'MAP@10', 'Recall@100', 'Precision@10']\n",
    "angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()\n",
    "angles += angles[:1]  # Complete the circle\n",
    "\n",
    "for method in ['keyword', 'vector', 'hybrid']:\n",
    "    values = [\n",
    "        retrieval_results[method]['ndcg'].get('NDCG@10', 0),\n",
    "        retrieval_results[method]['map'].get('MAP@10', 0),\n",
    "        retrieval_results[method]['recall'].get('Recall@100', 0),\n",
    "        retrieval_results[method]['precision'].get('P@10', 0)\n",
    "    ]\n",
    "    values += values[:1]  # Complete the circle\n",
    "    \n",
    "    ax.plot(angles, values, 'o-', linewidth=2.5, label=method.upper(), color=colors[method])\n",
    "    ax.fill(angles, values, alpha=0.15, color=colors[method])\n",
    "\n",
    "ax.set_xticks(angles[:-1])\n",
    "ax.set_xticklabels(categories, size=12)\n",
    "ax.set_ylim(0, 1)\n",
    "ax.set_title('Retrieval Methods - Radar Comparison', size=16, fontweight='bold', pad=20)\n",
    "ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))\n",
    "ax.grid(True)\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()\n",
    "\n",
    "print(\"\\n Phase 1 Evaluation Complete - All metrics stored in 'retrieval_results' variable\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f13ab7e",
   "metadata": {},
   "source": [
    "# Part 2: Evaluating RAG Pipelines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d9734b1",
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install -qU \"galileo[openai]\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9188eb05",
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "\n",
    "# Function to securely get and set environment variables\n",
    "def set_env_securely(var_name, prompt):\n",
    "    value = getpass.getpass(prompt)\n",
    "    os.environ[var_name] = value"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e8470521",
   "metadata": {},
   "outputs": [],
   "source": [
    "set_env_securely(\"GALILEO_API_KEY\", \"Enter your Galileo API key: \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b6d70d9",
   "metadata": {},
   "outputs": [],
   "source": [
    "set_env_securely(\"OPENAI_API_KEY\", \"Enter your OpenAI API key: \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53ade417",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from galileo.openai import openai\n",
    "\n",
    "# Initialize the Galileo wrapped OpenAI client\n",
    "openai_client = openai.OpenAI(api_key=os.environ.get(\"OPENAI_API_KEY\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1aed083f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from galileo import log\n",
    "\n",
    "def hybrid_search_beir_corpus(conn, embedding_model, search_phrase: str, top_k: int = 10, show_explain: bool = False):\n",
    "    \"\"\"\n",
    "    Hybrid search on the beir_corpus table\n",
    "    Combines keyword filtering with vector similarity search.\n",
    "    \n",
    "    Returns:\n",
    "        tuple: (rows, columns, exec_plan_text)\n",
    "        \n",
    "    NOTE: This function is decorated with @log to capture retrieval metrics in Galileo\n",
    "    \"\"\"\n",
    "    # Generate query embedding\n",
    "    query_embedding = embedding_model.encode(\n",
    "        [f\"search_query: {search_phrase}\"],\n",
    "        convert_to_numpy=True,\n",
    "        normalize_embeddings=True,\n",
    "        show_progress_bar=False\n",
    "    )[0].astype(np.float32).tolist()\n",
    "    \n",
    "    query_embedding_array = array.array('f', query_embedding)\n",
    "    \n",
    "    # Extract keywords for filtering (4+ letter words)\n",
    "    words = re.findall(r'\\b[a-zA-Z]{4,}\\b', search_phrase.lower())\n",
    "    \n",
    "    # Build keyword filter conditions\n",
    "    if words:\n",
    "        search_words = words[:3]  # Use top 3 words\n",
    "        or_conditions = []\n",
    "        \n",
    "        for i, word in enumerate(search_words):\n",
    "            or_conditions.append(f\"(LOWER(title) LIKE '%{word}%' OR LOWER(text) LIKE '%{word}%')\")\n",
    "        \n",
    "        where_clause = \" OR \".join(or_conditions)\n",
    "    else:\n",
    "        # No keywords, search everything\n",
    "        where_clause = \"1=1\"\n",
    "    \n",
    "    # Hybrid search SQL\n",
    "    sql = f\"\"\"\n",
    "        SELECT {\"/*+ GATHER_PLAN_STATISTICS */\" if show_explain else \"\"}\n",
    "            doc_id,\n",
    "            title,\n",
    "            SUBSTR(text, 1, 500) AS text_snippet,\n",
    "            ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS similarity_score\n",
    "        FROM beir_corpus\n",
    "        WHERE {where_clause}\n",
    "        ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n",
    "        FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n",
    "    \"\"\"\n",
    "    \n",
    "    with conn.cursor() as cur:\n",
    "        cur.execute(sql, q=query_embedding_array)\n",
    "        rows = cur.fetchall()\n",
    "        columns = [desc[0] for desc in cur.description]\n",
    "        \n",
    "        # Get execution plan if requested\n",
    "        exec_plan_text = None\n",
    "        if show_explain:\n",
    "            cur.execute(\"SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT => 'TYPICAL'))\")\n",
    "            exec_plan_text = \"\\n\".join([row[0] for row in cur.fetchall() if row[0]])\n",
    "    \n",
    "    # Format output for Galileo retriever span (return list of documents)\n",
    "    @log(span_type=\"retriever\", name=\"Hybrid Search - BEIR Corpus\")\n",
    "    def get_retrieved_docs(rows):\n",
    "        retrieved_docs = []\n",
    "        for row in rows:\n",
    "            row_data = dict(zip(columns, row))\n",
    "            retrieved_docs.append({\n",
    "                \"doc_id\": row_data.get(\"DOC_ID\"),\n",
    "                \"title\": row_data.get(\"TITLE\"),\n",
    "                \"snippet\": row_data.get(\"TEXT_SNIPPET\"),\n",
    "                \"score\": float(row_data.get(\"SIMILARITY_SCORE\", 0))\n",
    "            })\n",
    "        return retrieved_docs\n",
    "\n",
    "\n",
    "    retrieved_docs = get_retrieved_docs(rows)\n",
    "    \n",
    "    return rows, columns, exec_plan_text, retrieved_docs\n",
    "\n",
    "\n",
    "print(\" Hybrid search function for beir_corpus table created!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad82fd15",
   "metadata": {},
   "outputs": [],
   "source": [
    "@log(span_type=\"workflow\", name=\"Research Paper RAG Pipeline\")\n",
    "def research_paper_assistant_rag_pipeline(\n",
    "    conn,\n",
    "    embedding_model,\n",
    "    user_query: str,\n",
    "    top_k: int = 10,\n",
    "    retrieval_mode: str = \"hybrid\",\n",
    "    show_explain: bool = False\n",
    "):\n",
    "    \"\"\"\n",
    "    Research Paper Assistant — Retrieval-Augmented Generation (RAG) pipeline\n",
    "    built on SQL-based retrieval functions and powered by the OpenAI Responses API.\n",
    "\n",
    "    Retrieval techniques available:\n",
    "        - 'keyword'  → uses keyword_search_research_papers()\n",
    "        - 'vector'   → uses vector_search_research_papers()\n",
    "        - 'hybrid'   → uses hybrid_search_research_papers() [default]\n",
    "\n",
    "    Args:\n",
    "        conn: Oracle database connection.\n",
    "        embedding_model: Embedding model (e.g., SentenceTransformer, Voyage).\n",
    "        user_query (str): Research question from the user.\n",
    "        top_k (int): Number of top documents to retrieve.\n",
    "        retrieval_mode (str): Retrieval method ('keyword', 'vector', 'hybrid').\n",
    "        show_explain (bool): Whether to show the SQL execution plan.\n",
    "\n",
    "    Returns:\n",
    "        str: LLM-generated research synthesis with citations.\n",
    "        \n",
    "    NOTE: This function is decorated with @log to create a workflow span containing\n",
    "          the retrieval span (from hybrid_search_beir_corpus) and LLM span (from OpenAI).\n",
    "    \"\"\"\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 1. Retrieve relevant research papers using the selected retrieval mode\n",
    "    # ----------------------------------------------------------------------\n",
    "    if retrieval_mode == \"keyword\":\n",
    "        rows, columns, exec_plan_text, retrieved_docs = vector_search_beir_corpus(conn, user_query)\n",
    "\n",
    "    elif retrieval_mode == \"vector\":\n",
    "        rows, columns, exec_plan_text, retrieved_docs = hybrid_search_beir_corpus(conn, embedding_model, user_query, top_k)\n",
    "\n",
    "    else:  # default: hybrid retrieval\n",
    "        rows, columns, exec_plan_text, retrieved_docs = hybrid_search_beir_corpus(\n",
    "            conn=conn,\n",
    "            embedding_model=embedding_model,\n",
    "            search_phrase=user_query,\n",
    "            top_k=top_k,\n",
    "            show_explain=show_explain\n",
    "        )\n",
    "\n",
    "    retrieved_count = len(rows) if rows else 0\n",
    "    print(f\" Retrieved {retrieved_count} papers using {retrieval_mode.upper()} retrieval.\")\n",
    "\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 2. Convert retrieved rows to formatted LLM context\n",
    "    # ----------------------------------------------------------------------\n",
    "    formatted_context = \"\"\n",
    "    if retrieved_count > 0:\n",
    "        formatted_context += f\"\\n\\n📚 {retrieved_count} relevant research papers retrieved:\\n\\n\"\n",
    "        for i, row in enumerate(rows):\n",
    "            row_data = dict(zip(columns, row))\n",
    "            title = row_data.get(\"TITLE\", \"Untitled Paper\")\n",
    "            abstract = row_data.get(\"ABSTRACT\", \"No abstract available.\")\n",
    "            snippet = row_data.get(\"TEXT_SNIPPET\", \"\")\n",
    "            score = (\n",
    "                row_data.get(\"SIMILARITY_SCORE\")\n",
    "                or row_data.get(\"TEXT_RELEVANCE_SCORE\")\n",
    "                or \"N/A\"\n",
    "            )\n",
    "            formatted_context += (\n",
    "                f\"[{i+1}] **{title}**\\n\"\n",
    "                f\"Abstract: {abstract}\\n\"\n",
    "                f\"Snippet: {snippet}\\n\"\n",
    "                f\"Relevance Score: {score}\\n\\n\"\n",
    "            )\n",
    "    else:\n",
    "        formatted_context = \"\\n\\n⚠️ No relevant papers were retrieved from the database.\\n\"\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 3. Construct the prompt for the Responses API\n",
    "    # ----------------------------------------------------------------------\n",
    "    prompt = f\"\"\"\n",
    "            You are a **Research Paper Assistant** that synthesizes academic literature to help answer user questions.\n",
    "\n",
    "            User Query: {user_query}\n",
    "\n",
    "            Number of retrieved papers: {retrieved_count}\n",
    "            {formatted_context}\n",
    "\n",
    "            Please:\n",
    "            - Summarize the findings most relevant to the query.\n",
    "            - Use citation numbers [X] to support claims.\n",
    "            - Highlight consensus, innovation, or research gaps.\n",
    "            - If there is insufficient context, clearly say so.\n",
    "            \"\"\"\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 4. Call the OpenAI Responses API\n",
    "    # ----------------------------------------------------------------------\n",
    "    response = openai_client.chat.completions.create(\n",
    "        model=\"gpt-4o\",\n",
    "        messages=[\n",
    "            {\"role\": \"system\", \"content\": \"You are a scientific research assistant. Use only the provided context to answer. Always cite papers [1], [2], etc.\"},\n",
    "            {\"role\": \"user\", \"content\": prompt}\n",
    "        ]\n",
    "    )\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 5. Optionally print SQL execution plan (if hybrid)\n",
    "    # ----------------------------------------------------------------------\n",
    "    if show_explain and exec_plan_text:\n",
    "        print(\"\\n====== SQL Execution Plan ======\")\n",
    "        print(exec_plan_text)\n",
    "        print(\"================================\\n\")\n",
    "\n",
    "    # ----------------------------------------------------------------------\n",
    "    # 6. Return the LLM’s output text\n",
    "    # ----------------------------------------------------------------------\n",
    "    return response.choices[0].message.content\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "733a8ee7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from galileo import galileo_context\n",
    "\n",
    "galileo_context.init(\n",
    "    project=\"ai_system_evaluation_project\",\n",
    "    log_stream=\"ai_system_evaluation_showcase\"\n",
    ")\n",
    "\n",
    "summary = research_paper_assistant_rag_pipeline(\n",
    "    conn=conn,\n",
    "    embedding_model=embedding_model,\n",
    "    user_query=\"Can you get me some information on the research in the field of AI?\",\n",
    "    top_k=5,\n",
    "    retrieval_mode=\"hybrid\",\n",
    "    show_explain=False\n",
    ")\n",
    "\n",
    "# Flush the logger to ensure all traces are uploaded to Galileo\n",
    "# Note: The @log decorator automatically flushes when the decorated function exits,\n",
    "# but in notebooks it's good practice to explicitly flush to ensure data is sent\n",
    "galileo_context.flush()\n",
    "print(\" RAG pipeline traces flushed to Galileo\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e0e674b7",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(summary)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "playground",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}