{ "cells": [ { "cell_type": "markdown", "id": "f6bea003", "metadata": {}, "source": [ "# Oracle RAG with Retrieval and Generation Evaluations\n", "\n", "--------" ] }, { "cell_type": "markdown", "id": "6b223d47", "metadata": {}, "source": [ "[![Open in Colab](https://img.shields.io/badge/Open%20in-Colab-F9AB00?style=flat-square&logo=googlecolab)](https://colab.research.google.com/github/oracle-devrel/oracle-ai-developer-hub/blob/main/notebooks/oracle_rag_with_evals.ipynb)\n", "\n", "This notebook shows how to build and evaluate an Oracle AI Database RAG pipeline using BEIR retrieval benchmarks and answer-level RAG evaluation metrics." ] }, { "cell_type": "markdown", "id": "3cfd3258", "metadata": {}, "source": [ "## What You'll Learn\n", "\n", "- How to run Oracle AI Database 26ai locally and connect from Python.\n", "- How to load BEIR data and prepare evaluation queries and documents.\n", "- How to generate embeddings and ingest vectors into Oracle AI Database.\n", "- How to evaluate keyword, vector, and hybrid retrieval strategies.\n", "- How to evaluate complete RAG pipelines and compare quality metrics." ] }, { "cell_type": "code", "execution_count": null, "id": "e322ad7c", "metadata": {}, "outputs": [], "source": [ "! pip install -Uq oracledb pandas sentence-transformers datasets einops \"numpy<2.0\" beir matplotlib" ] }, { "cell_type": "markdown", "id": "010cf528", "metadata": {}, "source": [ "## 1. Oracle AI Database (26ai) Local Installation" ] }, { "cell_type": "markdown", "id": "0308b61d", "metadata": {}, "source": [ "1. Install oracle via docker\n", "2. Ensure that docker engine is runnning\n", "3. Pull docker image \n", "4. Run a container with oracle image\n", " ```\n", " docker run -d \\\n", " --name oracle-full \\\n", " -p 1521:1521 -p 5500:5500 \\\n", " -e ORACLE_PWD=OraclePwd_2025 \\\n", " -e ORACLE_SID=FREE \\\n", " -e ORACLE_PDB=FREEPDB1 \\\n", " -v ~/oracle/full_data:/opt/oracle/oradata \\\n", " container-registry.oracle.com/database/free:latest\n", " ```\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "de8defa7", "metadata": {}, "outputs": [], "source": [ "import oracledb\n", "\n", "conn = oracledb.connect(\n", " user=\"VECTOR\",\n", " password=\"VectorPwd_2025\", # must match ORACLE_PWD above\n", " dsn=\"localhost:1521/FREEPDB1\"\n", ")\n", "\n", "with conn.cursor() as cur:\n", " cur.execute(\"SELECT banner FROM v$version WHERE banner LIKE 'Oracle%';\")\n", " print(cur.fetchone()[0])" ] }, { "cell_type": "markdown", "id": "02032d60", "metadata": {}, "source": [ "## 2. Data Loading: Import BEIR and setup evaluation" ] }, { "cell_type": "markdown", "id": "07392848", "metadata": {}, "source": [ "BEIR: Benchmarking Information Retrieval -> Standardized evaluation for retrieval models and RAG pipelines" ] }, { "cell_type": "code", "execution_count": null, "id": "7d347503", "metadata": {}, "outputs": [], "source": [ "import logging\n", "from beir import util, LoggingHandler\n", "\n", "# Setup logging\n", "logging.basicConfig(format='%(asctime)s - %(message)s',\n", " datefmt='%Y-%m-%d %H:%M:%S',\n", " level=logging.INFO,\n", " handlers=[LoggingHandler()])" ] }, { "cell_type": "code", "execution_count": null, "id": "b2fc4747", "metadata": {}, "outputs": [], "source": [ "import pathlib\n", "import numpy as np\n", "from beir.datasets.data_loader import GenericDataLoader\n", "from beir.retrieval.evaluation import EvaluateRetrieval\n", "\n", "\n", "# Download and load a BEIR dataset (e.g., scifact - scientific papers)\n", "dataset = \"scifact\"\n", "url = f\"https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{dataset}.zip\"\n", "data_path = util.download_and_unzip(url, \"datasets\")\n", "\n", "# Load the dataset\n", "corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split=\"test\")\n", "\n", "print(f\" Loaded {len(corpus)} documents, {len(queries)} queries\")\n", "print(f\"Sample corpus document: {list(corpus.values())[0]}\")\n", "print(f\"Sample query: {list(queries.values())[0]}\")" ] }, { "cell_type": "markdown", "id": "1fc32d3a", "metadata": {}, "source": [ "| Component | Description | Example shown below |\n", "| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- |\n", "| **Corpus** | The collection of documents you can retrieve from. Each has an ID (`doc_id`), text body, and optional title. | “Microstructural development of human newborn...” |\n", "| **Queries** | The search inputs or questions used to test your retriever. Each has a unique `query_id` and text. | “A deficiency of vitamin B12 increases blood...” |\n", "| **Qrels (Query Relevance Judgments)** | Ground-truth labels that indicate which documents are relevant for each query. Each entry maps a `query_id` to a `doc_id` with a relevance score. | `query_id=1, doc_id=31715818, relevance=1.0` |\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3eacc415", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "# Corpus\n", "corpus_df = pd.DataFrame.from_dict(corpus, orient=\"index\")\n", "corpus_df.reset_index(inplace=True)\n", "corpus_df.rename(columns={\"index\": \"doc_id\"}, inplace=True)\n", "\n", "# Queries\n", "queries_df = pd.DataFrame(list(queries.items()), columns=[\"query_id\", \"query\"])\n", "\n", "# Qrels\n", "qrels_df = pd.DataFrame.from_dict(qrels, orient=\"index\").stack().reset_index()\n", "qrels_df.columns = [\"query_id\", \"doc_id\", \"relevance\"]\n", "\n", "# --- 4. Display a few rows from each ---\n", "print(\"\\n Corpus sample:\")\n", "display(corpus_df.head())\n", "\n", "print(\"\\n Queries sample:\")\n", "display(queries_df.head())\n", "\n", "print(\"\\n Qrels sample:\")\n", "display(qrels_df.head())" ] }, { "cell_type": "markdown", "id": "77ac879d", "metadata": {}, "source": [ "## 3. Embedding Generation" ] }, { "cell_type": "code", "execution_count": null, "id": "939be3b7", "metadata": {}, "outputs": [], "source": [ "from sentence_transformers import SentenceTransformer\n", "embedding_model = SentenceTransformer(\"nomic-ai/nomic-embed-text-v1.5\", trust_remote_code=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "be75832a", "metadata": {}, "outputs": [], "source": [ "# Cell 3: Generate embeddings for BEIR corpus documents (one-by-one with single progress bar)\n", "from tqdm import tqdm\n", "import pandas as pd\n", "\n", "print(f\" Generating embeddings for {len(corpus)} BEIR documents...\")\n", "\n", "# Prepare corpus data for embedding\n", "corpus_data = []\n", "for doc_id, doc_content in corpus.items():\n", " title = doc_content.get('title', '')\n", " text = doc_content.get('text', '')\n", " \n", " # Combine title and text for embedding\n", " combined_text = f\"{title} {text}\".strip()\n", " \n", " corpus_data.append({\n", " 'doc_id': doc_id,\n", " 'title': title,\n", " 'text': text,\n", " 'combined_text': combined_text\n", " })\n", "\n", "corpus_df = pd.DataFrame(corpus_data)\n", "\n", "# Add prefix for retrieval-style embeddings\n", "corpus_df[\"text_prefixed\"] = corpus_df[\"combined_text\"].apply(\n", " lambda x: f\"search_document: {x}\"\n", ")\n", "\n", "print(f\" Corpus prepared: {len(corpus_df)} documents\")\n", "\n", "# Generate embeddings one-by-one with single progress bar\n", "print(\" Encoding embeddings one-by-one (this will take a few minutes)...\")\n", "embeddings = []\n", "\n", "for text in tqdm(corpus_df[\"text_prefixed\"], desc=\"Generating embeddings\", unit=\"doc\"):\n", " # Generate embedding for single document\n", " embedding = embedding_model.encode(\n", " [text], # Pass as single-item list\n", " convert_to_numpy=True,\n", " normalize_embeddings=True,\n", " show_progress_bar=False # Disable internal progress bar\n", " )[0] # Extract first (and only) embedding\n", " \n", " # Convert to float32 and store as list\n", " embeddings.append(embedding.astype(np.float32).tolist())\n", "\n", "# Add embeddings to dataframe\n", "corpus_df[\"embedding\"] = embeddings\n", "\n", "embedding_dim = len(corpus_df[\"embedding\"].iloc[0])\n", "print(f\" Embeddings generated! Dimension: {embedding_dim}\")\n", "\n", "corpus_df.head(2)" ] }, { "cell_type": "code", "execution_count": null, "id": "5dadff8a", "metadata": {}, "outputs": [], "source": [ "corpus_df.head(2)" ] }, { "cell_type": "markdown", "id": "7072118f", "metadata": {}, "source": [ "# 4. Data Ingestion into Oracle AI Database" ] }, { "cell_type": "code", "execution_count": null, "id": "640fc616", "metadata": {}, "outputs": [], "source": [ "# Cell 4: Create BEIR evaluation table in Oracle\n", "ddl = f\"\"\"\n", "BEGIN\n", " EXECUTE IMMEDIATE 'DROP TABLE beir_corpus';\n", "EXCEPTION WHEN OTHERS THEN\n", " IF SQLCODE != -942 THEN RAISE; END IF;\n", "END;\n", "/\n", "CREATE TABLE beir_corpus (\n", " doc_id VARCHAR2(255) PRIMARY KEY,\n", " title VARCHAR2(4000),\n", " text CLOB,\n", " embedding VECTOR({embedding_dim}, FLOAT32)\n", ")\n", "TABLESPACE USERS\n", "\"\"\"\n", "\n", "with conn.cursor() as cur:\n", " for stmt in ddl.split(\"/\"):\n", " if stmt.strip():\n", " cur.execute(stmt)\n", "\n", "conn.commit()\n", "print(f\" Table BEIR_CORPUS created with VECTOR dimension: {embedding_dim}\")" ] }, { "cell_type": "markdown", "id": "80aef076", "metadata": {}, "source": [ "Create vector index on BEIR corpus" ] }, { "cell_type": "code", "execution_count": null, "id": "b95f61e5", "metadata": {}, "outputs": [], "source": [ "with conn.cursor() as cur:\n", " cur.execute(\"\"\"\n", " CREATE VECTOR INDEX BEIR_VEC_IVF\n", " ON beir_corpus(embedding)\n", " ORGANIZATION NEIGHBOR PARTITIONS\n", " DISTANCE COSINE\n", " WITH TARGET ACCURACY 90\n", " TABLESPACE USERS\n", " \"\"\")\n", "\n", "conn.commit()\n", "print(\" Vector Index BEIR_VEC_IVF created\")" ] }, { "cell_type": "markdown", "id": "8d5a03b7", "metadata": {}, "source": [ "Create Text Index" ] }, { "cell_type": "code", "execution_count": null, "id": "723744f2", "metadata": {}, "outputs": [], "source": [ "print(\" Setting up Oracle Text for proper keyword search...\")\n", "\n", "try:\n", " with conn.cursor() as cur:\n", " # Drop existing index if exists\n", " try:\n", " cur.execute(\"DROP INDEX beir_text_idx\")\n", " except:\n", " pass\n", " \n", " # Create CONTEXT index on text column\n", " cur.execute(\"\"\"\n", " CREATE INDEX beir_text_idx \n", " ON beir_corpus(text) \n", " INDEXTYPE IS CTXSYS.CONTEXT\n", " PARAMETERS('SYNC (ON COMMIT)')\n", " \"\"\")\n", " \n", " # Also index title\n", " try:\n", " cur.execute(\"DROP INDEX beir_title_idx\")\n", " except:\n", " pass\n", " \n", " cur.execute(\"\"\"\n", " CREATE INDEX beir_title_idx \n", " ON beir_corpus(title) \n", " INDEXTYPE IS CTXSYS.CONTEXT\n", " PARAMETERS('SYNC (ON COMMIT)')\n", " \"\"\")\n", " \n", " conn.commit()\n", " print(\" Oracle Text indexes created successfully!\")\n", " \n", "except Exception as e:\n", " print(f\" Oracle Text not available or error: {e}\")\n", " print(\" Falling back to LIKE-based search\")" ] }, { "cell_type": "code", "execution_count": null, "id": "7b4d5d04", "metadata": {}, "outputs": [], "source": [ "from tqdm import tqdm\n", "import array\n", "\n", "rows = []\n", "for i, row in corpus_df.iterrows():\n", " # Convert embedding list to array.array for proper VECTOR binding\n", " embedding_array = array.array('f', row.get(\"embedding\"))\n", " \n", " rows.append((\n", " row.get(\"doc_id\"),\n", " row.get(\"title\"),\n", " row.get(\"text\"),\n", " embedding_array\n", " ))\n", "\n", "print(f\" Inserting {len(rows)} documents into BEIR_CORPUS...\")\n", "\n", "with conn.cursor() as cur:\n", " for row in tqdm(rows):\n", " cur.execute(\n", " \"\"\"\n", " INSERT INTO beir_corpus (doc_id, title, text, embedding)\n", " VALUES (:1, :2, :3, :4)\n", " \"\"\", \n", " row\n", " )\n", "\n", "conn.commit()\n", "print(\" BEIR corpus inserted successfully!\")\n", "\n", "# Verify insertion\n", "with conn.cursor() as cur:\n", " cur.execute(\"SELECT COUNT(*) FROM beir_corpus\")\n", " count = cur.fetchone()[0]\n", " print(f\" Total documents in BEIR_CORPUS: {count}\")" ] }, { "cell_type": "markdown", "id": "6398b1ea", "metadata": {}, "source": [ "# Part 1: Evaluating Information Retrieval Pipeline" ] }, { "cell_type": "code", "execution_count": null, "id": "80bf5a32", "metadata": {}, "outputs": [], "source": [ "from typing import Dict\n", "import re\n", "\n", "class BEIRKeywordRetriever:\n", " \"\"\"Fixed keyword retriever with proper error logging\"\"\"\n", " \n", " def __init__(self, conn, table_name=\"beir_corpus\"):\n", " self.conn = conn\n", " self.table_name = table_name\n", " \n", " def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n", " \"\"\"Perform keyword search using CONTAINS operator\"\"\"\n", " \n", " # Extract key terms and keep mixed alphanumeric tokens (e.g., CCL19)\n", " words = re.findall(r'\\b[a-zA-Z][a-zA-Z0-9]{3,}\\b', query.lower())\n", " \n", " if not words:\n", " print(f\" No words extracted from query: '{query}'\")\n", " return {}\n", " \n", " # Build Oracle Text query with escaped literals (prevents reserved-word parser errors)\n", " unique_words = list(dict.fromkeys(words))\n", " oracle_text_query = ' OR '.join(f'{{{w}}}' for w in unique_words[:5])\n", " \n", " sql = f\"\"\"\n", " SELECT doc_id, SCORE(1) as relevance_score\n", " FROM {self.table_name}\n", " WHERE CONTAINS(text, :query, 1) > 0\n", " ORDER BY SCORE(1) DESC\n", " FETCH FIRST {top_k} ROWS ONLY\n", " \"\"\"\n", " \n", " try:\n", " with self.conn.cursor() as cur:\n", " cur.execute(sql, query=oracle_text_query)\n", " rows = cur.fetchall()\n", " \n", " # Return normalized scores\n", " if rows:\n", " max_score = max(row[1] for row in rows) if rows else 1\n", " if max_score == 0:\n", " max_score = 1\n", " results = {row[0]: float(row[1]) / max_score for row in rows}\n", " return results\n", " else:\n", " return {}\n", " \n", " except Exception as e:\n", " # Log the actual error instead of silently failing\n", " print(f\" CONTAINS error for query '{query[:50]}...': {e}\")\n", " import traceback\n", " traceback.print_exc()\n", " return {}\n", " \n", " def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n", " \"\"\"Batch search for multiple queries\"\"\"\n", " from tqdm import tqdm\n", " results = {}\n", " error_count = 0\n", " \n", " for query_id, query_text in tqdm(queries.items(), desc=\"Keyword search\", unit=\"query\"):\n", " try:\n", " result = self.search(query_text, top_k)\n", " results[query_id] = result\n", " if not result:\n", " error_count += 1\n", " except Exception as e:\n", " print(f\" Error on query {query_id}: {e}\")\n", " results[query_id] = {}\n", " error_count += 1\n", " \n", " if error_count > 0:\n", " print(f\"\\n {error_count}/{len(queries)} queries returned empty results\")\n", " \n", " return results" ] }, { "cell_type": "code", "execution_count": null, "id": "38dddf57", "metadata": {}, "outputs": [], "source": [ "# Cell: Update Vector Retriever to hide embedding progress bars\n", "class BEIRVectorRetriever:\n", " \"\"\"Vector-based retriever using Oracle vector search on BEIR corpus\"\"\"\n", " \n", " def __init__(self, conn, embedding_model, table_name=\"beir_corpus\"):\n", " self.conn = conn\n", " self.embedding_model = embedding_model\n", " self.table_name = table_name\n", " \n", " def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n", " \"\"\"Perform vector search\"\"\"\n", " \n", " # Generate query embedding (disable progress bar)\n", " query_embedding = self.embedding_model.encode(\n", " [f\"search_query: {query}\"],\n", " convert_to_numpy=True,\n", " normalize_embeddings=True,\n", " show_progress_bar=False # ← This is the key fix!\n", " )[0].astype(np.float32).tolist()\n", " \n", " query_embedding_array = array.array('f', query_embedding)\n", " \n", " # Execute vector search (convert distance to similarity score)\n", " sql = f\"\"\"\n", " SELECT \n", " doc_id,\n", " ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n", " FROM {self.table_name}\n", " ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n", " FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n", " \"\"\"\n", " \n", " with self.conn.cursor() as cur:\n", " cur.execute(sql, q=query_embedding_array)\n", " rows = cur.fetchall()\n", " \n", " # Return as dict {doc_id: score}\n", " results = {row[0]: float(row[1]) for row in rows}\n", " return results\n", " \n", " def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n", " \"\"\"Batch search for multiple queries\"\"\"\n", " from tqdm import tqdm\n", " results = {}\n", " for query_id, query_text in tqdm(queries.items(), desc=\"Vector search\", unit=\"query\"):\n", " results[query_id] = self.search(query_text, top_k)\n", " return results" ] }, { "cell_type": "code", "execution_count": null, "id": "0f88fbdf", "metadata": {}, "outputs": [], "source": [ "class BEIRHybridRetriever:\n", " \"\"\"Hybrid retriever combining proper keyword filtering with vector search\"\"\"\n", " \n", " def __init__(self, conn, embedding_model, table_name=\"beir_corpus\"):\n", " self.conn = conn\n", " self.embedding_model = embedding_model\n", " self.table_name = table_name\n", " \n", " def search(self, query: str, top_k: int = 100) -> Dict[str, float]:\n", " \"\"\"Perform hybrid search (keyword prefilter + vector ranking)\"\"\"\n", " \n", " # Generate query embedding (disable progress bar)\n", " query_embedding = self.embedding_model.encode(\n", " [f\"search_query: {query}\"],\n", " convert_to_numpy=True,\n", " normalize_embeddings=True,\n", " show_progress_bar=False\n", " )[0].astype(np.float32).tolist()\n", " \n", " query_embedding_array = array.array('f', query_embedding)\n", " \n", " # Extract key words for keyword filter (4+ letters)\n", " words = re.findall(r'\\b[a-zA-Z]{4,}\\b', query.lower())\n", " \n", " if not words:\n", " # No keywords, fall back to pure vector search\n", " sql = f\"\"\"\n", " SELECT \n", " doc_id,\n", " ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n", " FROM {self.table_name}\n", " ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n", " FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n", " \"\"\"\n", " with self.conn.cursor() as cur:\n", " cur.execute(sql, q=query_embedding_array)\n", " rows = cur.fetchall()\n", " else:\n", " # Build keyword filter with OR conditions (less restrictive)\n", " search_words = words[:3] # Use top 3 words\n", " or_conditions = []\n", " params = {'q': query_embedding_array}\n", " \n", " for i, word in enumerate(search_words):\n", " or_conditions.append(f\"LOWER(text) LIKE :word{i}\")\n", " params[f'word{i}'] = f'%{word}%'\n", " \n", " where_clause = \" OR \".join(or_conditions)\n", " \n", " # Hybrid: keyword prefilter + vector ranking\n", " sql = f\"\"\"\n", " SELECT \n", " doc_id,\n", " ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS score\n", " FROM {self.table_name}\n", " WHERE {where_clause}\n", " ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n", " FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n", " \"\"\"\n", " \n", " with self.conn.cursor() as cur:\n", " cur.execute(sql, **params)\n", " rows = cur.fetchall()\n", " \n", " # Return as dict {doc_id: score}\n", " results = {row[0]: float(row[1]) for row in rows}\n", " return results\n", " \n", " def batch_search(self, queries: Dict[str, str], top_k: int = 100) -> Dict[str, Dict[str, float]]:\n", " \"\"\"Batch search for multiple queries\"\"\"\n", " from tqdm import tqdm\n", " results = {}\n", " for query_id, query_text in tqdm(queries.items(), desc=\"Hybrid search\", unit=\"query\"):\n", " results[query_id] = self.search(query_text, top_k)\n", " return results\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3a991e66", "metadata": {}, "outputs": [], "source": [ "# Initialize all three retrievers\n", "print(\"\\n Initializing BEIR retrievers...\")\n", "beir_keyword_retriever = BEIRKeywordRetriever(conn)\n", "beir_vector_retriever = BEIRVectorRetriever(conn, embedding_model)\n", "beir_hybrid_retriever = BEIRHybridRetriever(conn, embedding_model)\n", "\n", "print(\" All BEIR retrievers initialized successfully!\")" ] }, { "cell_type": "markdown", "id": "0d2e71a8", "metadata": {}, "source": [ "## Run evaluation for all three retrieval methods" ] }, { "cell_type": "code", "execution_count": null, "id": "ce839775", "metadata": {}, "outputs": [], "source": [ "from beir.retrieval.evaluation import EvaluateRetrieval\n", "\n", "# Initialize evaluator\n", "evaluator = EvaluateRetrieval()\n", "\n", "# Dictionary to store all results\n", "retrieval_results = {}\n", "\n", "# ------------------------------------------------------------------\n", "# 1. KEYWORD-BASED RETRIEVAL EVALUATION\n", "# ------------------------------------------------------------------\n", "print(\"\\n [1/3] Evaluating KEYWORD-BASED retrieval...\")\n", "keyword_results = beir_keyword_retriever.batch_search(queries, top_k=100)\n", "keyword_ndcg, keyword_map, keyword_recall, keyword_precision = evaluator.evaluate(\n", " qrels, keyword_results, [1, 3, 5, 10, 100]\n", ")\n", "\n", "retrieval_results['keyword'] = {\n", " 'ndcg': keyword_ndcg,\n", " 'map': keyword_map,\n", " 'recall': keyword_recall,\n", " 'precision': keyword_precision,\n", " 'raw_results': keyword_results\n", "}\n", "\n", "print(f\" Keyword retrieval complete - NDCG@10: {keyword_ndcg.get('NDCG@10', 0):.4f}\")\n" ] }, { "cell_type": "code", "execution_count": null, "id": "39f2a78f", "metadata": {}, "outputs": [], "source": [ "# ------------------------------------------------------------------\n", "# 2. VECTOR-BASED RETRIEVAL EVALUATION\n", "# ------------------------------------------------------------------\n", "print(\"\\n [2/3] Evaluating VECTOR-BASED retrieval...\")\n", "vector_results = beir_vector_retriever.batch_search(queries, top_k=100)\n", "vector_ndcg, vector_map, vector_recall, vector_precision = evaluator.evaluate(\n", " qrels, vector_results, [1, 3, 5, 10, 100]\n", ")\n", "\n", "retrieval_results['vector'] = {\n", " 'ndcg': vector_ndcg,\n", " 'map': vector_map,\n", " 'recall': vector_recall,\n", " 'precision': vector_precision,\n", " 'raw_results': vector_results\n", "}\n", "\n", "print(f\" Vector retrieval complete - NDCG@10: {vector_ndcg.get('NDCG@10', 0):.4f}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "ed1dbf3d", "metadata": {}, "outputs": [], "source": [ "# ------------------------------------------------------------------\n", "# 3. HYBRID RETRIEVAL EVALUATION\n", "# ------------------------------------------------------------------\n", "print(\"\\n [3/3] Evaluating HYBRID retrieval...\")\n", "hybrid_results = beir_hybrid_retriever.batch_search(queries, top_k=100)\n", "hybrid_ndcg, hybrid_map, hybrid_recall, hybrid_precision = evaluator.evaluate(\n", " qrels, hybrid_results, [1, 3, 5, 10, 100]\n", ")\n", "\n", "retrieval_results['hybrid'] = {\n", " 'ndcg': hybrid_ndcg,\n", " 'map': hybrid_map,\n", " 'recall': hybrid_recall,\n", " 'precision': hybrid_precision,\n", " 'raw_results': hybrid_results\n", "}\n", "\n", "print(f\" Hybrid retrieval complete - NDCG@10: {hybrid_ndcg.get('NDCG@10', 0):.4f}\")\n", "\n", "print(\"\\n\" + \"=\"*80)\n", "print(\" ALL EVALUATIONS COMPLETE!\")\n", "print(\"=\"*80)" ] }, { "cell_type": "markdown", "id": "6ce1ca61", "metadata": {}, "source": [ "## Create comparison tables for all metrics" ] }, { "cell_type": "code", "execution_count": null, "id": "152a84a1", "metadata": {}, "outputs": [], "source": [ "def create_comparison_table(metric_name, metric_dict_key):\n", " \"\"\"Helper function to create comparison tables\"\"\"\n", " data = []\n", " k_values = [1, 3, 5, 10, 100]\n", " \n", " for k in k_values:\n", " row = {'k': k}\n", " for method in ['keyword', 'vector', 'hybrid']:\n", " metric_key = f\"{metric_name}@{k}\"\n", " row[method] = retrieval_results[method][metric_dict_key].get(metric_key, 0)\n", " data.append(row)\n", " \n", " return pd.DataFrame(data)\n", "\n", "# Create comparison tables\n", "print(\"\\n\" + \"=\"*80)\n", "print(\" RETRIEVAL METHODS COMPARISON - DETAILED METRICS\")\n", "print(\"=\"*80)\n", "\n", "# NDCG Comparison\n", "print(\"\\n NDCG (Normalized Discounted Cumulative Gain) Comparison:\")\n", "print(\"-\" * 80)\n", "ndcg_comparison = create_comparison_table(\"NDCG\", \"ndcg\")\n", "print(ndcg_comparison.to_string(index=False))\n", "\n", "# MAP Comparison\n", "print(\"\\n MAP (Mean Average Precision) Comparison:\")\n", "print(\"-\" * 80)\n", "map_comparison = create_comparison_table(\"MAP\", \"map\")\n", "print(map_comparison.to_string(index=False))\n", "\n", "# Recall Comparison\n", "print(\"\\n Recall Comparison:\")\n", "print(\"-\" * 80)\n", "recall_comparison = create_comparison_table(\"Recall\", \"recall\")\n", "print(recall_comparison.to_string(index=False))\n", "\n", "# Precision Comparison\n", "print(\"\\n Precision Comparison:\")\n", "print(\"-\" * 80)\n", "precision_comparison = create_comparison_table(\"Precision\", \"precision\")\n", "print(precision_comparison.to_string(index=False))\n", "\n", "print(\"\\n\" + \"=\"*80)" ] }, { "cell_type": "markdown", "id": "81a54220", "metadata": {}, "source": [ "## Create comprehensive comparison charts" ] }, { "cell_type": "code", "execution_count": null, "id": "065a6a88", "metadata": {}, "outputs": [], "source": [ "# Cell 10: Create comprehensive comparison charts\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "# Set up the plotting style\n", "plt.style.use('seaborn-v0_8-darkgrid')\n", "colors = {'keyword': '#FF6B6B', 'vector': '#4ECDC4', 'hybrid': '#95E1D3'}\n", "\n", "# Create a 2x2 subplot figure\n", "fig, axes = plt.subplots(2, 2, figsize=(16, 12))\n", "fig.suptitle('Retrieval Methods Comparison - BEIR Evaluation', fontsize=18, fontweight='bold', y=0.995)\n", "\n", "k_values = [1, 3, 5, 10, 100]\n", "x_pos = np.arange(len(k_values))\n", "bar_width = 0.25\n", "\n", "# ------------------------------------------------------------------\n", "# Plot 1: NDCG Comparison\n", "# ------------------------------------------------------------------\n", "ax1 = axes[0, 0]\n", "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n", " ndcg_values = [retrieval_results[method]['ndcg'].get(f\"NDCG@{k}\", 0) for k in k_values]\n", " ax1.bar(x_pos + i*bar_width, ndcg_values, bar_width, \n", " label=method.upper(), color=colors[method], alpha=0.8)\n", "\n", "ax1.set_xlabel('k', fontsize=12, fontweight='bold')\n", "ax1.set_ylabel('NDCG Score', fontsize=12, fontweight='bold')\n", "ax1.set_title('NDCG@k - Normalized Discounted Cumulative Gain', fontsize=14, fontweight='bold')\n", "ax1.set_xticks(x_pos + bar_width)\n", "ax1.set_xticklabels(k_values)\n", "ax1.legend(loc='lower right')\n", "ax1.grid(axis='y', alpha=0.3)\n", "ax1.set_ylim([0, 1])\n", "\n", "# ------------------------------------------------------------------\n", "# Plot 2: MAP Comparison\n", "# ------------------------------------------------------------------\n", "ax2 = axes[0, 1]\n", "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n", " map_values = [retrieval_results[method]['map'].get(f\"MAP@{k}\", 0) for k in k_values]\n", " ax2.bar(x_pos + i*bar_width, map_values, bar_width, \n", " label=method.upper(), color=colors[method], alpha=0.8)\n", "\n", "ax2.set_xlabel('k', fontsize=12, fontweight='bold')\n", "ax2.set_ylabel('MAP Score', fontsize=12, fontweight='bold')\n", "ax2.set_title('MAP@k - Mean Average Precision', fontsize=14, fontweight='bold')\n", "ax2.set_xticks(x_pos + bar_width)\n", "ax2.set_xticklabels(k_values)\n", "ax2.legend(loc='lower right')\n", "ax2.grid(axis='y', alpha=0.3)\n", "ax2.set_ylim([0, 1])\n", "\n", "# ------------------------------------------------------------------\n", "# Plot 3: Recall Comparison\n", "# ------------------------------------------------------------------\n", "ax3 = axes[1, 0]\n", "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n", " recall_values = [retrieval_results[method]['recall'].get(f\"Recall@{k}\", 0) for k in k_values]\n", " ax3.bar(x_pos + i*bar_width, recall_values, bar_width, \n", " label=method.upper(), color=colors[method], alpha=0.8)\n", "\n", "ax3.set_xlabel('k', fontsize=12, fontweight='bold')\n", "ax3.set_ylabel('Recall Score', fontsize=12, fontweight='bold')\n", "ax3.set_title('Recall@k - Proportion of Relevant Docs Retrieved', fontsize=14, fontweight='bold')\n", "ax3.set_xticks(x_pos + bar_width)\n", "ax3.set_xticklabels(k_values)\n", "ax3.legend(loc='lower right')\n", "ax3.grid(axis='y', alpha=0.3)\n", "ax3.set_ylim([0, 1])\n", "\n", "# ------------------------------------------------------------------\n", "# Plot 4: Precision Comparison\n", "# ------------------------------------------------------------------\n", "ax4 = axes[1, 1]\n", "for i, method in enumerate(['keyword', 'vector', 'hybrid']):\n", " precision_values = [retrieval_results[method]['precision'].get(f\"P@{k}\", 0) for k in k_values]\n", " ax4.bar(x_pos + i*bar_width, precision_values, bar_width, \n", " label=method.upper(), color=colors[method], alpha=0.8)\n", "\n", "ax4.set_xlabel('k', fontsize=12, fontweight='bold')\n", "ax4.set_ylabel('Precision Score', fontsize=12, fontweight='bold')\n", "ax4.set_title('Precision@k - Proportion of Retrieved Docs Relevant', fontsize=14, fontweight='bold')\n", "ax4.set_xticks(x_pos + bar_width)\n", "ax4.set_xticklabels(k_values)\n", "ax4.legend(loc='upper right')\n", "ax4.grid(axis='y', alpha=0.3)\n", "ax4.set_ylim([0, 1])\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "3ef3ea56", "metadata": {}, "outputs": [], "source": [ "print(\"\\n\" + \"=\"*80)\n", "print(\" RETRIEVAL METHODS SUMMARY & WINNER ANALYSIS\")\n", "print(\"=\"*80)\n", "\n", "# Calculate average scores across all k values\n", "summary_data = []\n", "\n", "for method in ['keyword', 'vector', 'hybrid']:\n", " avg_ndcg = np.mean([retrieval_results[method]['ndcg'].get(f\"NDCG@{k}\", 0) for k in k_values])\n", " avg_map = np.mean([retrieval_results[method]['map'].get(f\"MAP@{k}\", 0) for k in k_values])\n", " avg_recall = np.mean([retrieval_results[method]['recall'].get(f\"Recall@{k}\", 0) for k in k_values])\n", " avg_precision = np.mean([retrieval_results[method]['precision'].get(f\"P@{k}\", 0) for k in k_values])\n", " \n", " summary_data.append({\n", " 'Method': method.upper(),\n", " 'Avg NDCG': avg_ndcg,\n", " 'Avg MAP': avg_map,\n", " 'Avg Recall': avg_recall,\n", " 'Avg Precision': avg_precision,\n", " 'Overall Score': (avg_ndcg + avg_map + avg_recall + avg_precision) / 4\n", " })\n", "\n", "summary_df = pd.DataFrame(summary_data)\n", "summary_df = summary_df.round(4)\n", "\n", "print(\"\\n Average Performance Across All k Values:\")\n", "print(\"-\" * 80)\n", "print(summary_df.to_string(index=False))\n", "\n", "# Determine winners for each metric\n", "print(\"\\n Winners by Metric:\")\n", "print(\"-\" * 80)\n", "for metric in ['Avg NDCG', 'Avg MAP', 'Avg Recall', 'Avg Precision', 'Overall Score']:\n", " winner_idx = summary_df[metric].idxmax()\n", " winner = summary_df.loc[winner_idx, 'Method']\n", " score = summary_df.loc[winner_idx, metric]\n", " print(f\"{metric:20s}: {winner:10s} (Score: {score:.4f})\")\n", "\n", "print(\"\\n\" + \"=\"*80)\n", "\n", "# Create a radar chart for overall comparison\n", "fig, ax = plt.subplots(figsize=(10, 10), subplot_kw=dict(projection='polar'))\n", "\n", "categories = ['NDCG@10', 'MAP@10', 'Recall@100', 'Precision@10']\n", "angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()\n", "angles += angles[:1] # Complete the circle\n", "\n", "for method in ['keyword', 'vector', 'hybrid']:\n", " values = [\n", " retrieval_results[method]['ndcg'].get('NDCG@10', 0),\n", " retrieval_results[method]['map'].get('MAP@10', 0),\n", " retrieval_results[method]['recall'].get('Recall@100', 0),\n", " retrieval_results[method]['precision'].get('P@10', 0)\n", " ]\n", " values += values[:1] # Complete the circle\n", " \n", " ax.plot(angles, values, 'o-', linewidth=2.5, label=method.upper(), color=colors[method])\n", " ax.fill(angles, values, alpha=0.15, color=colors[method])\n", "\n", "ax.set_xticks(angles[:-1])\n", "ax.set_xticklabels(categories, size=12)\n", "ax.set_ylim(0, 1)\n", "ax.set_title('Retrieval Methods - Radar Comparison', size=16, fontweight='bold', pad=20)\n", "ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))\n", "ax.grid(True)\n", "\n", "plt.tight_layout()\n", "plt.show()\n", "\n", "print(\"\\n Phase 1 Evaluation Complete - All metrics stored in 'retrieval_results' variable\")" ] }, { "cell_type": "markdown", "id": "7f13ab7e", "metadata": {}, "source": [ "# Part 2: Evaluating RAG Pipelines" ] }, { "cell_type": "code", "execution_count": null, "id": "4d9734b1", "metadata": {}, "outputs": [], "source": [ "! pip install -qU \"galileo[openai]\"" ] }, { "cell_type": "code", "execution_count": null, "id": "9188eb05", "metadata": {}, "outputs": [], "source": [ "import getpass\n", "import os\n", "\n", "# Function to securely get and set environment variables\n", "def set_env_securely(var_name, prompt):\n", " value = getpass.getpass(prompt)\n", " os.environ[var_name] = value" ] }, { "cell_type": "code", "execution_count": null, "id": "e8470521", "metadata": {}, "outputs": [], "source": [ "set_env_securely(\"GALILEO_API_KEY\", \"Enter your Galileo API key: \")" ] }, { "cell_type": "code", "execution_count": null, "id": "6b6d70d9", "metadata": {}, "outputs": [], "source": [ "set_env_securely(\"OPENAI_API_KEY\", \"Enter your OpenAI API key: \")" ] }, { "cell_type": "code", "execution_count": null, "id": "53ade417", "metadata": {}, "outputs": [], "source": [ "import os\n", "from galileo.openai import openai\n", "\n", "# Initialize the Galileo wrapped OpenAI client\n", "openai_client = openai.OpenAI(api_key=os.environ.get(\"OPENAI_API_KEY\"))" ] }, { "cell_type": "code", "execution_count": null, "id": "1aed083f", "metadata": {}, "outputs": [], "source": [ "from galileo import log\n", "\n", "def hybrid_search_beir_corpus(conn, embedding_model, search_phrase: str, top_k: int = 10, show_explain: bool = False):\n", " \"\"\"\n", " Hybrid search on the beir_corpus table\n", " Combines keyword filtering with vector similarity search.\n", " \n", " Returns:\n", " tuple: (rows, columns, exec_plan_text)\n", " \n", " NOTE: This function is decorated with @log to capture retrieval metrics in Galileo\n", " \"\"\"\n", " # Generate query embedding\n", " query_embedding = embedding_model.encode(\n", " [f\"search_query: {search_phrase}\"],\n", " convert_to_numpy=True,\n", " normalize_embeddings=True,\n", " show_progress_bar=False\n", " )[0].astype(np.float32).tolist()\n", " \n", " query_embedding_array = array.array('f', query_embedding)\n", " \n", " # Extract keywords for filtering (4+ letter words)\n", " words = re.findall(r'\\b[a-zA-Z]{4,}\\b', search_phrase.lower())\n", " \n", " # Build keyword filter conditions\n", " if words:\n", " search_words = words[:3] # Use top 3 words\n", " or_conditions = []\n", " \n", " for i, word in enumerate(search_words):\n", " or_conditions.append(f\"(LOWER(title) LIKE '%{word}%' OR LOWER(text) LIKE '%{word}%')\")\n", " \n", " where_clause = \" OR \".join(or_conditions)\n", " else:\n", " # No keywords, search everything\n", " where_clause = \"1=1\"\n", " \n", " # Hybrid search SQL\n", " sql = f\"\"\"\n", " SELECT {\"/*+ GATHER_PLAN_STATISTICS */\" if show_explain else \"\"}\n", " doc_id,\n", " title,\n", " SUBSTR(text, 1, 500) AS text_snippet,\n", " ROUND(1.0 - VECTOR_DISTANCE(embedding, :q, COSINE), 4) AS similarity_score\n", " FROM beir_corpus\n", " WHERE {where_clause}\n", " ORDER BY VECTOR_DISTANCE(embedding, :q, COSINE)\n", " FETCH APPROX FIRST {top_k} ROWS ONLY WITH TARGET ACCURACY 90\n", " \"\"\"\n", " \n", " with conn.cursor() as cur:\n", " cur.execute(sql, q=query_embedding_array)\n", " rows = cur.fetchall()\n", " columns = [desc[0] for desc in cur.description]\n", " \n", " # Get execution plan if requested\n", " exec_plan_text = None\n", " if show_explain:\n", " cur.execute(\"SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT => 'TYPICAL'))\")\n", " exec_plan_text = \"\\n\".join([row[0] for row in cur.fetchall() if row[0]])\n", " \n", " # Format output for Galileo retriever span (return list of documents)\n", " @log(span_type=\"retriever\", name=\"Hybrid Search - BEIR Corpus\")\n", " def get_retrieved_docs(rows):\n", " retrieved_docs = []\n", " for row in rows:\n", " row_data = dict(zip(columns, row))\n", " retrieved_docs.append({\n", " \"doc_id\": row_data.get(\"DOC_ID\"),\n", " \"title\": row_data.get(\"TITLE\"),\n", " \"snippet\": row_data.get(\"TEXT_SNIPPET\"),\n", " \"score\": float(row_data.get(\"SIMILARITY_SCORE\", 0))\n", " })\n", " return retrieved_docs\n", "\n", "\n", " retrieved_docs = get_retrieved_docs(rows)\n", " \n", " return rows, columns, exec_plan_text, retrieved_docs\n", "\n", "\n", "print(\" Hybrid search function for beir_corpus table created!\")" ] }, { "cell_type": "code", "execution_count": null, "id": "ad82fd15", "metadata": {}, "outputs": [], "source": [ "@log(span_type=\"workflow\", name=\"Research Paper RAG Pipeline\")\n", "def research_paper_assistant_rag_pipeline(\n", " conn,\n", " embedding_model,\n", " user_query: str,\n", " top_k: int = 10,\n", " retrieval_mode: str = \"hybrid\",\n", " show_explain: bool = False\n", "):\n", " \"\"\"\n", " Research Paper Assistant — Retrieval-Augmented Generation (RAG) pipeline\n", " built on SQL-based retrieval functions and powered by the OpenAI Responses API.\n", "\n", " Retrieval techniques available:\n", " - 'keyword' → uses keyword_search_research_papers()\n", " - 'vector' → uses vector_search_research_papers()\n", " - 'hybrid' → uses hybrid_search_research_papers() [default]\n", "\n", " Args:\n", " conn: Oracle database connection.\n", " embedding_model: Embedding model (e.g., SentenceTransformer, Voyage).\n", " user_query (str): Research question from the user.\n", " top_k (int): Number of top documents to retrieve.\n", " retrieval_mode (str): Retrieval method ('keyword', 'vector', 'hybrid').\n", " show_explain (bool): Whether to show the SQL execution plan.\n", "\n", " Returns:\n", " str: LLM-generated research synthesis with citations.\n", " \n", " NOTE: This function is decorated with @log to create a workflow span containing\n", " the retrieval span (from hybrid_search_beir_corpus) and LLM span (from OpenAI).\n", " \"\"\"\n", "\n", " # ----------------------------------------------------------------------\n", " # 1. Retrieve relevant research papers using the selected retrieval mode\n", " # ----------------------------------------------------------------------\n", " if retrieval_mode == \"keyword\":\n", " rows, columns, exec_plan_text, retrieved_docs = vector_search_beir_corpus(conn, user_query)\n", "\n", " elif retrieval_mode == \"vector\":\n", " rows, columns, exec_plan_text, retrieved_docs = hybrid_search_beir_corpus(conn, embedding_model, user_query, top_k)\n", "\n", " else: # default: hybrid retrieval\n", " rows, columns, exec_plan_text, retrieved_docs = hybrid_search_beir_corpus(\n", " conn=conn,\n", " embedding_model=embedding_model,\n", " search_phrase=user_query,\n", " top_k=top_k,\n", " show_explain=show_explain\n", " )\n", "\n", " retrieved_count = len(rows) if rows else 0\n", " print(f\" Retrieved {retrieved_count} papers using {retrieval_mode.upper()} retrieval.\")\n", "\n", "\n", " # ----------------------------------------------------------------------\n", " # 2. Convert retrieved rows to formatted LLM context\n", " # ----------------------------------------------------------------------\n", " formatted_context = \"\"\n", " if retrieved_count > 0:\n", " formatted_context += f\"\\n\\n📚 {retrieved_count} relevant research papers retrieved:\\n\\n\"\n", " for i, row in enumerate(rows):\n", " row_data = dict(zip(columns, row))\n", " title = row_data.get(\"TITLE\", \"Untitled Paper\")\n", " abstract = row_data.get(\"ABSTRACT\", \"No abstract available.\")\n", " snippet = row_data.get(\"TEXT_SNIPPET\", \"\")\n", " score = (\n", " row_data.get(\"SIMILARITY_SCORE\")\n", " or row_data.get(\"TEXT_RELEVANCE_SCORE\")\n", " or \"N/A\"\n", " )\n", " formatted_context += (\n", " f\"[{i+1}] **{title}**\\n\"\n", " f\"Abstract: {abstract}\\n\"\n", " f\"Snippet: {snippet}\\n\"\n", " f\"Relevance Score: {score}\\n\\n\"\n", " )\n", " else:\n", " formatted_context = \"\\n\\n⚠️ No relevant papers were retrieved from the database.\\n\"\n", "\n", " # ----------------------------------------------------------------------\n", " # 3. Construct the prompt for the Responses API\n", " # ----------------------------------------------------------------------\n", " prompt = f\"\"\"\n", " You are a **Research Paper Assistant** that synthesizes academic literature to help answer user questions.\n", "\n", " User Query: {user_query}\n", "\n", " Number of retrieved papers: {retrieved_count}\n", " {formatted_context}\n", "\n", " Please:\n", " - Summarize the findings most relevant to the query.\n", " - Use citation numbers [X] to support claims.\n", " - Highlight consensus, innovation, or research gaps.\n", " - If there is insufficient context, clearly say so.\n", " \"\"\"\n", "\n", " # ----------------------------------------------------------------------\n", " # 4. Call the OpenAI Responses API\n", " # ----------------------------------------------------------------------\n", " response = openai_client.chat.completions.create(\n", " model=\"gpt-4o\",\n", " messages=[\n", " {\"role\": \"system\", \"content\": \"You are a scientific research assistant. Use only the provided context to answer. Always cite papers [1], [2], etc.\"},\n", " {\"role\": \"user\", \"content\": prompt}\n", " ]\n", " )\n", "\n", " # ----------------------------------------------------------------------\n", " # 5. Optionally print SQL execution plan (if hybrid)\n", " # ----------------------------------------------------------------------\n", " if show_explain and exec_plan_text:\n", " print(\"\\n====== SQL Execution Plan ======\")\n", " print(exec_plan_text)\n", " print(\"================================\\n\")\n", "\n", " # ----------------------------------------------------------------------\n", " # 6. Return the LLM’s output text\n", " # ----------------------------------------------------------------------\n", " return response.choices[0].message.content\n" ] }, { "cell_type": "code", "execution_count": null, "id": "733a8ee7", "metadata": {}, "outputs": [], "source": [ "from galileo import galileo_context\n", "\n", "galileo_context.init(\n", " project=\"ai_system_evaluation_project\",\n", " log_stream=\"ai_system_evaluation_showcase\"\n", ")\n", "\n", "summary = research_paper_assistant_rag_pipeline(\n", " conn=conn,\n", " embedding_model=embedding_model,\n", " user_query=\"Can you get me some information on the research in the field of AI?\",\n", " top_k=5,\n", " retrieval_mode=\"hybrid\",\n", " show_explain=False\n", ")\n", "\n", "# Flush the logger to ensure all traces are uploaded to Galileo\n", "# Note: The @log decorator automatically flushes when the decorated function exits,\n", "# but in notebooks it's good practice to explicitly flush to ensure data is sent\n", "galileo_context.flush()\n", "print(\" RAG pipeline traces flushed to Galileo\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e0e674b7", "metadata": {}, "outputs": [], "source": [ "print(summary)" ] } ], "metadata": { "kernelspec": { "display_name": "playground", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.14" } }, "nbformat": 4, "nbformat_minor": 5 }