{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/examples/hybrid-chat-guardrails.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/examples/hybrid-chat-guardrails.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sparse Encoder" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Install Prerequisites" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "!pip install -qU semantic-router>=0.1.6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating Hybrid Router for Sparse Encoder Detection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To begin we first need to import the `Route` class from the `semantic_router` package.\n", "\n", "Then we can define the routes that we want to use in our semantic router. For this example we will use routes for BYD, Tesla, Polestar, and Rivian. Giving each route a name and a list of utterances that we want to use to represent the route.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/jamesbriggs/Documents/aurelio/semantic-router/.venv/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "from semantic_router import Route\n", "\n", "# Route for BYD-related queries (allowed)\n", "byd = Route(\n", " name=\"byd\",\n", " utterances=[\n", " \"Tell me about the BYD Seal.\",\n", " \"What is the battery capacity of the BYD Dolphin?\",\n", " \"How does BYD's Blade Battery work?\",\n", " \"Is the BYD Atto 3 a good EV?\",\n", " \"Can I sell my BYD?\",\n", " \"How much is my BYD worth?\",\n", " \"What is the resale value of my BYD?\",\n", " \"How much can I get for my BYD?\",\n", " \"How much can I sell my BYD for?\",\n", " ],\n", ")\n", "\n", "# Route for Tesla-related queries (blocked or redirected)\n", "tesla = Route(\n", " name=\"tesla\",\n", " utterances=[\n", " \"Is Tesla better than BYD?\",\n", " \"Tell me about the Tesla Model 3.\",\n", " \"How does Tesla's autopilot compare to other EVs?\",\n", " \"What's new in the Tesla Cybertruck?\",\n", " \"Can I sell my Tesla?\",\n", " \"How much is my Tesla worth?\",\n", " \"What is the resale value of my Tesla?\",\n", " \"How much can I get for my Tesla?\",\n", " \"How much can I sell my Tesla for?\",\n", " ],\n", ")\n", "\n", "# Route for Polestar-related queries (blocked or redirected)\n", "polestar = Route(\n", " name=\"polestar\",\n", " utterances=[\n", " \"What's the range of the Polestar 2?\",\n", " \"Is Polestar a good alternative to other EVs?\",\n", " \"How does Polestar compare to other EVs?\",\n", " \"Can I sell my Polestar?\",\n", " \"How much is my Polestar worth?\",\n", " \"What is the resale value of my Polestar?\",\n", " \"How much can I get for my Polestar?\",\n", " \"How much can I sell my Polestar for?\",\n", " ],\n", ")\n", "\n", "# Route for Rivian-related queries (blocked or redirected)\n", "rivian = Route(\n", " name=\"rivian\",\n", " utterances=[\n", " \"Tell me about the Rivian R1T.\",\n", " \"How does Rivian's off-road capability compare to other EVs?\",\n", " \"Is Rivian's charging network better than other EVs?\",\n", " \"Can I sell my Rivian?\",\n", " \"How much is my Rivian worth?\",\n", " \"What is the resale value of my Rivian?\",\n", " \"How much can I get for my Rivian?\",\n", " \"How much can I sell my Rivian for?\",\n", " ],\n", ")\n", "\n", "# Combine all routes\n", "routes = [byd, tesla, polestar, rivian]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Relying solely on dense embedding models to differentiate between the meaning of these queries is _very_ difficult due to the nature of semantic space resulting in queries like `\"can I sell my Tesla?\"` and `\"can I sell my Polestar?\"` being incredibly semantically similar. We can test this with OpenAI's dense embedding model.\n", "\n", "We will need an [OpenAI API key](https://platform.openai.com/api-keys) for this." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/jamesbriggs/Documents/aurelio/semantic-router/.venv/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "import os\n", "from getpass import getpass\n", "from semantic_router.encoders import OpenAIEncoder\n", "\n", "os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") or getpass(\n", " \"Enter your OpenAI API key: \"\n", ")\n", "# dense encoder for semantic meaning\n", "encoder = OpenAIEncoder(name=\"text-embedding-3-small\", score_threshold=0.3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next let's compare the similarity between some vectors:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2025-03-25 11:45:11 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "data": { "text/plain": [ "array([[1. , 0.65354249, 0.67416076, 0.69256556],\n", " [0.65354249, 1. , 0.57430814, 0.59140332],\n", " [0.67416076, 0.57430814, 1. , 0.60840109],\n", " [0.69256556, 0.59140332, 0.60840109, 1. ]])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "from numpy.linalg import norm\n", "\n", "vectors = encoder(\n", " docs=[\n", " \"can I sell my Tesla?\",\n", " \"can I sell my Polestar?\",\n", " \"can I sell my BYD?\",\n", " \"can I sell my Rivian?\",\n", " ]\n", ")\n", "\n", "# normalize our vectors\n", "vector_norms = norm(vectors, axis=1, keepdims=True)\n", "normalized_vectors = vectors / vector_norms\n", "\n", "# calculate the dot product similarity between the vectors\n", "dot_products = np.dot(normalized_vectors, normalized_vectors.T)\n", "dot_products" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's compare this to similarities between utterances of a single route:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2025-03-25 11:50:27 - httpx - INFO - _client.py:1013 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "data": { "text/plain": [ "array([[1. , 0.52624727, 0.48299403, 0.57280113, 0.55299787],\n", " [0.52624727, 1. , 0.5188066 , 0.56618672, 0.55230486],\n", " [0.48299403, 0.5188066 , 1. , 0.60667738, 0.58912712],\n", " [0.57280113, 0.56618672, 0.60667738, 1. , 0.8838391 ],\n", " [0.55299787, 0.55230486, 0.58912712, 0.8838391 , 1. ]])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vectors = encoder(\n", " docs=[\n", " \"Tell me about the BYD Seal.\",\n", " \"How does BYD's Blade Battery work?\",\n", " \"Is the BYD Atto 3 a good EV?\",\n", " \"Can I sell my BYD?\",\n", " \"How much can I sell my BYD for?\",\n", " ]\n", ")\n", "\n", "# normalize our vectors\n", "vector_norms = norm(vectors, axis=1, keepdims=True)\n", "normalized_vectors = vectors / vector_norms\n", "\n", "# calculate the dot product similarity between the vectors\n", "dot_products = np.dot(normalized_vectors, normalized_vectors.T)\n", "dot_products" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In some cases here the utterances between different routes share higher similarity than utterances within the same route. That is because dense encoders excel at identifying the \"generic\" semantic meaning between phrases, but there are many cases (like this one) where we also need to give some importance to the matching of similar terms, such as \"BYD\" or \"Tesla\".\n", "\n", "Traditional sparse encoders perform very well with _term matching_, and by merging both dense and sparse methods to create a _hybrid_ approach we can make the best of both worlds — scoring both on semantic meaning and term matching. Semantic router supports this via the `HybridRouter`. To use the hybrid methods we will first need to initialize a sparse encoder. We would typically need to \"fit\" (ie train) sparse encoders on our dataset, but we can use the pretrained `AurelioSparseEncoder` instead. For that we need an [API key](https://platform.aurelio.ai/settings/api-keys)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from semantic_router.encoders.aurelio import AurelioSparseEncoder\n", "\n", "os.environ[\"AURELIO_API_KEY\"] = os.getenv(\"AURELIO_API_KEY\") or getpass(\n", " \"Enter your Aurelio API key: \"\n", ")\n", "# sparse encoder for term matching\n", "sparse_encoder = AurelioSparseEncoder(name=\"bm25\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have all the components needed to initialize our `HybridRouter`. We provide the `HybridRouter` with a dense `encoder`, `sparse_encoder`, our predefined `routes`, and we also set `auto_sync` to `\"local\"`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2025-03-21 15:12:39 - semantic_router.utils.logger - WARNING - hybrid.py:54 - __init__() - No index provided. Using default HybridLocalIndex.\n", "2025-03-21 15:12:40 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2025-03-21 15:12:42 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n", "2025-03-21 15:12:43 - semantic_router.utils.logger - WARNING - hybrid_local.py:47 - add() - Function schemas are not supported for HybridLocalIndex.\n", "2025-03-21 15:12:43 - semantic_router.utils.logger - WARNING - hybrid_local.py:49 - add() - Metadata is not supported for HybridLocalIndex.\n", "2025-03-21 15:12:43 - semantic_router.utils.logger - WARNING - hybrid_local.py:210 - _write_config() - No config is written for HybridLocalIndex.\n" ] } ], "source": [ "from semantic_router.routers import HybridRouter\n", "\n", "router = HybridRouter(\n", " encoder=encoder, sparse_encoder=sparse_encoder, routes=routes, auto_sync=\"local\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To check the current route thresholds we can use the `get_thresholds` method which will return a dictionary of route names and their corresponding thresholds values in a float." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Default route thresholds: {'byd': 0.09, 'tesla': 0.09, 'polestar': 0.09, 'rivian': 0.09}\n" ] } ], "source": [ "route_thresholds = router.get_thresholds()\n", "print(\"Default route thresholds:\", route_thresholds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can test our router already by passing in a list of utterances and seeing which route each utterance is routed to." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2025-03-21 15:12:43 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Tell me about BYD's Blade Battery. -> byd\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2025-03-21 15:12:44 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Does the Tesla Model 3 have better range? -> tesla\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2025-03-21 15:12:45 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "What are the key features of the Polestar 2? -> polestar\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2025-03-21 15:12:49 - httpx - INFO - _client.py:1025 - _send_single_request() - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Is Rivian's R1T better for off-roading? -> rivian\n" ] } ], "source": [ "for utterance in [\n", " \"Tell me about BYD's Blade Battery.\",\n", " \"Does the Tesla Model 3 have better range?\",\n", " \"What are the key features of the Polestar 2?\",\n", " \"Is Rivian's R1T better for off-roading?\",\n", "]:\n", " print(f\"{utterance} -> {router(utterance).name}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `HybridRouter` is already performing reasonably well. We can use the `evaluate` method to measure the router's accuracy across a larger set of test data." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Generating embeddings: 0%| | 0/1 [00:00