{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/semantic-router/blob/main/docs/07-ollama-local-execution.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/aurelio-labs/semantic-router/blob/main/docs/07-ollama-local-execution.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Local Dynamic Routes - With Ollama" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fully local Semantic Router with Ollama and HuggingFace Encoder\n", "\n", "There are many reasons users might choose to roll their own LLMs rather than use a third-party service. Whether it's due to cost, privacy or compliance, Semantic Router supports the use of \"local\" LLMs through `llama.cpp`.\n", "\n", "Below is an example of using semantic router which leverages Ollama in order to utilize the **OpenHermes** LLM. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installing the Library and Dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# !pip install -qU \"semantic_router[local]==0.0.28\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from semantic_router.encoders import HuggingFaceEncoder\n", "\n", "encoder = HuggingFaceEncoder()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Static Routes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from semantic_router import Route\n", "\n", "# we could use this as a guide for our chatbot to avoid political conversations\n", "politics = Route(\n", " name=\"politics\",\n", " utterances=[\n", " \"isn't politics the best thing ever\",\n", " \"why don't you tell me about your political opinions\",\n", " \"don't you just love the president\" \"don't you just hate the president\",\n", " \"they're going to destroy this country!\",\n", " \"they will save the country!\",\n", " ],\n", ")\n", "\n", "# this could be used as an indicator to our chatbot to switch to a more\n", "# conversational prompt\n", "chitchat = Route(\n", " name=\"chitchat\",\n", " utterances=[\n", " \"how's the weather today?\",\n", " \"how are things going?\",\n", " \"lovely weather today\",\n", " \"the weather is horrendous\",\n", " \"let's go to the chippy\",\n", " ],\n", ")\n", "\n", "# we place both of our decisions together into single list\n", "routes = [politics, chitchat]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define Route Layer with Ollama" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from semantic_router.layer import RouteLayer\n", "from semantic_router.llms.ollama import OllamaLLM\n", "\n", "\n", "llm = OllamaLLM(\n", " llm_name=\"openhermes\"\n", ") # Change llm_name if you want to use a different LLM with dynamic routes.\n", "rl = RouteLayer(encoder=encoder, routes=routes, llm=llm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test Static Routes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rl(\"don't you love politics?\").name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rl(\"how's the weather today?\").name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rl(\"I'm interested in learning about llama 2\").name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test Dynamic Routes\n", "\n", "Dynamic routes work by associating a function with a route. If the input utterance is similar enough to the utterances of the route, such that route is chosen by the semantic router, then this triggers a secondary process: \n", "\n", "The LLM we specified in the `RouteLayer` (we specified Ollama, which isn't strictly an LLM, but which defaults to using the `OpenHermes` LLM), is then usde to take a `function_schema`, and the input utterance, and extract values from the input utterance which can be used as arguments for `function` described by the the `funcion_schema`. The returned values can then be used in the `function` to obtain an output.\n", "\n", "So, in short, it's a way of generating `function` inputs from an utterance, if that utterance matches the route utterances closely enough.\n", "\n", "In the below example the utterance **\"what is the time in new york city?\"** is used to trigger the \"get_time\" route, which has the `function_schema` of a likewise named `get_time()` function associated with it. Then Ollama is used to run `OpenHermes` locally, which extracts the correctly formatted IANA timezone (`\"America/New York\"`), based on this utterance and information we provide it about the `function` in the `function_schema`. The returned stirng \"America/New York\" can then be used directly in the `get_time()` function to return the actual time in New York city." ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "from zoneinfo import ZoneInfo\n", "\n", "\n", "def get_time(timezone: str) -> str:\n", " \"\"\"\n", " Finds the current time in a specific timezone.\n", "\n", " :param timezone: The timezone to find the current time in, should\n", " be a valid timezone from the IANA Time Zone Database like\n", " \"America/New_York\" or \"Europe/London\". Do NOT put the place\n", " name itself like \"rome\", or \"new york\", you must provide\n", " the IANA format.\n", " :type timezone: str\n", " :return: The current time in the specified timezone.\n", " \"\"\"\n", " now = datetime.now(ZoneInfo(timezone))\n", " return now.strftime(\"%H:%M\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "get_time(\"America/New_York\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from semantic_router.utils.function_call import get_schema\n", "\n", "schema = get_schema(get_time)\n", "schema" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time_route = Route(\n", " name=\"get_time\",\n", " utterances=[\n", " \"what is the time in new york city?\",\n", " \"what is the time in london?\",\n", " \"I live in Rome, what time is it?\",\n", " ],\n", " function_schemas=[schema],\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rl.add(time_route)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "out = rl(\"what is the time in new york city?\")\n", "print(out)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "get_time(**out.function_call)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "semantic_splitter_1", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 2 }