{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook shows how to replace the `VolatileMemoryStore` memory storage used in a [previous notebook](./06-memory-and-embeddings.ipynb) with a `WeaviateMemoryStore`.\n",
    "\n",
    "`WeaviateMemoryStore` is an example of a persistent (i.e. long-term) memory store backed by the Weaviate vector database.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configuring the Kernel\n",
    "\n",
    "Let's get started with the necessary configuration to run Semantic Kernel. For Notebooks, we require a `.env` file with the proper settings for the model you use. Create a new file named `.env` and place it in this directory. Copy the contents of the `.env.example` file from this directory and paste it into the `.env` file that you just created.\n",
    "\n",
    "**NOTE: Please make sure to include `GLOBAL_LLM_SERVICE` set to either OpenAI, AzureOpenAI, or HuggingFace in your .env file. If this setting is not included, the Service will default to AzureOpenAI.**\n",
    "\n",
    "#### Option 1: using OpenAI\n",
    "\n",
    "Add your [OpenAI Key](https://platform.openai.com/docs/overview) key to your `.env` file (org Id only if you have multiple orgs):\n",
    "\n",
    "```\n",
    "GLOBAL_LLM_SERVICE=\"OpenAI\"\n",
    "OPENAI_API_KEY=\"sk-...\"\n",
    "OPENAI_ORG_ID=\"\"\n",
    "OPENAI_CHAT_MODEL_ID=\"\"\n",
    "OPENAI_TEXT_MODEL_ID=\"\"\n",
    "OPENAI_EMBEDDING_MODEL_ID=\"\"\n",
    "```\n",
    "The names should match the names used in the `.env` file, as shown above.\n",
    "\n",
    "#### Option 2: using Azure OpenAI\n",
    "\n",
    "Add your [Azure Open AI Service key](https://learn.microsoft.com/azure/cognitive-services/openai/quickstart?pivots=programming-language-studio) settings to the `.env` file in the same folder:\n",
    "\n",
    "```\n",
    "GLOBAL_LLM_SERVICE=\"AzureOpenAI\"\n",
    "AZURE_OPENAI_API_KEY=\"...\"\n",
    "AZURE_OPENAI_ENDPOINT=\"https://...\"\n",
    "AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=\"...\"\n",
    "AZURE_OPENAI_TEXT_DEPLOYMENT_NAME=\"...\"\n",
    "AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=\"...\"\n",
    "AZURE_OPENAI_API_VERSION=\"...\"\n",
    "```\n",
    "The names should match the names used in the `.env` file, as shown above.\n",
    "\n",
    "For more advanced configuration, please follow the steps outlined in the [setup guide](./CONFIGURING_THE_KERNEL.md)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# About Weaviate\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[Weaviate](https://weaviate.io/) is an open-source vector database designed to scale seamlessly into billions of data objects. This implementation supports hybrid search out-of-the-box (meaning it will perform better for keyword searches).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can run Weaviate in 5 ways:\n",
    "\n",
    "- **SaaS** – with [Weaviate Cloud Services (WCS)](https://weaviate.io/pricing).\n",
    "\n",
    "  WCS is a fully managed service that takes care of hosting, scaling, and updating your Weaviate instance. You can try it out for free with a sandbox that lasts for 14 days.\n",
    "\n",
    "  To set up a SaaS Weaviate instance with WCS:\n",
    "\n",
    "  1.  Navigate to [Weaviate Cloud Console](https://console.weaviate.cloud/).\n",
    "  2.  Register or sign in to your WCS account.\n",
    "  3.  Create a new cluster with the following settings:\n",
    "      - `Subscription Tier` – Free sandbox for a free trial, or contact [hello@weaviate.io](mailto:hello@weaviate.io) for other options.\n",
    "      - `Cluster name` – a unique name for your cluster. The name will become part of the URL used to access this instance.\n",
    "      - `Enable Authentication?` – Enabled by default. This will generate a static API key that you can use to authenticate.\n",
    "  4.  Wait for a few minutes until your cluster is ready. You will see a green tick ✔️ when it's done. Copy your cluster URL.\n",
    "\n",
    "- **Hybrid SaaS**\n",
    "\n",
    "  > If you need to keep your data on-premise for security or compliance reasons, Weaviate also offers a Hybrid SaaS option: Weaviate runs within your cloud instances, but the cluster is managed remotely by Weaviate. This gives you the benefits of a managed service without sending data to an external party.\n",
    "\n",
    "  The Weaviate Hybrid SaaS is a custom solution. If you are interested in this option, please reach out to [hello@weaviate.io](mailto:hello@weaviate.io).\n",
    "\n",
    "- **Self-hosted** – with a Docker container\n",
    "\n",
    "  To set up a Weaviate instance with Docker:\n",
    "\n",
    "  1. [Install Docker](https://docs.docker.com/engine/install/) on your local machine if it is not already installed.\n",
    "  2. [Install the Docker Compose Plugin](https://docs.docker.com/compose/install/)\n",
    "  3. Download a `docker-compose.yml` file with this `curl` command:\n",
    "\n",
    "     ```\n",
    "     curl -o docker-compose.yml \"https://configuration.weaviate.io/v2/docker-compose/docker-compose.yml?modules=standalone&runtime=docker-compose&weaviate_version=v1.19.6\"\n",
    "     ```\n",
    "\n",
    "     Alternatively, you can use Weaviate's docker compose [configuration tool](https://weaviate.io/developers/weaviate/installation/docker-compose) to generate your own `docker-compose.yml` file.\n",
    "\n",
    "  4. Run `docker compose up -d` to spin up a Weaviate instance.\n",
    "\n",
    "     > To shut it down, run `docker compose down`.\n",
    "\n",
    "- **Self-hosted** – with a Kubernetes cluster\n",
    "\n",
    "  To configure a self-hosted instance with Kubernetes, follow Weaviate's [documentation](https://weaviate.io/developers/weaviate/installation/kubernetes).|\n",
    "\n",
    "- **Embedded** - start a weaviate instance right from your application code using the client library\n",
    "\n",
    "  This code snippet shows how to instantiate an embedded weaviate instance and upload a document:\n",
    "\n",
    "  ```python\n",
    "  import weaviate\n",
    "  from weaviate.embedded import EmbeddedOptions\n",
    "\n",
    "  client = weaviate.Client(\n",
    "    embedded_options=EmbeddedOptions()\n",
    "  )\n",
    "\n",
    "  data_obj = {\n",
    "    \"name\": \"Chardonnay\",\n",
    "    \"description\": \"Goes with fish\"\n",
    "  }\n",
    "\n",
    "  client.data_object.create(data_obj, \"Wine\")\n",
    "  ```\n",
    "\n",
    "  Refer to the [documentation](https://weaviate.io/developers/weaviate/installation/embedded) for more details about this deployment method.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setup\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Note: if using a virtual environment, do not run this cell\n",
    "%pip install -U semantic-kernel[weaviate]\n",
    "from semantic_kernel import __version__\n",
    "\n",
    "__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## OS-specific notes:\n",
    "\n",
    "- if you run into SSL errors when connecting to OpenAI on macOS, see this issue for a [potential solution](https://github.com/microsoft/semantic-kernel/issues/627#issuecomment-1580912248)\n",
    "- on Windows, you may need to run Docker Desktop as administrator\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we instantiate the Weaviate memory store. Uncomment ONE of the options below, depending on how you want to use Weaviate:\n",
    "\n",
    "- from a Docker instance\n",
    "- from WCS\n",
    "- directly from the client (embedded Weaviate), which works on Linux only at the moment\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from semantic_kernel.connectors.memory.weaviate import WeaviateMemoryStore\n",
    "\n",
    "# Note the Weaviate Config values need to be either configured as environment variables\n",
    "# or in the .env file, as a back up. When creating the instance of the `weaviate_memory_store`\n",
    "# pass in `env_file_path=<path_to_file>` to read the config values from the `.env` file, otherwise\n",
    "# the values will be read from environment variables.\n",
    "# Env variables or .env file config should look like:\n",
    "# WEAVIATE_URL=\"http://localhost:8080\"\n",
    "# WEAVIATE_API_KEY=\"\"\n",
    "# WEAVIATE_USE_EMBED=True|False\n",
    "\n",
    "store = WeaviateMemoryStore()\n",
    "store.client.schema.delete_all()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, we register the memory store to the kernel:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from services import Service\n",
    "\n",
    "# Select a service to use for this notebook (available services: OpenAI, AzureOpenAI, HuggingFace)\n",
    "selectedService = Service.OpenAI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion, OpenAITextEmbedding\n",
    "from semantic_kernel.core_plugins.text_memory_plugin import TextMemoryPlugin\n",
    "from semantic_kernel.kernel import Kernel\n",
    "from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory\n",
    "from semantic_kernel.memory.volatile_memory_store import VolatileMemoryStore\n",
    "\n",
    "kernel = Kernel()\n",
    "\n",
    "chat_service_id = \"chat\"\n",
    "if selectedService == Service.OpenAI:\n",
    "    oai_chat_service = OpenAIChatCompletion(\n",
    "        service_id=chat_service_id,\n",
    "        ai_model_id=\"gpt-3.5-turbo\",\n",
    "    )\n",
    "    embedding_gen = OpenAITextEmbedding(ai_model_id=\"text-embedding-ada-002\")\n",
    "    kernel.add_service(oai_chat_service)\n",
    "    kernel.add_service(embedding_gen)\n",
    "\n",
    "memory = SemanticTextMemory(storage=VolatileMemoryStore(), embeddings_generator=embedding_gen)\n",
    "kernel.add_plugin(TextMemoryPlugin(memory), \"TextMemoryPlugin\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Manually adding memories\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create some initial memories \"About Me\". We can add memories to our weaviate memory store by using `save_information`\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "collection_id = \"generic\"\n",
    "\n",
    "\n",
    "async def populate_memory(memory: SemanticTextMemory) -> None:\n",
    "    # Add some documents to the semantic memory\n",
    "    await memory.save_information(collection=collection_id, id=\"info1\", text=\"Your budget for 2024 is $100,000\")\n",
    "    await memory.save_information(collection=collection_id, id=\"info2\", text=\"Your savings from 2023 are $50,000\")\n",
    "    await memory.save_information(collection=collection_id, id=\"info3\", text=\"Your investments are $80,000\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "await populate_memory(memory)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Searching is done through `search`:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "async def search_memory_examples(memory: SemanticTextMemory) -> None:\n",
    "    questions = [\"What is my budget for 2024?\", \"What are my savings from 2023?\", \"What are my investments?\"]\n",
    "\n",
    "    for question in questions:\n",
    "        print(f\"Question: {question}\")\n",
    "        result = await memory.search(collection_id, question)\n",
    "        print(f\"Answer: {result[0].text}\\n\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "await search_memory_examples(memory)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's how to use the weaviate memory store in a chat application:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from semantic_kernel.functions.kernel_function import KernelFunction\n",
    "from semantic_kernel.prompt_template.prompt_template_config import PromptTemplateConfig\n",
    "\n",
    "\n",
    "async def setup_chat_with_memory(\n",
    "    kernel: Kernel,\n",
    "    service_id: str,\n",
    ") -> KernelFunction:\n",
    "    prompt = \"\"\"\n",
    "    ChatBot can have a conversation with you about any topic.\n",
    "    It can give explicit instructions or say 'I don't know' if\n",
    "    it does not have an answer.\n",
    "\n",
    "    Information about me, from previous conversations:\n",
    "    - {{recall 'budget by year'}} What is my budget for 2024?\n",
    "    - {{recall 'savings from previous year'}} What are my savings from 2023?\n",
    "    - {{recall 'investments'}} What are my investments?\n",
    "\n",
    "    {{$request}}\n",
    "    \"\"\".strip()\n",
    "\n",
    "    prompt_template_config = PromptTemplateConfig(\n",
    "        template=prompt,\n",
    "        execution_settings={\n",
    "            service_id: kernel.get_service(service_id).get_prompt_execution_settings_class()(service_id=service_id)\n",
    "        },\n",
    "    )\n",
    "\n",
    "    return kernel.add_function(\n",
    "        function_name=\"chat_with_memory\",\n",
    "        plugin_name=\"TextMemoryPlugin\",\n",
    "        prompt_template_config=prompt_template_config,\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "async def chat(kernel: Kernel, chat_func: KernelFunction) -> bool:\n",
    "    try:\n",
    "        user_input = input(\"User:> \")\n",
    "    except KeyboardInterrupt:\n",
    "        print(\"\\n\\nExiting chat...\")\n",
    "        return False\n",
    "    except EOFError:\n",
    "        print(\"\\n\\nExiting chat...\")\n",
    "        return False\n",
    "\n",
    "    if user_input == \"exit\":\n",
    "        print(\"\\n\\nExiting chat...\")\n",
    "        return False\n",
    "\n",
    "    answer = await kernel.invoke(chat_func, request=user_input)\n",
    "\n",
    "    print(f\"ChatBot:> {answer}\")\n",
    "    return True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Populating memory...\")\n",
    "await populate_memory(memory)\n",
    "\n",
    "print(\"Asking questions... (manually)\")\n",
    "await search_memory_examples(memory)\n",
    "\n",
    "print(\"Setting up a chat (with memory!)\")\n",
    "chat_func = await setup_chat_with_memory(kernel, chat_service_id)\n",
    "\n",
    "print(\"Begin chatting (type 'exit' to exit):\\n\")\n",
    "print(\n",
    "    \"Welcome to the chat bot!\\\n",
    "    \\n  Type 'exit' to exit.\\\n",
    "    \\n  Try asking a question about your finances (i.e. \\\"talk to me about my finances\\\").\"\n",
    ")\n",
    "chatting = True\n",
    "while chatting:\n",
    "    chatting = await chat(kernel, chat_func)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Adding documents to your memory\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a dictionary to hold some files. The key is the hyperlink to the file and the value is the file's content:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "github_files = {}\n",
    "github_files[\"https://github.com/microsoft/semantic-kernel/blob/main/README.md\"] = (\n",
    "    \"README: Installation, getting started, and how to contribute\"\n",
    ")\n",
    "github_files[\n",
    "    \"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/02-running-prompts-from-file.ipynb\"\n",
    "] = \"Jupyter notebook describing how to pass prompts from a file to a semantic plugin or function\"\n",
    "github_files[\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/00-getting-started.ipynb\"] = (\n",
    "    \"Jupyter notebook describing how to get started with the Semantic Kernel\"\n",
    ")\n",
    "github_files[\"https://github.com/microsoft/semantic-kernel/tree/main/samples/plugins/ChatPlugin/ChatGPT\"] = (\n",
    "    \"Sample demonstrating how to create a chat plugin interfacing with ChatGPT\"\n",
    ")\n",
    "github_files[\n",
    "    \"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/SemanticKernel/Memory/Volatile/VolatileMemoryStore.cs\"\n",
    "] = \"C# class that defines a volatile embedding store\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use `save_reference` to save the file:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "COLLECTION = \"SKGitHub\"\n",
    "\n",
    "print(\"Adding some GitHub file URLs and their descriptions to a volatile Semantic Memory.\")\n",
    "for index, (entry, value) in enumerate(github_files.items()):\n",
    "    await memory.save_reference(\n",
    "        collection=COLLECTION,\n",
    "        description=value,\n",
    "        text=value,\n",
    "        external_id=entry,\n",
    "        external_source_name=\"GitHub\",\n",
    "    )\n",
    "    print(\"  URL {} saved\".format(index))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use `search` to ask a question:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ask = \"I love Jupyter notebooks, how should I get started?\"\n",
    "print(\"===========================\\n\" + \"Query: \" + ask + \"\\n\")\n",
    "\n",
    "memories = await memory.search(COLLECTION, ask, limit=5, min_relevance_score=0.77)\n",
    "\n",
    "for index, memory in enumerate(memories):\n",
    "    print(f\"Result {index}:\")\n",
    "    print(\"  URL:     : \" + memory.id)\n",
    "    print(\"  Title    : \" + memory.description)\n",
    "    print(\"  Relevance: \" + str(memory.relevance))\n",
    "    print()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}