{ "cells": [ { "cell_type": "markdown", "id": "e410b6c4", "metadata": {}, "source": [ "# Build a Qusetion Answering Engine in Minutes" ] }, { "cell_type": "markdown", "id": "9bed6f24", "metadata": {}, "source": [ "This notebook illustrates how to build a question answering engine from scratch using [Milvus](https://milvus.io/) and [Towhee](https://towhee.io/). Milvus is the most advanced open-source vector database built for AI applications and supports nearest neighbor embedding search across tens of millions of entries, and Towhee is a framework that provides ETL for unstructured data using SoTA machine learning models.\n", "\n", "We will go through question answering procedures and evaluate performance. Moreover, we managed to make the core functionality as simple as almost 10 lines of code with Towhee, so that you can start hacking your own question answering engine." ] }, { "cell_type": "markdown", "id": "4883e577", "metadata": {}, "source": [ "## Preparations" ] }, { "cell_type": "markdown", "id": "49110b91", "metadata": {}, "source": [ "### Install Dependencies" ] }, { "cell_type": "markdown", "id": "0117995a", "metadata": {}, "source": [ "First we need to install dependencies such as towhee, towhee.models and gradio." ] }, { "cell_type": "code", "execution_count": 41, "id": "c9ba3850", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip available: \u001b[0m\u001b[31;49m22.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.0\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n" ] } ], "source": [ "! python -m pip install -q towhee towhee.models gradio" ] }, { "cell_type": "markdown", "id": "a90db0c5", "metadata": {}, "source": [ "### Prepare the Data" ] }, { "cell_type": "markdown", "id": "d1eceb58", "metadata": {}, "source": [ "There is a subset of the [InsuranceQA Corpus](https://github.com/shuzi/insuranceQA) (1000 pairs of questions and answers) used in this demo, everyone can download on [Github](https://github.com/towhee-io/examples/releases/download/data/question_answer.csv)." ] }, { "cell_type": "code", "execution_count": 1, "id": "d1436a9c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n", "100 595k 100 595k 0 0 286k 0 0:00:02 0:00:02 --:--:-- 666k\n" ] } ], "source": [ "! curl -L https://github.com/towhee-io/examples/releases/download/data/question_answer.csv -O" ] }, { "cell_type": "markdown", "id": "c4abdc0a", "metadata": {}, "source": [ "**question_answer.csv**: a file containing question and the answer.\n", "\n", "Let's take a quick look:" ] }, { "cell_type": "code", "execution_count": 2, "id": "d652efea", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "question | \n", "answer | \n", "
---|---|---|---|
0 | \n", "0 | \n", "Is Disability Insurance Required By Law? | \n", "Not generally. There are five states that requ... | \n", "
1 | \n", "1 | \n", "Can Creditors Take Life Insurance After ... | \n", "If the person who passed away was the one with... | \n", "
2 | \n", "2 | \n", "Does Travelers Insurance Have Renters Ins... | \n", "One of the insurance carriers I represent is T... | \n", "
3 | \n", "3 | \n", "Can I Drive A New Car Home Without Ins... | \n", "Most auto dealers will not let you drive the c... | \n", "
4 | \n", "4 | \n", "Is The Cash Surrender Value Of Life Ins... | \n", "Cash surrender value comes only with Whole Lif... | \n", "
question | answer |
---|---|
Is Disability Insurance Required By Law? | Not generally. There are five states that require most all employers carry short term disability insurance on their employees. T... |