{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "BhHJ1MNYyYME" }, "source": [ "# Speech-to-Text with faster-whisper (Whisper large v3) of large audio files in any language" ] }, { "cell_type": "markdown", "metadata": { "id": "U1FdQpBXyiFF" }, "source": [ "- Author: Pierre Guillou\n", "- Date: 04/12/2023\n", "- Post blog: [Speech-to-Text | Quickly get a transcription of a large audio file in any language with \"Faster-Whisper\"](https://medium.com/@pierre_guillou/speech-to-text-quickly-get-a-transcription-of-a-large-audio-file-in-any-language-with-e4d4d2daf0cd)\n", "- Sources\n", " - github: https://github.com/guillaumekln/faster-whisper\n", " - [Whisper large v3](https://huggingface.co/openai/whisper-large-v3)\n", " - blog: [Making OpenAI Whisper faster](https://github.com/guillaumekln/faster-whisper#faster-whisper-transcription-with-ctranslate2)" ] }, { "cell_type": "code", "source": [ "# check if there is a GPU\n", "!nvidia-smi" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "pALb2uZFkDgA", "outputId": "207f6ba2-e81d-4be1-a946-ad8da335bb2a" }, "execution_count": 6, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Mon Dec 4 16:01:38 2023 \n", "+-----------------------------------------------------------------------------+\n", "| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 |\n", "|-------------------------------+----------------------+----------------------+\n", "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", "| | | MIG M. |\n", "|===============================+======================+======================|\n", "| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n", "| N/A 31C P8 8W / 70W | 0MiB / 15360MiB | 0% Default |\n", "| | | N/A |\n", "+-------------------------------+----------------------+----------------------+\n", " \n", "+-----------------------------------------------------------------------------+\n", "| Processes: |\n", "| GPU GI CI PID Type Process name GPU Memory |\n", "| ID ID Usage |\n", "|=============================================================================|\n", "| No running processes found |\n", "+-----------------------------------------------------------------------------+\n" ] } ] }, { "cell_type": "markdown", "metadata": { "id": "IMl3c6dQjXGO" }, "source": [ "## About faster-whisper" ] }, { "cell_type": "markdown", "metadata": { "id": "eKpJXbQzjT-k" }, "source": [ "This project implemented the OpenAI Whisper model in CTranslate2. CTranslate2 is a library for efficient inference with transformer models. This is made possible by applying various methods to increase efficiency, such as weight quantization, layer fusion, batch reordering, etc.\n", "\n", "In the case of the project faster-whisper, a noticeable performance boost was achieved.\n", "\n", "**Method**: we just need to give access to the wav audio file (even the rate conversion to 16k is done by the library faster whisper)." ] }, { "cell_type": "markdown", "metadata": { "id": "Uh6vBNpPwUoM" }, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vfr0sPzer-uq", "outputId": "798ac8ee-9151-4e45-f142-d6af22aa2566" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.5/1.5 MB\u001b[0m \u001b[31m12.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m31.0/31.0 MB\u001b[0m \u001b[31m40.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m36.8/36.8 MB\u001b[0m \u001b[31m19.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.4/6.4 MB\u001b[0m \u001b[31m47.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m6.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25h Building wheel for faster-whisper (setup.py) ... \u001b[?25l\u001b[?25hdone\n" ] } ], "source": [ "!pip install -q faster-whisper" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "IW9fre5pFWR0" }, "outputs": [], "source": [ "# audio library\n", "!pip install -q pydub\n", "import pydub" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "Z4ogN2m7I_4C" }, "outputs": [], "source": [ "import pathlib\n", "from pathlib import Path\n", "\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": { "id": "pvW3IQkzrM39" }, "source": [ "## Path to audio files" ] }, { "cell_type": "markdown", "source": [ "### mp3" ], "metadata": { "id": "hErWhj_ohVXe" } }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "gRU41nLnrHmx" }, "outputs": [], "source": [ "# path to mp3 audio files\n", "path_to_main = \"/content/audio_files/\"\n", "path_to_mp3_audio_folder = path_to_main + \"mp3_audio_files/\"\n", "\n", "# path to transcripts\n", "path_to_transcripts_folder = path_to_mp3_audio_folder + \"transcripts/\"\n", "\n", "if not Path(path_to_transcripts_folder).is_dir(): Path(path_to_transcripts_folder).mkdir(parents=True, exist_ok=True)" ] }, { "cell_type": "markdown", "source": [ "Upload your mp3 file into the `path_to_mp3_audio_folder` folder.\n", "\n", "Here, we use the [audio file](https://github.com/piegu/language-models/blob/master/audio/lesson1_of_RAG_course_with_DeepLearningAI.mp3) of the lesson 1 video of the course [Building and Evaluating Advanced RAG Applications](https://www.deeplearning.ai/short-courses/building-evaluating-advanced-rag/) (DeepLearning.AI)." ], "metadata": { "id": "zOty4GlQhAq8" } }, { "cell_type": "code", "source": [ "p = Path(path_to_mp3_audio_folder).glob('**/*')\n", "mp3_audio_files = [x for x in p if x.is_file() and \".mp3\" in x.name]\n", "len(mp3_audio_files), mp3_audio_files[0]" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vBH-DrZdyk6l", "outputId": "dc02bc77-52d3-48b6-9a7e-48eb57033394" }, "execution_count": 10, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(1,\n", " PosixPath('/content/audio_files/mp3_audio_files/introduction_to_RAG_course_with_DeepLearningAI.mp3'))" ] }, "metadata": {}, "execution_count": 10 } ] }, { "cell_type": "markdown", "source": [ "### wav" ], "metadata": { "id": "yjmOhe2ThZoU" } }, { "cell_type": "markdown", "source": [ "If your file is in wav format, you can convert it to mp3 with the following code but this code is just to show the wav-to-mp3 conversion code because **(faster) Whisper does not need a mp3 format**.\n", "\n", "It works very well (and faster!) with a wav format." ], "metadata": { "id": "b8yT-HMSheJU" } }, { "cell_type": "code", "source": [ "# path to wav audio files\n", "path_to_wav_audio_folder = path_to_main + \"wav_audio_files/\"\n", "if not Path(path_to_wav_audio_folder).is_dir(): Path(path_to_wav_audio_folder).mkdir(parents=True, exist_ok=True)" ], "metadata": { "id": "DFdDCaHIh2G0" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "Upload your wav file into the `path_to_wav_audio_folder` folder." ], "metadata": { "id": "5Q8wEQ_oiAdU" } }, { "cell_type": "code", "source": [ "p = Path(path_to_wav_audio_folder).glob('**/*')\n", "wav_audio_files = [x for x in p if x.is_file() and \".wav\" in x.name]\n", "print(len(wav_audio_files), wav_audio_files[0])\n", "\n", "path_to_audio_file_wav = wav_audio_files[0]" ], "metadata": { "id": "g9jnYxhriTv1" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "JKVC3zYPwajq" }, "source": [ "#### (option) Analysis and reading of the wav audio file" ] }, { "cell_type": "markdown", "metadata": { "id": "JDijv3IvqBN5" }, "source": [ "As `Audio()` from `IPython.display` does not read wav file in Jupyter notebook (it's a bug), we use `pydub` in order to read it in this notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "tOzPX8HMkgNy", "outputId": "5bc21da3-638a-4ae8-fd0c-8f798668e09e" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Number of channels 2\n", "Sample width 2\n", "Frame rate. 44100\n", "Number of frames 12071052\n", "parameters: _wave_params(nchannels=2, sampwidth=2, framerate=44100, nframes=12071052, comptype='NONE', compname='not compressed')\n" ] } ], "source": [ "# Analysis of the wav audio file\n", "\n", "import wave\n", "obj = wave.open(path_to_audio_file_wav,'r')\n", "print( \"Number of channels\",obj.getnchannels())\n", "print ( \"Sample width\",obj.getsampwidth())\n", "print ( \"Frame rate.\",obj.getframerate())\n", "print (\"Number of frames\",obj.getnframes())\n", "print ( \"parameters:\",obj.getparams())\n", "obj.close()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 75 }, "id": "Pj_TExdYv9h2", "outputId": "aeecf0bc-035f-4635-8247-9bee585e0d53" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", " " ] }, "metadata": {}, "execution_count": 10 } ], "source": [ "# Display and read wav audio file\n", "\n", "sound = pydub.AudioSegment.from_wav(path_to_audio_file_wav)\n", "sound = sound.set_frame_rate(16000) # allow a faster display\n", "\n", "sound" ] }, { "cell_type": "markdown", "metadata": { "id": "79TRSsvNGiun" }, "source": [ "#### Conversion wav to mp3" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 75 }, "id": "0SmAus71GRv-", "outputId": "10bc4fe6-d4c1-40ed-8bc3-b16bc1d4791f" }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", " " ] }, "metadata": {}, "execution_count": 11 } ], "source": [ "# create mp3 folder if does not exist\n", "path_to_main = \"/content/audio_files/\"\n", "path_to_mp3_audio_folder = path_to_main + \"mp3_audio_files/\"\n", "if not Path(path_to_mp3_audio_folder).is_dir(): Path(path_to_mp3_audio_folder).mkdir(parents=True, exist_ok=True)\n", "\n", "# path to mp3 audio file\n", "path_to_audio_file_mp3 = path_to_mp3_audio_folder + path_to_audio_file_wav.replace(\".wav\", \".mp3\")\n", "\n", "# conversion to mp3\n", "sound.export(path_to_audio_file_mp3, format=\"mp3\")\n", "# print(f\"frame rate (mp3 audio file): {sound.frame_rate}\")\n", "sound" ] }, { "cell_type": "markdown", "metadata": { "id": "hV2UugdWf_Hs" }, "source": [ "**Note**: the traditional python way to display and read an audio file is done through the following code:\n", "\n", "```\n", "from IPython.display import Audio\n", "Audio(path_to_audio_file_mp3)\n", "```\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "kCtw062_wX0t" }, "source": [ "## Model (faster) Whisper" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "referenced_widgets": [ "868b9f20277543ab9f3c5ff961f49ea7", "bc95b41243b94db6ba779fd1f744362d", "f8a64046040a4f73924355833c31002f", "cd68ef3a3c194f3fb15b841b7d087988", "7d6182943e12405bb41541d19b73597c", "23ea7b6f426d493e9aec208411cf619d", "98897d8c2e5347fc8b4f77e082e1cc86", "02e51352eacf4c22a9c21ac0486708c9", "bd06b216e2334db2a0d7ccc2274b98cf", "3651bcba50f6432a96e9120e66a822f6", "4d648c38f0c643daa2fc7ce94ab335b9", "fd10664fb56244b98758f8240adb9f22", "2b037483a42f4d15be233997afd8856f", "66c645482f6b453ea94be3adc2e28cca", "71a699acfccc451c8d97014b38e036b2", "497810dff68447e09558fabc75ae09b8", "41893727a98246eabc172b423894957e", "2d6d6790abbe4017b8b368cfd3620935", "860b19ce05e7487e981f6a33e0615085", "04ab837b5cad4f73bf871050c9287d4f", "cfdbc970ece447ceaaf2551493702495", "86a9f41e2ba74d1b8d07bc13d9ae20e3", "c46a0a47c351407b89a65d750509e66e", "5a5756c717334947a0ab71c820e53718", "bc991543a83a4f7586e3309d9fab5579", "fadf0fc2f8bc40febf43556ed0114601", "5fe7813225d74e63ad59299ca0382aae", "9ade777651c14ec0bf69b9c92620491a", "e9602cdef15c448d84c077b5e06833d1", "986bc6ee52754f18a02948bb2e0f3a6e", "aa14d821cd124fd18c5af25500bc9fd8", "d9c7f5b7cbaf4a23a2213b82b5e9fd15", "170aaf4dcd4744bd955ed909f4b06921", "21b6a9ee567840e7ac38804048b8175c", "0d1bbacf381a40dd948533cee2216241", "3b791e1aa5434c1596c5139dcf5545a2", "807d3db5d30242b9a408235800ca02e3", "08ce5b8f220942b2a8063012698d52dd", "292d58ed6c744052ba36be0140be66fb", "f175883288d34ffb88e406b65380171c", "eaf9aca102784e278e71dabf913d2ffc", "12ec634e6f1e404aa91e5d1ce53f6760", "0634f9677fc84a47b234d2a7896401e5", "9b1ee122c584485da03994d5bd08809e", "be529ee0f81046569146099d95aa8d0a", "0d1e9270814e4662bfd148f1d8a83b23", "2505cd6ab53d489aad37806bffa915e2", "ee6c8fa1f0c54dfdae9fba8669f91ccc", "2d3ed864e688457db9759977ed0d063e", "fb3ac3ef0b4049e8a546a3f3f78ed4de", "1a05f388b2444ec1b7e2979a9963880a", "c81450bd7f1649b7acd0bb6e793986a8", "79e9dc04394646a6979774eff376ff12", "45d128ef6a424a4fa2c0676b45c30029", "58f16a7120744474a8cccbadee903536" ], "height": 177 }, "id": "kRpZlrZgsBJx", "outputId": "e32f3234-6e20-4a0d-8dbb-cf5bae15a491" }, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "config.json: 0%| | 0.00/2.39k [00:00 %.2fs] %s\" % (segment.start, segment.end, segment.text))\n", " start, end, text = segment.start, segment.end, segment.text\n", " start_segments.append(start)\n", " end_segments.append(end)\n", " text_segments.append(text)\n", "\n", " # save transcript into csv\n", " df = pd.DataFrame()\n", " df[\"start\"] = start_segments\n", " df[\"end\"] = end_segments\n", " df[\"text\"] = text_segments\n", " path_to_audio_file_transcript = path_to_transcripts_folder + path_to_audio_file.name.replace(\".mp3\", \".csv\").replace(\".wav\", \".csv\")\n", " df.to_csv(path_to_audio_file_transcript, encoding='utf-8', index=False)\n", "\n", " if i % 2 == 0: print(i)" ] }, { "cell_type": "code", "source": [ "df" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 687 }, "id": "cgUWQXgZk1u9", "outputId": "bfbf0f08-77ff-4254-a0a8-da5a20e38f01" }, "execution_count": 12, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " start end text\n", "0 1.97 6.75 Retrieval Augmented Generation, or RAG, has b...\n", "1 6.75 13.25 answered questions over a user's own data. Bu...\n", "2 13.25 18.91 RAG system, it costs a lot to have effective ...\n", "3 18.91 24.05 relevant context to generate his answer, and ...\n", "4 24.05 30.21 to help you efficiently iterate and improve y...\n", "5 30.21 35.47 during post-deployment maintenance. This cour...\n", "6 35.99 41.03 sentence window retrieval and auto-merging re...\n", "7 41.03 47.45 context to the LM than simpler methods. It al...\n", "8 47.45 53.21 system with three evaluation metrics, context...\n", "9 53.73 59.65 I'm excited to introduce Jerry Liu, co-founde...\n", "10 60.11 60.19 co-founder and CEO of LarmRatex.\n", "11 60.21 68.85 For a long time, I've enjoyed following Jerry...\n", "12 68.85 74.01 evolving RAG practices, so I'm looking forwar...\n", "13 74.01 79.19 systematically here. And Anupam has been a pr...\n", "14 79.19 86.07 over a decade on trustworthy AI and how to mo...\n", "15 86.65 88.37 Thanks, Andrew. It's great to be here.\n", "16 89.07 90.11 Great to be with you, Andrew.\n", "17 90.11 95.75 Sentence window retrieval gives an LLN better...\n", "18 95.75 100.27 sentence, but the window of sentences that oc...\n", "19 101.39 106.13 Auto-merging retrieval organizes the document...\n", "20 106.13 111.61 node's text is divided among its child nodes....\n", "21 111.61 116.25 a user's question, then the entire text of th...\n", "22 116.67 120.09 I know this sounds like a lot of steps, but d...\n", "23 120.11 126.23 The main takeaway is that this provides a way...\n", "24 126.23 133.49 than simpler methods. To evaluate RAG-based L...\n", "25 133.49 140.77 three main steps of a RAG's execution, is qui...\n", "26 140.77 148.43 how to compute context relevance, which measu...\n", "27 148.43 149.47 the user's question.\n", "28 150.11 154.91 This helps you identify and debug possible is...\n", "29 154.91 158.59 is retrieving context for the LLN in the QA s...\n", "30 159.15 165.07 But that's only part of the overall QA system...\n", "31 165.07 171.55 such as groundedness and answer relevance, th...\n", "32 171.55 179.55 system are or are not yet working well, so th...\n", "33 179.55 180.05 part needs to be improved.\n", "34 180.11 186.27 If you're familiar with the concept of error ...\n", "35 186.27 192.43 this has similarities. And I've found that ta...\n", "36 192.43 197.71 helps you be much more efficient in building ...\n", "37 197.71 203.63 The goal of this course is to help you build ...\n", "38 204.35 209.95 An important part of getting production ready...\n", "39 210.67 215.95 In the later half of this course, you'll gain...\n", "40 215.95 222.91 methods and evaluation methods. And you'll al...\n", "41 222.91 226.11 to establish a baseline and then quickly impr...\n", "42 226.11 230.51 We'll also share some suggestions for tuning ...\n", "43 230.51 233.87 based on our experience assisting partners wh...\n", "44 233.87 239.47 Many people have worked to create this course...\n", "45 240.11 247.87 Logan Machowicz, and on the Truera side, Shai...\n", "46 247.87 253.31 From deeplearning.ai, Eddie Xu and Tialla Ezz...\n", "47 253.31 257.15 The next lesson will give you an overview of ...\n", "48 257.15 262.27 You'll try out question answering systems tha...\n", "49 262.27 267.23 and compare their performance on the RAG tria...\n", "50 267.23 267.71 relevance.\n", "51 267.71 269.95 Sounds great. Let's get started.\n", "52 269.95 273.71 And I think you'll be able to really clean up...\n", "53 275.07 275.71 Laughed on it." ], "text/html": [ "\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
startendtext
01.976.75Retrieval Augmented Generation, or RAG, has b...
16.7513.25answered questions over a user's own data. Bu...
213.2518.91RAG system, it costs a lot to have effective ...
318.9124.05relevant context to generate his answer, and ...
424.0530.21to help you efficiently iterate and improve y...
530.2135.47during post-deployment maintenance. This cour...
635.9941.03sentence window retrieval and auto-merging re...
741.0347.45context to the LM than simpler methods. It al...
847.4553.21system with three evaluation metrics, context...
953.7359.65I'm excited to introduce Jerry Liu, co-founde...
1060.1160.19co-founder and CEO of LarmRatex.
1160.2168.85For a long time, I've enjoyed following Jerry...
1268.8574.01evolving RAG practices, so I'm looking forwar...
1374.0179.19systematically here. And Anupam has been a pr...
1479.1986.07over a decade on trustworthy AI and how to mo...
1586.6588.37Thanks, Andrew. It's great to be here.
1689.0790.11Great to be with you, Andrew.
1790.1195.75Sentence window retrieval gives an LLN better...
1895.75100.27sentence, but the window of sentences that oc...
19101.39106.13Auto-merging retrieval organizes the document...
20106.13111.61node's text is divided among its child nodes....
21111.61116.25a user's question, then the entire text of th...
22116.67120.09I know this sounds like a lot of steps, but d...
23120.11126.23The main takeaway is that this provides a way...
24126.23133.49than simpler methods. To evaluate RAG-based L...
25133.49140.77three main steps of a RAG's execution, is qui...
26140.77148.43how to compute context relevance, which measu...
27148.43149.47the user's question.
28150.11154.91This helps you identify and debug possible is...
29154.91158.59is retrieving context for the LLN in the QA s...
30159.15165.07But that's only part of the overall QA system...
31165.07171.55such as groundedness and answer relevance, th...
32171.55179.55system are or are not yet working well, so th...
33179.55180.05part needs to be improved.
34180.11186.27If you're familiar with the concept of error ...
35186.27192.43this has similarities. And I've found that ta...
36192.43197.71helps you be much more efficient in building ...
37197.71203.63The goal of this course is to help you build ...
38204.35209.95An important part of getting production ready...
39210.67215.95In the later half of this course, you'll gain...
40215.95222.91methods and evaluation methods. And you'll al...
41222.91226.11to establish a baseline and then quickly impr...
42226.11230.51We'll also share some suggestions for tuning ...
43230.51233.87based on our experience assisting partners wh...
44233.87239.47Many people have worked to create this course...
45240.11247.87Logan Machowicz, and on the Truera side, Shai...
46247.87253.31From deeplearning.ai, Eddie Xu and Tialla Ezz...
47253.31257.15The next lesson will give you an overview of ...
48257.15262.27You'll try out question answering systems tha...
49262.27267.23and compare their performance on the RAG tria...
50267.23267.71relevance.
51267.71269.95Sounds great. Let's get started.
52269.95273.71And I think you'll be able to really clean up...
53275.07275.71Laughed on it.
\n", "
\n", "
\n", "\n", "
\n", " \n", "\n", " \n", "\n", " \n", "
\n", "\n", "\n", "
\n", " \n", "\n", "\n", "\n", " \n", "
\n", "
\n", "
\n" ] }, "metadata": {}, "execution_count": 12 } ] }, { "cell_type": "markdown", "source": [ "## Display transcript" ], "metadata": { "id": "hrOtnMLxuStG" } }, { "cell_type": "code", "source": [ "import nltk\n", "\n", "# Download the Punkt tokenizer\n", "nltk.download('punkt')" ], "metadata": { "id": "MXYiGvx-8TMY" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "paragraph = ' '.join(df[\"text\"].tolist()). replace(\" \", \" \")\n", "\n", "# Tokenize the paragraph into sentences\n", "sentences = nltk.sent_tokenize(paragraph)" ], "metadata": { "id": "miqsp_PclzXi" }, "execution_count": 43, "outputs": [] }, { "cell_type": "code", "source": [ "css = '''\n", " \n", " '''\n", "\n", "from IPython.display import display, HTML\n", "for sentence in sentences:\n", " display(HTML(f'{css}

{sentence}

'))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "G2CXvfRMpFBW", "outputId": "5217f62e-b474-41af-b6d6-7690cd0fc493" }, "execution_count": 42, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Retrieval Augmented Generation, or RAG, has become a key method for getting LMs answered questions over a user's own data.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

But to actually build and productionize a high-quality RAG system, it costs a lot to have effective retrieval techniques, to give the LM highly relevant context to generate his answer, and also to have an effective evaluation framework to help you efficiently iterate and improve your RAG system, both during initial development and during post-deployment maintenance.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

This course covers two advanced retrieval methods, sentence window retrieval and auto-merging retrieval, that deliver a significantly better context to the LM than simpler methods.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

It also covers how to evaluate your LM question-answering system with three evaluation metrics, context relevance, drowdeness, and answer relevance.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

I'm excited to introduce Jerry Liu, co-founder and CEO of LarmRatex and Anupam Data, co-founder and CEO of LarmRatex.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

For a long time, I've enjoyed following Jerry and LarmRatex on social media and getting tips on evolving RAG practices, so I'm looking forward to him teaching this body of knowledge more systematically here.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

And Anupam has been a professor at CMU and has done research for over a decade on trustworthy AI and how to monitor, evaluate, and optimize AI app effectiveness.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Thanks, Andrew.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

It's great to be here.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Great to be with you, Andrew.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Sentence window retrieval gives an LLN better context by retrieving not just the most relevant sentence, but the window of sentences that occur before and after it in the document.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Auto-merging retrieval organizes the document into a tree-like structure where each parent node's text is divided among its child nodes.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

When enough child nodes are identified as relevant to a user's question, then the entire text of the parent node is provided as context for the LLN.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

I know this sounds like a lot of steps, but don't worry, we'll go over it in a minute.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

The main takeaway is that this provides a way to dynamically retrieve more coherent chunks of text than simpler methods.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

To evaluate RAG-based LLN apps, the RAG triad, a triad of metrics for the three main steps of a RAG's execution, is quite effective.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

For example, we'll cover in detail how to compute context relevance, which measures how relevant the retrieved chunks of text are to the user's question.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

This helps you identify and debug possible issues with how your system is retrieving context for the LLN in the QA system.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

But that's only part of the overall QA system.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

We'll also cover additional evaluation metrics, such as groundedness and answer relevance, that let you systematically analyze what parts of your system are or are not yet working well, so that you can go in in a targeted way to improve whatever part needs to be improved.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

If you're familiar with the concept of error analysis and machine learning, this has similarities.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

And I've found that taking this sort of systematic approach helps you be much more efficient in building a reliable QA system.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

The goal of this course is to help you build production-ready RAG-based LLN apps.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

An important part of getting production ready is to iterate in a systematic way on the system.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

In the later half of this course, you'll gain hands-on practice iterating using these retrieval methods and evaluation methods.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

And you'll also see how to use systematic experiment tracking to establish a baseline and then quickly improve on that.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

We'll also share some suggestions for tuning these two retrieval methods based on our experience assisting partners who are building RAG apps.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Many people have worked to create this course.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

I'd like to thank, on the LLN index side, Logan Machowicz, and on the Truera side, Shaiak Sen, Joshua Rainey, and Barbara Lewis.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

From deeplearning.ai, Eddie Xu and Tialla Ezzedine also contributed to this course.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

The next lesson will give you an overview of what you'll see in the rest of the course.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

You'll try out question answering systems that use sentence window retrieval or auto-merging retrieval and compare their performance on the RAG triad, context relevance, groundedness, and answer relevance.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Sounds great.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Let's get started.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

And I think you'll be able to really clean up with this RAG stuff.

" ] }, "metadata": {} }, { "output_type": "display_data", "data": { "text/plain": [ "" ], "text/html": [ "\n", " \n", "

Laughed on it.

" ] }, "metadata": {} } ] }, { "cell_type": "markdown", "metadata": { "id": "jUAmLdIbyNMZ" }, "source": [ "# END" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [ "yjmOhe2ThZoU", "JKVC3zYPwajq", "79TRSsvNGiun" ], "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "868b9f20277543ab9f3c5ff961f49ea7": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_bc95b41243b94db6ba779fd1f744362d", "IPY_MODEL_f8a64046040a4f73924355833c31002f", "IPY_MODEL_cd68ef3a3c194f3fb15b841b7d087988" ], "layout": "IPY_MODEL_7d6182943e12405bb41541d19b73597c" } }, "bc95b41243b94db6ba779fd1f744362d": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_23ea7b6f426d493e9aec208411cf619d", "placeholder": "​", "style": "IPY_MODEL_98897d8c2e5347fc8b4f77e082e1cc86", "value": "config.json: 100%" } }, "f8a64046040a4f73924355833c31002f": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_02e51352eacf4c22a9c21ac0486708c9", "max": 2394, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_bd06b216e2334db2a0d7ccc2274b98cf", "value": 2394 } }, "cd68ef3a3c194f3fb15b841b7d087988": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_3651bcba50f6432a96e9120e66a822f6", "placeholder": "​", "style": "IPY_MODEL_4d648c38f0c643daa2fc7ce94ab335b9", "value": " 2.39k/2.39k [00:00<00:00, 37.7kB/s]" } }, "7d6182943e12405bb41541d19b73597c": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "23ea7b6f426d493e9aec208411cf619d": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "98897d8c2e5347fc8b4f77e082e1cc86": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "02e51352eacf4c22a9c21ac0486708c9": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "bd06b216e2334db2a0d7ccc2274b98cf": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "3651bcba50f6432a96e9120e66a822f6": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "4d648c38f0c643daa2fc7ce94ab335b9": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "fd10664fb56244b98758f8240adb9f22": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_2b037483a42f4d15be233997afd8856f", "IPY_MODEL_66c645482f6b453ea94be3adc2e28cca", "IPY_MODEL_71a699acfccc451c8d97014b38e036b2" ], "layout": "IPY_MODEL_497810dff68447e09558fabc75ae09b8" } }, "2b037483a42f4d15be233997afd8856f": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_41893727a98246eabc172b423894957e", "placeholder": "​", "style": "IPY_MODEL_2d6d6790abbe4017b8b368cfd3620935", "value": "preprocessor_config.json: 100%" } }, "66c645482f6b453ea94be3adc2e28cca": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_860b19ce05e7487e981f6a33e0615085", "max": 340, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_04ab837b5cad4f73bf871050c9287d4f", "value": 340 } }, "71a699acfccc451c8d97014b38e036b2": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_cfdbc970ece447ceaaf2551493702495", "placeholder": "​", "style": "IPY_MODEL_86a9f41e2ba74d1b8d07bc13d9ae20e3", "value": " 340/340 [00:00<00:00, 4.94kB/s]" } }, "497810dff68447e09558fabc75ae09b8": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "41893727a98246eabc172b423894957e": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "2d6d6790abbe4017b8b368cfd3620935": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "860b19ce05e7487e981f6a33e0615085": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "04ab837b5cad4f73bf871050c9287d4f": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "cfdbc970ece447ceaaf2551493702495": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "86a9f41e2ba74d1b8d07bc13d9ae20e3": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "c46a0a47c351407b89a65d750509e66e": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_5a5756c717334947a0ab71c820e53718", "IPY_MODEL_bc991543a83a4f7586e3309d9fab5579", "IPY_MODEL_fadf0fc2f8bc40febf43556ed0114601" ], "layout": "IPY_MODEL_5fe7813225d74e63ad59299ca0382aae" } }, "5a5756c717334947a0ab71c820e53718": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_9ade777651c14ec0bf69b9c92620491a", "placeholder": "​", "style": "IPY_MODEL_e9602cdef15c448d84c077b5e06833d1", "value": "vocabulary.json: 100%" } }, "bc991543a83a4f7586e3309d9fab5579": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_986bc6ee52754f18a02948bb2e0f3a6e", "max": 1068114, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_aa14d821cd124fd18c5af25500bc9fd8", "value": 1068114 } }, "fadf0fc2f8bc40febf43556ed0114601": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_d9c7f5b7cbaf4a23a2213b82b5e9fd15", "placeholder": "​", "style": "IPY_MODEL_170aaf4dcd4744bd955ed909f4b06921", "value": " 1.07M/1.07M [00:00<00:00, 3.30MB/s]" } }, "5fe7813225d74e63ad59299ca0382aae": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9ade777651c14ec0bf69b9c92620491a": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "e9602cdef15c448d84c077b5e06833d1": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "986bc6ee52754f18a02948bb2e0f3a6e": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "aa14d821cd124fd18c5af25500bc9fd8": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "d9c7f5b7cbaf4a23a2213b82b5e9fd15": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "170aaf4dcd4744bd955ed909f4b06921": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "21b6a9ee567840e7ac38804048b8175c": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_0d1bbacf381a40dd948533cee2216241", "IPY_MODEL_3b791e1aa5434c1596c5139dcf5545a2", "IPY_MODEL_807d3db5d30242b9a408235800ca02e3" ], "layout": "IPY_MODEL_08ce5b8f220942b2a8063012698d52dd" } }, "0d1bbacf381a40dd948533cee2216241": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_292d58ed6c744052ba36be0140be66fb", "placeholder": "​", "style": "IPY_MODEL_f175883288d34ffb88e406b65380171c", "value": "tokenizer.json: 100%" } }, "3b791e1aa5434c1596c5139dcf5545a2": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_eaf9aca102784e278e71dabf913d2ffc", "max": 2480617, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_12ec634e6f1e404aa91e5d1ce53f6760", "value": 2480617 } }, "807d3db5d30242b9a408235800ca02e3": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_0634f9677fc84a47b234d2a7896401e5", "placeholder": "​", "style": "IPY_MODEL_9b1ee122c584485da03994d5bd08809e", "value": " 2.48M/2.48M [00:00<00:00, 7.52MB/s]" } }, "08ce5b8f220942b2a8063012698d52dd": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "292d58ed6c744052ba36be0140be66fb": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f175883288d34ffb88e406b65380171c": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "eaf9aca102784e278e71dabf913d2ffc": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "12ec634e6f1e404aa91e5d1ce53f6760": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "0634f9677fc84a47b234d2a7896401e5": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "9b1ee122c584485da03994d5bd08809e": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "be529ee0f81046569146099d95aa8d0a": { "model_module": "@jupyter-widgets/controls", "model_name": "HBoxModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_0d1e9270814e4662bfd148f1d8a83b23", "IPY_MODEL_2505cd6ab53d489aad37806bffa915e2", "IPY_MODEL_ee6c8fa1f0c54dfdae9fba8669f91ccc" ], "layout": "IPY_MODEL_2d3ed864e688457db9759977ed0d063e" } }, "0d1e9270814e4662bfd148f1d8a83b23": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_fb3ac3ef0b4049e8a546a3f3f78ed4de", "placeholder": "​", "style": "IPY_MODEL_1a05f388b2444ec1b7e2979a9963880a", "value": "model.bin: 100%" } }, "2505cd6ab53d489aad37806bffa915e2": { "model_module": "@jupyter-widgets/controls", "model_name": "FloatProgressModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "FloatProgressModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ProgressView", "bar_style": "success", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_c81450bd7f1649b7acd0bb6e793986a8", "max": 3087284237, "min": 0, "orientation": "horizontal", "style": "IPY_MODEL_79e9dc04394646a6979774eff376ff12", "value": 3087284237 } }, "ee6c8fa1f0c54dfdae9fba8669f91ccc": { "model_module": "@jupyter-widgets/controls", "model_name": "HTMLModel", "model_module_version": "1.5.0", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_45d128ef6a424a4fa2c0676b45c30029", "placeholder": "​", "style": "IPY_MODEL_58f16a7120744474a8cccbadee903536", "value": " 3.09G/3.09G [00:15<00:00, 256MB/s]" } }, "2d3ed864e688457db9759977ed0d063e": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "fb3ac3ef0b4049e8a546a3f3f78ed4de": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "1a05f388b2444ec1b7e2979a9963880a": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "c81450bd7f1649b7acd0bb6e793986a8": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "79e9dc04394646a6979774eff376ff12": { "model_module": "@jupyter-widgets/controls", "model_name": "ProgressStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ProgressStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "bar_color": null, "description_width": "" } }, "45d128ef6a424a4fa2c0676b45c30029": { "model_module": "@jupyter-widgets/base", "model_name": "LayoutModel", "model_module_version": "1.2.0", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "58f16a7120744474a8cccbadee903536": { "model_module": "@jupyter-widgets/controls", "model_name": "DescriptionStyleModel", "model_module_version": "1.5.0", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } } } } }, "nbformat": 4, "nbformat_minor": 0 }