{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "7gOI3IlanPQU" }, "source": [ "**이 노트북은 farm-haystack 0.9.0, torch 1.8.1 버전이 필요하며 현재 코랩 인스턴스와 호환되지 않습니다.**" ] }, { "cell_type": "markdown", "metadata": { "id": "0yd2u-tZv357" }, "source": [ "이 노트북을 코랩에서 실행하려면 Pro 버전이 필요할 수 있습니다." ] }, { "cell_type": "markdown", "metadata": { "id": "MzeRO7j5z41N" }, "source": [ "
\n", "\"코랩에서\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "i3nPMmDHz41U", "outputId": "97a6ebf5-fd8d-4978-f400-1f3d638c3178" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cloning into 'nlp-with-transformers'...\n", "remote: Enumerating objects: 563, done.\u001b[K\n", "remote: Counting objects: 100% (297/297), done.\u001b[K\n", "remote: Compressing objects: 100% (190/190), done.\u001b[K\n", "remote: Total 563 (delta 179), reused 184 (delta 107), pack-reused 266\u001b[K\n", "Receiving objects: 100% (563/563), 48.83 MiB | 10.77 MiB/s, done.\n", "Resolving deltas: 100% (278/278), done.\n", "/content/nlp-with-transformers\n", "⏳ Installing base requirements ...\n", "✅ Base requirements installed!\n", "Using transformers v4.6.1\n", "Using datasets v1.11.0\n", "Using haystack\n" ] } ], "source": [ "# 코랩이나 캐글을 사용한다면 이 셀의 주석을 제거하고 실행하세요.\n", "!git clone https://github.com/rickiepark/nlp-with-transformers.git\n", "%cd nlp-with-transformers\n", "from install import *\n", "install_requirements(chapter=7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "8FvoJMWGz41Z", "outputId": "6b552593-0487-4474-f569-628196bf87ca" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: TOKENIZERS_PARALLELISM=false\n" ] } ], "source": [ "%env TOKENIZERS_PARALLELISM=false" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "lFMHGCzwz41a" }, "outputs": [], "source": [ "# haystack의 로깅을 끕니다.\n", "import logging\n", "for module in [\"farm.utils\", \"farm.infer\", \"haystack.reader.farm.FARMReader\",\n", " \"farm.modeling.prediction_head\", \"elasticsearch\", \"haystack.eval\",\n", " \"haystack.document_store.base\", \"haystack.retriever.base\",\n", " \"farm.data_handler.dataset\"]:\n", " module_logger = logging.getLogger(module)\n", " module_logger.setLevel(logging.ERROR)" ] }, { "cell_type": "markdown", "metadata": { "id": "2bsecm0Rz41b" }, "source": [ "# 질문 답변" ] }, { "cell_type": "markdown", "metadata": { "id": "4e238IyQz41b" }, "source": [ "\"Marie" ] }, { "cell_type": "markdown", "metadata": { "id": "I2DxvBM0z41d" }, "source": [ "## 리뷰 기반 QA 시스템 구축하기" ] }, { "cell_type": "markdown", "metadata": { "id": "1-rCKMXJz41e" }, "source": [ "### 데이터셋" ] }, { "cell_type": "markdown", "metadata": { "id": "QhStaRsGz41f" }, "source": [ "\"Phone" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 99, "referenced_widgets": [ "dd6eefee0b604dc4ad26235d6650f819", "88e6e4bb9bad4c778884761f73abe971", "b527fdc36a964d3091a759631ad4c076", "747e599ed26444568eb963ae9931b639", "f19540eca4bb4b1b98d4c714ca9575be", "f6823486a3a84c50b59f1e76d8017f99", "441787b1af6145ae83d72f058a160a2b", "18abb84c6a4145f0b1bf4fa2dae96b5f", "1381dfaa69d943deab27ca802b39910b", "3969b4ea396e495691294eeee8c1a8ae", "8fdb8370e9ec4307883a1b72c5eb694f", "e3453b02063b4e4c943f685731456e71", "b185a07760f84dcbba6eb01fb92c8b23", "2b2f3f2f7be04309903bc5e211011801", "7a6e5f51e7874b4f8b18afc1392315ef", "e241b3f75d4f4af2989f946643babcf6", "3cddbdc3dd13443b85a41fe1ccf9f76f", "327a6d88210b492fa281630eb7b2bf08", "8a621c01ed7e4072a6c9733792df4669", "2696479e59804459b9501732cb11f42a", "22e46c837abd4f91bdad7aa2077d1842", "a758c96d91dd4adbb308bf9a342f0c8b" ] }, "id": "tHcoy_Rtz41f", "outputId": "2543c957-c065-42d1-d4e6-29eb675a9bf1" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "dd6eefee0b604dc4ad26235d6650f819", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/2.65k [00:00\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titlequestionanswers.textanswers.answer_startcontext
791B005DKZTMGDoes the keyboard lightweight?[this keyboard is compact][215]I really like this keyboard. I give it 4 stars because it doesn't have a CA...
1159B00AAIPT76How is the battery?[][]I bought this after the first spare gopro battery I bought wouldn't hold a c...
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", " \n", " " ], "text/plain": [ " title question answers.text \\\n", "791 B005DKZTMG Does the keyboard lightweight? [this keyboard is compact] \n", "1159 B00AAIPT76 How is the battery? [] \n", "\n", " answers.answer_start \\\n", "791 [215] \n", "1159 [] \n", "\n", " context \n", "791 I really like this keyboard. I give it 4 stars because it doesn't have a CA... \n", "1159 I bought this after the first spare gopro battery I bought wouldn't hold a c... " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qa_cols = [\"title\", \"question\", \"answers.text\",\n", " \"answers.answer_start\", \"context\"]\n", "sample_df = dfs[\"train\"][qa_cols].sample(2, random_state=7)\n", "sample_df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 36 }, "id": "z2QRigbUz41i", "outputId": "26f56f7b-23e7-4bac-d86b-ddb3eefb4cc3" }, "outputs": [ { "data": { "application/vnd.google.colaboratory.intrinsic+json": { "type": "string" }, "text/plain": [ "'this keyboard is compact'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "start_idx = sample_df[\"answers.answer_start\"].iloc[0][0]\n", "end_idx = start_idx + len(sample_df[\"answers.text\"].iloc[0][0])\n", "sample_df[\"context\"].iloc[0][start_idx:end_idx]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 378 }, "id": "KJmPL7Jiz41j", "outputId": "f194f387-4a64-470f-8a4d-d6cc62bc090d" }, "outputs": [ { "data": { "application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUgovTWVkaWFCb3ggWyAwIDAgMzk4LjgzMTQwNDUzMyAyNjcuMTA1NjI1IF0gL1BhcmVudCAyIDAgUiAvUmVzb3VyY2VzIDggMCBSCi9UeXBlIC9QYWdlID4+CmVuZG9iago5IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTIgMCBSID4+CnN0cmVhbQp4nL1Xy3IbNxDcM74CR+uQIWbwPlp2rMQ3xazywZWDS6Fou1ZKJDlR6e/TWO4uQWcfOZFVZBFNLLo5L8xs3u7++Xqz++3qUr/5oDbH1c2TYv1NbV6z3j9po7/h/axZX+nTTQb4nbI5UbLsjMeyrZcSIrHxQTxw8+OybP6i1D249lhcgWCvvCW2EY96StiH01OkXCFthThLEgo0PjUiOPhWPegfjrPWUdLCkZzTjzv9Ud+D2WBlsU28y1jEwzcj+hG61MKvI68VCgKxPpFxkvOAtBXiAglXWod1p/Ran02rzzj2TgWmKMLcA20FREs2V0qH9bmVpkC5k+rIJxu865G2QtgY8rESOwLnVsvsiAPkRiZvLRs3QG0NSaRYh+wInF2vY7IOepFOJkSOA9JWiHdItVrtAJxdbUjkS4plR4ZDQOr0UFtDSYjrJBuBs+vNniLksssUWEQGqK0hMZmkTrUROJfeI7MIpZJtFsGaswvODlh7gtlArk64ERglb17L4d7Yo6Tj7oDg567Am6kCfxRAtqvv6hI3wrN6wKfRPxmcAYNZjpywK8OEN3fqcqs37+Bv0dvb7irZ/qE+6VeNudC/6+179fNWXauOuIqOkbKC5kkROd77kKyNmdd4uTFT3Fy8lX10ciSvsXl2FkvOuoTLwydZo5cZ+oiIiy4FW9FX2AI9KixCJouHb9wavZ2mFw4UPPoBd6SvsXl6MZ44W+MC/LTqdDdD7+NQdY/0FbZAj7vZREgNcFRYo/cz9OiDXJ/mI32FLdCXa86zpAhHxTX6ME1vbakkCLZQ9VIVNk9vcSMl2N4kOCqt0ccZ+mSIIxwY61buiC3Qo6xHk71kOGqNPU2zj31ZhrkLdSm5FTBpdcZPpcSU/PQQt0D78UIHS8mIQXnAC9Dn5mleRwjkc62jByZTL6PudP1tEDJ2RUjzpXmZ580WuV7z9sAUbyTRCfHB64y75vECOUqHP98bYDcvg7E3+1rHgEwawFPuHMGoFaUELcl52/wJOQumZ5fw5YS7RyaDYDQ+Y5TwcZV8gRiF1p4E34BMRz1aL1NGI2QIGsJl4l+X/rEYIXcSbQMyaW1H4cCby1i17vzPzfcFbhgwnETcgMyEei47UOxXou4XuPm5phX9/jCYds3F6Vg632KM46H68J/B8m5qsMS+/zGR1rv6R+dOM53uQ2PEXVu0P7EIGjvpWx2BCctzlVkcucownWXeXWg0Z4GdH0tRSU4Mc4ZtKq+skJzNQ/M3Pu+bm+al0bDmLT6vO+yp+d58BXIPZHuB9IM7gsPtW9ofHPfS/NWnmBqsf63+BZifOMUKZW5kc3RyZWFtCmVuZG9iagoxMiAwIG9iago5OTAKZW5kb2JqCjEwIDAgb2JqClsgXQplbmRvYmoKMTcgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAyMCAwIFIgL0xlbmd0aDEgMTE3NTYgPj4Kc3RyZWFtCnic1XoLeFTVtf/aZ51zZs6Z92Qm78fkMYTwSpwY3soYAXmJAQICFptAEt4kEBAx0gCWBIQ2oUAQREgVEANiQCoJRAqaighULdBbb7VWwVcbkXqp2pDs3LXPJDxs+/+333fv9//+58zaZ+999mO99m+vPTPAAMBNiQy++4YOGw4DYDQA60W1sfflPDAhvq3b11QeASDx+yZMzN48cu94akxl+Pj+e3JHaEMyfwygiPbfPDAhPTD3u5eeAFBTqDxpxvz8Ehzkoqz6OI1RMuORxT6YHTcAwNRCZV5UMnP+wjsfmQOgURn2zcwvLQET3aA/RWXrzHnLilyX069TeRdAdMqswvwC87TXKgBS3qT3fWdRhW2naROAn4qQMmv+4kePHHPT3H4x//Z5xTPyZ+8r6UflE1TOmZ//aIm8SV0I0M1JZd+C/PmFqV/e1UTlDOLnQklx6eKdK+8NB+gu0/sVJYsKSwaZ/kLZ7kLGWSB0ZYXQJdGNkEx1KdCN8uKdDn1gMEhDh4/JBfu8/MULIBLESNDRAXAjJ1qyuYWLFoBZ5AySaQTxNIMkBURLKUdaBTbwwr99dbxPNNTIVRvpXKIjIg3VhN79b11idv4iJ3sJovJJkePn+Yv0PE1vTt7WermRrjHS8yLXEep1JDTS/yxfHe//z475T2Y6T/OcvzHj+f/1ORn5iQ3s4AAXuCEMwiECoiEG4iHJ8E871TDDv5A8SzH6qGKV0WUGzXjqnWNZyL/FeBFGj5t9RHuNWon3Yi7R0wHOzhk9AJ3+vwP2g8fw/8fyF+VPh+r8RfMXQPX0RfmzoXpG/oJSSmcVLqJ02aJ5UD2zsJjyMxcVzoXqWfkLqM2swulUMzd/QT5Uz8sv9omU1tGP5+cvngXVC+aKmuKZ+fOhetGSBdRycdGCmZTOEuP/k7Vm6Gje7Jn5t603GULrTaxf8UQiJ2mqG+UiIIogRaI0hVKhC0Zpl4x/hM8hBwZbwfSBGJrWKch5hA8XQ1OJ/K1XV5nN6sxP/BdMShqWHvkXbD811FY8b+RvGeO2+qm3lBfdzEv9Aa5XUX7wja7dScqQR4T0x2Q7qxaeo2QqW6kYH3rif0CR5KYRLCqiWZYk+UaPziunaFgBBMEHy1QP97BtpvnsUh6wjg87OhtgJ8VCyAf3U4kZZRkEevnolo1nfxgKw2EkjINceBDyYQYUwWxYAkthmWFxH2QYLUbc0qIQZsGCUIuOSx0XOk53vNlxquNEx/GOpo6XOw51HOyo79jfsa+j7nau/+EVwv3mzpLDmDFEgn/aP4jDkN8N7SQb0fBOEvoXe+bIzr7jOklsjrmdRJokvkMURpTfSWJ1zegksR8UEhUR0T5F8gFpAWifAZIUSB+0URIt7aQkomWdlMyyoAHO0H0S6mA720MlMc5CqqmVDsFq6t0Ar7MzbK3Um+r2wFU4Ty0r4QzWycBGQSbVArynSHCN5cJhGmMA87ABJpWW01j5sDxebpA/k89BP7lUPifnyaUsE59VJil7iAbgr8hXTkMCNLAPoRSO4heYiU3yUNkOH+I5rINPaBahvzNQBbugjHjxsGIol8qk8VRzSjkH2+gupvfn2A52nrg7yp6Ai/AUytII2MEuklxn4Bt4AnOlcjJLplRE/J+isc5R/21QSkB2kenApZ5UR9zTXNONNA57KxeN+yqU08y5sEttUD2mZJpFaGwPe521qBuhFs7jD3Ah/p6tlpPlvRQFVYU0gHlQRWNvE33UIraMZBd3mRhdWirnsTr4Qs4zTaexfyUkojkPS+NJoiJoIlqqOkmmQWw1riVOxds4OGcaJadTfxrBtJykBijGLJhDuTI4AIegN9ZAFY1kyKv2U76hntvlj0jmKvYT6Rs4h0MhDYrkK6Rr4UI1AEdMqiKjxKCXz1kv+UcW1AfHTfa9OSWxd6/vFX1Ok68ecupty3wNHR05k+UYZUq9EluPfnO97E/+6J+9/Kh3r9E5k3317cOGdo46LG8o1U2YTFlRomqqHzbUeCcmrVf89BmZV++bMcv3pPPJ5IFPOgsH9hZrTBLREK1tpFxlxyW5iqxjIRxODoaptW6otW5wr4/UYh3xGOuNiXS2tVxruQOcl6+1OK9ksCTJ5XRnBtwup5QaAJcTkpNEKq3b/swz9HnmmetM499ev86/ZZqSw8/xs0TnWCbdd7LMWl7KK3glL2U/YcvYY+wnAtc+ouU9lVBeh2DQm421slSrrDRBrWZOUGMREpjFeWF0vSN3ciM1Dvaf0tLcRgyltwSutVxoySAdTElihx3okKVp/RJdSpY/05XoTeRsFN/KCt9io9p21cmlIxpGtF4UeET2kkeRxLGwI5gaFR2DkbEuRQaXosjZzp+7NtlqPRtkWrfg1CWmx0Y4UY1zto2u9+aOrg/PfWh0vSf3IeIEO070n9J8oeXECZd7QCc31wxuTE7lS5PyJauPdboiBhBvwcBEeZIyyfSY/JjySExllIlWdZQcTeaNXQyPqEuiS2MWx66CiqhV0atiVsXuhb0xrmkwzU9CZPWFfnezrDu7JSeppqy7WWZA9npUkwoEJSfbxpAaM/Pvf77ih+cffezC5M+ZZ9hDUfxaXV3dUrZh4PwtI5fWZN979o7A56/9YHdJHP8zSb+d7F1K0neHkmAf8IbpFVpChS+s1mur1TaqsbW+jckb1PXe59LCY8MAPVGx3XzOWPQkaGqaUEJ4bpf8miE/KeBaC0lJGnC2XL52ucX56RWncZNWMlhQK4jPT8j3FSTKMI3FM69HTkzqlpoVT4L0Jal6sqxQ5jbxcMiG5/g7/POHT83JfXP+8VONuw+8snnHc09NOL6o9PSUT5n1p+hPaK7+4Gu///U7AjVVP968Z2lJaVlKt8M+37uHHt8nPLyArLyLfEqi3WJlMI7Z0AaItmxAi6lWYbhSY1YdYlWzbLU73x9dbyHBbIZgViHYhcHNLQGXsOvlC4NbAiSLYVj5NBn3tDBpDwv0oI1nCm0SS+FJMIWzntCN9cS+bCx7wPqAbRIrYkvYY7ia2ciUGkvETFemN9mV7ErMQpVLjGfxixdPtz+s+Nsu4bm2zL28luW9ThbaQRYqIM7j4OFgshxtclU446JrTZ5a51qbVAsrbetNu+IjYpmOsaA71XhnG7vVLk7BfudqcYrVQiZyNl8RC1isYDIPbw5ZJ4z8yyV0Dl4P3GYWYY0PMKq9ttfkXq0shV/gXz38+qypJ+a++NZbL477ea5ysY7/zOHgV/70F/5Xn+/MHRmvbN/+Soo4rVUR9zUGnqTA5GBKmAq2CivUhqu1seG7nbXWtUkbYtf7rUlabFR8WCwmJsT4CWDIiS4bEHO57fJN9wl6aE9l56RzeE4+o5xRSe5D8dI0No0lqV5PeIhX5u3DkpMk7BIk2SfgKDEQLu1as3PnGiKmjXl6zJvnHYMOzf2IKfzqx7ydX2E5LGbM0zjo6LM/P3bs588elZY1pHTjX/OvHpzGv/rzp/xPBkBNZ7vjBULtJW+aRTZRYUYwUnFJKKFLJrxQyB6oIJMZqCZn29lml1gJ6bfgABG5izDQ5Fcp1A/SYHQQJxu5+vWfEnRPlpiK0coAZYQyE+uhXjWRt5BhWDJL3Isn2j8+z3h7pnJxUutKpaeIiNaRftcZ+k2GdLg36I8k7aaqtfG9a90b4tenPpcRaU3pEetNiXVohN4E4Y7EmAxnW3PLteYWQ7Fda9UoDaBFeosy/X0Ia1IyA+ECZIzlmpyUknVn37CuBuQZ0rrq3burq/fs5rtXbYCOP3zIN6z82XP822+/5d/uGrHhiVUbN656YoP0q22VlduerqjcNsl3aMXL77zz8opDvqQ3qt77/PP3qt5g+YtXrVpMRB5zmqLfetKtmWK2PkEvbNZWss1Os+TUQYmyBSBWk93GHkScG9oVMHsoL4yRH7gyQ9b2JxrPNMY2XmNZLIF/xM/wbLaTHWI1fBbP4flK+vWlLJL1Yb1YxB6+ha/gP+I1ZAyaXU6m2TVIC7rVzbK0GVbKL5oVZqLlJes0ceBCc2hS2v0OOSw0rbFwBCWfxqntJVJOe/1bYkmMqGvvB6ExlUE0pgV6BT3mzdKLMqzUVRpQ6a8xGtUqRqUPDTv4MmUyDuXYxLC0bbkyBULQwG9Jf3jrrfYkGrd9u1TQ2lM61T6gc2y2kcZG6HMEXpLEcLIAZqdY72DsjoJbgl5bhhJUcpQ8pUrZqahidBqVxmvtKfCxlH+nvGf4URyMD/ag3dzBrDarndls1mxHvFXd7IbNkRQI2OJtMQ4ralExmRjldRLctLQJUwhwdDYbbjSgCx5D/m7seUZQkGkECGEsOdXOOtelSCVMfZ+3M3z/fcZ4xxCmX/sgMdnJf8UrackNYgPZo+8qY3gD/4R/yhvYCBbNYtiI1rf5B19JEtvN8sWy5A/xHbyN/5TU0dFKNvyCdGKCUUG7aliQBcl8QcXsvECIYqzIQAatQl2sQrOxCs1g7lqFYaAlgJM5pQSTUwtqJdpOTZuGhoFdiar8VfuVM+1XyMCtF8UaZOIMYtpD81lZWXCkEq0qmq7J0bqG0bpFl6KZZLHoqstkNikEEmazSXKhZKXWdCaxZuuKhCrCSxaz1aJr5lDgaDGBzXnhbIRwb9puAhG3KNTcScrNZyg7JekQ2Mlvgl+rkqpI4mSpu/XuSoru0++W7lbu1DP0MdL9SrYe1KdIc6S5ykw9Ty+TyqXHlXJlhV4jbVbiTKBJZgRZVcSXEcwkk15MGmiyrlvBHo1e2WuOsjrtPjlR8ak+k8+crKXofovP7rMPlgZilpypZJj7agMsQ6wZ9uEwnI2SgvIwcr1sNduUbQ6ag9pQ/X5r0B60T5YmmadYc+xF0kzMl6creWqeKc9coBXoBZal8Agrkx7FpfJiZZm6zLTUXGJ+1FpuLbdXSJW4Rl6rrNaetFTZt8g77S/ZHxIomakx8WHJGkseepbcZsAlkZzjazn50mtcuXjdLV8R1NpTcbZeJa8vI/TsTacHHfzQRNFfgiVCs8MLEWqj3eWrSDga25jc4FofYYUIjLRpZksCmj3DupHLn71AVgmBO/l8G3n9GwaWuoTvBxdkxGXEZyRk+DISM5KGpAbjgvHBhKAvmBhMyonLic9JyPHlJOYk5aSWpK6Oq4yvTKj0VSauTqpOrU29mhrf1bWrU1eHvPi8hDxfXmJJfElCia8kcUX8ioQVvhWJkbfugHexfq7kLFpeSd0IpzMTb42lwqXjH+5fWby1saFhSNOa/WfarzPp+S15r+QWHp/6X1elzKKy6aXvHU4b076yrij/5LOvnnCXr+vTpy41tU14+cKOS/gZ6SoKhgRjoIKtke0VtjV6o0tujCAlRZvcNhjhGRbtbLsc6Dof8GtXnH+9khG0OGKcMStiqmNqYxRi1tiiOxnu5xXMdu7R+NnYZ3JefuONl3OeGXv/7mnt/LesN1MnPitn7e/Z89K5c5d69qxLSWF3Mztzs4HJZEHiSi4nrjwQQxFsCniZVmFeo3hfYEqjlR2LbHQ3WNfHxngls9cMoyW3Y1issXk0G1G6YDIUYFwLRRhpQ+JK4mrj3om7GqcMgSFsiDTEOyRG6WVKN6drvfRiKGbFUrG3OEabtlAIkmiESIYMvk6EMxnCmeTytkPWc0fmnJo+4525/Bo/xdLaPmamBmn3mm2NdunhqcdP3XnngR69WH+mszB2L/+gecvhAzuEptMJkL9TPRAGU4KxipNZzS+orBK22NUmXQqjcEFTzDaHZYxHAL0ugN4igH50vd3Ii4BvcHPb4OZmt+GgYktxXqGYlfb2V4LeHG+tF4l1YjKOhUAtOStTOIv0Xf2M+1k6f7exvv7Aq6pna86sGVVt6fhu1dhj+4ivo4SrBQZfdwajUAO0M7XS7mqwbtGZZIaxYoUM94jNTBwQCbgoxHC5CbsO5XmN3TnZZXijlzKZRqQZLhc0PP745v2NjdkvLzn5hrSr/QfSjp07ju9qr1Q97TsKC74Se9NJUsoymhcJ0XsSoh+XD0KTpDCzDMPNzrbBxs55mfbNoEXgdY6WR5itGBtosgiyTzbQJeddr1U9X0CXHOQzYdD3djma/rEcl7vkOJznfdsrfV8S7/9FEjnvgBAk5KtLaF5xth5FZ+tGNzRaG8TZ2u0Yh27vsO+drYPJQ6LKoEwtN5Wby7VyvdxSZi23ldvLHeXOcleZuzbqapTr9uj3tiN46ab9+zZv3L9/41Xm5leu/oV/xVz44WenT3/2+ZunvtjO3+Qt/EtyzAHkfx7Wnzg8yifJu4hDJ52G7w7GRDeC3dOomBvs69mr2BTnclvui5DBLA2PE9oJhHi9LPRDR4qMoJYXvyK+Nv6P8TKb5r+hGrHDExyxkMYMTllpY+PAg2VnoaPjbNlBqf/zP/vZ84L2th9Q9bqCfN7Ev6O7KZ/9+cxnn50hEn6wkLibaujP4A4brQ6tMdLb4Fgf82pUU1ykVTVH3wdu97A4Q4+BwE3u3vged+yWAJYJAJKMxRsekUU+w9KJl717iR+pdODBx890QMeZxw8ObGiQ0ju5aT9QkM+GMjPdQ/ML6v72TcircBRx5wI6qKi0g7ssWGlv0JpMukqb6HC3CLoNTyX0uXBWwM3hnLCdYcKfQnh405kicFTCyF7bnyctHV0d1icWD7tdZ463HyJXKpqhKDRbMaHxKZotFT4LDrZZJbtlQkK8WZNM+oSEhPhs3RKfIHsJpdfKngrv2kiB0n5C6e7xuiUhxgTjY8x2k9mTNKy74OpCy2Vix4imQ7D9VwHb7q6ow/4lnehNRipijFQRY8yP1WMtsdY+BIq9LL2sg7RB+iDLIKvFBz6WInXXu1t6hKV70r09wrvHd09I86UlpqRW6BWWCmuFTXz/yiRJ1VULWtGGdnSgE6MwGmMwVo7TUtPThqT9MK08bUVadVpt2tW0SNreF97cNRKMbw7U5FuPqOlMHFX6ku5w3di9U9eunb5pSPPub3839fV5RW/kr1pfuC+476k//rrosDzkQPfuubnBkYn2HlvXbn8lOfl4VtaUcaNz/I6Uzat27DfOdxS2S18rOwgh+gej7IrZgS+AizWZK3UL6ZhWgNNtFwhhwGyg8xQS+oqA0O6lENoJbPWEDxJI2y1LYKyLLWVlfPXo0ldfvfhsZaWyg79W1V67duy2nb+R8qrY3cLHDxBGTDawyQODgrE30Wm9zpo8DVbCJo9lLKHUcK9w8gEhj7ocuAFRxd4TAqLCCGFDPi6OEKGAgB0QEPViQ8O9B5ecfJO9zY5Ke9rzd+48vksqu167v2jGVdwrpJ8EoGbJeRSh/pni2YmheHYixbMTRTw78V+KZ3/5D+JZsVmNrneJ78zcIgkTicX4lsI4qVBAbg197SLOL0a12Mv+7TCYBTsUKVwKV5L0LH2kNFIZThHvQ9JDykQ9R18gLVCK9GUU9S6jqLdS2io9pWzSm6Qm5dfSKXxbiVMkDVXZouhmi0YPq1eKwnA5Wokxx2gei9fqBz9LllIxUfYrSWqSyW9OpQg40ZJsHYB95b7mASLulUbgcDkoZytBNWgKmodSzDvUImLeSTCJTZJy5HHKeHW8Kcc8QcvVJ1pmQAErlOZgoTxHmaPOMS3Q8i0zrcX2JbCELZOW46PycuUxtVx9zFRuetS8TCvXyvRHLMutldIahWJg2MI2SRtxu/y08pT6lGmrOZheY91p3wN72C5pF+6T9ykvqC+Y9pl3WV+y/0I6iK/Kx5QG7Zf2Zul1PCu/pSwz4ucYJj4s2cKSJzV8+sl7n37SwH//3l++fk/Oa6vBOYKu12JN2xzykbsIi8vJRxxsXfBek1nSXODQXRYdwGF3OcBhc1ltIB52GzmN1UUuk22zaE6wKJX4qt3S5LTbrLpGnmJ2yA6Ls8s7zIbdO+MZz414prnZFWGgU0Cc+123nTC/5wl04AwIH7iqgmJWNbSF6xE2py3ZlmUbqT+gj7VN1abqc/RK2wrbRpubzkiaSla22C2OCOaVnLJTidA9Fo812h7tSIUUQjOf7FPSzN01v55iSbGm2nrYezh8rn6QxbKkDDlD6a/3tfS19rcNsA9wZLjugSALSkEMysFO62drw/T7bCPtIx1BVy6MY+OkiZgj5yiT1ImmSeYHtQfJAyZap9inOHJcRaxImqXPts925LnKzI/aH3WshSe11ZbV1rW2tfa1jq3aZstm6zb7Nscuyy7rPvs+R73rbdeHrg5XIdlQsbPQV6hDmHEWkjaO3fT4xnljcjMT+aDX2cPs4ddnvfnYthEVufLYtk04rzPaVOooukqBEcGwbkZwaU2MtMWbXdZEp2eMX6BcQOCcc7CIKZvvgKBLs7lecEvRlRC5RU1wN1kc6YM/DQT44CsBCjTpZH1rcHkzwDSg0CReCKRW6rqiTW4yAs4D9TNSu7G/3RZ5dkWfW7t3nzVDRKGS4FceRfx66SQyPuiKGQ4R5nCHRzabMVxXx0Tf5JcPpvgk6DYTeDsr7ZHHww/at2jQpDDB7RVufK8YoKCxg44l1XQ8cRqHk79nmThmFEbKo0Kcvvhyo+D8b42NIqbs4vHIS4JpdugLCPGothGPadAbpgYj04dH9DT3cMZ4zdE9NEhQzSnxWlK3MX1ustocEGmbwXBETELyCykuivh7H+9x0Albwk0pTVFxiemDB18OBMQKcLYE6BPSc6c++/Xtd4PZLq3fEtorxL8I74WCH4xNHfsEKft+6ZAQqFP7SPKQ4kN6nhgblirE6RKvywzSF6HfnaUpW6X3frzoh47Bf4UEs/FD7bs/sl/uen7727Yx9ima+D+G+cbvutTPNJ/HAdj5t79tHWef8ne/BafK54zfSUGqI1pHfhkBlUQfEdUQbScqINpBVEW0l2idmgan5W1wWpXhtPQDeo6BUvlSR6vpCxgqfwZlchEslFtgofQupCtuOCoNgJPiKX8CCxUPHCVaiKOo3BOKMRn60bsD6jqYJDfBXcoT1EeHdFOTsOiNKxuWwXdsHjstdZc2Sr/De3AF1uHvsE2eLm9Tlqo2dbL6msluGmVaZDpg6jCPNT9jvqA9oZ3QJYvH8jfrOOtzFAMJ6VMxF3rCLLCSzzhhq9CW7JXC6Sl+nzXBVPErvayR8jKM379FnkE4lUJ5CcxseGceb6mXb8krEMnGduZV8LAiuBeKoYRkWASzYSbNvhh80B1mkLf6IAAZdGdSbjq18JGss+l9KdEiKIR8mA+9qHYkLKD2fSh3D8yj2wfjb4xVapQK6VlIfR6htIBa6v/CrH1vzJpLMz1Cc4lfYRdQa8FHPvX592YcSrk51G8SLKEWM6htvjFaodEj35DIR6MsoLSE2kyncWdTOx/1L6bZ84133x9ngjFKKXFUTPdcqhWzllLbYmOkAM2dCVm39erqI4UcqONHxn9C/v5KNdaJ+L+biP+8xj+MIgnnxL+MUklTPUn7d1J0OrDzHxgjYTSMgXEk/QTS2USa60GYTGP/gElwgqFpyYLZw7MzMjqfmQ3SimDHdY6tHvybH78L4Lc1+I0d/8rxGsf/8uPXdvxLDV7141dP3qN8xfFKDX5Zgy2t+OdW/BPHLwbi59n4GcdPA/jJ5QnKJzV4mRpenoCXPk5XLrXix+n4Ecc/cvwwgH/w4Ac1+D7H37vxP5fje8fwdxx/S81/uxwvXrhPubgcL9yH538To5zn+JsYfJfjOxzf5vhrjudq8OyZeOUsxzPx+FYAT3N8Y7VLeSMWfxWOzRxf5/gax5McT3D8JcfjHF/l2MTxGMejLmys8CuNHBuOUIzD8cgr05Qjx/DICvmVX/iVV6YFO/CVoPwLPx7m+HINHuJ4kGM9x5c4HijAF+24f59f2V+A++rcyj4/1rnxBWL6hVbcy/F5jns47nbjLo7PPWtXngvgs3b8eQHWUpPaGtzJccczVorr8Rkrbn86StlegE9vcypPR+E2J27V8SmOW2psyhaONTbcTJ021+CmjXZlU3fcaMefteKG6mPKBo7VVdOU6mNYvUKu+qlfqZqGVUH5p378Ccf16/oo6zmu64NPkphP3oNr11iUtR5cQ4dNqqgswArSVIUfV7vwxxyfWOVSnuC4yoUrOa7gWM4x2PGj5cuVH3FcvhwfL8CyXK9S5sfHOC7j+Kgdl1rxER2XcFzciqWtuKgVF7ZiCcdijgs4zkvEuRznuLKVORNwNsdZy3EmFYo4FnIs4DiD43SO+QMxrxUftuI0jg9xnMpxymRdmdKKk3V8MDxKeTCAkzhOpJknZmOuFycwpzIhEsd7cNyoMGUcxxwLPsBx7P1OZSzH+504huNoejOa46iRTmVUGI6MsykjnTjChvdxHF6Dw2pwKMd7pd7Kva2YfQzvGY1BjkM43n2XW7nbg3cNdih3uXHwIJsyONjhwEE2HMhxAMf+/TxK/1bs19ep9PNg3yyL0teJWRa8Mx4zbRi4w6IEON5hwYx0i5Jhw3QL9umtKX2c2FvDXgHs2cOv9CzAHmlupYcf09zYPdWvdL8HU/3YzW9RujnQb8EUjskckxyYSHImutFXgAmtGE8ixBdgnA1jSYOxHGNaMTobo6gQxTGyACNIUxEcw6lTeBR6OXo4hnF0UwM3RxfJ6spG53J0FKCdo80artg4Wqm1NRwtHHUnahzN1MzM0eRBtQBleimTB3iRapHTwcypSL2RORE4sgZWsPonrOf/Dxf8v2bg/3jF/Te5mlpkCmVuZHN0cmVhbQplbmRvYmoKMjAgMCBvYmoKODAyNwplbmRvYmoKMTYgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCA4NCA+PgpzdHJlYW0KeJylizkOhFAMxSwBwwDDvu/b/e/I0y8oEGgKItlFnMDLsf50G4cPLl88fAJ+D3ehiERMcimpcWac33wWohQVtdzQnqWjZ2BkYmZh1WYT+wF0pwJ0CmVuZHN0cmVhbQplbmRvYmoKMTkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAzNjQgPj4Kc3RyZWFtCnicXZK7boMwFIZ3nsJjOkQEEuxEQkhVujD0otJOqAPBhwipGMuQgbev7d8hUpGST+c/dzjxuXwpVT+z+MOMbUUz63olDU3jzbTELnTtVZSkTPbtHCz/3w6NjmKbXC3TTEOpujHKcxZ/Wuc0m4VtnuV4oaeIMRa/G0mmV1e2+T5XkKqb1r80kJrZLioKJqmz5V4b/dYMxGKfvC2l9ffzsrVpj4ivRRNLvZ1gpHaUNOmmJdOoK0X5zj4Fyzv7FBEp+c+fZEi7dGt86uKBGvxx8h7y/gj5biZACuyBA5ABHBD3fF/ugBiHGoSMYIcahHyEfAryanpvhiEcahAyWmShRRZaZBjEoQa9zFGEhyI8FOFYyaEGIWM3HuZ8mPBiPh7GfZjwkjdFFrwweQcV71VgGoH3KjCEONwTfR2BRURYRIRFxAlyaC5cV/vh71/Y3YA72PXA2psx9rb8VfujcufUK1oPX4/aZbnfHxL20GIKZW5kc3RyZWFtCmVuZG9iagoxNCAwIG9iago8PCAvQmFzZUZvbnQgL0JNUVFEVitEZWphVnVTYW5zCi9DSURTeXN0ZW1JbmZvIDw8IC9PcmRlcmluZyAoSWRlbnRpdHkpIC9SZWdpc3RyeSAoQWRvYmUpIC9TdXBwbGVtZW50IDAgPj4KL0NJRFRvR0lETWFwIDE2IDAgUiAvRm9udERlc2NyaXB0b3IgMTMgMCBSIC9TdWJ0eXBlIC9DSURGb250VHlwZTIKL1R5cGUgL0ZvbnQgL1cgMTggMCBSID4+CmVuZG9iagoxNSAwIG9iago8PCAvQmFzZUZvbnQgL0JNUVFEVitEZWphVnVTYW5zIC9EZXNjZW5kYW50Rm9udHMgWyAxNCAwIFIgXQovRW5jb2RpbmcgL0lkZW50aXR5LUggL1N1YnR5cGUgL1R5cGUwIC9Ub1VuaWNvZGUgMTkgMCBSIC9UeXBlIC9Gb250ID4+CmVuZG9iagoxMyAwIG9iago8PCAvQXNjZW50IDkyOSAvQ2FwSGVpZ2h0IDAgL0Rlc2NlbnQgLTIzNiAvRmxhZ3MgMzIKL0ZvbnRCQm94IFsgLTEwMjEgLTQ2MyAxNzk0IDEyMzMgXSAvRm9udEZpbGUyIDE3IDAgUgovRm9udE5hbWUgL0JNUVFEVitEZWphVnVTYW5zIC9JdGFsaWNBbmdsZSAwIC9NYXhXaWR0aCA5ODkgL1N0ZW1WIDAKL1R5cGUgL0ZvbnREZXNjcmlwdG9yIC9YSGVpZ2h0IDAgPj4KZW5kb2JqCjE4IDAgb2JqClsgMzIgWyAzMTggXSA0OCBbIDYzNiA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiA2MzYgNjM2IF0gNjggWyA3NzAgXSA3MApbIDU3NSBdIDcyIFsgNzUyIDI5NSBdIDgxIFsgNzg3IF0gODQgWyA2MTEgXSA4NyBbIDk4OSBdIDk3IFsgNjEzIF0gOTkKWyA1NTAgXSAxMDEgWyA2MTUgMzUyIF0gMTA0IFsgNjM0IDI3OCBdIDExMApbIDYzNCA2MTIgNjM1IDYzNSA0MTEgNTIxIDM5MiA2MzQgXSAxMTkgWyA4MTggXSAxMjEgWyA1OTIgXSBdCmVuZG9iagozIDAgb2JqCjw8IC9GMSAxNSAwIFIgPj4KZW5kb2JqCjQgMCBvYmoKPDwgL0ExIDw8IC9DQSAwIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4KL0EyIDw8IC9DQSAxIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4gPj4KZW5kb2JqCjUgMCBvYmoKPDwgPj4KZW5kb2JqCjYgMCBvYmoKPDwgPj4KZW5kb2JqCjcgMCBvYmoKPDwgPj4KZW5kb2JqCjIgMCBvYmoKPDwgL0NvdW50IDEgL0tpZHMgWyAxMSAwIFIgXSAvVHlwZSAvUGFnZXMgPj4KZW5kb2JqCjIxIDAgb2JqCjw8IC9DcmVhdGlvbkRhdGUgKEQ6MjAyMzAzMDYwNjQyNTJaKQovQ3JlYXRvciAoTWF0cGxvdGxpYiB2My41LjMsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcpCi9Qcm9kdWNlciAoTWF0cGxvdGxpYiBwZGYgYmFja2VuZCB2My41LjMpID4+CmVuZG9iagp4cmVmCjAgMjIKMDAwMDAwMDAwMCA2NTUzNSBmIAowMDAwMDAwMDE2IDAwMDAwIG4gCjAwMDAwMTEyMjQgMDAwMDAgbiAKMDAwMDAxMTAzMCAwMDAwMCBuIAowMDAwMDExMDYyIDAwMDAwIG4gCjAwMDAwMTExNjEgMDAwMDAgbiAKMDAwMDAxMTE4MiAwMDAwMCBuIAowMDAwMDExMjAzIDAwMDAwIG4gCjAwMDAwMDAwNjUgMDAwMDAgbiAKMDAwMDAwMDM0NyAwMDAwMCBuIAowMDAwMDAxNDMyIDAwMDAwIG4gCjAwMDAwMDAyMDggMDAwMDAgbiAKMDAwMDAwMTQxMiAwMDAwMCBuIAowMDAwMDEwNTQ0IDAwMDAwIG4gCjAwMDAwMTAxODQgMDAwMDAgbiAKMDAwMDAxMDM5NyAwMDAwMCBuIAowMDAwMDA5NTkxIDAwMDAwIG4gCjAwMDAwMDE0NTIgMDAwMDAgbiAKMDAwMDAxMDc2OCAwMDAwMCBuIAowMDAwMDA5NzQ3IDAwMDAwIG4gCjAwMDAwMDk1NzAgMDAwMDAgbiAKMDAwMDAxMTI4NCAwMDAwMCBuIAp0cmFpbGVyCjw8IC9JbmZvIDIxIDAgUiAvUm9vdCAxIDAgUiAvU2l6ZSAyMiA+PgpzdGFydHhyZWYKMTE0MzUKJSVFT0YK\n", "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:42:51.548948\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "counts = {}\n", "question_types = [\"What\", \"How\", \"Is\", \"Does\", \"Do\", \"Was\", \"Where\", \"Why\"]\n", "\n", "for q in question_types:\n", " counts[q] = dfs[\"train\"][\"question\"].str.startswith(q).value_counts()[True]\n", "\n", "pd.Series(counts).sort_values().plot.barh()\n", "plt.title(\"Frequency of Question Types\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "MkcPl64xz41k", "outputId": "4a0fdde0-21ef-4773-e1c6-2e7a82856f76" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "How is the camera?\n", "How do you like the control?\n", "How fast is the charger?\n", "What is direction?\n", "What is the quality of the construction of the bag?\n", "What is your impression of the product?\n", "Is this how zoom works?\n", "Is sound clear?\n", "Is it a wireless keyboard?\n" ] } ], "source": [ "for question_type in [\"How\", \"What\", \"Is\"]:\n", " for question in (\n", " dfs[\"train\"][dfs[\"train\"].question.str.startswith(question_type)]\n", " .sample(n=3, random_state=42)['question']):\n", " print(question)" ] }, { "cell_type": "markdown", "metadata": { "id": "dLTCyg6Uz41l" }, "source": [ "### 사이드바: 스탠포드 질문 답변 데이터셋" ] }, { "cell_type": "markdown", "metadata": { "id": "IbKi9yYyz41l" }, "source": [ "\"SQuAD" ] }, { "cell_type": "markdown", "metadata": { "id": "V56fOUQRz41l" }, "source": [ "### 사이드바 끝" ] }, { "cell_type": "markdown", "metadata": { "id": "WWtK_6STz41m" }, "source": [ "### 텍스트에서 답 추출하기" ] }, { "cell_type": "markdown", "metadata": { "id": "JKaueCUUz41m" }, "source": [ "#### 범위 분류" ] }, { "cell_type": "markdown", "metadata": { "id": "w7FgPqwtz41m" }, "source": [ "\"QA" ] }, { "cell_type": "markdown", "metadata": { "id": "VGWPDO8Dz41n" }, "source": [ "\"SQuAD" ] }, { "cell_type": "markdown", "metadata": { "id": "4Wj8cnXJz41n" }, "source": [ "#### QA를 위한 텍스트 토큰화" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 145, "referenced_widgets": [ "a5aa206de3b243ec9ed2a6e914a448fa", "b8536ec5c895460ebfa32a103fd80591", "6d5046a2cb704b3ca5758c10a86288bf", "5307c50f07cd421db587496c39e4d175", "6d9e00830be4431bb2bbeeff660dd6eb", "b5afa237918a4f7e918730dd78f23724", "1d8743ebb1664e818055d55f846725fa", "00c99c2a8fd94b9a8d992fdcf11f6e61", "50939b3e6f914edc94c7568b5cc47e22", "58aa92721d354129a8d31ae34f2cf855", "a86b2402ffbc46fd9d92325cf1cfe3aa", "06e3d8ab6e374e5ba9f4c4f0f0424b04", "7d0fed82d1db4d558c67c75c8378b412", "fb64a118ff924db68c44208bb1e1f7bb", "7f635a3e72cf4298b1ff4dbdfbb634c3", "5c5590848528421ab7c081610f8dfc58", "cb8a67217c974ad9a04690da613cdda7", "066df9e253f042e2ae242fa1d5a03596", "9628c68b4f2041a8ae8e56a59930359f", "5940bb21f0cf496f9ddb7b40c3b6e353", "64224a316b1348a1803cd793e4f27cb7", "4c326dcae0ca4ebc8b19008ceae6a93c", "f709228b2006482da1050f30604bbb11", "6d606e63dae34cc9820a345f969b4ff4", "daf33cfaaa604cdaa44d60690068555f", "1fa630961cdb4f83a96b08e3db2e1ce3", "5fbd5d0edc6d41678a5c3c155cde5279", "fc1aa6d56f004f6fb2265225c8517f65", "3079624f39d3437489e46aa5e7783301", "847cb3ec626e4c18b3fd93af01736368", "5bb6c9761c034d50b967c22edbddb736", "268e2d65ee0d4675b364e514cc6039a1", "f2b73e98882e477daff3a56798012368", "e20d8ade0bd347d39f0eaedffbd10ba0", "018a5325476c45b28613895fd0df9d0d", "ac1d74729df34a1fac1f530c3c6b39cb", "f4d64cf610cd4c3fb62124908a8c791d", "b5893539ee084860ae1cdda6debd8dd6", "0edd8fc74cb34fcd9a92480b87bd8244", "687a73a209a942a5b1bba2e0490f7c1d", "6cdab1e4b4204a8286d4c210b76144b1", "d997fc3e56a3461790bee7b7a589d95d", "b63d8d35559c47099546d9ebfce22783", "58531126fa0f48f090de938365ee83bd" ] }, "id": "9qGdwgLQz41o", "outputId": "254517bb-3b5d-4e18-98cc-5c32befe555c" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a5aa206de3b243ec9ed2a6e914a448fa", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/477 [00:00\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...18192021222324252627
input_ids10121292172218920642023290710291022019...2061205525961284758342006537129461012102
token_type_ids0000000001...1111111111
attention_mask1111111111...1111111111
\n", "

3 rows × 28 columns

\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", " \n", " " ], "text/plain": [ " 0 1 2 3 4 5 6 7 8 9 ... \\\n", "input_ids 101 2129 2172 2189 2064 2023 2907 1029 102 2019 ... \n", "token_type_ids 0 0 0 0 0 0 0 0 0 1 ... \n", "attention_mask 1 1 1 1 1 1 1 1 1 1 ... \n", "\n", " 18 19 20 21 22 23 24 25 26 27 \n", "input_ids 2061 2055 25961 2847 5834 2006 5371 2946 1012 102 \n", "token_type_ids 1 1 1 1 1 1 1 1 1 1 \n", "attention_mask 1 1 1 1 1 1 1 1 1 1 \n", "\n", "[3 rows x 28 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "input_df = pd.DataFrame.from_dict(tokenizer(question, context), orient=\"index\")\n", "input_df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "nLLHfM4kz41q", "outputId": "dab12499-28e4-4283-960f-aefca104ff36" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[CLS] how much music can this hold? [SEP] an mp3 is about 1 mb / minute, so\n", "about 6000 hours depending on file size. [SEP]\n" ] } ], "source": [ "print(tokenizer.decode(inputs[\"input_ids\"][0]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 227, "referenced_widgets": [ "f4b32bc8d7ab4a5cb06299a244563cd2", "4b564b527b714d16a8fc7ecdd72b9a07", "643d8588ebe4468da769ee6ce999c5a0", "46630d148ba84cdaa060d412cb5fb404", "0904d3a7594541d2bc562905bc406dbf", "c046141e86194b0f812d62e3d198b7ce", "fa29e2ade5d84c659d26e373dff96206", "c6ca86b04665411f8f3b884450c90f48", "7c4676d723874fb99b8fd208e7979f9a", "af89f79780214eb4bb9bb8c14cf77e41", "7ccb044924da45678d2980d0c5ac0131" ] }, "id": "M3ItsK1fz41q", "outputId": "b790e1c9-1630-4aeb-faf2-ca9abc74420f" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f4b32bc8d7ab4a5cb06299a244563cd2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/133M [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:43:17.257463\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# 시작 토큰과 종료 토큰에 대한 예측 로짓. 오렌지 색 토큰이 가장 높은 점수를 가진 토큰입니다.\n", "# 이 그래프는 다음을 참고했습니다. https://mccormickml.com/2020/03/10/question-answering-with-a-fine-tuned-BERT\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "s_scores = start_logits.detach().numpy().flatten()\n", "e_scores = end_logits.detach().numpy().flatten()\n", "tokens = tokenizer.convert_ids_to_tokens(inputs[\"input_ids\"][0])\n", "\n", "fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)\n", "colors = [\"C0\" if s != np.max(s_scores) else \"C1\" for s in s_scores]\n", "ax1.bar(x=tokens, height=s_scores, color=colors)\n", "ax1.set_ylabel(\"Start Scores\")\n", "colors = [\"C0\" if s != np.max(e_scores) else \"C1\" for s in e_scores]\n", "ax2.bar(x=tokens, height=e_scores, color=colors)\n", "ax2.set_ylabel(\"End Scores\")\n", "plt.xticks(rotation=\"vertical\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "P46m0SViz41s", "outputId": "20498b4e-671c-4982-8fea-306012ea5c5d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "질문: How much music can this hold?\n", "답변: 6000 hours\n" ] } ], "source": [ "import torch\n", "\n", "start_idx = torch.argmax(start_logits)\n", "end_idx = torch.argmax(end_logits) + 1\n", "answer_span = inputs[\"input_ids\"][0][start_idx:end_idx]\n", "answer = tokenizer.decode(answer_span)\n", "print(f\"질문: {question}\")\n", "print(f\"답변: {answer}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_F0ajZ6Rz41s", "outputId": "ac31650e-aa26-4592-b0d2-bc310244e76b" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/question_answering.py:316: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:230.)\n", " fw_args = {k: torch.tensor(v, device=self.device) for (k, v) in fw_args.items()}\n" ] }, { "data": { "text/plain": [ "[{'score': 0.26516157388687134,\n", " 'start': 38,\n", " 'end': 48,\n", " 'answer': '6000 hours'},\n", " {'score': 0.220829576253891,\n", " 'start': 16,\n", " 'end': 48,\n", " 'answer': '1 MB/minute, so about 6000 hours'},\n", " {'score': 0.102535180747509, 'start': 16, 'end': 27, 'answer': '1 MB/minute'}]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from transformers import pipeline\n", "\n", "pipe = pipeline(\"question-answering\", model=model, tokenizer=tokenizer)\n", "pipe(question=question, context=context, topk=3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "tQ_wM4Mtz41s", "outputId": "21cc31d0-576c-4bc8-d303-e72793a2f1d5" }, "outputs": [ { "data": { "text/plain": [ "{'score': 0.9068412780761719, 'start': 0, 'end': 0, 'answer': ''}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pipe(question=\"Why is there no data?\", context=context,\n", " handle_impossible_answer=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "8lxFOqUFz41t" }, "source": [ "#### 긴 텍스트 다루기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 376 }, "id": "D-fOa8kFz41t", "outputId": "f8e5a94e-9e79-4580-89b4-1e93b548779d" }, "outputs": [ { "data": { "application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUiAvTWVkaWFCb3ggWyAwIDAgMzk0Ljc0Njg3NSAyNjYuMDkgXQovUGFyZW50IDIgMCBSIC9SZXNvdXJjZXMgOCAwIFIgL1R5cGUgL1BhZ2UgPj4KZW5kb2JqCjkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxMiAwIFIgPj4Kc3RyZWFtCniczZ1LbxzHEcf3PJ+ij9ZBzX4/jpKdCBCQALYJ5CD4kNAUbYWUI0qO/fFTNbs7U1XdfBjxoSxY3P1zu6d+Pf34b2m79+Kb6//+fHX93ZvX5uvvl4v92dXnxZsPy8Urb24+G2c+wP+/GW/eGP4iB/rdEnuyNZVWMzy9pU9DKdZ10Bx9iC/6aVk+wjVu4MkbqPhmWXI4l0repoz1tmozlW6pFHKzDSvcC24S1P5++WRklTEm20zw1aZk7q/NP8xHc/EqHBkdiNHHGnLq8KQeH7lg7m8g0N8ee8V3b54ov5Rqe8ku1w2ueutcKz1ubLsSUrM55uoTyHtRJgPi6+Vb8ydBLvc35v+GlEgAmazLIRLGsxBCtaGGlDonp7JGRA4EhBV6de6l7IibEpy3sZUaOTlVNSIKorulOetjL/D7M+OueAg8tlZrYehMVggpkQAyWt9iiH2H3BQfA9w8zn2WNMIJFIArNgQ2ke6Kh/6YExSonJnKGiEFEkB2mDpc9n6H3JRebPcdeiRDp6pGRAF0t3QYVT751DbEXYHBB6uEh7IUnKoKESUQIGYLS1xtaUfclNrhUXItcHCiakQUQIDYbHI+7dPpJsAiUaMvsEowbKJqBOQ4dwtOHSkRt3NLpeJtPa8MW1EqKkQciBAy2dSYgSNSyfDotMZTdiKrxBRMiAkTSKQWblfAbbvumi+cncoqITkRMHpnc2Umjki0a+5ltXdYiYSU0RbPbByRssdHx6WQFKaySkzBhJjFlsKMHJGKs62k0Aqnp7JKTMGEmN1WxywdkTJYnLN3I4WprBJTMAFmgOgyM3VEytFC5LFUQU9kjZiSCTGzrZ0ZOypV2wOskp3TU1klpmBCzGah91FzRyWGOaVXiSmYADN62xo1eETJ1qcKb5IFO5E1QgoiZExwW7jHI1Kx7pzFIWWJqhJSICElJt+4ySMSo5zCq8QUTIAJD5xnJm9XaIclRZV3WEGEjJiA4yaPSDASU+6+cXaiqoQUSEiJKThu8ohEOywprLzDSibAzA66Hzd5REo2hJhKF/RE1ogpmRAT03Hc5M2koTFU4onAEQ/8aOTmjkh04iGFlU88kgkxMRvHzR2R6JgkhZWPSckEmAUzctzcEYmOSVJY+ZiUTIiZbSzM3O1KtKG33gT6rqpE5DxIiFk5bu2IBFz11F33oruoElEAAWPFnBw3dhNpbAmNeDJwxEs2dWboJsrAqxKOx41smIPjRo5IpHvuRXV3TwkEjA0zcNzGEYkyzsg1QkoihIy2BG7iZtLQFCrxROCIh3k37tSIRO4hKav8HgoihOy2em7hZtLQFCrxROCA1zHbxq0bkchyT8rqXu4lEUJiro0bNyIRyCm6SkhBhJD4Xp7Ztl0hY5GU1D0WBc/dgp8rao2ZFSrR27gVVX0XByBkxCwb8yxTaWgJlXgicMSrtldq0oiyd0daUnUnlTxA6DG3xswblSjjBFwjogRCRsysMccylYaWUIknAke8ar1jHm0qDcQq8UTggBcwm8a8ykwaiTXiycARL1rfmTubSgOxSjwROOJhBo0ZNirt6x0tq3sVlEQIifkzZtioRCZRUlb3LCqJADJCxIEatokyNoRGOBE3smHijFs1IpFeuhfV3UklEDJi7oz7MiKRTkrK6u6kkggg1wfMsI3K2BAa4UTcyIbpM27ViERu4F5U9/2TQMiI6TPuy2bS0BIq8UTggJedzXxTA5XoLdzL6r6HkgghMX3G3RmRyGRKyuqeTSURQhZb+OaGqTQ0hUo8ETjiYfqM2zYikY5KyirvqIIIIAumz7htm0hjU2jEk4EjXraNbWsgCrmDpKTuOyh4kHD9PBCzbTNJtoNKOBE30FXMmnHDNpFGYI14MnDES7azjQwzZeBVCcfjRjZMlnGrNpMkr0o4ETfQNcyUcUc2kUZgjXgycMRL1vENC1NpIFaJJwJHPMyUcXs2kwZilXgicMDrznq+QWEmjcQa8WTgiIdpMm7MZtJArBJPBI54mCDj/msmDcQq8UTgiNdtYBsRZsrAqxKOx323RIepMWZEJtLIqxBuiBvpMCnGnMhUGoBV4onAEa/ZxDYczJSBVyUcjxvYPD5gRmQmDbwa4WTcSIe5MOZDptIArBJPBI54eL4Q8yFTaSBWiScCB7yAKTB+otJEGok14snAES/awjcUTKWBWCWeCBzxMPnFfMhUGohV4onAEQ8zX8yJTKWBWCWeCBzwIkTHNg4QZc+E0ZKq82OSBwkx88Ut2UyS7aASTsSNdJj54v5rJg3AKvFE4IAHDxrbIDBRRl6NcCJuZMPMF/dfM0nyqoQTcSMdZr64/5pJA7BKPBE44GVvHd8TMJNGYo14MnDEw8wX918zaSBWiScCR7xqPd8NMJUGYpV4InDAK5j54v5rIo3EGvFk4IiHmS/uv2bSQKwSTwSOeOuBONSSTZSBVyUcjxvZMPPFnddMkrwq4UTcQFcx88WdF5GInSZlJ3Z6A8AwP8CV2xoshnSzQNv47vw+mW0CnoOB14RagG75BH8789JBHbjtt/QWa24xmZAxP2eu7pbXl+bir974YC7fr0cgX/64vDNfHdwL84O5fLv85XL5dlkvvx6Y0kNNkZ5DskkPX3o96CkV30rO2T/j2v7g8M8kgl5tKakGsutylx6JoOGpr7AURxdKfUYE4aEIAoy65IJz5B+SdunhCPCIZB9czT3BTXhGBPGhCPB9mI89dGK0dunhCKAX25abS7k/pwXSeH1SWQ+2d1egCnhT+FA1az1/P/x6uDv863B9uD+Ywy+H9/D3F/j57xcG3pH4DHUsX8FvPx4+w29+hp/m8AnKXMPzL/D8F1BeHq7Wn18O1y8MdCboca5F/A8u8DvI5vCfwz/hxfcvlh8W3lzj6d0JBh7/CPmksRr+sxjgxWIbvvSPjZT9oq1iUgDGNr3uJk4vDVNOwWYxeOat97319VSJRyLIj8fgI9z59VNlNIhdnfbXYBuE4KFnxVjWcy18e3LMPhoFnlF8PFKBhbHLj8VRs83FdZhEW30ijidaAz97Ho8HddA4iPxIHCHg8Yp5PXsnPtjxtxmEjx6s5yXWCF0YRx8e1RxTL0/0r6+h7/+KvZ/W9gcXQlxDvM2whrwz0JIAYhPU5syP0D54chLM4TA1445muNkutxbgiffnZ7gY8hP2e4d33PBrcgbwpmzH3S/f08UsmLdkIePVjcN0cno+VDccvX83PXofXvmsY/vZ606lH6zx4lU8rslv8YsBjvYBcE5fE4A73ts6bgMsiqUc3UA/b1c5i/hVAW674kmMzW8XhE6RpIRvQqCPLFTMME+sRoJqAb/KgF4AlIbr0h7HUbmi8R4l3LQPpotKOOzSJjgurNfHejbpFOct0444pOoT89heV0f782SXNc/tsvKbI6BdTjEkmFrLeSM1147bGKUGt9+d7vm7cxQnN0ZniXWNPTI1/HaEx8b132DJ+h0Wrbt1hTSw3l2fVr6PsN5dg3K7Pr6BBe4nNvaX/wEQFPAdCmVuZHN0cmVhbQplbmRvYmoKMTIgMCBvYmoKMjYzMwplbmRvYmoKMTAgMCBvYmoKWyBdCmVuZG9iagoxNyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDIwIDAgUiAvTGVuZ3RoMSAxMDkwOCA+PgpzdHJlYW0KeJzVenl8FFW28Ll1aumuXlLd6c7aSTp7QoDEDgECUVpkBzFIRIKDJpCEnQTCItsEURJRNHEhCCJEBYSAGJGBBAPKGEUGmBlH8On38GMUFH0GxHm4TEhu3qnqhM03b957/3y/ryq36p67nv2c2xVgAOCkhwje4UOGDoNsGA7AelKrZ3juPeMTvu5VQPAIACFl+Pj7Bq87tOMQAJZR/5W778wbYR6U+RhNLiL4p3vGp/tm/fLGKgDpNwRPmDqnsIw1mcsJ3kfwL1MXLfDCjKhsAHk3rclLyqbNmddn0UwAE8Gwa1pheRkodIN5IsHWabOXlIxPKvyBYFo/TJ1eXFhkmvz7SoBYff++06nBtkV5juAdBCdMn7Pg4Teczl4EnyB44+zSqYUlA2bXAcRNJzh3TuHDZeLz8jyCGwn2zi2cU5x88fZmgj8lfE6VlZYvuDT8yypaSsfn8bL5xWUDFdoeEvsQDdNB55UVApdAN0IstcVBEtX1PhV6Qw4IQ4aNyQP77MIFcyGM+EpXZyfAtZo+ks0qnj8XTHrNKCKtoL9NILA8fSSbyTaDGZzwj66kf9jzv7w6z1Cp+a/7b653HrihpaveeeS/veE/oUBfqXOFUfu480znx9d2PaPDXWOO3YT9kZtnX7sYcVIECziInyEQDhEQZUjOAqEG/7svXaKBy5AVSCAbb6Wr1WRIKvqmOWBIDoyZIo1XaJSJ9tO1wUQ7WI11bGCHINAIg4CebIbd4DL0ZGnh/MIpUFM4f85cqJkyv3AG1EwtnFtOz+nF8+m5ZP5sqJlWXEr1afOLZ0HN9MK5NGZ68RRqmVU4txBqZheWevUn6dtjcwoXTIeaubP0ltJphXOgZv7CuTRyQcncafScrq//D3TS4NTsGdMKb9JLEQJ6yaCv8ZaIIhd4SOtTSIJIHA2FROPthgTiRSjE09NN/QK1MbqdXTT/Fb6BXMixgvK5wTbyE2IBMed0F8sLbpZ/N8ymd9Xvg396MTutu+ifj4NJgbH6+1r9hjVuap90Azz/el3oD3C1muo516amEJVqYJHAU7SzGuIYSJnSBgKjA2/8FygRyLIFi4xoEgVBvDaj68otGVoEfvDCw7KLu9hGZQ47VwCs82xn1wDsKh4AY8flBDEDFkHnlYt0DXWfDr0gC+6CsTAVZsBCeNiQut6aQa13QyEUwyy9tfNc5790nu78sPOdztc7d3fu6tx5i5Z3XwH/t7wLMhurBYqOT6+uolOUQSULdO0H2itQLFTupjK2a6VCKlOp6HwupjKDioPKLCoLqYRQebirxLIsaITjdB+BetjEthNUQu3zqKVO2AuraUYjvMeOszVCL2rbDpfhYxpZBcexXgQ2CjKpFeAzSYAr5GX30RrZzMWyFZnUfKy4T7xXbBQviCehn1gunhQLxHKWia9IE6TtVLLxfZLZMYiBRnYWyuEgfouZ2CwOEe1wFk9iPXxFu+h0H4dq2ArLCBcXK4UKYZlwL7UclU7CRrpLqf8k28w+JuwOskfhNLyAojACNrPTRNdx+AkexTyhgtiZKZQQ/kdprZM0fyOUi2QrTAUupFEbYU97TTGeUdhLOm3cl6GCds6DrXKj7FLiaRedY9vZe6xVfg7q4GP8Dc7Df2WrxXhxhzgCqgMcwAKoprU36nPkEraEaNfvZfrqwmKxgNXDt2KBMoXWfl+niPbcJ9xLFJVAM5XFskY0DWSrcQ1hqvdGwUlllJhO82kFZQVRDVCKWTCTastgD+yFXlgL1bSSQa/cT/qJZm4SvyCaq9lTwk9wEodAKpSIl4jXpM5QC3BAkSURBQY9vVqDkDiyqME/bqL3w/zYXj1vAb2a4m2A3AbbEm9jZ2fuRDFSym+QPA2YaGoQE+O/+EedX/TqOTp3orehY+iQrlWHFgyhtvETqapD1EztQ4cYffqmDVIi/Y0saPBOne59QnsifsATWvEA3QTI/1E0J3vTLbOInRUqhEep7tgPmwSRPKt25sRtoJ1qzQiOdccWCZEdXwmPbtVnVXWeE6tJknp0ivcHy3VOqLM+41wbZvYERaPHHRmmtbdeaaXJ56+0apcyWJzg0JyZPqdDE5J94NAgPk5/Ck9ueukl+nvppavMzH++epX/zMxSLj/JT1A5yTLp7sMy63g5r+RVvJw9xZawpewp3Rd9QSY8iTyzCn6/ezDWiUKd9IgCdWZTjOxBiGEW7dTohqC8iU002N8/v7WlnRBKb/VdaSWKiF/5cWxfEAaJwuR+sQ4pKzHTQVRyNopvYMV/YKPat9aL5SMaR7SdrqcFSLbiKKLYA5v9yeERkRjmcUgiOCRJHKy97HjeVud6RiQbB00VmOoJ1VCO0tpHN7jzRjeE5D0wusGV9wBhgp3v9s9vOdX67rsOZ3YXNlcMbBRNuqhIF1mDR3OEZhNuft994gRpgrJUXCotiqwKV8gDhIsRpAqeBbBIXhhRHrnAswoqw1dFrIpc5dkBOyIdk2FyIhGR1Rf63cGy+iTFx8lK1h0s0ye6XbIiA7mdI+1jiI2ZhXe/VvnQxw8vPTXxG+Ya+kA4v1JfX7+YPTNgzvqRi2sH33XiNt83v//NtrIo/h1Rv4nkXU7Up0CZvze4g9VKc0ylN7jObaszPyd76rzPxT8jr3W/mhriCQZ0hXuSvJoHXTFmOVVnQkheN/1mg35iwJVWopI4oLWev3K+Vfv6kmbcxJUM5jcXRRfGFHqLYkWYzKKZ2yXGxiUlZ0UTIX2JqjSWFajcRB4OeuZV/mf+zYNHZ+Z9OOfw0aZte/av2/zqC+MPzy8/lv81sz6NiTEtNZ//LTHxvdt8tdWPrdu+uKx8WULSPq/3o73Ld+l6TTm8uJV0SqBs6BF/FLOhDRBtgwEtSp3E8BEzs6rgkU2i1a6dGd1gIcJsBmFWnbBTOS2tPocu1/Onclp9RIshWPEYCfeYLtIeFugBIyCfgshieAKUEJYGSSwN+7Kx7B7rPbYJrIQtZEtxNbORKM0sFjMdme54R7wjNgtlLjCexU+fPtbxoJTYfg5Ptmfu4HWs4D2S0GaSUBFhHgUP+uPFCMVRqUVF1CmuOm2NTaiDR2xrla3RoR6mogdUTY7W2tmNctF09LusRdOthUSktVzSDVi3YBIPbwlIR3cFDp3n4HbBTWLRpfE5hnfU9ZzYs40l8FP8+wffmz7p3Vmv/+EPr497OU86Xc+fDQril/7tB/6j13v8toz9mzbtT0gi7Ms7z0nJpF/h0NcfYXvZvkdd52Avwx5xXegzjrURSrgNMlxaBDkVX5dT0d3KT5cy9gVFxkQKbDJz92bxXt2lxPr69nPbrwEhUnLJhVWdwC8zjcGqCyUzLz7GX+dLWSUbX3lRmnL6oQf5Uf4p/4wfffChj0eMYFvYNDadbRlOWFHmLscQT63wjr8POhSTIjiYYNJfKJhVM3OoqnmwqghoQnjDZJHMJnL9kip7xDtUSnNshG+7ruekEIZmhGZfs3TN1FWki/lxe8vsjE32T0AlyBRkFlS34FKC1SQhSfEqSapX7aNkqTOE5cIyZYm6UlilrFJrhBCRWTCYRWI864nJphRzH5aDE0z55mLTTPMi0xLzI+wpXMdeRBdpUnAs6RJpEYtnjvhjrBdbwSpYr/d5xXFe0SKdbjfhL21pUkw7JUVtXwQopzzwNOVOS/zRChHNBAf5j8FEKhEqyUwRPGJfhWhUAzRmGzS2+rK7Vf6iXoiyGItOWUZfob8yQhiuzBBKlJWCIjOz7GYR8jA2Ur6fTZSL2Qx5ibyaPSGvowi/xaIZODMHuWQDYU2obeGXO2YSrldjxC/a0sQvrsaQlS4jre9FOYBK+X0z+eUYS6jZDjtD5Sa7w1sZc9DTFN/oWBtqhVAMs5lNlhg0uYYmEconTrX6fAGsW85faSdt+kD3PtmObFJx/9yMqIzojJgMb0ZsRtygZH+UP9of4/f6Y/1xuVG50bkxud7c2Ny43OSy5NVRVdFVMVXeqtjVcTXJdcmXk6O7p3ZP6p5QEF0QU+AtiC2LLosp85bFroxeGbPSuzI2bDLpb5zsdoWQHd3O+jnis0h/45Ky+vTNjL3Ry4UIh8/ufqR0Q1Nj46Dmx3cf77jKhNfWF+zPKz486d8vC5kly6aUf7YvdUzHI/UlhUdeOfSus+LJ3r3rk5Pbdd92kHi1VXZRzPZAf384NlmDzE1h7rVBjZHrw8HpHB5mlU0Rw6J0I/NdMULl+SstOmMy9hdEr4yui0bC00AngCozvABFdMI1mZxUJn712rPPvqaXjqcHvLnsBJ3aTix7c0BTk5B+/MKF41SEe4sKeTP/he7mwqIdhA2DeZ3n8ALJMBwG+SOhkj0u2ittj6tNDrEptFE3fqcNRriGkvGf7zZ+jV+5pP14KcNvCYrUIldG1kTWRUpdTqAbO8MJxHU5Abww9qXctz744K3cl8bevW1yB/+EbEC+7xUxa3da2rmTJ8+lpdUnJBBBduZkA/TTIWElVhBWLoikmJcAbmauND0uuXcyqcnK3g5rcjZa13oi3YLJbYLRgjNoqMdIeVqMuK4jed6IaFcMp+lPHRRVFlUX9eeoy1HSIBjEBgmD3IMipZ5Kuind3FMthVJWKpS6SyPNk+fphMQaTvW6IyNGKwZxiljRvtd68sDMo1Om/nkWv8KPstT2L5nSKGx7fGOTXXhw0uGjffrs6dGT9WcqC2Z38c9b1u/bs1nPmNJJDX4hngdDvt8jacxq2imzKlhvl5tVIVgBxSyZbEGWMS49Zqt6KLDooWB0g92o6yEip6U9p6XFaRjOeV87OWGKchSy9/vdue46NxLqhGQU04MEha2sTF2JhV8apt7N0vlHTQ0New7Jrg2506dWt6fjR9Vj396l85pPECcRry2UX4zyx4dbo8zOyuCQpiBsSopvTG42NwUdiohKCgeTdbjsdHqHUlLRQrz2+QzzDagpP61zOpt0tcfKHnU9btHVUE24HqtuZ7pqBDLSkNAsOkBtW/f8tm3Pr9vWyHlb4e5x4zbf+7t92XuX/7G9/Y/L92Y3Crd/eObMh0fPnPmOf8m/jYp+q2ePQ+88MHUKG8CQiWzAlKn1On8P6r8yGvztQ/ZlBrQzucruaLSuVylswFjdAw1z6fZlmFdOK+HtcFJQ2FvgJi9Jns4RQJkqmUaMDRGLGpcvX7e7qWnwWwuPfCBs7fiNsHnL5sNbO6pkV8fm4qLvdcs+QpsvoX0RFEjz2+XD4pvQLEjMJMIwk9ZOblmXVnsr2Ytm9ptzzQXmMjPZS3CmvpE7/kgjXWLB1TrZ9W03HZtoPZV8aqrJIcmS4pBlCR2iJApGxmuSBMRmWG+WJSbKJmaCYRadrlMtjkAgOB96QyCQuooREDSrHhAeEgXVFCIkC6lSmmmCUCJMM5ULi6VVwhrpadNzQq203vSq4DRLZlmwoKqkYLKYIqXJaYrfOh0LrGtwtbhGekquVjbieqUeX5P2K+8rnyg/42X8WbwsRkyeBzp5LNNshLyDTULidx17hFmXO442ya72Gexcx5WO3UJ8x+dE73X+xR2A9YJOjagFDgsZfpsm+aVcqUAqky5T4DOYRgyTXX9v7eKVEkW6GweT/Emy0xwWBHKU4rZWRXmxMbI5XFPAEWQyybkOU1CuJ4wcbLzOqPZ2CprGCSAn5/wVIznQFcEfnJGQm1CWUJNQR/c7CWcTOhPMpBmGLrhv1I/riuIOKErq0HdXvXG4af7C6u1N8xc/tb2paVDDkqW7cM3yRT9+qavNy5t0tRE2v/LiO692VIkFe6ZNWX5Na4mCYMq9btLa5v9ca893a+2+Avef3MKteuv+J3pLG+tqG/CwCw2rDyWrD5abnNBkbdTPkM6gceh0D73lDOmPHxS+DJbJFUqFqcJcoVZYllkrbBX2iqAKrcKxzFkXfjnccUMkpch501Gz/Pndu9Y9t3v3c5eZk1+6/AP/njnw7IVjxy588+HRbzfxD3krv0juNJu8pov11yMm+aWthKFGEfMOf2REE9hdTZKp0b6WHcLmKIfTMjxUBJNwQ8ykdNnwRi2ErzkQNf8aLbLJiddY0xUwbwqk5U1N1+Ol0L87iu7o2COr9TdETPZdVxyFm7ymgV13PG8MWht5KLw5yojmwymuD40y+Kg7y27sPrgFO9bNMEKp2zdCwDc64lk64bJjB+EjlA94c/nxTug8vvzNAY2N16J6x56iQjaEkRtgQwqL6v/+U0CrcBRh54AMv0u2kBVYsMreaG5WVNkEpmFO3YUbfoli5qkTepDclxu8JVjXp0AUv65MoTgqZmTPTa8Rlw6uDu7twX1Ox/HDHXtJlUqmShLtVko5xFHaLRku+HNsVsFuGR8TbTILijo+JiZ6sGqJjhHdlFusEV2V7jVhem6RSLlFSrRqiYlU4N5Ik10xueKGpuhYnWo9r0eS7OzuZONHPdlwdns0+0XK4xXjSe4MknV3Nsejeiwea28K5T0tPa0DzQPVgZaBVosXvCxBSFFTLD2C013p7h4hKdEpMane1NiE5Eq10lJprbTp30+YIMiqbEEr2tCOQahhOEZgJHrEKHNyeuqg1IdSK1JXptak1qVeTg2j5Hje9Vwnxjghy/E3HsXSiYd65KXM58mxOyatWTPl+UEt237+dNJ7s0s+KFy1tniXf9cLf/1jyT5x0J6UlLw8/8hYe48Nazbtj48/nJWVP250bmJQwrpVm3dH6x6iH6nb36TN5CEob7RLpiDcCQ7WbKpSLcRjsgDNadc9hJEc+IzcoDXg0yge730jENv0jMAVMlDPD5Ky9MzAwRazZXz16PJDh06/UlUlbea/r+6oWzN245a/CAXV7A5dx/eQj5ho+CYXDPR7rnuntSprdjVayTe5LGPJSw1z60qeHdCo875rLqrU/a7uooIpngZ0nCpd6TXbo7uo1xsb73pz4ZEP2Z/YQWF7R+GWLYe3Csuu1u0umXoZd+jUDyT/uEQsAAu7yz9M0o9BogNFRX9JIp2M0CEIzKJ/NVEdZpXpL4tKZ0Wzg06KdDoUmWiCdyShqyaYZKv+w5Sapx+3HxjdoOkPh3EKlwPplktPt+TAL0UtXcfHVt9/fnj89WGS+V9QRVGNEN1qknq7eJt6n3i/MlEtURexpeIiZYH6lLhK3SBuEdcrz6o16na2U3xD3Ka8qtapHhVFSTKrlgh0S25zhCUVk6REcw+L1zaAZWM/qY/S15xtybCNxGHSUPMoi9+WDxNYvpCP90sT5HxlgmmCOd+Sayu1PcwqbC+y55VdbKvSYPuT7ayt05au/6YhxFNINuuBWSzis1j9Z/wgP/gZe4vP/4ylslSxoONsxxHWyEcIo4QQPo9Vd2WuUj1F6AQY4Q9OMhJVa2yYLdrksMZqrjGJuu75dO3TcvT8tOU28DvMNsdOpxBRBWHr5RhnsyUoPedrn4/nXPJR0urLuClRvZ6sGgqq6B26/Uj13ZkrV4zkdU/D1OQk9vebstjuTHZDSsr0qYGMNl3/tZDwddOp5l6/I3IYhJpCglyiyYQhqjwm4jq+PIeiht9pIpPSquxhh0PetK83Q7PEdGwvceNHMR8lbp10xKmho45mHHR+jTJhzCiVE0cFMH39rSYd8783Nel5XTeOB97QkWZ7v+3iaReO/fxBocMoW1StJpMmOu1jQnT8Aujp2NGZYKdZpIOCw9xsE3TEuIGV8ZNC7K3pPskNmviB6xm/nipJy27J+Wl3uZ12T4VelDmFpQ8LTTP10CLdpogeZoiRTQnR5rikMb2vM6rFpz/bDXaFRsbE70xw0Nml1+Eeb2qwPkRJaA6PitWTKZ+PzL9Va/XRX0DKXdLs17ffNVZ1y/yGQ4pE3NMPKrp47/ckj32URH23sFdnZ5fskbhJYg9I+T5PcLLOzG7mdpMmdPPWoC4JxvpDUobZTFpImMukmfWfDmMjzTHxY5JvoMwgzFCDMI93Z6xDqLImrXcrsc1BEdEBkq7k/Jqevpm3sP+WE1dAT2+URRcd12jYdaNcrskm8J1PyN/wwxBt9kNBOT9CjMn4kPbRb+3nu98/f9I+xp5v1r9xm659d6N5yhweBWDnP3/SNs6e/6vvc0niSeN7GAh0YBKepHIOiqRQqKLyBZVaKpuo6G2bpVooly/AMakdjokXYJnkgoNiCcwTW2Ge8BGkEzxPcsJBIRuOiK8E3spxOKi3iV9RH43Xx+AogtOgFOOhH/XtEZthoPQozVcDRWkmeaXfhGcITIRqyoRWsbPCHcIalHESXhZzxCrxB+l26WXpc1mSV8gvyu1Kf2WN8qkp2fSg2WYeZV6rjlW3q79Y+lm2WyWD+mTMgzSYDlbSeQ026NwS3UIIvfXvcApM0r+9iGZiXobxfVKvMwghKFAXwMSGddXxhnbxhroEYWxsV10GFyuBu6AUymAJzIcZMI12XwBeOltPJWvzgg8y6M6k2hQa4YXBNGYBlFOZD8VQCHOgJ7WOhLk0vjfV7oTZdHvh3mtrlRtQMb2Lac4iehbRSPW/sWvfa7vm0U6LaC/9a9tcGq3jUUhz/mc7DqHaTJo3ARbSiKk0ttBYrdiYUWhQ5KVV5tKzjMZMoXVn0DgvzS+l3QuNvlvXGW+sUk4YldI9i1r1XctpbKmxko/2zoSsm2Z1zwn8pwV0/tb4Bv/rK9mwE/2/ODQjq3GTroVCGHmKDLiNVhwKw2AEcX8UjIYxMBbugVwYR7SPJ47dRzvdT5r5ABPgXTii//agLJw7Y9jgjIyud2bXu0+jsNLfeZVjmwv/noi/+PDnWvzJjj9yvMLx3xPxb3b8oRYvJ+L3T9wpfc/xUi1erMXWNvyuDf+N47cD8JvBeIHj1z786vx46ataPE8Dz4/Hc1+mS+fa8Mt0/ILjXzme9eH/deHntXiG47868f+swM/exk85fkLDP1mBp08Nl06vwFPD8eO/REofc/xLJH7E8c8c/8TxjxxP1uKJ49HSCY7Ho/EPPjzG8YPVDukDD74fgi0c3+P4e45HOL7L8R2Ohzke4tjM8W2OBx3YVJkoNXFsPPC21MjxwP7J0oG38cBKcf/vEqX9k/2duN8v/i4R93F8qxb3cnyTYwPHNzjuKcLX7bh7V6K0uwh31TulXYlY78SdhPTONtzB8TWO2zluc+JWjq++Ypde9eErdny5COtoSF0tbuG4+SUr5az4khU3vRgubSrCFzdq0ovhuFHDDSq+wHF9rU1az7HWhuto0rpafP45u/R8Cj5nx2fb8Jmat6VnONZUT5Zq3saalWL104lS9WSs9otPJ+JTHNc+2Vtay/HJ3vgEkfnEnbjmcYu0xoWP00GKGqqKsJI4VZmIqx34GMdHVzmkRzmucuAjHFdyrODo7/ztihXSbzmuWIHLi3BZnltalohLOS7h+LAdF1txkYoLOS5ow/I2nN+G89qwjGMpx7kcZ8fiLI4zHYOlmeNxBsfpK3AaASUcizkWcZzKcQrHwgFY0IYPWnEyxwc4TuKYP1GV8ttwoor3h4RL9/twAsf7aOf7BmOeG8czTRofhve6cNyoYGkcx1wL3sNx7N2aNJbj3RqO4TiaekZzHDVSk0YF48gomzRSwxE2HM5xWC0OrcUhHO8Sekl3teHgt/HO0ejnOIjjHbc7pTtceHtOkHS7E3MG2qQcf2cQDrThAI7ZHPv3c0n927BfX03q58K+WRapr4ZZFuwTjZk29N1mkXwcb7NgRrpFyrBhugV79zJLvTXsZcaePkzrkSilFWGPVKfUIxFTnZiSnCil3InJiZiUaJGSgjDRggkc4znGBWEs0RnrRG8RxrRhNJEQXYRRNvQQBz0cI9swYjCGExDOMawIQ4lToRxDaFJIOLo5ujgGc3TSACen80UvyTEYtRUYVIR2jjZriGTjaKXR1hC0cFQ1NHM00TATR8WFchGK1CmSBriRWpGjQLDQC5mGwJE1sqLVT7G0/x8u+H+NwH95Rf0HwjDU3AplbmRzdHJlYW0KZW5kb2JqCjIwIDAgb2JqCjc0NTAKZW5kb2JqCjE2IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNzkgPj4Kc3RyZWFtCnicpY5LCoAwEEMfVOvfWlv/WvX+l7R0rSKYRRIyQwj8hHi8RIFjJAkpGflLS3GTlVSfFtQoGq+aFoOl875nYGRiZmFlw7FzhO/zAlKkAi8KZW5kc3RyZWFtCmVuZG9iagoxOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDMzMSA+PgpzdHJlYW0KeJxdUstugzAQvPMVe2wPEQkh0EgIqUovHPpQaU9RDmAvEVIxliEH/r62x6RSkWA045n1mnV8ql4q1c8Uf5hR1DxT1ytpeBpvRjC1fO1VtEtI9mIOzH/F0OgotuF6mWYeKtWNUVFQ/GkXp9ks9PAsx5YfIyKK341k06srPXyfakj1TesfHljNtI3KkiR3ttxro9+agSn24U0l7Xo/Lxsb+3N8LZop8XyHlsQoedKNYNOoK0fF1j4lFZ19yoiV/Ld+RKrt7vbE2QFn4MXLErIMsoS8h3t/gLzSHSAB7AHpavXJFKqDMxAy9kk5yHfqVzMUzkJopcmqwnQAPQZToBkgBzytHkRaT/NwjEAzAUAPGQM6WHHSHB3k6CDHmfJ0reaL59jLwRl4cZNY/7kbirtB94mLmzF22P6a+Sm7+faK7zdRj9ql3PsLgKe6ogplbmRzdHJlYW0KZW5kb2JqCjE0IDAgb2JqCjw8IC9CYXNlRm9udCAvQk1RUURWK0RlamFWdVNhbnMKL0NJRFN5c3RlbUluZm8gPDwgL09yZGVyaW5nIChJZGVudGl0eSkgL1JlZ2lzdHJ5IChBZG9iZSkgL1N1cHBsZW1lbnQgMCA+PgovQ0lEVG9HSURNYXAgMTYgMCBSIC9Gb250RGVzY3JpcHRvciAxMyAwIFIgL1N1YnR5cGUgL0NJREZvbnRUeXBlMgovVHlwZSAvRm9udCAvVyAxOCAwIFIgPj4KZW5kb2JqCjE1IDAgb2JqCjw8IC9CYXNlRm9udCAvQk1RUURWK0RlamFWdVNhbnMgL0Rlc2NlbmRhbnRGb250cyBbIDE0IDAgUiBdCi9FbmNvZGluZyAvSWRlbnRpdHktSCAvU3VidHlwZSAvVHlwZTAgL1RvVW5pY29kZSAxOSAwIFIgL1R5cGUgL0ZvbnQgPj4KZW5kb2JqCjEzIDAgb2JqCjw8IC9Bc2NlbnQgOTI5IC9DYXBIZWlnaHQgMCAvRGVzY2VudCAtMjM2IC9GbGFncyAzMgovRm9udEJCb3ggWyAtMTAyMSAtNDYzIDE3OTQgMTIzMyBdIC9Gb250RmlsZTIgMTcgMCBSCi9Gb250TmFtZSAvQk1RUURWK0RlamFWdVNhbnMgL0l0YWxpY0FuZ2xlIDAgL01heFdpZHRoIDk3NCAvU3RlbVYgMAovVHlwZSAvRm9udERlc2NyaXB0b3IgL1hIZWlnaHQgMCA+PgplbmRvYmoKMTggMCBvYmoKWyAzMiBbIDMxOCBdIDQ1IFsgMzYxIF0gNDggWyA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiBdIDY3IFsgNjk4IF0gNzcKWyA4NjMgNzQ4IF0gOTcgWyA2MTMgNjM1IDU1MCBdIDEwMSBbIDYxNSAzNTIgNjM1IDYzNCAyNzggXSAxMDcKWyA1NzkgMjc4IDk3NCA2MzQgNjEyIDYzNSA2MzUgNDExIDUyMSAzOTIgNjM0IF0gMTIwIFsgNTkyIF0gXQplbmRvYmoKMyAwIG9iago8PCAvRjEgMTUgMCBSID4+CmVuZG9iago0IDAgb2JqCjw8IC9BMSA8PCAvQ0EgMCAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMiA8PCAvQ0EgMSAvVHlwZSAvRXh0R1N0YXRlIC9jYSAxID4+Ci9BMyA8PCAvQ0EgMC44IC9UeXBlIC9FeHRHU3RhdGUgL2NhIDAuOCA+PiA+PgplbmRvYmoKNSAwIG9iago8PCA+PgplbmRvYmoKNiAwIG9iago8PCA+PgplbmRvYmoKNyAwIG9iago8PCA+PgplbmRvYmoKMiAwIG9iago8PCAvQ291bnQgMSAvS2lkcyBbIDExIDAgUiBdIC9UeXBlIC9QYWdlcyA+PgplbmRvYmoKMjEgMCBvYmoKPDwgL0NyZWF0aW9uRGF0ZSAoRDoyMDIzMDMwNjA2NDMyMVopCi9DcmVhdG9yIChNYXRwbG90bGliIHYzLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZykKL1Byb2R1Y2VyIChNYXRwbG90bGliIHBkZiBiYWNrZW5kIHYzLjUuMykgPj4KZW5kb2JqCnhyZWYKMCAyMgowMDAwMDAwMDAwIDY1NTM1IGYgCjAwMDAwMDAwMTYgMDAwMDAgbiAKMDAwMDAxMjIzNyAwMDAwMCBuIAowMDAwMDEyMDAwIDAwMDAwIG4gCjAwMDAwMTIwMzIgMDAwMDAgbiAKMDAwMDAxMjE3NCAwMDAwMCBuIAowMDAwMDEyMTk1IDAwMDAwIG4gCjAwMDAwMTIyMTYgMDAwMDAgbiAKMDAwMDAwMDA2NSAwMDAwMCBuIAowMDAwMDAwMzQwIDAwMDAwIG4gCjAwMDAwMDMwNjkgMDAwMDAgbiAKMDAwMDAwMDIwOCAwMDAwMCBuIAowMDAwMDAzMDQ4IDAwMDAwIG4gCjAwMDAwMTE1NjYgMDAwMDAgbiAKMDAwMDAxMTIwNiAwMDAwMCBuIAowMDAwMDExNDE5IDAwMDAwIG4gCjAwMDAwMTA2NTEgMDAwMDAgbiAKMDAwMDAwMzA4OSAwMDAwMCBuIAowMDAwMDExNzkwIDAwMDAwIG4gCjAwMDAwMTA4MDIgMDAwMDAgbiAKMDAwMDAxMDYzMCAwMDAwMCBuIAowMDAwMDEyMjk3IDAwMDAwIG4gCnRyYWlsZXIKPDwgL0luZm8gMjEgMCBSIC9Sb290IDEgMCBSIC9TaXplIDIyID4+CnN0YXJ0eHJlZgoxMjQ0OAolJUVPRgo=\n", "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:43:21.339195\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# SubjQA 훈련 세트에 있는 질문-문맥 쌍의 토큰 분포\n", "def compute_input_length(row):\n", " inputs = tokenizer(row[\"question\"], row[\"context\"])\n", " return len(inputs[\"input_ids\"])\n", "\n", "dfs[\"train\"][\"n_tokens\"] = dfs[\"train\"].apply(compute_input_length, axis=1)\n", "\n", "fig, ax = plt.subplots()\n", "dfs[\"train\"][\"n_tokens\"].hist(bins=100, grid=False, ec=\"C0\", ax=ax)\n", "plt.xlabel(\"Number of tokens in question-context pair\")\n", "ax.axvline(x=512, ymin=0, ymax=1, linestyle=\"--\", color=\"C1\",\n", " label=\"Maximum sequence length\")\n", "plt.legend()\n", "plt.ylabel(\"Count\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "yh1MaHrdz41t" }, "source": [ "\"Sliding" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8emrQ8U0z41u" }, "outputs": [], "source": [ "example = dfs[\"train\"].iloc[0][[\"question\", \"context\"]]\n", "tokenized_example = tokenizer(example[\"question\"], example[\"context\"],\n", " return_overflowing_tokens=True, max_length=100,\n", " stride=25)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "IcGeOmO6z41u", "outputId": "783c9a4a-84a1-42e6-cdb6-fdcdf4f39924" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#0 윈도에는 100개의 토큰이 있습니다.\n", "#1 윈도에는 88개의 토큰이 있습니다.\n" ] } ], "source": [ "for idx, window in enumerate(tokenized_example[\"input_ids\"]):\n", " print(f\"#{idx} 윈도에는 {len(window)}개의 토큰이 있습니다.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "zOFXgi7Bz41v", "outputId": "dd702de4-f9e5-4a2e-f763-c3d5b4c45aeb" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[CLS] how is the bass? [SEP] i have had koss headphones in the past, pro 4aa and\n", "qz - 99. the koss portapro is portable and has great bass response. the work\n", "great with my android phone and can be \" rolled up \" to be carried in my\n", "motorcycle jacket or computer bag without getting crunched. they are very light\n", "and don't feel heavy or bear down on your ears even after listening to music\n", "with them on all day. the sound is [SEP]\n", "\n", "[CLS] how is the bass? [SEP] and don't feel heavy or bear down on your ears even\n", "after listening to music with them on all day. the sound is night and day better\n", "than any ear - bud could be and are almost as good as the pro 4aa. they are \"\n", "open air \" headphones so you cannot match the bass to the sealed types, but it\n", "comes close. for $ 32, you cannot go wrong. [SEP]\n", "\n" ] } ], "source": [ "for window in tokenized_example[\"input_ids\"]:\n", " print(f\"{tokenizer.decode(window)} \\n\")" ] }, { "cell_type": "markdown", "metadata": { "id": "M50OhvNGz41v" }, "source": [ "### 헤이스택을 사용해 QA 파이프라인 구축하기" ] }, { "cell_type": "markdown", "metadata": { "id": "2IWPWQDaz41v" }, "source": [ "\"QA" ] }, { "cell_type": "markdown", "metadata": { "id": "PH0ytdBNz41v" }, "source": [ "#### 문서 저장소 초기화하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "INnrsUNLz41v" }, "outputs": [], "source": [ "url = \"\"\"https://artifacts.elastic.co/downloads/elasticsearch/\\\n", "elasticsearch-7.9.3-linux-x86_64.tar.gz\"\"\"\n", "!wget -nc -q {url}\n", "!tar -xzf elasticsearch-7.9.3-linux-x86_64.tar.gz" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "T1zkGdtPz41w" }, "outputs": [], "source": [ "import os\n", "from subprocess import Popen, PIPE, STDOUT\n", "\n", "# 백그라운드 프로세스로 일래스틱서치를 실행합니다\n", "!chown -R daemon:daemon elasticsearch-7.9.3\n", "es_server = Popen(args=['elasticsearch-7.9.3/bin/elasticsearch'],\n", " stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1))\n", "# 일래스틱서치가 시작할 때까지 기다립니다\n", "!sleep 30" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "wL-1x19Nz41w", "outputId": "937d5965-c4f2-4c21-8059-d2121fb102e3" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:haystack.utils:Tried to start Elasticsearch through Docker but this failed. It is likely that there is already an existing Elasticsearch instance running. \n" ] } ], "source": [ "# 또는 도커가 설치되어 있다면\n", "from haystack.utils import launch_es\n", "\n", "launch_es()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "wl5rLHzOz41w", "outputId": "db87b213-f253-4f39-f470-ed040cd4d7dd" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"name\" : \"5e41be2d74e7\",\n", " \"cluster_name\" : \"elasticsearch\",\n", " \"cluster_uuid\" : \"OG6mo2jGQWmDRmm-VOOAHA\",\n", " \"version\" : {\n", " \"number\" : \"7.9.3\",\n", " \"build_flavor\" : \"default\",\n", " \"build_type\" : \"tar\",\n", " \"build_hash\" : \"c4138e51121ef06a6404866cddc601906fe5c868\",\n", " \"build_date\" : \"2020-10-16T10:36:16.141335Z\",\n", " \"build_snapshot\" : false,\n", " \"lucene_version\" : \"8.6.2\",\n", " \"minimum_wire_compatibility_version\" : \"6.8.0\",\n", " \"minimum_index_compatibility_version\" : \"6.0.0-beta1\"\n", " },\n", " \"tagline\" : \"You Know, for Search\"\n", "}\n" ] } ], "source": [ "!curl -X GET \"localhost:9200/?pretty\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "AzpVTzauz41x" }, "outputs": [], "source": [ "from haystack.document_store.elasticsearch import ElasticsearchDocumentStore\n", "\n", "# 밀집 리트리버에서 사용할 문서 임베딩을 반환합니다.\n", "document_store = ElasticsearchDocumentStore(return_embedding=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ity-bUF7z41x" }, "outputs": [], "source": [ "# 노트북을 다시 시작할 때 일래스틱서치 저장소를 모두 비우는 것이 좋습니다.\n", "if len(document_store.get_all_documents()) or len(document_store.get_all_labels()) > 0:\n", " document_store.delete_documents(\"document\")\n", " document_store.delete_documents(\"label\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "qbyi1qD-z41x", "outputId": "9a19509e-1ef6-483f-9a41-e47f5e341f44" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1615개 문서가 저장되었습니다\n" ] } ], "source": [ "for split, df in dfs.items():\n", " # 중복 리뷰를 제외시킵니다\n", " docs = [{\"text\": row[\"context\"],\n", " \"meta\":{\"item_id\": row[\"title\"], \"question_id\": row[\"id\"],\n", " \"split\": split}}\n", " for _,row in df.drop_duplicates(subset=\"context\").iterrows()]\n", " document_store.write_documents(docs, index=\"document\")\n", "\n", "print(f\"{document_store.get_document_count()}개 문서가 저장되었습니다\")" ] }, { "cell_type": "markdown", "metadata": { "id": "BRVKygWTz41y" }, "source": [ "#### 리트리버 초기화하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "G-BmiSOGz41y" }, "outputs": [], "source": [ "from haystack.retriever.sparse import ElasticsearchRetriever\n", "\n", "es_retriever = ElasticsearchRetriever(document_store=document_store)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fdnSMdOIz41y" }, "outputs": [], "source": [ "item_id = \"B0074BW614\"\n", "query = \"Is it good for reading?\"\n", "retrieved_docs = es_retriever.retrieve(\n", " query=query, top_k=3, filters={\"item_id\":[item_id], \"split\":[\"train\"]})" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "HrUYAVVGz410", "outputId": "36bc13d7-384f-4a37-edd9-1e7a1cde0b9c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'text': 'This is a gift to myself. I have been a kindle user for 4 years and\n", "this is my third one. I never thought I would want a fire for I mainly use it\n", "for book reading. I decided to try the fire for when I travel I take my laptop,\n", "my phone and my iPod classic. I love my iPod but watching movies on the plane\n", "with it can be challenging because it is so small. Laptops battery life is not\n", "as good as the Kindle. So the Fire combines for me what I needed all three to\n", "do. So far so good.', 'score': 6.243799, 'probability': 0.6857824513476455,\n", "'question': None, 'meta': {'item_id': 'B0074BW614', 'question_id':\n", "'868e311275e26dbafe5af70774a300f3', 'split': 'train'}, 'embedding': None, 'id':\n", "'252e83e25d52df7311d597dc89eef9f6'}\n" ] } ], "source": [ "print(retrieved_docs[0])" ] }, { "cell_type": "markdown", "metadata": { "id": "-Lvpd41rz411" }, "source": [ "#### 리더 초기화하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ddfnri_bz411" }, "outputs": [], "source": [ "from haystack.reader.farm import FARMReader\n", "\n", "model_ckpt = \"deepset/minilm-uncased-squad2\"\n", "max_seq_length, doc_stride = 384, 128\n", "reader = FARMReader(model_name_or_path=model_ckpt, progress_bar=False,\n", " max_seq_len=max_seq_length, doc_stride=doc_stride,\n", " return_no_answer=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "qXMSYzsXz411", "outputId": "4172f189-1b62-4824-e26b-41b24c7b3ffa" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'query': 'How much music can this hold?', 'no_ans_gap': 12.64809501171112,\n", "'answers': [{'answer': '6000 hours', 'score': 10.699626922607422, 'probability':\n", "0.3988155424594879, 'context': 'An MP3 is about 1 MB/minute, so about 6000 hours\n", "depending on file size.', 'offset_start': 38, 'offset_end': 48,\n", "'offset_start_in_doc': 38, 'offset_end_in_doc': 48, 'document_id':\n", "'e344757014e804eff50faa3ecf1c9c75'}]}\n" ] } ], "source": [ "print(reader.predict_on_texts(question=question, texts=[context], top_k=1))" ] }, { "cell_type": "markdown", "metadata": { "id": "P91oPhTqz411" }, "source": [ "#### 모두 합치기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "UB2q1x6Jz412" }, "outputs": [], "source": [ "from haystack.pipeline import ExtractiveQAPipeline\n", "\n", "pipe = ExtractiveQAPipeline(reader, es_retriever)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "yVLnBu9Rz41_", "outputId": "5cd2bffa-2f43-4ba7-f295-f0cfc06881cd" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "질문: Is it good for reading?\n", "\n", "답변 1: I mainly use it for book reading\n", "해당 리뷰 텍스트: ... is my third one. I never thought I would want a fire for I\n", "mainly use it for book reading. I decided to try the fire for when I travel I\n", "take my la...\n", "\n", "\n", "\n", "답변 2: the larger screen compared to the Kindle makes for easier reading\n", "해당 리뷰 텍스트: ...ght enough that I can hold it to read, but the larger screen\n", "compared to the Kindle makes for easier reading. I love the color, something I\n", "never thou...\n", "\n", "\n", "\n", "답변 3: it is great for reading books when no light is available\n", "해당 리뷰 텍스트: ...ecoming addicted to hers! Our son LOVES it and it is great for\n", "reading books when no light is available. Amazing sound but I suggest good\n", "headphones t...\n", "\n", "\n", "\n" ] } ], "source": [ "n_answers = 3\n", "preds = pipe.run(query=query, top_k_retriever=3, top_k_reader=n_answers,\n", " filters={\"item_id\": [item_id], \"split\":[\"train\"]})\n", "\n", "print(f\"질문: {preds['query']} \\n\")\n", "for idx in range(n_answers):\n", " print(f\"답변 {idx+1}: {preds['answers'][idx]['answer']}\")\n", " print(f\"해당 리뷰 텍스트: ...{preds['answers'][idx]['context']}...\")\n", " print(\"\\n\\n\")" ] }, { "cell_type": "markdown", "metadata": { "id": "XtWniguTz42A" }, "source": [ "## QA 파이프라인 개선하기" ] }, { "cell_type": "markdown", "metadata": { "id": "lVGUcNKgz42A" }, "source": [ "### 리트리버 평가하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gYCgemwjz42A" }, "outputs": [], "source": [ "from haystack.pipeline import Pipeline\n", "from haystack.eval import EvalDocuments\n", "\n", "class EvalRetrieverPipeline:\n", " def __init__(self, retriever):\n", " self.retriever = retriever\n", " self.eval_retriever = EvalDocuments()\n", " pipe = Pipeline()\n", " pipe.add_node(component=self.retriever, name=\"ESRetriever\",\n", " inputs=[\"Query\"])\n", " pipe.add_node(component=self.eval_retriever, name=\"EvalRetriever\",\n", " inputs=[\"ESRetriever\"])\n", " self.pipeline = pipe\n", "\n", "\n", "pipe = EvalRetrieverPipeline(es_retriever)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "amX_CSHdz42A" }, "outputs": [], "source": [ "from haystack import Label\n", "\n", "labels = []\n", "for i, row in dfs[\"test\"].iterrows():\n", " # 리트리버에서 필터링을 위해 사용하는 메타데이터\n", " meta = {\"item_id\": row[\"title\"], \"question_id\": row[\"id\"]}\n", " # 답이 있는 질문을 레이블에 추가합니다\n", " if len(row[\"answers.text\"]):\n", " for answer in row[\"answers.text\"]:\n", " label = Label(\n", " question=row[\"question\"], answer=answer, id=i, origin=row[\"id\"],\n", " meta=meta, is_correct_answer=True, is_correct_document=True,\n", " no_answer=False)\n", " labels.append(label)\n", " # 답이 없는 질문을 레이블에 추가합니다\n", " else:\n", " label = Label(\n", " question=row[\"question\"], answer=\"\", id=i, origin=row[\"id\"],\n", " meta=meta, is_correct_answer=True, is_correct_document=True,\n", " no_answer=True)\n", " labels.append(label)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "FrhGuFugz42B", "outputId": "9ff252fa-948f-4e96-e846-7fa26ee0784c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'id': '336690e2-4398-4ba8-9744-b6363b373427', 'created_at': None, 'updated_at':\n", "None, 'question': 'What is the tonal balance of these headphones?', 'answer': 'I\n", "have been a headphone fanatic for thirty years', 'is_correct_answer': True,\n", "'is_correct_document': True, 'origin': 'd0781d13200014aa25860e44da9d5ea7',\n", "'document_id': None, 'offset_start_in_doc': None, 'no_answer': False,\n", "'model_id': None, 'meta': {'item_id': 'B00001WRSJ', 'question_id':\n", "'d0781d13200014aa25860e44da9d5ea7'}}\n" ] } ], "source": [ "print(labels[0])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "C8eWcYU5z42B", "outputId": "64d47f55-8bd8-47f6-9b43-6c647b8d1aba" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "358개의 질문 답변 쌍을 로드했습니다.\n" ] } ], "source": [ "document_store.write_labels(labels, index=\"label\")\n", "print(f\"\"\"{document_store.get_label_count(index=\"label\")}개의 \\\n", "질문 답변 쌍을 로드했습니다.\"\"\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "8_ezogm9z42B", "outputId": "adbe2e45-8cea-4e6d-c100-c089fa06f3b6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "330\n" ] } ], "source": [ "labels_agg = document_store.get_all_labels_aggregated(\n", " index=\"label\",\n", " open_domain=True,\n", " aggregate_by_meta=[\"item_id\"]\n", ")\n", "print(len(labels_agg))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "-IBkcanfz42C" }, "outputs": [], "source": [ "def run_pipeline(pipeline, top_k_retriever=10, top_k_reader=4):\n", " for l in labels_agg:\n", " _ = pipeline.pipeline.run(\n", " query=l.question,\n", " top_k_retriever=top_k_retriever,\n", " top_k_reader=top_k_reader,\n", " top_k_eval_documents=top_k_retriever,\n", " labels=l,\n", " filters={\"item_id\": [l.meta[\"item_id\"]], \"split\": [\"test\"]})" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Q7SeC8XWz42C", "outputId": "6878d3b0-71ed-47ee-fc00-a5ebdc5d5ad0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "재현율@3: 0.95\n" ] } ], "source": [ "run_pipeline(pipe, top_k_retriever=3)\n", "print(f\"재현율@3: {pipe.eval_retriever.recall:.2f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "UPZC7be_z42C" }, "outputs": [], "source": [ "def evaluate_retriever(retriever, topk_values = [1,3,5,10,20]):\n", " topk_results = {}\n", "\n", " for topk in topk_values:\n", " # 파이프라인을 만듭니다\n", " p = EvalRetrieverPipeline(retriever)\n", " # 테스트 세트에 있는 질문-답변 쌍을 반복합니다\n", " run_pipeline(p, top_k_retriever=topk)\n", " # 재현율을 저장합니다\n", " topk_results[topk] = {\"recall\": p.eval_retriever.recall}\n", "\n", " return pd.DataFrame.from_dict(topk_results, orient=\"index\")\n", "\n", "\n", "es_topk_df = evaluate_retriever(es_retriever)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 376 }, "id": "eGkM8RXLz42C", "outputId": "a0b809c7-47f1-495c-ffac-263d617d7d08" }, "outputs": [ { "data": { "application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUiAvTWVkaWFCb3ggWyAwIDAgMzk4LjU1OTM3NSAyNjYuMDkgXQovUGFyZW50IDIgMCBSIC9SZXNvdXJjZXMgOCAwIFIgL1R5cGUgL1BhZ2UgPj4KZW5kb2JqCjkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxMiAwIFIgPj4Kc3RyZWFtCnicjVbLbtswENSZX8FjcwjN5fJ5jPsIEKCHJAZ6KHooXMdt4qRNAzS/36GkSKQfsm04scbkzOxytavZh9W/X8vVzeVcvr8Vs/Fq+SJI3ovZBcn1i9TyHp9XSfJS1os08EfBKSrnEgeHy015abxXOgHT5de86KcQT9BY4+ISxGshnH/bZUlZ1/KS4hLalJBxUcVMOG4cILDfwb3p3K8hgAhURAxZDogIpEIIzoVBbUQ4qJTFxBwOX8Uz/mp5rsHi8y/BWMsUDeQUGyeXj2K+kLNPJMnIxV0b4OKH+CrfNXQmv8nFlfi4ENeiNSBIs4rkIttBuYAOS6ekUnLOUGRNJ0jzHml2KjpLWDtIj9BhaWKjNHNIia05RdvtahtyKtkUtR+0C+iwttFRsU6EVeakjDd6V5yDV4QgE4+1NUKHxdmjkrV2PhLeJ6ibWr2Mw2hlo4kEgqTCFMnDbgBjifuEdHCMFjE4Vnob3ZtDoyJkvVXBEkdrUpiMQjeqSfvyONogHEsEFVc2RnTCBmmrTHApu0A1HfVhJn1YlEdoA6qMjPCUE5vvO23hQ3t/3ImddBJRRroNqXIywlNOIiu2TmcfED/qxE85MSYoa9uQSicFPOHE5P8hOfggN1mmnZNY13tmOs+c5Nt7hcjCVlf2E1yLM2xIKAsduX0B+938ac6bh0Y2N2fS2vrHVbNsvjebZlPJy+0ZwmyRHEM5cvl3Jb/IJ2nklSTl2lmAexKtHJXibMJF6L5pI262J10xHhxGELNnX7duCkEhTFAgv0VfNUzonkhnznvR84xNSmvQ5Ju2aEcVLm7FtXxuPethhtXOdufmnoEInp1p+rh3mmLlSZO4WtfvPsg4u+BuFF/lWY/PaxtKP/mZtPLoGpk99ILRoCmnEUKKoh1stRBHN5iKytRAQvrlUhQQnj9M6IhGLKDY88aBGh3ft8ig3yPL0maPgUvjsCqM0QNNj/TsBdJ5ANeIdVY3FdSFM3L3AW/naZkfdOZvDzptYaNIjhW33ClurHpTwqGl9gAw7CoIBjEotyAcrW5Lsz/T/uGqHKHWv3l2mD+TXWDefMYErR4ersV/GO4dpAplbmRzdHJlYW0KZW5kb2JqCjEyIDAgb2JqCjgxNgplbmRvYmoKMTAgMCBvYmoKWyBdCmVuZG9iagoxNyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDIwIDAgUiAvTGVuZ3RoMSA4MzE2ID4+CnN0cmVhbQp4nNVZe1xVVfZf+6yz7wsunHu5FwQul8vjgoYPAvGV6Y3UTM1IyREbGq4Cki9QfKRk4As0NcQHlqndfFVkDpmjoFaWlBry+zXjY6apxkxLK3w0Q9mPYN/fOgdwrPnNb36/v36f3zl3nbPXPmvvvb5r7b32OucCAwArXWRwPTBs+AgIgQgA1pNqQx/IeHh87OMJkcQPIxr7wPhH0xN3DT0OIG2k59UP3Zc50jg0dTkAqvI/PDy+T8r0H3+7iDqLJ37ClJneImmeOZf4p6j9yCnz57rgiaiBALyBeJFfNHXm7L7zpwHoiYfXpnqLi0BPJxiKiA+cOmNhPntp9nTiqX1wRkGeN9eQ/V45QNhpet6vgCrML+orALrJxMcXzJz7ZHGcuS/x3YkvmFE4xZtUlXiE+HWq/Ezvk0VyhW428SeJd83yzsxLvHbvUeKvkj5niwqL57KNUi+AiKOq/kVz8oru0X9HxUgb6VwAqq0CoeOQiEMI1+pUMkFvGAzSsBFjMiFohnfuLOgGqlbg9wPcLmnS0/PmzAJDZztGzyTtbqB7jSZ5FM6TFQI6BvIfIhoGXUcC/JPD/7oqSffp/jN03dVJw/xmrfZT8brYJahGpX91+D/tHFftMUHlfnEw0k+GYLCAFewQSlgZcSYNkUR2AQ0zp6tOQ2inq/RfDISdcjrqT7WIkfrosud62Ag2zZ6LvHO8k2G5d87MWbB88hzvE7B8indWMV0L8ubQdeGcGbB8al4hlafOyZsOywu8s0imIG8y1Uz3zvLC8hneQpd6Jb8sn+mdWwDLZ01XawqnemfC8jnzZpHk3PxZU+laoPZ/h+/uRK3q9TlchQwYHAj6z9RKaSlByKH5ca5DSC3feXTxrKCz/Oi/tj8Lon7n/2s5mNQhq95vl+/o42f1k+7g5/y9LA0A+KmSyoNvN+1OKE23EdNVDmLrVG/yVP48sc6OO/4R8iUr9RCgQzTIkiTfbtF5ZOQPzwUPuKBIZxM2tkU/k13KAea/0GVX7CQHgDbiVuKYxstQpt0VqgmkHnpDPxgI6fAQjINM8MIUyIMZUKR5yAW9IBkG/OLpdCj0+/2X/H/0n/ef8zf5j/vf8b/t3+9/w1/rf83/6s81/YejY53v7eTM2igdpOpLMYI06ljXyUT9QJ3pQDoAadkhn95JwUQPdZJCNK6TLESZnURWJK07KIRoSidR3CEkHWQnolhIqCk2ExUSqXGyG0uDOmik812KH1vZHuLyqX421fik/bAC5lHNcdbIVlFsa4Q9cBPOkGQFNGKNDGwUpFItwMdcghaWCQeoj4HMxgbqdTLIY+UD8ji5Tr4iN0F/uVhuknPkYpaKO/gEvodoIL5Pc+AUREMduwDFcBi/xlQ8Kg+Tg+ACNmENfEmjqHZqhErYBSWki40VQqlUIo2jmhO8CbbQWUjPm9h2doa0O8yWwTl4DmVpJGxn5whXI/wAyzBTKiXzp0r5pP8J6quJ2m+BYgog55gJhJREdaQ9jTVZu0ZhL35OO29CKY2cCbt0dTqbPo5GUS22hx1nzboN4IMz+GucjZ+wFXKc/Io8Eio7LIA5UEl9b1Hb6PLZQsKuniVq79ICOYfVwNdyjn4y9f2+iojGPCCNI0T5FMPzYYFOIUz3sBW4ijRVn0ZBk36U3IfaUw/6xZoPCzENplGpBPbBfuiF1VBJPWl4df35D9Ryq3yRMFeytdIP0ITDoAfky9fJ1ur0qAY4pNdxGSUGPV1KreR+MLfW88hE18msmF49f8G6FL2rFjJqzQtddX5/xkQ5kmfVckctug21sjvu4j97eLFXz9EZE1217cOHdfY6PGcY1Y2fSEWVo2qqHz5Me6YOWsvd9Hswp9Y1pcD1jPJM3KBnlLxB6rKhfYB2Olq36krPZRekUmkZlS0HYask016hfHr6blDONieHxNhjcqXI9i+lZbvUuPIJXfaBUGUPwTKJhdO2r7Q13w19mpP7p9rjPjlzRgjqvcJ/Sa4kjwdAGMR5QnQ+K/gCq6xruhkdwU502CO7UasWaqdcbmlWriezWMmiWFNTrBZFSkwBiwJxsepVWr112zb6bdv2EzOKWz/9JG4xI88QTeI0URNLpbMvS/WJYlEuKkQxW8sWskVsrarrRQoNk2hHMIHHY09Hnyz5+BI9+IyGaJ0DIZoFKGdH1wZnTqwnYc+ArOaGDiApLc2EnOyaFcsOBGOwLGX3j7HwNHeqhawh2CjxPMv7kI1q21UjF4+sG9l6roY6oDkgjyLEDtjuSQyPiMRuDguXwcK5nK68ZNlo9tmqZIoFoJgkZnKEKaiLUtpG19ozR9eGZj42utaW+Rhpgv5jA7IazjYfO2axDuzUpkXTRq/wa3p+jdU6FEvYQNLNk/KoPIFP0C+SF/H5kRXheooU4XIETRnHXJivmxdRHDnXsRTKw5dGLI1c6ngFXom0ZEO2m0Ck9YP+Q1ha34S4WJ0+bQhLTZHtNp2e8oRV0rttY8iMqd6HXi7/zZknF52deJXZhj8WLlpqamoWsKpBMzc/uKA6/f7Td6dcfe/Xu4uixLeEfiv5u5jQd4ciT2+wh5jKjdHlrhCf3ewzbtA5fK4NcVW6NfadPUIdIYC2cEeCS3GgLdqo66EaITSzC79Rw08GaGkmlGQBpflyy+Vm5avrinaSVZKZx5jr9EZ7XbkxMmQzJ7Pb5JjYhMQ0JwHpR6iSWFpH4WfwcGjVTvGRuPr4iWmZJ2e+faJ+976Dm7bvfG7823OKT2V9xQKfRXd0w7rP/up2H787pbpy+aY9C4qKS+ITDrhcv9//1GvqqlHz6F00pyTaXZZ4opgZzYBoTgcM0Ps4wyVGFmgCh84gBwYpn46uDSBgZg1YoArs7OCG5hSL6tfLZwc3pxAWzbHyKXLuKdWldwXAXTASsuAJWADPgD6UJUECS8J+bCx7OPBh8wSWz+axRbiCmcmVRhaDqRZadpY4S0wa6oTERJo4d+5U++Pc3XYJm9pSXxE+lnOcPLSdPJRLmkfB4544OUJvKVeiInx6m09ZZZZ8sMS8Rr/LGeZgJnSASdE5lTZ2p18UVf3O1aKoq4VcpDRcVxewuoLJPaKhwztqyLCoNge7DX7mFtUbn2F4u6/nxJ6tLF6cFTceP14w6dj01z/88PVHXsrk52rE+uBgcf2b78T3Llfj3ckHt249GJ9A1q4k7au1eBIPEz3xITowlweCL1Tnc4TuVnyBq2KrHGvcgbFGR7gzxIEx0ZFuCjA0iS5rIeZy2+W/Tx+PjfZp1iQ1YZPcyBt1hHu/U8pm2SxWZ7eFdujK7L1ZXKyEXUDiXGo4ikkJlXatfPHFlUTMOOaFMSfPBN+zf/pFxsXNL0S7uM4yWOSYF/CewzteOnLkpR2HpYV18Qnir+LGr7LFjW+/Et9oAWoy2+1Uc5jVhGq1hioO+sD9Hnc3wpSo8zl7+axVzjWJO5O7Bcbf5bDHO4KNFDMpcAbHRCYrbQ3NLQ3NGpyuFaJxA2lp3AHB3ZtWeHxqSqi6tLVFEhcbn9a3X0iXAPlDWr1u9+516/bsFruXVoH/LxdE1ZL1O8WtW7fErV0jq5Yt3bBh6bIq6f0tFRVbXiiv2DLBtb/szY8+erNsvyv2g8qPr179uPID5p27dOlcIvLTEkJUQYi6aX6K00eHs3II95l2yz5YFRrtU6pC17j1DkdMiBNiYx1mzU2kftdO8JX4vstLoQ3h70UcizzmOBb1nrMhWl9jPWr92orkp/7ajLKGBJGHIK0vpHb4JjaBdcEiG1wcs3U0eWfQ/hmfi5+Y8gVDZhFviC/HbGVDOj0YTb5hZmad8GsW/O1XLFTbRF4UjzmlzV3+U710itZ8M60cA2WLLvJSXDhsMhk3WZewTabXoy0BBikkPJpDkCOUhzt6G8FhlWMIVUobbSNqDNPCeLPqooHJ+4NjGU00S0yHprcL7pg73RTDNrBhO7dt2ymOsqSNVVUbRYAkX2kte2rTbnHzp/ar0qn2zypWr1kh5YshhXNmF+059saqHTZX43Mn/0wrnfTVRZO+gfCOpy9a9Aa9ZGGSQb2hZDQZmcVkMqab9BIaEH5rCOBGAyUu3KRzyENMlPSbSfk2VXPSW4tXYQNv7z+KoZP4tazY/UVBBMYzAfXBhmCjZLJLNn2IKUFK0Lv0CSaXqa8+zfSE9JRUol9oKpOW6pea1kmhMgvAEBaJcawnJhq6G/uywTjBkGXMM0wzzjcsNC5ha3ETewFtFN9CYijCUWxjZKi4U6wXW8xKWa/3RWmjKG3g59oM+GNrEo9uo1eA1os0+wg5pyhC2bsD3vCkGQ16NOksKCO3yDKmUyptR9m+yWjbZF4SIHMdWshboUHcFB4uW4baTI5AOUrzXIMK39LhusFqOmAdqJ5/NwO/pu3B+z1O1QLpi0IYB864pEO9bAc7s0mhGCa7wc3cUgIm6hL0CYYEo8vZj/WTRrARUgGfJ8/jC0JW6lbqn9M9p4/O1jaysJA47M2SmBrPXeqEYJbOpYpr7ysZ0vTxO6NWP/nph+wkg7Zl7avE+k2b1ktHQ9c9LQpYafXk9lX83Pk/rT0sPdx+vWLZshWUBflbafZ+TTbRwyhPkE7aBEtk5qEY7+EG5SzFRULY0pySTMHelDnxLZrkHu0ziIHivaX/gCxPCBijQWGKFK1XjB5jkfFFozEb1RhPOurkG+3XG9uvU+RuPceT1B2yhCJAL8rOTeCGo5QJRQeEGYPg1TBdfZDFVR592FEfV2dZExYIYdjNbDQERKPBNjyBrH76bHNKihbQ+jRcbmmj9fKBFuEsquE9s5Kjkp3J0cmu5Jjk2KGJniiP0xPtcXliPLEZURnOjOgMV0ZMRmxGYlHiiqgKZ0V0hasiZkXsukRf4s1EZ1fTrkZdDXKcOdE5rpyYImdRdJGrKKbMWRZd5iqL6XbnbnAv62+JS1ODTQJFz9SYO/OKUOntC3uXFD5fX1c39OjKvY3tPzHp5c05BzPz3p70t5tSan7J5OKPD/QY076kJt/77o63jllLV/fuXZOY2KZmqLP9l/AK2SochnoioZytlIPKzStN9Ra5PoyMFKG3mmGkbXiE0nY5pStCipbryvfXkz0BwZFKZFnkukhfJCdlte2qU+H+di0yduxXeGXstow3P/jgzYxtYx/and0uztM60j26Q07bm5R0qanpUlJSTXw8G8KCmJUNiiMPklZyKWllg0jK5uJpNhvLDSu5/VXG6wPZkW711rrANY5Iu2SwG2C0ZA0e7tCS+QYtY1WV7NhsWzrieI+hUUVRvqiPom5G8aEwlA2VhtqHRvKe+j6GPsaepkIoZIVSob0w0pg9WwUSo6ULGgYtrtP+q9fA6eXStv2BTYemnZg85aPpokWcYD3avmD6Omn3yi31QdLjk94+0bfvvrt6sgHMxELY/eKzhs0H9m1XLU2BXN6qs9GsPOrpYbBwHddbdDqOFpnLkpajG7iEeBQ2G3WcyToDM8CIADUWnG3Q8rXBzZe7wqB8rSsGqGUKAkqgGgR+I0smQ6iUKPXgSYYJUr401VAsLeBLpVX8WcMGqZpvNuyUrEZu1EkBaNJ3x0S5O0/SJek9gQWYE7gKV8ir+FpdpX4LbtbX4Mv8oP59/Xn9LbyJt+SbckT2bAqIqRaWatTC4eF6yf1t+z5p+s32E/U6W9sT7FJ7S/teKa79M8L7LoFeSHgRYg/BZklFc/s9zWNWuIdn8BxexG9yHVN7jXu3jl7K/6MZOv0/j/yvvruNone3eivUB9ap727W4EfQah/+i3c3T9zQ8BIo0ZXqSw2lxlJTaUBJYKm5NKg0uFQptZRYfeE3wy0/z65+9opXvHHva5s27N274Sazius3vxM3mAUvXDl16srVkye+3ipOimZxjZw9kHxqYwNIw8NigryLNFQozg/xREbUQ5Ctnhvqgtawt/BolMUa8ECYDAZphBbNUzp0pTy1QU1Zkz3GHGeZ0+f83CmzbDeFsVS7luXRqyctcdaVQJCmrLi+ftAbJafB7z9d8oY04OX1619W6ZX2fTpTTa5XHBU/0nnUy75tvHKlkajj25uU9fw9YWWG3wQP/h6iDdrHq98/HXS5637rfNuYoCyj+kXXcPtbF7XTzxRRAEHi1vnWR4Ky/uHbWITcpH1TAqmGaDXRJcil+yc8DCqILhJVE20lyiXaTlRJtJpoCV8Mp3RX4BSvhFPyJX+rfAVK5HyYLTcT7YDD0kB4V/4SZnObulZuH6H0OjIDdrJYVs3+Jo2T3sIEXIdX5Tz5Bk/kb+kMujm6byjDyNK/0oEBIjATkqCAsg961YXnVYSyXQqlu/r9SQ+T1G8OspGEk7XveGqZQShxHWUJDGxEZxnvqJfvKHPoxsZ2lnVgY/lwPxRCESyEOfTiNJVGn0uZWneYAj3ongLJdKZSaTJJuCCdZOZCMdEcyAMvzISeVPsgzCL53lS6jxDPoPu4230Va1we3fOozXy65pKk6X8war/bo6pfROfTWOpXplkkrerhpTb/uxGHUWkatZsA80hiCsl6td7ytBZeDZGLepmlfu8lmcnU7xMk56L2hTS6V3v2y37Ga70Uk0YUjWE61aqjFpNsodZTCo2dCmk/a9XVpvMfBf/T2rfsfzwitHmh/h+hUP4cQruK+l9FGL0hhNP+4qAsIZl67k99j6AX37HwMOEeD1AnlXn8PwlsteF/uPHHFLxVjT8E4fcCWwT+zY1/DcLvqvGmG288cx+/IfB6NV6rxuZW/LYVvxH49SC8mo5XBH6Vgl9eHs+/rMbLJHh5PF76og+/1Ipf9MGLAj8XeCEF/2LDz6rxU4GfWPHPi/HjI/gngedJ/PxiPHf2AX5uMZ59AM/8IZKfEfiHSPy9wI8E/rvAfxPYVI2nG538tMBGJ36YgqcEfrDCwj9w4Puh2CDwuMD3BL4r8JjAdwS+LfAtgUcFHhF42IL15W5eL7Du0BFeJ/DQwWx+6AgeKpMP/s7ND2Z7/HjQI//OjQcEvlmN+wW+IbBW4G8F7svF14Nw72tuvjcXX6ux8tfcWGPFV0npV1vxFYEvC9wjcLcVdwncuSOI70zBHUH4Ui76SMRXjS8K3L4tkG8XuC0Qt74Qzrfm4gtbFP5COG5R8HkTPidwc7WZbxZYbcZN1GhTNW7cEMQ3dscNQbi+FavWHeFVAtdVZvN1R3BdmVz5rJtXZmOlR37WjWsFrlndm68RuLo3PkMwn7kPV60M4KtsuDIAK6iiIhfLyVLlblxhweUCly218GUCl1pwicAygaUCPf6nFy/mTwtcvBifysWSTDsvceMigQsFPhmECwJxvgnnCZzbisWtOKcVZ7dikcBCgbMEzojB6QKnWdL5tPH4hMCCxTiVmHyBeQJzBU4ROFmgdxDmtOLjgZgt8DGBkwRmTTTxrFacaMJfhYbzX6XgBIGP0siPpmOmHcczhY/vhuNs+MioEP6IwIwAfFjg2IcUPlbgQwqOETianowWOOpBhY8KwQejzPxBBUea8QGBI6pxeDUOE3i/1Ivf34rpR/C+0egROFTgkHutfIgN7x0czO+14uB7zHywxx+M95hxkMCBAgf0t/EBrdi/n8L727BfWgDvp2BaAPZ1YqoZU+4O4CkC7w7A5D4BPNmMfQKwdy8j761gLyP2TMGku9w8KRfv6mHld7mxhxW7J7p59/sw0Y0J7gCeEIzuAIwXGCcwNhhjCGeMFV25GN2KToLgzMUoMzrIgg6Bka0YkY7hxIQL7JaLYWSpMIGh1Cg0HO0CbQJDBFpJwCrQQlgt6agsxuBcDBJoDgzlZoGBJB0YigECTQoaBRpIzCBQb0NdLsr0UKYZYEeqRUEvqAqXeiFTEASyOpa7Yi1L+v9wwP+1Av/tEfWf1zRPngplbmRzdHJlYW0KZW5kb2JqCjIwIDAgb2JqCjU4OTcKZW5kb2JqCjE2IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNjcgPj4Kc3RyZWFtCnicY2CgEDDjlGFhYAWSbAzsDBwMnAxcDNwMPEA+LwMfhkp+rPoF4CxBIBbCaY8wEIsAsSiSmBiDOJiWYJAEAEu2AUMKZW5kc3RyZWFtCmVuZG9iagoxOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDMyNyA+PgpzdHJlYW0KeJxdkktrhDAUhff+irucLgZHHactiFCmGxd9UNuVzEKTqwg1hugs/PdNchwHKujHOffhJTfhuXgtVD9T+GlGUfJMba+k4Wm8GsHUcNerIIpJ9mJelf+KodZBaIvLZZp5KFQ7BllG4ZcNTrNZaPcix4YfAiIKP4xk06uOdj/nElZ51fqXB1YzHYI8J8mtbfdW6/d6YAp98b6QNt7Py96W3TO+F80Uex1hJDFKnnQt2NSq4yA72CenrLVPHrCS/+JRgrKm3fJjlw9U4MXbEjav9iZ9NEFRckL0JiMgBhLgCKS3CjR4gnxeG2zSR4/o4FCBsDGEQwV6O0V2umana3aKHztUoLdPGNKhAmFjWocKhI2xHSoQdgNbrPYmEW29fFyP9C4vbiW3s3fbcVdpW724GmO37u+bX7dbdK94u5J61K7KvX/EY70iCmVuZHN0cmVhbQplbmRvYmoKMTQgMCBvYmoKPDwgL0Jhc2VGb250IC9CTVFRRFYrRGVqYVZ1U2FucwovQ0lEU3lzdGVtSW5mbyA8PCAvT3JkZXJpbmcgKElkZW50aXR5KSAvUmVnaXN0cnkgKEFkb2JlKSAvU3VwcGxlbWVudCAwID4+Ci9DSURUb0dJRE1hcCAxNiAwIFIgL0ZvbnREZXNjcmlwdG9yIDEzIDAgUiAvU3VidHlwZSAvQ0lERm9udFR5cGUyCi9UeXBlIC9Gb250IC9XIDE4IDAgUiA+PgplbmRvYmoKMTUgMCBvYmoKPDwgL0Jhc2VGb250IC9CTVFRRFYrRGVqYVZ1U2FucyAvRGVzY2VuZGFudEZvbnRzIFsgMTQgMCBSIF0KL0VuY29kaW5nIC9JZGVudGl0eS1IIC9TdWJ0eXBlIC9UeXBlMCAvVG9Vbmljb2RlIDE5IDAgUiAvVHlwZSAvRm9udCA+PgplbmRvYmoKMTMgMCBvYmoKPDwgL0FzY2VudCA5MjkgL0NhcEhlaWdodCAwIC9EZXNjZW50IC0yMzYgL0ZsYWdzIDMyCi9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnRGaWxlMiAxNyAwIFIKL0ZvbnROYW1lIC9CTVFRRFYrRGVqYVZ1U2FucyAvSXRhbGljQW5nbGUgMCAvTWF4V2lkdGggODYzIC9TdGVtViAwCi9UeXBlIC9Gb250RGVzY3JpcHRvciAvWEhlaWdodCAwID4+CmVuZG9iagoxOCAwIG9iagpbIDMyIFsgMzE4IF0gNDUgWyAzNjEgMzE4IF0gNDggWyA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiA2MzYgXSA1NgpbIDYzNiA2MzYgXSA2NiBbIDY4NiBdIDc3IFsgODYzIF0gODIgWyA2OTUgXSA4NCBbIDYxMSBdIDk3IFsgNjEzIF0gOTkKWyA1NTAgXSAxMDEgWyA2MTUgXSAxMDcgWyA1NzkgMjc4IF0gMTExIFsgNjEyIDYzNSBdIF0KZW5kb2JqCjMgMCBvYmoKPDwgL0YxIDE1IDAgUiA+PgplbmRvYmoKNCAwIG9iago8PCAvQTEgPDwgL0NBIDAgL1R5cGUgL0V4dEdTdGF0ZSAvY2EgMSA+PgovQTIgPDwgL0NBIDEgL1R5cGUgL0V4dEdTdGF0ZSAvY2EgMSA+PgovQTMgPDwgL0NBIDAuOCAvVHlwZSAvRXh0R1N0YXRlIC9jYSAwLjggPj4gPj4KZW5kb2JqCjUgMCBvYmoKPDwgPj4KZW5kb2JqCjYgMCBvYmoKPDwgPj4KZW5kb2JqCjcgMCBvYmoKPDwgPj4KZW5kb2JqCjIgMCBvYmoKPDwgL0NvdW50IDEgL0tpZHMgWyAxMSAwIFIgXSAvVHlwZSAvUGFnZXMgPj4KZW5kb2JqCjIxIDAgb2JqCjw8IC9DcmVhdGlvbkRhdGUgKEQ6MjAyMzAzMDYwNjQ1MDBaKQovQ3JlYXRvciAoTWF0cGxvdGxpYiB2My41LjMsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcpCi9Qcm9kdWNlciAoTWF0cGxvdGxpYiBwZGYgYmFja2VuZCB2My41LjMpID4+CmVuZG9iagp4cmVmCjAgMjIKMDAwMDAwMDAwMCA2NTUzNSBmIAowMDAwMDAwMDE2IDAwMDAwIG4gCjAwMDAwMDg4NDUgMDAwMDAgbiAKMDAwMDAwODYwOCAwMDAwMCBuIAowMDAwMDA4NjQwIDAwMDAwIG4gCjAwMDAwMDg3ODIgMDAwMDAgbiAKMDAwMDAwODgwMyAwMDAwMCBuIAowMDAwMDA4ODI0IDAwMDAwIG4gCjAwMDAwMDAwNjUgMDAwMDAgbiAKMDAwMDAwMDM0MCAwMDAwMCBuIAowMDAwMDAxMjUxIDAwMDAwIG4gCjAwMDAwMDAyMDggMDAwMDAgbiAKMDAwMDAwMTIzMSAwMDAwMCBuIAowMDAwMDA4MTc4IDAwMDAwIG4gCjAwMDAwMDc4MTggMDAwMDAgbiAKMDAwMDAwODAzMSAwMDAwMCBuIAowMDAwMDA3Mjc5IDAwMDAwIG4gCjAwMDAwMDEyNzEgMDAwMDAgbiAKMDAwMDAwODQwMiAwMDAwMCBuIAowMDAwMDA3NDE4IDAwMDAwIG4gCjAwMDAwMDcyNTggMDAwMDAgbiAKMDAwMDAwODkwNSAwMDAwMCBuIAp0cmFpbGVyCjw8IC9JbmZvIDIxIDAgUiAvUm9vdCAxIDAgUiAvU2l6ZSAyMiA+PgpzdGFydHhyZWYKOTA1NgolJUVPRgo=\n", "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:45:00.187561\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "def plot_retriever_eval(dfs, retriever_names):\n", " fig, ax = plt.subplots()\n", " for df, retriever_name in zip(dfs, retriever_names):\n", " df.plot(y=\"recall\", ax=ax, label=retriever_name)\n", " plt.xticks(df.index)\n", " plt.ylabel(\"Top-k Recall\")\n", " plt.xlabel(\"k\")\n", " plt.show()\n", "\n", "plot_retriever_eval([es_topk_df], [\"BM25\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "BwKX_F0fz42D" }, "source": [ "#### DPR" ] }, { "cell_type": "markdown", "metadata": { "id": "8fkHEPiAz42D" }, "source": [ "\"DPR" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 337, "referenced_widgets": [ "054fed8fdf48494db227ec97d40e7d50", "0da52647fc5548b2b0f8f590bbd426fd", "2214892e47cf43d5b187fa82b5ae1242", "39f1b72c37444491aaf32fe24cd47487", "f681cee7a15b450cbc1e7d7c600333ff", "a748a6c9a63f42f8b05ea1579ea8e7ef", "3153dba1224a4ed9ae9ddf358863e5e4", "79459a18de4a47adbdfcf74d4f030167", "6aa03d64c26b42ffb45412864401161c", "4f855da5e66b4787a626805cef2422b2", "09557a3024794fac85f3acae1f71a83d", "f5341f6caba04e9bb7508a2e0fc317fc", "a4eb739c64bf4508be68e0e881222a8b", "89ac08e825494a43a7d6c071585127fa", "0670fa053dc445e3831b771aae0f9aba", "45f72e1ba86148d795227af866ae63c6", "50d62ad707ce497eac491e36a7b8ba62", "7a8a540d5b7d48e6943c4763e448ded5", "0d61042d3303482c9c182714e8ba0787", "32c66cf82d7145d1912fc007bfd85b2c", "5e546217bc47462088c4f0edea85f8cf", "38166990fa8145fbaccaf1449bcda745", "8de3b97a338f4d2589f7eca645c042d8", "7b19861cf8974f629d5b76f33eccafc9", "10851a43255b47fc8a320c29afc750eb", "07896cb770304c19b96f3328cd0e6a05", "6401d537b91846c1b5910d5d4976b891", "7bfc49f8f1d54b87b74f857445394b17", "1a3ad624f8f84fae85b400746140561e", "73a598f8b6e54ad58edb2cb35b84a06d", "7d7d2ae292cb482792bea5877d2827e3", "3f8a5983ff9146fc92d54b5f91d6dec1", "cba72b421a434c38ba4ee8216c71726f", "5f228abb79574183b2a6429ce137568b", "579b7086ecf04fd6a64f4926cfa27ede", "15c7e84cbe5a4a93956ed24795e20b98", "0f98d03776ff419ebde75eacd5a66e9a", "880e98851cd040a594cb186830c74ac5", "2762e545eac8496eb2ef3faaf9341529", "10f1d896b1ec4c55ac0bb3f1ab18a3ab", "e92ae5560d804aef90d41bfc8f563e50", "ac8ff19ec10d4ad08556f80ec009d0d9", "43924d5dc07a4f09a10203263871929c", "6f7f82ebb8024360a6e6ee613f3083fc", "84c0d8ed2346467d933c690396d9b8ed", "8903ff45785942379e05bbe826631ef0", "9e47fefdef41443fa897add035a46835", "4dd13fd377bd45c6bbe090c142134b60", "04f6caee9e684ed3bdb0032c9e0cff02", "52046842098e46c2ad0ad26aae21df5d", "74a67cf72d3c47ccb78849399806bb81", "94b2e2c478fd480eae6369ad602ac734", "ae4a61e58c0346449e61515ae85d42e1", "49ea6bea756a46d4a8fc280479d69478", "b43e12739ec6452b89704fc7789462ed", "5a6b82e900144a2bbc60f7d173acd1fe", "c4764c0f3a0b4c5090fd5ca6057b792b", "d13d3df595874e5e8015d782baf2ba02", "c5af7e58db404bfc8c6b4a47a1957f1f", "422ef12185db47d9879f6fd716b08b20", "6551aaa6d7ff45dfb56d3ec0c588c442", "2cb2fb24787f4f9fa8383df806cb3cfd", "0668943d9d464fe5bde5ce3a332cd1a2", "e1b003b8e0e94f22b9258e503c1b093d", "7da41ceef70e42caabf414ec54ad58e1", "3e07697f1bbe414ab02144779ce060b2", "efeafc8d24e2454d86f2cc840d5b8c7d", "c3584c6d901849c4abcf664748fd0ace", "056dd79a76fa4e6aa08ccda715b4988e", "824bd3f969e744b1b032bcebe86d12de", "d454994e61bd47169dfde9a0d9645dd9", "f39ae404fba94241aa40b11ffba5f681", "a354f24998874e4887fff4c3aa1c43f7", "52184bae537946faa068d0ce469a8b2b", "921be580aefd4c90ab6942f94e956a8f", "ced15d6b54444cba878af934094f4eb4", "8e25dfd6e28e46f5835caecb932db111", "1e1dc199299048b4ba8241261caf39c9", "945833f08a46463d89e15419a8ec28ba", "a58e70a4776842ebb1000f8d50eb2080", "288efc2a53e8484a9b1f9ca7a1ef3ca7", "54f05d93434947fe9e601b992cbe3619", "46308fb8060a477b9259b450ee02b9fc", "1a6ad8fce08a42e08aa267d9a935c93f", "45a98350b39046fd94fe6650e46ee21f", "ef2814aabfc04157b526e58295ef1f09", "9a46c093570b4e33a32287a1da117cca", "3031cd0717d8455aba0e35d01a27c5fe", "b0cc841385c9468787653f229a07bbe2", "262be61e3850460da51188b521dfeadb", "26b60259fbc34e79a05080857deea519", "507d323d148242baa6f2561d5ff4e002", "13b1807e63a2495d93bc9808c9da3cc5", "b270a828f8714abf8c125321b14ea07f", "8169189b38e74b218c62782eafbf9ebb", "606f82d23ab64a209c621c04e702b4db", "3df00e02523e4d478a6a1e3761b05130", "fdf00a7d26ba4131ae96da90986b7779", "73655c00d83846e8a8bfab0fde5d822d", "4369bba69270488ca1a9ed6e1c06f693", "59c5854da5bd4971ae72c2601a371830", "3abe8cf41fdf47078815b03a7df8ea90", "4bfcdccdc5544405b4788871b3fdeb3a", "a836b9ff68dd4d5e8cd25f2622e067a0", "1eb63083884f4f8dbd78200f50be8204", "f08e7f6394754d40bd70ffcae226667e", "1b1f9306777b4fedb6306b1c89273615", "5a8346dc00df4777bf7566db52a3a50f", "9eee1fdf30954be7a7b6da68846a96f1", "46ca21a58ea0489a9c5c678825477b5f" ] }, "id": "jyqMIRf7z42D", "outputId": "3992b072-5d43-436c-9065-83f564c15eb8" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "054fed8fdf48494db227ec97d40e7d50", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/232k [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:47:49.029680\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "dpr_topk_df = evaluate_retriever(dpr_retriever)\n", "plot_retriever_eval([es_topk_df, dpr_topk_df], [\"BM25\", \"DPR\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "I51Rd5zxz42E" }, "source": [ "### 리더 평가하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "HGbqNTyFz42E", "outputId": "765d0151-bcdb-43c8-8964-c15e3ae2cab0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "EM: 0\n", "F1: 0.8\n" ] } ], "source": [ "from farm.evaluation.squad_evaluation import compute_f1, compute_exact\n", "\n", "pred = \"about 6000 hours\"\n", "label = \"6000 hours\"\n", "print(f\"EM: {compute_exact(label, pred)}\")\n", "print(f\"F1: {compute_f1(label, pred)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "UqYrcHcvz42E", "outputId": "0fe50625-4637-4007-f565-c562c8d20dc0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "EM: 0\n", "F1: 0.4\n" ] } ], "source": [ "pred = \"about 6000 dollars\"\n", "print(f\"EM: {compute_exact(label, pred)}\")\n", "print(f\"F1: {compute_f1(label, pred)}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zfvr-vjaz42E" }, "outputs": [], "source": [ "from time import sleep\n", "from haystack.eval import EvalAnswers\n", "\n", "def evaluate_reader(reader):\n", " score_keys = ['top_1_em', 'top_1_f1']\n", " eval_reader = EvalAnswers(skip_incorrect_retrieval=False)\n", " pipe = Pipeline()\n", " pipe.add_node(component=reader, name=\"QAReader\", inputs=[\"Query\"])\n", " pipe.add_node(component=eval_reader, name=\"EvalReader\", inputs=[\"QAReader\"])\n", "\n", " i = 0\n", " for l in labels_agg:\n", " doc = document_store.query(l.question,\n", " filters={\"question_id\":[l.origin]})\n", " _ = pipe.run(query=l.question, documents=doc, labels=l)\n", " i += 1\n", " sleep(0.01) # 쿼리 속도를 조절하기 위해\n", "\n", " return {k:v for k,v in eval_reader.__dict__.items() if k in score_keys}\n", "\n", "reader_eval = {}\n", "reader_eval[\"Fine-tune on SQuAD\"] = evaluate_reader(reader)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 356 }, "id": "5aqbwb2Qz42E", "outputId": "fa9ce07d-926e-4da0-adbc-a609b496beab" }, "outputs": [ { "data": { "application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUiAvTWVkaWFCb3ggWyAwIDAgMzk4LjU1OTM3NSAyNTAuNDY1IF0KL1BhcmVudCAyIDAgUiAvUmVzb3VyY2VzIDggMCBSIC9UeXBlIC9QYWdlID4+CmVuZG9iago5IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMTIgMCBSID4+CnN0cmVhbQp4nL1WTW/bMAz1Wb9Cx/VQRqS+j8n6ARTYIUuAHYadvDRrkWZouy1/f7TsxlKSKjstgA3rSSIfKYovk6vVn4d29fl2Jj8uxGQcta8C5aOYTFGuX6WSj/zsJMpbWS5SjD8JHQNYG7W3PNzkQ7IKjLMMquK7W/ZDiC17WfPglk2vhbBu3AeBbDKNoAtsU2BGAyWb2d49xh7uxbM8Mqu1gSAJPRgjX1byi9wyCWZnNGpP1kQe+P5LkXxhiqIyKwQHbFRJHAMBqpJ4hnkN6DwxOO4dsUR8Lv8Hdc4UhAPu2kSwB9xzTBNEVB4N4/n+HN+HMJlSX0NrPmSuI6a/S0fOiECjgA6OPMcIdPIuZlwpO/HMbyUvFVtCjUBROZ6O4GX7JGZLObnhCZLL+1Riy+/iq/zQXDefLuQ3ubwT10sxF4mA0MqDO6y1DHvfsVYKdOiXnfF80+Cx5xNFbjWogxo/9suUuinisLugK25VA/yommvPKVZB89bc+4hWCPB5O2MDEkZy52nYGg1EAzaiI1PwyOAKEUQFiiLyfQ9Wn2OC9YSgDRC94e+SyQjXmFgDpuPhrPfxPJNqTkhp7m6B+K4WlTHCNSbcTXy03gTv1dkioTInz6KzdNnZRJfqH7nPeDIu+KqtRdM2P5uXCyYJQb/9eGKVmyd51ytJ6gCljpy6FCcaulicEISn04LAa/9RT8qVg4H3rU6mum9od51q8bNLIQ0a5gzEYRMnUXWFTWQgNaoB4abJvl2GEDmWxmEcinHoGn0rRqQTgEQlg1wS1tGq9onu6HgA2pzegG2E0/s4B8zRPh2D7RzpCbCtEetpbgooRZLZ7kM9yk/bScVs/KtR1bK9dFQ1zfm9U2L/3RlkZ9lDmxxiLfGJvj+Cko4daBhmCpZLEitDLJzUtOFCegSH3ErfbstDs21WzWXzq/mdviRfqi2/F82ckWlzld+mufgL45ruNgplbmRzdHJlYW0KZW5kb2JqCjEyIDAgb2JqCjcyMgplbmRvYmoKMTAgMCBvYmoKWyBdCmVuZG9iagoxNyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDIwIDAgUiAvTGVuZ3RoMSA3NzE2ID4+CnN0cmVhbQp4nNU4aXgUVbbn1qnqfanudCdpsnVomi0JxA4BAghNBGQ3SEDAiSZkIWxJSMIaMewBxQFHEgdkiQqMRsAMMpAAMqARRMjTEdDnCKMIDDoTAWdweSG5Paeqm0XnPee9H+9736vqc+ueW+ee/Z57q4EBgJ0aEdwPDhk6DMLABcASaTT8wYyHxntcCX8nfAiAAA+On5DeZfugtwGQcKgZMzhzuH5QygrCzxP+3UPje/pm/vB6JTE7S/jE3Nk5JcLn1jUAEvFjJbnzyt0wPSYNQLOZcF5QMm32nF7zZgDoCIfXpuWUlYCWbtAPJ9w0bdbCgsRRj3xMeCah3Qvzc/J0WW+tAnAuofe9C2nAvE2r4A2EdyqcXb5g1mMm0sWp6JM3qzg3J/ZPHReSKX7Ce83OWVAirtLMIbyccHdRzuz8Ll/ff5jwDaTP2ZLisnK2UVgLEFlI7zNLSvNL+mu/oW4k0UiFoPjKBMFLIAwhUh1TwAA9YAAIQ4aNzgTLrJzyInonKoSBAMCdnko9M7+0CHSheYzeCepTBwLLUChZIVtN/OSgoMAZ9TEK7l6db3cCZwLnIU3tnVdAaW/379Cch5+5+IHAkLt0gfWBmdQeUNv1NJj2c3N/qs3/wtX557UPLL6tvYqdUXxFXjgTmsXIiyKBEcxgJX/awQFOCFfHzdQqfkdQoiNRqyHQUhT0akycoffBC9UIBqmCNApXE3GxEOdg/H8FG8Chxn9RTmnOVFiRUzq7CFZMLc2ZDityc4rKqC3ML6V2YeksWDEtv5j600rzZ8KKwpwioinMn0ojM3OKcmDFrJxit9JSHq2YnVNeCCuKZiojxdNyZsOK0rlFRFleUDSN2kKF/z25dvcK6vU5fAkZMMAE2gvKoLCMDM4mY84FiZT+vddtnBWG+hP+dZiYhfjO+9d0MCVIqzzv9O/h8aPxKffgpXf7Ql+AW+uoP+DO1K5kpeGOxdSKFrZeiZaUIm0kNDb4xI+hQLATB6MGUScKgnhnRujKKBiaB35ww1yNgzvYJu1sdikbWOCz237FEEQDqBJ3EcZUXARlvbjpFtVnD/BBKgyGYTAGxsF4yIV8mA7FUApz1Si5IQmSVYohP6IoIoryQCBwKfBx4KPA2cDJwInAscCRwJuBPYHdgbrAzsD2wMs/1vo/vYJ1qimEWVWJQVD0TyLoAcG6lEzgg2D+p4ZATzA4BIqdyiobRmAkGBMCJUbjQqDwHx8CpW7lhsBGkB8C8jzZF4QwgiKCYgIHQWkIwgmU2jyXIIKlQgOcovsY1MFmtpOwAhqfQyO1wl5YSVQN8DY7xdYISTS2E27AGaKsglNYJwIbCSk0CvCJJMBNlgn7iEcac7A0rUYEcay4T3xYbBCvis3QRywTm8VssYyl4EvSRGknQRq+Q7lyEuKggX0GZXAQv8IUPCwOES3wGTZjHVwhKYr/TsE62A4VpIuDFUOlUCE8TCMnpGbYRHcxvW9mW9kZ0u4gWw7n4NcoCsNhKztHdp2C72A5Zgq0Z2KKUED6nyBezTR/E5RRWTrHDMCFBBoj7UnWVLWNwSTpnHrfgEqSnAnbNQ0ah9ZDUhSP7WRvsxbNc1ALZ/AXOAc/ZStFj/iKOBzWBT2A2bCOeG9S5mgK2EKyXbkrFO7CfDGb1cFXYrZ2KvF+R7GIZO4THiaLCuAwwXyNTDb1ZytxDWmqvI2BZu1IsSfNJw7axWpcizEVZlCvAvbAXkjCGlhHnFR7NX2k72jmZvEi2byOPSN8B810pugGBeI18rWSEjUAB7QaSUSBQaJbrhe8I/Lq/eMmud+dHJ+U+BPULWvd9ZBRb17obggEMiaJUdLkeim6Hr26etHrufhfvbyYlDgqY5K7vn3okBDXodlDaGz8JOoqGA3T+NAh6jtFaL3kpd+I7Hp3bqH7KfkpT7+n5Px+ynKi3YJ2cFrbSkXIY58JlcJy6tv2w2ZBpB1FPn/6PpDPtiSHxTvj84So9ivC8u1K/fmUmj3AFdoDsFxgLogU5baW+6BnS3KfFKfn0zNnOCfuVYFL4jqKuBEiwOMP09Taodb0rH1tpD7aGovRzqhImnWT5smXb7bI15JZR8Em21N8dpssdPGBTQZPR6UVnt68ZQv9tmy5xfT8+1u3+PdML2XwZn6aoJml0N2LpdTyMr6KV/Ey9gxbyBaxZxRdL1LJmEI7hwH8fmc61opCrbRUC7V6XZwmGiGOGeWzo+qtmZMaidjfd3JLU9AQ380Wspz8Orkj22dFqyhk9Ym3SaneFBt5g7ORfCPLf4+NbNteJ5YNbxjeeq6OGFAOiCPJ4mjY6u/i6hCFkdE2SQSbJInp8ou2DeZax7Mi1QKQDQIzREfIqImR20bVOzNH1YdnPjqq3pH5KGmCgaN9JzedbTl61GZPC2lzU9VGK0tfa6WvWX20bItII938vgniRGmidpG4SJoXVeXSUqVwiR0oZaLLYZ5mboeyqPLoZbDKtazDsqhl0a/AK1G2LMjykhGpvaHPQJbaq7Ono0abOpCl+ESnQ6PVAJWnY22jyY0pOWN+s+rxMwsWnZ30JXMMfdTFb9bV1c1nz/ab/fyI+TXpD5y+z/flW7/YURLD/0rWb6V455GvY+Axv0fsoLWtkmM61GodtfIas1ALS81rtdtjI6KZAaPBIGti5TammH3balmxOhQLWYkFeUBuuqakh5Iff74m8yb5GnlETUibojA4HRDfsXOX1FhCeodMuYCu9trESYmtrBM/y68/9nbhlKMzd7/33u5xL2ZK5+r4r6xWfu0v3/Bv3e5T9yXv37x5f6fOlK2ycmIm7QXKlbf8g8FGS0ASBSV2zGZAA9gEQDRoKaAaZVBvQ4NOeZEOqK1muFQv6XVUqZX1r5cMlFdNEUrwBlw+20I63w3enYcu2JWD/ckd97pNjGX5063MKli1Vp0VJsE8KIG1oNcynaBBvRjOXMJENknIME1jhcICNk94AkvF+doFuiq2Wlhi+rWwEWvECIqwnsXTsvBgPHqEw/ya4OUVV4S0D1e3P776nGRpd+Ge1gRWyZeSvSfJ8nqyXEe7Xw+/E6r1S1m1rBNkA0gusw+i9aJdXam2NLqUfKRk3JsdRsoyW4pbWaLx3nj12Y2x526yVBbHL/JTPJ1tY3tZDS/kGTxH6nlrPotkPVgii9jJn+dL+JO8hrKGpItPk3SjIltTLQrVsFRXLe42SEyvpUwRTSTbd7apSRHcohaKvXFmkq1mQQhOYn17B+FEe5rwQ9tAJcjD6tov1YW4e4i7Hrr57SHu4m6dxFTWhiDroEkKY6vxXsaekzilvUTIaK9/T+E5vK69DwR5auKIpwl+7++FNq1OK9iYoFMeKOgNekoXgz7doBVQh/C6zkh5QUkhGTTR4kADSTWT1LYWWzA7mlp81LuTHmpSBBODEqLEoiTERFSSQS8YnIJDG2boLHTWurWdDW5DL22qYbrwhFChXWhYIizTLjOsF8JFZsQwFoUelohddF31vdgAnKibrM/XzdDP0y2k2D6D1ewFdFCShMVTmtg8NuZhZCxLYotZJUt6h1ee4pVN0rk2Hf7QmiDFtdGxq/UiZUoZ/0H6RK3oMfCwvzvVaSszmU0WZjab0q2xJk21HaojqcSbY81RVhPqXVEp6HLKsUGL09JsZLPcpJb7tHsWBYFazdRyn6KW/jDm6WJhnmB2Ka2AXc7zdobnzzPGA4OY4eaFeI/M3+FVVO37s35swR+k0byBX+F/5g1sOOvAotjw1vf5heuCwHawHDaV7eCP8q28jf9SiaKyO71EtnSBJ/0DzCbBYhRi42J1ekFrEOLiYtMNxtg40cnA+aJjQ2S1TayGDd5nbWu7xhqMcVFa6BjlsiRpXY6OXeXzTWTcZSpM6uqgzewymffdNfn4ncBavqZu6EFhtdIyydof161nt4e6YRZz9mBUt5yO8DgWy5wO8UcVrSetmNRenVJ84eLwstOP73hj/s5FX3zML/CrM64vqWgp3X24alPFF++xiG+n/1Ha/k6f3kvm5ebHuRI+2f/J58k9Pxg6bPWTRU/ERSYdfe345c6K3XMCl/AqnZxcMMgfBavYatGyyrza0GgTGyMabGs7aO1mGO4Y2kFuu+y7vTfzm9fkb68l+43WKDlqSdT6qNooid2jOKnax0nRUuMX7wvHq2O3ZLxx/PgbGVvGjtmR1c4/otzSTHhJTN2VkHCpuflSQkJdp05sILMwO+vnocwircRK0soBUVDi7wROpl+lWy05X2VSo4kdimy0N5jWRkc5BZ1TB6MEu3VotFqSmtQdUlHysnyN7pvqBuHvNiimJKY25oOYGzHSIBjEBgmDnIOipERtT11PfaKhGIpZsVDsLI7SZ81RDIlX3a3a4A7loFY1TitWtu01NR+YcWJq7gcz+U1+gnVr+4JpG4Qdqzc1WoTHphw50avXnu6JrC8zsDD2AL/Q9Py+PVuV09UxcvdCjYPOSVpI8Fs0R8TfwmFBYjoRhunktgEtPioAl9tayK+y3q/P0GfrS/Tk17AUWpJ0jjrWQJeYfatW4/hKidxBqmV55KMw6O13oR7QwjRVFluD6bCByg+MNet1xmEOpaQphyraeG42tdjslIP7sp3vOwWlWnts6h7pVAUoO2i4mNfwxBPVuxob09+Ye+y4sL39F8LWbVuPbG+vErP35OddD8VmLslVznEj6RzXaIdGU4NyjrNbx6HdOfQn5zi/Z5CrAio0ldpKXaW+0lBprDBVmistldZKudJWYa913XDZSJvbqUNZ/qPjXtmGXa9VP7dr13M3mJ1fu/ENv85s+NnVkyevfvnuia8283d5C/+aApFG/nawvkHP4EjS0AbJfofGqAWbEassDfrDWoNGB7phtH01Bb1NGXP2tJIi+zLCtoUpPgnm8F2HRODIuBGJm3/T2Njv4MqwHtG4z247daR9L7mjIFeSSBptAsLfpK0Uh75+l0XSWfFVOi0c1lUZjFQ+RNDJdosShwFN9POFdky10KelJe993anum8r+4gjvz5yejp1TbZ7UFBubzyr4ylFlb7557qWqKmkrf2tde+2asZu2fShkr2MDlYzaQ5GYpGaAA/r7o+/mwFoDO+xoMFEGOIxjKReGOZWQpAVtvuy7kwjFzqNKIoTZUmxB11PHk6qs3M5sj5IIuxsaHvjt3GPvsvfZQWFne862bUe2CxW3ancV5N7AVyD0/4MweaPh8cNTH7cO+BbidOoH+x+etFy+/fz+o7bRlsl65T8t3Z3ve5qnnc1j6DOcf/9R6zjL5H/6T8AlNqvfyyDUETxNcAny6PmpFAFVBBcJagi2SgtB1nSDk5IFToqb4KTmKsFoKJMcUCUWwByxBeYIaXBMssNB8QrMwZH0NeyBPoTvuUdeOHSHWfAy68iWMi7cL8wSajAMF+K3ol+sEz+ULNLjUo10RuNUNXVhJiRAIe36yllxo2KZ6BTC6al8U2thivIdJerJ0GT1PwulzyCcsGCfvrbYsFAf7xkX7+lLEMnGhvoa+jovgAfoe7QEFkIpTIdpJL0c3NAVcunL0w0+SKY7hXpTicIN6URTTl+j5USdDzkwGxJpdAQUEX0P6g0ma2fR8+E7vMpULJ+e+TRnHrV5RGn4b0jtfUdqJkmaR7KUL+ciolb0yKE5/zOJQ6g3g+ZNhLlEkUu0OSq3fHVGjmqRW/23yU2azSXZswjLJSyP5M4mCuXdT/mMV7mUkUZU8WEmjSpSy9T/tYpUW3qQ/1J/NOv2nNB/qYEn1f/x/vlyqTlN3wSUDTZ1RTohEjrReaIr+SkZepGPhsFwGA3jyOIJ8AhMahCW+AO3OLY68D+8+IMPv6/B7yz4LcebHP/uxb9Z8JsavOHF608Nlq5zvFaDX9dgSyv+tRX/wvGrfvhlOl7l+GcfXrk8XrpSg5eJ8PJ4vPRFT+lSK37REy9y/JzjZz78kwMv1OB5jp/a8Y+L8ZND+O8cPyLyjxbjubMPSucW49kH8cyHUdIZjh9G4R84fsDxfY7/xrG5Bk+fipVOczwVi+/58CTH4ytt0vFofCccmzi+zfEtjsc4HuX4e45HOL7J8TDHQxwP2rBxlVdq5Nhw4JDUwPHA/izpwCE8sETc/zuvtD/LH8D9fvF3XtzH8Y0a3MvxtxzrOb7OcU8e7rbgrte80q48fK3OLr3mxTo7vkpKv9qKr3D8DcedHHfYcTvHl1+ySC/78CULvpiHtURSW4PbOG7dYqLKiVtMuPkFl7Q5D1/YJEsvuHCTjBsN+GuOz9eYpec51pixmiZV1+CG5yzShq74nAV/1YrPrj8kPctx/bosaf0hXL9EXPdLr7QuC9f5xV968RmOa5/uIa3l+HQPfIrMfGowrlltlNY4cDVtODRQlYeryFOrvLjShis4Ll9mk5ZzXGbDpRyXcKzk6A88uXix9CTHxYvxiTysyHRKFV5cxHEhxwUWnG/CeQacy7G8FctasbQV57RiCcdijkUcZ8XjTI4zbOnSjPE4nWPhYpxGSAHHfI55HHM5TuWY0w+zW/ExE2ZxfJTjFI6TJxmkya04yYCPhLukR3w4keMEkjwhHTOdOJ7J0vhIfNiB40aGSeM4ZhjxIY5jx8jSWI5jZBzNcRS9GcVx5AhZGhmGI2LM0ggZh5vxQY7DanBoDQ7h+ICQJD3QiumHcPAo9HMcxHHg/XZpoAPvH2CV7rfjgP5maYA/YMX+ZuzHMY1j3z4OqW8r9uktS30c2DvVKPWWMdWIvWIxxYy++4ySj+N9RkzuaZSSzdjTiD2S9FIPGZP0mOjDhO5eKSEPu3ezS9292M2OXbt4pa6DsYsXO3uNUmcreo3YiaOHY0crxpOd8XZ052FcK8aSCbF5GGPGaPJgNMeoVuyQji5CXBwj8zCCPBXBMZwmhbvQydHBMYyjnQjsHG1kqy0d5cVozUMLR7MpXDJzNBG1KRyNHA0y6jnqiEzHUetATR6K9FKkDHAijSKnj0JZEpKQyQgcWQPLW/kMS/j/cMH/tQI/e8X8A9lbR18KZW5kc3RyZWFtCmVuZG9iagoyMCAwIG9iago1MjQ3CmVuZG9iagoxNiAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDY2ID4+CnN0cmVhbQp4nGNgoBAw45RhYWAFkmwM7AwcYD4nDnVcYJKbgYeBF0WcD0rzA7EAAVcIArEQlC0MFxVhEAXTYkAsziABAEBTASoKZW5kc3RyZWFtCmVuZG9iagoxOSAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDM0MCA+PgpzdHJlYW0KeJxdkk1vgzAMhu/8Ch+7Q0WBBlQJIU3dhcM+NLZTtQNNTIU0QhTogX+/JDZUWqRgvY8/YhLH5/ql1v0M8YcdZYMzdL1WFqfxbiXCFW+9jpIUVC9nVuErh9ZEsUtulmnGodbdGJUlxJ/OOc12gd2zGq/4FAFA/G4V2l7fYPd9bgg1d2N+cUA9wyGqKlDYuXKvrXlrB4Q4JO9r5fz9vOxd2iPiazEIadAJtSRHhZNpJdpW3zAqD25VUHZuVRFq9c+fCEq7dlt86uPJXMj+BKwII+NNBm9GSVlK3lUmK6UgQVJwkCB8pChvLmQJHwnnjFmKlVIQteHNhWzAgkoKLim4pMgIZ4wzwjnhnHG+Yjoq53Zzbjc/ET4xPjFGwh3jTQZvkQZZ8P0UfCMF/VTBRzzkj3+r9VH8s/kZ22ZC3q114xAGMcyBn4Be4zarZjQ+y+8/ldvCSQplbmRzdHJlYW0KZW5kb2JqCjE0IDAgb2JqCjw8IC9CYXNlRm9udCAvQk1RUURWK0RlamFWdVNhbnMKL0NJRFN5c3RlbUluZm8gPDwgL09yZGVyaW5nIChJZGVudGl0eSkgL1JlZ2lzdHJ5IChBZG9iZSkgL1N1cHBsZW1lbnQgMCA+PgovQ0lEVG9HSURNYXAgMTYgMCBSIC9Gb250RGVzY3JpcHRvciAxMyAwIFIgL1N1YnR5cGUgL0NJREZvbnRUeXBlMgovVHlwZSAvRm9udCAvVyAxOCAwIFIgPj4KZW5kb2JqCjE1IDAgb2JqCjw8IC9CYXNlRm9udCAvQk1RUURWK0RlamFWdVNhbnMgL0Rlc2NlbmRhbnRGb250cyBbIDE0IDAgUiBdCi9FbmNvZGluZyAvSWRlbnRpdHktSCAvU3VidHlwZSAvVHlwZTAgL1RvVW5pY29kZSAxOSAwIFIgL1R5cGUgL0ZvbnQgPj4KZW5kb2JqCjEzIDAgb2JqCjw8IC9Bc2NlbnQgOTI5IC9DYXBIZWlnaHQgMCAvRGVzY2VudCAtMjM2IC9GbGFncyAzMgovRm9udEJCb3ggWyAtMTAyMSAtNDYzIDE3OTQgMTIzMyBdIC9Gb250RmlsZTIgMTcgMCBSCi9Gb250TmFtZSAvQk1RUURWK0RlamFWdVNhbnMgL0l0YWxpY0FuZ2xlIDAgL01heFdpZHRoIDg2MyAvU3RlbVYgMAovVHlwZSAvRm9udERlc2NyaXB0b3IgL1hIZWlnaHQgMCA+PgplbmRvYmoKMTggMCBvYmoKWyAzMiBbIDMxOCBdIDQ1IFsgMzYxIDMxOCBdIDQ4IFsgNjM2IDYzNiA2MzYgXSA1MyBbIDYzNiBdIDY1IFsgNjg0IF0gNjgKWyA3NzAgNjMyIDU3NSBdIDc3IFsgODYzIF0gODEgWyA3ODcgXSA4MyBbIDYzNSBdIDk5IFsgNTUwIF0gMTAxIFsgNjE1IF0gMTA1ClsgMjc4IF0gMTEwIFsgNjM0IDYxMiBdIDExNCBbIDQxMSBdIDExNiBbIDM5MiA2MzQgXSBdCmVuZG9iagozIDAgb2JqCjw8IC9GMSAxNSAwIFIgPj4KZW5kb2JqCjQgMCBvYmoKPDwgL0ExIDw8IC9DQSAwIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4KL0EyIDw8IC9DQSAxIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4KL0EzIDw8IC9DQSAwLjggL1R5cGUgL0V4dEdTdGF0ZSAvY2EgMC44ID4+ID4+CmVuZG9iago1IDAgb2JqCjw8ID4+CmVuZG9iago2IDAgb2JqCjw8ID4+CmVuZG9iago3IDAgb2JqCjw8ID4+CmVuZG9iagoyIDAgb2JqCjw8IC9Db3VudCAxIC9LaWRzIFsgMTEgMCBSIF0gL1R5cGUgL1BhZ2VzID4+CmVuZG9iagoyMSAwIG9iago8PCAvQ3JlYXRpb25EYXRlIChEOjIwMjMwMzA2MDY0ODI0WikKL0NyZWF0b3IgKE1hdHBsb3RsaWIgdjMuNS4zLCBodHRwczovL21hdHBsb3RsaWIub3JnKQovUHJvZHVjZXIgKE1hdHBsb3RsaWIgcGRmIGJhY2tlbmQgdjMuNS4zKSA+PgplbmRvYmoKeHJlZgowIDIyCjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMDAxNiAwMDAwMCBuIAowMDAwMDA4MTI2IDAwMDAwIG4gCjAwMDAwMDc4ODkgMDAwMDAgbiAKMDAwMDAwNzkyMSAwMDAwMCBuIAowMDAwMDA4MDYzIDAwMDAwIG4gCjAwMDAwMDgwODQgMDAwMDAgbiAKMDAwMDAwODEwNSAwMDAwMCBuIAowMDAwMDAwMDY1IDAwMDAwIG4gCjAwMDAwMDAzNDEgMDAwMDAgbiAKMDAwMDAwMTE1OCAwMDAwMCBuIAowMDAwMDAwMjA4IDAwMDAwIG4gCjAwMDAwMDExMzggMDAwMDAgbiAKMDAwMDAwNzQ0NyAwMDAwMCBuIAowMDAwMDA3MDg3IDAwMDAwIG4gCjAwMDAwMDczMDAgMDAwMDAgbiAKMDAwMDAwNjUzNiAwMDAwMCBuIAowMDAwMDAxMTc4IDAwMDAwIG4gCjAwMDAwMDc2NzEgMDAwMDAgbiAKMDAwMDAwNjY3NCAwMDAwMCBuIAowMDAwMDA2NTE1IDAwMDAwIG4gCjAwMDAwMDgxODYgMDAwMDAgbiAKdHJhaWxlcgo8PCAvSW5mbyAyMSAwIFIgL1Jvb3QgMSAwIFIgL1NpemUgMjIgPj4Kc3RhcnR4cmVmCjgzMzcKJSVFT0YK\n", "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:48:23.928088\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "def plot_reader_eval(reader_eval):\n", " fig, ax = plt.subplots()\n", " df = pd.DataFrame.from_dict(reader_eval)\n", " df.plot(kind=\"bar\", ylabel=\"Score\", rot=0, ax=ax)\n", " ax.set_xticklabels([\"EM\", \"F1\"])\n", " plt.legend(loc='upper left')\n", " plt.show()\n", "\n", "plot_reader_eval(reader_eval)" ] }, { "cell_type": "markdown", "metadata": { "id": "RRixr1wQz42F" }, "source": [ "### 도메인 적응" ] }, { "cell_type": "markdown", "metadata": { "id": "c23IbLhvz42F" }, "source": [ "\"SQuAD" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fhfruh1kz42F" }, "outputs": [], "source": [ "def create_paragraphs(df):\n", " paragraphs = []\n", " id2context = dict(zip(df[\"review_id\"], df[\"context\"]))\n", " for review_id, review in id2context.items():\n", " qas = []\n", " # Filter for all question-answer pairs about a specific context\n", " review_df = df.query(f\"review_id == '{review_id}'\")\n", " id2question = dict(zip(review_df[\"id\"], review_df[\"question\"]))\n", " # Build up the qas array\n", " for qid, question in id2question.items():\n", " # 하나의 질문 ID에 대해 필터링합니다\n", " question_df = df.query(f\"id == '{qid}'\").to_dict(orient=\"list\")\n", " ans_start_idxs = question_df[\"answers.answer_start\"][0].tolist()\n", " ans_text = question_df[\"answers.text\"][0].tolist()\n", " # 답변 가능한 질문을 추가합니다\n", " if len(ans_start_idxs):\n", " answers = [\n", " {\"text\": text, \"answer_start\": answer_start}\n", " for text, answer_start in zip(ans_text, ans_start_idxs)]\n", " is_impossible = False\n", " else:\n", " answers = []\n", " is_impossible = True\n", " # 질문-답 쌍을 qas에 추가합니다\n", " qas.append({\"question\": question, \"id\": qid,\n", " \"is_impossible\": is_impossible, \"answers\": answers})\n", " # 문맥과 질문-답 쌍을 paragraphs에 추가합니다\n", " paragraphs.append({\"qas\": qas, \"context\": review})\n", " return paragraphs" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "OIRacuduz42F", "outputId": "4dbbe468-b6d0-4bea-886a-5871e8303abc" }, "outputs": [ { "data": { "text/plain": [ "[{'qas': [{'question': 'How is the bass?',\n", " 'id': '2543d296da9766d8d17d040ecc781699',\n", " 'is_impossible': True,\n", " 'answers': []}],\n", " 'context': 'I have had Koss headphones in the past, Pro 4AA and QZ-99. The Koss Portapro is portable AND has great bass response. The work great with my Android phone and can be \"rolled up\" to be carried in my motorcycle jacket or computer bag without getting crunched. They are very light and do not feel heavy or bear down on your ears even after listening to music with them on all day. The sound is night and day better than any ear-bud could be and are almost as good as the Pro 4AA. They are \"open air\" headphones so you cannot match the bass to the sealed types, but it comes close. For $32, you cannot go wrong.'},\n", " {'qas': [{'question': 'Is this music song have a goo bass?',\n", " 'id': 'd476830bf9282e2b9033e2bb44bbb995',\n", " 'is_impossible': False,\n", " 'answers': [{'text': 'Bass is weak as expected', 'answer_start': 1302},\n", " {'text': 'Bass is weak as expected, even with EQ adjusted up',\n", " 'answer_start': 1302}]}],\n", " 'context': 'To anyone who hasn\\'t tried all the various types of headphones, it is important to remember exactly what these are: cheap portable on-ear headphones. They give a totally different sound then in-ears or closed design phones, but for what they are I would say they\\'re good. I currently own six pairs of phones, from stock apple earbuds to Sennheiser HD 518s. Gave my Portapros a run on both my computer\\'s sound card and mp3 player, using 256 kbps mp3s or better. The clarity is good and they\\'re very lightweight. The folding design is simple but effective. The look is certainly retro and unique, although I didn\\'t find it as comfortable as many have claimed. Earpads are *very* thin and made my ears sore after 30 minutes of listening, although this can be remedied to a point by adjusting the \"comfort zone\" feature (tightening the temple pads while loosening the ear pads). The cord seems to be an average thickness, but I wouldn\\'t get too rough with these. The steel headband adjusts smoothly and easily, just watch out that the slider doesn\\'t catch your hair. Despite the sore ears, the phones are very lightweight overall.Back to the sound: as you would expect, it\\'s good for a portable phone, but hardly earth shattering. At flat EQ the clarity is good, although the highs can sometimes be harsh. Bass is weak as expected, even with EQ adjusted up. To be fair, a portable on-ear would have a tough time comparing to the bass of an in-ear with a good seal or a pair with larger drivers. No sound isolation offered if you\\'re into that sort of thing. Cool 80s phones, though I\\'ve certainly owned better portable on-ears (Sony makes excellent phones in this category). Soundstage is very narrow and lacks body. A good value if you can get them for under thirty, otherwise I\\'d rather invest in a nicer pair of phones. If we\\'re talking about value, they\\'re a good buy compared to new stock apple buds. If you\\'re trying to compare the sound quality of this product to serious headphones, there\\'s really no comparison at all.Update: After 100 hours of burn-in time the sound has not been affected in any appreciable way. Highs are still harsh, and bass is still underwhelming. I sometimes use these as a convenience but they have been largely replaced in my collection.'},\n", " {'qas': [{'question': 'How is the bass?',\n", " 'id': '455575557886d6dfeea5aa19577e5de4',\n", " 'is_impossible': False,\n", " 'answers': [{'text': 'The only fault in the sound is the bass',\n", " 'answer_start': 650}]}],\n", " 'context': \"I have had many sub-$100 headphones from $5 Panasonic to $100 Sony, with Sennheiser HD 433, 202, PX100 II (I really wanted to like these PX100-II, they were so very well designed), and even a Grado SR60 for awhile. And what it basically comes down to is value. I have never heard sound as good as these headphones in the $35 range, easily the best under $75. I can't believe they're over 25 years old.It's hard to describe how much detail these headphones bring out without making it too harsh or dull. I listen to every type of music from classical to hip hop to electronic to country, and these headphones are suitable for all types of music. The only fault in the sound is the bass. It's just a *slight* bit boomy, but you get to like it after a while to be honest.The design is from the 80s as you all have probably figured out. It could use a update but it seems like Koss has tried to perfect this formula and failed in the past. I don't really care about the looks or the way it folds up or the fact that my hair gets caught up in it (I have very short hair, even for a male).But despite it's design flaws, it's the most comfortable headphones I have ever worn, and the best part is that it's also the best sounding pair of headphones I have ever heard under $75.If you can get over the design flaws or if sound is the most important feature of headphones for you, there is nothing even close to this at this price range.This one is an absolute GEM. I loved these so much I ordered two of the 25th Anniversary ones for a bit more.Update: I read some reviews about the PX100-II being much improved and better sounding than the PortaPro. Since the PX100-II is relatively new, I thought I'd give it another listen. This time I noticed something different. The sound is warm, mellow, and neutral, but it loses a lot of detail at the expense of these attributes. I still prefer higher-detail Portapro, but some may prefer the more mellow sound of the PX100-II.Oh by the way the Portapro comes in the straight plug now, not the angled plug anymore. It's supposed to be for better compatibility with the iPods and iPhones out there.\"}]" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product = dfs[\"train\"].query(\"title == 'B00001P4ZH'\")\n", "create_paragraphs(product)" ] }, { "cell_type": "markdown", "metadata": { "id": "sbLyB3WNz42F" }, "source": [ "```python\n", "[{'qas': [{'question': 'How is the bass?',\n", " 'id': '2543d296da9766d8d17d040ecc781699',\n", " 'is_impossible': True,\n", " 'answers': []}],\n", " 'context': 'I have had Koss headphones ...',\n", " 'id': 'd476830bf9282e2b9033e2bb44bbb995',\n", " 'is_impossible': False,\n", " 'answers': [{'text': 'Bass is weak as expected', 'answer_start': 1302},\n", " {'text': 'Bass is weak as expected, even with EQ adjusted up',\n", " 'answer_start': 1302}]}],\n", " 'context': 'To anyone who hasn\\'t tried all ...'},\n", " {'qas': [{'question': 'How is the bass?',\n", " 'id': '455575557886d6dfeea5aa19577e5de4',\n", " 'is_impossible': False,\n", " 'answers': [{'text': 'The only fault in the sound is the bass',\n", " 'answer_start': 650}]}],\n", " 'context': \"I have had many sub-$100 headphones ...\"}]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fHRaCmFrz42G" }, "outputs": [], "source": [ "import json\n", "\n", "def convert_to_squad(dfs):\n", " for split, df in dfs.items():\n", " subjqa_data = {}\n", " # 각 제품 ID에 대해 `paragraphs`를 만듭니다\n", " groups = (df.groupby(\"title\").apply(create_paragraphs)\n", " .to_frame(name=\"paragraphs\").reset_index())\n", " subjqa_data[\"data\"] = groups.to_dict(orient=\"records\")\n", " # 결과를 디스크에 저장합니다\n", " with open(f\"electronics-{split}.json\", \"w+\", encoding=\"utf-8\") as f:\n", " json.dump(subjqa_data, f)\n", "\n", "convert_to_squad(dfs)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "flhLYefGz42G", "outputId": "0624715e-14a1-45b5-d73c-a02f0411dd42" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\r", "Preprocessing Dataset electronics-train.json: 0%| | 0/1265 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:50:28.974991\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "reader_eval[\"Fine-tune on SQuAD + SubjQA\"] = evaluate_reader(reader)\n", "plot_reader_eval(reader_eval)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 177, "referenced_widgets": [ "fecd4604e55a41ceb161f5f645efc861", "fda4b84c3ad34af9990a5994b7c33a1d", "50b0778d89204457a7d8dd32869498ef", "b0d96d75132b4dc8b32d03e913ae2453", "4949ca1475c4416eb801a38d56153997", "777903112b84452d981ccac350084980", "5ff70e11eba74bddbe50a9bf5547709c", "7da568c85f4e456f9e1f02805ce041b8", "4b592f341c554143b2d0d30abacecedd", "4b9774538ee441be9901278d7dbad38f", "ce94114fd0cf42238bc35cea101fecae", "c1e7120d8c304c319ad101eca64a23d0", "a35323b5bd2b48698f19bb19ab319914", "69e11085da814a12884dfb316654b757", "2b2f331915a14d1eba77e4cba77e541e", "d8eab6029bc44e2f95a1703fd4c43b66", "576d948710a74e959e2cc456771728ef", "28f3fd390a4340ed99ee95bda647238f", "ca45b166a30b4680934f5c6cf9ec582c", "8b271c32415d4a7294602aead353ae37", "1c6453e16e184b2a9c5646a479de17cb", "70664ce45d4043e0824a38cec4e55d34", "d54c3b80739b4cc48c7645319ee1b61e", "cef60d2a9ccb4c86996145f5dfa55459", "6fbbef00b44d472eaec96c0f141e5d80", "ec1604c31df146128f7bef859dae11ce", "13d6a998244d4ae4b5e78dfdfe155ae9", "bf9e2e52576c4e1f9ec934859c3f8d8a", "a9f8a1e6b3c0428db932328dcc48adb7", "9bc6b5363de14556ac64a1754656e3cd", "b7a8fa2091bd4d48a3d3684ddac2ca78", "49147c6f3df34b22bae3ec7f9372a998", "c0bae1a3738640c6aa7006c2c7905dac", "a6363e9a153a4a388b4e2824ad4aadd2", "7b1c888022c549b0840989540794f7b0", "59c46f6e382d4eaab78f17239850c245", "0ce9690f926f42aaa31f23d09332058b", "1f8e66f6f44545b7bdff954644dd21db", "91db0347838d439fbc487d1c7aedf15a", "06bf9033275744bf9424cb998f162686", "66d708aa514d4b5792dff7bb9ea19fb6", "9d0e50338ac0406ea692770945685eda", "7266357bceee49fe82e239b3e15a63b8", "a36d0df9fa094bf4826fb02c0ca5e7bc", "7850efc077774e4d96062d159287c5e5", "3d7b8a05e35b42b1ae69367f33ba44e1", "e8977b00bee046aab9fee97a2e6a41f8", "64b29b23cca2427b99f68ce2692d4fce", "3b11f65a69764caca1208379f71a8182", "841d20c0744a43919fa22ada0a03f94c", "ac79d0c45fc647ad98bc565320f02a5d", "5e53bb00e1c94f2d85d23fc11cd2282e", "ebc5a66053f345e48812aac3f6d3f650", "2318d68fef964fba87aa3cd25a066574", "bd6c6d4f873948f0a77c83ec15a420c3" ] }, "id": "btv1iLiYz42G", "outputId": "8b7657ba-2006-4b86-c076-a345ff83ecaf" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fecd4604e55a41ceb161f5f645efc861", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/385 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:53:00.176355\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "reader_eval[\"Fine-tune on SubjQA\"] = evaluate_reader(minilm_reader)\n", "plot_reader_eval(reader_eval)" ] }, { "cell_type": "markdown", "metadata": { "id": "72u2jX9vz42H" }, "source": [ "### 전체 QA 파이프라인 평가하기" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fzEaUoYRz42H" }, "outputs": [], "source": [ "# 리트리버 파이프라인을 초기화합니다\n", "pipe = EvalRetrieverPipeline(es_retriever)\n", "# 리더 관련 노드를 추가합니다\n", "eval_reader = EvalAnswers()\n", "pipe.pipeline.add_node(component=reader, name=\"QAReader\",\n", " inputs=[\"EvalRetriever\"])\n", "pipe.pipeline.add_node(component=eval_reader, name=\"EvalReader\",\n", " inputs=[\"QAReader\"])\n", "# 평가합니다!\n", "run_pipeline(pipe)\n", "# 리더에서 결과를 추출합니다\n", "reader_eval[\"QA Pipeline (top-1)\"] = {\n", " k:v for k,v in eval_reader.__dict__.items()\n", " if k in [\"top_1_em\", \"top_1_f1\"]}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 356 }, "id": "I8MGwwZmz42I", "outputId": "baf8db3d-3cca-4d8d-8f68-4ad3165396b0" }, "outputs": [ { "data": { "application/pdf": "JVBERi0xLjQKJazcIKu6CjEgMCBvYmoKPDwgL1BhZ2VzIDIgMCBSIC9UeXBlIC9DYXRhbG9nID4+CmVuZG9iago4IDAgb2JqCjw8IC9FeHRHU3RhdGUgNCAwIFIgL0ZvbnQgMyAwIFIgL1BhdHRlcm4gNSAwIFIKL1Byb2NTZXQgWyAvUERGIC9UZXh0IC9JbWFnZUIgL0ltYWdlQyAvSW1hZ2VJIF0gL1NoYWRpbmcgNiAwIFIKL1hPYmplY3QgNyAwIFIgPj4KZW5kb2JqCjExIDAgb2JqCjw8IC9Bbm5vdHMgMTAgMCBSIC9Db250ZW50cyA5IDAgUiAvTWVkaWFCb3ggWyAwIDAgMzkwLjkxODc1IDI1MC40NjUgXQovUGFyZW50IDIgMCBSIC9SZXNvdXJjZXMgOCAwIFIgL1R5cGUgL1BhZ2UgPj4KZW5kb2JqCjkgMCBvYmoKPDwgL0ZpbHRlciAvRmxhdGVEZWNvZGUgL0xlbmd0aCAxMiAwIFIgPj4Kc3RyZWFtCnicxVbBUhsxDPXZX+EjOeBIsr22j6QFZpjpAchMD6WHTggpTKAQOs3vV7ub3bWT4NAe2sxsvH5rS0+SJWv8cf7rfja/Op+oD9dyPMxmrxLVgxyfoFq8KlAP/KwVqnOVLwLGH6WJoCMG73i2TGbkQNvKMQbZe73qu5RPrGPBk3MWvJDShn6bDuRqucFon0LLFLJGUyNw2NhDLP1OvqhtkcZYHRSh19aq1Vx9Vk+sn4lZg8aTs5Envn0DUitmJwtfpWRTfZVxRkO6yjkPELqgeQQTGe73ZmjD/FL9A+7EajEnbwA0bDl8gNiMiODRMpxsTuE/Y69jFSry1iG/O19ZDy4EYtqI3awx5I2Famuh3PU+B8RbPnh5QHoITdQQvTGYxymF/7dNO0HhOFnUIbcpgRC1sUDR5uFL4d6m8Qm1Gb7gJOQsZ4PWTUoystebA0TaNJrlhPN4LV/4H9QxsBwkozkNo/Zq9ignUzU+q0E1vWuSf3orv6gjcSo+jdRXNb2Qp1N5KRvVe40doLdVUiTNvjRIB/WeCdzVu1t8rNM2T4U9Wqn5RMhlh8eCUhBaQEFt5Zg+AO9LNA9gQXmFGr13iAHgIIWS5QiOo+aAbMohQQskEJAThhyZCiAeYkElFqzZeQCTxSBBSyw4BwC8M84Cl6MDLEyJRWDNwYH1GYsBLbEInNvEhdYT4EFf2NJZ5KPH8QcXs/M4oKUzSSzMeOci1LE7wMKlLF5kLee4lohVk3FckbSntmQWJF2LmfghViPV6O5+/GGeiid10XYUTa3J+4k9abh7ucvr3c7gcW9nwCvf1VVk6za735Q4PjFtybyo+xZ+1o0pmy7GeW3awxIDB6yWQxB1ZaGuEB3IFyiXY8pBQtLQQVFTDhGXwcDQTKZgfQVVrbwErJpWK1FBxjcGJFQ6aDZQ7qCldK6zfAM527lnIzwBWgosp4c2VJcD1BnUC+6M3vHXrL6cJkPrWexw+suq1Om42BnD95Kpq9sQ2BZZJohBLro19biNNJfm1oWJyXWZJGE0nUtbBYWkuRopazV5CEO6iG/ilv9Xadp0Dnln39A37Q27YZMsNxuD1VBHymXO2kCpt9CyBJe5q4f+yl8bHQWHXYoTocSzuOdnLpY8PvGoxM2R+Mnl51kcCxQ3o9R3l/I3x6mRkAplbmRzdHJlYW0KZW5kb2JqCjEyIDAgb2JqCjg3MwplbmRvYmoKMTAgMCBvYmoKWyBdCmVuZG9iagoxNyAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDIwIDAgUiAvTGVuZ3RoMSA5NTc2ID4+CnN0cmVhbQp4nNV5e1xU1b74d+3v3rPnzcwwAyIwMzgOaIoSiIqZjiRoWob5OGqHYhQQX4DgI0XDN1qapoFlZlOhGZmHzKOg5skkTZHf6Ry1mze7Vmr2IPOcY49DsOZ+957RqHPuub/fH/dzP7+957v3enzXd33fa609wADARg8R3COGZ2VDJLgBWG9qjRqR88A4D9xRRPXhAELiiHETMpNqhx4HwNnUX3P/sPEj4z+csp8Gi4Tz4APj+qbO+vF3jwFIk6h/4rQ5/lIULRqq11N/6bQF89wwIz4DQFbm4IWl0+fM7bdgJoCO6vDadH95Kch0g+4M1Y3TZy8q1F84NYjqHxKTk4oK/Pna3HfWAMSlUH//ImowvSA/SXXiEboXzZn36JML7B9QfT3Vn5xdMs0/pf+EAEB8HNXHzPE/Wipu0cylusK/u9g/pyDpm7uPUJ3w2bnSkvJ5P5U98g2AK4/6d5SWFZTeJf9FQSWQikDRlRFCl0A1BJfapoAe+sBgEIZn3zcezLP984qhC+mUrmAQ4HZJxZ5VUFYM2vA4Rn2C+taCwPIVTLaE1RI9a2ii4Fl6jIZfX4m3eoMXgTRKT6V2UbnDPRdvtfKDweHh3tB7U3CW+j4Yeivj//UVfJ1wX+/EDwRrw7Xaf+Tpf/BKvCXdf3UFl4ZkDNdIO2F+L6qaUvSsBw2YwQI28nU7WSkW4sBJ7WbSuGIRxRpIdpDCRDTqU1YtpiMMDXSlp/CLaRFCNpboqVFxdTSPgbzFRHQBImi+W36yGZ4Gu+oni/1l/qmwyl82pxhWTS3zz4BV0/zF5fQsKiij56Ky2bBqekEJlaeXFcyCVUX+YsIpKphKLbP8xX5YNdtf4lae5G+r5vjnFcGq4llKS8l0/xxYVTa/mDDnFRZPp2eRQr+TT/58hfj6BL6AHBhsBPljpVFYQSLlkUjnQ0hKufN1q86KwuUJ/9oyKi7pQljw3+PBlBCu8r5d7kTjF+1TOtXLfi4LAwF+2kjlwbeH9iBbRKmW6wqZqg0VKyvSi2EMhG4Knm4+PRMIv4duA/W7qQS6ml/geW/jdb+N57mNR6yIZrZJ8SEpTXqWqs7QG/8NCgUbcWfQIGpFQVAoss6i5xRm5YOP5pynsXM72ybPYZfzgAUvBW/PHQIlp+kJ9lCNqXURNoGSrNwqn27oSTkpHYZBNtwP48APBTADZkMplME81QMUuZIhhTCyCGMsYUxTMYpDGMHLFDUXgh8GW4Kng03BY8H9wTeD9cG9lA32BF/7Jdf/9ArlyqZwLQJCidQd5p90RRyGdJpM0AdCsabk9nQCJd6GhUGxcxZBdpjO/WFQcuRYgnEEpFWSIAQOgmkEBQRRBDPCEEMwOwxdCYoJSgliCcrC4CSYFwYXS4cGaKb7GNTBdraLaoXUPpdaAsI+WA3zqeU4a2brhGRq2wU34CxhVkEz1onARkEatQJckAS4ycbDfqKRwewsQ9aIII4R94sPig3iNbEFBojlYouYJ5azNHxJmijtIsjAd8lXTtEq08AuQTkcwi8xDY+Iw0UzXMIWrIOrNIuiv2bYCLVQQbzYWQlUChXCg9RyUmqBbXSXUH8L28HOEneH2Eo4D8+gKIyEHew8ydUM38NKHC9UklnShELi/yTRaqHx26CcAuQ80wMXelEbcU9zTVWf8ZgsnVfvG1BJM4+HWk2Dxi57aBZFY7vYcdaq2QIBOIu/xbn4EVstesTd4kjYGNIA5sFGor1NGaMpZItIduWuUKgLC8U8VgdfinnyVKL9riIRzblfeJAkKoQjBAs1FpLpLrYa1xGnSm88tMijxL40nijIS0lqgBJMh5lUqoC9sA+SsQY2EiVVXs0A6XsauV38lGTeyDYI30MLDiePLBSvk65paQCK5IOyRhJRYNDbbakXvPfm1/vGTnK/Nzkhufevqm6L7K6HnHrTIndDMJgzSYyVJtdLcfXo1daLXs+n/1Xnp8m9R+dMctd3ZA0PU83KG05t4yZRUalRM7VnDVf7lEnrJS/97s2rd08rcj9uedwz6HFLwSAlhCij0S6CYlvJCPX8r0KFxkYr0ABfhOYZ2Go2yYA2DUTqzZaLo+sjx09qBH3w7YGTR9dHqGXwDZx8JbXVmpFxJ1jaW1OYRnDYbdGeRCG9n22AULFmxcrVgZrqp7dqbJ/zIdeu8buufs1OfHKJNbXSfLU0X4k6n8sXISvzyQwMNjFSCzTf4Js/041Mi7I57ILs6W9L7yfUEsnqmsDqlSs1tlY++NInfNDXV9m7166xd4hqPrskVAorSSLrAdguiLRfslw8Q3TOEZ0ER0K+ENtxVVhZq2TRj+ixF7iCexBWCiwGuog0353QtzVlQJrD89HZs5yTjqqCl8WN5LcGiAaPL1ITsEHA+JRtfRddXIQT4xyxXWjUTRpnuXKz1XI9hXUTrBZbWqrNahGSUsFqAU835Sk8sf355+n3/PM/MR3/4aef+A9MJ+XwFn6GoIWl0d2PpQV4OV/Dq3g528AWscVsg8Lrp5T4ptDaSkr3OTIxIAoBabkMAZ3WpYmjjSUzWM6F7cIUu7Q2hQRJvdlKkpN3TO7G9kdghCjkDkiwSuneNCtpg7NR/FlWcJqNaq+tE8tHNoxsO19HBMiTxVEkcRzs8CXFdI3FLnFWSQSrJImZlhetT5sC9qdEymhg0QtMHxdtQU28pX10vWP86Pqo8Q+NrrePf4g4QcVbms61vv221ZYR5uamyo1skb6RpW9YfZzFGp1BvPlSJ4gTpYnyYnGxtCC2KkamfBcjdiXHj5sHCzTzu5bHzotbAWtiVnRdEbsibjfsjrXmQq6XhEjvDwOGsPR+iZ5uGjl9CEtLFR12jawBSrLH2u8jNab5739lzSNnH118btIXzJ71UAy/WVdXt5A9NWjO1nsX1mTec+bO1C/e+e3O0nj+NUm/nexdTtL3gFJfH3BE6tfoXGvckQGHKaDbookLuLd4ntKsd7zcMyouEtAeE5fotsSh3aXT9FSUEDX+lvw6VX5SADlztOrNrVduXmm1fH7dot6klRTm0+U7/S6/Oz9BhFzmZA67mNAtMSndSYL0J6l6sfRQ4Rfi4dCnXubv8y8ePjlz/Htzjp5s3Ln3QPWOl58Zd7Ss/NTkz5nxSfS6mjZ9/Fev9/idqTUbV1XvWlhaXtE9cb/b/ad9S15TYj+frFxLPiVQDC73xTMTmgDRlAlokAMSw+U6ZtRDnEYrGtUMYCDBTKpgRkWwc4ObWlOtil2vnBvcmkqyqIYVT5FxTykmvcMAd8BImExL6UJ4HOQo1gsSWS/sz8awB4wPmCayQjafLcbVzESm1LEETLNS2Fk91oR01HCB8XR+/vypjoclb/tlbGlP280DLO84WWgHWSifOI+Hh30esatsXWOJ7xqQ7QHLOpMQgOWm9XKtMzqO6TEO9BaN09LOOtvF0imLWZRoIRNZmq4rAaxEMJmHN4Wso6QMq6JzcNjhF2ZRrPExxnQEek/q3ca683P824ePF015e9brp0+/PvbF8dL5Or45IoJf/+ov/Du3u/nOlAPbtx/onkjatiinQ1XvenjHNwyslKQkUVCii1n1SEc5geyglynkNEqjzop6rdJBlpGrFbtIOi3tCJR1RifpKfKbohUzDL5yrvWWFdTwuv3ShoqWUHlyt31uI2O5vswIFiFEyBHaCJgEC2hnsx50MtMKGtSJUSxGmMgmCTnG6axIeJQtEJZgmbhQflRbxdYKy4zPCM9ijRgdMhwlLg8moEc4wq8LXl5xVcj489qOR9ael8wdMbi3rRer5MvJbqfI454gyQ3Qx+fQVItCNSzXVouv6yWmk8lWopEyaeq5piYlWbSqyXSfy0SsqnYIwyms7+gqnOzIEH5sH6KoObuu43JdmLqHqOugp88Wpi6+rpWYSlofIq1QDhGOMHQm7DmFUzpKhZyO+tMKzZF1HQMgRFPjIppG+IOvH1plrSxYmaBVXijo9DoymF6XqZcF1CL8Tmsgy5BZJL0mThyip1lNNGu7Ev2KfZR4if7ZQKpZQqYhk5SaFZNMRMUcOkHvEOxypD5RSJTdcqLere8np+tnCEuECnmRfpmwQl6h3yREicyAkSwWPaw3Jml76PqxwThRO1lXoJ2pW6BdpFvONmA1ew7tZKbIBDIUxRbzMBKWJbOlrJIlv8srm3llk3S+XYs/tvWSXO20wW77lHyznP8oXVBXvXh40HcHrWURzGgympnJZMyMcBo11Tao7kLLoMlpio0woi4mNg1jHBZnSOKMDCU1WJrUJTGjk1sSqBlfXRLT1OUxknmSzMzjVsoJylPApIu8g+HFi4zx4FCmv/lxgsfC3+VVtCLexQaxR/8k3ccb+FX+OW9gI1lXFstGtv2Rf/ytILCdzM+msp38Ib6Dt/MnlUxHVpQoIukEHwdv+NIpelCvsaKIklUUMZM21w4UHdU6e7VpuUGUNGjVQVyUWdLHxIjWoXZ9nFGMV9ynvUkxJcmmmlNZWm0Ztl8Jp65n+3xONcAWRzIJJCZRSMmiAxzMLkRhtOgFL/MKiZikSZQTtYk6t7M/6y9ks2yhSJovzpcWRq7VrJWf0Twju3LVRSE60oN9WC+m5EZ3FOV/Zg2nIdwwrGJIy4U/jHri0Yun2XsM2ld2rOObq6s3C0eiNj3Gi1hlzdSOddL5Dz7ccEh4oON61cqVqxXPVnY1L5F9k+Ax32CTUTAbBKfLqdUJsl5wuZyZeoPTJToYOF60P92l2ipWw9Pep6zrezj1BlesDN1iY8zJcoy9Ww/LxSYy+BVKl6peLOoSd/3765YTt53drGgl/CLlRPQk5Rxw9ezb84GemMscfRhlU4c9yvVPlr++rA+t7d3TUqPEkeVnHtn55sJdiz/7N/4xvzbz22UVrWWvH6naVvHZaRb93Yx/l2rfHdB/2YJpBa6YXhcOXPgkpe/7WdlrHyte4uqS/PZrJ64kKr5QQXIn07lBT2fyI7S7cRmidWZ4NVrTaLa617gOxTV6Gqzro40QjV1MOq3BhVp7ViJZ/8y51tRUdVPal7y6nfz6hLKGkzuQA/iKU+JTnCmuFHdKQkq3oUm+eJ/T5/K5fQm+bjnxOc4cV447JyGnW05SadLq+CpnlavKXZWwutumpEDSjSTnraG3Bt0akOfMc+W58xJKnaWuUndpwjLnMtcy97KELrksl6k6Iy3dzQZYPekUQN0S0/v1T0vovFeIEo5e2rO85NnGhoahR9buae74iQmvbM07ML7g6JS/3RDSCiumll/Y3/O+juV1hf5jL731tq3yiT596pKS2hUfmRu8jNdIVzEw1BcLa9ha0bzGtFbfaBUbo0lJXWWbCUbas7pa2ukcEN7/8pvXLd9dT/EZImItsctiN8UGYiXWycjE8ACHwqwa7alReG3M8zlvnjjxZs7zY+7fmdvBP6DcpJnwkpi+p1evyy0tl3v1quvenQ1hZmZjgzxkQeJKnKKx0zoaBwN9MV0bwWxvlLTrzQ1sKwUXaIURVpshK17dlaemKhn/ys0mZYVPOZDnXOYMOJEYsqaFlSaoeSiKqayFdIcvNTQMemNJcxCCzUve6Dj5yubNu3dv3vwKHhAe/nvr7nw/G860dA/3c0fztWvNBGG+KklbdoilnWN3inbdGu1ayfEqkxqN7HCXRluDcX1crEPQOrQwWrBFZMWpLDapu2NFeVfUfeFNdevh6zk0vjQ+EP9+/I14aSgMZUOFoY6hsVJvua+2r663vgRKWIlQ4iiJ1eXOVRScoIaMqlt3OLfKqtJlsbJ9n7Hl4MyTU6e9P4vf5CdZz/bPmNwg7Fy7rdEsPDzl6Ml+/fbe0ZsNZHoWye7hHzdt3b93hxItx8gNFpGuEWTo5TNrjopvwBFBYloRsrWWdtr3Kfs/OqT5DBadT5ejy9OV6sjekaRdZTN3rIEuMe+ngMb+peJRP9PrdhC2CkwL2bfPXD6TRfJJOVKeVCrdkDQhIkRAY/97qzL2EK3v+aTfSOjvi0EdoJlpqszWBuMRPS3JMEaJ1my7kqeVwxhth8joVhvloP15jj86BMXmHqtqX4fKnBofYn7DkiXVexobM9+cf+yEUNvxW2HHCzuO1nZUiXl7C/K/Ddt1Ps2rnP9G0fmv0QaNxgbl/GeLGIs2R9avzn8+z9CYCqjQVMqV2kpdpb7SUGGsNFWaKyMqLZXWClsg5kaMtVP8ks/94phY/vSe16q37Nmz5Qaz8es3/sK/ZVa8dO3UqWtfvHfyy+38Pd7KvyEjZpCt7GwgcXiITxRriUMlIob4Ym9FRIN5PXsLj8RTNIxQ4yJbXcVSQ7xeuRUUPl0oKj5xiizXe1s1yjpMKYWFNKZyysobGwe9UXEGgsEzFW8IAykuXlFgd8dejb4u38+P8B/pPuJnX98Ki5DdcBRxZ4UUn11jkMFqwCpzg+6IrNdoQZtts9C6qvoRxcK5M4rz78+JfCFSsVgoa/xsrmgc5bq39/ZXiI9DqyP7xOF+m7X5aMc+MlbhNEn5OkvbNuGv0g7yEsoNZkkbga/SDvuItkpvoMWNtGCxmRUvGdxEv1Q1nbfeDB1lMlL2/c7BFD9RdoT2qLuYQ8mplCfSrGwhq+CrR5e/9db5l6qqpB38nY0dgXVjtr3wZyFvIxuiyFnf0Y458mbyE2CedEukLTINkLaL5ccDh3YFePuUyo72r3Ar+0xIYdjxN76m43r7N6FxrFxep3zlVDy+/vhxed335SpFbieK61SKDoWgRyCC6ay8cktz4P1NAXndV+07+W+4nRewLHaddhSpX/2ak3QLbUjTbJE2q0PYUDmFtwd2HQoojEQKNrZIMPOOjve5s33GV8CEDeq4deq/SqTzFMxRWVG/VnfmJc1BRNGjEBU2HA9sej/QvKVSYaal4yNu44fZc6yVvYoPhb5VC5OfnX9/18WPRAz+Dlxa9ePunx4zX7n1/uGD9vvMk3UXw99wQxeNk+fweAAz/+GDtrHmyf/w/ThBbFG/rYJQR/AE+ZgL6glqhcuQT/WPpGioIviUoIZgO0E+wQ5pEVgkM5wSt8EpzTWC+6Bc2ginJDtUidegQiyEuVSeK7bCXCEDjikg2eCQeFVtP4SjCDwwgJRUH7qFDfTsfA2EabAINsAfmYatZn8XJgq/x1T04xOiViwWf5TmafSaCZoXZYOcJo+VN8j/oU3W5mtf1L6lPa39XCfoLLreIR1AIo6HXlBE5w/l3PisoiHRIUTRW/mOK8MU5dudqCPkFPU7uVJmEEW1UFkALcsOl7FTu9ipLEEXNiZc1oCdFcI9UEKnwUVQRkf36TT7PPXb/zToSe9USKE7jUpTCcMNmYQzD8oJyqAA/DAHelPrvVBM+H2oNAxm0+2GB2/TKldrBfQuoDEL6JlPmPr/i1n73551PM20gOZSvtYWE7bCh5/G/L/NOJxKM2ncRJhPGNMI169SK1BH+FWJ3ESlmJ6lhDOV6M4gPDeNL6HZ/Wrfr+mMU6mUE0e0RsMsalVmLSfcEpVSKs2dBum/GHVrTPh/wuBj6v9S/3glqn4hkLWUfwstlFOVfyUdEEVrUxf1v6WepKd+RL0/0c+GETAS7oMHYCzJPw4mwG+YwJCJTCLPlOX5xTNS0zIyw+97wu/h4XdW+J0deg8jD2sQlvmCP3Fss+PfvfhjKv5Qg9+b8TuONzn+zYt/NeNfavCGF799fJj0LcfrNfhNDba24ddt+BXHLwfhF5l4jePnqXj1yjjpag1eIcQr4/DyZ32ly234WV/8lOMnHC+l4n/Y8eMavMjxIxv++1K8cBg/5PgBoX+wFM+fGyGdX4rnRuDZP8dKZzn+ORb/xPF9jn/k+H84ttTgmWandIZjsxNPp+IpjidWW6UTcfhuFDZxPM7xHY7HOL7N8Q8cj3J8i+MRjoc5HrJi4xqv1Mix4eBhqYHjwQO50sHDeHCZeOD3XulAri+IB3zi7724n+ObNbiP4xsc6zn+juPefHzdjHte80p78vG1Opv0mhfrbPgqMf1qG+7m+ArHXRx32rCW48svmaWXU/ElM76YjwFCCdTgCxx3PG+k1QWfN+L252Kk7fn43DaL9FwMbrPgs3p8huPWGpO0lWONCatpUHUNPr3FLD3dA7eYcXMbPrXpsPQUx00bc6VNh3HTMnHjk15pYy5u9IlPenEDx/VP9JHWc3yiDz5OYj4+DNetNUjr7LiWFmVqqMrHNaSpNV5cbcVVHFeusEorOa6w4nKOyzhWcvQFH1u6VHqM49KluCQfK8Y7pAovLua4iOOjZlxoxAV6nM9xXhuWt2FZG85tw1KOJRyLOc5OwFkcZ1ozpZnjcAbHoqU4nSqFHAs45nOcxnEqR/8gzGvDh42Yy/EhjlM4Tp6klya34SQ9/iYqRvpNKk7kOIFmnpCJ4x04jlmkcV3wQTuOHRUpjeWYY8AHOI653yKN4Xi/Be/jOJp6RnMcda9FGhWJ98abpHstONKEIzhm12BWDQ7neI+QLN3ThpmHcdho9HEcynHI3TZpiB3vHhwh3W3DwXeZpMG+YATeZcJBHDM4Dhxglwa24YD+FmmAHfunG6T+Fkw3YD8nppkw9U6DlMrxTgOm9DVIKSbsa8A+yTqpjwWTddg7FXvd4ZV65eMdPW3SHV7sacMeSV6pxzBM8mKi1yAlRqDXgN05ejh2i8AEkjPBhu58dLWhk0Rw5mO8CeNIg3EcY9uwaybGUCWGY5d8jCZNRXOMokFRMejgaOcYydFGCDaOVpLVmomWpRiRj2aOJmOUZOJoJGxjFBo46i2o46glNC1H2Y6afBSpUyQPcCC1IqedhUUSkpFZEDiyBpa/egPr9f/DBf/bDPzLK/4/ARcUYOEKZW5kc3RyZWFtCmVuZG9iagoyMCAwIG9iago2NjE4CmVuZG9iagoxNiAwIG9iago8PCAvRmlsdGVyIC9GbGF0ZURlY29kZSAvTGVuZ3RoIDc3ID4+CnN0cmVhbQp4nKWOxwqAAAxDH7j33uv//9IgvSjoxcJLmh5C4ec4j+zi2eYTSEMiYhJSspeG3LygvN0r85qG9uODTvQMjJanS2exsLLJd3Gcab8B7gplbmRzdHJlYW0KZW5kb2JqCjE5IDAgb2JqCjw8IC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggMzUyID4+CnN0cmVhbQp4nF2SP2+DMBDFdz6Fx3aICBCcRkJIVbow9I9KO6EMxD4ipGIsQwa+fbGfTaUiwdP73Z3P2Befq5dK9TOLP8woappZ1ytpaBrvRhC70q1XUZIy2YvZO/cVQ6ujeC2ul2mmoVLdGBUFiz/X4DSbhT08y/FKjxFjLH43kkyvbuzh+1wD1Xetf2ggNbN9VJZMUrcu99rqt3YgFrviXSXXeD8vu7XsL+Nr0cRS5xNsSYySJt0KMq26UVTs16dkRbc+ZURK/osnOcqu3Zaf2nxIA704/AR88niziEpY8tHNumiGJbMc0WATSArJIIeQ6ioPyLHSQIFzYO7xZhFFdysN1OEca+UZcLBpoC6Jg3LfkPuGHLvj/g+CPQSKpBOsPyLuz4YLYOGx8JgcPvpz9pZ3gbqkI3ZnpYECo7GVBnqxlxtu0d6zHcptiMTdmHV+3OS6wbEj0yvahluP2lbZ9xcMfMxWCmVuZHN0cmVhbQplbmRvYmoKMTQgMCBvYmoKPDwgL0Jhc2VGb250IC9CTVFRRFYrRGVqYVZ1U2FucwovQ0lEU3lzdGVtSW5mbyA8PCAvT3JkZXJpbmcgKElkZW50aXR5KSAvUmVnaXN0cnkgKEFkb2JlKSAvU3VwcGxlbWVudCAwID4+Ci9DSURUb0dJRE1hcCAxNiAwIFIgL0ZvbnREZXNjcmlwdG9yIDEzIDAgUiAvU3VidHlwZSAvQ0lERm9udFR5cGUyCi9UeXBlIC9Gb250IC9XIDE4IDAgUiA+PgplbmRvYmoKMTUgMCBvYmoKPDwgL0Jhc2VGb250IC9CTVFRRFYrRGVqYVZ1U2FucyAvRGVzY2VuZGFudEZvbnRzIFsgMTQgMCBSIF0KL0VuY29kaW5nIC9JZGVudGl0eS1IIC9TdWJ0eXBlIC9UeXBlMCAvVG9Vbmljb2RlIDE5IDAgUiAvVHlwZSAvRm9udCA+PgplbmRvYmoKMTMgMCBvYmoKPDwgL0FzY2VudCA5MjkgL0NhcEhlaWdodCAwIC9EZXNjZW50IC0yMzYgL0ZsYWdzIDMyCi9Gb250QkJveCBbIC0xMDIxIC00NjMgMTc5NCAxMjMzIF0gL0ZvbnRGaWxlMiAxNyAwIFIKL0ZvbnROYW1lIC9CTVFRRFYrRGVqYVZ1U2FucyAvSXRhbGljQW5nbGUgMCAvTWF4V2lkdGggODYzIC9TdGVtViAwCi9UeXBlIC9Gb250RGVzY3JpcHRvciAvWEhlaWdodCAwID4+CmVuZG9iagoxOCAwIG9iagpbIDMyIFsgMzE4IF0gNDAgWyAzOTAgMzkwIF0gNDUgWyAzNjEgMzE4IF0gNDggWyA2MzYgNjM2IDYzNiA2MzYgNjM2IDYzNiBdCjY1IFsgNjg0IF0gNjkgWyA2MzIgNTc1IF0gNzcgWyA4NjMgXSA4MSBbIDc4NyA2OTUgNjM1IF0gOTcgWyA2MTMgXSA5OQpbIDU1MCA2MzUgNjE1IF0gMTA1IFsgMjc4IF0gMTA4IFsgMjc4IF0gMTEwIFsgNjM0IDYxMiA2MzUgXSAxMTQgWyA0MTEgXSAxMTYKWyAzOTIgXSBdCmVuZG9iagozIDAgb2JqCjw8IC9GMSAxNSAwIFIgPj4KZW5kb2JqCjQgMCBvYmoKPDwgL0ExIDw8IC9DQSAwIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4KL0EyIDw8IC9DQSAxIC9UeXBlIC9FeHRHU3RhdGUgL2NhIDEgPj4KL0EzIDw8IC9DQSAwLjggL1R5cGUgL0V4dEdTdGF0ZSAvY2EgMC44ID4+ID4+CmVuZG9iago1IDAgb2JqCjw8ID4+CmVuZG9iago2IDAgb2JqCjw8ID4+CmVuZG9iago3IDAgb2JqCjw8ID4+CmVuZG9iagoyIDAgb2JqCjw8IC9Db3VudCAxIC9LaWRzIFsgMTEgMCBSIF0gL1R5cGUgL1BhZ2VzID4+CmVuZG9iagoyMSAwIG9iago8PCAvQ3JlYXRpb25EYXRlIChEOjIwMjMwMzA2MDY1NDM4WikKL0NyZWF0b3IgKE1hdHBsb3RsaWIgdjMuNS4zLCBodHRwczovL21hdHBsb3RsaWIub3JnKQovUHJvZHVjZXIgKE1hdHBsb3RsaWIgcGRmIGJhY2tlbmQgdjMuNS4zKSA+PgplbmRvYmoKeHJlZgowIDIyCjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMDAxNiAwMDAwMCBuIAowMDAwMDA5Njk4IDAwMDAwIG4gCjAwMDAwMDk0NjEgMDAwMDAgbiAKMDAwMDAwOTQ5MyAwMDAwMCBuIAowMDAwMDA5NjM1IDAwMDAwIG4gCjAwMDAwMDk2NTYgMDAwMDAgbiAKMDAwMDAwOTY3NyAwMDAwMCBuIAowMDAwMDAwMDY1IDAwMDAwIG4gCjAwMDAwMDAzNDAgMDAwMDAgbiAKMDAwMDAwMTMwOCAwMDAwMCBuIAowMDAwMDAwMjA4IDAwMDAwIG4gCjAwMDAwMDEyODggMDAwMDAgbiAKMDAwMDAwODk5MSAwMDAwMCBuIAowMDAwMDA4NjMxIDAwMDAwIG4gCjAwMDAwMDg4NDQgMDAwMDAgbiAKMDAwMDAwODA1NyAwMDAwMCBuIAowMDAwMDAxMzI4IDAwMDAwIG4gCjAwMDAwMDkyMTUgMDAwMDAgbiAKMDAwMDAwODIwNiAwMDAwMCBuIAowMDAwMDA4MDM2IDAwMDAwIG4gCjAwMDAwMDk3NTggMDAwMDAgbiAKdHJhaWxlcgo8PCAvSW5mbyAyMSAwIFIgL1Jvb3QgMSAwIFIgL1NpemUgMjIgPj4Kc3RhcnR4cmVmCjk5MDkKJSVFT0YK\n", "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2023-03-06T06:54:37.997163\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.5.3, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# 리더와 전체 QA 파이프라인의 EM과 F1-점수 비교\n", "plot_reader_eval({\"Reader\": reader_eval[\"Fine-tune on SQuAD + SubjQA\"],\n", " \"QA pipeline (top-1)\": reader_eval[\"QA Pipeline (top-1)\"]})" ] }, { "cell_type": "markdown", "metadata": { "id": "7jpYpLCez42I" }, "source": [ "## 추출적 QA를 넘어서" ] }, { "cell_type": "markdown", "metadata": { "id": "dCYOjNaZz42I" }, "source": [ "\"RAG" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 361, "referenced_widgets": [ "24201c4e037c4521b0eddbc5acf14691", "6aa552d8122b4f1d900b2b0e1bbce2b3", "5368a0d8431a4f78aa8177e435a02044", "2cb6cd55021d4df786e9be9cf0fd1b6d", "33d160d0403e4752b8ccbc21771a6982", "91ed881e47304d19a43ca418f365d8d6", "597ba649de0347bdbd6bb206f5b9ffae", "324997655bb048908642cf935d5051b0", "35d6d5b4796146f29d537e6a66fa959e", "35fbba550fbe4b32afc9d37616b99de0", "1dda099c6e214b5aa17086d6334524c2", "97e1653fecbf4fe6b09cefc998dda8f6", "2303540a318a41cab13728254bdd93c8", "b2b7957de17c4a98bfaae26addf28860", "a96f7ee2514b43279c12713160afaa78", "24ea75b4190449c7ba3eaebc1e3ed920", "90458bdcaafb4c9a9a58ce9f68596ad5", "0b4490a071da4343927f1de85077fbf9", "5f6a69a0c7fb40639baff3d21330c5bc", "410484c5dc674bae851a59855b1e64ca", "6c42b3d197cd43b4819d7b54519fe828", "7df205b130f9448da4ad935d01487f2a", "3b1ae7324a1449308a91efac7920a217", "f3615131d88240afaa1ac0720aede7c4", "d0967753213f42abac4741b7bb2fd08e", "0caa719ea1084f178ae38f8a7df61881", "cdd53d905acb4405acd0c338d36376ba", "ab68f3fe206f4ca999bfc699f3fc26dd", "699d545cd351410c94d7a2b9919073f9", "3c531571972c453184084c188351196e", "a14f56f3525e4244988aa23522c29dc4", "ab41baffed8549f3a2d10780522ec6d2", "fb3fbb598e2b4a2b946f8e38537f9ecd", "da0771fd53424bfea4813287d7596ba2", "9d608880b7074d118f06ba4580cc830e", "e6d8d1c99b124815a677287f3a23d855", "a176bb76c2224caa9ed5b6eae65f1564", "a07930d28d774f0fbb6b49baa18e4197", "f9b9185d995f42ecab490619ca56f869", "8669ecdf967b49f3b1228ac2796fb7a9", "50964575a89f41968d1e12256c5042e1", "bf84ed5045984fe98eb4ace77e0b4d67", "ef436542e0a744ac983d89656390c1f5", "ada018dd49974dc58324a31802455be4", "4dde3fbfcb2944c3ab12694c87a4a7ee", "23f40f453b794fd6b3bdde3e2b4a590e", "9dc2a12ad0c8492dadefcb4df8c11727", "965384baefcd41a6a8f84bc1595c586a", "c234eec2de7c4398bbba485dfd546b8d", "1424e0fbd853472c9ba2b10fb1ee7053", "d7a99d0c739f4ae797281c7f6f1692d1", "9e69fe0c6c684ae6951c6e7e25167b88", "68227be14ac44ee9926ae5e38325610a", "e8ecf9f62d3041c4a38927c6882620cb", "579a7f13bf63462bab5d37fc33915448", "03e428e2e73049bb905969eff9580a44", "b60acd0208864b22a27fbef1de7df37f", "b14e5833789b48cfa0184bb1f96cb42b", "b4aab01faeca457ca5753de0cbf7adde", "35de3d53e98f4542ba00af58a2755f8c", "b5e861339e54462c933b21894cc2d8a6", "7bce87515ea6443298a392574a71f932", "dac07ee151724d6e9a35ba4ce914ee05", "d8f84a5f1c2f4994961cd018314e54e0", "ecfe3ef8abcd43e2992abfa188a7c30e", "e9383fa4554749c4a7c20acf2e47f288", "7279382f89ab43e5b8b06ebf11e1d3b3", "8107637926474da4922e13d33698a1c1", "2843bab33dcc4b9eae01a3661f884489", "8f9a4093735d4b468c7842f828838791", "02893b7c50d4483ba01adeb66e3415cb", "4ea98f5f71ea45d990b7e5eaf0313ca9", "2d6af5d950df40199a89a7acf6052ad5", "5586941950c24573b73466007cf30510", "234fa67a0c554eae837ff0805f6a6238", "32eb73fcab60469bbe7614bd62df8d4e", "84c13f0bf353402b8e0f044216692a52", "c774b980469840708af26b970509a643", "44d8ab80464a4d2c846e5815b0e109e9", "b07e0284bd454d679cc832c75f406266", "a4f847c2ed714642a9e5fd860ae1b9f3", "ecec445e15a440acadd23469083cbf81", "e06482a4537a409ab529ffc64ad59535", "4571c4ac68d94071a758bed8c0bbaddf", "e246ae2b88ec4806ad5dbe82094a1d1b", "3a49d109ece24d7687f77df7a1ddae18", "e367cef0cc944c9ab138f81e5128b215", "fef4af0a4bc74982976c6cbf82d2e277", "d7e566bc809f456c8bec5eb1db2b34d6", "54bb7257f2bf46d5855aef925fe59fb3", "f741f6f629514e6e9b8a63be5f0cf2b7", "7ccd1d52c0524d4499965ba69e6a9530", "834c458c2c5347f29d0b24eecf319861", "e3716150c4064df9a3d288fb509635d8", "e92e4c674bee4db08053d584bdeef3e8", "6222d158304540ba85d43952f5b00022", "3d82104ae4d1473792669d781c812bc6", "d2e1799132f2428a849647aa4a46f83a", "0e5cf1e295fb47d2aed2ab3a4f8c9f31" ] }, "id": "VGryx5itz42I", "outputId": "ca7492e0-feea-4897-abd4-52fb00195518" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "24201c4e037c4521b0eddbc5acf14691", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/4.60k [00:00" ] } ], "metadata": { "accelerator": "GPU", "colab": { "machine_shape": "hm", "provenance": [] }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.11" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }