{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"intent_classification_airlines_ATIS.ipynb","provenance":[],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"ZE4c3HMSkGGu"},"source":["![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)\n","\n","\n","[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/intent_classification_airlines_ATIS.ipynb)\n","\n","# Intent Classification with NLU\n","\n","|Type | \tDescription |\n","|------|--------------|\n"," | atis_airfare|air fares, like **500 $**\n"," | atis_ground_service|gorund services like, **Transporation**\n"," | atis_flight|aits flights like, **6B12**\n"," | atis_airline|atis airline like, **Emirates**\n"," | atis_abbreviation|atis abbreviations like, **air fare q**\n"," \n","# What is the ATIS Dataset?\n","ATIS dataset provides large number of messages and their associated intents that can be used in training a classifier. Within a chatbot, intent refers to the goal the customer has in mind when typing in a question or comment. While entity refers to the modifier the customer uses to describe their issue, the intent is what they really mean. For example, a user says, ‘I need new shoes.’ The intent behind the message is to browse the footwear on offer. Understanding the intent of the customer is key to implementing a successful chatbot experience for end-user.\n","https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem\n","\n"]},{"cell_type":"code","metadata":{"id":"SF5-Z-U4jukd","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649990147602,"user_tz":-300,"elapsed":111768,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"e2ab06a7-7811-4539-8212-41c21af55334"},"source":["!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash\n"," \n","\n","import nlu"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-15 02:33:54-- https://setup.johnsnowlabs.com/nlu/colab.sh\n","Resolving setup.johnsnowlabs.com (setup.johnsnowlabs.com)... 51.158.130.125\n","Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:443... connected.\n","HTTP request sent, awaiting response... 302 Moved Temporarily\n","Location: https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh [following]\n","--2022-04-15 02:33:54-- https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 1665 (1.6K) [text/plain]\n","Saving to: ‘STDOUT’\n","\n","- 100%[===================>] 1.63K --.-KB/s in 0s \n","\n","2022-04-15 02:33:55 (28.6 MB/s) - written to stdout [1665/1665]\n","\n","Installing NLU 3.4.3rc2 with PySpark 3.0.3 and Spark NLP 3.4.2 for Google Colab ...\n","Get:1 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]\n","Ign:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease\n","Ign:3 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease\n","Get:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release [696 B]\n","Hit:5 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release\n","Get:6 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release.gpg [836 B]\n","Get:7 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease [15.9 kB]\n","Hit:8 http://archive.ubuntu.com/ubuntu bionic InRelease\n","Get:9 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]\n","Get:10 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]\n","Hit:11 http://ppa.launchpad.net/cran/libgit2/ubuntu bionic InRelease\n","Get:13 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]\n","Get:14 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic InRelease [15.9 kB]\n","Get:15 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages [953 kB]\n","Hit:16 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease\n","Get:17 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main Sources [1,947 kB]\n","Get:18 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1,490 kB]\n","Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [3,134 kB]\n","Get:20 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2,695 kB]\n","Get:21 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main amd64 Packages [996 kB]\n","Get:22 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2,268 kB]\n","Get:23 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic/main amd64 Packages [45.3 kB]\n","Fetched 13.8 MB in 4s (3,725 kB/s)\n","Reading package lists... Done\n","tar: spark-3.0.2-bin-hadoop2.7.tgz: Cannot open: No such file or directory\n","tar: Error is not recoverable: exiting now\n","\u001b[K |████████████████████████████████| 209.1 MB 55 kB/s \n","\u001b[K |████████████████████████████████| 142 kB 48.8 MB/s \n","\u001b[K |████████████████████████████████| 505 kB 45.5 MB/s \n","\u001b[K |████████████████████████████████| 198 kB 56.6 MB/s \n","\u001b[?25h Building wheel for pyspark (setup.py) ... \u001b[?25l\u001b[?25hdone\n","Collecting nlu_tmp==3.4.3rc10\n"," Downloading nlu_tmp-3.4.3rc10-py3-none-any.whl (510 kB)\n","\u001b[K |████████████████████████████████| 510 kB 5.0 MB/s \n","\u001b[?25hRequirement already satisfied: pandas>=1.3.5 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (1.3.5)\n","Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (1.21.5)\n","Requirement already satisfied: pyarrow>=0.16.0 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (6.0.1)\n","Requirement already satisfied: dataclasses in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (0.6)\n","Requirement already satisfied: spark-nlp<3.5.0,>=3.4.2 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (3.4.2)\n","Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=1.3.5->nlu_tmp==3.4.3rc10) (2018.9)\n","Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=1.3.5->nlu_tmp==3.4.3rc10) (2.8.2)\n","Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas>=1.3.5->nlu_tmp==3.4.3rc10) (1.15.0)\n","Installing collected packages: nlu-tmp\n","Successfully installed nlu-tmp-3.4.3rc10\n"]}]},{"cell_type":"code","metadata":{"id":"51Zr-JvU4xEg","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649990148234,"user_tz":-300,"elapsed":665,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"9edca62f-a258-4563-fa37-791f1ce4b77e"},"source":["# Download the dataset \n","! wget http://ckl-it.de/wp-content/uploads/2021/01/atis_intents.csv\n"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-15 02:35:46-- http://ckl-it.de/wp-content/uploads/2021/01/atis_intents.csv\n","Resolving ckl-it.de (ckl-it.de)... 217.160.0.108, 2001:8d8:100f:f000::209\n","Connecting to ckl-it.de (ckl-it.de)|217.160.0.108|:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 391936 (383K) [text/csv]\n","Saving to: ‘atis_intents.csv’\n","\n","atis_intents.csv 100%[===================>] 382.75K 669KB/s in 0.6s \n","\n","2022-04-15 02:35:47 (669 KB/s) - ‘atis_intents.csv’ saved [391936/391936]\n","\n"]}]},{"cell_type":"markdown","metadata":{"id":"yZ7R8bho2kXI"},"source":["# Predict Intent of Airline messages"]},{"cell_type":"code","metadata":{"id":"7GJX5d6mjk5j","colab":{"base_uri":"https://localhost:8080/","height":580},"executionInfo":{"status":"ok","timestamp":1649990290799,"user_tz":-300,"elapsed":142581,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"384404ea-9b81-48a3-eb65-024b784020c8"},"source":["import nlu \n","import pandas as pd\n","\n","df = pd.read_csv(\"atis_intents.csv\")\n","df.columns = [\"flight\",\"text\"]\n","\n","preds = nlu.load('en.classify.intent.airline').predict(df[\"text\"],output_level='sentence')\n","preds"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["classifierdl_use_atis download started this may take some time.\n","Approximate size to download 21.1 MB\n","[OK!]\n","tfhub_use download started this may take some time.\n","Approximate size to download 923.7 MB\n","[OK!]\n","sentence_detector_dl download started this may take some time.\n","Approximate size to download 354.6 KB\n","[OK!]\n"]},{"output_type":"execute_result","data":{"text/plain":[" intent intent_confidence_confidence \\\n","0 atis_flight 0.999994 \n","1 atis_flight 0.999997 \n","2 atis_airfare 0.997928 \n","3 atis_airfare 1.0 \n","4 atis_flight 0.999996 \n","... ... ... \n","4972 atis_airfare 0.999503 \n","4973 atis_flight 0.999994 \n","4974 atis_airline 1.0 \n","4975 atis_flight 0.994565 \n","4976 atis_flight 0.999779 \n","\n"," sentence \\\n","0 what flights are available from pittsburgh to ... \n","1 what is the arrival time in san francisco for ... \n","2 cheapest airfare from tacoma to orlando \n","3 round trip fares from pittsburgh to philadelph... \n","4 i need a flight tomorrow from columbus to minn... \n","... ... \n","4972 what is the airfare for flights from denver to... \n","4973 do you have any flights from denver to baltimo... \n","4974 which airlines fly into and out of denver \n","4975 does continental fly from boston to san franci... \n","4976 is there a delta flight from denver to san fra... \n","\n"," sentence_embedding_use \n","0 [0.037106938660144806, 0.0727505013346672, -0.... \n","1 [0.020266082137823105, 0.044293809682130814, -... \n","2 [0.05529679358005524, 0.0694049745798111, -0.0... \n","3 [0.044724948704242706, 0.07032939791679382, -0... \n","4 [-0.0009330636239610612, 0.0720256119966507, -... \n","... ... \n","4972 [0.015531656332314014, 0.06927467882633209, -0... \n","4973 [0.03598876670002937, 0.06490834802389145, -0.... \n","4974 [0.0314473956823349, 0.0699605792760849, -0.06... \n","4975 [0.01851840876042843, 0.07567648589611053, -0.... \n","4976 [0.026785779744386673, 0.06964033842086792, -0... \n","\n","[4977 rows x 4 columns]"],"text/html":["\n","
\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
intentintent_confidence_confidencesentencesentence_embedding_use
0atis_flight0.999994what flights are available from pittsburgh to ...[0.037106938660144806, 0.0727505013346672, -0....
1atis_flight0.999997what is the arrival time in san francisco for ...[0.020266082137823105, 0.044293809682130814, -...
2atis_airfare0.997928cheapest airfare from tacoma to orlando[0.05529679358005524, 0.0694049745798111, -0.0...
3atis_airfare1.0round trip fares from pittsburgh to philadelph...[0.044724948704242706, 0.07032939791679382, -0...
4atis_flight0.999996i need a flight tomorrow from columbus to minn...[-0.0009330636239610612, 0.0720256119966507, -...
...............
4972atis_airfare0.999503what is the airfare for flights from denver to...[0.015531656332314014, 0.06927467882633209, -0...
4973atis_flight0.999994do you have any flights from denver to baltimo...[0.03598876670002937, 0.06490834802389145, -0....
4974atis_airline1.0which airlines fly into and out of denver[0.0314473956823349, 0.0699605792760849, -0.06...
4975atis_flight0.994565does continental fly from boston to san franci...[0.01851840876042843, 0.07567648589611053, -0....
4976atis_flight0.999779is there a delta flight from denver to san fra...[0.026785779744386673, 0.06964033842086792, -0...
\n","

4977 rows × 4 columns

\n","
\n"," \n"," \n"," \n","\n"," \n","
\n","
\n"," "]},"metadata":{},"execution_count":3}]},{"cell_type":"markdown","metadata":{"id":"HH6KBffB2pY_"},"source":["# Plot Distribution of Intent in Messages"]},{"cell_type":"code","metadata":{"id":"WdnY9n1LTmed","colab":{"base_uri":"https://localhost:8080/","height":388},"executionInfo":{"status":"ok","timestamp":1649990290804,"user_tz":-300,"elapsed":22,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"36c79ee1-9285-4d11-a8ab-ff030e1e1710"},"source":["preds.intent.value_counts().plot.bar(title='Distribution of message intents')"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":[""]},"metadata":{},"execution_count":4},{"output_type":"display_data","data":{"text/plain":["
"],"image/png":"\n"},"metadata":{"needs_background":"light"}}]}]}