{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"intent_classification_airlines_ATIS.ipynb","provenance":[],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"ZE4c3HMSkGGu"},"source":["![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)\n","\n","\n","[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/classifiers/intent_classification_airlines_ATIS.ipynb)\n","\n","# Intent Classification with NLU\n","\n","|Type | \tDescription |\n","|------|--------------|\n"," | atis_airfare|air fares, like **500 $**\n"," | atis_ground_service|gorund services like, **Transporation**\n"," | atis_flight|aits flights like, **6B12**\n"," | atis_airline|atis airline like, **Emirates**\n"," | atis_abbreviation|atis abbreviations like, **air fare q**\n"," \n","# What is the ATIS Dataset?\n","ATIS dataset provides large number of messages and their associated intents that can be used in training a classifier. Within a chatbot, intent refers to the goal the customer has in mind when typing in a question or comment. While entity refers to the modifier the customer uses to describe their issue, the intent is what they really mean. For example, a user says, ‘I need new shoes.’ The intent behind the message is to browse the footwear on offer. Understanding the intent of the customer is key to implementing a successful chatbot experience for end-user.\n","https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem\n","\n"]},{"cell_type":"code","metadata":{"id":"SF5-Z-U4jukd","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649990147602,"user_tz":-300,"elapsed":111768,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"e2ab06a7-7811-4539-8212-41c21af55334"},"source":["!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash\n"," \n","\n","import nlu"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-15 02:33:54-- https://setup.johnsnowlabs.com/nlu/colab.sh\n","Resolving setup.johnsnowlabs.com (setup.johnsnowlabs.com)... 51.158.130.125\n","Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:443... connected.\n","HTTP request sent, awaiting response... 302 Moved Temporarily\n","Location: https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh [following]\n","--2022-04-15 02:33:54-- https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 1665 (1.6K) [text/plain]\n","Saving to: ‘STDOUT’\n","\n","- 100%[===================>] 1.63K --.-KB/s in 0s \n","\n","2022-04-15 02:33:55 (28.6 MB/s) - written to stdout [1665/1665]\n","\n","Installing NLU 3.4.3rc2 with PySpark 3.0.3 and Spark NLP 3.4.2 for Google Colab ...\n","Get:1 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]\n","Ign:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease\n","Ign:3 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease\n","Get:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release [696 B]\n","Hit:5 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release\n","Get:6 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release.gpg [836 B]\n","Get:7 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease [15.9 kB]\n","Hit:8 http://archive.ubuntu.com/ubuntu bionic InRelease\n","Get:9 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]\n","Get:10 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]\n","Hit:11 http://ppa.launchpad.net/cran/libgit2/ubuntu bionic InRelease\n","Get:13 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]\n","Get:14 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic InRelease [15.9 kB]\n","Get:15 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages [953 kB]\n","Hit:16 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease\n","Get:17 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main Sources [1,947 kB]\n","Get:18 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1,490 kB]\n","Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [3,134 kB]\n","Get:20 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2,695 kB]\n","Get:21 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main amd64 Packages [996 kB]\n","Get:22 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2,268 kB]\n","Get:23 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic/main amd64 Packages [45.3 kB]\n","Fetched 13.8 MB in 4s (3,725 kB/s)\n","Reading package lists... Done\n","tar: spark-3.0.2-bin-hadoop2.7.tgz: Cannot open: No such file or directory\n","tar: Error is not recoverable: exiting now\n","\u001b[K |████████████████████████████████| 209.1 MB 55 kB/s \n","\u001b[K |████████████████████████████████| 142 kB 48.8 MB/s \n","\u001b[K |████████████████████████████████| 505 kB 45.5 MB/s \n","\u001b[K |████████████████████████████████| 198 kB 56.6 MB/s \n","\u001b[?25h Building wheel for pyspark (setup.py) ... \u001b[?25l\u001b[?25hdone\n","Collecting nlu_tmp==3.4.3rc10\n"," Downloading nlu_tmp-3.4.3rc10-py3-none-any.whl (510 kB)\n","\u001b[K |████████████████████████████████| 510 kB 5.0 MB/s \n","\u001b[?25hRequirement already satisfied: pandas>=1.3.5 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (1.3.5)\n","Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (1.21.5)\n","Requirement already satisfied: pyarrow>=0.16.0 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (6.0.1)\n","Requirement already satisfied: dataclasses in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (0.6)\n","Requirement already satisfied: spark-nlp<3.5.0,>=3.4.2 in /usr/local/lib/python3.7/dist-packages (from nlu_tmp==3.4.3rc10) (3.4.2)\n","Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=1.3.5->nlu_tmp==3.4.3rc10) (2018.9)\n","Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=1.3.5->nlu_tmp==3.4.3rc10) (2.8.2)\n","Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas>=1.3.5->nlu_tmp==3.4.3rc10) (1.15.0)\n","Installing collected packages: nlu-tmp\n","Successfully installed nlu-tmp-3.4.3rc10\n"]}]},{"cell_type":"code","metadata":{"id":"51Zr-JvU4xEg","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1649990148234,"user_tz":-300,"elapsed":665,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"9edca62f-a258-4563-fa37-791f1ce4b77e"},"source":["# Download the dataset \n","! wget http://ckl-it.de/wp-content/uploads/2021/01/atis_intents.csv\n"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-15 02:35:46-- http://ckl-it.de/wp-content/uploads/2021/01/atis_intents.csv\n","Resolving ckl-it.de (ckl-it.de)... 217.160.0.108, 2001:8d8:100f:f000::209\n","Connecting to ckl-it.de (ckl-it.de)|217.160.0.108|:80... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 391936 (383K) [text/csv]\n","Saving to: ‘atis_intents.csv’\n","\n","atis_intents.csv 100%[===================>] 382.75K 669KB/s in 0.6s \n","\n","2022-04-15 02:35:47 (669 KB/s) - ‘atis_intents.csv’ saved [391936/391936]\n","\n"]}]},{"cell_type":"markdown","metadata":{"id":"yZ7R8bho2kXI"},"source":["# Predict Intent of Airline messages"]},{"cell_type":"code","metadata":{"id":"7GJX5d6mjk5j","colab":{"base_uri":"https://localhost:8080/","height":580},"executionInfo":{"status":"ok","timestamp":1649990290799,"user_tz":-300,"elapsed":142581,"user":{"displayName":"Gammer Otaku","userId":"18042713576744284398"}},"outputId":"384404ea-9b81-48a3-eb65-024b784020c8"},"source":["import nlu \n","import pandas as pd\n","\n","df = pd.read_csv(\"atis_intents.csv\")\n","df.columns = [\"flight\",\"text\"]\n","\n","preds = nlu.load('en.classify.intent.airline').predict(df[\"text\"],output_level='sentence')\n","preds"],"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["classifierdl_use_atis download started this may take some time.\n","Approximate size to download 21.1 MB\n","[OK!]\n","tfhub_use download started this may take some time.\n","Approximate size to download 923.7 MB\n","[OK!]\n","sentence_detector_dl download started this may take some time.\n","Approximate size to download 354.6 KB\n","[OK!]\n"]},{"output_type":"execute_result","data":{"text/plain":[" intent intent_confidence_confidence \\\n","0 atis_flight 0.999994 \n","1 atis_flight 0.999997 \n","2 atis_airfare 0.997928 \n","3 atis_airfare 1.0 \n","4 atis_flight 0.999996 \n","... ... ... \n","4972 atis_airfare 0.999503 \n","4973 atis_flight 0.999994 \n","4974 atis_airline 1.0 \n","4975 atis_flight 0.994565 \n","4976 atis_flight 0.999779 \n","\n"," sentence \\\n","0 what flights are available from pittsburgh to ... \n","1 what is the arrival time in san francisco for ... \n","2 cheapest airfare from tacoma to orlando \n","3 round trip fares from pittsburgh to philadelph... \n","4 i need a flight tomorrow from columbus to minn... \n","... ... \n","4972 what is the airfare for flights from denver to... \n","4973 do you have any flights from denver to baltimo... \n","4974 which airlines fly into and out of denver \n","4975 does continental fly from boston to san franci... \n","4976 is there a delta flight from denver to san fra... \n","\n"," sentence_embedding_use \n","0 [0.037106938660144806, 0.0727505013346672, -0.... \n","1 [0.020266082137823105, 0.044293809682130814, -... \n","2 [0.05529679358005524, 0.0694049745798111, -0.0... \n","3 [0.044724948704242706, 0.07032939791679382, -0... \n","4 [-0.0009330636239610612, 0.0720256119966507, -... \n","... ... \n","4972 [0.015531656332314014, 0.06927467882633209, -0... \n","4973 [0.03598876670002937, 0.06490834802389145, -0.... \n","4974 [0.0314473956823349, 0.0699605792760849, -0.06... \n","4975 [0.01851840876042843, 0.07567648589611053, -0.... \n","4976 [0.026785779744386673, 0.06964033842086792, -0... \n","\n","[4977 rows x 4 columns]"],"text/html":["\n","
\n"," | intent | \n","intent_confidence_confidence | \n","sentence | \n","sentence_embedding_use | \n","
---|---|---|---|---|
0 | \n","atis_flight | \n","0.999994 | \n","what flights are available from pittsburgh to ... | \n","[0.037106938660144806, 0.0727505013346672, -0.... | \n","
1 | \n","atis_flight | \n","0.999997 | \n","what is the arrival time in san francisco for ... | \n","[0.020266082137823105, 0.044293809682130814, -... | \n","
2 | \n","atis_airfare | \n","0.997928 | \n","cheapest airfare from tacoma to orlando | \n","[0.05529679358005524, 0.0694049745798111, -0.0... | \n","
3 | \n","atis_airfare | \n","1.0 | \n","round trip fares from pittsburgh to philadelph... | \n","[0.044724948704242706, 0.07032939791679382, -0... | \n","
4 | \n","atis_flight | \n","0.999996 | \n","i need a flight tomorrow from columbus to minn... | \n","[-0.0009330636239610612, 0.0720256119966507, -... | \n","
... | \n","... | \n","... | \n","... | \n","... | \n","
4972 | \n","atis_airfare | \n","0.999503 | \n","what is the airfare for flights from denver to... | \n","[0.015531656332314014, 0.06927467882633209, -0... | \n","
4973 | \n","atis_flight | \n","0.999994 | \n","do you have any flights from denver to baltimo... | \n","[0.03598876670002937, 0.06490834802389145, -0.... | \n","
4974 | \n","atis_airline | \n","1.0 | \n","which airlines fly into and out of denver | \n","[0.0314473956823349, 0.0699605792760849, -0.06... | \n","
4975 | \n","atis_flight | \n","0.994565 | \n","does continental fly from boston to san franci... | \n","[0.01851840876042843, 0.07567648589611053, -0.... | \n","
4976 | \n","atis_flight | \n","0.999779 | \n","is there a delta flight from denver to san fra... | \n","[0.026785779744386673, 0.06964033842086792, -0... | \n","
4977 rows × 4 columns
\n","