{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/nguyen/projects/flask-rest-setup/sentiment-clf\n" ] } ], "source": [ "cd ../sentiment-clf/" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from model import NLPModel\n", "import pandas as pd\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create the model object\n", "The NLP model object uses a Naive Bayes classifier and a TFIDF vectorizer:\n", "```\n", "self.clf = MultinomialNB()\n", "self.vectorizer = TfidfVectorizer()\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "model = NLPModel()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get the data" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "with open('lib/data/train.tsv') as f:\n", " data = pd.read_csv(f, sep='\\t')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | PhraseId | \n", "SentenceId | \n", "Phrase | \n", "Sentiment | \n", "
---|---|---|---|---|
0 | \n", "1 | \n", "1 | \n", "A series of escapades demonstrating the adage ... | \n", "1 | \n", "
1 | \n", "2 | \n", "1 | \n", "A series of escapades demonstrating the adage ... | \n", "2 | \n", "
2 | \n", "3 | \n", "1 | \n", "A series | \n", "2 | \n", "
3 | \n", "4 | \n", "1 | \n", "A | \n", "2 | \n", "
4 | \n", "5 | \n", "1 | \n", "series | \n", "2 | \n", "
5 | \n", "6 | \n", "1 | \n", "of escapades demonstrating the adage that what... | \n", "2 | \n", "
6 | \n", "7 | \n", "1 | \n", "of | \n", "2 | \n", "
7 | \n", "8 | \n", "1 | \n", "escapades demonstrating the adage that what is... | \n", "2 | \n", "
8 | \n", "9 | \n", "1 | \n", "escapades | \n", "2 | \n", "
9 | \n", "10 | \n", "1 | \n", "demonstrating the adage that what is good for ... | \n", "2 | \n", "