{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sentiment Analysis\n",
"We will use the IMDB sentiment analysis database in this tutorial. The main idea that is used in this tutorial is that certain words are enough to establish the sentiment of a given sentence. The word order is discarded in this particular tutorial.\n",
"
\n",
"\n",
"## References\n",
"1. http://ai.stanford.edu/~amaas/data/sentiment/"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import os\n",
"import urllib\n",
"\n",
"from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer\n",
"\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"\n",
"np.random.seed(1)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
| \n", " | Reviews | \n", "Sentiment | \n", "
|---|---|---|
| 0 | \n", "Bromwell High is a cartoon comedy. It ran at t... | \n", "1 | \n", "
| 1 | \n", "Homelessness (or Houselessness as George Carli... | \n", "1 | \n", "
| 2 | \n", "Brilliant over-acting by Lesley Ann Warren. Be... | \n", "1 | \n", "
| 3 | \n", "This is easily the most underrated film inn th... | \n", "1 | \n", "
| 4 | \n", "This is not the typical Mel Brooks film. It wa... | \n", "1 | \n", "