{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multinomial Naive Bayes" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "from sklearn.metrics import classification_report\n", "from sklearn.naive_bayes import MultinomialNB\n", "from sklearn.model_selection import train_test_split\n", "\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cargamos el conjunto de datos de spam o no spam (ham)\n", "\n", "La idea es poder predecir si un mensaje entrante es spam o no lo es (en caso que no lo sea se le llama *ham*)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | target | \n", "text | \n", "
|---|---|---|
| 0 | \n", "ham | \n", "Go until jurong point, crazy.. Available only ... | \n", "
| 1 | \n", "ham | \n", "Ok lar... Joking wif u oni... | \n", "
| 2 | \n", "spam | \n", "Free entry in 2 a wkly comp to win FA Cup fina... | \n", "
| 3 | \n", "ham | \n", "U dun say so early hor... U c already then say... | \n", "
| 4 | \n", "ham | \n", "Nah I don't think he goes to usf, he lives aro... | \n", "