{
"cells": [
{
"cell_type": "markdown",
"id": "a29a237f",
"metadata": {},
"source": [
"## Name : ADVAIT GURUNATH CHAVAN\n",
"## Contact Number : +91 70214 55852\n",
"## Mail ID :advaitchavan135@gmail.com \n",
"\n",
"\n",
"## Oasis Infobyte Data Science Intern\n",
"## Task 4 : Email Mail Spam Detection with Machine Learning."
]
},
{
"cell_type": "markdown",
"id": "b4a25e6b",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"id": "04b36837",
"metadata": {},
"source": [
"### 1. Importing the necessary dependencies "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "9ba9aceb",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import plotly.io as plio\n",
"plio.templates\n",
"import seaborn as sns\n",
"import plotly.express as px\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.feature_extraction.text import CountVectorizer\n",
"from sklearn.naive_bayes import MultinomialNB\n",
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.tree import DecisionTreeClassifier\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.svm import SVC\n",
"import joblib\n",
"import re\n",
"from sklearn.metrics import confusion_matrix, accuracy_score, classification_report,precision_recall_curve, average_precision_score\n",
"from warnings import filterwarnings\n",
"filterwarnings(action='ignore')"
]
},
{
"cell_type": "markdown",
"id": "c9decead",
"metadata": {},
"source": [
"### 2. Exploring the dataset"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "2fe76627",
"metadata": {},
"outputs": [],
"source": [
"data = pd.read_csv('spam.csv', encoding='ISO-8859-1')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "526a314a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
| \n", " | v1 | \n", "v2 | \n", "Unnamed: 2 | \n", "Unnamed: 3 | \n", "Unnamed: 4 | \n", "
|---|---|---|---|---|---|
| 0 | \n", "ham | \n", "Go until jurong point, crazy.. Available only ... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 1 | \n", "ham | \n", "Ok lar... Joking wif u oni... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 2 | \n", "spam | \n", "Free entry in 2 a wkly comp to win FA Cup fina... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 3 | \n", "ham | \n", "U dun say so early hor... U c already then say... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 4 | \n", "ham | \n", "Nah I don't think he goes to usf, he lives aro... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 5567 | \n", "spam | \n", "This is the 2nd time we have tried 2 contact u... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 5568 | \n", "ham | \n", "Will Ì_ b going to esplanade fr home? | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 5569 | \n", "ham | \n", "Pity, * was in mood for that. So...any other s... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 5570 | \n", "ham | \n", "The guy did some bitching but I acted like i'd... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
| 5571 | \n", "ham | \n", "Rofl. Its true to its name | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
5572 rows × 5 columns
\n", "| \n", " | v1 | \n", "v2 | \n", "
|---|---|---|
| 0 | \n", "ham | \n", "Go until jurong point, crazy.. Available only ... | \n", "
| 1 | \n", "ham | \n", "Ok lar... Joking wif u oni... | \n", "
| 2 | \n", "spam | \n", "Free entry in 2 a wkly comp to win FA Cup fina... | \n", "
| 3 | \n", "ham | \n", "U dun say so early hor... U c already then say... | \n", "
| 4 | \n", "ham | \n", "Nah I don't think he goes to usf, he lives aro... | \n", "
| ... | \n", "... | \n", "... | \n", "
| 5567 | \n", "spam | \n", "This is the 2nd time we have tried 2 contact u... | \n", "
| 5568 | \n", "ham | \n", "Will Ì_ b going to esplanade fr home? | \n", "
| 5569 | \n", "ham | \n", "Pity, * was in mood for that. So...any other s... | \n", "
| 5570 | \n", "ham | \n", "The guy did some bitching but I acted like i'd... | \n", "
| 5571 | \n", "ham | \n", "Rofl. Its true to its name | \n", "
5572 rows × 2 columns
\n", "| \n", " | class | \n", "text | \n", "
|---|---|---|
| 0 | \n", "ham | \n", "Go until jurong point, crazy.. Available only ... | \n", "
| 1 | \n", "ham | \n", "Ok lar... Joking wif u oni... | \n", "
| 2 | \n", "spam | \n", "Free entry in 2 a wkly comp to win FA Cup fina... | \n", "
| 3 | \n", "ham | \n", "U dun say so early hor... U c already then say... | \n", "
| 4 | \n", "ham | \n", "Nah I don't think he goes to usf, he lives aro... | \n", "
| ... | \n", "... | \n", "... | \n", "
| 5567 | \n", "spam | \n", "This is the 2nd time we have tried 2 contact u... | \n", "
| 5568 | \n", "ham | \n", "Will Ì_ b going to esplanade fr home? | \n", "
| 5569 | \n", "ham | \n", "Pity, * was in mood for that. So...any other s... | \n", "
| 5570 | \n", "ham | \n", "The guy did some bitching but I acted like i'd... | \n", "
| 5571 | \n", "ham | \n", "Rofl. Its true to its name | \n", "
5572 rows × 2 columns
\n", "| \n", " | class | \n", "text | \n", "
|---|---|---|
| 0 | \n", "0 | \n", "Go until jurong point, crazy.. Available only ... | \n", "
| 1 | \n", "0 | \n", "Ok lar... Joking wif u oni... | \n", "
| 2 | \n", "1 | \n", "Free entry in 2 a wkly comp to win FA Cup fina... | \n", "
| 3 | \n", "0 | \n", "U dun say so early hor... U c already then say... | \n", "
| 4 | \n", "0 | \n", "Nah I don't think he goes to usf, he lives aro... | \n", "
| ... | \n", "... | \n", "... | \n", "
| 5567 | \n", "1 | \n", "This is the 2nd time we have tried 2 contact u... | \n", "
| 5568 | \n", "0 | \n", "Will Ì_ b going to esplanade fr home? | \n", "
| 5569 | \n", "0 | \n", "Pity, * was in mood for that. So...any other s... | \n", "
| 5570 | \n", "0 | \n", "The guy did some bitching but I acted like i'd... | \n", "
| 5571 | \n", "0 | \n", "Rofl. Its true to its name | \n", "
5572 rows × 2 columns
\n", "