{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "##SPAM Classifier\n", "
\n",
"\n",
"\n",
"\n",
"\n",
"Steps
\n",
"Read in data
\n",
"Feature Engineering
\n",
"-- Simple Bins
\n",
"-- TFIDF
\n",
"-- NLP
\n",
"Sparse Representation
\n",
"Training
\n",
"-- Naive Bayes
\n",
"-- SGD
\n",
"\n",
"\n",
"\n",
"\n",
"
The data used in this module is from the CSDMC2010 SPAM corpus. If you want to follow along with your own data, or make any modifications on the examples/data, do the following first in a Python compatible environment:
\n",
"