{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "##SPAM Classifier\n", "

\n", "\n", "\n", "\n", "\n", "Steps
\n", "Read in data
\n", "Feature Engineering
\n", "-- Simple Bins
\n", "-- TFIDF
\n", "-- NLP
\n", "Sparse Representation
\n", "Training
\n", "-- Naive Bayes
\n", "-- SGD
\n", "\n", "\n", "\n", "\n", "

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "###Reading and Preprocessing the Data\n", "

The data used in this module is from the CSDMC2010 SPAM corpus. If you want to follow along with your own data, or make any modifications on the examples/data, do the following first in a Python compatible environment:
\n", "

\n", "\n", "\n", "

\n" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }