I have developed a spam detector program in Python which classifies given emails as spam or ham using the Naive Bayes approach.
This Project includes 3 executable files, 3 text files as well as 2 directories as follows:
In machine learning, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features. Abstractly, naive Bayes is a conditional probability model: given a problem instance to be classified, represented by a vector
representing some n features (independent variables), it assigns to this instance probabilities
The problem with the above formulation is that if the number of features n is large or if a feature can take on a large number of values, then basing such a model on probability tables is infeasible. We therefore reformulate the model to make it more tractable. Using Bayes' theorem, the conditional probability can be decomposed as
![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/rainbow.png)The order of execution of the program files is as follows:
1) spam_detector.py
First, the spam_detector.py file must be executed to define all the functions and variables required for classification operations.
2) train.py
Then, the train.py file must be executed, which leads to the production of the model.txt file. At the beginning of this file, the spam_detector has been imported so that the functions defined in it can be used.
3) test.py
Finally, the test.py file must be executed to create the result.txt and evaluation.txt files. Just like the train.py file, at the beginning of this file, the spam_detector has been imported so that the functions defined in it can be used.
![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/rainbow.png)Jonathan Lee, 'Notes on Naive Bayes Classifiers for Spam Filtering'. [Online].
Available: https://courses.cs.washington.edu/courses/cse312/18sp/lectures/naive-bayes/naivebayesnotes.pdf
Wikipedia.org, 'Naive Bayes Classifier'. [Online].
Available: https://en.wikipedia.org/wiki/Naive_Bayes_classifier
Youtube.com, 'Naive Bayes for Spam Detection'. [Online].
Available: https://www.youtube.com/watch?v=8aZNAmWKGfs
Youtube.com, 'Text Classification Using Naive Bayes'. [Online].
Available: https://www.youtube.com/watch?v=EGKeC2S44Rs
Manisha-sirsat.blogspot.com, 'What is Confusion Matrix and Advanced Classification Metrics?'. [Online].
Available: https://manisha-sirsat.blogspot.com/2019/04/confusion-matrix.html
Pythonforengineers.com, 'Build a Spam Filter'. [Online].
Available: https://www.pythonforengineers.com/build-a-spam-filter/