{ "cells": [ { "cell_type": "markdown", "id": "cfebebd9", "metadata": {}, "source": [ "![brainome logo](./images/brainome_logo.png)\n", "# 301 Using Predictors in Python\n", "Brainome's predictors can be easily integrated into your python application.\n", "1. Importing the predictor\n", "2. Validating a data set\n", "3. Batch Classification (DataFrame)\n", "4. Realtime Classification (Single instance)" ] }, { "cell_type": "markdown", "id": "0d121210", "metadata": {}, "source": [ "## Prerequisites\n", "This notebook assumes brainome is installed as per notebook [brainome_101_Quick_Start](brainome_101_Quick_Start.ipynb)\n", "\n", "The data sets are:\n", "* [titanic_train.csv](https://download.brainome.ai/data/public/titanic_train.csv) for training data\n", "* [titanic_validate.csv](https://download.brainome.ai/data/public/titanic_validate.csv) for validation\n", "* [titanic_predict.csv](https://download.brainome.ai/data/public/titanic_predict.csv) for predictions\n", "\n", "Predictors require numpy and optionally scipy to generate a confusion matrix." ] }, { "cell_type": "code", "execution_count": null, "id": "655a3e12", "metadata": {}, "outputs": [], "source": [ "!python3 -m pip install brainome --quiet\n", "!brainome --version\n", "# download data sets\n", "import urllib.request as request\n", "response1 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_validate.csv', 'titanic_validate.csv')\n", "response2 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_predict.csv', 'titanic_predict.csv')\n", "%ls -lh titanic_validate.csv titanic_predict.csv" ] }, { "cell_type": "markdown", "id": "f6f9cbda", "metadata": {}, "source": [ "## Generate a predictor\n", "The predictor filename is `predictor_301.py`" ] }, { "cell_type": "code", "execution_count": null, "id": "376ccf6e", "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "!brainome https://download.brainome.ai/data/public/titanic_train.csv -rank -y -o predictor_301.py -modelonly -q\n", "print('\\nCreated predictor_301.py')\n", "!ls -lh predictor_301.py" ] }, { "cell_type": "markdown", "id": "33177319", "metadata": {}, "source": [ "## 1. Importing the predictor\n", "Start with importing the predictor_301.py source file into your program. It also requires numpy.\n", "Calling `help(predictor)` will display the pydoc for it." ] }, { "cell_type": "code", "execution_count": null, "id": "f4dc7f42", "metadata": {}, "outputs": [], "source": [ "import numpy as np # predictors require numpy\n", "import predictor_301 as predictor\n", "help(predictor)" ] }, { "cell_type": "markdown", "id": "8f889a13", "metadata": {}, "source": [ "## 2. Validating a data set\n", "Given a test data set, the predictor will compare predictions with expected outcomes.\n", "\n", "For this exercise, we are reading the data set into a pandas data frame, your method may differ." ] }, { "cell_type": "code", "execution_count": null, "id": "de2aa21d", "metadata": {}, "outputs": [], "source": [ "%pip install pandas --quiet\n", "import pandas as pd\n", "\n", "validate_data = pd.read_csv('titanic_validate.csv', na_values=[], na_filter=False)\n", "validate_values = validate_data.values\n", "clean_values = predictor.__preprocess_and_clean_in_memory(validate_values)\n", "count, correct_count, num_TP, num_TN, num_FP, num_FN, num_class_1, num_class_0, preds = predictor.validate(clean_values)\n", "print(' Test Predictions '.center(80, '-'))\n", "print(preds)\n", "true_labels = clean_values[:, -1]\n", "mtrx, stats = predictor.__confusion_matrix(np.array(true_labels).reshape(-1), np.array(preds).reshape(-1), True)\n", "print(' Confusion Matrix '.center(80, '-'))\n", "print(mtrx)\n", "print(' Statistics '.center(80, '-'))\n", "print(stats)" ] }, { "cell_type": "markdown", "id": "37a171af", "metadata": {}, "source": [ "## 3. Single Instance Classification\n", "Demonstrating classification of a single passenger" ] }, { "cell_type": "code", "execution_count": null, "id": "1a4d12b8", "metadata": {}, "outputs": [], "source": [ "passenger = [881,2,\"Shelley, Mrs. William (Imanita Parrish Hall)\",\"female\",25,0,1,230433,26,\"\",\"S\"]\n", "prediction = predictor.predict([passenger])[0]\n", "print(passenger[2], prediction)" ] }, { "cell_type": "markdown", "id": "72975533", "metadata": {}, "source": [ "## 3. Batch Classification\n", "Given a chunk of data, the predictor will return classification predictions for each record." ] }, { "cell_type": "code", "execution_count": null, "id": "53f7c9b4", "metadata": {}, "outputs": [], "source": [ "predict_data = pd.read_csv('titanic_predict.csv', na_values=[], na_filter=False)\n", "predict_values = predict_data.values\n", "predictions_output = predictor.predict(predict_values)\n", "print(' Batch Predictions '.center(80, '-'))\n", "print(predictions_output)" ] }, { "cell_type": "markdown", "id": "715fee81", "metadata": {}, "source": [ "## 4. Large Data Set Classification\n", "Not all data sets can be fully loaded into memory but rather must be streamed instance by instance." ] }, { "cell_type": "code", "execution_count": null, "id": "3a727df3", "metadata": {}, "outputs": [], "source": [ "import csv\n", "with open(\"./titanic_predict.csv\", \"r\") as csv_file:\n", " data_reader = csv.reader(csv_file)\n", " header = next(data_reader)\n", " first = True\n", " for row in data_reader:\n", " prediction = predictor.predict([row])[0]\n", " probability = predictor.predict([row], return_probabilities=True)\n", " if first:\n", " first = False\n", " print(header[0], 'Prediction', \"\\t\", probability[0])\n", " print(row[0], prediction, \"\\t\", probability[1])" ] }, { "cell_type": "markdown", "id": "9e36fb88", "metadata": {}, "source": [ "## Next Steps\n", "- Check out [302_Generating_Probabilities](./brainome_302_Generating_Probabilities.ipynb)\n", "- Check out [300 Put your model to work](./brainome_300_Integrating_Predictors.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }