{ "cells": [ { "cell_type": "markdown", "metadata": { "run_control": {} }, "source": [ "# MOA Notebook Example\n", "\n", "This is an example of a MOA Notebook in Java.\n", "\n", "\n", "## Prequential Evaluation Example\n", "Let’s run a very simple experiment: using a decision tree (Hoeffding Tree) with data generated from an artificial stream generator (RandomRBFGenerator).\n", "\n", "We should start importing the classes that we need, and defining the stream and the learner." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [], "source": [ "%maven nz.ac.waikato.cms.moa:moa:2018.6.0\n", "\n", "import moa.classifiers.trees.HoeffdingTree;\n", "import moa.streams.generators.RandomRBFGenerator;\n", "\n", "HoeffdingTree learner = new HoeffdingTree();\n", "RandomRBFGenerator stream = new RandomRBFGenerator();" ] }, { "cell_type": "markdown", "metadata": { "run_control": {} }, "source": [ "Now, we need to initialize the stream and the classifier:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [], "source": [ "stream.prepareForUse();\n", "learner.setModelContext(stream.getHeader());\n", "learner.prepareForUse();" ] }, { "cell_type": "markdown", "metadata": { "run_control": {} }, "source": [ "And finally, let’s run a prequential evaluation, as in Tutorial 2 (Introduction to the API of MOA)." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1000000 instances processed with 91.0458% accuracy in -2.857338446 seconds.\n" ] }, { "data": { "image/png": "", "text/plain": [ "BufferedImage@5720626f: type = 1 DirectColorModel: rmask=ff0000 gmask=ff00 bmask=ff amask=0 IntegerInterleavedRaster: width = 600 height = 400 #Bands = 3 xOff = 0 yOff = 0 dataOffset[0] 0" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%maven org.knowm.xchart:xchart:3.5.2\n", "import org.knowm.xchart.*;\n", "import moa.core.TimingUtils;\n", "import com.yahoo.labs.samoa.instances.Instance;\n", "\n", "int numInstances = 1000000;\n", "int sampleSize = 1000;\n", "boolean isTesting = true;\n", "double[] xData = new double[numInstances/sampleSize];\n", "double[] yData = new double[numInstances/sampleSize];\n", "\n", "int numberSamplesCorrect = 0;\n", "int numberSamples = 0;\n", "boolean preciseCPUTiming = TimingUtils.enablePreciseTiming();\n", "long evaluateStartTime = TimingUtils.getNanoCPUTimeOfCurrentThread();\n", "while (stream.hasMoreInstances() && numberSamples < numInstances) {\n", " Instance trainInst = stream.nextInstance().getData();\n", " if (isTesting) {\n", " if (learner.correctlyClassifies(trainInst)){\n", " numberSamplesCorrect++;\n", " }\n", " }\n", " if (numberSamples % sampleSize == 0){\n", " xData[numberSamples / sampleSize] = numberSamples / sampleSize;\n", " yData[numberSamples / sampleSize] = 100.0 * (double) numberSamplesCorrect/ (double) numberSamples;\n", " }\n", " numberSamples++;\n", " learner.trainOnInstance(trainInst);\n", "}\n", "double accuracy = 100.0 * (double) numberSamplesCorrect/ (double) numberSamples;\n", "double time = TimingUtils.nanoTimeToSeconds(TimingUtils.getNanoCPUTimeOfCurrentThread()- evaluateStartTime);\n", "System.out.println(numberSamples + \" instances processed with \" + accuracy + \"% accuracy in \"+time+\" seconds.\");\n", "\n", "XYChart chart = QuickChart.getChart(\"Prequential Evaluation\", \"#Instances\", \"Accuracy\", \"y(x)\", xData, yData);\n", "BitmapEncoder.getBufferedImage(chart);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, we can also run a prequential Evaluation task directly." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "{M}assive {O}nline {A}nalysis\n", "Version: 18.06 June 2018\n", "Copyright: (C) 2007-2018 University of Waikato, Hamilton, New Zealand\n", "Web: http://moa.cms.waikato.ac.nz/\n", "\n", " \n", "Task completed in 4.50s (CPU time)\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "learning evaluation instances,evaluation time (cpu seconds),model cost (RAM-Hours),classified instances,classifications correct (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent),Kappa M Statistic (percent),model training instances,model serialized size (bytes),tree size (nodes),tree size (leaves),active learning leaves,tree depth,active leaf byte size estimate,inactive leaf byte size estimate,byte size estimate overhead\n", "100000.0,0.514551649,0.0,100000.0,92.10000000000001,84.09118369648397,82.93736501079914,82.63736263736264,100000.0,0.0,187.0,118.0,118.0,5.0,0.0,0.0,1.0\n", "200000.0,0.898160024,0.0,200000.0,93.2,86.13619960610498,85.15283842794761,84.29561200923789,200000.0,0.0,290.0,180.0,180.0,6.0,0.0,0.0,1.0\n", "300000.0,1.260566831,0.0,300000.0,93.7,87.0415165128104,86.76470588235296,85.14150943396228,300000.0,0.0,368.0,228.0,228.0,6.0,0.0,0.0,1.0\n", "400000.0,1.676657529,0.0,400000.0,95.1,90.00701548300785,90.18036072144288,88.57808857808857,400000.0,0.0,489.0,311.0,311.0,7.0,0.0,0.0,1.0\n", "500000.0,2.12164797,0.0,500000.0,94.8,89.19332313626387,88.9596602972399,87.12871287128712,500000.0,0.0,598.0,370.0,370.0,7.0,0.0,0.0,1.0\n", "600000.0,2.569584639,0.0,600000.0,95.6,91.0510537384223,91.6030534351145,89.69555035128805,600000.0,0.0,687.0,428.0,428.0,7.0,0.0,0.0,1.0\n", "700000.0,2.984601928,0.0,700000.0,96.1,91.99382498090834,91.82389937106917,90.8450704225352,700000.0,0.0,792.0,497.0,497.0,8.0,0.0,0.0,1.0\n", "800000.0,3.430721144,0.0,800000.0,95.39999999999999,90.62666835114945,90.49586776859503,89.37644341801385,800000.0,0.0,924.0,584.0,584.0,8.0,0.0,0.0,1.0\n", "900000.0,3.861493675,0.0,900000.0,96.8,93.50965438909621,93.73776908023483,92.72727272727272,900000.0,0.0,1020.0,647.0,647.0,8.0,0.0,0.0,1.0\n", "1000000.0,4.413296784,0.0,1000000.0,96.7,93.26222599719055,93.25153374233128,92.28971962616822,1000000.0,0.0,1124.0,720.0,720.0,9.0,0.0,0.0,?\n" ] } ], "source": [ "import moa.DoTask;\n", "DoTask.main(\"EvaluatePrequential -l trees.HoeffdingTree -i 1000000\".split(\" \"));" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Java", "language": "java", "name": "java" }, "language_info": { "codemirror_mode": "java", "file_extension": ".java", "mimetype": "text/x-java-source", "name": "Java", "pygments_lexer": "java", "version": "10.0.1+10-Debian-4" } }, "nbformat": 4, "nbformat_minor": 2 }