{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualization of neural networks and decision trees\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import ROOT\n", "from ROOT import TFile, TMVA, TCut" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enable JS visualization" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "application/javascript": [ "\n", "require(['notebook'],\n", " function() {\n", " IPython.CodeCell.config_defaults.highlight_modes['magic_text/x-c++src'] = {'reg':[/^%%cpp/]};\n", " console.log(\"JupyROOT - %%cpp magic configured\");\n", " }\n", ");\n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Welcome to JupyROOT 6.09/01\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%jsmva on" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Declarations, building training and testing trees " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more details please see this notebook." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
DataSetInfo
Dataset: tmva_class_exampleAdded class \"Signal\"
Add Tree TreeS of type Signal with 6000 events
DataSetInfo
Dataset: tmva_class_exampleAdded class \"Background\"
Add Tree TreeB of type Background with 6000 events
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "outputFile = TFile( \"TMVA.root\", 'RECREATE' )\n", "\n", "TMVA.Tools.Instance()\n", "\n", "factory = TMVA.Factory(JobName=\"TMVAClassification\", TargetFile=outputFile,\n", " V=False, Color=True, DrawProgressBar=True, Transformations=[\"I\", \"D\", \"P\", \"G\",\"D\"],\n", " AnalysisType=\"Classification\")\n", "\n", "dataset = \"tmva_class_example\"\n", "loader = TMVA.DataLoader(dataset)\n", "\n", "loader.AddVariable( \"myvar1 := var1+var2\", 'F' )\n", "loader.AddVariable( \"myvar2 := var1-var2\", \"Expression 2\", 'F' )\n", "loader.AddVariable( \"var3\", \"Variable 3\", 'F' )\n", "loader.AddVariable( \"var4\", \"Variable 4\", 'F' )\n", "\n", "loader.AddSpectator( \"spec1:=var1*2\", \"Spectator 1\", 'F' )\n", "loader.AddSpectator( \"spec2:=var1*3\", \"Spectator 2\", 'F' )\n", "\n", "if ROOT.gSystem.AccessPathName( \"./tmva_class_example.root\" ) != 0: \n", " ROOT.gSystem.Exec( \"wget https://root.cern.ch/files/tmva_class_example.root\")\n", " \n", "input = TFile.Open( \"./tmva_class_example.root\" )\n", "\n", "# Get the signal and background trees for training\n", "signal = input.Get( \"TreeS\" )\n", "background = input.Get( \"TreeB\" )\n", " \n", "# Global event weights (see below for setting event-wise weights)\n", "signalWeight = 1.0\n", "backgroundWeight = 1.0\n", "\n", "mycuts = TCut(\"\")\n", "mycutb = TCut(\"\")\n", "\n", "loader.AddSignalTree(signal, signalWeight)\n", "loader.AddBackgroundTree(background, backgroundWeight)\n", "loader.fSignalWeight = signalWeight\n", "loader.fBackgroundWeight = backgroundWeight\n", "loader.fTreeS = signal\n", "loader.fTreeB = background\n", "\n", "loader.PrepareTrainingAndTestTree(SigCut=mycuts, BkgCut=mycutb,\n", " nTrain_Signal=1000, nTrain_Background=1000, nTest_Signal=2000, nTest_Background=2000,\n", " SplitMode=\"Random\", NormMode=\"NumEvents\", V=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Booking methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more details please see this notebook." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "
Factory Booking method: MLP
MLP
Dataset: tmva_class_exampleCreate Transformation \"N\" with events from all classes.
Transformation, Variable selection :
Input : variable 'myvar1' <---> Output : variable 'myvar1'
Input : variable 'myvar2' <---> Output : variable 'myvar2'
Input : variable 'var3' <---> Output : variable 'var3'
Input : variable 'var4' <---> Output : variable 'var4'
MLP Building Network.
Initializing weights
Factory Booking method: DNN
DNN
Dataset: tmva_class_exampleCreate Transformation \"Normalize\" with events from all classes.
Transformation, Variable selection :
Input : variable 'myvar1' <---> Output : variable 'myvar1'
Input : variable 'myvar2' <---> Output : variable 'myvar2'
Input : variable 'var3' <---> Output : variable 'var3'
Input : variable 'var4' <---> Output : variable 'var4'
Factory Booking method: BDT
DataSetFactory
Dataset: tmva_class_exampleNumber of events in input trees
Number of training and testing events
Signaltraining events1000
testing events2000
training and testing events3000
Backgroundtraining events1000
testing events2000
training and testing events3000
DataSetInfo Correlation matrix (Signal)
DataSetInfo Correlation matrix (Background)
DataSetFactory
Dataset: tmva_class_example
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "factory.BookMethod( DataLoader=loader, Method=TMVA.Types.kMLP, MethodTitle=\"MLP\", \n", " H=False, V=False, NeuronType=\"tanh\", VarTransform=\"N\", NCycles=600, HiddenLayers=\"N+5\",\n", " TestRate=5, UseRegulator=False )\n", "\n", "trainingStrategy = [{\n", " \"LearningRate\": 1e-1,\n", " \"Momentum\": 0.0,\n", " \"Repetitions\": 1,\n", " \"ConvergenceSteps\": 300,\n", " \"BatchSize\": 20,\n", " \"TestRepetitions\": 15,\n", " \"WeightDecay\": 0.001,\n", " \"Regularization\": \"NONE\",\n", " \"DropConfig\": \"0.0+0.5+0.5+0.5\",\n", " \"DropRepetitions\": 1,\n", " \"Multithreading\": True\n", " \n", " }, {\n", " \"LearningRate\": 1e-2,\n", " \"Momentum\": 0.5,\n", " \"Repetitions\": 1,\n", " \"ConvergenceSteps\": 300,\n", " \"BatchSize\": 30,\n", " \"TestRepetitions\": 7,\n", " \"WeightDecay\": 0.001,\n", " \"Regularization\": \"L2\",\n", " \"DropConfig\": \"0.0+0.1+0.1+0.1\",\n", " \"DropRepetitions\": 1,\n", " \"Multithreading\": True\n", " \n", " }, {\n", " \"LearningRate\": 1e-2,\n", " \"Momentum\": 0.3,\n", " \"Repetitions\": 1,\n", " \"ConvergenceSteps\": 300,\n", " \"BatchSize\": 40,\n", " \"TestRepetitions\": 7,\n", " \"WeightDecay\": 0.001,\n", " \"Regularization\": \"L2\",\n", " \"Multithreading\": True\n", " \n", " },{\n", " \"LearningRate\": 1e-3,\n", " \"Momentum\": 0.1,\n", " \"Repetitions\": 1,\n", " \"ConvergenceSteps\": 200,\n", " \"BatchSize\": 70,\n", " \"TestRepetitions\": 7,\n", " \"WeightDecay\": 0.001,\n", " \"Regularization\": \"NONE\",\n", " \"Multithreading\": True\n", " \n", "}]\n", "\n", "factory.BookMethod(DataLoader=loader, Method=TMVA.Types.kDNN, MethodTitle=\"DNN\", \n", " H = False, V=False, VarTransform=\"Normalize\", ErrorStrategy=\"CROSSENTROPY\",\n", " Layout=[\"TANH|100\", \"TANH|50\", \"TANH|10\", \"LINEAR\"],\n", " TrainingStrategy=trainingStrategy, Architecture=\"CPU\")\n", "\n", "factory.BookMethod(DataLoader= loader, Method=TMVA.Types.kBDT, MethodTitle=\"BDT\",\n", " H=False,V=False,NTrees=850,MinNodeSize=\"2.5%\",MaxDepth=3,BoostType=\"AdaBoost\", AdaBoostBeta=0.5,\n", " UseBaggedBoost=True,BaggedSampleFraction=0.5, SeparationType=\"GiniIndex\", nCuts=20 )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Draw Neural Network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we trained a neural network then the weights of the network will be saved to XML and C file. We can read back the XML file and we can visualize the network using Factory.DrawNeuralNetwork function.\n", "\n", "The arguments of this function:\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", "
KeywordCan be used as positional argumentDefaultPredefined valuesDescription
datasetNameyes, 1.-- The name of dataset
methodNameyes, 2.-- The name of method
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This visualization will be interactive, and we can do the following with it:\n", "* Mouseover (node, weight): focusing\n", "* Zooming and grab and move supported\n", "* Reset: double click\n", "\n", "The synapses are drawn with 2 colors, one for positive weight and one for negative weight. The absolute value of the synapses are scaled and transformed to thickness of line between to node." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "factory.DrawNeuralNetwork(dataset, \"MLP\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Draw Deep Neural Network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The DrawNeuralNetwork function also can visualize deep neural networks, we just have to pass \"DNN\" as method name. If you have very big network with lots of thousands of neurons then drawing the network will be a little bit slow and will need a lot of ram, so be careful with this function.\n", "\n", "This visualization also will be interactive, and we can do the following with it:\n", "\n", "* Zooming and grab and move supported\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "factory.DrawNeuralNetwork(dataset, \"DNN\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Draw Decision Tree" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The trained decision trees will be save to XML save too, so we can read back the XML file and we can visualize the trees. This is the purpose of Factory.DrawDecisionTree function.\n", "\n", "The arguments of this function:\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", "
KeywordCan be used as positional argumentDefaultPredefined valuesDescription
datasetNameyes, 1.-- The name of dataset
methodNameyes, 2.-- The name of method
\n", "\n", "This function will produce a little box where you can enter the index of the tree (the number of trees will be also will appear before this input box) you want to see. After choosing this number you have to press the Draw button. The nodes of tree will be colored, the color is associated to signal efficiency." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The visualization of tree will be interactive and you can do the following with it:\n", "\n", "* Mouseover (node, weight): showing decision path\n", "* Zooming and grab and move supported\n", "* Reset zoomed tree: double click\n", "* Expand all closed subtrees, turn off zoom: button in the bottom of the picture\n", "* Click on node: \n", "\n", " * hiding subtree, if node children are hidden the node will have a green border\n", " * rescaling: bigger nodes, bigger texts\n", " * click again to show the subtree" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "factory.DrawDNNWeights(dataset, \"DNN\")" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ " factory.DrawDecisionTree(dataset, \"BDT\") #11" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Close the factory's output file" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [], "source": [ "outputFile.Close()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [Root]", "language": "python", "name": "Python [Root]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }