{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Inheritance\n", "\n", "ZnTrack allows inheritance from a Node base class.\n", "This can e.g. be useful if you want to test out different methods of the same kind.\n", "In the following example, we will show this by using different functions in the run method with the same inputs and outputs." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from zntrack import config\n", "\n", "config.nb_name = \"02_Inheritance.ipynb\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initialized empty Git repository in /tmp/tmp4o5jiism/.git/\n", "Initialized DVC repository.\n", "\n", "You can now commit the changes to git.\n", "\n", "+---------------------------------------------------------------------+\n", "| |\n", "| DVC has enabled anonymous aggregate usage analytics. |\n", "| Read the analytics documentation (and how to opt-out) here: |\n", "| |\n", "| |\n", "+---------------------------------------------------------------------+\n", "\n", "What's next?\n", "------------\n", "- Check out the documentation: \n", "- Get help and share ideas: \n", "- Star us on GitHub: \n" ] } ], "source": [ "# Work in a temporary directory\n", "from zntrack.utils import cwd_temp_dir\n", "temp_dir = cwd_temp_dir()\n", "\n", "!git init\n", "!dvc init" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import zntrack" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Let us define a ``NodeBase`` which has a single input and output. We will use this as a base class for the following Nodes.\n", "\n", "- ``AddNumber``: Shift input by an offset\n", "- ``MultiplyNumber``: Multiply input by a factor\n", "\n", "Both of these Nodes extend the ``NodeBase`` by additional parameters." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "class NodeBase(zntrack.Node):\n", " _name_ = \"basic_number\"\n", "\n", " inputs: float = zntrack.zn.params()\n", " output: float = zntrack.zn.outs()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "class AddNumber(NodeBase):\n", " \"\"\"Shift input by an offset\"\"\"\n", " offset: float = zntrack.zn.params()\n", "\n", " def run(self):\n", " self.output = self.inputs + self.offset\n", "\n", "class MultiplyNumber(NodeBase):\n", " \"\"\"Multiply input by a factor\"\"\"\n", " factor: float = zntrack.zn.params()\n", "\n", " def run(self):\n", " self.output = self.inputs * self.factor" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Running DVC command: 'stage add --name basic_number --force ...'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Creating 'dvc.yaml'\n", "Adding stage 'basic_number' in 'dvc.yaml'\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.yaml nodes/basic_number/.gitignore\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Jupyter support is an experimental feature! Please save your notebook before running this command!\n", "Submit issues to https://github.com/zincware/ZnTrack.\n", "[NbConvertApp] Converting notebook 02_Inheritance.ipynb to script\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Running stage 'basic_number':\n", "> zntrack run src.AddNumber.AddNumber --name basic_number\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[NbConvertApp] Writing 4756 bytes to 02_Inheritance.py\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Generating lock file 'dvc.lock'\n", "Updating lock file 'dvc.lock'\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.lock\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n", "Use `dvc push` to send your updates to remote storage.\n" ] } ], "source": [ "with zntrack.Project() as project:\n", " add_number = AddNumber(inputs=10.0, offset=15.0)\n", "project.run()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because the Nodes inherit from each other and we defined the `node_name` in the parent class, we can use all classes to load the outputs (as long as they are shared).\n", "This is important to keep in mind when working with inheritance, that the output might not necessarily be created by the Node it was loaded by.\n", "On the other hand, this can be handy for dependency handling.\n", "A subsequent Node can e.g. depend on the parent Node and does not need to know where the values actually come from.\n", "I.e. an ML Model might implement a predict function in the parent node but can have an entirely different structure.\n", "An evaluation node might only need the predict method and can therefore be used with all children of the model class." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "25.0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "NodeBase.from_rev().output" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+--------------+ \n", "| basic_number | \n", "+--------------+ \n" ] } ], "source": [ "!dvc dag" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Running DVC command: 'stage add --name basic_number --force ...'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Modifying stage 'basic_number' in 'dvc.yaml'\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.yaml\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[NbConvertApp] Converting notebook 02_Inheritance.ipynb to script\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Running stage 'basic_number':\n", "> zntrack run src.MultiplyNumber.MultiplyNumber --name basic_number\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[NbConvertApp] Writing 4756 bytes to 02_Inheritance.py\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Updating lock file 'dvc.lock'\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.lock\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n", "Use `dvc push` to send your updates to remote storage.\n" ] } ], "source": [ "with zntrack.Project() as project:\n", " multiply_number = MultiplyNumber(inputs=6.0, factor=6.0)\n", "project.run()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "36.0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "NodeBase.from_rev().output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected the node name remains the same and therefore, the Node is replaced with the new one." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+--------------+ \n", "| basic_number | \n", "+--------------+ \n" ] } ], "source": [ "!dvc dag" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nodes as parameters\n", "\n", "Sometimes it can be useful to have a Node as a parameter or use the run method of the given Node but storing the outputs somewhere else.\n", "For example an active learning cycle might use the model and evaluation class but the outputs are stored in the active learning Node.\n", "You might still want to use the other Nodes to avoid overhead though.\n", "\n", "In the following we will use the run method of a `NodeBase` Node and also have a dataclass Node just for storing parameters.\n", "Internally, ZnTrack disables all outputs of the given Node.\n", "To keep the DAG working, a `_hash = zn.Hash()` is introduced.\n", "This value is computed from the parameters as well as the current timestamp and only serves as a file dependency for DVC.\n", "Adding `zn.Hash()` to any Node will add an output file but won't have any additional effect." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "class DivideNumber(NodeBase):\n", " \"\"\"Multiply input by a factor\"\"\"\n", " divider: float = zntrack.zn.params()\n", "\n", " def run(self):\n", " self.output = self.inputs * self.divider\n", "\n", "\n", "class Polynomial(zntrack.Node):\n", " a0: float = zntrack.zn.params()\n", " a1: float = zntrack.zn.params()\n", "\n", "class ManipulateNumber(zntrack.Node):\n", " inputs: float = zntrack.zn.params()\n", " output: float = zntrack.zn.outs()\n", " value_handler: NodeBase = zntrack.zn.nodes()\n", " polynomial: Polynomial = zntrack.zn.nodes()\n", "\n", " def run(self):\n", " # use the passed method\n", " self.value_handler.inputs = self.inputs\n", " self.value_handler.run()\n", " self.output = self.value_handler.output\n", " # polynomials\n", " self.output = self.polynomial.a0 + self.polynomial.a1 * self.output" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Running DVC command: 'stage add --name ManipulateNumber --force ...'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Adding stage 'ManipulateNumber' in 'dvc.yaml'\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add nodes/ManipulateNumber/.gitignore dvc.yaml\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Running DVC command: 'stage add --name ManipulateNumber_polynomial --outs ...'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Adding stage 'ManipulateNumber_polynomial' in 'dvc.yaml'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Could not create .gitignore entry in /tmp/tmp4o5jiism/nodes/ManipulateNumber_polynomial/.gitignore. DVC will attempt to create .gitignore entry again when the stage is run.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.yaml\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Running DVC command: 'stage add --name ManipulateNumber_value_handler --outs ...'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Adding stage 'ManipulateNumber_value_handler' in 'dvc.yaml'\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Could not create .gitignore entry in /tmp/tmp4o5jiism/nodes/ManipulateNumber_value_handler/.gitignore. DVC will attempt to create .gitignore entry again when the stage is run.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "To track the changes with git, run:\n", "\n", "\tgit add dvc.yaml\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[NbConvertApp] Converting notebook 02_Inheritance.ipynb to script\n", "[NbConvertApp] Writing 4756 bytes to 02_Inheritance.py\n", "[NbConvertApp] Converting notebook 02_Inheritance.ipynb to script\n", "[NbConvertApp] Writing 4756 bytes to 02_Inheritance.py\n", "[NbConvertApp] Converting notebook 02_Inheritance.ipynb to script\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Running stage 'ManipulateNumber_polynomial':\n", "> zntrack run src.Polynomial.Polynomial --name ManipulateNumber_polynomial --hash-only\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[NbConvertApp] Writing 4756 bytes to 02_Inheritance.py\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Updating lock file 'dvc.lock'\n", "\n", "Running stage 'ManipulateNumber_value_handler':\n", "> zntrack run src.DivideNumber.DivideNumber --name ManipulateNumber_value_handler --hash-only\n", "Updating lock file 'dvc.lock'\n", "\n", "Running stage 'ManipulateNumber':\n", "> zntrack run src.ManipulateNumber.ManipulateNumber --name ManipulateNumber\n", "Updating lock file 'dvc.lock'\n", "\n", "Stage 'basic_number' didn't change, skipping\n", "\n", "To track the changes with git, run:\n", "\n", "\tgit add nodes/ManipulateNumber_value_handler/.gitignore nodes/ManipulateNumber_polynomial/.gitignore dvc.lock\n", "\n", "To enable auto staging, run:\n", "\n", "\tdvc config core.autostage true\n", "Use `dvc push` to send your updates to remote storage.\n" ] } ], "source": [ "value_handler=DivideNumber(divider=3.0, inputs=None)\n", "polynomial=Polynomial(a0=60.0, a1=10.0)\n", "\n", "with zntrack.Project() as project:\n", " manipulate_number = ManipulateNumber(\n", " inputs=10.0,\n", " value_handler=value_handler,\n", " polynomial=polynomial,\n", " )\n", "project.run()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "manipulate_number.load()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "360.0" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "manipulate_number.output" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+--------------+ \n", "| basic_number | \n", "+--------------+ \n", "+-----------------------------+ +--------------------------------+\n", "| ManipulateNumber_polynomial | | ManipulateNumber_value_handler |\n", "+-----------------------------+ +--------------------------------+\n", " **** ***** \n", " **** **** \n", " *** *** \n", " +------------------+ \n", " | ManipulateNumber | \n", " +------------------+ \n" ] } ], "source": [ "!dvc dag" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "temp_dir.cleanup()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 4 }