{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to working with NIX" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For an online introduction please see \n", "- the [nixpy tutorial page](http://g-node.github.io/nixpy/tutorial.html) and\n", "- the [nixio readthedocs page](https://nixio.readthedocs.io) (c++ implementation of NIX)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook is setup to be used with Python 3(.6+). Also to properly run this notebook the following libraries need to be installed:\n", "- `pip install numpy`\n", "- `pip install matplotlib`\n", "- `pip install nixio==1.5.0b3`\n", "\n", "Note: nixio 1.5.0b3 is a beta release with many new exciting features of NIX. As of the time of the presentation (24.07.2019) these features have not made it into the main NIX release. So if you are using this notebook at a later point in time, installing via `pip install nixio` should be enough." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Storing data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When storing data, we have two main requirements:\n", "\n", "1. We want to be able to store n-dimensional data structures.\n", "2. The data structures must be self-explanatory, that is, they must contain sufficient information to draw a basic plot of the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](https://nixio.readthedocs.io/en/latest/_images/regular_sampled.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Considering the simple plot above, we can list all information that it shows and by extension, that needs to be stored in order to reproduce it.\n", "\n", "- the data (voltage measurements)\n", "- the y-axis labeling, i.e. label (voltage) and unit (mV)\n", "- the x-axis labeling, i.e. label (time) and unit (s)\n", "- the x-position for each data point\n", "- a title/legend\n", "\n", "In this, and in most cases, it would be inefficient to store x-, and y-position for each data point. The voltage measurements have been done in regular (time) intervals. Thus, we rather need to store the measured values and a definition of the x-axis consisting of an offset, the sampling interval, a label, and a unit.\n", "\n", "This is exactly the approach chosen in NIX. For each dimension of the data a dimension descriptor must be given. In NIX we define three (and a half) dimension descriptors:\n", "\n", "1. SampledDimension: Used if a dimension is sampled at regular intervals.\n", "- RangeDimension: Used if a dimension is sampled at irregular intervals.\n", "- There is a special case of the RangeDimension, the AliasRangeDimension, which is used when e.g. event times are stored.\n", "- SetDimension: Used for dimensions that represent categories rather than physical quantities.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Some data to store" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we can store any data we need to have it lying around somewhere. Lets re-create the example data for the figure we saw above and then see, how we can store this data in a NIX file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets create some example data:\n", "import numpy as np\n", "\n", "freq = 5.0;\n", "samples = 1000\n", "sample_interval = 0.001\n", "time = np.arange(samples)\n", "voltage = np.sin(2 * np.pi * time * freq/samples)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets quickly check how the data we will store actually looks like\n", "# The next line is jupyter notebook specific and will allow us to see plots. It only works in python3.\n", "%matplotlib notebook\n", "\n", "import matplotlib.pyplot as plot\n", "\n", "plot.plot(time*sample_interval, voltage)\n", "plot.xlabel('Time [s]')\n", "plot.ylabel('Voltage [mV]')\n", "\n", "plot.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is perfect data, we would like to keep it and store it in a file. So lets persist this wonderful data in a NIX file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The DataArray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The DataArray is the most central entity of the NIX data model. As almost all other NIX-entities it requires a name and a type. Both are not restricted but names must be unique inside a Block. type information can be used to introduce semantic meaning and domain-specificity. Upon creation, a unique ID will be assigned to the DataArray.\n", "\n", "The DataArray stores the actual data together with label and unit. In addition, the DataArray needs a dimension descriptor for each dimension. The following snippet shows how to create a DataArray and store data in it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import nixio\n", "\n", "# First create a file we'll use to work with\n", "\n", "# Files can be opened in FileMode \"ReadOnly\", \"ReadWrite\" and \"Overwrite\"\n", "# ReadOnly ... Opens an existing file for reading\n", "# ReadWrite ... Opens an existing file for editing or creates a new file\n", "# Overwrite ... Truncates and opens an existing file or creates a new file\n", "f = nixio.File.open('Tutorial.nix', nixio.FileMode.Overwrite)\n", "\n", "# Please note, that nix works on an open file and reads and writes directly from and to this file.\n", "# Always close the file using 'f.close()' when you are done.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see in the [NIX data model](https://github.com/G-Node/nix/wiki/Model-Definition), NIX files are hierarchically structured. Data is stored in 'DataArrays'. DataArrays are contained in 'Blocks'. When we want to create a DataArray, we need to create at least one Block first, that will contain the DataArray." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets check the blocks we currently have defined in our file; it should be emtpy\n", "f.blocks\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets see how we can create a block in our file; we'll use the handy python help function to get more information\n", "help(f.create_block)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# \"name\" and \"type\" of a block can be used to filter and find our blocks later on when the file contains more content\n", "block = f.create_block(name=\"basic_examples\", type_=\"examples\")\n", "\n", "# Please note at this point, that the 'name' of any NIX entity e.g. Blocks, DataArrays, etc. has to be unique\n", "# since it can be used to find and return this exact entity via the 'name'.\n", "# The 'type' can also be used to find entities, but it does not need to be unique. You can use 'name' to uniquely\n", "# identify a particular entity and use 'type' to find groups of related entities" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Great, we have an empty block\n", "block" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And this block resides within our file\n", "f.blocks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are finally set up to put our data in our file!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# First lets check how we can actually create a DataArray\n", "help(block.create_data_array)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now we create the DataArray within the Block created above and with the data\n", "# We also add the appropriate labels immediately.\n", "\n", "da = block.create_data_array(name=\"data_regular\", array_type=\"sine\", data=voltage)\n", "da.label = \"voltage\"\n", "da.unit = \"mV\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now we will also add the appropriate Dimension to this DataArray, so it can be correctly interpreted for\n", "# later plotting. We will look into the different Dimensions in a second.\n", "\n", "# Note that we always should add dimensions in the order x, y, z ... when thinking in plot terms\n", "# This is necessary to later properly interpret data without knowing the actual structure of a DataArray.\n", "\n", "# First we check how to properly create the Dimension we need\n", "help(da.append_sampled_dimension)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And lets add the Dimension of our X axis to our DataArray:\n", "dim = da.append_sampled_dimension(sample_interval)\n", "dim.label = \"time\"\n", "dim.unit = \"s\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We also want to add a Dimension to our Y axis to make the DataArray consistent even if we do not add\n", "# any additional annotations. We will see what a set dimension is later on\n", "dim_set = da.append_set_dimension()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example shown above, the NIX library will figure out the dimensionality of the data, the shape of the data and its type. The data type and the dimensionality (i.e. the number of dimensions) are fixed once the DataArray has been created. The actual size of the DataArray can be changed during the life-time of the entity.\n", "\n", "In case you need more control, DataArrays can be created empty for later filling e.g. during data acquisition." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now lets see if we can access our data and do something useful with it e.g. plot it:\n", "plot_data = f.blocks['basic_examples'].data_arrays['data_regular']\n", "plot_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_data[:5]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets check the dimensionality of our data\n", "plot_data.dimensions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Since we only stored the sampling rate with the second dimension we save quite a bit of space\n", "\n", "dim = plot_data.dimensions[0]\n", "dim.sampling_interval" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compared to the original time array:\n", "time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets plot all data from file using all information provided by the file\n", "\n", "y = plot_data[:]\n", "# The sampled dimension axis function applies its interval to a passed array to recreate the original time array\n", "x = plot_data.dimensions[0].axis(y.shape[0])\n", "\n", "plot.figure(figsize=(10,5))\n", "plot.plot(x, y, '-')\n", "plot.xlabel(\"%s [%s]\" % (dim.label, dim.unit))\n", "plot.ylabel(\"%s [%s]\" % (plot_data.label, plot_data.unit))\n", "plot.title(\"%s/%s\" % (plot_data.name, plot_data.type))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "View, that was already nice. As you have seen in the example we dealt with regularly sampled data. What do we do if we have data that is not regularly sampled? As mentioned at the beginning, NIX supports\n", "- regularly sampled data\n", "- irregularly sampled data\n", "- set (event) data\n", "- one dimensional data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets create some irregularly sampled data and store it\n", "\n", "duration = 1.0\n", "interval = 0.02\n", "time_points = np.around(np.cumsum(np.random.poisson(interval*1000, int(1.5*duration/interval)))/1000., 3)\n", "time_points = time_points[time_points <= duration]\n", "\n", "data_points = np.sin(5 * np.arange(0, time_points[-1] * 2 * np.pi, 0.001))\n", "data_points = data_points[np.asarray(time_points / 0.001 * 2 * np.pi, dtype=int)]\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check the block we want to save this data in:\n", "block" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_irr = block.create_data_array(name=\"data_irregular\", array_type=\"sine\", data=data_points)\n", "data_irr.label = \"Voltage\"\n", "data_irr.unit = \"mV\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets add our x dimension\n", "dim = data_irr.append_range_dimension(time_points)\n", "dim.label = \"time\"\n", "dim.unit = \"s\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And our y dimension\n", "dim_set = data_irr.append_set_dimension()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets plot our data again\n", "plot_data = f.blocks['basic_examples'].data_arrays['data_irregular']\n", "\n", "x_dim = plot_data.dimensions[0]\n", "x = list(x_dim.ticks)\n", "\n", "y = plot_data[:]\n", "\n", "plot.figure(figsize=(10,5))\n", "plot.plot(x, y, '-o')\n", "plot.xlabel(\"%s [%s]\" % (x_dim.label, x_dim.unit))\n", "plot.ylabel(\"%s [%s]\" % (plot_data.label, plot_data.unit))\n", "plot.title(\"%s/%s\" % (plot_data.name, plot_data.type))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Next we will store some basic set or \"event\" data\n", "\n", "data_points = [281, 293, 271, 300, 285, 150]\n", "\n", "data_event = block.create_data_array(name=\"data_event\", array_type=\"event\", data=data_points)\n", "data_event.label = \"temperature\"\n", "data_event.unit = \"K\"\n", "\n", "# Add x dimension\n", "dim = data_event.append_set_dimension()\n", "dim.labels = [\"Response A\", \"Response B\", \"Response C\", \"Response D\", \"Response E\", \"Response F\"]\n", "\n", "# Add y dimension\n", "dim_set = data_event.append_set_dimension()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And lets see how we can plot this\n", "plot_data = f.blocks['basic_examples'].data_arrays['data_event']\n", "\n", "x_dim = plot_data.dimensions[0]\n", "y = plot_data[:]\n", "index = np.arange(len(y))\n", "\n", "plot.figure(figsize=(10,5))\n", "plot.bar(index, y)\n", "plot.xticks(index, x_dim.labels)\n", "plot.ylabel(\"%s [%s]\" % (plot_data.label, plot_data.unit))\n", "plot.title(\"%s/%s\" % (plot_data.name, plot_data.type))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Multiple related signals in one DataArray - Multidimensional data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we know how to save two dimensional data in a DataArray. Due to the ability of adding dimensions NIX also supports multidimensional data and is able to properly describe it. As examples one could save 2D images including their different color channels into one DataArray." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another use case would be to store different time series data together in one DataArray." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets create data for two related time series and store them together\n", "# ---- MOCK DATA; the code can be safely ignored --------\n", "freq = 5.0;\n", "samples = 1000\n", "sample_interval = 0.001\n", "time = np.arange(samples)\n", "voltage_trace_A = np.sin(2 * np.pi * time * freq/samples)\n", "voltage_trace_B = np.cos(2 * np.pi * time * freq/samples)\n", "\n", "# We use a numpy function that will stack both signal\n", "voltage_stacked = np.vstack((voltage_trace_A, voltage_trace_B))\n", "# ---- MOCK DATA end --------\n", "\n", "# Lets create a new DataArray with our multi-dimensional data\n", "data_related = block.create_data_array(name=\"data_multi_dimension\", array_type=\"multi\", data=voltage_stacked)\n", "data_related.label = \"voltage\"\n", "data_related.unit = \"mV\"\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# To properly describe the DataArray we need to add two dimensions\n", "# First we describe the depth of the stacked arrays\n", "dim_set = data_related.append_set_dimension()\n", "# Take care to add the lables in the order the arrays were stacked above.\n", "dim_set.labels = [\"Trace_A\", \"Trace_B\"]\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Second we add the second dimension that is common to both stacked arrays of data and describes time\n", "dim_sample = data_related.append_sampled_dimension(sample_interval)\n", "dim_sample.label = \"time\"\n", "dim_sample.unit = \"s\"\n", "\n", "# And add the y dimensions\n", "dim_set = data_related.append_set_dimension()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets harvest the fruits of our labour\n", "plot_data = f.blocks['basic_examples'].data_arrays['data_multi_dimension']\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "dim_set = plot_data.dimensions[0]\n", "dim_sampled = plot_data.dimensions[1]\n", "\n", "# We need to know the dimension of the x-axis, so we compute the \n", "# timepoints from one of the stored arrays and the sampled dimension interval\n", "data_points_A = plot_data[0, :] # Here we access the first of the multidimensional arrays\n", "\n", "time_points = dim_sampled.axis(data_points_A.shape[0])\n", "\n", "plot.figure(figsize=(10,5))\n", "\n", "# Now we add as many plots as we have set dimensions\n", "for i, label in enumerate(dim_set.labels):\n", " plot.plot(time_points, plot_data[i, :], label=label)\n", "\n", "plot.xlabel(\"%s [%s]\" % (dim_sampled.label, dim_sampled.unit))\n", "plot.ylabel(\"%s [%s]\" % (plot_data.label, plot_data.unit))\n", "plot.title(\"%s/%s\" % (plot_data.name, plot_data.type))\n", "plot.legend()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What we have seen so far:\n", "- we can save different DataArrays that belong to the same experiment in one file in a structured fashion\n", "- we can describe and save different kinds of data to file\n", "- we can add labels and units directly to the data\n", "- we can save multidimensional data\n", "- we can save a bit of space in case of sampled data\n", "- we can better understand the dimensionality of the stored data since we spell out the kind of dimensions which\n", " makes it easier to interpret it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with multiple data in the same file - tagging points and regions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Saving data together with data of annotation is what you can easily do with Matlab and with additional work in Python as well.\n", "NIX does that and it provides additional features to continue working on this initial data and save the analyzed data in relation to the initial data in a meaningfull way.\n", "- \"Tag\" regions of interest in a DataArray\n", "- Use the same tag in multiple related DataArrays e.g. in MultiElectrodeArrays\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The DataArrays store data, but this is not all that is needed to store scientific data. We may want to highlight points or regions in the data and link it to further information.\n", "\n", "This is done using the Tag and the MultiTag, for tagging single or mutliple points or regions, respectively.\n", "\n", "The basic idea is that the Tag defines the point (and extent) with which it refers to points (or regions) in the data. A tag can point to several DataArrays at once. These are mere links that are stored in the list of references. The following figure illustrates, how a MultiTag links two DataArrays to create a new construct." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](https://nixio.readthedocs.io/en/latest/_images/mtag_concept.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let us create a new block to illustrate tagged data\n", "block_tag = f.create_block(name=\"tag_examples\", type_=\"examples\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Referencing a single point or region in a DataArray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To reference only a single point or region, we can use a NIX tag. The NIX tag is a simpler form of the MultiTag that we will cover in a moment." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We will create some more elaborate example data to make a point\n", "# For this we need some equally elaborate code, which can be safely ignored.\n", "\n", "# This code will create some mock membrane voltage traces for us\n", "\n", "# ---- MOCK CODE AND DATA; the code can be safely ignored --------\n", "\n", "class LIF(object):\n", " def __init__(self, stepsize=0.0001, offset=1.6, tau_m=0.025, tau_a=0.02, da=0.0, D=3.5):\n", " self.stepsize = stepsize # simulation stepsize [s]\n", " self.offset = offset # offset curent [nA]\n", " self.tau_m = tau_m # membrane time_constant [s]\n", " self.tau_a = tau_a # adaptation time_constant [s]\n", " self.da = da # increment in adaptation current [nA]\n", " self.D = D # noise intensity\n", " self.v_threshold = 1.0 # spiking threshold\n", " self.v_reset = 0.0 # reset voltage after spiking\n", " self.i_a = 0.0 # current adaptation current\n", " self.v = self.v_reset # current membrane voltage\n", " self.t = 0.0 # current time [s]\n", " self.membrane_voltage = []\n", " self.spike_times = []\n", "\n", " def _reset(self):\n", " self.i_a = 0.0\n", " self.v = self.v_reset\n", " self.t = 0.0\n", " self.membrane_voltage = []\n", " self.spike_times = []\n", "\n", " def _lif(self, stimulus, noise):\n", " \"\"\"\n", " euler solution of the membrane equation with adaptation current and noise\n", " \"\"\"\n", " self.i_a -= self.i_a - self.stepsize/self.tau_a * (self.i_a)\n", " self.v += self.stepsize * ( -self.v + stimulus + noise + self.offset - self.i_a)/self.tau_m;\n", " self.membrane_voltage.append(self.v)\n", "\n", " def _next(self, stimulus):\n", " \"\"\"\n", " working horse which delegates to the euler and gets the spike times\n", " \"\"\"\n", " noise = self.D * (float(np.random.randn() % 10000) - 5000.0)/10000\n", " self._lif(stimulus, noise)\n", " self.t += self.stepsize\n", " if self.v > self.v_threshold and len(self.membrane_voltage) > 1:\n", " self.v = self.v_reset\n", " self.membrane_voltage[len(self.membrane_voltage)-1] = 2.0\n", " self.spike_times.append(self.t)\n", " self.i_a += self.da\n", "\n", " def run_const_stim(self, steps, stimulus):\n", " \"\"\"\n", " lif simulation with constant stimulus.\n", " \"\"\"\n", " self._reset()\n", " for i in range(steps):\n", " self._next(stimulus);\n", " time = np.arange(len(self.membrane_voltage))*self.stepsize\n", " return time, np.array(self.membrane_voltage), np.array(self.spike_times)\n", "\n", " def run_stimulus(self, stimulus):\n", " \"\"\"\n", " lif simulation with a predefined stimulus trace.\n", " \"\"\"\n", " self._reset()\n", " for s in stimulus:\n", " self._next(s);\n", " time = np.arange(len(self.membrane_voltage))*self.stepsize\n", " return time, np.array(self.membrane_voltage), np.array(self.spike_times)\n", "\n", " def __str__(self):\n", " out = '\\n'.join([\"stepsize: \\t\" + str(self.stepsize),\n", " \"offset:\\t\\t\" + str(self.offset),\n", " \"tau_m:\\t\\t\" + str(self.tau_m),\n", " \"tau_a:\\t\\t\" + str(self.tau_a),\n", " \"da:\\t\\t\" + str(self.da),\n", " \"D:\\t\\t\" + str(self.D),\n", " \"v_threshold:\\t\" + str(self.v_threshold),\n", " \"v_reset:\\t\" + str(self.v_reset)])\n", " return out\n", "\n", " def __repr__(self):\n", " return self.__str__()\n", "\n", "lif_model = LIF()\n", "time, voltage, spike_times = lif_model.run_const_stim(10000, 0.005)\n", " \n", "# ---- MOCK CODE AND DATA end --------\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This is what our time data looks like:\n", "time[:10]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This is what our voltage data looks like:\n", "voltage[:10]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Our assumption is, that we analysed the voltage traces and identified times where neurons where spiking.\n", "# The data for these spike times are found in the third mock data and look like:\n", "spike_times" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# With the mock membrane voltage traces we can now create a new DataArray on our Block\n", "data = block_tag.create_data_array(name=\"membrane_voltage_A\", array_type=\"regular_sampled\", data=voltage)\n", "data.label = \"membrane voltage\"\n", "data.unit = \"mV\"\n", "\n", "# As we are used to by now, we add the time dimension as a sampled dimension with the sample interval\n", "dim = data.append_sampled_dimension(time[1]-time[0])\n", "dim.label = \"time\"\n", "dim.unit = \"s\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now we want to store the data from our analysis step, the identified spike times. We store them in a separate\n", "# DataArray on the same Block, right next to our initial data.\n", "spike_data = block_tag.create_data_array(name=\"spike_times_A\", array_type=\"set\", data=spike_times)\n", "# The analysed data set needs to have the same dimensionality like the initial data set so it can be linked\n", "# via a tag or multi tag. Therefore we add two dimensions, they don't need to contain data, since it is \n", "# assumed, that the analysed data will map to the x-axis of the initial data.\n", "spike_data.append_set_dimension()\n", "spike_data.append_set_dimension()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We want to make sure, that anyone opening this file will know, that \"spike_times_A\" \n", "# were derived from the DataArray \"membrane_voltage_A\".\n", "# We can do that by connecting them via a \"MultiTag\"\n", "\n", "# We first create the MultiTag on the same Block right next to our two DataArrays. Lets see how we can do that:\n", "help(block_tag.create_multi_tag)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We create the multi tag with the derived spike data\n", "multi_tag = block_tag.create_multi_tag(name=\"tag_A\", type_=\"spike_times\", positions=spike_data)\n", "\n", "# Now we hook the spike data up to the original data\n", "multi_tag.references.append(data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And now we see how these two data sets can be plotted together\n", "# To interpret and plot tagged data, we only need the tag, we do not even need to know the DataArrays themselves.\n", "plot_tag = f.blocks['tag_examples'].multi_tags['tag_A']\n", "\n", "# We read in the initial data from the multi tag\n", "init_data = plot_tag.references[0]\n", "# Note that \"plot_tag.references\" returns a list since a tag could reference multiple original DataArrays\n", "\n", "# We read in the spike times from the multi tag\n", "spike_times = plot_tag.positions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Now we prepare both initial and analysed data for plotting\n", "dim_sampled = init_data.dimensions[0] # We again reconstruct the time axis\n", "time_points = dim_sampled.axis(init_data.shape[0])\n", "\n", "plot.figure(figsize=(10,5))\n", "plot.plot(time_points, init_data[:], label=init_data.name)\n", "plot.scatter(spike_times[:], np.ones(spike_times[:].shape)*np.max(init_data), color='red', label=spike_times.name)\n", "\n", "plot.xlabel(\"%s [%s]\" % (dim_sampled.label, dim_sampled.unit))\n", "plot.ylabel(\"%s [%s]\" % (init_data.label, init_data.unit))\n", "plot.title(\"%s/%s\" % (plot_data.name, plot_data.type))\n", "plot.legend()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can extract information from multiple steps of analysis and are able to plot data and analyses data without having to know or directly access the DataArrays that contain the acutal data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data and data annotation in the same file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NIX does not only allow to save initial data and analysed data within in the same file. It also allows to create structured annotations of the experiments that were conducted and connects this information directly to the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Metadata in NIX files is stored in the odML format and is saved in parallel to the actual \"DataTree\" in a \"MetadataTree\" but can easily be connected to Data in the DataTree." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "odML is a hierarchically structured data format, that provides grouping in nestable 'Sections' and stores information in 'Property'-'Value' pairs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let us annotate the DataArray in our last example.\n", "\n", "# As we can see, we have not stored any metadata in our current file yet.\n", "f.sections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets check how we can create a new section:\n", "help(f.create_section)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# First we need to create a Section that can contain our annotations\n", "section = f.create_section(name=\"tag_examples\", type_=\"general_section\")\n", "\n", "f.sections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This section can contain further sections as well as properties:\n", "section.sections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "section.props" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets store additional information about the initial data in our tag example.\n", "sub_sec = section.create_section(name=\"subject\", type_=\"experiment_A\")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets add some properties to this section\n", "help(sub_sec.create_property)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prop = sub_sec.create_property(name=\"species\", values_or_dtype=\"Mus Musculus\")\n", "prop = sub_sec.create_property(name=\"age\", values_or_dtype=\"4\")\n", "prop.unit = \"weeks\"\n", "prop = sub_sec.create_property(name=\"subjectID\", values_or_dtype=\"78376446-f096-47b9-8bfe-ce1eb43a48dc\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lets check what we have so far:\n", "f.sections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# One section that will describe our tag_examples\n", "f.sections['tag_examples'].sections" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A subsection that contains subject related information\n", "f.sections['tag_examples'].sections['subject'].props" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We can now connect the section describing our tag_example directly to the MultiTag that references both \n", "# the initial as well as the analysed data.\n", "\n", "multi_tag = f.blocks['tag_examples'].multi_tags['tag_A']\n", "multi_tag.metadata = f.sections['tag_examples']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now when we look at the data via a MultiTag we can directly access all metadata that has been attached to it.\n", "# E.g. get information about the subject the experiment was conducted with\n", "multi_tag.metadata.sections['subject']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We can also attach the same section to the initial DataArray\n", "init_data = f.blocks['tag_examples'].data_arrays['membrane_voltage_A']\n", "init_data.metadata = f.sections['tag_examples']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And we can also find it in reverse: we can select a section and find all data, that is connected to it\n", "sec = f.sections['tag_examples']\n", "sec.referring_data_arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sec.referring_multi_tags" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And finally we close our file.\n", "f.close()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }