{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Rearranging and Filtering Binned Data\n", "\n", "## Introduction\n", "\n", "Event filtering refers to the process of removing or extracting a subset of events based on some criterion such as the temperature of the measured sample at the time an event was detected.\n", "Instead of extracting based on a single parameter value or interval, we may also want to rearrange data based on the parameter value, providing quick and convenient access to the parameter-dependence of our data.\n", "Scipp's binned data can be used for both of these purposes.\n", "\n", "The [Quick Reference](#Quick-Reference) below provides a brief overview of the options.\n", "A more detailed walkthrough based on actual data can be found in the [Full example](#Full-example)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quick Reference\n", "\n", "### Extract events matching parameter value\n", "\n", "Use [label-based indexing on the bins property](../../generated/classes/scipp.Bins.rst#scipp.Bins).\n", "This works similar to regular [label-based indexing](../slicing.ipynb#Label-based-indexing) but operates on the unordered bin contents.\n", "Example:\n", "\n", "```python\n", "param_value = sc.scalar(1.2, unit='m')\n", "filtered = da.bins['param', param_value]\n", "```\n", "\n", "- The output data array has the same dimensions as the input `da`.\n", "- `filtered` contains a *copy* of the filtered events.\n", "\n", "### Extract events falling into a parameter interval\n", "\n", "Use [label-based indexing on the bins property](../../generated/classes/scipp.Bins.rst#scipp.Bins).\n", "This works similar to regular [label-based indexing](../slicing.ipynb#Label-based-indexing) but operates on the unordered bin contents.\n", "Example:\n", "\n", "```python\n", "start = sc.scalar(1.2, unit='m')\n", "stop = sc.scalar(1.3, unit='m')\n", "filtered = da.bins['param', start:stop]\n", "```\n", "\n", "- The output data array has the same dimensions as the input `da`.\n", "`filtered` contains a *copy* of the filtered events.\n", "- Note that as usual the upper bound of the interval (here $1.3~\\text{m}$) is *not* included.\n", "\n", "### Split into bins based on a discrete event parameter\n", "\n", "Use [scipp.group](../../generated/functions/scipp.group.rst).\n", "Example:\n", "\n", "```python\n", "split = da.group('param')\n", "```\n", "\n", "- The output data array has a new dimension `'param'` in addition to the dimensions of the input.\n", "- `split` contains a *copy* of the reordered events.\n", "- Pass an explicit variable to `group` listing desired groups to limit what is included in the output.\n", "\n", "### Split into bins based on a continuous event parameter\n", "\n", "Use [scipp.bin](../../generated/functions/scipp.bin.rst).\n", "Example:\n", "\n", "```python\n", "split = da.bin(param=10)\n", "```\n", "\n", "- The output data array has a new dimension `'param'` in addition to the dimensions of the input.\n", "- `split` contains a *copy* of the reordered events.\n", "- Provide an explicit variable to `bin` to limit the parameter interval that is included in the output, or for fine-grained control over the sub-intervals.\n", "\n", "### Compute derived event parameters for subsequent extracting or splitting\n", "\n", "Use [scipp.transform_coords](../../generated/functions/scipp.transform_coords.rst).\n", "Example:\n", "\n", "```python\n", "da2 = da.transform_coords(derived_param=lambda p1, p2: p1 + p2)\n", "```\n", "\n", "`da2` can now be used with any of the methods for extracting or splitting data described above.\n", "The intermediate variable can also be omitted, and we can directly extract or split the result:\n", "\n", "```python\n", "filtered = da.transform_coords(derived_param=lambda p1, p2: p1 + p2) \\\n", " .bin(derived_param=10)\n", "```\n", "\n", "### Compute derived event parameters from time-series or other metadata\n", "\n", "In practice, events are often tagged with a timestamp, which can be used to lookup parameter values from, e.g., a time-series log given by a data array with a single dimension and a coordinate matching the coordinate name of the timestamps.\n", "Use [scipp.lookup](../../generated/functions/scipp.lookup.rst) with [scipp.transform_coords](../../generated/functions/scipp.transform_coords.rst). Example:\n", "\n", "```python\n", "temperature = da.attrs['sample_temperature'].value # temperature value time-series\n", "interp_temperature = sc.lookup(temperature, mode='previous')\n", "filtered = da.transform_coords(temperature=interp_temperature) \\\n", " .bin(temperature=10)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Full example\n", "\n", "### Input data\n", "\n", "In the following we use neutron diffraction data for a stainless steel tensile bar in a loadframe measured at the [VULCAN Engineering Materials Diffractometer](https://neutrons.ornl.gov/vulcan), kindly provided by the SNS.\n", "Scipp's sample data includes an excerpt from the full dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import scipp as sc\n", "\n", "dg = sc.data.vulcan_steel_strain_data()\n", "dg" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da = dg['data']\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `dspacing` dimension is the interplanar lattice spacing (the spacing between planes in a crystal), and plotting this data we see two diffraction peaks:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "da.hist().plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract time interval\n", "\n", "The [mechanical strain](https://en.wikipedia.org/wiki/Strain_(mechanics)) of the steel sample in the loadframe is recorded in the metadata:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "strain = dg['loadframe.strain']\n", "strain.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that the strain drops off for some reason at the end.\n", "We can filter out those events, by extracting the rest as outlined in [Extract events matching parameter value](#Extract-events-matching-parameter-value):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "start = strain.coords['time'][0]\n", "stop = strain.coords['time'][np.argmax(strain.values)]\n", "da = da.bins['time', start:stop]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "