{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n\n# Rejecting bad data spans and breaks\n\nThis tutorial covers:\n\n- manual marking of bad spans of data,\n- automated rejection of data spans based on signal amplitude, and\n- automated detection of breaks during an experiment.\n\nWe begin as always by importing the necessary Python modules and loading some\n`example data <sample-dataset>`; to save memory we'll use a pre-filtered\nand downsampled version of the example data, and we'll also load an events\narray to use when converting the continuous data to epochs:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: The MNE-Python contributors.\n# License: BSD-3-Clause\n# Copyright the MNE-Python contributors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import os\n\nimport numpy as np\n\nimport mne\n\nsample_data_folder = mne.datasets.sample.data_path()\nsample_data_raw_file = os.path.join(\n    sample_data_folder, \"MEG\", \"sample\", \"sample_audvis_filt-0-40_raw.fif\"\n)\nraw = mne.io.read_raw_fif(sample_data_raw_file, verbose=False)\nevents_file = os.path.join(\n    sample_data_folder, \"MEG\", \"sample\", \"sample_audvis_filt-0-40_raw-eve.fif\"\n)\nevents = mne.read_events(events_file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Annotating bad spans of data\n\nThe tutorial `tut-events-vs-annotations` describes how\n:class:`~mne.Annotations` can be read from embedded events in the raw\nrecording file, and `tut-annotate-raw` describes in detail how to\ninteractively annotate a :class:`~mne.io.Raw` data object. Here, we focus on\nbest practices for annotating *bad* data spans so that they will be excluded\nfrom your analysis pipeline.\n\n\n### The ``reject_by_annotation`` parameter\n\nIn the interactive ``raw.plot()`` window, the annotation controls can be\nopened by pressing :kbd:`a`. Here, new annotation labels can be created or\nexisting annotation labels can be selected for use.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig = raw.plot()\nfig.fake_keypress(\"a\")  # Simulates user pressing 'a' on the keyboard."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can see that you need to add a description first to start with\nmarking spans (Push the button \"Add Description\" and enter the description).\nYou can use any description you like, but annotations marking spans that\nshould be excluded from the analysis pipeline should all begin with \"BAD\" or\n\"bad\" (e.g., \"bad_cough\", \"bad-eyes-closed\", \"bad door slamming\", etc). When\nthis practice is followed, many processing steps in MNE-Python will\nautomatically exclude the \"bad\"-labelled spans of data; this behavior is\ncontrolled by a parameter ``reject_by_annotation`` that can be found in many\nMNE-Python functions or class constructors, including:\n\n- creation of epoched data from continuous data (:class:`mne.Epochs`)\n- many methods of the independent components analysis class\n  (:class:`mne.preprocessing.ICA`)\n- functions for finding heartbeat and blink artifacts\n  (:func:`~mne.preprocessing.find_ecg_events`,\n  :func:`~mne.preprocessing.find_eog_events`)\n- covariance computations (:func:`mne.compute_raw_covariance`)\n- power spectral density computation (:meth:`mne.io.Raw.compute_psd`)\n\nFor example, when creating epochs from continuous data, if\n``reject_by_annotation=True`` the :class:`~mne.Epochs` constructor will drop\nany epoch that partially or fully overlaps with an annotated span that begins\nwith \"bad\".\n\n\n### Generating annotations programmatically\n\nThe `tut-artifact-overview` tutorial introduced the artifact detection\nfunctions :func:`~mne.preprocessing.find_eog_events` and\n:func:`~mne.preprocessing.find_ecg_events` (although that tutorial mostly\nrelied on their higher-level wrappers\n:func:`~mne.preprocessing.create_eog_epochs` and\n:func:`~mne.preprocessing.create_ecg_epochs`). Here, for demonstration\npurposes, we make use of the lower-level artifact detection function to get\nan events array telling us where the blinks are, then automatically add\n\"bad_blink\" annotations around them (this is not necessary when using\n:func:`~mne.preprocessing.create_eog_epochs`, it is done here just to show\nhow annotations are added non-interactively). We'll start the annotations\n250 ms before the blink and end them 250 ms after it:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eog_events = mne.preprocessing.find_eog_events(raw)\nonsets = eog_events[:, 0] / raw.info[\"sfreq\"] - 0.25\ndurations = [0.5] * len(eog_events)\ndescriptions = [\"bad blink\"] * len(eog_events)\nblink_annot = mne.Annotations(\n    onsets, durations, descriptions, orig_time=raw.info[\"meas_date\"]\n)\nraw.set_annotations(blink_annot)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now we can confirm that the annotations are centered on the EOG events. Since\nblinks are usually easiest to see in the EEG channels, we'll only plot EEG\nhere:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eeg_picks = mne.pick_types(raw.info, meg=False, eeg=True)\nraw.plot(events=eog_events, order=eeg_picks)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "See the section `tut-section-programmatic-annotations` for more details\non creating annotations programmatically.\n\n### Detecting and annotating breaks\nAnother useful function, albeit not related to artifact detection *per se*,\nis `mne.preprocessing.annotate_break`: It will generate annotations for\nsegments of the data where no existing annotations (or, alternatively:\nevents) can be found. It can therefore be used to automatically detect and\nmark breaks, e.g. between experimental blocks, when recording continued.\n\nFor the sake of this example, let's assume an experiment consisting of two\nblocks, the first one stretching from 30 to 90, and the second from 120 to\n180 seconds. We'll mark these blocks by annotations, and then use\n`mne.preprocessing.annotate_break` to detect and annotate any breaks.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>We need to take ``raw.first_time`` into account, otherwise the\n          onsets will be incorrect!</p></div>\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "onsets = [raw.first_time + 30, raw.first_time + 180]\ndurations = [60, 60]\ndescriptions = [\"block_1\", \"block_2\"]\n\nblock_annots = mne.Annotations(\n    onset=onsets,\n    duration=durations,\n    description=descriptions,\n    orig_time=raw.info[\"meas_date\"],\n)\nraw.set_annotations(raw.annotations + block_annots)  # add to existing\nraw.plot()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now detect break periods. We can control how far the break annotations shall\nexpand toward both ends of each break.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "break_annots = mne.preprocessing.annotate_break(\n    raw=raw,\n    min_break_duration=20,  # consider segments of at least 20 s duration\n    t_start_after_previous=5,  # start annotation 5 s after end of previous one\n    t_stop_before_next=2,  # stop annotation 2 s before beginning of next one\n)\n\nraw.set_annotations(raw.annotations + break_annots)  # add to existing\nraw.plot()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can see that 3 segments have been annotated as ``BAD_break``:\n\n- the first one starting with the beginning of the recording and ending 2\n  seconds before the beginning of block 1 (due to ``t_stop_before_next=2``),\n- the second one starting 5 seconds after block 1 has ended, and ending 2\n  seconds before the beginning of block 2 (``t_start_after_previous=5``,\n  ``t_stop_before_next=2``),\n- and the last one starting 5 seconds after block 2 has ended\n  (``t_start_after_previous=5``) and continuing until the end of the\n  recording.\n\nYou can also see that only the ``block_1`` and ``block_2`` annotations\nwere considered in the detection of the break periods \u2013 the EOG annotations\nwere simply ignored. This is because, by default,\n`~mne.preprocessing.annotate_break` ignores all annotations starting with\n``'bad'``. You can control this behavior via the ``ignore`` parameter.\n\nIt is also possible to perform break period detection based on an array\nof events: simply pass the array via the ``events`` parameter. Existing\nannotations in the raw data will be ignored in this case:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# only keep some button press events (code 32) for this demonstration\nevents_subset = events[events[:, -1] == 32]\n# drop the first and last few events\nevents_subset = events_subset[3:-3]\n\nbreak_annots = mne.preprocessing.annotate_break(\n    raw=raw,\n    events=events_subset,  # passing events will ignore existing annotations\n    min_break_duration=25,  # pick a longer break duration this time\n)\n\n# replace existing annotations (otherwise it becomes difficult to see any\n# effects in the plot!)\nraw.set_annotations(break_annots)\nraw.plot(events=events_subset)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n## Rejecting Epochs based on peak-to-peak channel amplitude\n\nBesides \"bad\" annotations, the :class:`mne.Epochs` class constructor has\nanother means of rejecting epochs, based on signal amplitude thresholds for\neach channel type. In the `overview tutorial\n<tut-section-overview-epoching>` we saw an example of this: setting maximum\nacceptable peak-to-peak amplitudes for each channel type in an epoch, using\nthe ``reject`` parameter. There is also a related parameter, ``flat``, that\ncan be used to set *minimum* acceptable peak-to-peak amplitudes for each\nchannel type in an epoch:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "reject_criteria = dict(\n    mag=3000e-15,  # 3000 fT\n    grad=3000e-13,  # 3000 fT/cm\n    eeg=100e-6,  # 100 \u00b5V\n    eog=200e-6,\n)  # 200 \u00b5V\n\nflat_criteria = dict(mag=1e-15, grad=1e-13, eeg=1e-6)  # 1 fT  # 1 fT/cm  # 1 \u00b5V"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The values that are appropriate are dataset- and hardware-dependent, so some\ntrial-and-error may be necessary to find the correct balance between data\nquality and loss of power due to too many dropped epochs. Here, we've set the\nrejection criteria to be fairly stringent, for illustration purposes.\n\nTwo additional parameters, ``reject_tmin`` and ``reject_tmax``, are used to\nset the temporal window in which to calculate peak-to-peak amplitude for the\npurposes of epoch rejection. These default to the same ``tmin`` and ``tmax``\nof the entire epoch. As one example, if you wanted to only apply the\nrejection thresholds to the portion of the epoch that occurs *before* the\nevent marker around which the epoch is created, you could set\n``reject_tmax=0``. A summary of the causes of rejected epochs can be\ngenerated with the :meth:`~mne.Epochs.plot_drop_log` method:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "raw.set_annotations(blink_annot)  # restore the EOG annotations\nepochs = mne.Epochs(\n    raw,\n    events,\n    tmin=-0.2,\n    tmax=0.5,\n    reject_tmax=0,\n    reject=reject_criteria,\n    flat=flat_criteria,\n    reject_by_annotation=False,\n    preload=True,\n)\nepochs.plot_drop_log()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Notice that we've passed ``reject_by_annotation=False`` above, in order to\nisolate the effects of the rejection thresholds. If we re-run the epoching\nwith ``reject_by_annotation=True`` (the default) we see that the rejections\ndue to EEG and EOG channels have disappeared (suggesting that those channel\nfluctuations were probably blink-related, and were subsumed by rejections\nbased on the \"bad blink\" label).\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "epochs = mne.Epochs(\n    raw,\n    events,\n    tmin=-0.2,\n    tmax=0.5,\n    reject_tmax=0,\n    reject=reject_criteria,\n    flat=flat_criteria,\n    preload=True,\n)\nepochs.plot_drop_log()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "More importantly, note that *many* more epochs are rejected (~12.2% instead\nof ~2.5%) when rejecting based on the blink labels, underscoring why it is\nusually desirable to repair artifacts rather than exclude them.\n\nThe :meth:`~mne.Epochs.plot_drop_log` method is a visualization of an\n:class:`~mne.Epochs` attribute, namely ``epochs.drop_log``, which stores\nempty lists for retained epochs and lists of strings for dropped epochs, with\nthe strings indicating the reason(s) why the epoch was dropped. For example:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(epochs.drop_log)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Finally, it should be noted that \"dropped\" epochs are not necessarily deleted\nfrom the :class:`~mne.Epochs` object right away. Above, we forced the\ndropping to happen when we created the :class:`~mne.Epochs` object by using\nthe ``preload=True`` parameter. If we had not done that, the\n:class:`~mne.Epochs` object would have been `memory-mapped`_ (not loaded into\nRAM), in which case the criteria for dropping epochs are stored, and the\nactual dropping happens when the :class:`~mne.Epochs` data are finally loaded\nand used. There are several ways this can get triggered, such as:\n\n- explicitly loading the data into RAM with the :meth:`~mne.Epochs.load_data`\n  method\n- plotting the data (:meth:`~mne.Epochs.plot`,\n  :meth:`~mne.Epochs.plot_image`, etc)\n- using the :meth:`~mne.Epochs.average` method to create an\n  :class:`~mne.Evoked` object\n\nYou can also trigger dropping with the :meth:`~mne.Epochs.drop_bad` method;\nif ``reject`` and/or ``flat`` criteria have already been provided to the\nepochs constructor, :meth:`~mne.Epochs.drop_bad` can be used without\narguments to simply delete the epochs already marked for removal (if the\nepochs have already been dropped, nothing further will happen):\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "epochs.drop_bad()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Alternatively, if rejection thresholds were not originally given to the\n:class:`~mne.Epochs` constructor, they can be passed to\n:meth:`~mne.Epochs.drop_bad` later instead; this can also be a way of\nimposing progressively more stringent rejection criteria:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "stronger_reject_criteria = dict(\n    mag=2000e-15,  # 2000 fT\n    grad=2000e-13,  # 2000 fT/cm\n    eeg=100e-6,  # 100 \u00b5V\n    eog=100e-6,\n)  # 100 \u00b5V\n\nepochs.drop_bad(reject=stronger_reject_criteria)\nprint(epochs.drop_log)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n## Rejecting Epochs using callables (functions)\nSometimes it is useful to reject epochs based criteria other than\npeak-to-peak amplitudes. For example, we might want to reject epochs\nbased on the maximum or minimum amplitude of a channel.\nIn this case, the `mne.Epochs.drop_bad` function also accepts\ncallables (functions) in the ``reject`` and ``flat`` parameters. This\nallows us to define functions to reject epochs based on our desired criteria.\n\nLet's begin by generating Epoch data with large artifacts in one eeg channel\nin order to demonstrate the versatility of this approach.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "raw.crop(0, 5)\nraw.del_proj()\nchans = raw.info[\"ch_names\"][-5:-1]\nraw.pick(chans)\ndata = raw.get_data()\n\nnew_data = data\nnew_data[0, 180:200] *= 1e3\nnew_data[0, 460:580] += 1e-3\nedit_raw = mne.io.RawArray(new_data, raw.info)\n\n# Create fixed length epochs of 1 second\nevents = mne.make_fixed_length_events(edit_raw, id=1, duration=1.0, start=0)\nepochs = mne.Epochs(edit_raw, events, tmin=0, tmax=1, baseline=None)\nepochs.plot(scalings=dict(eeg=50e-5))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "As you can see, we have two large artifacts in the first channel. One large\nspike in amplitude and one large increase in amplitude.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Let's try to reject the epoch containing the spike in amplitude based on the\n# maximum amplitude of the first channel. Please note that the callable in\n# ``reject`` must return a (good, reason) tuple. Where the good must be bool\n# and reason must be a str, list, or tuple where each entry is a str.\n\nepochs = mne.Epochs(\n    edit_raw,\n    events,\n    tmin=0,\n    tmax=1,\n    baseline=None,\n    preload=True,\n)\n\nepochs.drop_bad(\n    reject=dict(eeg=lambda x: ((np.max(x, axis=1) > 1e-2).any(), \"max amp\"))\n)\nepochs.plot(scalings=dict(eeg=50e-5))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Here, the epoch containing the spike in amplitude was rejected for having a\nmaximum amplitude greater than 1e-2 Volts. Notice the use of the ``any()``\nfunction to check if any of the channels exceeded the threshold. We could\nhave also used the ``all()`` function to check if all channels exceeded the\nthreshold.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Next, let's try to reject the epoch containing the increase in amplitude\n# using the median.\n\nepochs = mne.Epochs(\n    edit_raw,\n    events,\n    tmin=0,\n    tmax=1,\n    baseline=None,\n    preload=True,\n)\n\nepochs.drop_bad(\n    reject=dict(eeg=lambda x: ((np.median(x, axis=1) > 1e-4).any(), \"median amp\"))\n)\nepochs.plot(scalings=dict(eeg=50e-5))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Finally, let's try to reject both epochs using a combination of the maximum\nand median. We'll define a custom function and use boolean operators to\ncombine the two criteria.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def reject_criteria(x):\n    max_condition = np.max(x, axis=1) > 1e-2\n    median_condition = np.median(x, axis=1) > 1e-4\n    return ((max_condition.any() or median_condition.any()), [\"max amp\", \"median amp\"])\n\n\nepochs = mne.Epochs(\n    edit_raw,\n    events,\n    tmin=0,\n    tmax=1,\n    baseline=None,\n    preload=True,\n)\n\nepochs.drop_bad(reject=dict(eeg=reject_criteria))\nepochs.plot(events=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Note that a complementary Python module, the `autoreject package`_, uses\nmachine learning to find optimal rejection criteria, and is designed to\nintegrate smoothly with MNE-Python workflows. This can be a considerable\ntime-saver when working with heterogeneous datasets.\n\n\n.. LINKS\n\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.2"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}