{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n\n# Non-parametric 1 sample cluster statistic on single trial power\n\nThis script shows how to estimate significant clusters in time-frequency power\nestimates. It uses a non-parametric statistical procedure based on permutations and\ncluster level statistics.\n\nThe procedure consists of:\n\n  - extracting epochs\n  - compute single trial power estimates\n  - baseline line correct the power estimates (power ratios)\n  - compute stats to see if ratio deviates from 1.\n\nHere, the unit of observation is epochs from a specific study subject.\nHowever, the same logic applies when the unit of observation is\na number of study subjects each of whom contribute their own averaged\ndata (i.e., an average of their epochs). This would then be considered\nan analysis at the \"2nd level\".\n\nFor more information on cluster-based permutation testing in MNE-Python,\nsee also: `tut-cluster-spatiotemporal-sensor`.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: Alexandre Gramfort <alexandre.gramfort@inria.fr>\n#          Stefan Appelhoff <stefan.appelhoff@mailbox.org>\n#\n# License: BSD-3-Clause\n# Copyright the MNE-Python contributors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\nimport numpy as np\nimport scipy.stats\n\nimport mne\nfrom mne.datasets import sample\nfrom mne.stats import permutation_cluster_1samp_test"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set parameters\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "data_path = sample.data_path()\nmeg_path = data_path / \"MEG\" / \"sample\"\nraw_fname = meg_path / \"sample_audvis_raw.fif\"\ntmin, tmax, event_id = -0.3, 0.6, 1\n\n# Setup for reading the raw data\nraw = mne.io.read_raw_fif(raw_fname)\nevents = mne.find_events(raw, stim_channel=\"STI 014\")\n\ninclude = []\nraw.info[\"bads\"] += [\"MEG 2443\", \"EEG 053\"]  # bads + 2 more\n\n# for speed, we'll only look at right-temporal gradiometers (and EOG)\npicks_eog = mne.pick_types(raw.info, eog=True)\npicks_grad = mne.pick_types(raw.info, meg=\"grad\", exclude=\"bads\")\npicks_rtemp = mne.pick_channels(\n    raw.info[\"ch_names\"], mne.read_vectorview_selection(\"Right-temporal\"), ordered=True\n)\npicks = list((set(picks_rtemp) & set(picks_grad)) | set(picks_eog))\n\n# Load condition 1\nevent_id = 1\nepochs = mne.Epochs(\n    raw,\n    events,\n    event_id,\n    tmin,\n    tmax,\n    picks=picks,\n    baseline=(None, 0),\n    preload=True,\n    reject=dict(grad=4000e-13, eog=150e-6),\n)\n\nevoked = epochs.average()\n\n# Factor to down-sample the temporal dimension of the TFR computed by\n# tfr_morlet. Decimation occurs after frequency decomposition and can\n# be used to reduce memory usage (and possibly computational time of downstream\n# operations such as nonparametric statistics) if you don't need high\n# spectrotemporal resolution.\ndecim = 5\n\n# define frequencies of interest\nfreqs = np.arange(8, 40, 2)\n\n# run the TFR decomposition\ntfr_epochs = epochs.compute_tfr(\n    \"morlet\",\n    freqs,\n    n_cycles=4.0,\n    decim=decim,\n    average=False,\n    return_itc=False,\n    n_jobs=None,\n)\n\n# Baseline power\ntfr_epochs.apply_baseline(mode=\"logratio\", baseline=(-0.100, 0))\n\n# Crop in time to keep only what is between 0 and 400 ms\nevoked.crop(-0.1, 0.4)\ntfr_epochs.crop(-0.1, 0.4)\n\nepochs_power = tfr_epochs.data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Define adjacency for statistics\nTo perform a cluster-based permutation test, we need a suitable definition\nfor the adjacency of sensors, time points, and frequency bins.\nThe adjacency matrix will be used to form clusters.\n\nWe first compute the sensor adjacency, and then combine that with a\n\"lattice\" adjacency for the time-frequency plane, which assumes\nthat elements at index \"N\" are adjacent to elements at indices\n\"N + 1\" and \"N - 1\" (forming a \"grid\" on the time-frequency plane).\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# find_ch_adjacency first attempts to find an existing \"neighbor\"\n# (adjacency) file for given sensor layout.\n# If such a file doesn't exist, an adjacency matrix is computed on the fly,\n# using Delaunay triangulations.\nsensor_adjacency, ch_names = mne.channels.find_ch_adjacency(tfr_epochs.info, \"grad\")\n\n# In this case, find_ch_adjacency finds an appropriate file and\n# reads it (see log output: \"neuromag306planar\").\n# However, we need to subselect the channels we are actually using\nuse_idx = [ch_names.index(ch_name) for ch_name in tfr_epochs.ch_names]\nsensor_adjacency = sensor_adjacency[use_idx][:, use_idx]\n\n# Our sensor adjacency matrix is of shape n_chs \u00d7 n_chs\nassert sensor_adjacency.shape == (len(tfr_epochs.ch_names), len(tfr_epochs.ch_names))\n\n# Now we need to prepare adjacency information for the time-frequency\n# plane. For that, we use \"combine_adjacency\", and pass dimensions\n# as in the data we want to test (excluding observations). Here:\n# channels \u00d7 frequencies \u00d7 times\nassert epochs_power.data.shape == (\n    len(epochs),\n    len(tfr_epochs.ch_names),\n    len(tfr_epochs.freqs),\n    len(tfr_epochs.times),\n)\nadjacency = mne.stats.combine_adjacency(\n    sensor_adjacency, len(tfr_epochs.freqs), len(tfr_epochs.times)\n)\n\n# The overall adjacency we end up with is a square matrix with each\n# dimension matching the data size (excluding observations) in an\n# \"unrolled\" format, so: len(channels \u00d7 frequencies \u00d7 times)\nassert (\n    adjacency.shape[0]\n    == adjacency.shape[1]\n    == len(tfr_epochs.ch_names) * len(tfr_epochs.freqs) * len(tfr_epochs.times)\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Compute statistic\nFor forming clusters, we need to specify a critical test statistic threshold.\nOnly data bins exceeding this threshold will be used to form clusters.\n\nHere, we\nuse a t-test and can make use of Scipy's percent point function of the t\ndistribution to get a t-value that corresponds to a specific alpha level\nfor significance. This threshold is often called the\n\"cluster forming threshold\".\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>The choice of the threshold is more or less arbitrary. Choosing\n   a t-value corresponding to p=0.05, p=0.01, or p=0.001 may often provide\n   a good starting point. Depending on the specific dataset you are working\n   with, you may need to adjust the threshold.</p></div>\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# We want a two-tailed test\ntail = 0\n\n# In this example, we wish to set the threshold for including data bins in\n# the cluster forming process to the t-value corresponding to p=0.001 for the\n# given data.\n#\n# Because we conduct a two-tailed test, we divide the p-value by 2 (which means\n# we're making use of both tails of the distribution).\n# As the degrees of freedom, we specify the number of observations\n# (here: epochs) minus 1.\n# Finally, we subtract 0.001 / 2 from 1, to get the critical t-value\n# on the right tail (this is needed for MNE-Python internals)\ndegrees_of_freedom = len(epochs) - 1\nt_thresh = scipy.stats.t.ppf(1 - 0.001 / 2, df=degrees_of_freedom)\n\n# Set the number of permutations to run.\n# Warning: 50 is way too small for a real-world analysis (where values of 5000\n# or higher are used), but here we use it to increase the computation speed.\nn_permutations = 50\n\n# Run the analysis\nT_obs, clusters, cluster_p_values, H0 = permutation_cluster_1samp_test(\n    epochs_power,\n    n_permutations=n_permutations,\n    threshold=t_thresh,\n    tail=tail,\n    adjacency=adjacency,\n    out_type=\"mask\",\n    verbose=True,\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## View time-frequency plots\nWe now visualize the observed clusters that are statistically significant\nunder our permutation distribution.\n\n<div class=\"alert alert-danger\"><h4>Warning</h4><p>Talking about \"significant clusters\" can be convenient, but\n             you must be aware of all associated caveats! For example, it\n             is **invalid** to interpret the cluster p value as being\n             spatially or temporally specific. A cluster with sufficiently\n             low (for example < 0.05) p value at specific location does not\n             allow you to say that the significant effect is at that\n             particular location. The p value only tells you about the\n             probability of obtaining similar or stronger/larger cluster\n             anywhere in the data if there were no differences between the\n             compared conditions. So it only allows you to draw conclusions\n             about the differences in the data \"in general\", not at specific\n             locations. See the comprehensive\n             [FieldTrip tutorial](ft_cluster_) for more information.\n             [FieldTrip tutorial](ft_cluster_) for more information.</p></div>\n\n.. include:: ../../links.inc\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "evoked_data = evoked.data\ntimes = 1e3 * evoked.times\n\nfig, (ax, ax2) = plt.subplots(2, layout=\"constrained\")\n\nT_obs_plot = np.nan * np.ones_like(T_obs)\nfor c, p_val in zip(clusters, cluster_p_values):\n    if p_val <= 0.05:\n        T_obs_plot[c] = T_obs[c]\n\n# Just plot one channel's data\n# use the following to show a specific one:\n# ch_idx = tfr_epochs.ch_names.index('MEG 1332')\nch_idx, f_idx, t_idx = np.unravel_index(\n    np.nanargmax(np.abs(T_obs_plot)), epochs_power.shape[1:]\n)\n\nvmax = np.max(np.abs(T_obs))\nvmin = -vmax\nax.imshow(\n    T_obs[ch_idx],\n    cmap=plt.cm.gray,\n    extent=[times[0], times[-1], freqs[0], freqs[-1]],\n    aspect=\"auto\",\n    origin=\"lower\",\n    vmin=vmin,\n    vmax=vmax,\n)\nax.imshow(\n    T_obs_plot[ch_idx],\n    cmap=plt.cm.RdBu_r,\n    extent=[times[0], times[-1], freqs[0], freqs[-1]],\n    aspect=\"auto\",\n    origin=\"lower\",\n    vmin=vmin,\n    vmax=vmax,\n)\nfig.colorbar(ax.images[0])\nax.set(xlabel=\"Time (ms)\", ylabel=\"Frequency (Hz)\")\nax.set(title=f\"Induced power ({tfr_epochs.ch_names[ch_idx]})\")\n\nevoked.plot(axes=[ax2], time_unit=\"s\")"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.2"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}