{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n\n# Representational Similarity Analysis\n\nRepresentational Similarity Analysis is used to perform summary statistics\non supervised classifications where the number of classes is relatively high.\nIt consists in characterizing the structure of the confusion matrix to infer\nthe similarity between brain responses and serves as a proxy for characterizing\nthe space of mental representations\n:footcite:`Shepard1980,LaaksoCottrell2000,KriegeskorteEtAl2008`.\n\nIn this example, we perform RSA on responses to 24 object images (among\na list of 92 images). Subjects were presented with images of human, animal\nand inanimate objects :footcite:`CichyEtAl2014`. Here we use the 24 unique\nimages of faces and body parts.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>this example will download a very large (~6GB) file, so we will not\n          build the images below.</p></div>\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: Jean-R\u00e9mi King <jeanremi.king@gmail.com>\n#          Jaakko Leppakangas <jaeilepp@student.jyu.fi>\n#          Alexandre Gramfort <alexandre.gramfort@inria.fr>\n#\n# License: BSD-3-Clause\n# Copyright the MNE-Python contributors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\nimport numpy as np\nfrom pandas import read_csv\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.manifold import MDS\nfrom sklearn.metrics import roc_auc_score\nfrom sklearn.model_selection import StratifiedKFold\nfrom sklearn.multiclass import OneVsRestClassifier\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nimport mne\nfrom mne.datasets import visual_92_categories\nfrom mne.io import concatenate_raws, read_raw_fif\n\nprint(__doc__)\n\ndata_path = visual_92_categories.data_path()\n\n# Define stimulus - trigger mapping\nfname = data_path / \"visual_stimuli.csv\"\nconds = read_csv(fname)\nprint(conds.head(5))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's restrict the number of conditions to speed up computation\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "max_trigger = 24\nconds = conds[:max_trigger]  # take only the first 24 rows"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Define stimulus - trigger mapping\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "conditions = []\nfor c in conds.values:\n    cond_tags = list(c[:2])\n    cond_tags += [\n        (\"not-\" if i == 0 else \"\") + conds.columns[k] for k, i in enumerate(c[2:], 2)\n    ]\n    conditions.append(\"/\".join(map(str, cond_tags)))\nprint(conditions[:10])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's make the event_id dictionary\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "event_id = dict(zip(conditions, conds.trigger + 1))\nevent_id[\"0/human bodypart/human/not-face/animal/natural\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Read MEG data\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "n_runs = 4  # 4 for full data (use less to speed up computations)\nfnames = [data_path / f\"sample_subject_{b}_tsss_mc.fif\" for b in range(n_runs)]\nraws = [\n    read_raw_fif(fname, verbose=\"error\", on_split_missing=\"ignore\") for fname in fnames\n]  # ignore filename warnings\nraw = concatenate_raws(raws)\n\nevents = mne.find_events(raw, min_duration=0.002)\n\nevents = events[events[:, 2] <= max_trigger]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Epoch data\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "picks = mne.pick_types(raw.info, meg=True)\nepochs = mne.Epochs(\n    raw,\n    events=events,\n    event_id=event_id,\n    baseline=None,\n    picks=picks,\n    tmin=-0.1,\n    tmax=0.500,\n    preload=True,\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's plot some conditions\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "epochs[\"face\"].average().plot()\nepochs[\"not-face\"].average().plot()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Representational Similarity Analysis (RSA) is a neuroimaging-specific\nappelation to refer to statistics applied to the confusion matrix\nalso referred to as the representational dissimilarity matrices (RDM).\n\nCompared to the approach from Cichy et al. we'll use a multiclass\nclassifier (Multinomial Logistic Regression) while the paper uses\nall pairwise binary classification task to make the RDM.\nAlso we use here the ROC-AUC as performance metric while the\npaper uses accuracy. Finally here for the sake of time we use\nRSA on a window of data while Cichy et al. did it for all time\ninstants separately.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Classify using the average signal in the window 50ms to 300ms\n# to focus the classifier on the time interval with best SNR.\nclf = make_pipeline(\n    StandardScaler(),\n    OneVsRestClassifier(LogisticRegression(C=1)),\n)\nX = epochs.copy().crop(0.05, 0.3).get_data().mean(axis=2)\ny = epochs.events[:, 2]\n\nclasses = set(y)\ncv = StratifiedKFold(n_splits=5, random_state=0, shuffle=True)\n\n# Compute confusion matrix for each cross-validation fold\ny_pred = np.zeros((len(y), len(classes)))\nfor train, test in cv.split(X, y):\n    # Fit\n    clf.fit(X[train], y[train])\n    # Probabilistic prediction (necessary for ROC-AUC scoring metric)\n    y_pred[test] = clf.predict_proba(X[test])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Compute confusion matrix using ROC-AUC\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "confusion = np.zeros((len(classes), len(classes)))\nfor ii, train_class in enumerate(classes):\n    for jj in range(ii, len(classes)):\n        confusion[ii, jj] = roc_auc_score(y == train_class, y_pred[:, jj])\n        confusion[jj, ii] = confusion[ii, jj]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Plot\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "labels = [\"\"] * 5 + [\"face\"] + [\"\"] * 11 + [\"bodypart\"] + [\"\"] * 6\nfig, ax = plt.subplots(1, layout=\"constrained\")\nim = ax.matshow(confusion, cmap=\"RdBu_r\", clim=[0.3, 0.7])\nax.set_yticks(range(len(classes)))\nax.set_yticklabels(labels)\nax.set_xticks(range(len(classes)))\nax.set_xticklabels(labels, rotation=40, ha=\"left\")\nax.axhline(11.5, color=\"k\")\nax.axvline(11.5, color=\"k\")\nplt.colorbar(im)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Confusion matrix related to mental representations have been historically\nsummarized with dimensionality reduction using multi-dimensional scaling [1].\nSee how the face samples cluster together.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig, ax = plt.subplots(1, layout=\"constrained\")\nmds = MDS(2, random_state=0, dissimilarity=\"precomputed\")\nchance = 0.5\nsummary = mds.fit_transform(chance - confusion)\ncmap = plt.colormaps[\"rainbow\"]\ncolors = [\"r\", \"b\"]\nnames = list(conds[\"condition\"].values)\nfor color, name in zip(colors, set(names)):\n    sel = np.where([this_name == name for this_name in names])[0]\n    size = 500 if name == \"human face\" else 100\n    ax.scatter(\n        summary[sel, 0],\n        summary[sel, 1],\n        s=size,\n        facecolors=color,\n        label=name,\n        edgecolors=\"k\",\n    )\nax.axis(\"off\")\nax.legend(loc=\"lower right\", scatterpoints=1, ncol=2)\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## References\n.. footbibliography::\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.2"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}