{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "68b92712-4df4-4762-8eeb-1b7a9ec6ca56",
   "metadata": {},
   "source": [
    "# Working with messy data with Python and pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "952c0972-0434-44e6-99c3-2594c5158ba1",
   "metadata": {},
   "source": [
    "The goal of this workshop is to provide you with some tools that will help you work with real-world data.\n",
    "\n",
    "There are many **excellent** textbooks and online tutorials that will teach you how to analyze and visualize data.\n",
    "\n",
    "However, these tutorials rely on data that are pre-formatted in a very specific way (i.e., example datasets are ready to analyze).\n",
    "\n",
    "There is good reason for this: statistics and plotting libraries contain functions that are designed to have data organized with a precise structure.\n",
    "- A great resource that describes how data *should* be structured can be found here: https://vita.had.co.nz/papers/tidy-data.pdf\n",
    "\n",
    "If data are not arranged appropriately:\n",
    "- bad: it will be difficult to get your analysis working properly\n",
    "- worse: you might not be able to get the analysis to work at all\n",
    "- worst: the analysis will work, but not as you expected!!!\n",
    "\n",
    "## Problem\n",
    "\n",
    "Real-world data are not usually pre-formatted for analyses.\n",
    "\n",
    "Few textbooks or tutorials focus on the process of cleaning up data---that's where this workshop comes in!\n",
    "\n",
    "Here, you will learn:\n",
    "- About the Python library called `pandas`\n",
    "- How to use Python/pandas to\n",
    "  - Read csv files\n",
    "  - Filter/select data\n",
    "  - Create a clean dataset\n",
    "\n",
    "We will work with a small set of example files stored in the `raw_data` folder.\n",
    "- These files can be downloaded using this link: https://minhaskamal.github.io/DownGit/#/home?url=https://github.com/scds/dash-webinars/tree/main/assets/data\n",
    "\n",
    "These files represent an experiment scenario where participants completed a questionnaire and a separate reaction time task.\n",
    "\n",
    "Each participant has their data stored in two separate csv files.\n",
    "\n",
    "We need to combine all of these files into one clean data set for analysis."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15b7ac87-876a-474d-a829-aabbfc03f1c3",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "How will we approach this problem?\n",
    "\n",
    "1. Select all the survey and trial raw data files\n",
    "2. Read each file\n",
    "3. Select the appropriate data in each file and store in a DataFrame\n",
    "4. Combine all individual data into a single clean data file\n",
    "5. Write the clean file\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82cd08e7-85d2-41f7-ba98-a6259e5eb238",
   "metadata": {},
   "source": [
    "## Organizing Files\n",
    "The `os` module provides functions that allow you to interact with some basic parts of the operating system.\n",
    "A very useful function is the `listdir()` function, which can read the contents of a folder on your computer!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "96c6ba92-565e-485e-8711-e1bb91b5ae5a",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import os"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "49def607-ec20-4baf-9400-b993f183bbc2",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['id_007_trial.csv',\n",
       " 'id_005_survey.csv',\n",
       " 'id_008_survey.csv',\n",
       " '.DS_Store',\n",
       " 'id_001_trial.csv',\n",
       " 'id_006_trial.csv',\n",
       " 'id_007_survey.csv',\n",
       " 'id_002_survey.csv',\n",
       " 'id_010_survey.csv',\n",
       " 'id_005_trial.csv',\n",
       " 'id_008_trial.csv',\n",
       " 'id_001_survey.csv',\n",
       " 'id_003_trial.csv',\n",
       " 'id_004_survey.csv',\n",
       " 'id_009_survey.csv',\n",
       " '.ipynb_checkpoints',\n",
       " 'id_004_trial.csv',\n",
       " 'id_010_trial.csv',\n",
       " 'id_003_survey.csv',\n",
       " 'id_009_trial.csv',\n",
       " 'id_002_trial.csv',\n",
       " 'id_006_survey.csv']"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dir_raw = \"raw_data/\"\n",
    "files_raw = os.listdir(dir_raw)\n",
    "files_raw"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b89b803-822d-46cb-80ff-1d12a319e11e",
   "metadata": {},
   "source": [
    "There is a problem if .ipynb files are included. We need to exclude non-csv files.\n",
    "A list comprehension can quickly check each filename for a string pattern."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "160c331f-5b46-41a7-8c66-ffc643acf5b9",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['id_007_trial.csv',\n",
       " 'id_005_survey.csv',\n",
       " 'id_008_survey.csv',\n",
       " 'id_001_trial.csv',\n",
       " 'id_006_trial.csv',\n",
       " 'id_007_survey.csv',\n",
       " 'id_002_survey.csv',\n",
       " 'id_010_survey.csv',\n",
       " 'id_005_trial.csv',\n",
       " 'id_008_trial.csv',\n",
       " 'id_001_survey.csv',\n",
       " 'id_003_trial.csv',\n",
       " 'id_004_survey.csv',\n",
       " 'id_009_survey.csv',\n",
       " 'id_004_trial.csv',\n",
       " 'id_010_trial.csv',\n",
       " 'id_003_survey.csv',\n",
       " 'id_009_trial.csv',\n",
       " 'id_002_trial.csv',\n",
       " 'id_006_survey.csv']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "files_raw = [file for file in files_raw if \".csv\" in file]\n",
    "files_raw"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c37dbea8-4a59-4813-be6c-8320ac0d3d34",
   "metadata": {},
   "source": [
    "We can do this to separate 'survey' files from 'trial' files as well.\n",
    "This is important because survey and trial files have different structure and will require different selection and filtering tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "92ccf2a7-c59a-424a-b0f0-ad06071797ad",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['id_005_survey.csv', 'id_008_survey.csv', 'id_007_survey.csv', 'id_002_survey.csv', 'id_010_survey.csv', 'id_001_survey.csv', 'id_004_survey.csv', 'id_009_survey.csv', 'id_003_survey.csv', 'id_006_survey.csv']\n",
      "['id_007_trial.csv', 'id_001_trial.csv', 'id_006_trial.csv', 'id_005_trial.csv', 'id_008_trial.csv', 'id_003_trial.csv', 'id_004_trial.csv', 'id_010_trial.csv', 'id_009_trial.csv', 'id_002_trial.csv']\n"
     ]
    }
   ],
   "source": [
    "files_raw_survey = [f for f in files_raw if \"survey\" in f]\n",
    "files_raw_trials = [f for f in files_raw if \"trial\" in f]\n",
    "\n",
    "print(files_raw_survey)\n",
    "print(files_raw_trials)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c03f624-cea7-4908-9037-5d5e78eb672f",
   "metadata": {},
   "source": [
    "Now we have two lists of file names. One that holds all of the survey files and one that holds all of the experiment files."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b9b3ef8-a218-471b-b1f8-8e1bf4aa9a83",
   "metadata": {},
   "source": [
    "## pandas\n",
    "\n",
    "The `pandas` library lets us easily work with data structures.\n",
    "\n",
    "The pandas documentation is a great resource!\n",
    "\n",
    "Check out the Intro to pandas: https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html#intro-to-pandas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "afb39251-c315-4c99-80de-c70ab0e3df88",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a996d97-fe8c-44eb-a3ea-a23778eb2d2b",
   "metadata": {},
   "source": [
    "### Creating DataFrames"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a8e01259-8594-4099-963e-3a53f8742d7d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_A</th>\n",
       "      <th>col_B</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  col_A  col_B\n",
       "0     A     20\n",
       "1     B     30"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df01 = pd.DataFrame({\n",
    "    \"col_A\": [\"A\", \"B\"],\n",
    "    \"col_B\": [20, 30]\n",
    "})\n",
    "df01.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "df14b49e-b0db-4580-b93a-893c3affba33",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    A\n",
       "1    B\n",
       "Name: col_A, dtype: object"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df01[\"col_A\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "beb4d072-6ee3-4cbe-ae0c-82d5978d9bce",
   "metadata": {},
   "source": [
    "### DataFrames from Files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8c481ede-90ed-4d48-b43f-597cf9257f8d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>63</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index            task              stimulus     value  time_elapsed\n",
       "0            1  Participant ID                   NaN       001            21\n",
       "1            2         Consent                   NaN  Complete            47\n",
       "2            3         Trial 1  blue_red_Incongruent        67            63\n",
       "3            4         Trial 2   blue_blue_Congruent        51            91\n",
       "4            5         Trial 3     red_red_Congruent        51           116"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example = pd.read_csv(\"raw_data/id_001_trial.csv\")\n",
    "data_read_example.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "79218861-92f0-4ac7-9609-7a73adc54f0e",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>20</td>\n",
       "      <td>Trial 18</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>26</td>\n",
       "      <td>538</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>21</td>\n",
       "      <td>Trial 19</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>1</td>\n",
       "      <td>585</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>22</td>\n",
       "      <td>Trial 20</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>37</td>\n",
       "      <td>613</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>23</td>\n",
       "      <td>Debrief</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>642</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>24</td>\n",
       "      <td>Complete</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Done</td>\n",
       "      <td>666</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index      task             stimulus     value  time_elapsed\n",
       "19           20  Trial 18  blue_blue_Congruent        26           538\n",
       "20           21  Trial 19    red_red_Congruent         1           585\n",
       "21           22  Trial 20    red_red_Congruent        37           613\n",
       "22           23   Debrief                  NaN  Complete           642\n",
       "23           24  Complete                  NaN      Done           666"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "1ac01988-b54c-4232-850f-e3a1dd9dbc0b",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 24 entries, 0 to 23\n",
      "Data columns (total 5 columns):\n",
      " #   Column        Non-Null Count  Dtype \n",
      "---  ------        --------------  ----- \n",
      " 0   trial_index   24 non-null     int64 \n",
      " 1   task          24 non-null     object\n",
      " 2   stimulus      20 non-null     object\n",
      " 3   value         24 non-null     object\n",
      " 4   time_elapsed  24 non-null     int64 \n",
      "dtypes: int64(2), object(3)\n",
      "memory usage: 1.1+ KB\n"
     ]
    }
   ],
   "source": [
    "data_read_example.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "a382e46f-00ab-48e1-b82e-f0a0299ed00f",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>24.000000</td>\n",
       "      <td>24.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>12.500000</td>\n",
       "      <td>341.666667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>7.071068</td>\n",
       "      <td>205.208837</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>21.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>6.750000</td>\n",
       "      <td>147.250000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>12.500000</td>\n",
       "      <td>355.500000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>18.250000</td>\n",
       "      <td>501.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>24.000000</td>\n",
       "      <td>666.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       trial_index  time_elapsed\n",
       "count    24.000000     24.000000\n",
       "mean     12.500000    341.666667\n",
       "std       7.071068    205.208837\n",
       "min       1.000000     21.000000\n",
       "25%       6.750000    147.250000\n",
       "50%      12.500000    355.500000\n",
       "75%      18.250000    501.000000\n",
       "max      24.000000    666.000000"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d229c9de-9ce0-4b46-94dd-978deb94fdb0",
   "metadata": {},
   "source": [
    "### Subsetting DataFrames\n",
    "\n",
    "#### Columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "37c7fcea-eecf-499b-9a8d-01d31718fce8",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      21\n",
       "1      47\n",
       "2      63\n",
       "3      91\n",
       "4     116\n",
       "5     130\n",
       "6     153\n",
       "7     200\n",
       "8     229\n",
       "9     275\n",
       "10    321\n",
       "11    345\n",
       "12    366\n",
       "13    408\n",
       "14    430\n",
       "15    461\n",
       "16    488\n",
       "17    496\n",
       "18    516\n",
       "19    538\n",
       "20    585\n",
       "21    613\n",
       "22    642\n",
       "23    666\n",
       "Name: time_elapsed, dtype: int64"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"time_elapsed\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "dc17cacc-ffc0-4a66-9df4-b5e243ca750f",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>task</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>21</td>\n",
       "      <td>Participant ID</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>47</td>\n",
       "      <td>Consent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>63</td>\n",
       "      <td>Trial 1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>91</td>\n",
       "      <td>Trial 2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>116</td>\n",
       "      <td>Trial 3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>130</td>\n",
       "      <td>Trial 4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>153</td>\n",
       "      <td>Trial 5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>200</td>\n",
       "      <td>Trial 6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>229</td>\n",
       "      <td>Trial 7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>275</td>\n",
       "      <td>Trial 8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>321</td>\n",
       "      <td>Trial 9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>345</td>\n",
       "      <td>Trial 10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>366</td>\n",
       "      <td>Trial 11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>408</td>\n",
       "      <td>Trial 12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>430</td>\n",
       "      <td>Trial 13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>461</td>\n",
       "      <td>Trial 14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>488</td>\n",
       "      <td>Trial 15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>496</td>\n",
       "      <td>Trial 16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>516</td>\n",
       "      <td>Trial 17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>538</td>\n",
       "      <td>Trial 18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>585</td>\n",
       "      <td>Trial 19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>613</td>\n",
       "      <td>Trial 20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>642</td>\n",
       "      <td>Debrief</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>666</td>\n",
       "      <td>Complete</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    time_elapsed            task\n",
       "0             21  Participant ID\n",
       "1             47         Consent\n",
       "2             63         Trial 1\n",
       "3             91         Trial 2\n",
       "4            116         Trial 3\n",
       "5            130         Trial 4\n",
       "6            153         Trial 5\n",
       "7            200         Trial 6\n",
       "8            229         Trial 7\n",
       "9            275         Trial 8\n",
       "10           321         Trial 9\n",
       "11           345        Trial 10\n",
       "12           366        Trial 11\n",
       "13           408        Trial 12\n",
       "14           430        Trial 13\n",
       "15           461        Trial 14\n",
       "16           488        Trial 15\n",
       "17           496        Trial 16\n",
       "18           516        Trial 17\n",
       "19           538        Trial 18\n",
       "20           585        Trial 19\n",
       "21           613        Trial 20\n",
       "22           642         Debrief\n",
       "23           666        Complete"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[[\"time_elapsed\", \"task\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c48a0f3d-1bbd-4d74-868a-c79b38147b0b",
   "metadata": {},
   "source": [
    "#### Rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "bb43eace-9c27-40a4-a39a-263378ae8bdb",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>14</td>\n",
       "      <td>Trial 12</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>15</td>\n",
       "      <td>408</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15</td>\n",
       "      <td>Trial 13</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>22</td>\n",
       "      <td>430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>16</td>\n",
       "      <td>Trial 14</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>55</td>\n",
       "      <td>461</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>17</td>\n",
       "      <td>Trial 15</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>66</td>\n",
       "      <td>488</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>18</td>\n",
       "      <td>Trial 16</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>48</td>\n",
       "      <td>496</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>19</td>\n",
       "      <td>Trial 17</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>75</td>\n",
       "      <td>516</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>20</td>\n",
       "      <td>Trial 18</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>26</td>\n",
       "      <td>538</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>21</td>\n",
       "      <td>Trial 19</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>1</td>\n",
       "      <td>585</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>22</td>\n",
       "      <td>Trial 20</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>37</td>\n",
       "      <td>613</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>23</td>\n",
       "      <td>Debrief</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>642</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>24</td>\n",
       "      <td>Complete</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Done</td>\n",
       "      <td>666</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index      task              stimulus     value  time_elapsed\n",
       "13           14  Trial 12     red_red_Congruent        15           408\n",
       "14           15  Trial 13     red_red_Congruent        22           430\n",
       "15           16  Trial 14  blue_red_Incongruent        55           461\n",
       "16           17  Trial 15  blue_red_Incongruent        66           488\n",
       "17           18  Trial 16     red_red_Congruent        48           496\n",
       "18           19  Trial 17  red_blue_Incongruent        75           516\n",
       "19           20  Trial 18   blue_blue_Congruent        26           538\n",
       "20           21  Trial 19     red_red_Congruent         1           585\n",
       "21           22  Trial 20     red_red_Congruent        37           613\n",
       "22           23   Debrief                   NaN  Complete           642\n",
       "23           24  Complete                   NaN      Done           666"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[data_read_example[\"time_elapsed\"] > 400]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "2499275d-cb5b-4f9c-8f7f-8de239b89b17",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     False\n",
       "1     False\n",
       "2     False\n",
       "3     False\n",
       "4     False\n",
       "5     False\n",
       "6     False\n",
       "7     False\n",
       "8     False\n",
       "9     False\n",
       "10    False\n",
       "11    False\n",
       "12    False\n",
       "13     True\n",
       "14     True\n",
       "15     True\n",
       "16     True\n",
       "17     True\n",
       "18     True\n",
       "19     True\n",
       "20     True\n",
       "21     True\n",
       "22     True\n",
       "23     True\n",
       "Name: time_elapsed, dtype: bool"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"time_elapsed\"] > 400"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "7a43bbed-24b6-4c82-beba-e95c55e90d6c",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     False\n",
       "1     False\n",
       "2     False\n",
       "3     False\n",
       "4     False\n",
       "5     False\n",
       "6     False\n",
       "7     False\n",
       "8     False\n",
       "9     False\n",
       "10    False\n",
       "11    False\n",
       "12    False\n",
       "13    False\n",
       "14     True\n",
       "15    False\n",
       "16    False\n",
       "17    False\n",
       "18     True\n",
       "19    False\n",
       "20    False\n",
       "21    False\n",
       "22    False\n",
       "23    False\n",
       "Name: task, dtype: bool"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"task\"].isin([\"Trial 13\", \"Trial 17\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "c09ca6b8-5897-4671-91b4-16cb52da2a9e",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15</td>\n",
       "      <td>Trial 13</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>22</td>\n",
       "      <td>430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>19</td>\n",
       "      <td>Trial 17</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>75</td>\n",
       "      <td>516</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index      task              stimulus value  time_elapsed\n",
       "14           15  Trial 13     red_red_Congruent    22           430\n",
       "18           19  Trial 17  red_blue_Incongruent    75           516"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[data_read_example[\"task\"].isin([\"Trial 13\", \"Trial 17\"])]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ae2349ad-6edd-4503-8a91-44c01f359bc7",
   "metadata": {},
   "source": [
    "#### Rows and Columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "0750cb15-8cdf-4401-966b-b4ed6db8ff78",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             stimulus value\n",
       "14  red_red_Congruent    22"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.loc[data_read_example[\"task\"].isin([\"Trial 13\"]), [\"stimulus\", \"value\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "a5c04042-474b-482b-a706-c70bbd806ccb",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>81</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               stimulus value\n",
       "3   blue_blue_Congruent    51\n",
       "4     red_red_Congruent    51\n",
       "5   blue_blue_Congruent    16\n",
       "6  blue_red_Incongruent    81"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.iloc[3:7,2:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "39dfa042-0abc-4641-a7f8-c7b991e1966d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index            task              stimulus     value  time_elapsed\n",
       "0            1  Participant ID                   NaN       001            21\n",
       "1            2         Consent                   NaN  Complete           222\n",
       "2            3         Trial 1  blue_red_Incongruent        67           222\n",
       "3            4         Trial 2   blue_blue_Congruent        51            91\n",
       "4            5         Trial 3     red_red_Congruent        51           116"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.iloc[1:3, 4] = 222\n",
    "data_read_example.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f721fec7-1ecf-4c7a-aea1-3b11a8745cb2",
   "metadata": {},
   "source": [
    "### Creating New Columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "88adb964-fed6-4515-b4a4-ba6232229718",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_seconds</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "      <td>0.021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "      <td>0.091</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "      <td>0.116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index            task              stimulus     value  time_elapsed  \\\n",
       "0            1  Participant ID                   NaN       001            21   \n",
       "1            2         Consent                   NaN  Complete           222   \n",
       "2            3         Trial 1  blue_red_Incongruent        67           222   \n",
       "3            4         Trial 2   blue_blue_Congruent        51            91   \n",
       "4            5         Trial 3     red_red_Congruent        51           116   \n",
       "\n",
       "   time_seconds  \n",
       "0         0.021  \n",
       "1         0.222  \n",
       "2         0.222  \n",
       "3         0.091  \n",
       "4         0.116  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"time_seconds\"] = data_read_example[\"time_elapsed\"] / 1000\n",
    "data_read_example.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6ef3065-8f4d-46e7-9648-dfcd0cbc9d1d",
   "metadata": {},
   "source": [
    "#### Renaming Columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "42960600-9ef7-46ba-bf4a-ace024f9f2f4",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "      <td>0.021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "      <td>0.091</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "      <td>0.116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index            task              stimulus     value  time_elapsed  \\\n",
       "0            1  Participant ID                   NaN       001            21   \n",
       "1            2         Consent                   NaN  Complete           222   \n",
       "2            3         Trial 1  blue_red_Incongruent        67           222   \n",
       "3            4         Trial 2   blue_blue_Congruent        51            91   \n",
       "4            5         Trial 3     red_red_Congruent        51           116   \n",
       "\n",
       "   time_s  \n",
       "0   0.021  \n",
       "1   0.222  \n",
       "2   0.222  \n",
       "3   0.091  \n",
       "4   0.116  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.rename(columns={\"time_seconds\": \"time_s\"}, inplace=True)\n",
    "data_read_example.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "031e90c1-528d-4d46-86b5-be189da55264",
   "metadata": {},
   "source": [
    "### Summarizing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "f1ac2866-34c3-46c3-9b78-bc14556e639c",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.3555833333333333"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"time_s\"].mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "14c5479a-84f4-489c-bd19-f9005a7f922d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>0.355583</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>0.189743</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        time_s\n",
       "mean  0.355583\n",
       "std   0.189743"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.agg(\n",
    "    {\n",
    "        \"time_s\": [\"mean\", \"std\"]\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dcd6ac3e-5e34-4a81-91bb-83b8aa54214b",
   "metadata": {},
   "source": [
    "#### Grouping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "3742041f-f799-4b40-a0f1-b5d5e5119bdf",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>stimulus</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>blue_blue_Congruent</th>\n",
       "      <td>253.000000</td>\n",
       "      <td>0.253000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>blue_red_Incongruent</th>\n",
       "      <td>331.000000</td>\n",
       "      <td>0.331000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>red_blue_Incongruent</th>\n",
       "      <td>337.166667</td>\n",
       "      <td>0.337167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>red_red_Congruent</th>\n",
       "      <td>411.000000</td>\n",
       "      <td>0.411000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      time_elapsed    time_s\n",
       "stimulus                                    \n",
       "blue_blue_Congruent     253.000000  0.253000\n",
       "blue_red_Incongruent    331.000000  0.331000\n",
       "red_blue_Incongruent    337.166667  0.337167\n",
       "red_red_Congruent       411.000000  0.411000"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.groupby(\"stimulus\").mean(numeric_only=True)[[\"time_elapsed\", \"time_s\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "192963fb-36f8-47e1-971d-649b87da3e96",
   "metadata": {},
   "source": [
    "#### Counting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "95926985-8970-4eb9-93c7-ed8bebf04521",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "51          2\n",
       "75          2\n",
       "48          2\n",
       "Complete    2\n",
       "001         1\n",
       "15          1\n",
       "37          1\n",
       "1           1\n",
       "26          1\n",
       "66          1\n",
       "55          1\n",
       "22          1\n",
       "31          1\n",
       "58          1\n",
       "34          1\n",
       "61          1\n",
       "81          1\n",
       "16          1\n",
       "67          1\n",
       "Done        1\n",
       "Name: value, dtype: int64"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"value\"].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f061ad1-6516-4f1a-b3eb-64f6101e6404",
   "metadata": {},
   "source": [
    "### Sorting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "b349fe3c-492d-478c-8282-96571f9a3de4",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>24</td>\n",
       "      <td>Complete</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Done</td>\n",
       "      <td>666</td>\n",
       "      <td>0.666</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>23</td>\n",
       "      <td>Debrief</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>642</td>\n",
       "      <td>0.642</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "      <td>0.021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index            task              stimulus     value  time_elapsed  \\\n",
       "23           24        Complete                   NaN      Done           666   \n",
       "1             2         Consent                   NaN  Complete           222   \n",
       "22           23         Debrief                   NaN  Complete           642   \n",
       "0             1  Participant ID                   NaN       001            21   \n",
       "2             3         Trial 1  blue_red_Incongruent        67           222   \n",
       "\n",
       "    time_s  \n",
       "23   0.666  \n",
       "1    0.222  \n",
       "22   0.642  \n",
       "0    0.021  \n",
       "2    0.222  "
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.sort_values(by=\"task\").head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "1c91333d-9fba-4ebc-a963-e79f5dd6ef24",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>22</td>\n",
       "      <td>Trial 20</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>37</td>\n",
       "      <td>613</td>\n",
       "      <td>0.613</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>21</td>\n",
       "      <td>Trial 19</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>1</td>\n",
       "      <td>585</td>\n",
       "      <td>0.585</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>18</td>\n",
       "      <td>Trial 16</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>48</td>\n",
       "      <td>496</td>\n",
       "      <td>0.496</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15</td>\n",
       "      <td>Trial 13</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>22</td>\n",
       "      <td>430</td>\n",
       "      <td>0.430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>14</td>\n",
       "      <td>Trial 12</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>15</td>\n",
       "      <td>408</td>\n",
       "      <td>0.408</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index      task           stimulus value  time_elapsed  time_s\n",
       "21           22  Trial 20  red_red_Congruent    37           613   0.613\n",
       "20           21  Trial 19  red_red_Congruent     1           585   0.585\n",
       "17           18  Trial 16  red_red_Congruent    48           496   0.496\n",
       "14           15  Trial 13  red_red_Congruent    22           430   0.430\n",
       "13           14  Trial 12  red_red_Congruent    15           408   0.408"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.sort_values(by=[\"stimulus\", \"time_elapsed\"], ascending=False).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1611f84-4018-4497-a6bb-a833cd5a0859",
   "metadata": {},
   "source": [
    "### Formatting: Long to Wide"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "1a0839d7-ae90-41cd-a38b-7127f8afa8e5",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Type</th>\n",
       "      <th>Value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>C</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>A</td>\n",
       "      <td>50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>B</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Type  Value\n",
       "0    A     20\n",
       "1    B     30\n",
       "2    C     40\n",
       "3    A     50\n",
       "4    B     30"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df02 = pd.DataFrame({\n",
    "    \"Type\": [\"A\", \"B\", \"C\", \"A\", \"B\"],\n",
    "    \"Value\": [20, 30, 40, 50, 30]\n",
    "})\n",
    "df02"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "a49b64bd-84d8-4cf0-80fa-0e69e12a7efd",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Type</th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>20.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>NaN</td>\n",
       "      <td>30.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>40.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>50.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>NaN</td>\n",
       "      <td>30.0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Type     A     B     C\n",
       "0     20.0   NaN   NaN\n",
       "1      NaN  30.0   NaN\n",
       "2      NaN   NaN  40.0\n",
       "3     50.0   NaN   NaN\n",
       "4      NaN  30.0   NaN"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df02.pivot(columns=\"Type\", values=\"Value\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "9665983a-b1a9-4ced-8dd2-5d623a43ac1d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>Type</th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Value</th>\n",
       "      <td>35</td>\n",
       "      <td>30</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "Type    A   B   C\n",
       "Value  35  30  40"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df02.pivot_table(columns=\"Type\", values=\"Value\", aggfunc=np.mean)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5868d644-f52d-4fbb-b0d4-dda6b727fce2",
   "metadata": {},
   "source": [
    "### Formatting: Wide to Long"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "1c410c4a-8ce3-4b55-b7f7-8922c0c3f3e5",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Task 1</th>\n",
       "      <th>Task 2</th>\n",
       "      <th>Task 3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>001</td>\n",
       "      <td>20</td>\n",
       "      <td>22</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Participant ID  Task 1  Task 2  Task 3\n",
       "0            001      20      22      18"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df03 = pd.DataFrame({\n",
    "    \"Participant ID\": [\"001\"],\n",
    "    \"Task 1\": [20],\n",
    "    \"Task 2\": [22],\n",
    "    \"Task 3\": [18],\n",
    "})\n",
    "df03"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "95a92d0c-f894-45bf-81e4-9d774191d4e9",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>task</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>001</td>\n",
       "      <td>Task 1</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>001</td>\n",
       "      <td>Task 2</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>001</td>\n",
       "      <td>Task 3</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Participant ID    task  value\n",
       "0            001  Task 1     20\n",
       "1            001  Task 2     22\n",
       "2            001  Task 3     18"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df03.melt(id_vars=\"Participant ID\", var_name=\"task\", value_name=\"value\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "7b41dd32-e019-485f-aaed-646287927016",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Task 3</th>\n",
       "      <th>task</th>\n",
       "      <th>value</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>001</td>\n",
       "      <td>18</td>\n",
       "      <td>Task 1</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>001</td>\n",
       "      <td>18</td>\n",
       "      <td>Task 2</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Participant ID  Task 3    task  value\n",
       "0            001      18  Task 1     20\n",
       "1            001      18  Task 2     22"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df03.melt(\n",
    "    id_vars=[\"Participant ID\", \"Task 3\"], \n",
    "    value_vars=[\"Task 1\", \"Task 2\"],\n",
    "    var_name=\"task\", \n",
    "    value_name=\"value\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73828442-8818-4159-b545-45ee08454541",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Combining Data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "429b6be8-9031-4d32-a828-2644c2d47266",
   "metadata": {},
   "source": [
    "#### Concatenation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "36e3bba1-fe72-4b22-874e-f9bc1b13f06d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_A</th>\n",
       "      <th>col_B</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>C</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>D</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  col_A  col_B\n",
       "0     A     22\n",
       "1     B     33\n",
       "2     C     30\n",
       "3     D     10"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df01 = pd.DataFrame({\n",
    "    \"col_A\": [\"A\", \"B\"],\n",
    "    \"col_B\": [22, 33],\n",
    "})\n",
    "\n",
    "df02 = pd.DataFrame({\n",
    "    \"col_A\": [\"C\", \"D\"],\n",
    "    \"col_B\": [30, 10],\n",
    "})\n",
    "\n",
    "pd.concat([df01, df02], ignore_index=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "bb7f99e7-e97a-4e20-a5d3-5af0288a3abb",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>22</td>\n",
       "      <td>C</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>33</td>\n",
       "      <td>D</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   0   1  2   3\n",
       "0  A  22  C  30\n",
       "1  B  33  D  10"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.concat([df01, df02], ignore_index=True, axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "75238f80-13ab-4263-94dd-777168bb4796",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_A</th>\n",
       "      <th>col_B</th>\n",
       "      <th>col_C</th>\n",
       "      <th>col_D</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>22.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>33.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>C</td>\n",
       "      <td>30.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>D</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  col_A  col_B col_C  col_D\n",
       "0     A   22.0   NaN    NaN\n",
       "1     B   33.0   NaN    NaN\n",
       "2   NaN    NaN     C   30.0\n",
       "3   NaN    NaN     D   10.0"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df01 = pd.DataFrame({\n",
    "    \"col_A\": [\"A\", \"B\"],\n",
    "    \"col_B\": [22, 33],\n",
    "})\n",
    "\n",
    "df02 = pd.DataFrame({\n",
    "    \"col_C\": [\"C\", \"D\"],\n",
    "    \"col_D\": [30, 10],\n",
    "})\n",
    "\n",
    "pd.concat([df01, df02], ignore_index=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "e524efba-627a-4d36-9809-49d10289cbce",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_A</th>\n",
       "      <th>col_B</th>\n",
       "      <th>col_C</th>\n",
       "      <th>col_D</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>22.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B</td>\n",
       "      <td>33.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>C</td>\n",
       "      <td>30.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>D</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  col_A  col_B col_C  col_D\n",
       "0     A   22.0   NaN    NaN\n",
       "1     B   33.0   NaN    NaN\n",
       "2   NaN    NaN     C   30.0\n",
       "3   NaN    NaN     D   10.0"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.concat([df01, df02], ignore_index=True, keys=[\"col_A\", \"col_C\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac3dc7ac-8989-4c69-b5d4-fd1216ceccb7",
   "metadata": {},
   "source": [
    "#### Merging"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "ded5917c-4eaa-4d34-a5f9-481cca0ed87d",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_A</th>\n",
       "      <th>col_B</th>\n",
       "      <th>col_C</th>\n",
       "      <th>col_D</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>A</td>\n",
       "      <td>22</td>\n",
       "      <td>A</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  col_A  col_B col_C  col_D\n",
       "0     A     22     A     30"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df01 = pd.DataFrame({\n",
    "    \"col_A\": [\"A\", \"B\"],\n",
    "    \"col_B\": [22, 33],\n",
    "})\n",
    "\n",
    "df02 = pd.DataFrame({\n",
    "    \"col_C\": [\"A\", \"C\"],\n",
    "    \"col_D\": [30, 10],\n",
    "})\n",
    "\n",
    "pd.merge(df01, df02, left_on=\"col_A\", right_on=\"col_C\") # how=outer/inner - left/right/cross"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "973e99d2-fa64-4938-b39d-e3f30c1ab21b",
   "metadata": {},
   "source": [
    "### Manipulating Text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "15984090-5725-4af3-af66-3cfa6806460c",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>participant id</td>\n",
       "      <td>NaN</td>\n",
       "      <td>001</td>\n",
       "      <td>21</td>\n",
       "      <td>0.021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "      <td>0.091</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "      <td>0.116</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index            task              stimulus     value  time_elapsed  \\\n",
       "0            1  participant id                   NaN       001            21   \n",
       "1            2         consent                   NaN  Complete           222   \n",
       "2            3         trial 1  blue_red_Incongruent        67           222   \n",
       "3            4         trial 2   blue_blue_Congruent        51            91   \n",
       "4            5         trial 3     red_red_Congruent        51           116   \n",
       "\n",
       "   time_s  \n",
       "0   0.021  \n",
       "1   0.222  \n",
       "2   0.222  \n",
       "3   0.091  \n",
       "4   0.116  "
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"task\"] = data_read_example[\"task\"].str.lower()\n",
    "data_read_example.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "5175ace1-634e-483f-b4dd-463c85e3ed5a",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "      <th>time_s</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>trial 1</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>67</td>\n",
       "      <td>222</td>\n",
       "      <td>0.222</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>trial 2</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>91</td>\n",
       "      <td>0.091</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>51</td>\n",
       "      <td>116</td>\n",
       "      <td>0.116</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>trial 4</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>16</td>\n",
       "      <td>130</td>\n",
       "      <td>0.130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>trial 5</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>81</td>\n",
       "      <td>153</td>\n",
       "      <td>0.153</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   trial_index     task              stimulus value  time_elapsed  time_s\n",
       "2            3  trial 1  blue_red_Incongruent    67           222   0.222\n",
       "3            4  trial 2   blue_blue_Congruent    51            91   0.091\n",
       "4            5  trial 3     red_red_Congruent    51           116   0.116\n",
       "5            6  trial 4   blue_blue_Congruent    16           130   0.130\n",
       "6            7  trial 5  blue_red_Incongruent    81           153   0.153"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[data_read_example[\"task\"].str.contains(\"trial\")].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "142af218-72ec-4f43-8af6-3ddf20361ed0",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0                          NaN\n",
       "1                          NaN\n",
       "2     [blue, red, Incongruent]\n",
       "3      [blue, blue, Congruent]\n",
       "4        [red, red, Congruent]\n",
       "5      [blue, blue, Congruent]\n",
       "6     [blue, red, Incongruent]\n",
       "7     [red, blue, Incongruent]\n",
       "8        [red, red, Congruent]\n",
       "9     [red, blue, Incongruent]\n",
       "10    [red, blue, Incongruent]\n",
       "11    [red, blue, Incongruent]\n",
       "12    [red, blue, Incongruent]\n",
       "13       [red, red, Congruent]\n",
       "14       [red, red, Congruent]\n",
       "15    [blue, red, Incongruent]\n",
       "16    [blue, red, Incongruent]\n",
       "17       [red, red, Congruent]\n",
       "18    [red, blue, Incongruent]\n",
       "19     [blue, blue, Congruent]\n",
       "20       [red, red, Congruent]\n",
       "21       [red, red, Congruent]\n",
       "22                         NaN\n",
       "23                         NaN\n",
       "Name: stimulus, dtype: object"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example[\"stimulus\"].str.split(\"_\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "33a3110c-a532-4580-826b-cd70e7f7d109",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Word</th>\n",
       "      <th>Colour</th>\n",
       "      <th>Condition</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Word Colour    Condition\n",
       "2   blue    red  Incongruent\n",
       "3   blue   blue    Congruent\n",
       "4    red    red    Congruent\n",
       "5   blue   blue    Congruent\n",
       "6   blue    red  Incongruent\n",
       "7    red   blue  Incongruent\n",
       "8    red    red    Congruent\n",
       "9    red   blue  Incongruent\n",
       "10   red   blue  Incongruent\n",
       "11   red   blue  Incongruent\n",
       "12   red   blue  Incongruent\n",
       "13   red    red    Congruent\n",
       "14   red    red    Congruent\n",
       "15  blue    red  Incongruent\n",
       "16  blue    red  Incongruent\n",
       "17   red    red    Congruent\n",
       "18   red   blue  Incongruent\n",
       "19  blue   blue    Congruent\n",
       "20   red    red    Congruent\n",
       "21   red    red    Congruent"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_read_example.loc[data_read_example[\"task\"].str.contains(\n",
    "    \"trial\"), \"stimulus\"].str.split(\"_\", expand=True).rename(\n",
    "    columns={0: \"Word\", 1: \"Colour\", 2: \"Condition\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd89d8fd-3fcd-4783-80ec-fcfbd6fdb74a",
   "metadata": {
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "source": [
    "## Putting it all Together\n",
    "\n",
    "We can use pandas along with the list of files to create a clean set of data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "e8a0e1ab-1171-4fab-8374-3046902953d0",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['id_005_survey.csv', 'id_008_survey.csv', 'id_007_survey.csv', 'id_002_survey.csv', 'id_010_survey.csv', 'id_001_survey.csv', 'id_004_survey.csv', 'id_009_survey.csv', 'id_003_survey.csv', 'id_006_survey.csv']\n",
      "['id_007_trial.csv', 'id_001_trial.csv', 'id_006_trial.csv', 'id_005_trial.csv', 'id_008_trial.csv', 'id_003_trial.csv', 'id_004_trial.csv', 'id_010_trial.csv', 'id_009_trial.csv', 'id_002_trial.csv']\n"
     ]
    }
   ],
   "source": [
    "print(files_raw_survey)\n",
    "print(files_raw_trials)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f02ae94-a4ab-4f74-8752-f583685ee48c",
   "metadata": {
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "### Create Survey Data\n",
    "\n",
    "Start by creating an empty dataframe to store all of the survey data.\n",
    "\n",
    "Then, read each file in `files_raw_survey` into a temporary dataframe and concatenate to the empty survey data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "198f5fec-2b7d-4393-a761-c55e6ab3d6b4",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Age</th>\n",
       "      <th>Gender</th>\n",
       "      <th>Income</th>\n",
       "      <th>Education</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>8</td>\n",
       "      <td>23</td>\n",
       "      <td>Female</td>\n",
       "      <td>High</td>\n",
       "      <td>High school</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>7</td>\n",
       "      <td>25</td>\n",
       "      <td>Male</td>\n",
       "      <td>High</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>33</td>\n",
       "      <td>Male</td>\n",
       "      <td>High</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>10</td>\n",
       "      <td>23</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>High school</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>20</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>4</td>\n",
       "      <td>21</td>\n",
       "      <td>Male</td>\n",
       "      <td>Low</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>9</td>\n",
       "      <td>26</td>\n",
       "      <td>Male</td>\n",
       "      <td>Low</td>\n",
       "      <td>College</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>3</td>\n",
       "      <td>31</td>\n",
       "      <td>Female</td>\n",
       "      <td>Low</td>\n",
       "      <td>College</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Participant ID  Age  Gender  Income    Education\n",
       "0               5   38    Male  Medium   University\n",
       "0               8   23  Female    High  High school\n",
       "0               7   25    Male    High   University\n",
       "0               2   33    Male    High   University\n",
       "0              10   23  Female  Medium  High school\n",
       "0               1   20  Female  Medium   University\n",
       "0               4   21    Male     Low   University\n",
       "0               9   26    Male     Low      College\n",
       "0               3   31  Female     Low      College\n",
       "0               6   24  Female  Medium   University"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_survey = pd.DataFrame()\n",
    "\n",
    "for file in files_raw_survey:\n",
    "    tmp_df = pd.read_csv(dir_raw + file)\n",
    "    data_survey = pd.concat([data_survey, tmp_df])\n",
    "    \n",
    "data_survey"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfd74b02-5b94-4071-aec4-ecf84ed81caa",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Create Trials data\n",
    "\n",
    "We can read the trial data in a similar way, but we will need to select and filter data as we go.\n",
    "\n",
    "Lets just start with one file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "ea8cf893-5312-4c15-8a07-22bb52c3d791",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>trial_index</th>\n",
       "      <th>task</th>\n",
       "      <th>stimulus</th>\n",
       "      <th>value</th>\n",
       "      <th>time_elapsed</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Participant ID</td>\n",
       "      <td>NaN</td>\n",
       "      <td>007</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Consent</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Trial 1</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>32</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Trial 2</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>76</td>\n",
       "      <td>105</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Trial 3</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>43</td>\n",
       "      <td>116</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>Trial 4</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>85</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>Trial 5</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>49</td>\n",
       "      <td>166</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>8</td>\n",
       "      <td>Trial 6</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>2</td>\n",
       "      <td>171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>Trial 7</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>55</td>\n",
       "      <td>201</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>10</td>\n",
       "      <td>Trial 8</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>34</td>\n",
       "      <td>241</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>11</td>\n",
       "      <td>Trial 9</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>35</td>\n",
       "      <td>246</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>12</td>\n",
       "      <td>Trial 10</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>71</td>\n",
       "      <td>288</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>13</td>\n",
       "      <td>Trial 11</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>60</td>\n",
       "      <td>294</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>14</td>\n",
       "      <td>Trial 12</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>49</td>\n",
       "      <td>341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>15</td>\n",
       "      <td>Trial 13</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>20</td>\n",
       "      <td>360</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>16</td>\n",
       "      <td>Trial 14</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>8</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>17</td>\n",
       "      <td>Trial 15</td>\n",
       "      <td>red_blue_Incongruent</td>\n",
       "      <td>72</td>\n",
       "      <td>381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>18</td>\n",
       "      <td>Trial 16</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>35</td>\n",
       "      <td>387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>19</td>\n",
       "      <td>Trial 17</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>56</td>\n",
       "      <td>436</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>20</td>\n",
       "      <td>Trial 18</td>\n",
       "      <td>blue_blue_Congruent</td>\n",
       "      <td>53</td>\n",
       "      <td>449</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>21</td>\n",
       "      <td>Trial 19</td>\n",
       "      <td>red_red_Congruent</td>\n",
       "      <td>58</td>\n",
       "      <td>465</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>22</td>\n",
       "      <td>Trial 20</td>\n",
       "      <td>blue_red_Incongruent</td>\n",
       "      <td>35</td>\n",
       "      <td>514</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>23</td>\n",
       "      <td>Debrief</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Complete</td>\n",
       "      <td>549</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>24</td>\n",
       "      <td>Complete</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Done</td>\n",
       "      <td>562</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    trial_index            task              stimulus     value  time_elapsed\n",
       "0             1  Participant ID                   NaN       007            27\n",
       "1             2         Consent                   NaN  Complete            56\n",
       "2             3         Trial 1     red_red_Congruent        32            80\n",
       "3             4         Trial 2  red_blue_Incongruent        76           105\n",
       "4             5         Trial 3     red_red_Congruent        43           116\n",
       "5             6         Trial 4  red_blue_Incongruent        85           130\n",
       "6             7         Trial 5  red_blue_Incongruent        49           166\n",
       "7             8         Trial 6   blue_blue_Congruent         2           171\n",
       "8             9         Trial 7   blue_blue_Congruent        55           201\n",
       "9            10         Trial 8   blue_blue_Congruent        34           241\n",
       "10           11         Trial 9  blue_red_Incongruent        35           246\n",
       "11           12        Trial 10  red_blue_Incongruent        71           288\n",
       "12           13        Trial 11  blue_red_Incongruent        60           294\n",
       "13           14        Trial 12     red_red_Congruent        49           341\n",
       "14           15        Trial 13   blue_blue_Congruent        20           360\n",
       "15           16        Trial 14   blue_blue_Congruent         8           365\n",
       "16           17        Trial 15  red_blue_Incongruent        72           381\n",
       "17           18        Trial 16   blue_blue_Congruent        35           387\n",
       "18           19        Trial 17   blue_blue_Congruent        56           436\n",
       "19           20        Trial 18   blue_blue_Congruent        53           449\n",
       "20           21        Trial 19     red_red_Congruent        58           465\n",
       "21           22        Trial 20  blue_red_Incongruent        35           514\n",
       "22           23         Debrief                   NaN  Complete           549\n",
       "23           24        Complete                   NaN      Done           562"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_trials = pd.DataFrame()\n",
    "\n",
    "for file in files_raw_trials[0:1]:\n",
    "    tmp_df = pd.read_csv(dir_raw + file)\n",
    "\n",
    "tmp_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "319f9e17-746e-4050-9c2a-e5e15e0fda70",
   "metadata": {},
   "source": [
    "'Participant ID', 'Consent', 'Debrief', and 'Complete' should be in their own columns\n",
    "\n",
    "'stimulus' should be broken into its component factors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "4aa99378-3f86-42ab-98a5-2e1aa7427c45",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Word</th>\n",
       "      <th>Colour</th>\n",
       "      <th>Condition</th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Consent</th>\n",
       "      <th>Debrief</th>\n",
       "      <th>Complete</th>\n",
       "      <th>Reaction Time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>49</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>49</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Word Colour    Condition Participant ID   Consent   Debrief  Complete  \\\n",
       "2    red    red    Congruent            007  Complete  Complete       562   \n",
       "3    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "4    red    red    Congruent            007  Complete  Complete       562   \n",
       "5    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "6    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "7   blue   blue    Congruent            007  Complete  Complete       562   \n",
       "8   blue   blue    Congruent            007  Complete  Complete       562   \n",
       "9   blue   blue    Congruent            007  Complete  Complete       562   \n",
       "10  blue    red  Incongruent            007  Complete  Complete       562   \n",
       "11   red   blue  Incongruent            007  Complete  Complete       562   \n",
       "12  blue    red  Incongruent            007  Complete  Complete       562   \n",
       "13   red    red    Congruent            007  Complete  Complete       562   \n",
       "14  blue   blue    Congruent            007  Complete  Complete       562   \n",
       "15  blue   blue    Congruent            007  Complete  Complete       562   \n",
       "16   red   blue  Incongruent            007  Complete  Complete       562   \n",
       "17  blue   blue    Congruent            007  Complete  Complete       562   \n",
       "18  blue   blue    Congruent            007  Complete  Complete       562   \n",
       "19  blue   blue    Congruent            007  Complete  Complete       562   \n",
       "20   red    red    Congruent            007  Complete  Complete       562   \n",
       "21  blue    red  Incongruent            007  Complete  Complete       562   \n",
       "\n",
       "   Reaction Time  \n",
       "2             32  \n",
       "3             76  \n",
       "4             43  \n",
       "5             85  \n",
       "6             49  \n",
       "7              2  \n",
       "8             55  \n",
       "9             34  \n",
       "10            35  \n",
       "11            71  \n",
       "12            60  \n",
       "13            49  \n",
       "14            20  \n",
       "15             8  \n",
       "16            72  \n",
       "17            35  \n",
       "18            56  \n",
       "19            53  \n",
       "20            58  \n",
       "21            35  "
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_trials = pd.DataFrame()\n",
    "\n",
    "for file in files_raw_trials[0:1]:\n",
    "    tmp_df = pd.read_csv(dir_raw + file)\n",
    "    tmp_trials = tmp_df.loc[tmp_df[\"task\"].str.contains(\n",
    "        \"Trial\"), \"stimulus\"].str.split(\"_\", expand=True).rename(\n",
    "        columns={0: \"Word\", 1: \"Colour\", 2: \"Condition\"})\n",
    "    tmp_trials[\"Participant ID\"] = tmp_df[tmp_df[\"task\"] == \"Participant ID\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Consent\"] = tmp_df[tmp_df[\"task\"] == \"Consent\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Debrief\"] = tmp_df[tmp_df[\"task\"] == \"Debrief\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Complete\"] = tmp_df[tmp_df[\"task\"] == \"Complete\"][\"time_elapsed\"].values[0]\n",
    "    tmp_trials[\"Reaction Time\"] = tmp_df.loc[tmp_df[\"task\"].str.contains(\"Trial\"), \"value\"]\n",
    "\n",
    "tmp_trials"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "id": "956d4db8-191c-40b3-b49f-0692f22ede9a",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Word</th>\n",
       "      <th>Colour</th>\n",
       "      <th>Condition</th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Consent</th>\n",
       "      <th>Debrief</th>\n",
       "      <th>Complete</th>\n",
       "      <th>Reaction Time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>76</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>007</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>562</td>\n",
       "      <td>49</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>002</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>662</td>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>002</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>662</td>\n",
       "      <td>15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>002</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>662</td>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>002</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>662</td>\n",
       "      <td>47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>002</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>662</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>200 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    Word Colour    Condition Participant ID   Consent   Debrief  Complete  \\\n",
       "2    red    red    Congruent            007  Complete  Complete       562   \n",
       "3    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "4    red    red    Congruent            007  Complete  Complete       562   \n",
       "5    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "6    red   blue  Incongruent            007  Complete  Complete       562   \n",
       "..   ...    ...          ...            ...       ...       ...       ...   \n",
       "17   red    red    Congruent            002  Complete  Complete       662   \n",
       "18  blue   blue    Congruent            002  Complete  Complete       662   \n",
       "19  blue    red  Incongruent            002  Complete  Complete       662   \n",
       "20  blue    red  Incongruent            002  Complete  Complete       662   \n",
       "21   red    red    Congruent            002  Complete  Complete       662   \n",
       "\n",
       "   Reaction Time  \n",
       "2             32  \n",
       "3             76  \n",
       "4             43  \n",
       "5             85  \n",
       "6             49  \n",
       "..           ...  \n",
       "17            38  \n",
       "18            15  \n",
       "19            38  \n",
       "20            47  \n",
       "21            13  \n",
       "\n",
       "[200 rows x 8 columns]"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_trials = pd.DataFrame()\n",
    "\n",
    "for file in files_raw_trials:  # Loop through all trials now\n",
    "    tmp_df = pd.read_csv(dir_raw + file)\n",
    "    tmp_trials = tmp_df.loc[tmp_df[\"task\"].str.contains(\n",
    "        \"Trial\"), \"stimulus\"].str.split(\"_\", expand=True).rename(\n",
    "        columns={0: \"Word\", 1: \"Colour\", 2: \"Condition\"})\n",
    "    tmp_trials[\"Participant ID\"] = tmp_df[tmp_df[\"task\"] == \"Participant ID\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Consent\"] = tmp_df[tmp_df[\"task\"] == \"Consent\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Debrief\"] = tmp_df[tmp_df[\"task\"] == \"Debrief\"][\"value\"].values[0]\n",
    "    tmp_trials[\"Complete\"] = tmp_df[tmp_df[\"task\"] == \"Complete\"][\"time_elapsed\"].values[0]\n",
    "    tmp_trials[\"Reaction Time\"] = tmp_df.loc[tmp_df[\"task\"].str.contains(\"Trial\"), \"value\"]\n",
    "    \n",
    "    data_trials = pd.concat([data_trials, tmp_trials])\n",
    "\n",
    "data_trials"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61b874e0-e372-476b-93ea-d4023f721f2d",
   "metadata": {},
   "source": [
    "### Merge Survey and Trials Data\n",
    "\n",
    "We want to merge on 'Participant ID', notice that in `survey_data` 'Participant ID' is an integer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "06420905-2262-43de-89ff-d4c902f464b4",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "Int64Index: 10 entries, 0 to 0\n",
      "Data columns (total 5 columns):\n",
      " #   Column          Non-Null Count  Dtype \n",
      "---  ------          --------------  ----- \n",
      " 0   Participant ID  10 non-null     int64 \n",
      " 1   Age             10 non-null     int64 \n",
      " 2   Gender          10 non-null     object\n",
      " 3   Income          10 non-null     object\n",
      " 4   Education       10 non-null     object\n",
      "dtypes: int64(2), object(3)\n",
      "memory usage: 480.0+ bytes\n"
     ]
    }
   ],
   "source": [
    "data_survey.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a84277ab-e90e-405b-85de-36af07ef296b",
   "metadata": {},
   "source": [
    "Merge will give an error since it doesn't match the data type of 'Participant ID' in `data_trials`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "10a7acd5-fc54-48ff-966e-75d253530ab9",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Age</th>\n",
       "      <th>Gender</th>\n",
       "      <th>Income</th>\n",
       "      <th>Education</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>008</td>\n",
       "      <td>23</td>\n",
       "      <td>Female</td>\n",
       "      <td>High</td>\n",
       "      <td>High school</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>007</td>\n",
       "      <td>25</td>\n",
       "      <td>Male</td>\n",
       "      <td>High</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>002</td>\n",
       "      <td>33</td>\n",
       "      <td>Male</td>\n",
       "      <td>High</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>010</td>\n",
       "      <td>23</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>High school</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>001</td>\n",
       "      <td>20</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>004</td>\n",
       "      <td>21</td>\n",
       "      <td>Male</td>\n",
       "      <td>Low</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>009</td>\n",
       "      <td>26</td>\n",
       "      <td>Male</td>\n",
       "      <td>Low</td>\n",
       "      <td>College</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>003</td>\n",
       "      <td>31</td>\n",
       "      <td>Female</td>\n",
       "      <td>Low</td>\n",
       "      <td>College</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Participant ID  Age  Gender  Income    Education\n",
       "0            005   38    Male  Medium   University\n",
       "0            008   23  Female    High  High school\n",
       "0            007   25    Male    High   University\n",
       "0            002   33    Male    High   University\n",
       "0            010   23  Female  Medium  High school\n",
       "0            001   20  Female  Medium   University\n",
       "0            004   21    Male     Low   University\n",
       "0            009   26    Male     Low      College\n",
       "0            003   31  Female     Low      College\n",
       "0            006   24  Female  Medium   University"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_survey[\"Participant ID\"] = data_survey[\"Participant ID\"].astype(str).str.zfill(3)\n",
    "data_survey"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24d77ccd-99e3-4a75-8594-1b80d9a725c1",
   "metadata": {},
   "source": [
    "Now we can merge!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "12b5943f-20b2-4181-b248-907b672db867",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Participant ID</th>\n",
       "      <th>Age</th>\n",
       "      <th>Gender</th>\n",
       "      <th>Income</th>\n",
       "      <th>Education</th>\n",
       "      <th>Word</th>\n",
       "      <th>Colour</th>\n",
       "      <th>Condition</th>\n",
       "      <th>Consent</th>\n",
       "      <th>Debrief</th>\n",
       "      <th>Complete</th>\n",
       "      <th>Reaction Time</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>614</td>\n",
       "      <td>41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>614</td>\n",
       "      <td>19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>614</td>\n",
       "      <td>42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>red</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>614</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>005</td>\n",
       "      <td>38</td>\n",
       "      <td>Male</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>614</td>\n",
       "      <td>31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>195</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>558</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>196</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>558</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>blue</td>\n",
       "      <td>red</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>558</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>blue</td>\n",
       "      <td>blue</td>\n",
       "      <td>Congruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>558</td>\n",
       "      <td>26</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>006</td>\n",
       "      <td>24</td>\n",
       "      <td>Female</td>\n",
       "      <td>Medium</td>\n",
       "      <td>University</td>\n",
       "      <td>red</td>\n",
       "      <td>blue</td>\n",
       "      <td>Incongruent</td>\n",
       "      <td>Complete</td>\n",
       "      <td>Complete</td>\n",
       "      <td>558</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>200 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    Participant ID  Age  Gender  Income   Education  Word Colour    Condition  \\\n",
       "0              005   38    Male  Medium  University   red    red    Congruent   \n",
       "1              005   38    Male  Medium  University  blue   blue    Congruent   \n",
       "2              005   38    Male  Medium  University   red    red    Congruent   \n",
       "3              005   38    Male  Medium  University   red    red    Congruent   \n",
       "4              005   38    Male  Medium  University   red   blue  Incongruent   \n",
       "..             ...  ...     ...     ...         ...   ...    ...          ...   \n",
       "195            006   24  Female  Medium  University  blue   blue    Congruent   \n",
       "196            006   24  Female  Medium  University   red   blue  Incongruent   \n",
       "197            006   24  Female  Medium  University  blue    red  Incongruent   \n",
       "198            006   24  Female  Medium  University  blue   blue    Congruent   \n",
       "199            006   24  Female  Medium  University   red   blue  Incongruent   \n",
       "\n",
       "      Consent   Debrief  Complete Reaction Time  \n",
       "0    Complete  Complete       614            41  \n",
       "1    Complete  Complete       614            19  \n",
       "2    Complete  Complete       614            42  \n",
       "3    Complete  Complete       614            54  \n",
       "4    Complete  Complete       614            31  \n",
       "..        ...       ...       ...           ...  \n",
       "195  Complete  Complete       558            40  \n",
       "196  Complete  Complete       558            83  \n",
       "197  Complete  Complete       558            70  \n",
       "198  Complete  Complete       558            26  \n",
       "199  Complete  Complete       558            83  \n",
       "\n",
       "[200 rows x 12 columns]"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_clean = pd.merge(data_survey, data_trials, on=\"Participant ID\")\n",
    "data_clean"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05544e21-ee1f-4e0b-b05b-d4567ea30b96",
   "metadata": {},
   "source": [
    "## Congratulations!!!\n",
    "\n",
    "We have now combined each of the raw data files into one clean DataFrame for analysis!\n",
    "\n",
    "We can either continue our anlaysis in Python, or write this DataFrame to a file and continue an analysis in another language or software."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "fa4ac710-ef5f-415c-803c-71548c774c41",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "data_clean.to_csv(\"clean_data.csv\", index=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81c418f3-a59d-4f65-97fc-466e3a90c9f9",
   "metadata": {},
   "source": [
    "## Thank you for attending this workshop!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}