{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "# An investigation of Social Class Inequalities in General Cognitive Ability in Two British Birth Cohorts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Roxanne Connelly (R.Connelly@warwick.ac.uk)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Vernon Gayle (vernon.gayle@ed.ac.uk)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Abstract\n",
    "\n",
    "The ‘Flynn effect’ describes the substantial and long-standing increase in average cognitive ability test scores, which has been observed in numerous psychological studies. Flynn makes an appeal for researchers to move beyond psychology’s standard disciplinary boundaries and to consider sociological contexts, in order to develop a more comprehensive understanding of cognitive inequalities. In this article we respond to this appeal and investigate social class inequalities in general cognitive ability test scores over time. We analyse data from the National Child Development Study (1958) and the British Cohort Study (1970). These two British birth cohorts are suitable nationally representative large-scale data resources for studying inequalities in general cognitive ability.\n",
    "\n",
    "We observe a large parental social class effect, net of parental education and gender in both cohorts. The overall finding is that large social class divisions in cognitive ability can be observed when children are still at primary school, and similar patterns are observed in each cohort. Notably, pupils with fathers at the lower end of the class structure are at a distinct disadvantage. This is a disturbing finding and it is especially important because cognitive ability is known to influence individuals later in the lifecourse.\n",
    "\n",
    "\n",
    "\n",
    "### Keywords\n",
    "\n",
    "Social Class, Cognitive Ability, Longitudinal, Cohort Studies, Social Stratification, Inequality.\n",
    "\n",
    "### Acknowledgements\n",
    "\n",
    "We are indebted to the National Child Development Study and 1970 British Cohort Study participants. We are grateful to The Centre for Longitudinal Studies, UCL Institute of Education for the use of these data and to the UK Data Archive and Economic and Social Data Service for making them available. These organizations bear no responsibility for the analysis or interpretation of these data.\n",
    "\n",
    "### Funding\n",
    "\n",
    "This work was funded by the Economic and Social Research Council [Grant Number: ES/N011783/1].\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Introduction to the Notebook\n",
    "\n",
    "There is an increasing desire and requirement to make sociological research more transparent, and to actively render it reproducible. <br>\n",
    "\n",
    "\n",
    "Jupyter notebooks are increasingly used in high-profile big science applications (e.g. see [here](https://losc.ligo.org/s/events/GW150914/GW150914_tutorial.html)). Using Jupyter notebooks for large-scale social science data analysis in sociology is zygotic. <br>\n",
    "\n",
    "Publishing a Jupyter notebook allows third parties to fully reproduce the complete workflow for the article and to duplicate the empirical results. In addition to increasing transparency, this approach greatly extends the possibility for other researchers to build on the work, for example with alternative measures or additional data. This is an attractive feature and likely to make a major contribution to quantitative sociology. <br>\n",
    "\n",
    "This is a very early example of undertaking a complete analytical workflow within a Jupyter notebook. <br>\n",
    "\n",
    "As the practice of using Jupyter notebooks becomes more ubiquitos it is likely that there will be improvements to how the notebooks are used and best practices will become much more evident.<br>\n",
    "\n",
    "Rendering a complete workflow open and accessible is a new departure. Therefore we would ask that you consider the amount of extra work that has gone into rendering our workflow open and accessible and developing this notebook. As a safeguard against being overcritical, we also invite you to reflect on how much of your own work is transparent. <br>\n",
    "\n",
    "There are hundreds of analytical decisions that are made in the process of data enabling (e.g. which measure of social class to use, how to code education). When dealing with complex, messy real world data there is often no single 'correct' way to organise the data. Researchers are unable to describe these analytical decisions within the confines of a standard journal article. This is one of the reasons why improved transparency is required in quantitative sociology.\n",
    "\n",
    "An overarching goal when producing this notebook was to ensure that a third party could follow the workflow. In developing an open and published workflow we have drawn upon ideas advanced in computer science especially the concept ‘literate computing’, which is the weaving of a narrative directly into live computation, interleaving text with code and results in order to construct a complete piece that achieves the goals of communicating results (see [here](http://blog.fperez.org/)). <br>\n",
    "\n",
    "Data analysis software can be operated in various ways. In some parts of this notebook we have deliberately chosen simpler forms of code rather than more complex programming in order to assist the reader. But as far as is practicable we have tried to annotate the work in a fashion that is least likely to obstruct the reader. <br>\n",
    "\n",
    "A further innovation within this work has been the adoption of ‘pair programming’ which is a technique from software development in which two programmers work together in the development of code. In addition we have also used ‘code peer review’ and each author has run the complete workflow independently using different computers and different software set-ups. This has enabled us to undertake an in-depth test of the reproducibility of the work. These practices are currently unknown in sociological research. <br>\n",
    "\n",
    "__Please remember that this approach to transparent research is very exploratory.__ <br>\n",
    "\n",
    "Positive comments are always appreciated, but brickbats improve work.<br>\n",
    "\n",
    "Here is how to contact us: <br>\n",
    "Roxanne Connelly (R.Connelly@warwick.ac.uk)<br>\n",
    "Vernon Gayle (vernon.gayle@ed.ac.uk)<br>\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Using Stata__\n",
    "\n",
    "The Jupyter notebook is an open-source web application that allows researchers to create and share documents that contain live code, equations, visualizations and explanatory text. \n",
    "\n",
    "An introduction to using Jupyter notebooks in social science research is available [here](https://youtu.be/Os3s1jwLAEI).\n",
    "\n",
    "Jupyter is 'language agnostic' and at the current time over forty languages are supported including those popular in data science such as Python, Stata, _R_ and Julia.\n",
    "\n",
    "In this notebook we use Stata. Stata is a proprietary software and researchers MUST have access to Stata in order to undertake data analyses within the Jupyter notebook.\n",
    "\n",
    "There are currently two approaches to undertaking analyses using Stata within a Jupyter notebook.\n",
    "\n",
    "__1. The Stata Kernel__\n",
    "\n",
    "The first approach is using a Stata kernel. The Stata kernel can be downloaded and installed from [this github repository](https://github.com/jrfiedler/stata-kernel).\n",
    "\n",
    "This kernel currently only works in Windows.\n",
    "\n",
    "You need a recent version of Stata, and if you have not already used Stata automation, register its type library by following [these instructions](http://www.stata.com/automation/#createmsapp).\n",
    "\n",
    "Once your have registered Stata you can install the kernel.\n",
    "\n",
    "At the command prompt you need to type:\n",
    "\n",
    "pip install git+https://github.com/jrfiedler/stata-kernel <br>\n",
    "python -m stata_kernel.install <br>\n",
    "\n",
    "Now when you open a new Jupyter notebook you should be able to switch to the Stata kernel from the *kernel* menu option at the top of the notebook.\n",
    "\n",
    "\n",
    "__2. Using Stata via Magic Cells__\n",
    "\n",
    "The second approach is using a Stata via magic cells. This facility can be downloaded and installed from [this github repository](https://github.com/TiesdeKok/ipystata).\n",
    "\n",
    "At the command prompt you need to type:\n",
    "\n",
    "_pip install ipystata_\n",
    "\n",
    "In a _code_ cell before using Stata you must type:\n",
    "\n",
    "_import ipystata_\n",
    "\n",
    "and then run the cell.\n",
    "\n",
    "Each cell will now be a Stata code cell as long as you start your syntax with:\n",
    "\n",
    "_%%stata_\n",
    "\n",
    "For example to get a summary of the variables in Stata the cell should include the following code:\n",
    "\n",
    "_%%stata_ <br>\n",
    "_summarize_\n",
    "\n",
    "\n",
    "further information on using Stata via magic is available [here](http://dev-ii-seminar.readthedocs.io/en/latest/notebooks/Stata_in_jupyter.html).\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Table of Contents:\n",
    "\n",
    "* [Introduction](#introduction)\n",
    "* [Background](#background)\n",
    "* [Data](#Data)\n",
    "* [Preparation of Stata](#stataprep)\n",
    "* [Preparation of NCDS Datasets](#ncdsprep)\n",
    "* [Preparation of BCS Datasets](#bcssprep)\n",
    "* [General Ability Test Scores](#generalabilitytestscores)\n",
    "* [Parental Social Class](#parentalsocialclass)\n",
    "* [Further Explanatory Variables](#explanatoryvariables)\n",
    "* [Missing Data](#missingdata)\n",
    "* [Reproducibility](#Repo)\n",
    "* [Descriptive Results](#descriptiveresults)\n",
    "* [Modelling Results](#modellingresults)\n",
    "* [Discussion of Social Class Effect](#socialclasseffect)\n",
    "* [Conclusions](#conclusions)\n",
    "* [Notes](#notes)\n",
    "* [Supplementary Materials](#supplement)\n",
    "* [References](#references)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Introduction <a class=\"anchor\" id=\"introduction\"></a>\n",
    "\n",
    "The ‘Flynn effect’ describes the substantial and long-standing increase in average cognitive test scores, which has been observed in numerous psychological studies [1](#note1) (Flynn, 2012). Flynn makes an appeal for researchers to move beyond psychology’s standard disciplinary boundaries and to consider sociological contexts, in order to develop a more comprehensive understanding of the influence of the social on cognitive inequalities. In this article we investigate social class inequalities in general cognitive ability through the examination of data from two British birth cohort studies.\n",
    "\n",
    "The focus of this article is general cognitive ability in childhood, which is understood to be socially stratified from a very young age (Feinstein, 2003; Sullivan et al., 2013; Cunha and Heckman, 2009; Duncan et al., 1998; Gottfried et al., 2003). Childhood general cognitive ability is important because it is associated with later educational attainment, occupational attainment, and health and wellbeing across the lifecourse (Deary et al., 2007; Nettle, 2003; Vanhanen, 2011). Understanding social class inequalities in childhood cognitive test scores can therefore contribute to the wider sociological understanding of the reproduction of social inequalities.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Background <a class=\"anchor\" id=\"background\"></a>\n",
    "\n",
    "Neisser et al. (1995: 77) describes cognitive ability as the ‘ability to understand complex ideas, to adapt effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought.’ Cognitive ability tests are well validated measures of individual differences of cognitive capability (Deary et al., 2007; Sternberg et al., 2001). The association between parental social class and children’s cognitive test performance has been consistently documented, and a wealth of empirical evidence demonstrates that children from more advantaged families generally have better cognitive test scores (McCulloch and Joshi, 2001; Feinstein, 2003; Goodman and Gregg, 2010; Blanden et al., 2007; Schoon et al., 2011; Schoon et al., 2010; Dickerson and Popli, 2016; Sullivan et al., 2013). Shenkin et al. (2001) describe social class inequalities in the cognitive ability test performance of 11 year olds born in 1921. Lawlor et al. (2005) found that father’s social class was an important predictor of cognitive ability test scores at ages 7, 9 and 11 for a cohort of children born between 1950 and 1956. Feinstein (2003) demonstrated socio-economic inequalities in cognitive skills at as young as 22 months for a cohort of children born in 1970. Similar inequalities were also found at ages 42 months, and at 5 and 10 years (Feinstein, 2003). Using data from the UK Millennium Cohort Study (MCS) a series of more recent investigations have shown that children from less advantaged social backgrounds perform worse on cognitive ability tests than their more advantaged peers throughout childhood (see Blanden and Machin, 2010; Blanden et al., 2007; Schoon et al., 2011; Schoon et al., 2010; Dickerson and Popli, 2012; Sullivan et al., 2013).\n",
    "\n",
    "The overall motivation for this article is to directly respond to Flynn’s appeal for researchers to move beyond psychology’s standard disciplinary boundaries, and to consider sociological contexts with the aim of developing a more comprehensive understanding of cognitive inequalities. There has been a dearth of research investigating the extent to which social class inequalities in childhood cognitive test scores have changed between birth cohorts. This stands in stark contrast to the vast quantity of research that has investigated trends in educational test scores, and the formal educational outcomes of children and young people (see for example Bradbury et al., 2015; Blanden and Gregg, 2004; Erikson et al., 2005).\n",
    "\n",
    "The analyses within this article use data from two long running British birth cohort studies, the National Child Development Study (NCDS) and the 1970 British Cohort Study (BCS). These large-scale longitudinal surveys are ongoing and follow infants born in 1958 and 1970 respectively (Power and Elliott, 2006; Elliott and Shepherd, 2006). These two studies have proven to be invaluable sociological data resources. A sizable cannon of research regarding social mobility trends in the UK is based on comparisons between these two birth cohorts (e.g. Blanden and Machin, 2004; Blanden et al., 2005; Blanden et al., 2004; Machin and Vignoles, 2004; Goldthorpe and Jackson, 2007; Tampubolon and Savage, 2012; Blanden et al., 2013). A key concern in these projects is measuring changes between birth cohorts. For example studies have investigated changes in educational inequalities (Breen et al., 2010; Shavit and Blossfeld, 1991; Shavit et al., 2007), and changes in inequalities in access to advantaged occupational positions (Erikson and Goldthorpe, 1992; Breen, 2004). Building on the tradition of cross-cohort comparisons, this work compares social class inequalities in childhood cognitive ability test scores in these two cohorts.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data <a class=\"anchor\" id=\"Data\"></a>\n",
    "\n",
    "The UK data portfolio is well endowed with large-scale nationally representative birth cohort datasets. The [National Child Development Study](http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=724&sitesectiontitle=National+Child+Development+Study) (NCDS) follows the lives of babies born in England, Scotland and Wales from the 3rd to the 9th of March 1958 (see Power and Elliott, 2006). The [British Cohort Study](http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=795&sitesectiontitle=Welcome+to+the+1970+British+Cohort+Study) (BCS) follows babies born in England, Scotland and Wales from the 5th to the 11th of April 1970  (see Elliott and Shepherd, 2006) [2](#note2). Childhood data were collected at birth, age 7 and age 11 in the NCDS (SN5565, University of London, 2014), and at birth, age 5 and age 10 in the BCS (SN2666, SN2699, SN3723, University of London, 2013; University of London, 2016a; University of London, 2016b).\n",
    "\n",
    "The UK also has a more recent nationally representative birth cohort, the [Millennium Cohort Study](http://www.cls.ioe.ac.uk/page.aspx?sitesectionid=851)  (MCS) (see Connelly and Platt, 2014). The overall design, the selection strategy, and the content of the MCS differs substantially from the previous British birth cohorts. The MCS 5th sweep (age 11) only contains one subtest of the British Ability Scales, the Verbal Similarities Test. This single test would not be sufficient to compute an overall general ability test score that is suitably comparable with the tests included in the NCDS and BCS. The MCS 5th sweep (age 11) also contains two cognitive tests drawn from the Cambridge Neuropsychological Test Automated Battery, however these tests are very different in nature to the tests completed in the NCDS and BCS (see Atkinson, 2015).\n",
    "\n",
    "Goisis et al. (2017) undertook a comparative analysis of the effects of low birth weight in the NCDS, BCS and MCS. They operationalised a measure by using only the verbal test scores within the NCDS and the BCS, and then compared them with the single Verbal Similarities Test in the MCS. We do not adopt this strategy because psychometricians have warned against the use of isolated subtests for the measurement of general cognitive ability (McDermott et al., 1990). Ensuring the comparability of cognitive tests is challenging, especially when studying test scores over time (see Must et al., 2009). Flynn (2012) highlights that performances on different cognitive ability subtests have improved at different rates. In particular, the similarities subtest has shown some of the largest increases. Therefore, the use of the similarities subtest from the MCS cohort in isolation is likely to result in misleading comparisons.\n",
    "\n",
    "The NCDS and BCS data were downloaded from the [UK Data Archive](http://www.data-archive.ac.uk/). The dates and times of download are provided below. New versions of the data are uploaded periodically. If you are using a different version of the data, it is possible that slight variations in the results will occur. To identify changes between data versions you can consult the documentation provided with the datasets.\n",
    "\n",
    "\n",
    "##### National Child Development Study (NCDS)\n",
    "\n",
    "* [National Child Development Study: Childhood Data, Sweeps 0-3, 1958-1974 SN5565](https://discover.ukdataservice.ac.uk/catalogue/?sn=5565&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "\n",
    "* [National Child Development Study Response and Outcomes Dataset, 1958-2013 SN5560](https://discover.ukdataservice.ac.uk/catalogue/?sn=5560&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "##### British Cohort Study (BCS)\t\n",
    "\t\n",
    "* [British Cohort Study: Birth and 22-Month Subsample, 1970-1972 SN2666](https://discover.ukdataservice.ac.uk/catalogue/?sn=2666&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706) \n",
    "\n",
    "\n",
    "* [British Cohort Study: Five-Year Follow-Up, 1975 SN2699](https://discover.ukdataservice.ac.uk/catalogue/?sn=2699&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "\n",
    "* [British Cohort Study: Ten-Year Follow-Up, 1980 SN3723](https://discover.ukdataservice.ac.uk/catalogue/?sn=3723&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "\n",
    "* [British Cohort Study Response Dataset, 1970-2005 SN5641](https://discover.ukdataservice.ac.uk/catalogue/?sn=5641&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "\n",
    "##### Parental Occupational Information\n",
    "The detailed parental occupational information is provided in a joint file for the NCDS and BCS:\n",
    "\n",
    "* [Occupational Coding for the National Child Development Study (1969, 1991-2008) and the 1970 British Cohort Study (1980, 2000-2008) SN7023](https://discover.ukdataservice.ac.uk/catalogue/?sn=7023&type=Data%20catalogue) (Downloaded Date: 27/07/2017 Time: 1706)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Preparation of Stata <a class=\"anchor\" id=\"stataprep\"></a>\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". global path1 \"F:\\Data\\RAWDATA\"\n",
      "\n",
      ". global path2 \"F:\\Data\\MYDATA\\WORK\"\n",
      "\n",
      ". global path3 \"F:\\Data\\MYDATA\\TEMP\"\n",
      "\n",
      ". global path4 \"F:\\Data\\MYDATA\\FINAL\"\n",
      "\n",
      ". \n",
      ". clear\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "global path1 \"F:\\Data\\RAWDATA\"\n",
    "global path2 \"F:\\Data\\MYDATA\\WORK\"\n",
    "global path3 \"F:\\Data\\MYDATA\\TEMP\"\n",
    "global path4 \"F:\\Data\\MYDATA\\FINAL\"\n",
    "\n",
    "clear\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will need the following user written comments installed in Stata:\n",
    "* fitstat\n",
    "* estpost\n",
    "* mibeta\n",
    "\n",
    "You can check if these are already installed on your machine using the 'which' command (i.e. which fitstat)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". which fitstat\n",
      "c:\\ado\\plus\\f\\fitstat.ado\n",
      "*! version 1.6.4 2/22/01 add warning messages\n",
      "\n",
      ". which estout\n",
      "c:\\ado\\plus\\e\\estout.ado\n",
      "*! version 3.17  02jun2014  Ben Jann\n",
      "\n",
      ". which mibeta\n",
      "c:\\ado\\plus\\m\\mibeta.ado\n",
      "*! version 1.0.2  19jun2014\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "which fitstat\n",
    "which estout\n",
    "which mibeta\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If these programs are not installed you will have to install them using 'ssc install' (e.g. ssc install fitstat). For more details on how to install programs from ssc see [here](http://www.stata.com/support/ssc-installation/).\n",
    "\n",
    "We extend our thanks for these programs to their authors:\n",
    "\n",
    "Long, J. S., & Freese, J. (2001). FITSTAT: Stata module to compute fit statistics for single equation regression models. Statistical Software Components.\n",
    "\n",
    "Jann, B. (2017). ESTOUT: Stata module to make regression tables. Statistical Software Components.\n",
    "\n",
    "mibeta - Yulia Marchenko, StataCorp.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Preparation of NCDS Datasets  <a class=\"anchor\" id=\"ncdsprep\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Open raw NCDS data file. This file contains information from the first four sweeps (age 0, age 7, age 11 and age 16) of the NCDS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
      "\n",
      ". count\n",
      "  18,558\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
    "count\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Identify the missing values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". quietly mvdecode _all, mv(-9=. \\-8=. \\-2=. \\-1=. \\-7=. \\-3=.)\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "quietly mvdecode _all, mv(-9=. \\-8=. \\-2=. \\-1=. \\-7=. \\-3=.)\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###### Cohort member's gender\n",
    "\n",
    "Gender is derived from variable n622.\n",
    "\n",
    "This variable comes from the age 0 (birth) survey (question 53). This question asks: Sex of infant - Male/Female.\n",
    "Variable n622 also appears in other sweeps of the survey so it is possible that this is variable includes information collected in multiple surveys.\n",
    "\n",
    "This variable is coded (1) Male (2) Female. We recode the variable into a 1/0 dummy variable for male."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". numlabel n622, add\n",
      "\n",
      ". tab n622, mi\n",
      "\n",
      "  0-3D Sex of |\n",
      "        child |      Freq.     Percent        Cum.\n",
      "--------------+-----------------------------------\n",
      "      1. Male |      9,595       51.70       51.70\n",
      "    2. Female |      8,959       48.28       99.98\n",
      "            . |          4        0.02      100.00\n",
      "--------------+-----------------------------------\n",
      "        Total |     18,558      100.00\n",
      "\n",
      ". codebook n622\n",
      "\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "n622                                                                                                                       0-3D Sex of child\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "\n",
      "                  type:  numeric (byte)\n",
      "                 label:  n622\n",
      "\n",
      "                 range:  [1,2]                        units:  1\n",
      "         unique values:  2                        missing .:  4/18,558\n",
      "\n",
      "            tabulation:  Freq.   Numeric  Label\n",
      "                         9,595         1  1. Male\n",
      "                         8,959         2  2. Female\n",
      "                             4         .  \n",
      "\n",
      ". capture drop ncds_male\n",
      "\n",
      ".     gen ncds_male = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ".     replace ncds_male = 1 if (n622==1)\n",
      "(9,595 real changes made)\n",
      "\n",
      ".     replace ncds_male = 0 if (n622==2)\n",
      "(8,959 real changes made)\n",
      "\n",
      ".     label variable ncds_male \"NCDS Cohort member Male\"\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\", replace\n",
      "\n",
      ".     label values ncds_male yesno\n",
      "\n",
      ".     tab ncds_male, mi\n",
      "\n",
      "NCDS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "         No |      8,959       48.28       48.28\n",
      "        Yes |      9,595       51.70       99.98\n",
      "          . |          4        0.02      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". tab n622 ncds_male\n",
      "\n",
      "              |  NCDS Cohort member\n",
      "  0-3D Sex of |         Male\n",
      "        child |        No        Yes |     Total\n",
      "--------------+----------------------+----------\n",
      "      1. Male |         0      9,595 |     9,595 \n",
      "    2. Female |     8,959          0 |     8,959 \n",
      "--------------+----------------------+----------\n",
      "        Total |     8,959      9,595 |    18,554 \n",
      "\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "numlabel n622, add\n",
    "tab n622, mi\n",
    "codebook n622\n",
    "capture drop ncds_male\n",
    "    gen ncds_male = .\n",
    "    replace ncds_male = 1 if (n622==1)\n",
    "    replace ncds_male = 0 if (n622==2)\n",
    "    label variable ncds_male \"NCDS Cohort member Male\"\n",
    "    label define yesno 1 \"Yes\" 0 \"No\", replace\n",
    "    label values ncds_male yesno\n",
    "    tab ncds_male, mi\n",
    "\n",
    "tab n622 ncds_male\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###### Parents Education\n",
    "\n",
    "Information on the parents educational qualifications was collected in the age 10 survey of the BCS, however this information is not available in the NCDS so we have chosen to use parents years of education in both cohorts to facilitate comparability. \n",
    "\n",
    "We code parental education using the method described in:\n",
    "\n",
    "Cheung, S.Y. & Egerton, M. (2007). Great Britain: Higher Education Expansion \n",
    "and Reform: Changing Educational Inequalities. Stratification in higher \n",
    "education: A comparative study, 195-219.\n",
    "\n",
    "Page 206 to 207\n",
    "\n",
    "'Information on parent's education in the NCDS was available only as the age at \n",
    "which the respondent's father and mother left full-time education. We have no\n",
    "informaiton on whether they left school with any qualifications. As most \n",
    "parents left school at the age of 14 to 15, we coded them as having completed \n",
    "intermediate secondary qualifications. Those who left school at age 13 or below \n",
    "would have only had school minimum, low-level school qualifications or basic \n",
    "vocational qualifications. However, the number in this category is very low\n",
    "and was therefore combined with those who left school at age 14 to 15. We took\n",
    "the highest of the parents' education and recoded it into four categories\n",
    "following the CASMIN schema'\n",
    "\n",
    "1. Left at age 15 or below\n",
    "2. Left at age 16 to 18\n",
    "3. Left at age 19 to 20\n",
    "4. Left at age 21 or above\n",
    "\t\t\t\t\t\t\t\t\t\t\t\n",
    "For fathers we use variable n195 that comes from the age 7 survey.\n",
    "\n",
    "The survey respondent is asked: Did the father stay at school after the minimum school leaving age? Yes/No\n",
    "\n",
    "Then they are asked the follow up question: If yes, at what age did he finish full-time education? (n195 comes from this question).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Father's Education\n",
      "\n",
      ". \n",
      ". numlabel n195, add\n",
      "\n",
      ". tab n195, mi\n",
      "\n",
      "    1P Dads age |\n",
      "      finishing |\n",
      " school if aftr |\n",
      "            min |      Freq.     Percent        Cum.\n",
      "----------------+-----------------------------------\n",
      "              0 |     11,068       59.64       59.64\n",
      "              2 |          2        0.01       59.65\n",
      "              3 |          4        0.02       59.67\n",
      "              6 |          3        0.02       59.69\n",
      "             10 |          1        0.01       59.69\n",
      "             11 |          3        0.02       59.71\n",
      "             12 |          4        0.02       59.73\n",
      "             13 |          2        0.01       59.74\n",
      "             14 |         91        0.49       60.23\n",
      "             15 |        380        2.05       62.28\n",
      "             16 |      1,153        6.21       68.49\n",
      "             17 |        641        3.45       71.95\n",
      "             18 |        419        2.26       74.21\n",
      "             19 |         83        0.45       74.65\n",
      "             20 |         70        0.38       75.03\n",
      "             21 |        102        0.55       75.58\n",
      "             22 |         90        0.48       76.06\n",
      "             23 |         75        0.40       76.47\n",
      "             24 |         98        0.53       77.00\n",
      "             25 |         58        0.31       77.31\n",
      "             26 |         32        0.17       77.48\n",
      "             27 |         16        0.09       77.57\n",
      "             28 |         12        0.06       77.63\n",
      "             29 |          7        0.04       77.67\n",
      "             30 |          8        0.04       77.71\n",
      "             31 |          2        0.01       77.72\n",
      "             32 |          1        0.01       77.73\n",
      "             33 |          2        0.01       77.74\n",
      "             35 |          2        0.01       77.75\n",
      "             39 |          1        0.01       77.76\n",
      "              . |      4,128       22.24      100.00\n",
      "----------------+-----------------------------------\n",
      "          Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". capture drop ncds_paed_cat\n",
      "\n",
      ".     gen ncds_paed_cat = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ". *Category 1 if they left education at age 15 or below\n",
      "\n",
      ".     replace ncds_paed_cat = 1 if (n195<=15)\n",
      "(11,558 real changes made)\n",
      "\n",
      ". *Category 2 if they left education from at age 16, 17 or 18\n",
      "\n",
      ".     replace ncds_paed_cat = 2 if ((n195>=16)&(n195<=18))\n",
      "(2,213 real changes made)\n",
      "\n",
      ". *Category 3 if they left education at age 19 or 20\n",
      "\n",
      ".     replace ncds_paed_cat = 3 if ((n195>=19)&(n195<=20))\n",
      "(153 real changes made)\n",
      "\n",
      ". *Category 4 if they left education at age 21+ (39 is highest year observed)\n",
      "\n",
      ".     replace ncds_paed_cat = 4 if ((n195>=21)&(n195<=39))\n",
      "(506 real changes made)\n",
      "\n",
      ".     tab ncds_paed_cat\n",
      "\n",
      "ncds_paed_c |\n",
      "         at |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     11,558       80.10       80.10\n",
      "          2 |      2,213       15.34       95.43\n",
      "          3 |        153        1.06       96.49\n",
      "          4 |        506        3.51      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,430      100.00\n",
      "\n",
      ".     label define ed_cat 1 \"Comp\" 2 \"Comp+1-3\" 3 \"Comp+4-5\" 4 \"Comp+6+\"\n",
      "\n",
      ".     label values ncds_paed_cat ed_cat\n",
      "\n",
      ".     label variable ncds_paed_cat \"NCDS Father's Education Categories\"\n",
      "\n",
      ".     tab ncds_paed_cat\n",
      "\n",
      "       NCDS |\n",
      "   Father's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |     11,558       80.10       80.10\n",
      "   Comp+1-3 |      2,213       15.34       95.43\n",
      "   Comp+4-5 |        153        1.06       96.49\n",
      "    Comp+6+ |        506        3.51      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,430      100.00\n",
      "\n",
      ".     tab n195 ncds_paed_cat \n",
      "\n",
      "    1P Dads age |\n",
      "      finishing |\n",
      " school if aftr |     NCDS Father's Education Categories\n",
      "            min |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "----------------+--------------------------------------------+----------\n",
      "              0 |    11,068          0          0          0 |    11,068 \n",
      "              2 |         2          0          0          0 |         2 \n",
      "              3 |         4          0          0          0 |         4 \n",
      "              6 |         3          0          0          0 |         3 \n",
      "             10 |         1          0          0          0 |         1 \n",
      "             11 |         3          0          0          0 |         3 \n",
      "             12 |         4          0          0          0 |         4 \n",
      "             13 |         2          0          0          0 |         2 \n",
      "             14 |        91          0          0          0 |        91 \n",
      "             15 |       380          0          0          0 |       380 \n",
      "             16 |         0      1,153          0          0 |     1,153 \n",
      "             17 |         0        641          0          0 |       641 \n",
      "             18 |         0        419          0          0 |       419 \n",
      "             19 |         0          0         83          0 |        83 \n",
      "             20 |         0          0         70          0 |        70 \n",
      "             21 |         0          0          0        102 |       102 \n",
      "             22 |         0          0          0         90 |        90 \n",
      "             23 |         0          0          0         75 |        75 \n",
      "             24 |         0          0          0         98 |        98 \n",
      "             25 |         0          0          0         58 |        58 \n",
      "             26 |         0          0          0         32 |        32 \n",
      "             27 |         0          0          0         16 |        16 \n",
      "             28 |         0          0          0         12 |        12 \n",
      "             29 |         0          0          0          7 |         7 \n",
      "             30 |         0          0          0          8 |         8 \n",
      "             31 |         0          0          0          2 |         2 \n",
      "             32 |         0          0          0          1 |         1 \n",
      "             33 |         0          0          0          2 |         2 \n",
      "             35 |         0          0          0          2 |         2 \n",
      "             39 |         0          0          0          1 |         1 \n",
      "----------------+--------------------------------------------+----------\n",
      "          Total |    11,558      2,213        153        506 |    14,430 \n",
      "\n",
      "\n",
      ".     tab n195 ncds_paed_cat, mi\n",
      "\n",
      "    1P Dads age |\n",
      "      finishing |\n",
      " school if aftr |           NCDS Father's Education Categories\n",
      "            min |      Comp   Comp+1-3   Comp+4-5    Comp+6+          . |     Total\n",
      "----------------+-------------------------------------------------------+----------\n",
      "              0 |    11,068          0          0          0          0 |    11,068 \n",
      "              2 |         2          0          0          0          0 |         2 \n",
      "              3 |         4          0          0          0          0 |         4 \n",
      "              6 |         3          0          0          0          0 |         3 \n",
      "             10 |         1          0          0          0          0 |         1 \n",
      "             11 |         3          0          0          0          0 |         3 \n",
      "             12 |         4          0          0          0          0 |         4 \n",
      "             13 |         2          0          0          0          0 |         2 \n",
      "             14 |        91          0          0          0          0 |        91 \n",
      "             15 |       380          0          0          0          0 |       380 \n",
      "             16 |         0      1,153          0          0          0 |     1,153 \n",
      "             17 |         0        641          0          0          0 |       641 \n",
      "             18 |         0        419          0          0          0 |       419 \n",
      "             19 |         0          0         83          0          0 |        83 \n",
      "             20 |         0          0         70          0          0 |        70 \n",
      "             21 |         0          0          0        102          0 |       102 \n",
      "             22 |         0          0          0         90          0 |        90 \n",
      "             23 |         0          0          0         75          0 |        75 \n",
      "             24 |         0          0          0         98          0 |        98 \n",
      "             25 |         0          0          0         58          0 |        58 \n",
      "             26 |         0          0          0         32          0 |        32 \n",
      "             27 |         0          0          0         16          0 |        16 \n",
      "             28 |         0          0          0         12          0 |        12 \n",
      "             29 |         0          0          0          7          0 |         7 \n",
      "             30 |         0          0          0          8          0 |         8 \n",
      "             31 |         0          0          0          2          0 |         2 \n",
      "             32 |         0          0          0          1          0 |         1 \n",
      "             33 |         0          0          0          2          0 |         2 \n",
      "             35 |         0          0          0          2          0 |         2 \n",
      "             39 |         0          0          0          1          0 |         1 \n",
      "              . |         0          0          0          0      4,128 |     4,128 \n",
      "----------------+-------------------------------------------------------+----------\n",
      "          Total |    11,558      2,213        153        506      4,128 |    18,558 \n",
      "\n",
      "\n",
      ".     \n"
     ]
    }
   ],
   "source": [
    "*Father's Education\n",
    "\n",
    "numlabel n195, add\n",
    "tab n195, mi\n",
    "\n",
    "capture drop ncds_paed_cat\n",
    "    gen ncds_paed_cat = .\n",
    "*Category 1 if they left education at age 15 or below\n",
    "    replace ncds_paed_cat = 1 if (n195<=15)\n",
    "*Category 2 if they left education from at age 16, 17 or 18\n",
    "    replace ncds_paed_cat = 2 if ((n195>=16)&(n195<=18))\n",
    "*Category 3 if they left education at age 19 or 20\n",
    "    replace ncds_paed_cat = 3 if ((n195>=19)&(n195<=20))\n",
    "*Category 4 if they left education at age 21+ (39 is highest year observed)\n",
    "    replace ncds_paed_cat = 4 if ((n195>=21)&(n195<=39))\n",
    "    tab ncds_paed_cat\n",
    "    label define ed_cat 1 \"Comp\" 2 \"Comp+1-3\" 3 \"Comp+4-5\" 4 \"Comp+6+\"\n",
    "    label values ncds_paed_cat ed_cat\n",
    "    label variable ncds_paed_cat \"NCDS Father's Education Categories\"\n",
    "    tab ncds_paed_cat\n",
    "    tab n195 ncds_paed_cat \n",
    "    tab n195 ncds_paed_cat, mi\n",
    "    \n",
    "*return to jupyter\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For mothers years of education we use variable n2397 that comes from the age 16 survey.\n",
    "\n",
    "From the age 0 we have variable n537 available to us. This comes from the question: Did the patient stay at school after the minimum school-leaving age? Yes/No\n",
    "\n",
    "In the survey this question is followed up with the question: At what age did she finish her full-time education? But the answer to this question does not appear to be deposited with the data, so we have to gather this information from a later sweep of the survey (the age 16 survey). This is suboptimal, but this information does not appear to be available from the earlier survey sweeps.\n",
    "\n",
    "Variable n2397 comes from the question: At what age did mother or mother figure leave full-time education?\n",
    "\n",
    "This is not a continuous variable but grouped into categories. We categorise the mothers years of education using the same method outlined above.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Mother's Education\n",
      "\n",
      ". \n",
      ". numlabel n2397, add\n",
      "\n",
      ". tab n2397, mi\n",
      "\n",
      "  3P Age mother figr |\n",
      " left full-time educ |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "     1. under 13 yrs |        157        0.85        0.85\n",
      "   2. 13 to 14 years |        102        0.55        1.40\n",
      "   3. 14 to 15 years |      5,258       28.33       29.73\n",
      "   4. 15 to 16 years |      3,438       18.53       48.25\n",
      "   5. 16 to 17 years |      1,307        7.04       55.30\n",
      "   6. 17 to 18 years |        504        2.72       58.01\n",
      "   7. 18 to 19 years |        278        1.50       59.51\n",
      "   8. 19 to 21 years |        153        0.82       60.34\n",
      "   9. 21 to 23 years |        187        1.01       61.34\n",
      "10. 23 or more years |         47        0.25       61.60\n",
      "       11. Not known |         43        0.23       61.83\n",
      "                   . |      7,084       38.17      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". capture drop ncds_moed_cat\n",
      "\n",
      ".     gen ncds_moed_cat = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ". *Category 1 if they left education at age 15 or below\n",
      "\n",
      ".     replace ncds_moed_cat = 1 if (n2397<=4)\n",
      "(8,955 real changes made)\n",
      "\n",
      ". *Category 2 if they left education at age 16, 17 or 18\n",
      "\n",
      ".     replace ncds_moed_cat = 2 if ((n2397>=5)&(n2397<=7))\n",
      "(2,089 real changes made)\n",
      "\n",
      ". *Category 3 if they left education at age 19 or 20\n",
      "\n",
      ".     replace ncds_moed_cat = 3 if (n2397==8)\n",
      "(153 real changes made)\n",
      "\n",
      ". *Category 4 if they left education at age 21+ \n",
      "\n",
      ".     replace ncds_moed_cat = 4 if ((n2397>=9)&(n2397<=10))\n",
      "(234 real changes made)\n",
      "\n",
      ".     tab ncds_moed_cat\n",
      "\n",
      "ncds_moed_c |\n",
      "         at |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      8,955       78.34       78.34\n",
      "          2 |      2,089       18.27       96.61\n",
      "          3 |        153        1.34       97.95\n",
      "          4 |        234        2.05      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     11,431      100.00\n",
      "\n",
      ".     label values ncds_moed_cat ed_cat\n",
      "\n",
      ".     label variable ncds_moed_cat \"NCDS Mother's Education Categories\"\n",
      "\n",
      ".     tab ncds_moed_cat\n",
      "\n",
      "       NCDS |\n",
      "   Mother's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |      8,955       78.34       78.34\n",
      "   Comp+1-3 |      2,089       18.27       96.61\n",
      "   Comp+4-5 |        153        1.34       97.95\n",
      "    Comp+6+ |        234        2.05      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     11,431      100.00\n",
      "\n",
      ".     tab n2397 ncds_moed_cat \n",
      "\n",
      "  3P Age mother figr |     NCDS Mother's Education Categories\n",
      " left full-time educ |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "---------------------+--------------------------------------------+----------\n",
      "     1. under 13 yrs |       157          0          0          0 |       157 \n",
      "   2. 13 to 14 years |       102          0          0          0 |       102 \n",
      "   3. 14 to 15 years |     5,258          0          0          0 |     5,258 \n",
      "   4. 15 to 16 years |     3,438          0          0          0 |     3,438 \n",
      "   5. 16 to 17 years |         0      1,307          0          0 |     1,307 \n",
      "   6. 17 to 18 years |         0        504          0          0 |       504 \n",
      "   7. 18 to 19 years |         0        278          0          0 |       278 \n",
      "   8. 19 to 21 years |         0          0        153          0 |       153 \n",
      "   9. 21 to 23 years |         0          0          0        187 |       187 \n",
      "10. 23 or more years |         0          0          0         47 |        47 \n",
      "---------------------+--------------------------------------------+----------\n",
      "               Total |     8,955      2,089        153        234 |    11,431 \n",
      "\n",
      "\n",
      ".     tab n2397 ncds_moed_cat, mi\n",
      "\n",
      "  3P Age mother figr |           NCDS Mother's Education Categories\n",
      " left full-time educ |      Comp   Comp+1-3   Comp+4-5    Comp+6+          . |     Total\n",
      "---------------------+-------------------------------------------------------+----------\n",
      "     1. under 13 yrs |       157          0          0          0          0 |       157 \n",
      "   2. 13 to 14 years |       102          0          0          0          0 |       102 \n",
      "   3. 14 to 15 years |     5,258          0          0          0          0 |     5,258 \n",
      "   4. 15 to 16 years |     3,438          0          0          0          0 |     3,438 \n",
      "   5. 16 to 17 years |         0      1,307          0          0          0 |     1,307 \n",
      "   6. 17 to 18 years |         0        504          0          0          0 |       504 \n",
      "   7. 18 to 19 years |         0        278          0          0          0 |       278 \n",
      "   8. 19 to 21 years |         0          0        153          0          0 |       153 \n",
      "   9. 21 to 23 years |         0          0          0        187          0 |       187 \n",
      "10. 23 or more years |         0          0          0         47          0 |        47 \n",
      "       11. Not known |         0          0          0          0         43 |        43 \n",
      "                   . |         0          0          0          0      7,084 |     7,084 \n",
      "---------------------+-------------------------------------------------------+----------\n",
      "               Total |     8,955      2,089        153        234      7,127 |    18,558 \n",
      "\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Mother's Education\n",
    "\n",
    "numlabel n2397, add\n",
    "tab n2397, mi\n",
    "\n",
    "capture drop ncds_moed_cat\n",
    "    gen ncds_moed_cat = .\n",
    "*Category 1 if they left education at age 15 or below\n",
    "    replace ncds_moed_cat = 1 if (n2397<=4)\n",
    "*Category 2 if they left education at age 16, 17 or 18\n",
    "    replace ncds_moed_cat = 2 if ((n2397>=5)&(n2397<=7))\n",
    "*Category 3 if they left education at age 19 or 20\n",
    "    replace ncds_moed_cat = 3 if (n2397==8)\n",
    "*Category 4 if they left education at age 21+ \n",
    "    replace ncds_moed_cat = 4 if ((n2397>=9)&(n2397<=10))\n",
    "    tab ncds_moed_cat\n",
    "    label values ncds_moed_cat ed_cat\n",
    "    label variable ncds_moed_cat \"NCDS Mother's Education Categories\"\n",
    "    tab ncds_moed_cat\n",
    "    tab n2397 ncds_moed_cat \n",
    "    tab n2397 ncds_moed_cat, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again in line with Cheung and Egerton (2007, p206-207) we take the highest of the parent's education to create a parental educational level variable.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Highest of the parent's education \n",
      "\n",
      ". \n",
      ". capture drop ncds_parented\n",
      "\n",
      ". *Highest of father's education and mother's education\n",
      "\n",
      ". egen ncds_parented = rmax(ncds_paed_cat ncds_moed_cat)\n",
      "(2631 missing values generated)\n",
      "\n",
      ". tab ncds_parented\n",
      "\n",
      "ncds_parent |\n",
      "         ed |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     11,659       73.20       73.20\n",
      "          2 |      3,384       21.25       94.45\n",
      "          3 |        246        1.54       95.99\n",
      "          4 |        638        4.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     15,927      100.00\n",
      "\n",
      ". label values ncds_parented ed_cat\n",
      "\n",
      ". label variable ncds_parented \"NCDS Parent's Highest Education\"\n",
      "\n",
      ". tab ncds_parented\n",
      "\n",
      "       NCDS |\n",
      "   Parent's |\n",
      "    Highest |\n",
      "  Education |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |     11,659       73.20       73.20\n",
      "   Comp+1-3 |      3,384       21.25       94.45\n",
      "   Comp+4-5 |        246        1.54       95.99\n",
      "    Comp+6+ |        638        4.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     15,927      100.00\n",
      "\n",
      ". \n",
      ". tab ncds_parented ncds_paed_cat\n",
      "\n",
      "      NCDS |\n",
      "  Parent's |\n",
      "   Highest |     NCDS Father's Education Categories\n",
      " Education |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "-----------+--------------------------------------------+----------\n",
      "      Comp |    10,505          0          0          0 |    10,505 \n",
      "  Comp+1-3 |       972      2,111          0          0 |     3,083 \n",
      "  Comp+4-5 |        36         48        143          0 |       227 \n",
      "   Comp+6+ |        45         54         10        506 |       615 \n",
      "-----------+--------------------------------------------+----------\n",
      "     Total |    11,558      2,213        153        506 |    14,430 \n",
      "\n",
      "\n",
      ". tab ncds_parented ncds_moed_cat\n",
      "\n",
      "      NCDS |\n",
      "  Parent's |\n",
      "   Highest |     NCDS Mother's Education Categories\n",
      " Education |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "-----------+--------------------------------------------+----------\n",
      "      Comp |     8,041          0          0          0 |     8,041 \n",
      "  Comp+1-3 |       807      1,907          0          0 |     2,714 \n",
      "  Comp+4-5 |        39         43        115          0 |       197 \n",
      "   Comp+6+ |        68        139         38        234 |       479 \n",
      "-----------+--------------------------------------------+----------\n",
      "     Total |     8,955      2,089        153        234 |    11,431 \n",
      "\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Highest of the parent's education \n",
    "\n",
    "capture drop ncds_parented\n",
    "*Highest of father's education and mother's education\n",
    "egen ncds_parented = rmax(ncds_paed_cat ncds_moed_cat)\n",
    "tab ncds_parented\n",
    "label values ncds_parented ed_cat\n",
    "label variable ncds_parented \"NCDS Parent's Highest Education\"\n",
    "tab ncds_parented\n",
    "\n",
    "tab ncds_parented ncds_paed_cat\n",
    "tab ncds_parented ncds_moed_cat\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we identify the country at interview of the first three sweeps of the survey."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab n0region\n",
      "\n",
      " Region at PMS |\n",
      "(1958) - Birth |      Freq.     Percent        Cum.\n",
      "---------------+-----------------------------------\n",
      "         North |      1,234        7.09        7.09\n",
      "    North West |      2,295       13.18       20.26\n",
      "  E & W.Riding |      1,433        8.23       28.49\n",
      "North Midlands |      1,299        7.46       35.95\n",
      "      Midlands |      1,648        9.46       45.41\n",
      "          East |      1,242        7.13       52.54\n",
      "    South East |      3,445       19.78       72.32\n",
      "         South |        955        5.48       77.81\n",
      "    South West |        966        5.55       83.35\n",
      "         Wales |        914        5.25       88.60\n",
      "      Scotland |      1,985       11.40      100.00\n",
      "---------------+-----------------------------------\n",
      "         Total |     17,416      100.00\n",
      "\n",
      ". capture drop ncds0_country\n",
      "\n",
      ".     gen ncds0_country = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ".     replace ncds0_country = 1 if (n0region>=1)&(n0region<=9)\n",
      "(14,517 real changes made)\n",
      "\n",
      ".     replace ncds0_country = 2 if (n0region==10)\n",
      "(914 real changes made)\n",
      "\n",
      ".     replace ncds0_country = 3 if (n0region==11)\n",
      "(1,985 real changes made)\n",
      "\n",
      ".     label define country 1 \"England\" 2 \"Wales\" 3 \"Scotland\"\n",
      "\n",
      ".     label values ncds0_country country\n",
      "\n",
      ".     label variable ncds0_country \"NCDS Age 0 Country\"\n",
      "\n",
      ".     tab ncds0_country\n",
      "\n",
      " NCDS Age 0 |\n",
      "    Country |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    England |     14,517       83.35       83.35\n",
      "      Wales |        914        5.25       88.60\n",
      "   Scotland |      1,985       11.40      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,416      100.00\n",
      "\n",
      ". \n",
      ". tab n1region\n",
      "\n",
      "     Region at |\n",
      "NCDS1 (1965) - |\n",
      "       7 years |      Freq.     Percent        Cum.\n",
      "---------------+-----------------------------------\n",
      "         North |      1,126        7.31        7.31\n",
      "    North West |      1,980       12.85       20.16\n",
      "  E & W.Riding |      1,286        8.35       28.51\n",
      "North Midlands |      1,180        7.66       36.17\n",
      "      Midlands |      1,499        9.73       45.89\n",
      "          East |      1,181        7.67       53.56\n",
      "    South East |      2,815       18.27       71.83\n",
      "         South |        948        6.15       77.98\n",
      "    South West |        930        6.04       84.02\n",
      "         Wales |        822        5.34       89.36\n",
      "      Scotland |      1,640       10.64      100.00\n",
      "---------------+-----------------------------------\n",
      "         Total |     15,407      100.00\n",
      "\n",
      ". capture drop ncds5_country\n",
      "\n",
      ".     gen ncds5_country = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ".     replace ncds5_country = 1 if (n1region>=1)&(n1region<=9)\n",
      "(12,945 real changes made)\n",
      "\n",
      ".     replace ncds5_country = 2 if (n1region==10)\n",
      "(822 real changes made)\n",
      "\n",
      ".     replace ncds5_country = 3 if (n1region==11)\n",
      "(1,640 real changes made)\n",
      "\n",
      ".     label values ncds5_country country\n",
      "\n",
      ".     label variable ncds5_country \"NCDS Age 5 Country\"\n",
      "\n",
      ".     tab ncds5_country\n",
      "\n",
      " NCDS Age 5 |\n",
      "    Country |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    England |     12,945       84.02       84.02\n",
      "      Wales |        822        5.34       89.36\n",
      "   Scotland |      1,640       10.64      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     15,407      100.00\n",
      "\n",
      ". \n",
      ". tab n2region\n",
      "\n",
      "     Region at |\n",
      "NCDS2 (1969) - |\n",
      "      11 years |      Freq.     Percent        Cum.\n",
      "---------------+-----------------------------------\n",
      "         North |      1,063        6.92        6.92\n",
      "    North West |      1,943       12.65       19.58\n",
      "  E & W.Riding |      1,303        8.49       28.06\n",
      "North Midlands |      1,181        7.69       35.75\n",
      "      Midlands |      1,438        9.36       45.12\n",
      "          East |      1,310        8.53       53.65\n",
      "    South East |      2,793       18.19       71.84\n",
      "         South |        962        6.26       78.10\n",
      "    South West |        962        6.26       84.36\n",
      "         Wales |        817        5.32       89.68\n",
      "      Scotland |      1,584       10.32      100.00\n",
      "---------------+-----------------------------------\n",
      "         Total |     15,356      100.00\n",
      "\n",
      ". capture drop ncds11_country\n",
      "\n",
      ".     gen ncds11_country = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ".     replace ncds11_country = 1 if (n2region>=1)&(n2region<=9)\n",
      "(12,955 real changes made)\n",
      "\n",
      ".     replace ncds11_country = 2 if (n2region==10)\n",
      "(817 real changes made)\n",
      "\n",
      ".     replace ncds11_country = 3 if (n2region==11)\n",
      "(1,584 real changes made)\n",
      "\n",
      ".     label values ncds11_country country\n",
      "\n",
      ".     label variable ncds11_country \"NCDS Age 11 Country\"\n",
      "\n",
      ".     tab ncds11_country\n",
      "\n",
      "NCDS Age 11 |\n",
      "    Country |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    England |     12,955       84.36       84.36\n",
      "      Wales |        817        5.32       89.68\n",
      "   Scotland |      1,584       10.32      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     15,356      100.00\n",
      "\n",
      ".  \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tab n0region\n",
    "capture drop ncds0_country\n",
    "    gen ncds0_country = .\n",
    "    replace ncds0_country = 1 if (n0region>=1)&(n0region<=9)\n",
    "    replace ncds0_country = 2 if (n0region==10)\n",
    "    replace ncds0_country = 3 if (n0region==11)\n",
    "    label define country 1 \"England\" 2 \"Wales\" 3 \"Scotland\"\n",
    "    label values ncds0_country country\n",
    "    label variable ncds0_country \"NCDS Age 0 Country\"\n",
    "    tab ncds0_country\n",
    "\n",
    "tab n1region\n",
    "capture drop ncds5_country\n",
    "    gen ncds5_country = .\n",
    "    replace ncds5_country = 1 if (n1region>=1)&(n1region<=9)\n",
    "    replace ncds5_country = 2 if (n1region==10)\n",
    "    replace ncds5_country = 3 if (n1region==11)\n",
    "    label values ncds5_country country\n",
    "    label variable ncds5_country \"NCDS Age 5 Country\"\n",
    "    tab ncds5_country\n",
    "\n",
    "tab n2region\n",
    "capture drop ncds11_country\n",
    "    gen ncds11_country = .\n",
    "    replace ncds11_country = 1 if (n2region>=1)&(n2region<=9)\n",
    "    replace ncds11_country = 2 if (n2region==10)\n",
    "    replace ncds11_country = 3 if (n2region==11)\n",
    "    label values ncds11_country country\n",
    "    label variable ncds11_country \"NCDS Age 11 Country\"\n",
    "    tab ncds11_country\n",
    " \n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we clean the general ability test variable. We use variable n920 taken from the third survey (age 11). This variable is described in the NCDS documentation [here](http://www.cls.ioe.ac.uk/library-media%5Cdocuments%5CNCDS%20user%20guide%20-%20NCDS1-3%20Measures%20of%20ability%20-%20P%20Shepherd%20-%20December%202012.pdf). The original test materials can be viewed on the NCDS [webpage](http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=769&sitesectiontitle=Questionnaires).\n",
    "\n",
    "This is the ability test score variable that has been used in previous social stratification studies. For example see: Breen, R., & Yaish, M. (2006). [Testing the Breen-Goldthorpe model of educational decision making.](https://www.researchgate.net/profile/Meir_Yaish/publication/228380994_Testing_the_Breen-Goldthorpe_model_of_educational_decision_making/links/0912f511eaa8d3d23b000000.pdf) Mobility and inequality, 232-258.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab n920\n",
      "\n",
      "   2T Total |\n",
      "   score on |\n",
      "    general |\n",
      "    ability |\n",
      "       test |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |         66        0.47        0.47\n",
      "          1 |          4        0.03        0.50\n",
      "          2 |          7        0.05        0.54\n",
      "          3 |          9        0.06        0.61\n",
      "          4 |          7        0.05        0.66\n",
      "          5 |          6        0.04        0.70\n",
      "          6 |         16        0.11        0.81\n",
      "          7 |         18        0.13        0.94\n",
      "          8 |         23        0.16        1.10\n",
      "          9 |         30        0.21        1.32\n",
      "         10 |         50        0.35        1.67\n",
      "         11 |         47        0.33        2.00\n",
      "         12 |         56        0.40        2.40\n",
      "         13 |         81        0.57        2.97\n",
      "         14 |        107        0.76        3.73\n",
      "         15 |        113        0.80        4.53\n",
      "         16 |        124        0.88        5.41\n",
      "         17 |        149        1.05        6.46\n",
      "         18 |        131        0.93        7.39\n",
      "         19 |        154        1.09        8.48\n",
      "         20 |        182        1.29        9.77\n",
      "         21 |        174        1.23       11.00\n",
      "         22 |        181        1.28       12.28\n",
      "         23 |        194        1.37       13.65\n",
      "         24 |        196        1.39       15.04\n",
      "         25 |        203        1.44       16.47\n",
      "         26 |        228        1.61       18.09\n",
      "         27 |        220        1.56       19.64\n",
      "         28 |        214        1.51       21.16\n",
      "         29 |        223        1.58       22.74\n",
      "         30 |        255        1.80       24.54\n",
      "         31 |        263        1.86       26.40\n",
      "         32 |        250        1.77       28.17\n",
      "         33 |        250        1.77       29.94\n",
      "         34 |        218        1.54       31.48\n",
      "         35 |        290        2.05       33.54\n",
      "         36 |        284        2.01       35.55\n",
      "         37 |        273        1.93       37.48\n",
      "         38 |        287        2.03       39.51\n",
      "         39 |        279        1.97       41.48\n",
      "         40 |        272        1.92       43.41\n",
      "         41 |        283        2.00       45.41\n",
      "         42 |        290        2.05       47.46\n",
      "         43 |        246        1.74       49.20\n",
      "         44 |        299        2.12       51.32\n",
      "         45 |        328        2.32       53.64\n",
      "         46 |        280        1.98       55.62\n",
      "         47 |        301        2.13       57.75\n",
      "         48 |        332        2.35       60.10\n",
      "         49 |        324        2.29       62.39\n",
      "         50 |        295        2.09       64.48\n",
      "         51 |        306        2.17       66.65\n",
      "         52 |        282        2.00       68.64\n",
      "         53 |        271        1.92       70.56\n",
      "         54 |        286        2.02       72.59\n",
      "         55 |        289        2.05       74.63\n",
      "         56 |        302        2.14       76.77\n",
      "         57 |        306        2.17       78.93\n",
      "         58 |        276        1.95       80.89\n",
      "         59 |        278        1.97       82.85\n",
      "         60 |        247        1.75       84.60\n",
      "         61 |        233        1.65       86.25\n",
      "         62 |        220        1.56       87.81\n",
      "         63 |        203        1.44       89.24\n",
      "         64 |        216        1.53       90.77\n",
      "         65 |        200        1.42       92.19\n",
      "         66 |        175        1.24       93.43\n",
      "         67 |        176        1.25       94.67\n",
      "         68 |        143        1.01       95.68\n",
      "         69 |        115        0.81       96.50\n",
      "         70 |        110        0.78       97.28\n",
      "         71 |         81        0.57       97.85\n",
      "         72 |         89        0.63       98.48\n",
      "         73 |         57        0.40       98.88\n",
      "         74 |         60        0.42       99.31\n",
      "         75 |         34        0.24       99.55\n",
      "         76 |         27        0.19       99.74\n",
      "         77 |         20        0.14       99.88\n",
      "         78 |         10        0.07       99.95\n",
      "         79 |          6        0.04       99.99\n",
      "         80 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,131      100.00\n",
      "\n",
      ". rename n920 ncds11_bastotalscore\n",
      "\n",
      ". summ ncds11_bastotalscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds11_bas~e |     14,131    42.94041    16.14388          0         80\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ". tab ncds11_bastotalscore\n",
      "\n",
      "   2T Total |\n",
      "   score on |\n",
      "    general |\n",
      "    ability |\n",
      "       test |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |         66        0.47        0.47\n",
      "          1 |          4        0.03        0.50\n",
      "          2 |          7        0.05        0.54\n",
      "          3 |          9        0.06        0.61\n",
      "          4 |          7        0.05        0.66\n",
      "          5 |          6        0.04        0.70\n",
      "          6 |         16        0.11        0.81\n",
      "          7 |         18        0.13        0.94\n",
      "          8 |         23        0.16        1.10\n",
      "          9 |         30        0.21        1.32\n",
      "         10 |         50        0.35        1.67\n",
      "         11 |         47        0.33        2.00\n",
      "         12 |         56        0.40        2.40\n",
      "         13 |         81        0.57        2.97\n",
      "         14 |        107        0.76        3.73\n",
      "         15 |        113        0.80        4.53\n",
      "         16 |        124        0.88        5.41\n",
      "         17 |        149        1.05        6.46\n",
      "         18 |        131        0.93        7.39\n",
      "         19 |        154        1.09        8.48\n",
      "         20 |        182        1.29        9.77\n",
      "         21 |        174        1.23       11.00\n",
      "         22 |        181        1.28       12.28\n",
      "         23 |        194        1.37       13.65\n",
      "         24 |        196        1.39       15.04\n",
      "         25 |        203        1.44       16.47\n",
      "         26 |        228        1.61       18.09\n",
      "         27 |        220        1.56       19.64\n",
      "         28 |        214        1.51       21.16\n",
      "         29 |        223        1.58       22.74\n",
      "         30 |        255        1.80       24.54\n",
      "         31 |        263        1.86       26.40\n",
      "         32 |        250        1.77       28.17\n",
      "         33 |        250        1.77       29.94\n",
      "         34 |        218        1.54       31.48\n",
      "         35 |        290        2.05       33.54\n",
      "         36 |        284        2.01       35.55\n",
      "         37 |        273        1.93       37.48\n",
      "         38 |        287        2.03       39.51\n",
      "         39 |        279        1.97       41.48\n",
      "         40 |        272        1.92       43.41\n",
      "         41 |        283        2.00       45.41\n",
      "         42 |        290        2.05       47.46\n",
      "         43 |        246        1.74       49.20\n",
      "         44 |        299        2.12       51.32\n",
      "         45 |        328        2.32       53.64\n",
      "         46 |        280        1.98       55.62\n",
      "         47 |        301        2.13       57.75\n",
      "         48 |        332        2.35       60.10\n",
      "         49 |        324        2.29       62.39\n",
      "         50 |        295        2.09       64.48\n",
      "         51 |        306        2.17       66.65\n",
      "         52 |        282        2.00       68.64\n",
      "         53 |        271        1.92       70.56\n",
      "         54 |        286        2.02       72.59\n",
      "         55 |        289        2.05       74.63\n",
      "         56 |        302        2.14       76.77\n",
      "         57 |        306        2.17       78.93\n",
      "         58 |        276        1.95       80.89\n",
      "         59 |        278        1.97       82.85\n",
      "         60 |        247        1.75       84.60\n",
      "         61 |        233        1.65       86.25\n",
      "         62 |        220        1.56       87.81\n",
      "         63 |        203        1.44       89.24\n",
      "         64 |        216        1.53       90.77\n",
      "         65 |        200        1.42       92.19\n",
      "         66 |        175        1.24       93.43\n",
      "         67 |        176        1.25       94.67\n",
      "         68 |        143        1.01       95.68\n",
      "         69 |        115        0.81       96.50\n",
      "         70 |        110        0.78       97.28\n",
      "         71 |         81        0.57       97.85\n",
      "         72 |         89        0.63       98.48\n",
      "         73 |         57        0.40       98.88\n",
      "         74 |         60        0.42       99.31\n",
      "         75 |         34        0.24       99.55\n",
      "         76 |         27        0.19       99.74\n",
      "         77 |         20        0.14       99.88\n",
      "         78 |         10        0.07       99.95\n",
      "         79 |          6        0.04       99.99\n",
      "         80 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,131      100.00\n",
      "\n",
      ". capture drop sncds11_bastotalscore\n",
      "\n",
      ". egen sncds11_bastotalscore = std(ncds11_bastotalscore)\n",
      "(4427 missing values generated)\n",
      "\n",
      ". tab sncds11_bastotalscore\n",
      "\n",
      "Standardize |\n",
      "d values of |\n",
      "(ncds11_bas |\n",
      "totalscore) |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "  -2.659858 |         66        0.47        0.47\n",
      "  -2.597915 |          4        0.03        0.50\n",
      "  -2.535972 |          7        0.05        0.54\n",
      "  -2.474029 |          9        0.06        0.61\n",
      "  -2.412086 |          7        0.05        0.66\n",
      "  -2.350143 |          6        0.04        0.70\n",
      "    -2.2882 |         16        0.11        0.81\n",
      "  -2.226257 |         18        0.13        0.94\n",
      "  -2.164314 |         23        0.16        1.10\n",
      "  -2.102371 |         30        0.21        1.32\n",
      "  -2.040428 |         50        0.35        1.67\n",
      "  -1.978485 |         47        0.33        2.00\n",
      "  -1.916542 |         56        0.40        2.40\n",
      "  -1.854599 |         81        0.57        2.97\n",
      "  -1.792656 |        107        0.76        3.73\n",
      "  -1.730713 |        113        0.80        4.53\n",
      "   -1.66877 |        124        0.88        5.41\n",
      "  -1.606827 |        149        1.05        6.46\n",
      "  -1.544884 |        131        0.93        7.39\n",
      "  -1.482941 |        154        1.09        8.48\n",
      "  -1.420998 |        182        1.29        9.77\n",
      "  -1.359055 |        174        1.23       11.00\n",
      "  -1.297112 |        181        1.28       12.28\n",
      "  -1.235169 |        194        1.37       13.65\n",
      "  -1.173226 |        196        1.39       15.04\n",
      "  -1.111283 |        203        1.44       16.47\n",
      "   -1.04934 |        228        1.61       18.09\n",
      "   -.987397 |        220        1.56       19.64\n",
      "   -.925454 |        214        1.51       21.16\n",
      "   -.863511 |        223        1.58       22.74\n",
      "   -.801568 |        255        1.80       24.54\n",
      "   -.739625 |        263        1.86       26.40\n",
      "   -.677682 |        250        1.77       28.17\n",
      "   -.615739 |        250        1.77       29.94\n",
      "   -.553796 |        218        1.54       31.48\n",
      "   -.491853 |        290        2.05       33.54\n",
      "    -.42991 |        284        2.01       35.55\n",
      "   -.367967 |        273        1.93       37.48\n",
      "   -.306024 |        287        2.03       39.51\n",
      "  -.2440811 |        279        1.97       41.48\n",
      "  -.1821381 |        272        1.92       43.41\n",
      "  -.1201951 |        283        2.00       45.41\n",
      "  -.0582521 |        290        2.05       47.46\n",
      "   .0036909 |        246        1.74       49.20\n",
      "   .0656339 |        299        2.12       51.32\n",
      "   .1275769 |        328        2.32       53.64\n",
      "   .1895199 |        280        1.98       55.62\n",
      "   .2514628 |        301        2.13       57.75\n",
      "   .3134058 |        332        2.35       60.10\n",
      "   .3753488 |        324        2.29       62.39\n",
      "   .4372918 |        295        2.09       64.48\n",
      "   .4992348 |        306        2.17       66.65\n",
      "   .5611778 |        282        2.00       68.64\n",
      "   .6231208 |        271        1.92       70.56\n",
      "   .6850638 |        286        2.02       72.59\n",
      "   .7470068 |        289        2.05       74.63\n",
      "   .8089498 |        302        2.14       76.77\n",
      "   .8708928 |        306        2.17       78.93\n",
      "   .9328358 |        276        1.95       80.89\n",
      "   .9947788 |        278        1.97       82.85\n",
      "   1.056722 |        247        1.75       84.60\n",
      "   1.118665 |        233        1.65       86.25\n",
      "   1.180608 |        220        1.56       87.81\n",
      "   1.242551 |        203        1.44       89.24\n",
      "   1.304494 |        216        1.53       90.77\n",
      "   1.366437 |        200        1.42       92.19\n",
      "    1.42838 |        175        1.24       93.43\n",
      "   1.490323 |        176        1.25       94.67\n",
      "   1.552266 |        143        1.01       95.68\n",
      "   1.614209 |        115        0.81       96.50\n",
      "   1.676152 |        110        0.78       97.28\n",
      "   1.738095 |         81        0.57       97.85\n",
      "   1.800038 |         89        0.63       98.48\n",
      "   1.861981 |         57        0.40       98.88\n",
      "   1.923924 |         60        0.42       99.31\n",
      "   1.985867 |         34        0.24       99.55\n",
      "    2.04781 |         27        0.19       99.74\n",
      "   2.109753 |         20        0.14       99.88\n",
      "   2.171695 |         10        0.07       99.95\n",
      "   2.233639 |          6        0.04       99.99\n",
      "   2.295582 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,131      100.00\n",
      "\n",
      ". summ sncds11_bastotalscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sncds11_ba~e |     14,131    4.08e-09           1  -2.659858   2.295582\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop ncds11_stdbastotalscore\n",
      "\n",
      ". gen ncds11_stdbastotalscore = (sncds11_bastotalscore*15)+100\n",
      "(4,427 missing values generated)\n",
      "\n",
      ". tab ncds11_stdbastotalscore\n",
      "\n",
      "ncds11_stdb |\n",
      "astotalscor |\n",
      "          e |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "   60.10213 |         66        0.47        0.47\n",
      "   61.03128 |          4        0.03        0.50\n",
      "   61.96043 |          7        0.05        0.54\n",
      "   62.88957 |          9        0.06        0.61\n",
      "   63.81871 |          7        0.05        0.66\n",
      "   64.74786 |          6        0.04        0.70\n",
      "     65.677 |         16        0.11        0.81\n",
      "   66.60615 |         18        0.13        0.94\n",
      "   67.53529 |         23        0.16        1.10\n",
      "   68.46444 |         30        0.21        1.32\n",
      "   69.39359 |         50        0.35        1.67\n",
      "   70.32273 |         47        0.33        2.00\n",
      "   71.25187 |         56        0.40        2.40\n",
      "   72.18102 |         81        0.57        2.97\n",
      "   73.11016 |        107        0.76        3.73\n",
      "   74.03931 |        113        0.80        4.53\n",
      "   74.96845 |        124        0.88        5.41\n",
      "    75.8976 |        149        1.05        6.46\n",
      "   76.82674 |        131        0.93        7.39\n",
      "   77.75589 |        154        1.09        8.48\n",
      "   78.68504 |        182        1.29        9.77\n",
      "   79.61417 |        174        1.23       11.00\n",
      "   80.54332 |        181        1.28       12.28\n",
      "   81.47247 |        194        1.37       13.65\n",
      "   82.40161 |        196        1.39       15.04\n",
      "   83.33076 |        203        1.44       16.47\n",
      "    84.2599 |        228        1.61       18.09\n",
      "   85.18905 |        220        1.56       19.64\n",
      "   86.11819 |        214        1.51       21.16\n",
      "   87.04733 |        223        1.58       22.74\n",
      "   87.97648 |        255        1.80       24.54\n",
      "   88.90562 |        263        1.86       26.40\n",
      "   89.83477 |        250        1.77       28.17\n",
      "   90.76392 |        250        1.77       29.94\n",
      "   91.69306 |        218        1.54       31.48\n",
      "   92.62221 |        290        2.05       33.54\n",
      "   93.55135 |        284        2.01       35.55\n",
      "   94.48049 |        273        1.93       37.48\n",
      "   95.40964 |        287        2.03       39.51\n",
      "   96.33878 |        279        1.97       41.48\n",
      "   97.26793 |        272        1.92       43.41\n",
      "   98.19707 |        283        2.00       45.41\n",
      "   99.12622 |        290        2.05       47.46\n",
      "   100.0554 |        246        1.74       49.20\n",
      "   100.9845 |        299        2.12       51.32\n",
      "   101.9137 |        328        2.32       53.64\n",
      "   102.8428 |        280        1.98       55.62\n",
      "   103.7719 |        301        2.13       57.75\n",
      "   104.7011 |        332        2.35       60.10\n",
      "   105.6302 |        324        2.29       62.39\n",
      "   106.5594 |        295        2.09       64.48\n",
      "   107.4885 |        306        2.17       66.65\n",
      "   108.4177 |        282        2.00       68.64\n",
      "   109.3468 |        271        1.92       70.56\n",
      "    110.276 |        286        2.02       72.59\n",
      "   111.2051 |        289        2.05       74.63\n",
      "   112.1342 |        302        2.14       76.77\n",
      "   113.0634 |        306        2.17       78.93\n",
      "   113.9925 |        276        1.95       80.89\n",
      "   114.9217 |        278        1.97       82.85\n",
      "   115.8508 |        247        1.75       84.60\n",
      "     116.78 |        233        1.65       86.25\n",
      "   117.7091 |        220        1.56       87.81\n",
      "   118.6383 |        203        1.44       89.24\n",
      "   119.5674 |        216        1.53       90.77\n",
      "   120.4966 |        200        1.42       92.19\n",
      "   121.4257 |        175        1.24       93.43\n",
      "   122.3548 |        176        1.25       94.67\n",
      "    123.284 |        143        1.01       95.68\n",
      "   124.2131 |        115        0.81       96.50\n",
      "   125.1423 |        110        0.78       97.28\n",
      "   126.0714 |         81        0.57       97.85\n",
      "   127.0006 |         89        0.63       98.48\n",
      "   127.9297 |         57        0.40       98.88\n",
      "   128.8589 |         60        0.42       99.31\n",
      "    129.788 |         34        0.24       99.55\n",
      "   130.7171 |         27        0.19       99.74\n",
      "   131.6463 |         20        0.14       99.88\n",
      "   132.5754 |         10        0.07       99.95\n",
      "   133.5046 |          6        0.04       99.99\n",
      "   134.4337 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,131      100.00\n",
      "\n",
      ". summ ncds11_stdbastotalscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds11_std~e |     14,131         100          15   60.10213   134.4337\n",
      "\n",
      ". \n",
      ". label variable ncds11_stdbastotalscore \"NCDS Age 11 BAS Total Score std\"\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "tab n920\n",
    "rename n920 ncds11_bastotalscore\n",
    "summ ncds11_bastotalscore\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "tab ncds11_bastotalscore\n",
    "capture drop sncds11_bastotalscore\n",
    "egen sncds11_bastotalscore = std(ncds11_bastotalscore)\n",
    "tab sncds11_bastotalscore\n",
    "summ sncds11_bastotalscore\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop ncds11_stdbastotalscore\n",
    "gen ncds11_stdbastotalscore = (sncds11_bastotalscore*15)+100\n",
    "tab ncds11_stdbastotalscore\n",
    "summ ncds11_stdbastotalscore\n",
    "\n",
    "label variable ncds11_stdbastotalscore \"NCDS Age 11 BAS Total Score std\"\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could also produce an overall cognitive ability test scores using principal components analysis (see for example Schoon 2010). Here we compute a general ability test scores using the method described in Schoon (2010)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *return to jupyter\n",
      "\n",
      ". * We use the two cognitive ability sub-test which make up the general abilty \n",
      "\n",
      ". * test described above.\n",
      "\n",
      ". \n",
      ". tab n914, mi \n",
      "\n",
      "  2T Verbal |\n",
      "   score on |\n",
      "    general |\n",
      "    ability |\n",
      "       test |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |         97        0.52        0.52\n",
      "          1 |         17        0.09        0.61\n",
      "          2 |         38        0.20        0.82\n",
      "          3 |         57        0.31        1.13\n",
      "          4 |         87        0.47        1.59\n",
      "          5 |        132        0.71        2.31\n",
      "          6 |        207        1.12        3.42\n",
      "          7 |        279        1.50        4.93\n",
      "          8 |        327        1.76        6.69\n",
      "          9 |        338        1.82        8.51\n",
      "         10 |        409        2.20       10.71\n",
      "         11 |        412        2.22       12.93\n",
      "         12 |        372        2.00       14.94\n",
      "         13 |        361        1.95       16.88\n",
      "         14 |        380        2.05       18.93\n",
      "         15 |        400        2.16       21.09\n",
      "         16 |        412        2.22       23.31\n",
      "         17 |        402        2.17       25.47\n",
      "         18 |        401        2.16       27.63\n",
      "         19 |        456        2.46       30.09\n",
      "         20 |        428        2.31       32.40\n",
      "         21 |        436        2.35       34.75\n",
      "         22 |        502        2.71       37.45\n",
      "         23 |        515        2.78       40.23\n",
      "         24 |        497        2.68       42.90\n",
      "         25 |        520        2.80       45.71\n",
      "         26 |        513        2.76       48.47\n",
      "         27 |        470        2.53       51.00\n",
      "         28 |        516        2.78       53.78\n",
      "         29 |        485        2.61       56.40\n",
      "         30 |        480        2.59       58.98\n",
      "         31 |        514        2.77       61.75\n",
      "         32 |        485        2.61       64.37\n",
      "         33 |        486        2.62       66.98\n",
      "         34 |        440        2.37       69.36\n",
      "         35 |        336        1.81       71.17\n",
      "         36 |        300        1.62       72.78\n",
      "         37 |        255        1.37       74.16\n",
      "         38 |        183        0.99       75.14\n",
      "         39 |        131        0.71       75.85\n",
      "         40 |         55        0.30       76.15\n",
      "          . |      4,427       23.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". tab n917, mi\n",
      "\n",
      "     2T Non |\n",
      "     verbal |\n",
      "   score on |\n",
      "gen ability |\n",
      "       test |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |         76        0.41        0.41\n",
      "          1 |         10        0.05        0.46\n",
      "          2 |         31        0.17        0.63\n",
      "          3 |         42        0.23        0.86\n",
      "          4 |         60        0.32        1.18\n",
      "          5 |         94        0.51        1.69\n",
      "          6 |        139        0.75        2.44\n",
      "          7 |        177        0.95        3.39\n",
      "          8 |        216        1.16        4.55\n",
      "          9 |        242        1.30        5.86\n",
      "         10 |        298        1.61        7.46\n",
      "         11 |        363        1.96        9.42\n",
      "         12 |        371        2.00       11.42\n",
      "         13 |        429        2.31       13.73\n",
      "         14 |        456        2.46       16.19\n",
      "         15 |        485        2.61       18.80\n",
      "         16 |        575        3.10       21.90\n",
      "         17 |        577        3.11       25.01\n",
      "         18 |        593        3.20       28.20\n",
      "         19 |        644        3.47       31.67\n",
      "         20 |        671        3.62       35.29\n",
      "         21 |        711        3.83       39.12\n",
      "         22 |        706        3.80       42.92\n",
      "         23 |        698        3.76       46.69\n",
      "         24 |        691        3.72       50.41\n",
      "         25 |        626        3.37       53.78\n",
      "         26 |        641        3.45       57.24\n",
      "         27 |        571        3.08       60.31\n",
      "         28 |        529        2.85       63.16\n",
      "         29 |        480        2.59       65.75\n",
      "         30 |        431        2.32       68.07\n",
      "         31 |        375        2.02       70.09\n",
      "         32 |        332        1.79       71.88\n",
      "         33 |        245        1.32       73.20\n",
      "         34 |        178        0.96       74.16\n",
      "         35 |        143        0.77       74.93\n",
      "         36 |        110        0.59       75.53\n",
      "         37 |         57        0.31       75.83\n",
      "         38 |         33        0.18       76.01\n",
      "         39 |         22        0.12       76.13\n",
      "         40 |          3        0.02       76.15\n",
      "          . |      4,427       23.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". \n",
      ". \n",
      ". * We only want to include those cohort members who completed both tests.\n",
      "\n",
      ". * We create a variable which indicates how many of the sub-tests\n",
      "\n",
      ". * we have information on.\n",
      "\n",
      ". \n",
      ". capture drop rmiss\n",
      "\n",
      ". egen rmiss = rmiss(n914 n917)\n",
      "\n",
      ". tab rmiss\n",
      "\n",
      "      rmiss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     14,131       76.15       76.15\n",
      "          2 |      4,427       23.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". * We examine the correlation between these two test scores:\n",
      "\n",
      ". \n",
      ". pwcorr n914 n917 if (rmiss==0), sig\n",
      "\n",
      "             |     n914     n917\n",
      "-------------+------------------\n",
      "        n914 |   1.0000 \n",
      "             |\n",
      "             |\n",
      "        n917 |   0.8074   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * Principal components analysis of the tests that make up the \n",
      "\n",
      ". * general ability test\n",
      "\n",
      ". \n",
      ". pca n914 n917 if (rmiss==0)\n",
      "\n",
      "Principal components/correlation                 Number of obs    =     14,131\n",
      "                                                 Number of comp.  =          2\n",
      "                                                 Trace            =          2\n",
      "    Rotation: (unrotated = principal)            Rho              =     1.0000\n",
      "\n",
      "    --------------------------------------------------------------------------\n",
      "       Component |   Eigenvalue   Difference         Proportion   Cumulative\n",
      "    -------------+------------------------------------------------------------\n",
      "           Comp1 |      1.80738      1.61476             0.9037       0.9037\n",
      "           Comp2 |      .192618            .             0.0963       1.0000\n",
      "    --------------------------------------------------------------------------\n",
      "\n",
      "Principal components (eigenvectors) \n",
      "\n",
      "    ------------------------------------------------\n",
      "        Variable |    Comp1     Comp2 | Unexplained \n",
      "    -------------+--------------------+-------------\n",
      "            n914 |   0.7071    0.7071 |           0 \n",
      "            n917 |   0.7071   -0.7071 |           0 \n",
      "    ------------------------------------------------\n",
      "\n",
      ". \n",
      ". * Only the first component has an eigenvalue greater than 1.\n",
      "\n",
      ". \n",
      ". screeplot\n",
      "\n",
      ". \n",
      ". * The screeplot leads to the same conclusion\n",
      "\n",
      ". \n",
      ". * Here we predict the score for each individual on the first principal\n",
      "\n",
      ". * component. This score is obtained by applying the elements of the \n",
      "\n",
      ". * corresponding eigenvector to the standardised values of the original\n",
      "\n",
      ". * observations for an individual.\n",
      "\n",
      ". \n",
      ". predict ncds11_pc1 if (rmiss==0), score\n",
      "(1 components skipped)\n",
      "\n",
      "Scoring coefficients \n",
      "    sum of squares(column-loading) = 1\n",
      "\n",
      "    ----------------------------------\n",
      "        Variable |    Comp1     Comp2 \n",
      "    -------------+--------------------\n",
      "            n914 |   0.7071           \n",
      "            n917 |   0.7071   -0.7071 \n",
      "    ----------------------------------\n",
      "\n",
      ". label variable ncds11_pc1 \"NCDS Age 11 PCA Score\"\n",
      "\n",
      ". \n",
      ". summ ncds11_pc1\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "  ncds11_pc1 |     14,131    2.37e-09    1.344389  -3.606081   3.131255\n",
      "\n",
      ". \n",
      ". * We standardise this variable:\n",
      "\n",
      ". \n",
      ". capture drop ncds11_stdpc1\n",
      "\n",
      ". egen ncds11_stdpc1 = std(ncds11_pc1)\n",
      "(4427 missing values generated)\n",
      "\n",
      ". summ ncds11_stdpc1\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds11_std~1 |     14,131   -1.45e-10           1   -2.68232   2.329129\n",
      "\n",
      ". label variable ncds11_stdpc1 \"NCDS Age 11 standardised PCA Score\"\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* We use the two cognitive ability sub-test which make up the general abilty \n",
    "* test described above.\n",
    "\n",
    "tab n914, mi \n",
    "tab n917, mi\n",
    "\n",
    "\n",
    "\n",
    "* We only want to include those cohort members who completed both tests.\n",
    "* We create a variable which indicates how many of the sub-tests\n",
    "* we have information on.\n",
    "\n",
    "capture drop rmiss\n",
    "egen rmiss = rmiss(n914 n917)\n",
    "tab rmiss\n",
    "\n",
    "* We examine the correlation between these two test scores:\n",
    "\n",
    "pwcorr n914 n917 if (rmiss==0), sig\n",
    "\n",
    "* Principal components analysis of the tests that make up the \n",
    "* general ability test\n",
    "\n",
    "pca n914 n917 if (rmiss==0)\n",
    "\n",
    "* Only the first component has an eigenvalue greater than 1.\n",
    "\n",
    "screeplot\n",
    "\n",
    "* The screeplot leads to the same conclusion\n",
    "\n",
    "* Here we predict the score for each individual on the first principal\n",
    "* component. This score is obtained by applying the elements of the \n",
    "* corresponding eigenvector to the standardised values of the original\n",
    "* observations for an individual.\n",
    "\n",
    "predict ncds11_pc1 if (rmiss==0), score\n",
    "label variable ncds11_pc1 \"NCDS Age 11 PCA Score\"\n",
    "\n",
    "summ ncds11_pc1\n",
    "\n",
    "* We standardise this variable:\n",
    "\n",
    "capture drop ncds11_stdpc1\n",
    "egen ncds11_stdpc1 = std(ncds11_pc1)\n",
    "summ ncds11_stdpc1\n",
    "label variable ncds11_stdpc1 \"NCDS Age 11 standardised PCA Score\"\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". keep ncdsid ncds_male ncds_paed ncds_paed_cat ncds_moed ncds_parented ncds_moed_cat ncds0_country ncds5_country ncds11_country ncds11_bast\n",
      "> otalscore ncds11_stdbastotalscore ncds11_stdpc1\n",
      "\n",
      ". \n",
      ". sort ncdsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp1.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp1.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp1.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "keep ncdsid ncds_male ncds_paed ncds_paed_cat ncds_moed ncds_parented ncds_moed_cat ncds0_country ncds5_country ncds11_country ncds11_bastotalscore ncds11_stdbastotalscore ncds11_stdpc1\n",
    "\n",
    "sort ncdsid\n",
    "\n",
    "save $path3\\temp1.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We intend to use father's NS-SEC based on the new occupational coding for our parental social class measure. However, here we prepare some of the older parental occupation-based social class measures which are available in the deposited datasets. These were used in intial sensitivity analyses and were considered when preparing the inverse probability weights."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
      "\n",
      ". \n",
      ". quietly mvdecode _all, mv(-9=. \\-8=. \\-2=. \\-1=. \\-7=. \\-3=.)\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
    "\n",
    "quietly mvdecode _all, mv(-9=. \\-8=. \\-2=. \\-1=. \\-7=. \\-3=.)\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Social Class of mother's husband - Age 0 (birth)\n",
      "\n",
      ". numlabel n236 n492, add\n",
      "\n",
      ". \n",
      ". tab n236\n",
      "\n",
      "0P Social class of |\n",
      "  mother's husband |\n",
      "        (GRO 1951) |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "1. Unemployed,sick |          5        0.03        0.03\n",
      "              2. I |        746        4.53        4.56\n",
      "             3. II |      2,133       12.96       17.52\n",
      " 4. III non-manual |      1,592        9.67       27.19\n",
      "     5. III manual |      8,376       50.88       78.07\n",
      "             6. IV |      1,995       12.12       90.18\n",
      "              7. V |      1,616        9.82      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     16,463      100.00\n",
      "\n",
      ". tab n492\n",
      "\n",
      "      0 Social class |\n",
      "    mother's husband |\n",
      "          (GRO 1951) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "  1. Unemployed,sick |          5        0.03        0.03\n",
      "                2. I |        746        4.38        4.41\n",
      "               3. II |      2,133       12.53       16.94\n",
      "              4. III |      9,981       58.62       75.56\n",
      "               5. IV |      1,995       11.72       87.28\n",
      "                6. V |      1,616        9.49       96.77\n",
      "         9. Students |         35        0.21       96.98\n",
      "    10. Dead or away |          3        0.02       96.99\n",
      "         11. Retired |          2        0.01       97.00\n",
      "12. Single,no husbnd |        510        3.00      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,026      100.00\n",
      "\n",
      ". \n",
      ". tab n236 n492\n",
      "\n",
      "0P Social class of |\n",
      "  mother's husband |            0 Social class mother's husband (GRO 1951)\n",
      "        (GRO 1951) | 1. Unempl       2. I      3. II     4. III      5. IV       6. V |     Total\n",
      "-------------------+------------------------------------------------------------------+----------\n",
      "1. Unemployed,sick |         5          0          0          0          0          0 |         5 \n",
      "              2. I |         0        746          0          0          0          0 |       746 \n",
      "             3. II |         0          0      2,133          0          0          0 |     2,133 \n",
      " 4. III non-manual |         0          0          0      1,592          0          0 |     1,592 \n",
      "     5. III manual |         0          0          0      8,375          0          0 |     8,375 \n",
      "             6. IV |         0          0          0          0      1,995          0 |     1,995 \n",
      "              7. V |         0          0          0          0          0      1,616 |     1,616 \n",
      "-------------------+------------------------------------------------------------------+----------\n",
      "             Total |         5        746      2,133      9,967      1,995      1,616 |    16,462 \n",
      "\n",
      "\n",
      ". *The difference bewteen n236 and n492, seems to be that:\n",
      "\n",
      ". *n236 includes the manual non-manual distinction in class III\n",
      "\n",
      ". *but n492 does not.\n",
      "\n",
      ". \n",
      ". *Create father's RGSC at birth survey using variable n236\n",
      "\n",
      ". capture drop ncds0_olddadrgsc\n",
      "\n",
      ". gen ncds0_olddadrgsc = n236\n",
      "(2,095 missing values generated)\n",
      "\n",
      ". recode ncds0_olddadrgsc (1=.)\n",
      "(ncds0_olddadrgsc: 5 changes made)\n",
      "\n",
      ". replace ncds0_olddadrgsc = (ncds0_olddadrgsc-1)\n",
      "(16,458 real changes made)\n",
      "\n",
      ". label variable ncds0_olddadrgsc \"NCDS Birth Dad RGSC Old Coding\"\n",
      "\n",
      ". label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
      "\n",
      ". label values ncds0_olddadrgsc rgsc\n",
      "\n",
      ". \n",
      ". tab ncds0_olddadrgsc\n",
      "\n",
      " NCDS Birth |\n",
      "   Dad RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          I |        746        4.53        4.53\n",
      "         II |      2,133       12.96       17.49\n",
      "     III NM |      1,592        9.67       27.17\n",
      "      III M |      8,376       50.89       78.06\n",
      "         IV |      1,995       12.12       90.18\n",
      "          V |      1,616        9.82      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,458      100.00\n",
      "\n",
      ". tab ncds0_olddadrgsc n236, mi\n",
      "\n",
      "NCDS Birth |\n",
      "  Dad RGSC |                     0P Social class of mother's husband (GRO 1951)\n",
      "Old Coding | 1. Unempl       2. I      3. II  4. III no  5. III ma      6. IV       7. V          . |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         I |         0        746          0          0          0          0          0          0 |       746 \n",
      "        II |         0          0      2,133          0          0          0          0          0 |     2,133 \n",
      "    III NM |         0          0          0      1,592          0          0          0          0 |     1,592 \n",
      "     III M |         0          0          0          0      8,376          0          0          0 |     8,376 \n",
      "        IV |         0          0          0          0          0      1,995          0          0 |     1,995 \n",
      "         V |         0          0          0          0          0          0      1,616          0 |     1,616 \n",
      "         . |         5          0          0          0          0          0          0      2,095 |     2,100 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |         5        746      2,133      1,592      8,376      1,995      1,616      2,095 |    18,558 \n",
      "\n",
      "\n",
      ". \n",
      ". \n",
      ". *Social class of father or male head - Age 7\n",
      "\n",
      ". \n",
      ". numlabel n190, add\n",
      "\n",
      ". tab n190\n",
      "\n",
      "  1P Social class |\n",
      "   of father,male |\n",
      "  head (GRO 1960) |      Freq.     Percent        Cum.\n",
      "------------------+-----------------------------------\n",
      "  1. No male head |        421        2.90        2.90\n",
      "             2. I |        750        5.16        8.06\n",
      "            3. II |      2,079       14.30       22.36\n",
      "4. III non-manual |      1,408        9.69       32.05\n",
      "    5. III manual |      6,416       44.14       76.19\n",
      " 6. IV non-manual |        258        1.78       77.96\n",
      "     7. IV manual |      2,272       15.63       93.59\n",
      "             8. V |        931        6.41      100.00\n",
      "------------------+-----------------------------------\n",
      "            Total |     14,535      100.00\n",
      "\n",
      ". \n",
      ". capture drop ncds7_olddadrgsc\n",
      "\n",
      ". gen ncds7_olddadrgsc = n190\n",
      "(4,023 missing values generated)\n",
      "\n",
      ". recode ncds7_olddadrgsc (1=.) (7=6) (8=7) \n",
      "(ncds7_olddadrgsc: 3624 changes made)\n",
      "\n",
      ". replace ncds7_olddadrgsc = (ncds7_olddadrgsc-1)\n",
      "(14,114 real changes made)\n",
      "\n",
      ". label variable ncds7_olddadrgsc \"NCDS Age 7 Dad RGSC Old Coding\"\n",
      "\n",
      ". label values ncds7_olddadrgsc rgsc\n",
      "\n",
      ". \n",
      ". tab ncds7_olddadrgsc n190, mi\n",
      "\n",
      "NCDS Age 7 |\n",
      "  Dad RGSC |                           1P Social class of father,male head (GRO 1960)\n",
      "Old Coding | 1. No mal       2. I      3. II  4. III no  5. III ma  6. IV non  7. IV man       8. V          . |     Total\n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "         I |         0        750          0          0          0          0          0          0          0 |       750 \n",
      "        II |         0          0      2,079          0          0          0          0          0          0 |     2,079 \n",
      "    III NM |         0          0          0      1,408          0          0          0          0          0 |     1,408 \n",
      "     III M |         0          0          0          0      6,416          0          0          0          0 |     6,416 \n",
      "        IV |         0          0          0          0          0        258      2,272          0          0 |     2,530 \n",
      "         V |         0          0          0          0          0          0          0        931          0 |       931 \n",
      "         . |       421          0          0          0          0          0          0          0      4,023 |     4,444 \n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       421        750      2,079      1,408      6,416        258      2,272        931      4,023 |    18,558 \n",
      "\n",
      "\n",
      ". tab ncds7_olddadrgsc, mi\n",
      "\n",
      " NCDS Age 7 |\n",
      "   Dad RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          I |        750        4.04        4.04\n",
      "         II |      2,079       11.20       15.24\n",
      "     III NM |      1,408        7.59       22.83\n",
      "      III M |      6,416       34.57       57.40\n",
      "         IV |      2,530       13.63       71.04\n",
      "          V |        931        5.02       76.05\n",
      "          . |      4,444       23.95      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *Social class of father or male head - Age 11\n",
      "\n",
      ". \n",
      ". tab n1171\n",
      "\n",
      " 2P Social Class |\n",
      "    of father or |\n",
      "  male head (GRO |\n",
      "           1966) |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "  Social class I |        738        5.52        5.52\n",
      " Social class II |      2,432       18.19       23.71\n",
      " SC III non-man. |      1,245        9.31       33.02\n",
      "   SC III manual |      5,721       42.78       75.80\n",
      "SC IV non-manual |        285        2.13       77.93\n",
      "    SC IV manual |      2,064       15.44       93.37\n",
      "  Social class V |        827        6.18       99.55\n",
      "  Unclassifiable |         60        0.45      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     13,372      100.00\n",
      "\n",
      ". \n",
      ". tab n1687\n",
      "\n",
      "    2PD Social |\n",
      "      class of |\n",
      "father or male |\n",
      "     head (GRO |\n",
      "         1966) |      Freq.     Percent        Cum.\n",
      "---------------+-----------------------------------\n",
      "             I |        726        5.34        5.34\n",
      "            II |      2,363       17.39       22.73\n",
      "III non manual |      1,202        8.84       31.57\n",
      "    III manual |      5,564       40.94       72.52\n",
      "            IV |      2,257       16.61       89.12\n",
      "             V |        776        5.71       94.83\n",
      "  No male head |        702        5.17      100.00\n",
      "---------------+-----------------------------------\n",
      "         Total |     13,590      100.00\n",
      "\n",
      ". \n",
      ". tab n1687 n1171\n",
      "\n",
      "    2PD Social |\n",
      "      class of |\n",
      "father or male |\n",
      "     head (GRO |                    2P Social Class of father or male head (GRO 1966)\n",
      "         1966) | Social cl  Social cl  SC III no  SC III ma  SC IV non  SC IV man  Social cl  Unclassif |     Total\n",
      "---------------+----------------------------------------------------------------------------------------+----------\n",
      "             I |       726          0          0          0          0          0          0          0 |       726 \n",
      "            II |         0      2,363          0          0          0          0          0          0 |     2,363 \n",
      "III non manual |         0          0      1,202          0          0          0          0          0 |     1,202 \n",
      "    III manual |         0          0          0      5,564          0          0          0          0 |     5,564 \n",
      "            IV |         0          0          0          0        266      1,991          0          0 |     2,257 \n",
      "             V |         0          0          0          0          0          0        776          0 |       776 \n",
      "  No male head |        12         69         43        157         19         73         51          9 |       433 \n",
      "---------------+----------------------------------------------------------------------------------------+----------\n",
      "         Total |       738      2,432      1,245      5,721        285      2,064        827          9 |    13,321 \n",
      "\n",
      "\n",
      ". \n",
      ". numlabel n1687, add\n",
      "\n",
      ". \n",
      ". capture drop ncds11_olddadrgsc\n",
      "\n",
      ". gen ncds11_olddadrgsc = n1687\n",
      "(4,968 missing values generated)\n",
      "\n",
      ". recode ncds11_olddadrgsc (7=.)\n",
      "(ncds11_olddadrgsc: 702 changes made)\n",
      "\n",
      ". label variable ncds11_olddadrgsc \"NCDS Age 11 Dad RGSC Old Coding\"\n",
      "\n",
      ". label values ncds11_olddadrgsc rgsc\n",
      "\n",
      ". \n",
      ". tab ncds11_olddadrgsc n1687, mi\n",
      "\n",
      "  NCDS Age |\n",
      "    11 Dad |\n",
      "  RGSC Old |                   2PD Social class of father or male head (GRO 1966)\n",
      "    Coding |      1. I      2. II  3. III no  4. III ma      5. IV       6. V  7. No mal          . |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         I |       726          0          0          0          0          0          0          0 |       726 \n",
      "        II |         0      2,363          0          0          0          0          0          0 |     2,363 \n",
      "    III NM |         0          0      1,202          0          0          0          0          0 |     1,202 \n",
      "     III M |         0          0          0      5,564          0          0          0          0 |     5,564 \n",
      "        IV |         0          0          0          0      2,257          0          0          0 |     2,257 \n",
      "         V |         0          0          0          0          0        776          0          0 |       776 \n",
      "         . |         0          0          0          0          0          0        702      4,968 |     5,670 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |       726      2,363      1,202      5,564      2,257        776        702      4,968 |    18,558 \n",
      "\n",
      "\n",
      ". tab ncds11_olddadrgsc, mi\n",
      "\n",
      "NCDS Age 11 |\n",
      "   Dad RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          I |        726        3.91        3.91\n",
      "         II |      2,363       12.73       16.65\n",
      "     III NM |      1,202        6.48       23.12\n",
      "      III M |      5,564       29.98       53.10\n",
      "         IV |      2,257       12.16       65.27\n",
      "          V |        776        4.18       69.45\n",
      "          . |      5,670       30.55      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *Age 16 Social class of father or male head\n",
      "\n",
      ". \n",
      ". tab n2384\n",
      "\n",
      "     3P Social |\n",
      "         class |\n",
      "   father,male |\n",
      "     head (GRO |\n",
      "         1970) |      Freq.     Percent        Cum.\n",
      "---------------+-----------------------------------\n",
      "             I |        569        5.36        5.36\n",
      "            II |      2,100       19.77       25.13\n",
      "III non-manual |      1,004        9.45       34.58\n",
      "    III manual |      4,661       43.88       78.47\n",
      " IV non-manual |        155        1.46       79.93\n",
      "     IV manual |      1,403       13.21       93.14\n",
      "             V |        608        5.72       98.86\n",
      "       Unclear |        121        1.14      100.00\n",
      "---------------+-----------------------------------\n",
      "         Total |     10,621      100.00\n",
      "\n",
      ". \n",
      ". numlabel n2384, add\n",
      "\n",
      ". \n",
      ". capture drop ncds16_olddadrgsc\n",
      "\n",
      ". gen ncds16_olddadrgsc = n2384\n",
      "(7,937 missing values generated)\n",
      "\n",
      ". recode ncds16_olddadrgsc (8=.) \n",
      "(ncds16_olddadrgsc: 121 changes made)\n",
      "\n",
      ". recode ncds16_olddadrgsc (6=5) \n",
      "(ncds16_olddadrgsc: 1403 changes made)\n",
      "\n",
      ". recode ncds16_olddadrgsc (7=6) \n",
      "(ncds16_olddadrgsc: 608 changes made)\n",
      "\n",
      ". label variable ncds16_olddadrgsc \"NCDS Age 16 Dad RGSC Old Coding\"\n",
      "\n",
      ". label values ncds16_olddadrgsc rgsc\n",
      "\n",
      ". \n",
      ". tab ncds16_olddadrgsc n2384, mi\n",
      "\n",
      "  NCDS Age |\n",
      "    16 Dad |\n",
      "  RGSC Old |                            3P Social class father,male head (GRO 1970)\n",
      "    Coding |      1. I      2. II  3. III no  4. III ma  5. IV non  6. IV man       7. V  8. Unclea          . |     Total\n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "         I |       569          0          0          0          0          0          0          0          0 |       569 \n",
      "        II |         0      2,100          0          0          0          0          0          0          0 |     2,100 \n",
      "    III NM |         0          0      1,004          0          0          0          0          0          0 |     1,004 \n",
      "     III M |         0          0          0      4,661          0          0          0          0          0 |     4,661 \n",
      "        IV |         0          0          0          0        155      1,403          0          0          0 |     1,558 \n",
      "         V |         0          0          0          0          0          0        608          0          0 |       608 \n",
      "         . |         0          0          0          0          0          0          0        121      7,937 |     8,058 \n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       569      2,100      1,004      4,661        155      1,403        608        121      7,937 |    18,558 \n",
      "\n",
      "\n",
      ". tab ncds16_olddadrgsc, mi\n",
      "\n",
      "NCDS Age 16 |\n",
      "   Dad RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          I |        569        3.07        3.07\n",
      "         II |      2,100       11.32       14.38\n",
      "     III NM |      1,004        5.41       19.79\n",
      "      III M |      4,661       25.12       44.91\n",
      "         IV |      1,558        8.40       53.30\n",
      "          V |        608        3.28       56.58\n",
      "          . |      8,058       43.42      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Social Class of mother's husband - Age 0 (birth)\n",
    "numlabel n236 n492, add\n",
    "\n",
    "tab n236\n",
    "tab n492\n",
    "\n",
    "tab n236 n492\n",
    "*The difference bewteen n236 and n492, seems to be that:\n",
    "*n236 includes the manual non-manual distinction in class III\n",
    "*but n492 does not.\n",
    "\n",
    "*Create father's RGSC at birth survey using variable n236\n",
    "capture drop ncds0_olddadrgsc\n",
    "gen ncds0_olddadrgsc = n236\n",
    "recode ncds0_olddadrgsc (1=.)\n",
    "replace ncds0_olddadrgsc = (ncds0_olddadrgsc-1)\n",
    "label variable ncds0_olddadrgsc \"NCDS Birth Dad RGSC Old Coding\"\n",
    "label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
    "label values ncds0_olddadrgsc rgsc\n",
    "\n",
    "tab ncds0_olddadrgsc\n",
    "tab ncds0_olddadrgsc n236, mi\n",
    "\n",
    "\n",
    "*Social class of father or male head - Age 7\n",
    "\n",
    "numlabel n190, add\n",
    "tab n190\n",
    "\n",
    "capture drop ncds7_olddadrgsc\n",
    "gen ncds7_olddadrgsc = n190\n",
    "recode ncds7_olddadrgsc (1=.) (7=6) (8=7) \n",
    "replace ncds7_olddadrgsc = (ncds7_olddadrgsc-1)\n",
    "label variable ncds7_olddadrgsc \"NCDS Age 7 Dad RGSC Old Coding\"\n",
    "label values ncds7_olddadrgsc rgsc\n",
    "\n",
    "tab ncds7_olddadrgsc n190, mi\n",
    "tab ncds7_olddadrgsc, mi\n",
    "\n",
    "*Social class of father or male head - Age 11\n",
    "\n",
    "tab n1171\n",
    "\n",
    "tab n1687\n",
    "\n",
    "tab n1687 n1171\n",
    "\n",
    "numlabel n1687, add\n",
    "\n",
    "capture drop ncds11_olddadrgsc\n",
    "gen ncds11_olddadrgsc = n1687\n",
    "recode ncds11_olddadrgsc (7=.)\n",
    "label variable ncds11_olddadrgsc \"NCDS Age 11 Dad RGSC Old Coding\"\n",
    "label values ncds11_olddadrgsc rgsc\n",
    "\n",
    "tab ncds11_olddadrgsc n1687, mi\n",
    "tab ncds11_olddadrgsc, mi\n",
    "\n",
    "*Age 16 Social class of father or male head\n",
    "\n",
    "tab n2384\n",
    "\n",
    "numlabel n2384, add\n",
    "\n",
    "capture drop ncds16_olddadrgsc\n",
    "gen ncds16_olddadrgsc = n2384\n",
    "recode ncds16_olddadrgsc (8=.) \n",
    "recode ncds16_olddadrgsc (6=5) \n",
    "recode ncds16_olddadrgsc (7=6) \n",
    "label variable ncds16_olddadrgsc \"NCDS Age 16 Dad RGSC Old Coding\"\n",
    "label values ncds16_olddadrgsc rgsc\n",
    "\n",
    "tab ncds16_olddadrgsc n2384, mi\n",
    "tab ncds16_olddadrgsc, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "The mother's occupational information is available in a much more limited format to that of fathers.\n",
    "\n",
    "Elliott and Lawrence (2014) discuss the availability of mother's occupational information in the NCDS. \n",
    "\n",
    "Elliott and Lawrence (2014) [Refining childhood social class measures \n",
    "in the 1958 British Cohort Study](http://tinyurl.com/jfn6nmv). London: CLS.\n",
    "\n",
    "Further discussion of our decision not to use mother's occupational information is provided in [endnote 7](#note7).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Mother's job when pregnant\n",
      "\n",
      ". tab n540, mi\n",
      "\n",
      " 0 Mums paid job |\n",
      "during pregnancy |\n",
      "      (GRO 1951) |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "        Teachers |        261        1.41        1.41\n",
      "Nurses qualified |         92        0.50        1.90\n",
      " Bank clerks etc |        239        1.29        3.19\n",
      " Shopkeepers etc |         60        0.32        3.51\n",
      "Others in SCI,II |        100        0.54        4.05\n",
      "Nurses- not qual |        107        0.58        4.63\n",
      "  Clerks,typists |      1,538        8.29       12.92\n",
      "Shop asst,hairdr |        786        4.24       17.15\n",
      " Garment workers |        149        0.80       17.95\n",
      "Textile wkr skld |        281        1.51       19.47\n",
      "Personal service |        222        1.20       20.66\n",
      "Others in SC III |        545        2.94       23.60\n",
      "      Machinists |        286        1.54       25.14\n",
      "Textile wkr SCIV |        102        0.55       25.69\n",
      "   Personal-SCIV |        374        2.02       27.71\n",
      " Others in SC IV |        982        5.29       33.00\n",
      "Textile-labourer |        349        1.88       34.88\n",
      "   Personal-SC V |        118        0.64       35.52\n",
      " No job dur preg |     10,621       57.23       92.75\n",
      "               . |      1,346        7.25      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *Mother's job (when starting this baby) age 0 (birth)\n",
      "\n",
      ". tab n539, mi\n",
      "\n",
      " 0 Mums paid job |\n",
      "   when starting |\n",
      "  this baby (GRO |\n",
      "           1951) |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "        Teachers |        269        1.45        1.45\n",
      "Nurses qualified |         92        0.50        1.95\n",
      " Bank clerks etc |        246        1.33        3.27\n",
      " Shopkeepers etc |         60        0.32        3.59\n",
      "Others in SCI,II |        101        0.54        4.14\n",
      "Nurses- not qual |        109        0.59        4.73\n",
      "  Clerks,typists |      1,559        8.40       13.13\n",
      "Shop asst,hairdr |        799        4.31       17.43\n",
      " Garment workers |        152        0.82       18.25\n",
      "Textile wkr skld |        281        1.51       19.77\n",
      "Personal service |        224        1.21       20.97\n",
      "Others in SC III |        553        2.98       23.95\n",
      "      Machinists |        287        1.55       25.50\n",
      "Textile wkr SCIV |        104        0.56       26.06\n",
      "   Personal-SCIV |        379        2.04       28.10\n",
      " Others in SC IV |        988        5.32       33.42\n",
      "Textile-labourer |        356        1.92       35.34\n",
      "   Personal-SC V |        122        0.66       36.00\n",
      "               . |     11,877       64.00      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *Mother's occupation aged 11\n",
      "\n",
      ". tab n1225, mi\n",
      "\n",
      "    2P Mothers's |\n",
      "most recent work |\n",
      "    and SEG (GRO |\n",
      "           1966) |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "      Prof,manag |        268        1.44        1.44\n",
      "Intermed non-man |        798        4.30        5.74\n",
      " Typist,clerical |      1,308        7.05       12.79\n",
      "  Shop assistant |        839        4.52       17.31\n",
      "Telephonists etc |        183        0.99       18.30\n",
      "Personal service |      1,784        9.61       27.91\n",
      "Forewomen,manual |        124        0.67       28.58\n",
      "  Manual workers |      2,839       15.30       43.88\n",
      "     Own account |         70        0.38       44.26\n",
      "    Farm workers |        148        0.80       45.05\n",
      " Inadequate info |         49        0.26       45.32\n",
      "               . |     10,148       54.68      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". \n",
      ". *Age 16 Social class of mother \n",
      "\n",
      ". numlabel n2393, add\n",
      "\n",
      ". tab n2393, mi\n",
      "\n",
      "  3P Mother-s social |\n",
      " class,if works (GRO |\n",
      "               1970) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "                1. I |         37        0.20        0.20\n",
      "               2. II |      1,196        6.44        6.64\n",
      "      3. III non-man |      2,320       12.50       19.15\n",
      "       4. III manual |        545        2.94       22.08\n",
      "       5. IV non-man |      1,344        7.24       29.32\n",
      "        6. IV manual |      1,192        6.42       35.75\n",
      "                7. V |        771        4.15       39.90\n",
      "     8. Unclassified |        113        0.61       40.51\n",
      "                   . |     11,040       59.49      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". capture drop ncds16_oldmumrgsc\n",
      "\n",
      ". gen ncds16_oldmumrgsc = n2393\n",
      "(11,040 missing values generated)\n",
      "\n",
      ". recode ncds16_oldmumrgsc (8=.) \n",
      "(ncds16_oldmumrgsc: 113 changes made)\n",
      "\n",
      ". recode ncds16_oldmumrgsc (6=5) \n",
      "(ncds16_oldmumrgsc: 1192 changes made)\n",
      "\n",
      ". recode ncds16_oldmumrgsc (7=6) \n",
      "(ncds16_oldmumrgsc: 771 changes made)\n",
      "\n",
      ". label variable ncds16_oldmumrgsc \"NCDS Age 16 Mum RGSC Old Coding\"\n",
      "\n",
      ". label values ncds16_oldmumrgsc rgsc\n",
      "\n",
      ". \n",
      ". tab ncds16_oldmumrgsc n2384, mi\n",
      "\n",
      "  NCDS Age |\n",
      "    16 Mum |\n",
      "  RGSC Old |                            3P Social class father,male head (GRO 1970)\n",
      "    Coding |      1. I      2. II  3. III no  4. III ma  5. IV non  6. IV man       7. V  8. Unclea          . |     Total\n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "         I |        20         11          0          5          0          0          0          0          1 |        37 \n",
      "        II |       101        436        120        321         18         69         22          8        101 |     1,196 \n",
      "    III NM |       132        523        346        867         20        179         56         21        176 |     2,320 \n",
      "     III M |         5         42         36        312          4         71         26          4         45 |       545 \n",
      "        IV |        30        213        180      1,236         38        462        133         15        229 |     2,536 \n",
      "         V |         3         19         41        408         15        134         81          3         67 |       771 \n",
      "         . |       278        856        281      1,512         60        488        290         70      7,318 |    11,153 \n",
      "-----------+---------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       569      2,100      1,004      4,661        155      1,403        608        121      7,937 |    18,558 \n",
      "\n",
      "\n",
      ". tab ncds16_oldmumrgsc, mi\n",
      "\n",
      "NCDS Age 16 |\n",
      "   Mum RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          I |         37        0.20        0.20\n",
      "         II |      1,196        6.44        6.64\n",
      "     III NM |      2,320       12.50       19.15\n",
      "      III M |        545        2.94       22.08\n",
      "         IV |      2,536       13.67       35.75\n",
      "          V |        771        4.15       39.90\n",
      "          . |     11,153       60.10      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Mother's job when pregnant\n",
    "tab n540, mi\n",
    "\n",
    "*Mother's job (when starting this baby) age 0 (birth)\n",
    "tab n539, mi\n",
    "\n",
    "*Mother's occupation aged 11\n",
    "tab n1225, mi\n",
    "\n",
    "\n",
    "*Age 16 Social class of mother \n",
    "numlabel n2393, add\n",
    "tab n2393, mi\n",
    "\n",
    "capture drop ncds16_oldmumrgsc\n",
    "gen ncds16_oldmumrgsc = n2393\n",
    "recode ncds16_oldmumrgsc (8=.) \n",
    "recode ncds16_oldmumrgsc (6=5) \n",
    "recode ncds16_oldmumrgsc (7=6) \n",
    "label variable ncds16_oldmumrgsc \"NCDS Age 16 Mum RGSC Old Coding\"\n",
    "label values ncds16_oldmumrgsc rgsc\n",
    "\n",
    "tab ncds16_oldmumrgsc n2384, mi\n",
    "tab ncds16_oldmumrgsc, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "We convert the available Socio-Economic Group information to an approximation of the Goldthorpe Schema using the method outlined in Goldthorpe and Jackson (2007).\n",
    "\n",
    "Goldthorpe, J. H., & Jackson, M. (2007). [Intergenerational class mobility in contemporary Britain: political concerns and empirical findings.](http://onlinelibrary.wiley.com/doi/10.1111/j.1468-4446.2007.00165.x/full) The British journal of sociology, 58(4), 525-546.\n",
    "Chicago.\n",
    "\n",
    "This method builds on an approximation developed by Health and McDonald (1987).\n",
    "\n",
    "Heath, A., & McDonald, S. K. (1987). [Social change and the future of the left.](http://onlinelibrary.wiley.com/doi/10.1111/j.1467-923X.1987.tb02624.x/full) The Political Quarterly, 58(4), 364-377."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Father's SEG\n",
      "\n",
      ". numlabel n2385 n1175, add\n",
      "\n",
      ". \n",
      ". *Father's SEG measured age 16\n",
      "\n",
      ". tab n2385\n",
      "\n",
      "3P Father,male heads |\n",
      "  socio-economic grp |\n",
      "          (GRO 1970) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "   1. Emp,mana,large |        584        5.45        5.45\n",
      "  2. Emp,manag,small |      1,155       10.78       16.24\n",
      "    3. Prof-self-emp |         89        0.83       17.07\n",
      "  4. Prof. employees |        483        4.51       21.58\n",
      " 5. Intermed non-man |        698        6.52       28.10\n",
      "   6. Junior non-man |        723        6.75       34.85\n",
      " 7. Personal service |         62        0.58       35.42\n",
      "   8. Foremen-manual |      1,045        9.76       45.18\n",
      "   9. Skilled manual |      2,996       27.97       73.16\n",
      "10. Semi skld manual |      1,268       11.84       85.00\n",
      "11. Unskilled manual |        586        5.47       90.47\n",
      "12. Work own account |        491        4.58       95.05\n",
      "  13. Farm emp,manag |        105        0.98       96.03\n",
      "14. Farm-own account |        100        0.93       96.97\n",
      "    15. Agric worker |        114        1.06       98.03\n",
      "    16. Armed forces |         89        0.83       98.86\n",
      " 17. Inadequate info |        122        1.14      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     10,710      100.00\n",
      "\n",
      ". capture drop ncds16_dadseg2egp\n",
      "\n",
      ". gen ncds16_dadseg2egp = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 1 if (n2385==1)|(n2385==3)|(n2385==4)\n",
      "(1,156 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 2 if (n2385==2)|(n2385==5)\n",
      "(1,853 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 3 if (n2385==6)|(n2385==7)\n",
      "(785 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 4 if (n2385==12)|(n2385==13)|(n2385==14)\n",
      "(696 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 5 if (n2385==8)\n",
      "(1,045 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 6 if (n2385==9)\n",
      "(2,996 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = 7 if (n2385==10)|(n2385==11)|(n2385==15)\n",
      "(1,968 real changes made)\n",
      "\n",
      ". replace ncds16_dadseg2egp = . if (n2385==17)|(n2385==16)\n",
      "(0 real changes made)\n",
      "\n",
      ". label define egp 1 \"I\" 2 \"II+IVa\" 3 \"III\" 4 \"IVb+c\" 5 \"V\" 6 \"VI\" 7 \"VII\"\n",
      "\n",
      ". label values ncds16_dadseg2egp egp\n",
      "\n",
      ". label variable ncds16_dadseg2egp \"NCDS Age 16 Dad's EGP from SEG\"\n",
      "\n",
      ". \n",
      ". *armed forces are coded as missing\n",
      "\n",
      ". tab ncds16_dadseg2egp n2385, mi\n",
      "\n",
      "  NCDS Age |\n",
      "  16 Dad's |\n",
      "  EGP from |                              3P Father,male heads socio-economic grp (GRO 1970)\n",
      "       SEG | 1. Emp,ma  2. Emp,ma  3. Prof-s  4. Prof.   5. Interm  6. Junior  7. Person  8. Foreme  9. Skille  10. Semi  |     Total\n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "         I |       584          0         89        483          0          0          0          0          0          0 |     1,156 \n",
      "    II+IVa |         0      1,155          0          0        698          0          0          0          0          0 |     1,853 \n",
      "       III |         0          0          0          0          0        723         62          0          0          0 |       785 \n",
      "     IVb+c |         0          0          0          0          0          0          0          0          0          0 |       696 \n",
      "         V |         0          0          0          0          0          0          0      1,045          0          0 |     1,045 \n",
      "        VI |         0          0          0          0          0          0          0          0      2,996          0 |     2,996 \n",
      "       VII |         0          0          0          0          0          0          0          0          0      1,268 |     1,968 \n",
      "         . |         0          0          0          0          0          0          0          0          0          0 |     8,059 \n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       584      1,155         89        483        698        723         62      1,045      2,996      1,268 |    18,558 \n",
      "\n",
      "\n",
      "  NCDS Age |\n",
      "  16 Dad's |\n",
      "  EGP from |                   3P Father,male heads socio-economic grp (GRO 1970)\n",
      "       SEG | 11. Unski  12. Work   13. Farm   14. Farm-  15. Agric  16. Armed  17. Inade          . |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         I |         0          0          0          0          0          0          0          0 |     1,156 \n",
      "    II+IVa |         0          0          0          0          0          0          0          0 |     1,853 \n",
      "       III |         0          0          0          0          0          0          0          0 |       785 \n",
      "     IVb+c |         0        491        105        100          0          0          0          0 |       696 \n",
      "         V |         0          0          0          0          0          0          0          0 |     1,045 \n",
      "        VI |         0          0          0          0          0          0          0          0 |     2,996 \n",
      "       VII |       586          0          0          0        114          0          0          0 |     1,968 \n",
      "         . |         0          0          0          0          0         89        122      7,848 |     8,059 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |       586        491        105        100        114         89        122      7,848 |    18,558 \n",
      "\n",
      "\n",
      ". \n",
      ". *Father's SEG measured age 11\n",
      "\n",
      ". tab n1175\n",
      "\n",
      "      2P Father,male |\n",
      "              head's |\n",
      "  socio-economic grp |\n",
      "          (GRO 1966) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "  1. Emp,manag,large |        582        4.32        4.32\n",
      "  2. Emp,manag,small |      1,365       10.12       14.44\n",
      "    3. Prof-self-emp |        123        0.91       15.35\n",
      "   4. Prof-employees |        615        4.56       19.91\n",
      " 5. Intermed non-man |        720        5.34       25.25\n",
      "   6. Junior non-man |      1,214        9.00       34.26\n",
      " 7. Personal service |         78        0.58       34.84\n",
      "   8. Foremen-manual |        820        6.08       40.92\n",
      "   9. Skilled manual |      4,273       31.69       72.61\n",
      "10. Semi skld manual |      1,808       13.41       86.02\n",
      "11. Unskilled manual |        783        5.81       91.83\n",
      "12. Work-own account |        480        3.56       95.39\n",
      "13. Farmer-emp,manag |        150        1.11       96.50\n",
      "14. Farm-own account |        113        0.84       97.34\n",
      "    15. Agric worker |        179        1.33       98.66\n",
      "    16. Armed forces |        180        1.34      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     13,483      100.00\n",
      "\n",
      ". capture drop ncds11_dadseg2egp\n",
      "\n",
      ". gen ncds11_dadseg2egp = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 1 if (n1175==1)|(n1175==3)|(n1175==4)\n",
      "(1,320 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 2 if (n1175==2)|(n1175==5)\n",
      "(2,085 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 3 if (n1175==6)|(n1175==7)\n",
      "(1,292 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 4 if (n1175==12)|(n1175==13)|(n1175==14)\n",
      "(743 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 5 if (n1175==8)\n",
      "(820 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 6 if (n1175==9)\n",
      "(4,273 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = 7 if (n1175==10)|(n1175==11)|(n1175==15)\n",
      "(2,770 real changes made)\n",
      "\n",
      ". replace ncds11_dadseg2egp = . if (n1175==16)\n",
      "(0 real changes made)\n",
      "\n",
      ". label values ncds11_dadseg2egp egp\n",
      "\n",
      ". label variable ncds11_dadseg2egp \"NCDS Age 11 Dad's EGP from SEG\"\n",
      "\n",
      ". tab ncds11_dadseg2egp n1175, mi\n",
      "\n",
      "  NCDS Age |\n",
      "  11 Dad's |\n",
      "  EGP from |                              2P Father,male head's socio-economic grp (GRO 1966)\n",
      "       SEG | 1. Emp,ma  2. Emp,ma  3. Prof-s  4. Prof-e  5. Interm  6. Junior  7. Person  8. Foreme  9. Skille  10. Semi  |     Total\n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "         I |       582          0        123        615          0          0          0          0          0          0 |     1,320 \n",
      "    II+IVa |         0      1,365          0          0        720          0          0          0          0          0 |     2,085 \n",
      "       III |         0          0          0          0          0      1,214         78          0          0          0 |     1,292 \n",
      "     IVb+c |         0          0          0          0          0          0          0          0          0          0 |       743 \n",
      "         V |         0          0          0          0          0          0          0        820          0          0 |       820 \n",
      "        VI |         0          0          0          0          0          0          0          0      4,273          0 |     4,273 \n",
      "       VII |         0          0          0          0          0          0          0          0          0      1,808 |     2,770 \n",
      "         . |         0          0          0          0          0          0          0          0          0          0 |     5,255 \n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       582      1,365        123        615        720      1,214         78        820      4,273      1,808 |    18,558 \n",
      "\n",
      "\n",
      "  NCDS Age |\n",
      "  11 Dad's |\n",
      "  EGP from |             2P Father,male head's socio-economic grp (GRO 1966)\n",
      "       SEG | 11. Unski  12. Work-  13. Farme  14. Farm-  15. Agric  16. Armed          . |     Total\n",
      "-----------+-----------------------------------------------------------------------------+----------\n",
      "         I |         0          0          0          0          0          0          0 |     1,320 \n",
      "    II+IVa |         0          0          0          0          0          0          0 |     2,085 \n",
      "       III |         0          0          0          0          0          0          0 |     1,292 \n",
      "     IVb+c |         0        480        150        113          0          0          0 |       743 \n",
      "         V |         0          0          0          0          0          0          0 |       820 \n",
      "        VI |         0          0          0          0          0          0          0 |     4,273 \n",
      "       VII |       783          0          0          0        179          0          0 |     2,770 \n",
      "         . |         0          0          0          0          0        180      5,075 |     5,255 \n",
      "-----------+-----------------------------------------------------------------------------+----------\n",
      "     Total |       783        480        150        113        179        180      5,075 |    18,558 \n",
      "\n",
      "\n",
      ". \n",
      ". *Mother's SEG (Age 16)\n",
      "\n",
      ". tab n2394\n",
      "\n",
      "      3P Mothers |\n",
      "  Socio-economic |\n",
      "  group,if works |\n",
      "      (GRO 1970) |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      " Emp,manag large |         46        0.61        0.61\n",
      " Emp,manag small |        220        2.93        3.54\n",
      "   Prof-self-emp |          7        0.09        3.63\n",
      "  Prof-employees |         42        0.56        4.19\n",
      "Intermed non-man |        987       13.13       17.32\n",
      "  Junior non-man |      2,327       30.96       48.28\n",
      "Personal service |      1,347       17.92       66.20\n",
      "    Foremen-man. |         55        0.73       66.93\n",
      "  Skilled manual |        294        3.91       70.84\n",
      "Semi skld manual |      1,121       14.91       85.75\n",
      "Unskilled manual |        772       10.27       96.02\n",
      "Work own account |        115        1.53       97.55\n",
      "Farmer-emp,manag |          2        0.03       97.58\n",
      "Farm-own account |          6        0.08       97.66\n",
      "    Agric worker |         62        0.82       98.48\n",
      "    Armed forces |          1        0.01       98.50\n",
      " Inadequate info |        113        1.50      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |      7,517      100.00\n",
      "\n",
      ". capture drop ncds16_mumseg2egp\n",
      "\n",
      ". gen ncds16_mumseg2egp = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 1 if (n2394==1)|(n2394==3)|(n2394==4)\n",
      "(95 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 2 if (n2394==2)|(n2394==5)\n",
      "(1,207 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 3 if (n2394==6)|(n2394==7)\n",
      "(3,674 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 4 if (n2394==12)|(n2394==13)|(n2394==14)\n",
      "(123 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 5 if (n2394==8)\n",
      "(55 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 6 if (n2394==9)\n",
      "(294 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = 7 if (n2394==10)|(n2394==11)|(n2394==15)\n",
      "(1,955 real changes made)\n",
      "\n",
      ". replace ncds16_mumseg2egp = . if (n2394==16)\n",
      "(0 real changes made)\n",
      "\n",
      ". label values ncds16_mumseg2egp egp\n",
      "\n",
      ". label variable ncds16_mumseg2egp \"NCDS Age 16 Mum's EGP from SEG\"\n",
      "\n",
      ". tab ncds16_mumseg2egp n2394, mi\n",
      "\n",
      "  NCDS Age |\n",
      "  16 Mum's |\n",
      "  EGP from |                              3P Mothers Socio-economic group,if works (GRO 1970)\n",
      "       SEG | Emp,manag  Emp,manag  Prof-self  Prof-empl  Intermed   Junior no  Personal   Foremen-m  Skilled m  Semi skld |     Total\n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "         I |        46          0          7         42          0          0          0          0          0          0 |        95 \n",
      "    II+IVa |         0        220          0          0        987          0          0          0          0          0 |     1,207 \n",
      "       III |         0          0          0          0          0      2,327      1,347          0          0          0 |     3,674 \n",
      "     IVb+c |         0          0          0          0          0          0          0          0          0          0 |       123 \n",
      "         V |         0          0          0          0          0          0          0         55          0          0 |        55 \n",
      "        VI |         0          0          0          0          0          0          0          0        294          0 |       294 \n",
      "       VII |         0          0          0          0          0          0          0          0          0      1,121 |     1,955 \n",
      "         . |         0          0          0          0          0          0          0          0          0          0 |    11,155 \n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "     Total |        46        220          7         42        987      2,327      1,347         55        294      1,121 |    18,558 \n",
      "\n",
      "\n",
      "  NCDS Age |\n",
      "  16 Mum's |\n",
      "  EGP from |                   3P Mothers Socio-economic group,if works (GRO 1970)\n",
      "       SEG | Unskilled  Work own   Farmer-em  Farm-own   Agric wor  Armed for  Inadequat          . |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         I |         0          0          0          0          0          0          0          0 |        95 \n",
      "    II+IVa |         0          0          0          0          0          0          0          0 |     1,207 \n",
      "       III |         0          0          0          0          0          0          0          0 |     3,674 \n",
      "     IVb+c |         0        115          2          6          0          0          0          0 |       123 \n",
      "         V |         0          0          0          0          0          0          0          0 |        55 \n",
      "        VI |         0          0          0          0          0          0          0          0 |       294 \n",
      "       VII |       772          0          0          0         62          0          0          0 |     1,955 \n",
      "         . |         0          0          0          0          0          1        113     11,041 |    11,155 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |       772        115          2          6         62          1        113     11,041 |    18,558 \n",
      "\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Father's SEG\n",
    "numlabel n2385 n1175, add\n",
    "\n",
    "*Father's SEG measured age 16\n",
    "tab n2385\n",
    "capture drop ncds16_dadseg2egp\n",
    "gen ncds16_dadseg2egp = .\n",
    "replace ncds16_dadseg2egp = 1 if (n2385==1)|(n2385==3)|(n2385==4)\n",
    "replace ncds16_dadseg2egp = 2 if (n2385==2)|(n2385==5)\n",
    "replace ncds16_dadseg2egp = 3 if (n2385==6)|(n2385==7)\n",
    "replace ncds16_dadseg2egp = 4 if (n2385==12)|(n2385==13)|(n2385==14)\n",
    "replace ncds16_dadseg2egp = 5 if (n2385==8)\n",
    "replace ncds16_dadseg2egp = 6 if (n2385==9)\n",
    "replace ncds16_dadseg2egp = 7 if (n2385==10)|(n2385==11)|(n2385==15)\n",
    "replace ncds16_dadseg2egp = . if (n2385==17)|(n2385==16)\n",
    "label define egp 1 \"I\" 2 \"II+IVa\" 3 \"III\" 4 \"IVb+c\" 5 \"V\" 6 \"VI\" 7 \"VII\"\n",
    "label values ncds16_dadseg2egp egp\n",
    "label variable ncds16_dadseg2egp \"NCDS Age 16 Dad's EGP from SEG\"\n",
    "\n",
    "*armed forces are coded as missing\n",
    "tab ncds16_dadseg2egp n2385, mi\n",
    "\n",
    "*Father's SEG measured age 11\n",
    "tab n1175\n",
    "capture drop ncds11_dadseg2egp\n",
    "gen ncds11_dadseg2egp = .\n",
    "replace ncds11_dadseg2egp = 1 if (n1175==1)|(n1175==3)|(n1175==4)\n",
    "replace ncds11_dadseg2egp = 2 if (n1175==2)|(n1175==5)\n",
    "replace ncds11_dadseg2egp = 3 if (n1175==6)|(n1175==7)\n",
    "replace ncds11_dadseg2egp = 4 if (n1175==12)|(n1175==13)|(n1175==14)\n",
    "replace ncds11_dadseg2egp = 5 if (n1175==8)\n",
    "replace ncds11_dadseg2egp = 6 if (n1175==9)\n",
    "replace ncds11_dadseg2egp = 7 if (n1175==10)|(n1175==11)|(n1175==15)\n",
    "replace ncds11_dadseg2egp = . if (n1175==16)\n",
    "label values ncds11_dadseg2egp egp\n",
    "label variable ncds11_dadseg2egp \"NCDS Age 11 Dad's EGP from SEG\"\n",
    "tab ncds11_dadseg2egp n1175, mi\n",
    "\n",
    "*Mother's SEG (Age 16)\n",
    "tab n2394\n",
    "capture drop ncds16_mumseg2egp\n",
    "gen ncds16_mumseg2egp = .\n",
    "replace ncds16_mumseg2egp = 1 if (n2394==1)|(n2394==3)|(n2394==4)\n",
    "replace ncds16_mumseg2egp = 2 if (n2394==2)|(n2394==5)\n",
    "replace ncds16_mumseg2egp = 3 if (n2394==6)|(n2394==7)\n",
    "replace ncds16_mumseg2egp = 4 if (n2394==12)|(n2394==13)|(n2394==14)\n",
    "replace ncds16_mumseg2egp = 5 if (n2394==8)\n",
    "replace ncds16_mumseg2egp = 6 if (n2394==9)\n",
    "replace ncds16_mumseg2egp = 7 if (n2394==10)|(n2394==11)|(n2394==15)\n",
    "replace ncds16_mumseg2egp = . if (n2394==16)\n",
    "label values ncds16_mumseg2egp egp\n",
    "label variable ncds16_mumseg2egp \"NCDS Age 16 Mum's EGP from SEG\"\n",
    "tab ncds16_mumseg2egp n2394, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". keep ncdsid ncds0_olddadrgsc ncds7_olddadrgsc ncds11_olddadrgsc ncds16_olddadrgsc ncds16_oldmumrgsc ncds16_dadseg2egp ncds11_dadseg2egp nc\n",
      "> ds16_mumseg2egp n539 n1225 n2393 n2394\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "keep ncdsid ncds0_olddadrgsc ncds7_olddadrgsc ncds11_olddadrgsc ncds16_olddadrgsc ncds16_oldmumrgsc ncds16_dadseg2egp ncds11_dadseg2egp ncds16_mumseg2egp n539 n1225 n2393 n2394\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". save $path3\\temp2.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp2.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp2.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "save $path3\\temp2.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The occupational information we are going to use for our parental social class measure comes from the new occupational coding files [(SN7023)](https://discover.ukdataservice.ac.uk/catalogue/?sn=7023&type=Data%20catalogue).\n",
    "\n",
    "Gregg, P. (2012). [Occupational Coding for the National Child Development Study (1969, 1991-2008) and the 1970 British Cohort Study (1980, 2000-2008)](https://discover.ukdataservice.ac.uk/catalogue/?sn=7023&type=Data%20catalogue). [data collection]. University of London. Institute of Education. Centre for Longitudinal Studies, [original data producer(s)]. UK Data Service. SN: 7023.\n",
    "\n",
    "\"Researchers from the Avon Longitudinal Study of Parents and Children (ALSPAC), based at the University of Bristol, worked on data from selected waves of the NCDS and BCS70. To create occupational code classifications, the computerised questionnaire response text strings were converted into comma separated value (CSV) files and processed using the CASCOT (Computer Assisted Structured COding Tool) software programme, which used automatic and semi-automatic processing to assign Standard Occupational Classification 2000 (SOC2000) codes (SOC2000) to entries.\"\n",
    "\n",
    "The NS-SEC Full Version is recoded to the 8 category version using the NS-SEC documentation available [here](http://webarchive.nationalarchives.gov.uk/20160108055058/http://www.ons.gov.uk/ons/guide-method/classifications/archived-standard-classifications/soc-and-sec-archive/index.html). See the classes and collapses of NS-SEC [here](http://webarchive.nationalarchives.gov.uk/20160106042025/http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/soc2010/soc2010-volume-3-ns-sec--rebased-on-soc2010--user-manual/index.html#7).\n",
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDSBCS_OCCS\\ncds2_occupation_coding_father.dta, clear\n",
      "\n",
      ". \n",
      ". describe\n",
      "\n",
      "Contains data from F:\\Data\\RAWDATA\\ARCHIVE\\NCDSBCS_OCCS\\ncds2_occupation_coding_father.dta\n",
      "  obs:        15,337                          \n",
      " vars:            22                          \n",
      " size:     1,901,788                          \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "              storage   display    value\n",
      "variable name   type    format     label      variable label\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "NCDSID          str7    %7s                   ncdsid\n",
      "N2ASOCC         str4    %4s                   NCDS 1969 Father: FULL automatic SOC2000\n",
      "N2ASOCS         byte    %8.0g                 NCDS 1969 Father: FULL automatic Score\n",
      "N2ASOC90        int     %8.0g                 NCDS 1969 Father: Full auto SOC90\n",
      "N2SSOCC         str4    %4s                   NCDS 1969 Father: SEMI auto SOC2000\n",
      "N2SSOCS         byte    %8.0g                 NCDS 1969 Father: SEMI auto Score\n",
      "N2SSOC90        int     %8.0g                 NCDS 1969 Father: Semi auto SOC90\n",
      "N2VSOCC         str4    %4s                   NCDS 1969 Father: VERIFICATION SOC2000\n",
      "N2VSOCS         byte    %8.0g                 NCDS 1969 Father: VERIFICATION Score\n",
      "N2VSOC90        int     %8.0g                 NCDS 1969 Father: VERIFICATION SOC90\n",
      "N2ANSSEC        double  %10.0g     N2ANSSEC   NCDS 1969 Father: NS-SEC social class code AUTO processing\n",
      "N2SNSSEC        double  %10.0g     N2SNSSEC   NCDS 1969 Father: NS-SEC social class code SEMI processing\n",
      "N2VNSSEC        double  %10.0g     N2VNSSEC   NCDS 1969 Father: NS-SEC social class code VERIFICATION\n",
      "N2ACMSIS        double  %10.0g                NCDS 1969 Father: CAMSIS code AUTO processing\n",
      "N2SCMSIS        double  %10.0g                NCDS 1969 Father: CAMSIS code SEMI processing\n",
      "N2VCMSIS        double  %10.0g                NCDS 1969 Father: CAMSIS code VERIFICATION processing\n",
      "N2ARGSC         double  %10.0g     N2ARGSC    NCDS 1969 Father: RGSC social class code AUTO processing\n",
      "N2SRGSC         double  %10.0g     N2SRGSC    NCDS 1969 Father: RGSC social class code SEMI processing\n",
      "N2VRGSC         double  %10.0g     N2VRGSC    NCDS 1969 Father: RGSC social class code VERIFICATION processing\n",
      "N2ASEG          double  %10.0g     N2ASEG     NCDS 1969 Father: SEG social class code AUTO processing\n",
      "N2SSEG          double  %10.0g     N2SSEG     NCDS 1969 Father: SEG social class code SEMI processing\n",
      "N2VSEG          double  %10.0g     N2VSEG     NCDS 1969 Father: SEG social class code VERIFICATION processing\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "Sorted by: \n",
      "\n",
      ". \n",
      ". *Age 11 Father's Occupational Information\n",
      "\n",
      ". \n",
      ". keep NCDSID N2SNSSEC N2SSOCC \n",
      "\n",
      ". \n",
      ". *N2SSOCC is father's SOC2000\n",
      "\n",
      ". \n",
      ". *N2SNSSEC if father's NSSEC(simplified version, no employment status\n",
      "\n",
      ". *information used in its preparation.)\n",
      "\n",
      ". \n",
      ". tab N2SNSSEC\n",
      "\n",
      "  NCDS 1969 Father: NS-SEC social class |\n",
      "                 code SEMI processing   |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "                      Higher managerial |        367        3.29        3.29\n",
      "                                    3.1 |        461        4.13        7.42\n",
      "                                    3.2 |         63        0.56        7.99\n",
      "                                    3.3 |         12        0.11        8.09\n",
      "                                    4.1 |        674        6.04       14.14\n",
      "                                    4.2 |        147        1.32       15.45\n",
      "                                    4.3 |         26        0.23       15.69\n",
      "                       Lower managerial |        476        4.27       19.95\n",
      "                                    7.1 |        347        3.11       23.06\n",
      "                                    7.2 |        467        4.19       27.25\n",
      "                                    7.3 |        122        1.09       28.34\n",
      "                                    7.4 |        122        1.09       29.44\n",
      "                                    8.1 |        229        2.05       31.49\n",
      "                                    9.1 |        882        7.91       39.40\n",
      "                                    9.2 |        263        2.36       41.75\n",
      "                      Lower supervisory |        185        1.66       43.41\n",
      "                                   11.1 |      1,302       11.67       55.08\n",
      "                                   11.2 |        330        2.96       58.04\n",
      "                                   12.1 |        155        1.39       59.43\n",
      "                                   12.2 |        267        2.39       61.82\n",
      "                                   12.3 |        696        6.24       68.06\n",
      "                                   12.4 |        661        5.93       73.99\n",
      "                                   12.5 |        130        1.17       75.15\n",
      "                                   12.6 |         61        0.55       75.70\n",
      "                                   12.7 |          2        0.02       75.72\n",
      "                                   13.1 |         35        0.31       76.03\n",
      "                                   13.2 |        193        1.73       77.76\n",
      "                                   13.3 |      1,469       13.17       90.93\n",
      "                                   13.4 |        991        8.88       99.81\n",
      "                                   13.5 |         21        0.19      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     11,156      100.00\n",
      "\n",
      ". capture drop ncds_panssec\n",
      "\n",
      ". gen ncds_panssec = .\n",
      "(15,337 missing values generated)\n",
      "\n",
      ".     replace ncds_panssec = 1 if (N2SNSSEC>=1)&(N2SNSSEC<=2) \n",
      "(367 real changes made)\n",
      "\n",
      ".         *1.1 Large Employers and Higher Managerial\n",
      "\n",
      ".     replace ncds_panssec = 2 if (N2SNSSEC>=3.1)&(N2SNSSEC<=3.4) \n",
      "(536 real changes made)\n",
      "\n",
      ".         *1.2 Higher Professional\n",
      "\n",
      ".     replace ncds_panssec = 3 if (N2SNSSEC>=4.1)&(N2SNSSEC<=6) \n",
      "(1,323 real changes made)\n",
      "\n",
      ".         *lower managerial and professional\n",
      "\n",
      ".     replace ncds_panssec = 4 if (N2SNSSEC>=7.1)&(N2SNSSEC<=7.4) \n",
      "(1,058 real changes made)\n",
      "\n",
      ".         *intermediate\n",
      "\n",
      ".     replace ncds_panssec = 5 if (N2SNSSEC>=8.1)&(N2SNSSEC<=9.2) \n",
      "(1,374 real changes made)\n",
      "\n",
      ".         *small employers and own account\n",
      "\n",
      ".     replace ncds_panssec = 6 if (N2SNSSEC>=10)&(N2SNSSEC<=11.2) \n",
      "(1,817 real changes made)\n",
      "\n",
      ".         *lower supervisory and technical\n",
      "\n",
      ".     replace ncds_panssec = 7 if (N2SNSSEC>=12.1)&(N2SNSSEC<=12.7) \n",
      "(1,972 real changes made)\n",
      "\n",
      ".         *semiroutine\n",
      "\n",
      ".     replace ncds_panssec = 8 if (N2SNSSEC>=13.1)&(N2SNSSEC<=13.5) \n",
      "(2,709 real changes made)\n",
      "\n",
      ".         *routine\n",
      "\n",
      ". tab ncds_panssec\n",
      "\n",
      "ncds_pansse |\n",
      "          c |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |        367        3.29        3.29\n",
      "          2 |        536        4.80        8.09\n",
      "          3 |      1,323       11.86       19.95\n",
      "          4 |      1,058        9.48       29.44\n",
      "          5 |      1,374       12.32       41.75\n",
      "          6 |      1,817       16.29       58.04\n",
      "          7 |      1,972       17.68       75.72\n",
      "          8 |      2,709       24.28      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     11,156      100.00\n",
      "\n",
      ". label variable ncds_panssec \"NCDS Age 11 Father's NSSEC\"\n",
      "\n",
      ". label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermediate\n",
      "> \" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" \n",
      "\n",
      ". label values ncds_panssec nssec\n",
      "\n",
      ". tab ncds_panssec, mi\n",
      "\n",
      "           NCDS Age 11 Father's NSSEC |      Freq.     Percent        Cum.\n",
      "--------------------------------------+-----------------------------------\n",
      "Large Employers and Higher Managerial |        367        2.39        2.39\n",
      "                  Higher Professional |        536        3.49        5.89\n",
      "    Lower managerial and professional |      1,323        8.63       14.51\n",
      "                         Intermediate |      1,058        6.90       21.41\n",
      "      Small employers and own account |      1,374        8.96       30.37\n",
      "      Lower Supervisory and Technical |      1,817       11.85       42.22\n",
      "                         Semi-Routine |      1,972       12.86       55.08\n",
      "                              Routine |      2,709       17.66       72.74\n",
      "                                    . |      4,181       27.26      100.00\n",
      "--------------------------------------+-----------------------------------\n",
      "                                Total |     15,337      100.00\n",
      "\n",
      ". \n",
      ". \n",
      ". rename NCDSID ncdsid\n",
      "\n",
      ". sort ncdsid\n",
      "\n",
      ". \n",
      ". duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        15337             0\n",
      "--------------------------------------\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDSBCS_OCCS\\ncds2_occupation_coding_father.dta, clear\n",
    "\n",
    "describe\n",
    "\n",
    "*Age 11 Father's Occupational Information\n",
    "\n",
    "keep NCDSID N2SNSSEC N2SSOCC \n",
    "\n",
    "*N2SSOCC is father's SOC2000\n",
    "\n",
    "*N2SNSSEC if father's NSSEC(simplified version, no employment status\n",
    "*information used in its preparation.)\n",
    "\n",
    "tab N2SNSSEC\n",
    "capture drop ncds_panssec\n",
    "gen ncds_panssec = .\n",
    "    replace ncds_panssec = 1 if (N2SNSSEC>=1)&(N2SNSSEC<=2) \n",
    "        *1.1 Large Employers and Higher Managerial\n",
    "    replace ncds_panssec = 2 if (N2SNSSEC>=3.1)&(N2SNSSEC<=3.4) \n",
    "        *1.2 Higher Professional\n",
    "    replace ncds_panssec = 3 if (N2SNSSEC>=4.1)&(N2SNSSEC<=6) \n",
    "        *lower managerial and professional\n",
    "    replace ncds_panssec = 4 if (N2SNSSEC>=7.1)&(N2SNSSEC<=7.4) \n",
    "        *intermediate\n",
    "    replace ncds_panssec = 5 if (N2SNSSEC>=8.1)&(N2SNSSEC<=9.2) \n",
    "        *small employers and own account\n",
    "    replace ncds_panssec = 6 if (N2SNSSEC>=10)&(N2SNSSEC<=11.2) \n",
    "        *lower supervisory and technical\n",
    "    replace ncds_panssec = 7 if (N2SNSSEC>=12.1)&(N2SNSSEC<=12.7) \n",
    "        *semiroutine\n",
    "    replace ncds_panssec = 8 if (N2SNSSEC>=13.1)&(N2SNSSEC<=13.5) \n",
    "        *routine\n",
    "tab ncds_panssec\n",
    "label variable ncds_panssec \"NCDS Age 11 Father's NSSEC\"\n",
    "label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermediate\" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" \n",
    "label values ncds_panssec nssec\n",
    "tab ncds_panssec, mi\n",
    "\n",
    "\n",
    "rename NCDSID ncdsid\n",
    "sort ncdsid\n",
    "\n",
    "duplicates report ncdsid\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just to double check we are going to recode the SOC2000 codes in this file to NS-SEC. \n",
    "\n",
    "A look-up table to code NS-SEC from soc2000 is available [here](https://github.com/RoxanneConnelly/SOC2000toNSSEC), this is produced using the NS-SEC [documentation](http://webarchive.nationalarchives.gov.uk/20070109013547/http://www.statistics.gov.uk/methods_quality/ns_sec/soc2000.asp).\n",
    "\n",
    "NS-SEC is ideally computed using both occupational and employment status information. In this analysis we compute NS-SEC using the simplified method (i.e. occupational information only). \n",
    "\n",
    "There is some employment status information available in the datasets, however there is not the required information to produce the standardised employment status variable required to compute the full version of NS-SEC fully in the prescribed manner in a comparable manner across the two cohorts. \n",
    "\n",
    "For example in the NCDS first survey we have information on:\n",
    "* Father self-employed\n",
    "* If whether they employ more than 10 people\n",
    "* If not self-employed whether he supervises others (e.g. foreman, manager, charge hand)\n",
    "\n",
    "The NS-SEC required the following information:\n",
    "* Whether self-employed with no employees (i.e. own account worker)\n",
    "* If employer whether employs less then 25, or 25 or more employees\n",
    "* If an employee whether they are a supervisor or not\n",
    "* and if a supervisor how many employees they supervise\n",
    "\n",
    "The NCDS question on whether employees supervise others includes foremen and managers in the same response. However the NS-SEC documentation explicitly defines managers as separate from supervisors.\n",
    "\n",
    "Due to these differences we have been cautious and not used the employment status information to ensure that the coding of our social class measure can be as standardised as possible.\n",
    "\n",
    "Furthermore, there are different employment status questions used in the BCS which would provide slightly different employment status information. This would potentially hinder fair comparisons between the two cohorts. As a result, we use the simplified NS-SEC coding method in both cohorts.\n",
    "\n",
    "More information on NS-SEC and the simplified method are available in the NS-SEC [documentation.](http://webarchive.nationalarchives.gov.uk/20160106042025/http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/soc2010/soc2010-volume-3-ns-sec--rebased-on-soc2010--user-manual/index.html)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *We create an employment status variable (==0) \"missing\" as we\n",
      "\n",
      ". *are using the simplified coding method.\n",
      "\n",
      ". capture drop ukempst\n",
      "\n",
      ". gen ukempst = 0\n",
      "\n",
      ". \n",
      ". *The soc2000 variable is a string and contains non-numeric values\n",
      "\n",
      ". *so i need to turn this into a numeric variable.  \n",
      "\n",
      ". describe \n",
      "\n",
      "Contains data from F:\\Data\\RAWDATA\\ARCHIVE\\NCDSBCS_OCCS\\ncds2_occupation_coding_father.dta\n",
      "  obs:        15,337                          \n",
      " vars:             5                          \n",
      " size:       414,099                          \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "              storage   display    value\n",
      "variable name   type    format     label      variable label\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "ncdsid          str7    %7s                   ncdsid\n",
      "N2SSOCC         str4    %4s                   NCDS 1969 Father: SEMI auto SOC2000\n",
      "N2SNSSEC        double  %10.0g     N2SNSSEC   NCDS 1969 Father: NS-SEC social class code SEMI processing\n",
      "ncds_panssec    float   %37.0g     nssec      NCDS Age 11 Father's NSSEC\n",
      "ukempst         float   %9.0g                 \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "Sorted by: ncdsid\n",
      "     Note: Dataset has changed since last saved.\n",
      "\n",
      ". capture drop soc2000\n",
      "\n",
      ". gen soc2000 = real(N2SSOCC)\n",
      "(4,179 missing values generated)\n",
      "\n",
      ". \n",
      ". duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        15337             0\n",
      "--------------------------------------\n",
      "\n",
      ". \n",
      ". sort soc2000 ukempst\n",
      "\n",
      ". merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
      "(label nssec already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         6,692\n",
      "        from master                     4,179  (_merge==1)\n",
      "        from using                      2,513  (_merge==2)\n",
      "\n",
      "    matched                            11,158  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ". \n",
      ". drop if _merge==2\n",
      "(2,513 observations deleted)\n",
      "\n",
      ". \n",
      ". tab nssec\n",
      "\n",
      "                                nssec |      Freq.     Percent        Cum.\n",
      "--------------------------------------+-----------------------------------\n",
      "Large Employers and Higher Managerial |        369        3.31        3.31\n",
      "                  Higher Professional |        536        4.80        8.11\n",
      "    Lower managerial and professional |      1,323       11.86       19.97\n",
      "                         Intermediate |      1,058        9.48       29.45\n",
      "      Small employers and own account |      1,374       12.31       41.76\n",
      "      Lower Supervisory and Technical |      1,817       16.28       58.05\n",
      "                         Semi-Routine |      1,972       17.67       75.72\n",
      "                              Routine |      2,709       24.28      100.00\n",
      "--------------------------------------+-----------------------------------\n",
      "                                Total |     11,158      100.00\n",
      "\n",
      ". \n",
      ". tab nssec ncds_panssec\n",
      "\n",
      "                      |                               NCDS Age 11 Father's NSSEC\n",
      "                nssec | Large Emp  Higher Pr  Lower man  Intermedi  Small emp  Lower Sup  Semi-Rout    Routine |     Total\n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "Large Employers and H |       367          0          0          0          0          0          0          0 |       367 \n",
      "  Higher Professional |         0        536          0          0          0          0          0          0 |       536 \n",
      "Lower managerial and  |         0          0      1,323          0          0          0          0          0 |     1,323 \n",
      "         Intermediate |         0          0          0      1,058          0          0          0          0 |     1,058 \n",
      "Small employers and o |         0          0          0          0      1,374          0          0          0 |     1,374 \n",
      "Lower Supervisory and |         0          0          0          0          0      1,817          0          0 |     1,817 \n",
      "         Semi-Routine |         0          0          0          0          0          0      1,972          0 |     1,972 \n",
      "              Routine |         0          0          0          0          0          0          0      2,709 |     2,709 \n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "                Total |       367        536      1,323      1,058      1,374      1,817      1,972      2,709 |    11,156 \n",
      "\n",
      "\n",
      ". \n",
      ". *There is a perfect match between the NS-SEC variable in the data, and the version coded by us.\n",
      "\n",
      ". \n",
      ". rename soc2000 ncds_dadsoc2000 \n",
      "\n",
      ". \n",
      ". drop ukempst nssec _merge\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "*We create an employment status variable (==0) \"missing\" as we\n",
    "*are using the simplified coding method.\n",
    "capture drop ukempst\n",
    "gen ukempst = 0\n",
    "\n",
    "*The soc2000 variable is a string and contains non-numeric values\n",
    "*so i need to turn this into a numeric variable.  \n",
    "describe \n",
    "capture drop soc2000\n",
    "gen soc2000 = real(N2SSOCC)\n",
    "\n",
    "duplicates report ncdsid\n",
    "\n",
    "sort soc2000 ukempst\n",
    "merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
    "\n",
    "drop if _merge==2\n",
    "\n",
    "tab nssec\n",
    "\n",
    "tab nssec ncds_panssec\n",
    "\n",
    "*There is a perfect match between the NS-SEC variable in the data, and the version coded by us.\n",
    "\n",
    "rename soc2000 ncds_dadsoc2000 \n",
    "\n",
    "drop ukempst nssec _merge\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *return to jupyter\n",
      "\n",
      ". save $path3\\temp3.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp3.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp3.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "save $path3\\temp3.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use the reponse datasets to include information on the outcome at each sweep of the survey (e.g. productive or not productive)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDS\\response\\ncds_response.dta, clear\n",
      "\n",
      ". \n",
      ". keep NCDSID OUTCME00 OUTCME01 OUTCME02\n",
      "\n",
      ".     numlabel, add\n",
      "\n",
      ". \n",
      ". rename NCDSID ncdsid\n",
      "\n",
      ". \n",
      ". *NCDS AGE 0  - This variable indicates the outcome of the first survey (i.e. productive or other outcome)\n",
      "\n",
      ". tab OUTCME00\n",
      "\n",
      "Outcome to PMS (1958)    |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     17,415       93.84       93.84\n",
      "          3. Non-contact |        218        1.17       95.02\n",
      "           6. Not Issued |        925        4.98      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". rename OUTCME00 ncds_0outcome\n",
      "\n",
      ".     label variable ncds_0outcome \"NCDS response outcome 1958 (age 0)\"\n",
      "\n",
      ".     tab ncds_0outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "            1958 (age 0) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     17,415       93.84       93.84\n",
      "          3. Non-contact |        218        1.17       95.02\n",
      "           6. Not Issued |        925        4.98      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ".     \n",
      ". *NCDS AGE 7 - This variable indicates the outcome of the age 7 survey\n",
      "\n",
      ". tab OUTCME01\n",
      "\n",
      "Outcome to NCDS1 (1965)  |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,425       83.12       83.12\n",
      "              2. Refusal |         80        0.43       83.55\n",
      "          3. Non-contact |      1,036        5.58       89.13\n",
      "   4. Other unproductive |        173        0.93       90.06\n",
      "           6. Not Issued |        548        2.95       93.02\n",
      "7. Not Issued - Emigrant |        475        2.56       95.58\n",
      "    8. Not Issued - Dead |        821        4.42      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". rename OUTCME01 ncds_7outcome\n",
      "\n",
      ".     label variable ncds_7outcome \"NCDS response outcome 1965 (age 7)\"\n",
      "\n",
      ".     tab ncds_7outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "            1965 (age 7) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,425       83.12       83.12\n",
      "              2. Refusal |         80        0.43       83.55\n",
      "          3. Non-contact |      1,036        5.58       89.13\n",
      "   4. Other unproductive |        173        0.93       90.06\n",
      "           6. Not Issued |        548        2.95       93.02\n",
      "7. Not Issued - Emigrant |        475        2.56       95.58\n",
      "    8. Not Issued - Dead |        821        4.42      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ".     \n",
      ". *NCDS AGE 11 - This variable indicates the outcome of the age 11 survey\n",
      "\n",
      ". tab OUTCME02\n",
      "\n",
      "Outcome to NCDS2 (1969)  |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,337       82.64       82.64\n",
      "              2. Refusal |        797        4.29       86.94\n",
      "          3. Non-contact |        406        2.19       89.13\n",
      "   4. Other unproductive |        202        1.09       90.21\n",
      "           6. Not Issued |        275        1.48       91.70\n",
      "7. Not Issued - Emigrant |        701        3.78       95.47\n",
      "    8. Not Issued - Dead |        840        4.53      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". rename OUTCME02 ncds_11outcome\n",
      "\n",
      ".     label variable ncds_11outcome \"NCDS response outcome 1969 (age 11)\"\n",
      "\n",
      ".     tab ncds_11outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,337       82.64       82.64\n",
      "              2. Refusal |        797        4.29       86.94\n",
      "          3. Non-contact |        406        2.19       89.13\n",
      "   4. Other unproductive |        202        1.09       90.21\n",
      "           6. Not Issued |        275        1.48       91.70\n",
      "7. Not Issued - Emigrant |        701        3.78       95.47\n",
      "    8. Not Issued - Dead |        840        4.53      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ".     \n",
      ". *Here we create a simple dummy variable indicating if the cohort member had a productive\n",
      "\n",
      ". *interview at the age 11 survey (or not)\n",
      "\n",
      ". tab ncds_11outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,337       82.64       82.64\n",
      "              2. Refusal |        797        4.29       86.94\n",
      "          3. Non-contact |        406        2.19       89.13\n",
      "   4. Other unproductive |        202        1.09       90.21\n",
      "           6. Not Issued |        275        1.48       91.70\n",
      "7. Not Issued - Emigrant |        701        3.78       95.47\n",
      "    8. Not Issued - Dead |        840        4.53      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ".     gen sweeptestoutcome = 0\n",
      "\n",
      ".     replace sweeptestoutcome = 1 if (ncds_11outcome==1)\n",
      "(15,337 real changes made)\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\"\n",
      "\n",
      ".     label values sweeptestoutcome yesno\n",
      "\n",
      ".     tab ncds_11outcome sweeptestoutcome\n",
      "\n",
      "NCDS response outcome |   sweeptestoutcome\n",
      "        1969 (age 11) |        No        Yes |     Total\n",
      "----------------------+----------------------+----------\n",
      "        1. Productive |         0     15,337 |    15,337 \n",
      "           2. Refusal |       797          0 |       797 \n",
      "       3. Non-contact |       406          0 |       406 \n",
      "4. Other unproductive |       202          0 |       202 \n",
      "        6. Not Issued |       275          0 |       275 \n",
      "7. Not Issued - Emigr |       701          0 |       701 \n",
      " 8. Not Issued - Dead |       840          0 |       840 \n",
      "----------------------+----------------------+----------\n",
      "                Total |     3,221     15,337 |    18,558 \n",
      "\n",
      "\n",
      ".     label variable sweeptestoutcome \"Productive at age 11 survey\"\n",
      "\n",
      ". \n",
      ". sort ncdsid\n",
      "\n",
      ". save $path3\\temp4.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp4.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp4.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDS\\response\\ncds_response.dta, clear\n",
    "\n",
    "keep NCDSID OUTCME00 OUTCME01 OUTCME02\n",
    "    numlabel, add\n",
    "\n",
    "rename NCDSID ncdsid\n",
    "\n",
    "*NCDS AGE 0  - This variable indicates the outcome of the first survey (i.e. productive or other outcome)\n",
    "tab OUTCME00\n",
    "rename OUTCME00 ncds_0outcome\n",
    "    label variable ncds_0outcome \"NCDS response outcome 1958 (age 0)\"\n",
    "    tab ncds_0outcome\n",
    "    \n",
    "*NCDS AGE 7 - This variable indicates the outcome of the age 7 survey\n",
    "tab OUTCME01\n",
    "rename OUTCME01 ncds_7outcome\n",
    "    label variable ncds_7outcome \"NCDS response outcome 1965 (age 7)\"\n",
    "    tab ncds_7outcome\n",
    "    \n",
    "*NCDS AGE 11 - This variable indicates the outcome of the age 11 survey\n",
    "tab OUTCME02\n",
    "rename OUTCME02 ncds_11outcome\n",
    "    label variable ncds_11outcome \"NCDS response outcome 1969 (age 11)\"\n",
    "    tab ncds_11outcome\n",
    "    \n",
    "*Here we create a simple dummy variable indicating if the cohort member had a productive\n",
    "*interview at the age 11 survey (or not)\n",
    "tab ncds_11outcome\n",
    "    gen sweeptestoutcome = 0\n",
    "    replace sweeptestoutcome = 1 if (ncds_11outcome==1)\n",
    "    label define yesno 1 \"Yes\" 0 \"No\"\n",
    "    label values sweeptestoutcome yesno\n",
    "    tab ncds_11outcome sweeptestoutcome\n",
    "    label variable sweeptestoutcome \"Productive at age 11 survey\"\n",
    "\n",
    "sort ncdsid\n",
    "save $path3\\temp4.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have also brought together some additional variables that are not used in the main data analysis, but which could potentially be used to produce the inverse probability weights and in the multiple imputation.\n",
    "\n",
    "These are the type of variables that have previously been used in models of missing data in the cohort studies (see Mostafa & Wiggins, 2015; Plewis, Calderwood, Hawkes, & Nathan, 2004).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
      "\n",
      ". \n",
      ". keep ncdsid n0region n553 n545 n504 \n",
      "\n",
      ". \n",
      ". numlabel n0region n553 n545 n504 , add\n",
      "\n",
      ". \n",
      ". *Whether the cohort member's mother is married.\n",
      "\n",
      ". tab n545\n",
      "\n",
      "   0 Mother's present |\n",
      "       marital status |      Freq.     Percent        Cum.\n",
      "----------------------+-----------------------------------\n",
      "-1. NA, incomplt info |         10        0.06        0.06\n",
      "   1. Sep,Div,Widowed |        161        0.92        0.98\n",
      "      2. Stable union |         39        0.22        1.21\n",
      "     3. Twice married |         33        0.19        1.40\n",
      "           4. Married |     16,662       95.68       97.07\n",
      "         5. Unmarried |        510        2.93      100.00\n",
      "----------------------+-----------------------------------\n",
      "                Total |     17,415      100.00\n",
      "\n",
      ". capture drop ncds_married\n",
      "\n",
      ".     gen ncds_married = .\n",
      "(18,558 missing values generated)\n",
      "\n",
      ".     replace ncds_married = 1 if (n545==4)\n",
      "(16,662 real changes made)\n",
      "\n",
      ".     replace ncds_married = 0 if (n545==1)|(n545==2)|(n545==3)|(n545==5)\n",
      "(743 real changes made)\n",
      "\n",
      ".     label variable ncds_married \"NCDS Mother married at Cohort Member's Birth\"\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\"\n",
      "\n",
      ".     label values ncds_married yesno\n",
      "\n",
      ".     tab ncds_married\n",
      "\n",
      "NCDS Mother |\n",
      " married at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "         No |        743        4.27        4.27\n",
      "        Yes |     16,662       95.73      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,405      100.00\n",
      "\n",
      ".     drop n545\n",
      "\n",
      ". \n",
      ". *The cohort member's mother's age at the cohort member's birth\n",
      "\n",
      ". tab n553\n",
      "\n",
      " 0 Mother's |\n",
      "   age last |\n",
      "birthday,in |\n",
      "      years |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "     -1. NA |         13        0.07        0.07\n",
      "          8 |          1        0.01        0.08\n",
      "         14 |          1        0.01        0.09\n",
      "         15 |          8        0.05        0.13\n",
      "         16 |         41        0.24        0.37\n",
      "         17 |        142        0.82        1.18\n",
      "         18 |        306        1.76        2.94\n",
      "         19 |        494        2.84        5.78\n",
      "         20 |        772        4.43       10.21\n",
      "         21 |        951        5.46       15.67\n",
      "         22 |      1,033        5.93       21.60\n",
      "         23 |      1,122        6.44       28.04\n",
      "         24 |      1,139        6.54       34.59\n",
      "         25 |      1,141        6.55       41.14\n",
      "         26 |      1,118        6.42       47.56\n",
      "         27 |      1,255        7.21       54.76\n",
      "         28 |      1,025        5.89       60.65\n",
      "         29 |      1,046        6.01       66.66\n",
      "         30 |        915        5.25       71.91\n",
      "         31 |        724        4.16       76.07\n",
      "         32 |        700        4.02       80.09\n",
      "         33 |        633        3.63       83.72\n",
      "         34 |        560        3.22       86.94\n",
      "         35 |        452        2.60       89.53\n",
      "         36 |        407        2.34       91.87\n",
      "         37 |        469        2.69       94.56\n",
      "         38 |        295        1.69       96.26\n",
      "         39 |        202        1.16       97.42\n",
      "         40 |        122        0.70       98.12\n",
      "         41 |        120        0.69       98.81\n",
      "         42 |         85        0.49       99.29\n",
      "         43 |         63        0.36       99.66\n",
      "         44 |         27        0.16       99.81\n",
      "         45 |         17        0.10       99.91\n",
      "         46 |         11        0.06       99.97\n",
      "         47 |          4        0.02       99.99\n",
      "         48 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,415      100.00\n",
      "\n",
      ". recode n553 (-1=.)\n",
      "(n553: 13 changes made)\n",
      "\n",
      ". rename n553 ncds_mumagebirth\n",
      "\n",
      ". label variable ncds_mumagebirth \"NCDS Mother's Age at Cohort Member's Birth\"\n",
      "\n",
      ". tab ncds_mumagebirth\n",
      "\n",
      "       NCDS |\n",
      "   Mother's |\n",
      "     Age at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          8 |          1        0.01        0.01\n",
      "         14 |          1        0.01        0.01\n",
      "         15 |          8        0.05        0.06\n",
      "         16 |         41        0.24        0.29\n",
      "         17 |        142        0.82        1.11\n",
      "         18 |        306        1.76        2.87\n",
      "         19 |        494        2.84        5.71\n",
      "         20 |        772        4.44       10.14\n",
      "         21 |        951        5.46       15.61\n",
      "         22 |      1,033        5.94       21.54\n",
      "         23 |      1,122        6.45       27.99\n",
      "         24 |      1,139        6.55       34.54\n",
      "         25 |      1,141        6.56       41.09\n",
      "         26 |      1,118        6.42       47.52\n",
      "         27 |      1,255        7.21       54.73\n",
      "         28 |      1,025        5.89       60.62\n",
      "         29 |      1,046        6.01       66.63\n",
      "         30 |        915        5.26       71.89\n",
      "         31 |        724        4.16       76.05\n",
      "         32 |        700        4.02       80.07\n",
      "         33 |        633        3.64       83.71\n",
      "         34 |        560        3.22       86.93\n",
      "         35 |        452        2.60       89.52\n",
      "         36 |        407        2.34       91.86\n",
      "         37 |        469        2.70       94.56\n",
      "         38 |        295        1.70       96.25\n",
      "         39 |        202        1.16       97.41\n",
      "         40 |        122        0.70       98.12\n",
      "         41 |        120        0.69       98.80\n",
      "         42 |         85        0.49       99.29\n",
      "         43 |         63        0.36       99.66\n",
      "         44 |         27        0.16       99.81\n",
      "         45 |         17        0.10       99.91\n",
      "         46 |         11        0.06       99.97\n",
      "         47 |          4        0.02       99.99\n",
      "         48 |          1        0.01      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,402      100.00\n",
      "\n",
      ". \n",
      ". *Parity at the cohort member's birth\n",
      "\n",
      ". tab n504\n",
      "\n",
      "            0 Parity |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "              -1. NA |          3        0.02        0.02\n",
      "1. No prev aft 28wks |      6,396       36.73       36.74\n",
      "    2. 1 after 28wks |      5,364       30.80       67.55\n",
      "    3. 2 after 28wks |      2,730       15.68       83.22\n",
      "    4. 3 after 28wks |      1,357        7.79       91.01\n",
      "    5. 4 after 28wks |        705        4.05       95.06\n",
      "    6. 5 after 28wks |        391        2.25       97.31\n",
      "    7. 6 after 28wks |        216        1.24       98.55\n",
      "    8. 7 after 28wks |        129        0.74       99.29\n",
      "    9. 8 after 28wks |         67        0.38       99.67\n",
      "   10. 9 after 28wks |         57        0.33      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,415      100.00\n",
      "\n",
      ". recode n504 (-1=.)\n",
      "(n504: 3 changes made)\n",
      "\n",
      ".     rename n504 ncds_parity\n",
      "\n",
      ".     label variable ncds_parity \"NCDS Parity at Birth\"\n",
      "\n",
      ".     _strip_labels ncds_parity\n",
      "\n",
      ".     replace ncds_parity = (ncds_parity-1)\n",
      "(17,412 real changes made)\n",
      "\n",
      ".     tab ncds_parity\n",
      "\n",
      "NCDS Parity |\n",
      "   at Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |      6,396       36.73       36.73\n",
      "          1 |      5,364       30.81       67.54\n",
      "          2 |      2,730       15.68       83.22\n",
      "          3 |      1,357        7.79       91.01\n",
      "          4 |        705        4.05       95.06\n",
      "          5 |        391        2.25       97.31\n",
      "          6 |        216        1.24       98.55\n",
      "          7 |        129        0.74       99.29\n",
      "          8 |         67        0.38       99.67\n",
      "          9 |         57        0.33      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,412      100.00\n",
      "\n",
      ". \n",
      ". *Region at the first survey\n",
      "\n",
      ". tab n0region\n",
      "\n",
      "     Region at PMS |\n",
      "    (1958) - Birth |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "    -2. Not in PMS |      1,141        6.15        6.15\n",
      "          1. North |      1,234        6.65       12.80\n",
      "     2. North West |      2,295       12.37       25.17\n",
      "   3. E & W.Riding |      1,433        7.72       32.89\n",
      " 4. North Midlands |      1,299        7.00       39.89\n",
      "       5. Midlands |      1,648        8.88       48.77\n",
      "           6. East |      1,242        6.69       55.46\n",
      "     7. South East |      3,445       18.56       74.03\n",
      "          8. South |        955        5.15       79.17\n",
      "     9. South West |        966        5.21       84.38\n",
      "         10. Wales |        914        4.93       89.30\n",
      "      11. Scotland |      1,985       10.70      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     18,557      100.00\n",
      "\n",
      ".     rename n0region ncds_region\n",
      "\n",
      ".     recode ncds_region (-2=.)\n",
      "(ncds_region: 1141 changes made)\n",
      "\n",
      ".     tab ncds_region\n",
      "\n",
      "     Region at PMS |\n",
      "    (1958) - Birth |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "          1. North |      1,234        7.09        7.09\n",
      "     2. North West |      2,295       13.18       20.26\n",
      "   3. E & W.Riding |      1,433        8.23       28.49\n",
      " 4. North Midlands |      1,299        7.46       35.95\n",
      "       5. Midlands |      1,648        9.46       45.41\n",
      "           6. East |      1,242        7.13       52.54\n",
      "     7. South East |      3,445       19.78       72.32\n",
      "          8. South |        955        5.48       77.81\n",
      "     9. South West |        966        5.55       83.35\n",
      "         10. Wales |        914        5.25       88.60\n",
      "      11. Scotland |      1,985       11.40      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     17,416      100.00\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDS\\S1-3\\ncds0123.dta, clear\n",
    "\n",
    "keep ncdsid n0region n553 n545 n504 \n",
    "\n",
    "numlabel n0region n553 n545 n504 , add\n",
    "\n",
    "*Whether the cohort member's mother is married.\n",
    "tab n545\n",
    "capture drop ncds_married\n",
    "    gen ncds_married = .\n",
    "    replace ncds_married = 1 if (n545==4)\n",
    "    replace ncds_married = 0 if (n545==1)|(n545==2)|(n545==3)|(n545==5)\n",
    "    label variable ncds_married \"NCDS Mother married at Cohort Member's Birth\"\n",
    "    label define yesno 1 \"Yes\" 0 \"No\"\n",
    "    label values ncds_married yesno\n",
    "    tab ncds_married\n",
    "    drop n545\n",
    "\n",
    "*The cohort member's mother's age at the cohort member's birth\n",
    "tab n553\n",
    "recode n553 (-1=.)\n",
    "rename n553 ncds_mumagebirth\n",
    "label variable ncds_mumagebirth \"NCDS Mother's Age at Cohort Member's Birth\"\n",
    "tab ncds_mumagebirth\n",
    "\n",
    "*Parity at the cohort member's birth\n",
    "tab n504\n",
    "recode n504 (-1=.)\n",
    "    rename n504 ncds_parity\n",
    "    label variable ncds_parity \"NCDS Parity at Birth\"\n",
    "    _strip_labels ncds_parity\n",
    "    replace ncds_parity = (ncds_parity-1)\n",
    "    tab ncds_parity\n",
    "\n",
    "*Region at the first survey\n",
    "tab n0region\n",
    "    rename n0region ncds_region\n",
    "    recode ncds_region (-2=.)\n",
    "    tab ncds_region\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". sort ncdsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp5.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp5.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp5.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "sort ncdsid\n",
    "\n",
    "save $path3\\temp5.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Merge all these pieces of information together to make a single NCDS working data file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\temp1.dta, clear\n",
      "\n",
      ".     sort ncdsid\n",
      "\n",
      ".     merge 1:1 ncdsid using $path3\\temp2.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                             0\n",
      "    matched                            18,558  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18558             0\n",
      "--------------------------------------\n",
      "\n",
      ".     sort ncdsid\n",
      "\n",
      ".     merge 1:1 ncdsid using $path3\\temp3.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         3,221\n",
      "        from master                     3,221  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            15,337  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18558             0\n",
      "--------------------------------------\n",
      "\n",
      ".     sort ncdsid\n",
      "\n",
      ".     merge 1:1 ncdsid using $path3\\temp4.dta\n",
      "(label yesno already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                             0\n",
      "    matched                            18,558  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18558             0\n",
      "--------------------------------------\n",
      "\n",
      ".     sort ncdsid\n",
      "\n",
      ".     merge 1:1 ncdsid using $path3\\temp5.dta\n",
      "(label yesno already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                             0\n",
      "    matched                            18,558  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     duplicates report ncdsid\n",
      "\n",
      "Duplicates in terms of ncdsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18558             0\n",
      "--------------------------------------\n",
      "\n",
      ".     sort ncdsid \n",
      "\n",
      ".     \n",
      ". capture drop cohort\n",
      "\n",
      ".     gen cohort=1\n",
      "\n",
      ".     label variable cohort \"Cohort\"\n",
      "\n",
      ".     label define cohort 1 \"NCDS\" 2 \"BCS\", replace\n",
      "\n",
      ".     label values cohort cohort\n",
      "\n",
      ".     tab cohort, mi\n",
      "\n",
      "     Cohort |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       NCDS |     18,558      100.00      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,558      100.00\n",
      "\n",
      ". \n",
      ". save $path2\\NCDS_MAIN.dta, replace\n",
      "file F:\\Data\\MYDATA\\WORK\\NCDS_MAIN.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\temp1.dta, clear\n",
    "    sort ncdsid\n",
    "    merge 1:1 ncdsid using $path3\\temp2.dta\n",
    "    drop _merge\n",
    "    duplicates report ncdsid\n",
    "    sort ncdsid\n",
    "    merge 1:1 ncdsid using $path3\\temp3.dta\n",
    "    drop _merge\n",
    "    duplicates report ncdsid\n",
    "    sort ncdsid\n",
    "    merge 1:1 ncdsid using $path3\\temp4.dta\n",
    "    drop _merge\n",
    "    duplicates report ncdsid\n",
    "    sort ncdsid\n",
    "    merge 1:1 ncdsid using $path3\\temp5.dta\n",
    "    drop _merge\n",
    "    duplicates report ncdsid\n",
    "    sort ncdsid \n",
    "    \n",
    "capture drop cohort\n",
    "    gen cohort=1\n",
    "    label variable cohort \"Cohort\"\n",
    "    label define cohort 1 \"NCDS\" 2 \"BCS\", replace\n",
    "    label values cohort cohort\n",
    "    tab cohort, mi\n",
    "\n",
    "save $path2\\NCDS_MAIN.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Delete the temporary data files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". erase $path3\\temp1.dta\n",
      "\n",
      ". erase $path3\\temp2.dta\n",
      "\n",
      ". erase $path3\\temp3.dta\n",
      "\n",
      ". erase $path3\\temp4.dta\n",
      "\n",
      ". erase $path3\\temp5.dta\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "erase $path3\\temp1.dta\n",
    "erase $path3\\temp2.dta\n",
    "erase $path3\\temp3.dta\n",
    "erase $path3\\temp4.dta\n",
    "erase $path3\\temp5.dta\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Preparation of BCS Datasets  <a class=\"anchor\" id=\"bcsprep\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Open raw BCS data file. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\BCS\\S1\\bcs7072a.dta, clear\n",
      "\n",
      ". keep bcsid a0009 a0010 a0014 a0018 a0255\n",
      "\n",
      ". \n",
      ". count\n",
      "  17,196\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\BCS\\S1\\bcs7072a.dta, clear\n",
    "keep bcsid a0009 a0010 a0014 a0018 a0255\n",
    "\n",
    "count\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we code father's and mother's RGSC (information collected in the age 0 survey). We are not using this RGSC measure in our main analysis but we code it here as it can potentially be used in producing the weights and in the multiple imputation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Age 0 Parental Social Class\n",
      "\n",
      ". \n",
      ". *father's RGSC Age 0\n",
      "\n",
      ". tab a0014\n",
      "\n",
      "     Social |\n",
      "   Class of |\n",
      "  Father in |\n",
      "       1970 |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    NK / NS |         97        0.56        0.56\n",
      "       SC 1 |        820        4.77        5.33\n",
      "       SC 2 |      1,906       11.08       16.42\n",
      "    SC 3 NM |      1,924       11.19       27.61\n",
      "     SC 3 M |      7,544       43.87       71.48\n",
      "       SC 4 |      2,473       14.38       85.86\n",
      "       SC 5 |      1,106        6.43       92.29\n",
      "      Other |        502        2.92       95.21\n",
      "Unsupported |        824        4.79      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ". capture drop bcs0_olddadrgsc\n",
      "\n",
      ".     gen bcs0_olddadrgsc = a0014\n",
      "\n",
      ".     recode bcs0_olddadrgsc (-2=.) (7=.) (8=.)\n",
      "(bcs0_olddadrgsc: 1423 changes made)\n",
      "\n",
      ".     label variable bcs0_olddadrgsc \"BCS Age 0 Dad RGSC Old Coding\"\n",
      "\n",
      ".     label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
      "\n",
      ".     label values bcs0_olddadrgsc rgsc\n",
      "\n",
      ".     tab bcs0_olddadrgsc a0014\n",
      "\n",
      " BCS Age 0 |\n",
      "  Dad RGSC |                  Social Class of Father in 1970\n",
      "Old Coding |      SC 1       SC 2    SC 3 NM     SC 3 M       SC 4       SC 5 |     Total\n",
      "-----------+------------------------------------------------------------------+----------\n",
      "         I |       820          0          0          0          0          0 |       820 \n",
      "        II |         0      1,906          0          0          0          0 |     1,906 \n",
      "    III NM |         0          0      1,924          0          0          0 |     1,924 \n",
      "     III M |         0          0          0      7,544          0          0 |     7,544 \n",
      "        IV |         0          0          0          0      2,473          0 |     2,473 \n",
      "         V |         0          0          0          0          0      1,106 |     1,106 \n",
      "-----------+------------------------------------------------------------------+----------\n",
      "     Total |       820      1,906      1,924      7,544      2,473      1,106 |    15,773 \n",
      "\n",
      "\n",
      ".     drop a0014\n",
      "\n",
      ". \n",
      ". *mother's RGSC Age 0\n",
      "\n",
      ". tab a0018\n",
      "\n",
      "    Mothers |\n",
      "     Social |\n",
      "   Class in |\n",
      "       1970 |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "  Not Known |      1,508        8.77        8.77\n",
      "   SC 1 & 2 |      1,466        8.53       17.29\n",
      "    SC 3 NM |      4,682       27.23       44.52\n",
      "     SC 3 M |        841        4.89       49.41\n",
      "       SC 4 |      3,276       19.05       68.46\n",
      "       SC 5 |        211        1.23       69.69\n",
      "      Other |        108        0.63       70.32\n",
      " Housewives |      5,104       29.68      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ". capture drop bcs0_oldmumrgsc\n",
      "\n",
      ".     gen bcs0_oldmumrgsc = a0018\n",
      "\n",
      ".     recode bcs0_oldmumrgsc  (-2=.) (6=.) (7=.)\n",
      "(bcs0_oldmumrgsc: 6720 changes made)\n",
      "\n",
      ".     label variable bcs0_oldmumrgsc \"BCS Age 0 Mum RGSC Old Coding\"\n",
      "\n",
      ".     label values bcs0_oldmumrgsc rgsc\n",
      "\n",
      ".     tab bcs0_oldmumrgsc a0018\n",
      "\n",
      " BCS Age 0 |\n",
      "  Mum RGSC |              Mothers Social Class in 1970\n",
      "Old Coding |  SC 1 & 2    SC 3 NM     SC 3 M       SC 4       SC 5 |     Total\n",
      "-----------+-------------------------------------------------------+----------\n",
      "         I |     1,466          0          0          0          0 |     1,466 \n",
      "        II |         0      4,682          0          0          0 |     4,682 \n",
      "    III NM |         0          0        841          0          0 |       841 \n",
      "     III M |         0          0          0      3,276          0 |     3,276 \n",
      "        IV |         0          0          0          0        211 |       211 \n",
      "-----------+-------------------------------------------------------+----------\n",
      "     Total |     1,466      4,682        841      3,276        211 |    10,476 \n",
      "\n",
      "\n",
      ".     drop a0018\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp1.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp1.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp1.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Age 0 Parental Social Class\n",
    "\n",
    "*father's RGSC Age 0\n",
    "tab a0014\n",
    "capture drop bcs0_olddadrgsc\n",
    "    gen bcs0_olddadrgsc = a0014\n",
    "    recode bcs0_olddadrgsc (-2=.) (7=.) (8=.)\n",
    "    label variable bcs0_olddadrgsc \"BCS Age 0 Dad RGSC Old Coding\"\n",
    "    label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
    "    label values bcs0_olddadrgsc rgsc\n",
    "    tab bcs0_olddadrgsc a0014\n",
    "    drop a0014\n",
    "\n",
    "*mother's RGSC Age 0\n",
    "tab a0018\n",
    "capture drop bcs0_oldmumrgsc\n",
    "    gen bcs0_oldmumrgsc = a0018\n",
    "    recode bcs0_oldmumrgsc  (-2=.) (6=.) (7=.)\n",
    "    label variable bcs0_oldmumrgsc \"BCS Age 0 Mum RGSC Old Coding\"\n",
    "    label values bcs0_oldmumrgsc rgsc\n",
    "    tab bcs0_oldmumrgsc a0018\n",
    "    drop a0018\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp1.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we code father's education. As above we code this in line with the method used in Cheung and Egerton (2007, page 206-207).\n",
    "\n",
    "We use variable e196 which comes from the age 5 survey. This is question E3 which asks:\n",
    "How many completed years of full-time education did the present parents have after leaving school? (e.g. college of education, polytechnic, university etc.)\n",
    "\n",
    "This variable is coded in the data as the number of completed years after age 15."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\BCS\\S2\\f699b.dta, clear\n",
      "\n",
      ". keep bcsid e008 e009 e189a e189b e190 e191 e192 e193 e194 e195 e196 e245\n",
      "\n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". quietly mvdecode e008 e009 e189a e189b e190 e191 e192 e193 e194 e195 e196 e245, mv(-2=. \\ -1=. \\ -6=. \\ -3=. \\ 9=.)\n",
      "\n",
      ". \n",
      ". *Father's Education\n",
      "\n",
      ". \n",
      ". tab e196\n",
      "\n",
      "  Years of Ft Educ |\n",
      "         After Age |\n",
      "         15-Father |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "           0. None |      8,018       65.03       65.03\n",
      "                 1 |      1,615       13.10       78.13\n",
      "                 2 |        784        6.36       84.49\n",
      "                 3 |        543        4.40       88.90\n",
      "                 4 |        253        2.05       90.95\n",
      "                 5 |        293        2.38       93.32\n",
      "                 6 |        346        2.81       96.13\n",
      "                 7 |        278        2.25       98.39\n",
      "                 8 |        152        1.23       99.62\n",
      "                10 |         41        0.33       99.95\n",
      "                11 |          3        0.02       99.98\n",
      "                12 |          1        0.01       99.98\n",
      "                13 |          1        0.01       99.99\n",
      "                19 |          1        0.01      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     12,329      100.00\n",
      "\n",
      ". capture drop bcs_paed_cat\n",
      "\n",
      ". gen bcs_paed_cat = .\n",
      "(13,135 missing values generated)\n",
      "\n",
      ". *Left school at age 15\n",
      "\n",
      ". *No years completed after age 15\n",
      "\n",
      ". replace bcs_paed_cat = 1 if (e196==0)\n",
      "(8,018 real changes made)\n",
      "\n",
      ". *Left school at age 16, 17 or 18\n",
      "\n",
      ". *This one to three years after age 15\n",
      "\n",
      ". replace bcs_paed_cat = 2 if ((e196>=1)&(e196<=3))\n",
      "(2,942 real changes made)\n",
      "\n",
      ". *Left school at age 19 or 20\n",
      "\n",
      ". *This is 4 or 5 years after age 15\n",
      "\n",
      ". replace bcs_paed_cat = 3 if ((e196>=4)&(e196<=5))\n",
      "(546 real changes made)\n",
      "\n",
      ". *Left school at age 21+\n",
      "\n",
      ". *This is 6 or more years after age 15\n",
      "\n",
      ". replace bcs_paed_cat = 4 if ((e196>=6)&(e196<=19))\n",
      "(823 real changes made)\n",
      "\n",
      ". tab bcs_paed_cat\n",
      "\n",
      "bcs_paed_ca |\n",
      "          t |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      8,018       65.03       65.03\n",
      "          2 |      2,942       23.86       88.90\n",
      "          3 |        546        4.43       93.32\n",
      "          4 |        823        6.68      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     12,329      100.00\n",
      "\n",
      ". label define ed_cat 1 \"Comp\" 2 \"Comp+1-3\" 3 \"Comp+4-5\" 4 \"Comp+6+\", replace\n",
      "\n",
      ". label values bcs_paed_cat ed_cat\n",
      "\n",
      ". label variable bcs_paed_cat \"BCS Father's Education Categories\"\n",
      "\n",
      ". tab bcs_paed_cat\n",
      "\n",
      "        BCS |\n",
      "   Father's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |      8,018       65.03       65.03\n",
      "   Comp+1-3 |      2,942       23.86       88.90\n",
      "   Comp+4-5 |        546        4.43       93.32\n",
      "    Comp+6+ |        823        6.68      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     12,329      100.00\n",
      "\n",
      ". \n",
      ". tab e196 bcs_paed_cat, mi\n",
      "\n",
      "  Years of Ft Educ |\n",
      "         After Age |           BCS Father's Education Categories\n",
      "         15-Father |      Comp   Comp+1-3   Comp+4-5    Comp+6+          . |     Total\n",
      "-------------------+-------------------------------------------------------+----------\n",
      "           0. None |     8,018          0          0          0          0 |     8,018 \n",
      "                 1 |         0      1,615          0          0          0 |     1,615 \n",
      "                 2 |         0        784          0          0          0 |       784 \n",
      "                 3 |         0        543          0          0          0 |       543 \n",
      "                 4 |         0          0        253          0          0 |       253 \n",
      "                 5 |         0          0        293          0          0 |       293 \n",
      "                 6 |         0          0          0        346          0 |       346 \n",
      "                 7 |         0          0          0        278          0 |       278 \n",
      "                 8 |         0          0          0        152          0 |       152 \n",
      "                10 |         0          0          0         41          0 |        41 \n",
      "                11 |         0          0          0          3          0 |         3 \n",
      "                12 |         0          0          0          1          0 |         1 \n",
      "                13 |         0          0          0          1          0 |         1 \n",
      "                19 |         0          0          0          1          0 |         1 \n",
      "                 . |         0          0          0          0        806 |       806 \n",
      "-------------------+-------------------------------------------------------+----------\n",
      "             Total |     8,018      2,942        546        823        806 |    13,135 \n",
      "\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\BCS\\S2\\f699b.dta, clear\n",
    "keep bcsid e008 e009 e189a e189b e190 e191 e192 e193 e194 e195 e196 e245\n",
    "numlabel, add\n",
    "\n",
    "quietly mvdecode e008 e009 e189a e189b e190 e191 e192 e193 e194 e195 e196 e245, mv(-2=. \\ -1=. \\ -6=. \\ -3=. \\ 9=.)\n",
    "\n",
    "*Father's Education\n",
    "\n",
    "tab e196\n",
    "capture drop bcs_paed_cat\n",
    "gen bcs_paed_cat = .\n",
    "*Left school at age 15\n",
    "*No years completed after age 15\n",
    "replace bcs_paed_cat = 1 if (e196==0)\n",
    "*Left school at age 16, 17 or 18\n",
    "*This one to three years after age 15\n",
    "replace bcs_paed_cat = 2 if ((e196>=1)&(e196<=3))\n",
    "*Left school at age 19 or 20\n",
    "*This is 4 or 5 years after age 15\n",
    "replace bcs_paed_cat = 3 if ((e196>=4)&(e196<=5))\n",
    "*Left school at age 21+\n",
    "*This is 6 or more years after age 15\n",
    "replace bcs_paed_cat = 4 if ((e196>=6)&(e196<=19))\n",
    "tab bcs_paed_cat\n",
    "label define ed_cat 1 \"Comp\" 2 \"Comp+1-3\" 3 \"Comp+4-5\" 4 \"Comp+6+\", replace\n",
    "label values bcs_paed_cat ed_cat\n",
    "label variable bcs_paed_cat \"BCS Father's Education Categories\"\n",
    "tab bcs_paed_cat\n",
    "\n",
    "tab e196 bcs_paed_cat, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we code mother's education. Again we used the method described in Cheung and Egerton (2007 page 206-207).\n",
    "\n",
    "We use variable e195 which comes from the age 5 survey. This is question E3 which asks: How many completed years of full-time education did the present parents have after leaving school? (e.g. college of education, polytechnic, university etc.)\n",
    "\n",
    "This variable is coded in the data as the number of completed years after age 15.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Mother's Education Categories\n",
      "\n",
      ". tab e195\n",
      "\n",
      "  Years of Ft Educ |\n",
      "         After Age |\n",
      "         15-Mother |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "           0. None |      8,515       65.34       65.34\n",
      "                 1 |      2,015       15.46       80.80\n",
      "                 2 |      1,020        7.83       88.63\n",
      "                 3 |        547        4.20       92.83\n",
      "                 4 |        237        1.82       94.64\n",
      "                 5 |        245        1.88       96.52\n",
      "                 6 |        280        2.15       98.67\n",
      "                 7 |        122        0.94       99.61\n",
      "                 8 |         45        0.35       99.95\n",
      "                10 |          5        0.04       99.99\n",
      "                11 |          1        0.01      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     13,032      100.00\n",
      "\n",
      ". capture drop bcs_moed_cat\n",
      "\n",
      ". gen bcs_moed_cat = .\n",
      "(13,135 missing values generated)\n",
      "\n",
      ". *Left school at age 15\n",
      "\n",
      ". *No years completed after age 15\n",
      "\n",
      ". replace bcs_moed_cat = 1 if (e195==0)\n",
      "(8,515 real changes made)\n",
      "\n",
      ". *Left school at age 16, 17 or 18\n",
      "\n",
      ". *This one to three years after age 15\n",
      "\n",
      ". replace bcs_moed_cat = 2 if ((e195>=1)&(e195<=3))\n",
      "(3,582 real changes made)\n",
      "\n",
      ". *Left school at age 19 or 20\n",
      "\n",
      ". *This is 4 or 5 years after age 15\n",
      "\n",
      ". replace bcs_moed_cat = 3 if ((e195>=4)&(e195<=5))\n",
      "(482 real changes made)\n",
      "\n",
      ". *Left school at age 21+\n",
      "\n",
      ". *This is 6 or more years after age 15\n",
      "\n",
      ". replace bcs_moed_cat = 4 if ((e195>=6)&(e195<=19))\n",
      "(453 real changes made)\n",
      "\n",
      ". tab bcs_moed_cat\n",
      "\n",
      "bcs_moed_ca |\n",
      "          t |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      8,515       65.34       65.34\n",
      "          2 |      3,582       27.49       92.83\n",
      "          3 |        482        3.70       96.52\n",
      "          4 |        453        3.48      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     13,032      100.00\n",
      "\n",
      ". label values bcs_moed_cat ed_cat\n",
      "\n",
      ". label variable bcs_moed_cat \"BCS Mother's Education Categories\"\n",
      "\n",
      ". tab bcs_moed_cat\n",
      "\n",
      "        BCS |\n",
      "   Mother's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |      8,515       65.34       65.34\n",
      "   Comp+1-3 |      3,582       27.49       92.83\n",
      "   Comp+4-5 |        482        3.70       96.52\n",
      "    Comp+6+ |        453        3.48      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     13,032      100.00\n",
      "\n",
      ". \n",
      ". tab e195 bcs_moed_cat, mi\n",
      "\n",
      "  Years of Ft Educ |\n",
      "         After Age |           BCS Mother's Education Categories\n",
      "         15-Mother |      Comp   Comp+1-3   Comp+4-5    Comp+6+          . |     Total\n",
      "-------------------+-------------------------------------------------------+----------\n",
      "           0. None |     8,515          0          0          0          0 |     8,515 \n",
      "                 1 |         0      2,015          0          0          0 |     2,015 \n",
      "                 2 |         0      1,020          0          0          0 |     1,020 \n",
      "                 3 |         0        547          0          0          0 |       547 \n",
      "                 4 |         0          0        237          0          0 |       237 \n",
      "                 5 |         0          0        245          0          0 |       245 \n",
      "                 6 |         0          0          0        280          0 |       280 \n",
      "                 7 |         0          0          0        122          0 |       122 \n",
      "                 8 |         0          0          0         45          0 |        45 \n",
      "                10 |         0          0          0          5          0 |         5 \n",
      "                11 |         0          0          0          1          0 |         1 \n",
      "                 . |         0          0          0          0        103 |       103 \n",
      "-------------------+-------------------------------------------------------+----------\n",
      "             Total |     8,515      3,582        482        453        103 |    13,135 \n",
      "\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Mother's Education Categories\n",
    "tab e195\n",
    "capture drop bcs_moed_cat\n",
    "gen bcs_moed_cat = .\n",
    "*Left school at age 15\n",
    "*No years completed after age 15\n",
    "replace bcs_moed_cat = 1 if (e195==0)\n",
    "*Left school at age 16, 17 or 18\n",
    "*This one to three years after age 15\n",
    "replace bcs_moed_cat = 2 if ((e195>=1)&(e195<=3))\n",
    "*Left school at age 19 or 20\n",
    "*This is 4 or 5 years after age 15\n",
    "replace bcs_moed_cat = 3 if ((e195>=4)&(e195<=5))\n",
    "*Left school at age 21+\n",
    "*This is 6 or more years after age 15\n",
    "replace bcs_moed_cat = 4 if ((e195>=6)&(e195<=19))\n",
    "tab bcs_moed_cat\n",
    "label values bcs_moed_cat ed_cat\n",
    "label variable bcs_moed_cat \"BCS Mother's Education Categories\"\n",
    "tab bcs_moed_cat\n",
    "\n",
    "tab e195 bcs_moed_cat, mi\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again in line with Cheung and Egerton (2007, p206-207) we take the highest of the parent's education to create a parental educational level variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Highest of the parent's education \n",
      "\n",
      ". \n",
      ". capture drop bcs_parented\n",
      "\n",
      ". egen bcs_parented = rmax(bcs_paed_cat bcs_moed_cat)\n",
      "(47 missing values generated)\n",
      "\n",
      ". tab bcs_parented\n",
      "\n",
      "bcs_parente |\n",
      "          d |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      6,840       52.26       52.26\n",
      "          2 |      4,413       33.72       85.98\n",
      "          3 |        775        5.92       91.90\n",
      "          4 |      1,060        8.10      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     13,088      100.00\n",
      "\n",
      ". label values bcs_parented ed_cat\n",
      "\n",
      ". label variable bcs_parented \"BCS Parent's Highest Education\"\n",
      "\n",
      ". tab bcs_parented\n",
      "\n",
      "        BCS |\n",
      "   Parent's |\n",
      "    Highest |\n",
      "  Education |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       Comp |      6,840       52.26       52.26\n",
      "   Comp+1-3 |      4,413       33.72       85.98\n",
      "   Comp+4-5 |        775        5.92       91.90\n",
      "    Comp+6+ |      1,060        8.10      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     13,088      100.00\n",
      "\n",
      ". \n",
      ". tab bcs_parented bcs_paed_cat\n",
      "\n",
      "       BCS |\n",
      "  Parent's |\n",
      "   Highest |      BCS Father's Education Categories\n",
      " Education |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "-----------+--------------------------------------------+----------\n",
      "      Comp |     6,342          0          0          0 |     6,342 \n",
      "  Comp+1-3 |     1,544      2,691          0          0 |     4,235 \n",
      "  Comp+4-5 |        92        150        491          0 |       733 \n",
      "   Comp+6+ |        40        101         55        823 |     1,019 \n",
      "-----------+--------------------------------------------+----------\n",
      "     Total |     8,018      2,942        546        823 |    12,329 \n",
      "\n",
      "\n",
      ". tab bcs_parented bcs_moed_cat\n",
      "\n",
      "       BCS |\n",
      "  Parent's |\n",
      "   Highest |      BCS Mother's Education Categories\n",
      " Education |      Comp   Comp+1-3   Comp+4-5    Comp+6+ |     Total\n",
      "-----------+--------------------------------------------+----------\n",
      "      Comp |     6,805          0          0          0 |     6,805 \n",
      "  Comp+1-3 |     1,371      3,035          0          0 |     4,406 \n",
      "  Comp+4-5 |       187        232        352          0 |       771 \n",
      "   Comp+6+ |       152        315        130        453 |     1,050 \n",
      "-----------+--------------------------------------------+----------\n",
      "     Total |     8,515      3,582        482        453 |    13,032 \n",
      "\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp2.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp2.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp2.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Highest of the parent's education \n",
    "\n",
    "capture drop bcs_parented\n",
    "egen bcs_parented = rmax(bcs_paed_cat bcs_moed_cat)\n",
    "tab bcs_parented\n",
    "label values bcs_parented ed_cat\n",
    "label variable bcs_parented \"BCS Parent's Highest Education\"\n",
    "tab bcs_parented\n",
    "\n",
    "tab bcs_parented bcs_paed_cat\n",
    "tab bcs_parented bcs_moed_cat\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp2.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we produce variables indicating what UK country the cohort members live in, at each sweep."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Country at each sweep\n",
      "\n",
      ". use $path1\\ARCHIVE\\BCS\\S1\\bcs1derived.dta, clear\n",
      "\n",
      ". keep BCSID BD1CNTRY\n",
      "\n",
      ". rename BCSID bcsid\n",
      "\n",
      ". tab BD1CNTRY\n",
      "\n",
      "1970: Country of |\n",
      "     Interview   |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "         England |     14,072       81.83       81.83\n",
      "           Wales |        879        5.11       86.94\n",
      "        Scotland |      1,617        9.40       96.35\n",
      "Northern Ireland |        628        3.65      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     17,196      100.00\n",
      "\n",
      ". rename BD1CNTRY bcs0_country\n",
      "\n",
      ". \n",
      ". duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        17196             0\n",
      "--------------------------------------\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp3.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp3.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp3.dta saved\n",
      "\n",
      ". \n",
      ". use $path1\\ARCHIVE\\BCS\\S2\\bcs2derived.dta, clear\n",
      "\n",
      ". keep BCSID BD2CNTRY\n",
      "\n",
      ". rename BCSID bcsid\n",
      "\n",
      ". tab BD2CNTRY\n",
      "\n",
      "1975: Country of |\n",
      "     Interview   |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "         England |     11,157       84.94       84.94\n",
      "           Wales |        748        5.69       90.64\n",
      "        Scotland |      1,166        8.88       99.51\n",
      "        Overseas |         64        0.49      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     13,135      100.00\n",
      "\n",
      ". rename BD2CNTRY bcs5_country\n",
      "\n",
      ". \n",
      ". duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        13135             0\n",
      "--------------------------------------\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp4.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp4.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp4.dta saved\n",
      "\n",
      ". \n",
      ". use $path1\\ARCHIVE\\BCS\\S3\\bcs3derived.dta, clear\n",
      "\n",
      ". keep BCSID BD3CNTRY\n",
      "\n",
      ". rename BCSID bcsid\n",
      "\n",
      ". tab BD3CNTRY\n",
      "\n",
      "1980: Country of |\n",
      "     Interview   |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "         Unknown |         81        0.54        0.54\n",
      "         England |     12,514       84.13       84.67\n",
      "           Wales |        825        5.55       90.22\n",
      "        Scotland |      1,455        9.78      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     14,875      100.00\n",
      "\n",
      ". rename BD3CNTRY bcs10_country\n",
      "\n",
      ". \n",
      ". duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18937             0\n",
      "        2 |          166            83\n",
      "--------------------------------------\n",
      "\n",
      ". *There are some duplicates of BCSID - these have missing info in one version\n",
      "\n",
      ". *Drop these\n",
      "\n",
      ". drop if (bcs10_country==.)\n",
      "(4,228 observations deleted)\n",
      "\n",
      ". duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        14875             0\n",
      "--------------------------------------\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp5.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp5.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp5.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Country at each sweep\n",
    "use $path1\\ARCHIVE\\BCS\\S1\\bcs1derived.dta, clear\n",
    "keep BCSID BD1CNTRY\n",
    "rename BCSID bcsid\n",
    "tab BD1CNTRY\n",
    "rename BD1CNTRY bcs0_country\n",
    "\n",
    "duplicates report bcsid\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp3.dta, replace\n",
    "\n",
    "use $path1\\ARCHIVE\\BCS\\S2\\bcs2derived.dta, clear\n",
    "keep BCSID BD2CNTRY\n",
    "rename BCSID bcsid\n",
    "tab BD2CNTRY\n",
    "rename BD2CNTRY bcs5_country\n",
    "\n",
    "duplicates report bcsid\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp4.dta, replace\n",
    "\n",
    "use $path1\\ARCHIVE\\BCS\\S3\\bcs3derived.dta, clear\n",
    "keep BCSID BD3CNTRY\n",
    "rename BCSID bcsid\n",
    "tab BD3CNTRY\n",
    "rename BD3CNTRY bcs10_country\n",
    "\n",
    "duplicates report bcsid\n",
    "*There are some duplicates of BCSID - these have missing info in one version\n",
    "*Drop these\n",
    "drop if (bcs10_country==.)\n",
    "duplicates report bcsid\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp5.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we code more parental occupational information. These are the old RGSC and SEG measures deposited in the BCS datasets. We will not use these variables in our main analyses. These variables are prepared to be potentially used in the production of the weights and in the multiple imputation.\n",
    "\n",
    "As above, we convert the available Socio-Economic Group information to an approximation of the Goldthorpe Schema using the method outlined in Goldthorpe and Jackson (2007).\n",
    "\n",
    "Goldthorpe, J. H., & Jackson, M. (2007). [Intergenerational class mobility in contemporary Britain: political concerns and empirical findings.](http://onlinelibrary.wiley.com/doi/10.1111/j.1468-4446.2007.00165.x/full) The British journal of sociology, 58(4), 525-546.\n",
    "Chicago.\n",
    "\n",
    "This method builds on an approximation developed by Health and McDonald (1987).\n",
    "\n",
    "Heath, A., & McDonald, S. K. (1987). [Social change and the future of the left.](http://onlinelibrary.wiley.com/doi/10.1111/j.1467-923X.1987.tb02624.x/full) The Political Quarterly, 58(4), 364-377.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Age 10 parental Social Class\n",
      "\n",
      ". use $path1\\ARCHIVE\\BCS\\S3\\sn3723.dta, clear\n",
      "\n",
      ". keep bcsid c3_4 c3_11 back10p back20p c4_1a c4_2a\n",
      "\n",
      ". \n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". *father's RGSC Age 10\n",
      "\n",
      ". tab c3_4\n",
      "\n",
      "  FATHER'S CORRECTED |\n",
      "   SOCIAL CLASS 1980 |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "                1. I |        767        5.53        5.53\n",
      "               2. II |      2,922       21.07       26.60\n",
      "            3. IIINM |      1,123        8.10       34.70\n",
      "             4. IIIM |      5,418       39.07       73.77\n",
      "               5. IV |      1,513       10.91       84.68\n",
      "                6. V |        489        3.53       88.20\n",
      "8. Insufficient data |        267        1.93       90.13\n",
      "          9. No data |      1,369        9.87      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     13,868      100.00\n",
      "\n",
      ". capture drop bcs10_olddadrgsc\n",
      "\n",
      ".     gen bcs10_olddadrgsc = c3_4\n",
      "(1,002 missing values generated)\n",
      "\n",
      ".     recode bcs10_olddadrgsc (8=.) (9=.)\n",
      "(bcs10_olddadrgsc: 1636 changes made)\n",
      "\n",
      ".     label variable bcs10_olddadrgsc \"BCS Age 10 Dad RGSC Old Coding\"\n",
      "\n",
      ".     label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
      "\n",
      ".     label values bcs10_olddadrgsc rgsc\n",
      "\n",
      ".     tab bcs10_olddadrgsc c3_4\n",
      "\n",
      "BCS Age 10 |\n",
      "  Dad RGSC |               FATHER'S CORRECTED SOCIAL CLASS 1980\n",
      "Old Coding |      1. I      2. II   3. IIINM    4. IIIM      5. IV       6. V |     Total\n",
      "-----------+------------------------------------------------------------------+----------\n",
      "         I |       767          0          0          0          0          0 |       767 \n",
      "        II |         0      2,922          0          0          0          0 |     2,922 \n",
      "    III NM |         0          0      1,123          0          0          0 |     1,123 \n",
      "     III M |         0          0          0      5,418          0          0 |     5,418 \n",
      "        IV |         0          0          0          0      1,513          0 |     1,513 \n",
      "         V |         0          0          0          0          0        489 |       489 \n",
      "-----------+------------------------------------------------------------------+----------\n",
      "     Total |       767      2,922      1,123      5,418      1,513        489 |    12,232 \n",
      "\n",
      "\n",
      ". \n",
      ". *mother's RGSC Age 10\n",
      "\n",
      ". tab c3_11\n",
      "\n",
      "  MOTHER'S CORRECTED |\n",
      "   SOCIAL CLASS 1980 |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "                1. I |         55        0.40        0.40\n",
      "               2. II |      1,714       12.36       12.76\n",
      "            3. IIINM |      3,192       23.02       35.77\n",
      "             4. IIIM |        908        6.55       42.32\n",
      "               5. IV |      2,759       19.89       62.22\n",
      "                6. V |        908        6.55       68.76\n",
      "8. Insufficient data |        165        1.19       69.95\n",
      "          9. No data |      4,167       30.05      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     13,868      100.00\n",
      "\n",
      ". capture drop bcs10_olddadrgsc\n",
      "\n",
      ".     gen bcs10_olddadrgsc = c3_11\n",
      "(1,002 missing values generated)\n",
      "\n",
      ".     recode bcs10_olddadrgsc (8=.) (9=.)\n",
      "(bcs10_olddadrgsc: 4332 changes made)\n",
      "\n",
      ".     label variable bcs10_olddadrgsc \"BCS Age 10 Dad RGSC Old Coding\"\n",
      "\n",
      ".     label values bcs10_olddadrgsc rgsc\n",
      "\n",
      ".     tab bcs10_olddadrgsc c3_11\n",
      "\n",
      "BCS Age 10 |\n",
      "  Dad RGSC |               MOTHER'S CORRECTED SOCIAL CLASS 1980\n",
      "Old Coding |      1. I      2. II   3. IIINM    4. IIIM      5. IV       6. V |     Total\n",
      "-----------+------------------------------------------------------------------+----------\n",
      "         I |        55          0          0          0          0          0 |        55 \n",
      "        II |         0      1,714          0          0          0          0 |     1,714 \n",
      "    III NM |         0          0      3,192          0          0          0 |     3,192 \n",
      "     III M |         0          0          0        908          0          0 |       908 \n",
      "        IV |         0          0          0          0      2,759          0 |     2,759 \n",
      "         V |         0          0          0          0          0        908 |       908 \n",
      "-----------+------------------------------------------------------------------+----------\n",
      "     Total |        55      1,714      3,192        908      2,759        908 |     9,536 \n",
      "\n",
      "\n",
      ". \n",
      ". *Father's SEG Age 10\n",
      "\n",
      ". tab back10p\n",
      "\n",
      "   FATHER'S CORRECTED |\n",
      "SOCIAL VARS SEG 1980  |      Freq.     Percent        Cum.\n",
      "----------------------+-----------------------------------\n",
      "-9. No code available |      2,755       18.53       18.53\n",
      "                   11 |         58        0.39       18.92\n",
      "                   12 |        853        5.74       24.65\n",
      "                   21 |        443        2.98       27.63\n",
      "                   22 |        864        5.81       33.44\n",
      "                   30 |        164        1.10       34.55\n",
      "                   40 |        583        3.92       38.47\n",
      "                   51 |        614        4.13       42.60\n",
      "                   52 |        273        1.84       44.43\n",
      "                   60 |        616        4.14       48.57\n",
      "                   70 |         61        0.41       48.98\n",
      "                   80 |      1,194        8.03       57.01\n",
      "                   90 |      3,216       21.63       78.64\n",
      "                  100 |      1,226        8.24       86.89\n",
      "                  110 |        443        2.98       89.87\n",
      "                  120 |      1,014        6.82       96.68\n",
      "                  130 |         84        0.56       97.25\n",
      "                  140 |        129        0.87       98.12\n",
      "                  150 |        119        0.80       98.92\n",
      "                  160 |        161        1.08      100.00\n",
      "----------------------+-----------------------------------\n",
      "                Total |     14,870      100.00\n",
      "\n",
      ". capture drop bcs10_dadseg2egp\n",
      "\n",
      ".     gen bcs10_dadseg2egp = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 1 if (back10p==11)|(back10p==12)|(back10p==30)|(back10p==40)\n",
      "(1,658 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 2 if (back10p==21)|(back10p==22)|(back10p==51)|(back10p==52)\n",
      "(2,194 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 3 if (back10p==60)|(back10p==70)\n",
      "(677 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 4 if (back10p==120)|(back10p==130)|(back10p==140)\n",
      "(1,227 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 5 if (back10p==80)\n",
      "(1,194 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 6 if (back10p==90)\n",
      "(3,216 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = 7 if (back10p==100)|(back10p==110)|(back10p==150)\n",
      "(1,788 real changes made)\n",
      "\n",
      ".     replace bcs10_dadseg2egp = . if (back10p==-9)\n",
      "(0 real changes made)\n",
      "\n",
      ".     label define egp 1 \"I\" 2 \"II+IVa\" 3 \"III\" 4 \"IVb+c\" 5 \"V\" 6 \"VI\" 7 \"VII\"\n",
      "\n",
      ".     label values bcs10_dadseg2egp egp\n",
      "\n",
      ".     label variable bcs10_dadseg2egp \"BCS Age 10 Dad's EGP from SEG\"\n",
      "\n",
      ".     tab back10p bcs10_dadseg2egp\n",
      "\n",
      "   FATHER'S CORRECTED |                        BCS Age 10 Dad's EGP from SEG\n",
      "SOCIAL VARS SEG 1980  |         I     II+IVa        III      IVb+c          V         VI        VII |     Total\n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "                   11 |        58          0          0          0          0          0          0 |        58 \n",
      "                   12 |       853          0          0          0          0          0          0 |       853 \n",
      "                   21 |         0        443          0          0          0          0          0 |       443 \n",
      "                   22 |         0        864          0          0          0          0          0 |       864 \n",
      "                   30 |       164          0          0          0          0          0          0 |       164 \n",
      "                   40 |       583          0          0          0          0          0          0 |       583 \n",
      "                   51 |         0        614          0          0          0          0          0 |       614 \n",
      "                   52 |         0        273          0          0          0          0          0 |       273 \n",
      "                   60 |         0          0        616          0          0          0          0 |       616 \n",
      "                   70 |         0          0         61          0          0          0          0 |        61 \n",
      "                   80 |         0          0          0          0      1,194          0          0 |     1,194 \n",
      "                   90 |         0          0          0          0          0      3,216          0 |     3,216 \n",
      "                  100 |         0          0          0          0          0          0      1,226 |     1,226 \n",
      "                  110 |         0          0          0          0          0          0        443 |       443 \n",
      "                  120 |         0          0          0      1,014          0          0          0 |     1,014 \n",
      "                  130 |         0          0          0         84          0          0          0 |        84 \n",
      "                  140 |         0          0          0        129          0          0          0 |       129 \n",
      "                  150 |         0          0          0          0          0          0        119 |       119 \n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "                Total |     1,658      2,194        677      1,227      1,194      3,216      1,788 |    11,954 \n",
      "\n",
      "\n",
      ". \n",
      ". *Mother's SEG Age 10\n",
      "\n",
      ". tab back20p\n",
      "\n",
      "   MOTHER'S CORRECTED |\n",
      "SOCIAL VARS SEG 1980  |      Freq.     Percent        Cum.\n",
      "----------------------+-----------------------------------\n",
      "-9. No code available |      5,359       36.04       36.04\n",
      "                   11 |         26        0.17       36.21\n",
      "                   12 |         51        0.34       36.56\n",
      "                   21 |        152        1.02       37.58\n",
      "                   22 |        242        1.63       39.21\n",
      "                   30 |         16        0.11       39.31\n",
      "                   40 |         39        0.26       39.58\n",
      "                   51 |      1,155        7.77       47.34\n",
      "                   52 |        216        1.45       48.80\n",
      "                   60 |      2,842       19.11       67.91\n",
      "                   70 |      1,725       11.60       79.51\n",
      "                   80 |        115        0.77       80.28\n",
      "                   90 |        370        2.49       82.77\n",
      "                  100 |      1,139        7.66       90.43\n",
      "                  110 |        901        6.06       96.49\n",
      "                  120 |        356        2.39       98.88\n",
      "                  130 |         10        0.07       98.95\n",
      "                  140 |         21        0.14       99.09\n",
      "                  150 |        133        0.89       99.99\n",
      "                  160 |          2        0.01      100.00\n",
      "----------------------+-----------------------------------\n",
      "                Total |     14,870      100.00\n",
      "\n",
      ". capture drop bcs10_mumseg2egp\n",
      "\n",
      ".     gen bcs10_mumseg2egp = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 1 if (back20p==11)|(back20p==12)|(back20p==30)|(back20p==40)\n",
      "(132 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 2 if (back20p==21)|(back20p==22)|(back20p==51)|(back20p==52)\n",
      "(1,765 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 3 if (back20p==60)|(back20p==70)\n",
      "(4,567 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 4 if (back20p==120)|(back20p==130)|(back20p==140)\n",
      "(387 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 5 if (back20p==80)\n",
      "(115 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 6 if (back20p==90)\n",
      "(370 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = 7 if (back20p==100)|(back20p==110)|(back20p==150)\n",
      "(2,173 real changes made)\n",
      "\n",
      ".     replace bcs10_mumseg2egp = . if (back20p==-9)\n",
      "(0 real changes made)\n",
      "\n",
      ".     label values bcs10_mumseg2egp egp\n",
      "\n",
      ".     label variable bcs10_mumseg2egp \"BCS Age 10 Mum's EGP from SEG\"\n",
      "\n",
      ".     tab back20p bcs10_mumseg2egp\n",
      "\n",
      "   MOTHER'S CORRECTED |                        BCS Age 10 Mum's EGP from SEG\n",
      "SOCIAL VARS SEG 1980  |         I     II+IVa        III      IVb+c          V         VI        VII |     Total\n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "                   11 |        26          0          0          0          0          0          0 |        26 \n",
      "                   12 |        51          0          0          0          0          0          0 |        51 \n",
      "                   21 |         0        152          0          0          0          0          0 |       152 \n",
      "                   22 |         0        242          0          0          0          0          0 |       242 \n",
      "                   30 |        16          0          0          0          0          0          0 |        16 \n",
      "                   40 |        39          0          0          0          0          0          0 |        39 \n",
      "                   51 |         0      1,155          0          0          0          0          0 |     1,155 \n",
      "                   52 |         0        216          0          0          0          0          0 |       216 \n",
      "                   60 |         0          0      2,842          0          0          0          0 |     2,842 \n",
      "                   70 |         0          0      1,725          0          0          0          0 |     1,725 \n",
      "                   80 |         0          0          0          0        115          0          0 |       115 \n",
      "                   90 |         0          0          0          0          0        370          0 |       370 \n",
      "                  100 |         0          0          0          0          0          0      1,139 |     1,139 \n",
      "                  110 |         0          0          0          0          0          0        901 |       901 \n",
      "                  120 |         0          0          0        356          0          0          0 |       356 \n",
      "                  130 |         0          0          0         10          0          0          0 |        10 \n",
      "                  140 |         0          0          0         21          0          0          0 |        21 \n",
      "                  150 |         0          0          0          0          0          0        133 |       133 \n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "                Total |       132      1,765      4,567        387        115        370      2,173 |     9,509 \n",
      "\n",
      "\n",
      ". \n",
      ". drop c3_4 c3_11 back10p back20p c4_1a c4_2a\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp6.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp6.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp6.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Age 10 parental Social Class\n",
    "use $path1\\ARCHIVE\\BCS\\S3\\sn3723.dta, clear\n",
    "keep bcsid c3_4 c3_11 back10p back20p c4_1a c4_2a\n",
    "\n",
    "numlabel, add\n",
    "\n",
    "*father's RGSC Age 10\n",
    "tab c3_4\n",
    "capture drop bcs10_olddadrgsc\n",
    "    gen bcs10_olddadrgsc = c3_4\n",
    "    recode bcs10_olddadrgsc (8=.) (9=.)\n",
    "    label variable bcs10_olddadrgsc \"BCS Age 10 Dad RGSC Old Coding\"\n",
    "    label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
    "    label values bcs10_olddadrgsc rgsc\n",
    "    tab bcs10_olddadrgsc c3_4\n",
    "\n",
    "*mother's RGSC Age 10\n",
    "tab c3_11\n",
    "capture drop bcs10_olddadrgsc\n",
    "    gen bcs10_olddadrgsc = c3_11\n",
    "    recode bcs10_olddadrgsc (8=.) (9=.)\n",
    "    label variable bcs10_olddadrgsc \"BCS Age 10 Dad RGSC Old Coding\"\n",
    "    label values bcs10_olddadrgsc rgsc\n",
    "    tab bcs10_olddadrgsc c3_11\n",
    "\n",
    "*Father's SEG Age 10\n",
    "tab back10p\n",
    "capture drop bcs10_dadseg2egp\n",
    "    gen bcs10_dadseg2egp = .\n",
    "    replace bcs10_dadseg2egp = 1 if (back10p==11)|(back10p==12)|(back10p==30)|(back10p==40)\n",
    "    replace bcs10_dadseg2egp = 2 if (back10p==21)|(back10p==22)|(back10p==51)|(back10p==52)\n",
    "    replace bcs10_dadseg2egp = 3 if (back10p==60)|(back10p==70)\n",
    "    replace bcs10_dadseg2egp = 4 if (back10p==120)|(back10p==130)|(back10p==140)\n",
    "    replace bcs10_dadseg2egp = 5 if (back10p==80)\n",
    "    replace bcs10_dadseg2egp = 6 if (back10p==90)\n",
    "    replace bcs10_dadseg2egp = 7 if (back10p==100)|(back10p==110)|(back10p==150)\n",
    "    replace bcs10_dadseg2egp = . if (back10p==-9)\n",
    "    label define egp 1 \"I\" 2 \"II+IVa\" 3 \"III\" 4 \"IVb+c\" 5 \"V\" 6 \"VI\" 7 \"VII\"\n",
    "    label values bcs10_dadseg2egp egp\n",
    "    label variable bcs10_dadseg2egp \"BCS Age 10 Dad's EGP from SEG\"\n",
    "    tab back10p bcs10_dadseg2egp\n",
    "\n",
    "*Mother's SEG Age 10\n",
    "tab back20p\n",
    "capture drop bcs10_mumseg2egp\n",
    "    gen bcs10_mumseg2egp = .\n",
    "    replace bcs10_mumseg2egp = 1 if (back20p==11)|(back20p==12)|(back20p==30)|(back20p==40)\n",
    "    replace bcs10_mumseg2egp = 2 if (back20p==21)|(back20p==22)|(back20p==51)|(back20p==52)\n",
    "    replace bcs10_mumseg2egp = 3 if (back20p==60)|(back20p==70)\n",
    "    replace bcs10_mumseg2egp = 4 if (back20p==120)|(back20p==130)|(back20p==140)\n",
    "    replace bcs10_mumseg2egp = 5 if (back20p==80)\n",
    "    replace bcs10_mumseg2egp = 6 if (back20p==90)\n",
    "    replace bcs10_mumseg2egp = 7 if (back20p==100)|(back20p==110)|(back20p==150)\n",
    "    replace bcs10_mumseg2egp = . if (back20p==-9)\n",
    "    label values bcs10_mumseg2egp egp\n",
    "    label variable bcs10_mumseg2egp \"BCS Age 10 Mum's EGP from SEG\"\n",
    "    tab back20p bcs10_mumseg2egp\n",
    "\n",
    "drop c3_4 c3_11 back10p back20p c4_1a c4_2a\n",
    "\n",
    "sort bcsid\n",
    "\n",
    "save $path3\\temp6.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we code more parental occupational information from the age 16 survey. These variables will not be used in the main analysis but they may potentially be used in producing the weights and in the multiple imputation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Age 16 Parental Social Class\n",
      "\n",
      ". \n",
      ". use $path1\\ARCHIVE\\BCS\\S4\\bcs7016x.dta, clear\n",
      "\n",
      ". \n",
      ". \n",
      ". *Father's SEG Age 16\n",
      "\n",
      ". tab t11_2 \n",
      "\n",
      " Father's social |\n",
      "           class |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "      Not stated |        601        5.17        5.17\n",
      "No questionnaire |      4,279       36.84       42.01\n",
      "               I |        511        4.40       46.41\n",
      "              II |      1,888       16.25       62.67\n",
      "  III non-manual |        653        5.62       68.29\n",
      "      III manual |      2,567       22.10       90.39\n",
      "              IV |        594        5.11       95.51\n",
      "               V |        153        1.32       96.82\n",
      "         Student |        133        1.15       97.97\n",
      "            Dead |        236        2.03      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     11,615      100.00\n",
      "\n",
      ". capture drop bcs16_olddadrgsc\n",
      "\n",
      ".     gen bcs16_olddadrgsc = t11_2 \n",
      "\n",
      ".     recode bcs16_olddadrgsc (-2=.) (-1=.) (7=.) (8=.)\n",
      "(bcs16_olddadrgsc: 5249 changes made)\n",
      "\n",
      ".     label variable bcs16_olddadrgsc \"BCS Age 16 Dad RGSC Old Coding\"\n",
      "\n",
      ".     label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
      "\n",
      ".     label values bcs16_olddadrgsc rgsc\n",
      "\n",
      ".     tab bcs16_olddadrgsc t11_2 , mi\n",
      "\n",
      "BCS Age 16 |\n",
      "  Dad RGSC |                                             Father's social class\n",
      "Old Coding | Not state  No questi          I         II  III non-m  III manua         IV          V    Student       Dead |     Total\n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "         I |         0          0        511          0          0          0          0          0          0          0 |       511 \n",
      "        II |         0          0          0      1,888          0          0          0          0          0          0 |     1,888 \n",
      "    III NM |         0          0          0          0        653          0          0          0          0          0 |       653 \n",
      "     III M |         0          0          0          0          0      2,567          0          0          0          0 |     2,567 \n",
      "        IV |         0          0          0          0          0          0        594          0          0          0 |       594 \n",
      "         V |         0          0          0          0          0          0          0        153          0          0 |       153 \n",
      "         . |       601      4,279          0          0          0          0          0          0        133        236 |     5,249 \n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       601      4,279        511      1,888        653      2,567        594        153        133        236 |    11,615 \n",
      "\n",
      "\n",
      ". \n",
      ". *Mother's RGSC Age 16\n",
      "\n",
      ". tab t11_9\n",
      "\n",
      " Mother's social |\n",
      "           class |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "      Not stated |        424        3.65        3.65\n",
      "No questionnaire |      4,279       36.84       40.49\n",
      "               I |         49        0.42       40.91\n",
      "              II |      1,144        9.85       50.76\n",
      "  III non-manual |      2,058       17.72       68.48\n",
      "      III manual |        399        3.44       71.92\n",
      "              IV |      1,095        9.43       81.34\n",
      "               V |        433        3.73       85.07\n",
      "         Student |      1,681       14.47       99.54\n",
      "            Dead |         53        0.46      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     11,615      100.00\n",
      "\n",
      ". capture drop bcs16_oldmumrgsc\n",
      "\n",
      ".     gen bcs16_oldmumrgsc = t11_9 \n",
      "\n",
      ".     recode bcs16_oldmumrgsc (-2=.) (-1=.) (7=.) (8=.)\n",
      "(bcs16_oldmumrgsc: 6437 changes made)\n",
      "\n",
      ".     label variable bcs16_oldmumrgsc \"BCS Age 16 Mum RGSC Old Coding\"\n",
      "\n",
      ".     label values bcs16_oldmumrgsc rgsc\n",
      "\n",
      ".     tab bcs16_oldmumrgsc t11_9 , mi\n",
      "\n",
      "BCS Age 16 |\n",
      "  Mum RGSC |                                             Mother's social class\n",
      "Old Coding | Not state  No questi          I         II  III non-m  III manua         IV          V    Student       Dead |     Total\n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "         I |         0          0         49          0          0          0          0          0          0          0 |        49 \n",
      "        II |         0          0          0      1,144          0          0          0          0          0          0 |     1,144 \n",
      "    III NM |         0          0          0          0      2,058          0          0          0          0          0 |     2,058 \n",
      "     III M |         0          0          0          0          0        399          0          0          0          0 |       399 \n",
      "        IV |         0          0          0          0          0          0      1,095          0          0          0 |     1,095 \n",
      "         V |         0          0          0          0          0          0          0        433          0          0 |       433 \n",
      "         . |       424      4,279          0          0          0          0          0          0      1,681         53 |     6,437 \n",
      "-----------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "     Total |       424      4,279         49      1,144      2,058        399      1,095        433      1,681         53 |    11,615 \n",
      "\n",
      "\n",
      ". \n",
      ". keep bcsid bcs16_olddadrgsc bcs16_oldmumrgsc\n",
      "\n",
      ". \n",
      ". save $path3\\temp7.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp7.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp7.dta saved\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Age 16 Parental Social Class\n",
    "\n",
    "use $path1\\ARCHIVE\\BCS\\S4\\bcs7016x.dta, clear\n",
    "\n",
    "\n",
    "*Father's SEG Age 16\n",
    "tab t11_2 \n",
    "capture drop bcs16_olddadrgsc\n",
    "    gen bcs16_olddadrgsc = t11_2 \n",
    "    recode bcs16_olddadrgsc (-2=.) (-1=.) (7=.) (8=.)\n",
    "    label variable bcs16_olddadrgsc \"BCS Age 16 Dad RGSC Old Coding\"\n",
    "    label define rgsc 1 \"I\" 2 \"II\" 3 \"III NM\" 4 \"III M\" 5 \"IV\" 6 \"V\"\n",
    "    label values bcs16_olddadrgsc rgsc\n",
    "    tab bcs16_olddadrgsc t11_2 , mi\n",
    "\n",
    "*Mother's RGSC Age 16\n",
    "tab t11_9\n",
    "capture drop bcs16_oldmumrgsc\n",
    "    gen bcs16_oldmumrgsc = t11_9 \n",
    "    recode bcs16_oldmumrgsc (-2=.) (-1=.) (7=.) (8=.)\n",
    "    label variable bcs16_oldmumrgsc \"BCS Age 16 Mum RGSC Old Coding\"\n",
    "    label values bcs16_oldmumrgsc rgsc\n",
    "    tab bcs16_oldmumrgsc t11_9 , mi\n",
    "\n",
    "keep bcsid bcs16_olddadrgsc bcs16_oldmumrgsc\n",
    "\n",
    "save $path3\\temp7.dta, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we prepare the cognitive ability test measure.\n",
    "\n",
    "We would like to use the variable score20 (general ability test score) which has been used in previous social stratification studies such as Breen and Goldthorpe (2001).\n",
    "\n",
    "Breen, Richard, and John H. Goldthorpe. [Class, mobility and merit the experience of two British birth cohorts.](https://academic.oup.com/esr/article-abstract/17/2/81/517646/Class-Mobility-and-Merit-The-Experience-of-Two) European sociological review 17.2 (2001): 81-101.\n",
    "\n",
    "This variable is no longer deposited in the BCS datasets. However, SPSS code to produce this variable is provided in the BCS documentation [here](http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=843&sitesectiontitle=Derived+variables). The procedure involves computing total scores for each individual test, then computing an overall score.\n",
    "\n",
    "A data note on the cognitive test scores in the BCS is available [here](http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=809&sitesectiontitle=BCS70+Age+10+survey+(1980).\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". clear\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "clear\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\BCS\\S3\\sn3723.dta, clear\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\BCS\\S3\\sn3723.dta, clear\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *BAS Word Definitions Sub-Test\n",
      "\n",
      ". \n",
      ". * In the test items \n",
      "\n",
      ". * -6 means no questionnaire\n",
      "\n",
      ". * -3 means not stated\n",
      "\n",
      ". * 9 means no response\n",
      "\n",
      ". * 1 means acceptable response (i.e. correct)\n",
      "\n",
      ". * 2 means unacceptable response (i.e. not correct)\n",
      "\n",
      ". \n",
      ". quietly mvdecode i3504-i3540, mv(-6=. \\ -3=. \\ 9=.)\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "*BAS Word Definitions Sub-Test\n",
    "\n",
    "* In the test items \n",
    "* -6 means no questionnaire\n",
    "* -3 means not stated\n",
    "* 9 means no response\n",
    "* 1 means acceptable response (i.e. correct)\n",
    "* 2 means unacceptable response (i.e. not correct)\n",
    "\n",
    "quietly mvdecode i3504-i3540, mv(-6=. \\ -3=. \\ 9=.)\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Here we identify cohort members who have no responses\n",
      "\n",
      ". * to any of the test items, which indicates that they\n",
      "\n",
      ". * did not take the subtest.\n",
      "\n",
      ". \n",
      ". tab i3504\n",
      "\n",
      "             BAS-WORD |\n",
      "  DEFINITIONS-SPORT   |      Freq.     Percent        Cum.\n",
      "----------------------+-----------------------------------\n",
      "  Acceptable response |      6,880       59.98       59.98\n",
      "Unacceptable response |      4,591       40.02      100.00\n",
      "----------------------+-----------------------------------\n",
      "                Total |     11,471      100.00\n",
      "\n",
      ". \n",
      ". *No correct or incorrect responses\n",
      "\n",
      ". capture drop miss\n",
      "\n",
      ".     egen miss = rmiss(i3504-i3540)\n",
      "\n",
      ".     tab miss\n",
      "\n",
      "       miss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |         16        0.11        0.11\n",
      "          1 |         11        0.07        0.18\n",
      "          2 |         24        0.16        0.34\n",
      "          3 |         16        0.11        0.45\n",
      "          4 |         26        0.17        0.63\n",
      "          5 |         47        0.32        0.94\n",
      "          6 |         71        0.48        1.42\n",
      "          7 |        104        0.70        2.12\n",
      "          8 |        125        0.84        2.96\n",
      "          9 |        162        1.09        4.05\n",
      "         10 |        196        1.32        5.37\n",
      "         11 |        233        1.57        6.93\n",
      "         12 |        317        2.13        9.07\n",
      "         13 |        382        2.57       11.63\n",
      "         14 |        510        3.43       15.06\n",
      "         15 |        552        3.71       18.78\n",
      "         16 |        756        5.08       23.86\n",
      "         17 |        852        5.73       29.59\n",
      "         18 |        914        6.15       35.74\n",
      "         19 |        855        5.75       41.49\n",
      "         20 |        905        6.09       47.57\n",
      "         21 |        811        5.45       53.03\n",
      "         22 |        735        4.94       57.97\n",
      "         23 |        750        5.04       63.01\n",
      "         24 |        632        4.25       67.26\n",
      "         25 |        506        3.40       70.67\n",
      "         26 |        409        2.75       73.42\n",
      "         27 |        254        1.71       75.12\n",
      "         28 |        122        0.82       75.94\n",
      "         29 |         77        0.52       76.46\n",
      "         30 |         44        0.30       76.76\n",
      "         31 |         46        0.31       77.07\n",
      "         32 |         23        0.15       77.22\n",
      "         33 |         15        0.10       77.32\n",
      "         34 |         12        0.08       77.40\n",
      "         35 |          9        0.06       77.46\n",
      "         36 |          6        0.04       77.51\n",
      "         37 |      3,345       22.49      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". \n",
      ". *This variable identified those who did not complete \n",
      "\n",
      ". * this element of the test.\n",
      "\n",
      ". capture drop bcs10_baswd_notest\n",
      "\n",
      ".     gen bcs10_baswd_notest = 0\n",
      "\n",
      ".     replace bcs10_baswd_notest = 1 if (miss==37)\n",
      "(3,345 real changes made)\n",
      "\n",
      ".     tab bcs10_baswd_notest\n",
      "\n",
      "bcs10_baswd |\n",
      "    _notest |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     11,525       77.51       77.51\n",
      "          1 |      3,345       22.49      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ".     label variable bcs10_baswd_notest \"BCS10 No Test for BAS Word Defin\"\n",
      "\n",
      ".     label values bcs10_baswd_notest yesno\n",
      "\n",
      ".     drop miss\n",
      "\n",
      ". \n",
      ". *In these test items 2 means unacceptable response\n",
      "\n",
      ". * (i.e. wrong answer). We recode this to 0.\n",
      "\n",
      ". * Now the test items are coded 1 (correct) 0 (incorrect)\n",
      "\n",
      ". recode i3504-i3540 (2=0)\n",
      "(i3504: 4591 changes made)\n",
      "(i3505: 1925 changes made)\n",
      "(i3506: 2536 changes made)\n",
      "(i3507: 5147 changes made)\n",
      "(i3508: 2975 changes made)\n",
      "(i3509: 3907 changes made)\n",
      "(i3510: 3719 changes made)\n",
      "(i3511: 3169 changes made)\n",
      "(i3512: 3808 changes made)\n",
      "(i3513: 8212 changes made)\n",
      "(i3514: 4225 changes made)\n",
      "(i3515: 4227 changes made)\n",
      "(i3516: 4421 changes made)\n",
      "(i3517: 4487 changes made)\n",
      "(i3518: 3498 changes made)\n",
      "(i3519: 2321 changes made)\n",
      "(i3520: 6467 changes made)\n",
      "(i3521: 2340 changes made)\n",
      "(i3522: 3403 changes made)\n",
      "(i3523: 2907 changes made)\n",
      "(i3524: 1773 changes made)\n",
      "(i3525: 1255 changes made)\n",
      "(i3526: 4041 changes made)\n",
      "(i3527: 1324 changes made)\n",
      "(i3528: 413 changes made)\n",
      "(i3529: 1616 changes made)\n",
      "(i3530: 1350 changes made)\n",
      "(i3531: 493 changes made)\n",
      "(i3532: 367 changes made)\n",
      "(i3533: 769 changes made)\n",
      "(i3534: 525 changes made)\n",
      "(i3535: 313 changes made)\n",
      "(i3536: 232 changes made)\n",
      "(i3537: 185 changes made)\n",
      "(i3538: 137 changes made)\n",
      "(i3539: 89 changes made)\n",
      "(i3540: 69 changes made)\n",
      "\n",
      ". \n",
      ". *We create a new variable which indicates the number of \n",
      "\n",
      ". * correct answers in this subtest\n",
      "\n",
      ". capture drop bcs10_worddefin\n",
      "\n",
      ".     egen bcs10_worddefin = rowtotal(i3504-i3540) if (bcs10_baswd_notest==0)\n",
      "(3345 missing values generated)\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Here we identify cohort members who have no responses\n",
    "* to any of the test items, which indicates that they\n",
    "* did not take the subtest.\n",
    "\n",
    "tab i3504\n",
    "\n",
    "*No correct or incorrect responses\n",
    "capture drop miss\n",
    "    egen miss = rmiss(i3504-i3540)\n",
    "    tab miss\n",
    "\n",
    "*This variable identified those who did not complete \n",
    "* this element of the test.\n",
    "capture drop bcs10_baswd_notest\n",
    "    gen bcs10_baswd_notest = 0\n",
    "    replace bcs10_baswd_notest = 1 if (miss==37)\n",
    "    tab bcs10_baswd_notest\n",
    "    label variable bcs10_baswd_notest \"BCS10 No Test for BAS Word Defin\"\n",
    "    label values bcs10_baswd_notest yesno\n",
    "    drop miss\n",
    "\n",
    "*In these test items 2 means unacceptable response\n",
    "* (i.e. wrong answer). We recode this to 0.\n",
    "* Now the test items are coded 1 (correct) 0 (incorrect)\n",
    "recode i3504-i3540 (2=0)\n",
    "\n",
    "*We create a new variable which indicates the number of \n",
    "* correct answers in this subtest\n",
    "capture drop bcs10_worddefin\n",
    "    egen bcs10_worddefin = rowtotal(i3504-i3540) if (bcs10_baswd_notest==0)\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ".     capture drop sbcs10_worddefin\n",
      "\n",
      ".     egen sbcs10_worddefin = std(bcs10_worddefin)\n",
      "(3345 missing values generated)\n",
      "\n",
      ". \n",
      ".     summ sbcs10_worddefin\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sbcs10_wor~n |     11,525    4.93e-09           1  -2.023293   4.369977\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ".     capture drop bcs10_stdworddefin\n",
      "\n",
      ".     gen bcs10_stdworddefin = (sbcs10_worddefin*15)+100\n",
      "(3,345 missing values generated)\n",
      "\n",
      ". \n",
      ".     summ bcs10_stdworddefin   \n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdw~n |     11,525         100          15    69.6506   165.5497\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*standardise to mean 0 sd 1\n",
    "    capture drop sbcs10_worddefin\n",
    "    egen sbcs10_worddefin = std(bcs10_worddefin)\n",
    "\n",
    "    summ sbcs10_worddefin\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "    capture drop bcs10_stdworddefin\n",
    "    gen bcs10_stdworddefin = (sbcs10_worddefin*15)+100\n",
    "\n",
    "    summ bcs10_stdworddefin   \n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *BAS Recall of Digits Sub-test\n",
      "\n",
      ". \n",
      ". quietly mvdecode i3541-i3574, mv(-6=. \\ -3=. \\ 9=.)\n",
      "\n",
      ". \n",
      ". *We identify the cases with no responses to any of the\n",
      "\n",
      ". * test items\n",
      "\n",
      ". capture drop miss\n",
      "\n",
      ".     egen miss = rmiss(i3541-i3574)\n",
      "\n",
      ". \n",
      ". *Create a variable to indicate whether cohort member\n",
      "\n",
      ". * took the test\n",
      "\n",
      ". capture drop bcs10_basrd_notest\n",
      "\n",
      ".     gen bcs10_basrd_notest = 0\n",
      "\n",
      ".     replace bcs10_basrd_notest = 1 if (miss==34)\n",
      "(3,358 real changes made)\n",
      "\n",
      ".     label variable bcs10_basrd_notest \"BCS10 No Test for BAS Recall Digits\"\n",
      "\n",
      ".     label values bcs10_basrd_notest yesno\n",
      "\n",
      ".     drop miss\n",
      "\n",
      ". \n",
      ". *Recode the items to indicate (1) correct response\n",
      "\n",
      ". * (0) incorrect response.\n",
      "\n",
      ". recode i3541-i3574 (2=0) \n",
      "(i3541: 3 changes made)\n",
      "(i3542: 7 changes made)\n",
      "(i3543: 7 changes made)\n",
      "(i3544: 4 changes made)\n",
      "(i3545: 5 changes made)\n",
      "(i3546: 15 changes made)\n",
      "(i3547: 27 changes made)\n",
      "(i3548: 21 changes made)\n",
      "(i3549: 25 changes made)\n",
      "(i3550: 41 changes made)\n",
      "(i3551: 215 changes made)\n",
      "(i3552: 445 changes made)\n",
      "(i3553: 360 changes made)\n",
      "(i3554: 619 changes made)\n",
      "(i3555: 1069 changes made)\n",
      "(i3556: 2353 changes made)\n",
      "(i3557: 1919 changes made)\n",
      "(i3558: 4473 changes made)\n",
      "(i3559: 1825 changes made)\n",
      "(i3560: 1749 changes made)\n",
      "(i3561: 6262 changes made)\n",
      "(i3562: 3585 changes made)\n",
      "(i3563: 4319 changes made)\n",
      "(i3564: 4586 changes made)\n",
      "(i3565: 6070 changes made)\n",
      "(i3566: 7580 changes made)\n",
      "(i3567: 6062 changes made)\n",
      "(i3568: 6464 changes made)\n",
      "(i3569: 4480 changes made)\n",
      "(i3570: 2931 changes made)\n",
      "(i3571: 4254 changes made)\n",
      "(i3572: 4181 changes made)\n",
      "(i3573: 3698 changes made)\n",
      "(i3574: 2885 changes made)\n",
      "\n",
      ". \n",
      ". *Create a variable that indicates the number of correct \n",
      "\n",
      ". * responses\n",
      "\n",
      ". capture drop bcs10_digits\n",
      "\n",
      ".     egen bcs10_digits = rowtotal(i3541-i3574) if (bcs10_basrd_notest==0)\n",
      "(3358 missing values generated)\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ".     capture drop sbcs10_digits\n",
      "\n",
      ".     egen sbcs10_digits = std(bcs10_digits)\n",
      "(3358 missing values generated)\n",
      "\n",
      ".     summ sbcs10_digits\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sbcs10_dig~s |     11,512    2.71e-09           1  -5.002222   2.712919\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stddigits\n",
      "\n",
      ".     gen bcs10_stddigits = (sbcs10_digits*15)+100\n",
      "(3,358 missing values generated)\n",
      "\n",
      ".     summ bcs10_stddigits\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdd~s |     11,512         100          15   24.96668   140.6938\n",
      "\n",
      ".     \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*BAS Recall of Digits Sub-test\n",
    "\n",
    "quietly mvdecode i3541-i3574, mv(-6=. \\ -3=. \\ 9=.)\n",
    "\n",
    "*We identify the cases with no responses to any of the\n",
    "* test items\n",
    "capture drop miss\n",
    "    egen miss = rmiss(i3541-i3574)\n",
    "\n",
    "*Create a variable to indicate whether cohort member\n",
    "* took the test\n",
    "capture drop bcs10_basrd_notest\n",
    "    gen bcs10_basrd_notest = 0\n",
    "    replace bcs10_basrd_notest = 1 if (miss==34)\n",
    "    label variable bcs10_basrd_notest \"BCS10 No Test for BAS Recall Digits\"\n",
    "    label values bcs10_basrd_notest yesno\n",
    "    drop miss\n",
    "\n",
    "*Recode the items to indicate (1) correct response\n",
    "* (0) incorrect response.\n",
    "recode i3541-i3574 (2=0) \n",
    "\n",
    "*Create a variable that indicates the number of correct \n",
    "* responses\n",
    "capture drop bcs10_digits\n",
    "    egen bcs10_digits = rowtotal(i3541-i3574) if (bcs10_basrd_notest==0)\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "    capture drop sbcs10_digits\n",
    "    egen sbcs10_digits = std(bcs10_digits)\n",
    "    summ sbcs10_digits\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stddigits\n",
    "    gen bcs10_stddigits = (sbcs10_digits*15)+100\n",
    "    summ bcs10_stddigits\n",
    "    \n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *BAS Matrices Sub-test\n",
      "\n",
      ". \n",
      ". *Identify missing values\n",
      "\n",
      ". quietly mvdecode i3617-i3644, mv(-6=. \\ -3=. \\ 9=.)\n",
      "\n",
      ". \n",
      ". *Identify the cases with no response to the test items\n",
      "\n",
      ". capture drop miss\n",
      "\n",
      ".     egen miss = rmiss(i3617-i3644)\n",
      "\n",
      ". \n",
      ". *Create a new variable that indicates that no test items\n",
      "\n",
      ". * were responded do\n",
      "\n",
      ". capture drop bcs10_basmat_notest\n",
      "\n",
      ".     gen bcs10_basmat_notest = 0\n",
      "\n",
      ".     replace bcs10_basmat_notest = 1 if (miss==28)\n",
      "(3,374 real changes made)\n",
      "\n",
      ".     label variable bcs10_basmat_notest \"BCS10 No Test for BAS Matrices\"\n",
      "\n",
      ".     label values bcs10_basmat_notest yesno\n",
      "\n",
      ".     drop miss\n",
      "\n",
      ". \n",
      ". *Recode incorrect responses to 0\n",
      "\n",
      ". recode i3617-i3644 (2=0) \n",
      "(i3617: 57 changes made)\n",
      "(i3618: 45 changes made)\n",
      "(i3619: 582 changes made)\n",
      "(i3620: 794 changes made)\n",
      "(i3621: 574 changes made)\n",
      "(i3622: 1156 changes made)\n",
      "(i3623: 2613 changes made)\n",
      "(i3624: 1478 changes made)\n",
      "(i3625: 2797 changes made)\n",
      "(i3626: 2611 changes made)\n",
      "(i3627: 4705 changes made)\n",
      "(i3628: 4415 changes made)\n",
      "(i3629: 4311 changes made)\n",
      "(i3630: 6446 changes made)\n",
      "(i3631: 4149 changes made)\n",
      "(i3632: 4403 changes made)\n",
      "(i3633: 4960 changes made)\n",
      "(i3634: 6104 changes made)\n",
      "(i3635: 5395 changes made)\n",
      "(i3636: 5707 changes made)\n",
      "(i3637: 5172 changes made)\n",
      "(i3638: 5646 changes made)\n",
      "(i3639: 5047 changes made)\n",
      "(i3640: 6641 changes made)\n",
      "(i3641: 6048 changes made)\n",
      "(i3642: 5089 changes made)\n",
      "(i3643: 7176 changes made)\n",
      "(i3644: 6567 changes made)\n",
      "\n",
      ". \n",
      ". *Create a variable that indicates the total number of correct\n",
      "\n",
      ". * responses to this subtest\n",
      "\n",
      ". capture drop bcs10_mat\n",
      "\n",
      ".     egen bcs10_mat = rowtotal(i3617-i3644) if (bcs10_basmat_notest==0)\n",
      "(3374 missing values generated)\n",
      "\n",
      ". \n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ".     capture drop sbcs10_mat\n",
      "\n",
      ".     egen sbcs10_mat = std(bcs10_mat)\n",
      "(3374 missing values generated)\n",
      "\n",
      ".     summ sbcs10_mat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "  sbcs10_mat |     11,496   -8.16e-09           1  -2.842147   2.344447\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stdmat\n",
      "\n",
      ".     gen bcs10_stdmat = (sbcs10_mat*15)+100\n",
      "(3,374 missing values generated)\n",
      "\n",
      ".     summ bcs10_stdmat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdmat |     11,496         100          15    57.3678   135.1667\n",
      "\n",
      ".     \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*BAS Matrices Sub-test\n",
    "\n",
    "*Identify missing values\n",
    "quietly mvdecode i3617-i3644, mv(-6=. \\ -3=. \\ 9=.)\n",
    "\n",
    "*Identify the cases with no response to the test items\n",
    "capture drop miss\n",
    "    egen miss = rmiss(i3617-i3644)\n",
    "\n",
    "*Create a new variable that indicates that no test items\n",
    "* were responded do\n",
    "capture drop bcs10_basmat_notest\n",
    "    gen bcs10_basmat_notest = 0\n",
    "    replace bcs10_basmat_notest = 1 if (miss==28)\n",
    "    label variable bcs10_basmat_notest \"BCS10 No Test for BAS Matrices\"\n",
    "    label values bcs10_basmat_notest yesno\n",
    "    drop miss\n",
    "\n",
    "*Recode incorrect responses to 0\n",
    "recode i3617-i3644 (2=0) \n",
    "\n",
    "*Create a variable that indicates the total number of correct\n",
    "* responses to this subtest\n",
    "capture drop bcs10_mat\n",
    "    egen bcs10_mat = rowtotal(i3617-i3644) if (bcs10_basmat_notest==0)\n",
    "\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "    capture drop sbcs10_mat\n",
    "    egen sbcs10_mat = std(bcs10_mat)\n",
    "    summ sbcs10_mat\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stdmat\n",
    "    gen bcs10_stdmat = (sbcs10_mat*15)+100\n",
    "    summ bcs10_stdmat\n",
    "    \n",
    "*return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *BAS Verbal Similarities Sub-test\n",
      "\n",
      ". \n",
      ". *Identify the missing values\n",
      "\n",
      ". quietly mvdecode i3575 i3577 i3579 i3581 i3583 i3585 i3587 i3589 i3591 i3593 i3595 i3597 i3599 i3601 i3603 i3605 i3607 i3609 i3611 i3613 i\n",
      "> 3615 i3576 i3578 i3580 i3582 i3584 i3586 i3588 i3590 i3592 i3594 i3596 i3598 i3600 i3602 i3604 i3606 i3608 i3610 i3612 i3614 i3616, mv(-6=\n",
      "> . \\ -3=. \\ 9=.)\n",
      "\n",
      ". \n",
      ". *This variable indicates the number of items with missing values\n",
      "\n",
      ". capture drop miss\n",
      "\n",
      ". egen miss = rmiss(i3575 i3577 i3579 i3581 i3583 i3585 i3587 i3589 i3591 i3593 i3595 i3597 i3599 i3601 i3603 i3605 i3607 i3609 i3611 i3613 \n",
      "> i3615 i3576 i3578 i3580 i3582 i3584 i3586 i3588 i3590 i3592 i3594 i3596 i3598 i3600 i3602 i3604 i3606 i3608 i3610 i3612 i3614 i3616)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*BAS Verbal Similarities Sub-test\n",
    "\n",
    "*Identify the missing values\n",
    "quietly mvdecode i3575 i3577 i3579 i3581 i3583 i3585 i3587 i3589 i3591 i3593 i3595 i3597 i3599 i3601 i3603 i3605 i3607 i3609 i3611 i3613 i3615 i3576 i3578 i3580 i3582 i3584 i3586 i3588 i3590 i3592 i3594 i3596 i3598 i3600 i3602 i3604 i3606 i3608 i3610 i3612 i3614 i3616, mv(-6=. \\ -3=. \\ 9=.)\n",
    "\n",
    "*This variable indicates the number of items with missing values\n",
    "capture drop miss\n",
    "egen miss = rmiss(i3575 i3577 i3579 i3581 i3583 i3585 i3587 i3589 i3591 i3593 i3595 i3597 i3599 i3601 i3603 i3605 i3607 i3609 i3611 i3613 i3615 i3576 i3578 i3580 i3582 i3584 i3586 i3588 i3590 i3592 i3594 i3596 i3598 i3600 i3602 i3604 i3606 i3608 i3610 i3612 i3614 i3616)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *To get a point on this test the child has to successfully state what the\n",
      "\n",
      ". * items have in common and also successfully provide a further congruent\n",
      "\n",
      ". * example.\n",
      "\n",
      ". \n",
      ". *These variables indicate whether the child got both elements correct for\n",
      "\n",
      ". * each item pair in the test\n",
      "\n",
      ". capture drop score1\n",
      "\n",
      ".     gen score1 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score1 = 1 if (i3575==1)&(i3576==1)\n",
      "(11,342 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score2\n",
      "\n",
      ".     gen score2 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score2 = 1 if (i3577==1)&(i3578==1)\n",
      "(11,257 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score3\n",
      "\n",
      ".     gen score3 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score3 = 1 if (i3579==1)&(i3580==1)\n",
      "(11,343 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score4\n",
      "\n",
      ".     gen score4 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score4 = 1 if (i3581==1)&(i3582==1)\n",
      "(11,289 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score5\n",
      "\n",
      ".     gen score5 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score5 = 1 if (i3583==1)&(i3584==1)\n",
      "(11,178 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score6\n",
      "\n",
      ".     gen score6 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score6 = 1 if (i3585==1)&(i3586==1)\n",
      "(10,954 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score7\n",
      "\n",
      ".     gen score7 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score7 = 1 if (i3587==1)&(i3588==1)\n",
      "(9,999 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score8\n",
      "\n",
      ".     gen score8 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score8 = 1 if (i3589==1)&(i3590==1)\n",
      "(10,300 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score9\n",
      "\n",
      ".     gen score9 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score9 = 1 if (i3591==1)&(i3592==1)\n",
      "(9,448 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score10\n",
      "\n",
      ".     gen score10 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score10 = 1 if (i3593==1)&(i3594==1)\n",
      "(5,773 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score11\n",
      "\n",
      ".     gen score11 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score11 = 1 if (i3595==1)&(i3596==1)\n",
      "(7,961 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score12\n",
      "\n",
      ".     gen score12 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score12 = 1 if (i3597==1)&(i3598==1)\n",
      "(8,092 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score13\n",
      "\n",
      ".     gen score13 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score13 = 1 if (i3599==1)&(i3600==1)\n",
      "(4,601 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score14\n",
      "\n",
      ".     gen score14 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score14 = 1 if (i3601==1)&(i3602==1)\n",
      "(5,709 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score15\n",
      "\n",
      ".     gen score15 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score15 = 1 if (i3603==1)&(i3604==1)\n",
      "(3,205 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score16\n",
      "\n",
      ".     gen score16 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score16 = 1 if (i3605==1)&(i3606==1)\n",
      "(2,412 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score17\n",
      "\n",
      ".     gen score17 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score17 = 1 if (i3607==1)&(i3608==1)\n",
      "(1,754 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score18\n",
      "\n",
      ".     gen score18 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score18 = 1 if (i3609==1)&(i3610==1)\n",
      "(1,066 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score19\n",
      "\n",
      ".     gen score19 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score19 = 1 if (i3611==1)&(i3612==1)\n",
      "(237 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score20\n",
      "\n",
      ".     gen score20 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score20 = 1 if (i3613==1)&(i3614==1)\n",
      "(495 real changes made)\n",
      "\n",
      ". \n",
      ". capture drop score21\n",
      "\n",
      ".     gen score21 = .\n",
      "(14,870 missing values generated)\n",
      "\n",
      ".     replace score21 = 1 if (i3615==1)&(i3616==1)\n",
      "(26 real changes made)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "*To get a point on this test the child has to successfully state what the\n",
    "* items have in common and also successfully provide a further congruent\n",
    "* example.\n",
    "\n",
    "*These variables indicate whether the child got both elements correct for\n",
    "* each item pair in the test\n",
    "capture drop score1\n",
    "    gen score1 = .\n",
    "    replace score1 = 1 if (i3575==1)&(i3576==1)\n",
    "\n",
    "capture drop score2\n",
    "    gen score2 = .\n",
    "    replace score2 = 1 if (i3577==1)&(i3578==1)\n",
    "\n",
    "capture drop score3\n",
    "    gen score3 = .\n",
    "    replace score3 = 1 if (i3579==1)&(i3580==1)\n",
    "\n",
    "capture drop score4\n",
    "    gen score4 = .\n",
    "    replace score4 = 1 if (i3581==1)&(i3582==1)\n",
    "\n",
    "capture drop score5\n",
    "    gen score5 = .\n",
    "    replace score5 = 1 if (i3583==1)&(i3584==1)\n",
    "\n",
    "capture drop score6\n",
    "    gen score6 = .\n",
    "    replace score6 = 1 if (i3585==1)&(i3586==1)\n",
    "\n",
    "capture drop score7\n",
    "    gen score7 = .\n",
    "    replace score7 = 1 if (i3587==1)&(i3588==1)\n",
    "\n",
    "capture drop score8\n",
    "    gen score8 = .\n",
    "    replace score8 = 1 if (i3589==1)&(i3590==1)\n",
    "\n",
    "capture drop score9\n",
    "    gen score9 = .\n",
    "    replace score9 = 1 if (i3591==1)&(i3592==1)\n",
    "\n",
    "capture drop score10\n",
    "    gen score10 = .\n",
    "    replace score10 = 1 if (i3593==1)&(i3594==1)\n",
    "\n",
    "capture drop score11\n",
    "    gen score11 = .\n",
    "    replace score11 = 1 if (i3595==1)&(i3596==1)\n",
    "\n",
    "capture drop score12\n",
    "    gen score12 = .\n",
    "    replace score12 = 1 if (i3597==1)&(i3598==1)\n",
    "\n",
    "capture drop score13\n",
    "    gen score13 = .\n",
    "    replace score13 = 1 if (i3599==1)&(i3600==1)\n",
    "\n",
    "capture drop score14\n",
    "    gen score14 = .\n",
    "    replace score14 = 1 if (i3601==1)&(i3602==1)\n",
    "\n",
    "capture drop score15\n",
    "    gen score15 = .\n",
    "    replace score15 = 1 if (i3603==1)&(i3604==1)\n",
    "\n",
    "capture drop score16\n",
    "    gen score16 = .\n",
    "    replace score16 = 1 if (i3605==1)&(i3606==1)\n",
    "\n",
    "capture drop score17\n",
    "    gen score17 = .\n",
    "    replace score17 = 1 if (i3607==1)&(i3608==1)\n",
    "\n",
    "capture drop score18\n",
    "    gen score18 = .\n",
    "    replace score18 = 1 if (i3609==1)&(i3610==1)\n",
    "\n",
    "capture drop score19\n",
    "    gen score19 = .\n",
    "    replace score19 = 1 if (i3611==1)&(i3612==1)\n",
    "\n",
    "capture drop score20\n",
    "    gen score20 = .\n",
    "    replace score20 = 1 if (i3613==1)&(i3614==1)\n",
    "\n",
    "capture drop score21\n",
    "    gen score21 = .\n",
    "    replace score21 = 1 if (i3615==1)&(i3616==1)\n",
    "\n",
    "* return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *This variable indicates cases which have no responses to any item\n",
      "\n",
      ". * on this subtest.\n",
      "\n",
      ". \n",
      ". capture drop bcs10_basvs_notest\n",
      "\n",
      ".     gen bcs10_basvs_notest = 0\n",
      "\n",
      ".     replace bcs10_basvs_notest = 1 if (miss==42)\n",
      "(3,386 real changes made)\n",
      "\n",
      ".     label variable bcs10_basvs_notest \"BCS10 No Test for BAS Verbal Sim both\"\n",
      "\n",
      ".     label values bcs10_basvs_notest yesno\n",
      "\n",
      ".     drop miss\n",
      "\n",
      ". \n",
      ". *This variable provides the total score for the test\n",
      "\n",
      ". capture drop bcs10_vs\n",
      "\n",
      ". egen bcs10_vs = rowtotal(score1 score2 score3 score4 score5 score6 score7 score8 score9 score10 score11 score12 score13 score14 score15 sc\n",
      "> ore16 score17 score18 score19 score20 score21) if (bcs10_basvs_notest==0)\n",
      "(3386 missing values generated)\n",
      "\n",
      ". sum bcs10_vs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "    bcs10_vs |     11,484    12.05512    2.610513          0         20\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ". capture drop sbcs10_vs\n",
      "\n",
      ".     egen sbcs10_vs = std(bcs10_vs)\n",
      "(3386 missing values generated)\n",
      "\n",
      ".     summ sbcs10_vs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "   sbcs10_vs |     11,484    1.53e-09           1  -4.617912   3.043417\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stdvs\n",
      "\n",
      ".     gen bcs10_stdvs = (sbcs10_vs*15)+100\n",
      "(3,386 missing values generated)\n",
      "\n",
      ".     \n",
      ". summ bcs10_stdvs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      " bcs10_stdvs |     11,484         100          15   30.73132   145.6512\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*This variable indicates cases which have no responses to any item\n",
    "* on this subtest.\n",
    "\n",
    "capture drop bcs10_basvs_notest\n",
    "    gen bcs10_basvs_notest = 0\n",
    "    replace bcs10_basvs_notest = 1 if (miss==42)\n",
    "    label variable bcs10_basvs_notest \"BCS10 No Test for BAS Verbal Sim both\"\n",
    "    label values bcs10_basvs_notest yesno\n",
    "    drop miss\n",
    "\n",
    "*This variable provides the total score for the test\n",
    "capture drop bcs10_vs\n",
    "egen bcs10_vs = rowtotal(score1 score2 score3 score4 score5 score6 score7 score8 score9 score10 score11 score12 score13 score14 score15 score16 score17 score18 score19 score20 score21) if (bcs10_basvs_notest==0)\n",
    "sum bcs10_vs\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "capture drop sbcs10_vs\n",
    "    egen sbcs10_vs = std(bcs10_vs)\n",
    "    summ sbcs10_vs\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stdvs\n",
    "    gen bcs10_stdvs = (sbcs10_vs*15)+100\n",
    "    \n",
    "summ bcs10_stdvs\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Word Definitions\n",
      "\n",
      ". sum bcs10_stdworddefin\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdw~n |     11,525         100          15    69.6506   165.5497\n",
      "\n",
      ". label variable bcs10_stdworddefin \"BCS 10 BAS Word Definitions std\"\n",
      "\n",
      ". \n",
      ". *Verbal Similarities\n",
      "\n",
      ". sum bcs10_stdvs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      " bcs10_stdvs |     11,484         100          15   30.73132   145.6512\n",
      "\n",
      ". label variable bcs10_stdvs \"BCS 10 BAS Verbal Similarities std\"\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Word Definitions\n",
    "sum bcs10_stdworddefin\n",
    "label variable bcs10_stdworddefin \"BCS 10 BAS Word Definitions std\"\n",
    "\n",
    "*Verbal Similarities\n",
    "sum bcs10_stdvs\n",
    "label variable bcs10_stdvs \"BCS 10 BAS Verbal Similarities std\"\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Digit Recall\n",
      "\n",
      ". sum bcs10_stddigits\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdd~s |     11,512         100          15   24.96668   140.6938\n",
      "\n",
      ". label variable bcs10_stddigits \"BCS 10 BAS Digit Recall std\"\n",
      "\n",
      ". \n",
      ". *Matrices\n",
      "\n",
      ". sum bcs10_stdmat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdmat |     11,496         100          15    57.3678   135.1667\n",
      "\n",
      ". label variable bcs10_stdmat \"BCS 10 BAS Matrices std\"\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "*Digit Recall\n",
    "sum bcs10_stddigits\n",
    "label variable bcs10_stddigits \"BCS 10 BAS Digit Recall std\"\n",
    "\n",
    "*Matrices\n",
    "sum bcs10_stdmat\n",
    "label variable bcs10_stdmat \"BCS 10 BAS Matrices std\"\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Here we compute the total non-verbal score\n",
      "\n",
      ". \n",
      ". *We identify cases where the cohort members didn't complete the two non-verbal\n",
      "\n",
      ". * tests\n",
      "\n",
      ". egen rmiss = rmiss(bcs10_stddigits bcs10_stdmat)\n",
      "\n",
      ". tab rmiss\n",
      "\n",
      "      rmiss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     11,454       77.03       77.03\n",
      "          1 |        100        0.67       77.70\n",
      "          2 |      3,316       22.30      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". *11,454 completed both tests\n",
      "\n",
      ". \n",
      ". *Create a variable that indicates the total score across these tests\n",
      "\n",
      ". * only if the cohort member completed both tests\n",
      "\n",
      ". capture drop bcs10_nonverbscore\n",
      "\n",
      ". egen bcs10_nonverbscore = rowtotal(bcs10_stddigits bcs10_stdmat) if (rmiss==0)\n",
      "(3416 missing values generated)\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ". capture drop sbcs10_nonverbscore\n",
      "\n",
      ". egen sbcs10_nonverbscore = std(bcs10_nonverbscore)\n",
      "(3416 missing values generated)\n",
      "\n",
      ". summ sbcs10_nonverbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sbcs10_non~e |     11,454    9.65e-11           1  -3.523276   2.895324\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stdnonverbscore\n",
      "\n",
      ". gen bcs10_stdnonverbscore = (sbcs10_nonverbscore*15)+100\n",
      "(3,416 missing values generated)\n",
      "\n",
      ". \n",
      ". summ bcs10_stdnonverbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdn~e |     11,454         100          15   47.15086   143.4299\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Here we compute the total non-verbal score\n",
    "\n",
    "*We identify cases where the cohort members didn't complete the two non-verbal\n",
    "* tests\n",
    "egen rmiss = rmiss(bcs10_stddigits bcs10_stdmat)\n",
    "tab rmiss\n",
    "*11,454 completed both tests\n",
    "\n",
    "*Create a variable that indicates the total score across these tests\n",
    "* only if the cohort member completed both tests\n",
    "capture drop bcs10_nonverbscore\n",
    "egen bcs10_nonverbscore = rowtotal(bcs10_stddigits bcs10_stdmat) if (rmiss==0)\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "capture drop sbcs10_nonverbscore\n",
    "egen sbcs10_nonverbscore = std(bcs10_nonverbscore)\n",
    "summ sbcs10_nonverbscore\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stdnonverbscore\n",
    "gen bcs10_stdnonverbscore = (sbcs10_nonverbscore*15)+100\n",
    "\n",
    "summ bcs10_stdnonverbscore\n",
    "\n",
    "*return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ bcs10_stdnonverbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdn~e |     11,454         100          15   47.15086   143.4299\n",
      "\n",
      ". \n",
      ". label variable bcs10_stdnonverbscore \"BCS Age 10 Total Non-Verbal Score std\"\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ bcs10_stdnonverbscore\n",
    "\n",
    "label variable bcs10_stdnonverbscore \"BCS Age 10 Total Non-Verbal Score std\"\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Total Verbal Score\n",
      "\n",
      ". \n",
      ". *Identify cases which did not complete both tests that make up the verbal score\n",
      "\n",
      ". capture drop rmiss\n",
      "\n",
      ".     egen rmiss = rmiss(bcs10_stdvs bcs10_stdworddefin)\n",
      "\n",
      ".     tab rmiss\n",
      "\n",
      "      rmiss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     11,463       77.09       77.09\n",
      "          1 |         83        0.56       77.65\n",
      "          2 |      3,324       22.35      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". *11,464 completed both tests\n",
      "\n",
      ". \n",
      ". *Create a variable that indicates the total verbal score \n",
      "\n",
      ". * if the cohort member completed both tests\n",
      "\n",
      ". capture drop bcs10_verbscore\n",
      "\n",
      ". egen bcs10_verbscore = rowtotal(bcs10_stdvs bcs10_stdworddefin) if (rmiss==0)\n",
      "(3407 missing values generated)\n",
      "\n",
      ". \n",
      ". drop rmiss\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ". capture drop sbcs10_verbscore\n",
      "\n",
      ". egen sbcs10_verbscore = std(bcs10_verbscore)\n",
      "(3407 missing values generated)\n",
      "\n",
      ". summ sbcs10_verbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sbcs10_ver~e |     11,463    1.57e-09           1  -3.656487    3.86983\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stdverbscore\n",
      "\n",
      ". gen bcs10_stdverbscore = (sbcs10_verbscore*15)+100\n",
      "(3,407 missing values generated)\n",
      "\n",
      ". summ bcs10_stdverbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdv~e |     11,463         100          15    45.1527   158.0475\n",
      "\n",
      ". \n",
      ". summ bcs10_stdverbscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdv~e |     11,463         100          15    45.1527   158.0475\n",
      "\n",
      ". \n",
      ". label variable bcs10_stdverbscore \"BCS Age 10 Total Verbal Score std\"\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "*Total Verbal Score\n",
    "\n",
    "*Identify cases which did not complete both tests that make up the verbal score\n",
    "capture drop rmiss\n",
    "    egen rmiss = rmiss(bcs10_stdvs bcs10_stdworddefin)\n",
    "    tab rmiss\n",
    "*11,464 completed both tests\n",
    "\n",
    "*Create a variable that indicates the total verbal score \n",
    "* if the cohort member completed both tests\n",
    "capture drop bcs10_verbscore\n",
    "egen bcs10_verbscore = rowtotal(bcs10_stdvs bcs10_stdworddefin) if (rmiss==0)\n",
    "\n",
    "drop rmiss\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "capture drop sbcs10_verbscore\n",
    "egen sbcs10_verbscore = std(bcs10_verbscore)\n",
    "summ sbcs10_verbscore\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stdverbscore\n",
    "gen bcs10_stdverbscore = (sbcs10_verbscore*15)+100\n",
    "summ bcs10_stdverbscore\n",
    "\n",
    "summ bcs10_stdverbscore\n",
    "\n",
    "label variable bcs10_stdverbscore \"BCS Age 10 Total Verbal Score std\"\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *This bit of code computes the total overall score on the \n",
      "\n",
      ". * cognitive ability test\n",
      "\n",
      ". \n",
      ". *First we create a variable to indicate whether the cohort members\n",
      "\n",
      ". * complete all of the tests that make up the overall test scores\n",
      "\n",
      ". egen rmiss = rmiss(bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat)\n",
      "\n",
      ". tab rmiss\n",
      "\n",
      "      rmiss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     11,397       76.64       76.64\n",
      "          1 |        121        0.81       77.46\n",
      "          2 |         21        0.14       77.60\n",
      "          3 |         24        0.16       77.76\n",
      "          4 |      3,307       22.24      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". \n",
      ". *We create a variable that indicates the overall test score\n",
      "\n",
      ". * only for thise cohort members who completed all of the required tests\n",
      "\n",
      ". capture drop bcs10_abilityscore\n",
      "\n",
      ". egen bcs10_abilityscore = rowtotal(bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat) if (rmiss==0)\n",
      "(3473 missing values generated)\n",
      "\n",
      ". \n",
      ". *standardise to mean 0 sd 1\n",
      "\n",
      ". capture drop sbcs10_abilityscore\n",
      "\n",
      ". egen sbcs10_abilityscore = std(bcs10_abilityscore)\n",
      "(3473 missing values generated)\n",
      "\n",
      ". summ sbcs10_abilityscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "sbcs10_abi~e |     11,397   -1.67e-10           1  -3.760751   3.412836\n",
      "\n",
      ". \n",
      ". *standardise to mean 100 sd 15\n",
      "\n",
      ". capture drop bcs10_stdabilityscore\n",
      "\n",
      ". gen bcs10_stdabilityscore = (sbcs10_abilityscore*15)+100\n",
      "(3,473 missing values generated)\n",
      "\n",
      ". summ bcs10_stdabilityscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stda~e |     11,397         100          15   43.58873   151.1925\n",
      "\n",
      ". \n",
      ". label variable bcs10_stdabilityscore \"BCS Age 10 Total Ability Test Score std\"\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*This bit of code computes the total overall score on the \n",
    "* cognitive ability test\n",
    "\n",
    "*First we create a variable to indicate whether the cohort members\n",
    "* complete all of the tests that make up the overall test scores\n",
    "egen rmiss = rmiss(bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat)\n",
    "tab rmiss\n",
    "\n",
    "*We create a variable that indicates the overall test score\n",
    "* only for thise cohort members who completed all of the required tests\n",
    "capture drop bcs10_abilityscore\n",
    "egen bcs10_abilityscore = rowtotal(bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat) if (rmiss==0)\n",
    "\n",
    "*standardise to mean 0 sd 1\n",
    "capture drop sbcs10_abilityscore\n",
    "egen sbcs10_abilityscore = std(bcs10_abilityscore)\n",
    "summ sbcs10_abilityscore\n",
    "\n",
    "*standardise to mean 100 sd 15\n",
    "capture drop bcs10_stdabilityscore\n",
    "gen bcs10_stdabilityscore = (sbcs10_abilityscore*15)+100\n",
    "summ bcs10_stdabilityscore\n",
    "\n",
    "label variable bcs10_stdabilityscore \"BCS Age 10 Total Ability Test Score std\"\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could also produce an overall cognitive ability test scores using principal components analysis (see for example Schoon 2010). Here we compute a general ability test scores using the method described in Schoon (2010)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * We undertake principal components analysis of the four sub-tests \n",
      "\n",
      ". * that make up the general ability test. Here are these items:\n",
      "\n",
      ". \n",
      ". tab bcs10_stdworddefin, mi \n",
      "\n",
      " BCS 10 BAS |\n",
      "       Word |\n",
      "Definitions |\n",
      "        std |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    69.6506 |         77        0.52        0.52\n",
      "   72.64745 |        160        1.08        1.59\n",
      "   75.64429 |        268        1.80        3.40\n",
      "   78.64114 |        404        2.72        6.11\n",
      "   81.63799 |        554        3.73        9.84\n",
      "   84.63483 |        685        4.61       14.45\n",
      "   87.63168 |        776        5.22       19.66\n",
      "   90.62852 |        869        5.84       25.51\n",
      "   93.62537 |        900        6.05       31.56\n",
      "   96.62221 |        902        6.07       37.63\n",
      "   99.61906 |        873        5.87       43.50\n",
      "   102.6159 |        807        5.43       48.92\n",
      "   105.6127 |        792        5.33       54.25\n",
      "   108.6096 |        697        4.69       58.94\n",
      "   111.6064 |        587        3.95       62.89\n",
      "   114.6033 |        497        3.34       66.23\n",
      "   117.6001 |        380        2.56       68.78\n",
      "    120.597 |        347        2.33       71.12\n",
      "   123.5938 |        270        1.82       72.93\n",
      "   126.5907 |        196        1.32       74.25\n",
      "   129.5875 |        153        1.03       75.28\n",
      "   132.5844 |         97        0.65       75.93\n",
      "   135.5812 |         68        0.46       76.39\n",
      "    138.578 |         61        0.41       76.80\n",
      "   141.5749 |         35        0.24       77.03\n",
      "   144.5717 |         23        0.15       77.19\n",
      "   147.5686 |         15        0.10       77.29\n",
      "   150.5654 |         13        0.09       77.38\n",
      "   153.5623 |          8        0.05       77.43\n",
      "   156.5591 |          7        0.05       77.48\n",
      "    159.556 |          3        0.02       77.50\n",
      "   165.5497 |          1        0.01       77.51\n",
      "          . |      3,345       22.49      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". tab bcs10_stddigits, mi\n",
      "\n",
      " BCS 10 BAS |\n",
      "      Digit |\n",
      " Recall std |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "   24.96668 |          1        0.01        0.01\n",
      "   28.47356 |          1        0.01        0.01\n",
      "   46.00797 |          3        0.02        0.03\n",
      "   49.51485 |          3        0.02        0.05\n",
      "   53.02174 |          4        0.03        0.08\n",
      "   56.52862 |         14        0.09        0.17\n",
      "    60.0355 |         22        0.15        0.32\n",
      "   63.54238 |         53        0.36        0.68\n",
      "   67.04926 |         90        0.61        1.28\n",
      "   70.55614 |        161        1.08        2.37\n",
      "   74.06303 |        241        1.62        3.99\n",
      "   77.56991 |        301        2.02        6.01\n",
      "   81.07679 |        453        3.05        9.06\n",
      "   84.58367 |        593        3.99       13.05\n",
      "   88.09055 |        882        5.93       18.98\n",
      "   91.59743 |      1,043        7.01       25.99\n",
      "   95.10432 |      1,117        7.51       33.50\n",
      "    98.6112 |      1,151        7.74       41.24\n",
      "   102.1181 |      1,047        7.04       48.29\n",
      "    105.625 |        913        6.14       54.43\n",
      "   109.1318 |        736        4.95       59.37\n",
      "   112.6387 |        618        4.16       63.53\n",
      "   116.1456 |        561        3.77       67.30\n",
      "   119.6525 |        489        3.29       70.59\n",
      "   123.1594 |        372        2.50       73.09\n",
      "   126.6663 |        257        1.73       74.82\n",
      "   130.1731 |        178        1.20       76.02\n",
      "     133.68 |        121        0.81       76.83\n",
      "   137.1869 |         52        0.35       77.18\n",
      "   140.6938 |         35        0.24       77.42\n",
      "          . |      3,358       22.58      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". tab bcs10_stdvs, mi\n",
      "\n",
      " BCS 10 BAS |\n",
      "     Verbal |\n",
      "Similaritie |\n",
      "      s std |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "   30.73132 |         20        0.13        0.13\n",
      "   36.47731 |          2        0.01        0.15\n",
      "   42.22332 |          6        0.04        0.19\n",
      "   47.96931 |         10        0.07        0.26\n",
      "   53.71531 |         26        0.17        0.43\n",
      "    59.4613 |         51        0.34        0.77\n",
      "    65.2073 |        120        0.81        1.58\n",
      "   70.95329 |        229        1.54        3.12\n",
      "    76.6993 |        481        3.23        6.36\n",
      "   82.44529 |        851        5.72       12.08\n",
      "   88.19128 |      1,252        8.42       20.50\n",
      "   93.93729 |      1,571       10.56       31.06\n",
      "   99.68328 |      1,730       11.63       42.70\n",
      "   105.4293 |      1,781       11.98       54.67\n",
      "   111.1753 |      1,378        9.27       63.94\n",
      "   116.9213 |        991        6.66       70.61\n",
      "   122.6673 |        580        3.90       74.51\n",
      "   128.4133 |        278        1.87       76.38\n",
      "   134.1593 |         90        0.61       76.98\n",
      "   139.9053 |         29        0.20       77.18\n",
      "   145.6512 |          8        0.05       77.23\n",
      "          . |      3,386       22.77      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". tab bcs10_stdmat, mi\n",
      "\n",
      " BCS 10 BAS |\n",
      "   Matrices |\n",
      "        std |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    57.3678 |          4        0.03        0.03\n",
      "   60.14633 |          4        0.03        0.05\n",
      "   62.92487 |         73        0.49        0.54\n",
      "    65.7034 |         84        0.56        1.11\n",
      "   68.48193 |        112        0.75        1.86\n",
      "   71.26046 |        155        1.04        2.91\n",
      "   74.03899 |        255        1.71        4.62\n",
      "   76.81753 |        301        2.02        6.64\n",
      "   79.59606 |        397        2.67        9.31\n",
      "   82.37459 |        440        2.96       12.27\n",
      "   85.15312 |        513        3.45       15.72\n",
      "   87.93166 |        559        3.76       19.48\n",
      "   90.71019 |        618        4.16       23.64\n",
      "   93.48872 |        661        4.45       28.08\n",
      "   96.26725 |        708        4.76       32.84\n",
      "   99.04578 |        791        5.32       38.16\n",
      "   101.8243 |        734        4.94       43.10\n",
      "   104.6029 |        739        4.97       48.07\n",
      "   107.3814 |        756        5.08       53.15\n",
      "   110.1599 |        780        5.25       58.40\n",
      "   112.9384 |        684        4.60       63.00\n",
      "    115.717 |        605        4.07       67.07\n",
      "   118.4955 |        477        3.21       70.28\n",
      "    121.274 |        380        2.56       72.83\n",
      "   124.0526 |        314        2.11       74.94\n",
      "   126.8311 |        195        1.31       76.25\n",
      "   129.6096 |         95        0.64       76.89\n",
      "   132.3882 |         46        0.31       77.20\n",
      "   135.1667 |         16        0.11       77.31\n",
      "          . |      3,374       22.69      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". \n",
      ". \n",
      ". * We only want to include those who completed all four tests. This variable\n",
      "\n",
      ". * was create above.\n",
      "\n",
      ". \n",
      ". tab rmiss\n",
      "\n",
      "      rmiss |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     11,397       76.64       76.64\n",
      "          1 |        121        0.81       77.46\n",
      "          2 |         21        0.14       77.60\n",
      "          3 |         24        0.16       77.76\n",
      "          4 |      3,307       22.24      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     14,870      100.00\n",
      "\n",
      ". \n",
      ". * We examine the correlation between these tests:\n",
      "\n",
      ". \n",
      ". pwcorr bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat if (rmiss==0), sig\n",
      "\n",
      "             | bcs10_.. ~ddigits bcs1~dvs bcs~dmat\n",
      "-------------+------------------------------------\n",
      "bcs10_stdw~n |   1.0000 \n",
      "             |\n",
      "             |\n",
      "bcs10_stdd~s |   0.3257   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      " bcs10_stdvs |   0.6521   0.3254   1.0000 \n",
      "             |   0.0000   0.0000\n",
      "             |\n",
      "bcs10_stdmat |   0.4757   0.3093   0.4838   1.0000 \n",
      "             |   0.0000   0.0000   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * Principal components analysis of the four tests that make up the \n",
      "\n",
      ". * general ability test:\n",
      "\n",
      ". \n",
      ". pca bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat if (rmiss==0)\n",
      "\n",
      "Principal components/correlation                 Number of obs    =     11,397\n",
      "                                                 Number of comp.  =          4\n",
      "                                                 Trace            =          4\n",
      "    Rotation: (unrotated = principal)            Rho              =     1.0000\n",
      "\n",
      "    --------------------------------------------------------------------------\n",
      "       Component |   Eigenvalue   Difference         Proportion   Cumulative\n",
      "    -------------+------------------------------------------------------------\n",
      "           Comp1 |      2.31293      1.54579             0.5782       0.5782\n",
      "           Comp2 |      .767141      .195023             0.1918       0.7700\n",
      "           Comp3 |      .572119       .22431             0.1430       0.9130\n",
      "           Comp4 |      .347808            .             0.0870       1.0000\n",
      "    --------------------------------------------------------------------------\n",
      "\n",
      "Principal components (eigenvectors) \n",
      "\n",
      "    --------------------------------------------------------------------\n",
      "        Variable |    Comp1     Comp2     Comp3     Comp4 | Unexplained \n",
      "    -------------+----------------------------------------+-------------\n",
      "    bcs10_stdw~n |   0.5490   -0.2604   -0.3745    0.7004 |           0 \n",
      "    bcs10_stdd~s |   0.3890    0.9182   -0.0743   -0.0034 |           0 \n",
      "     bcs10_stdvs |   0.5510   -0.2638   -0.3432   -0.7135 |           0 \n",
      "    bcs10_stdmat |   0.4936   -0.1396    0.8582    0.0200 |           0 \n",
      "    --------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * Only the first component has an eigenvalue greater than 1.\n",
      "\n",
      ". \n",
      ". \n",
      ". * Here we predict the score for each individual on the first principal\n",
      "\n",
      ". * component. This score is obtained by applying the elements of the \n",
      "\n",
      ". * corresponding eigenvector to the starndardised values of the original\n",
      "\n",
      ". * observations for an individual.\n",
      "\n",
      ". \n",
      ". predict bcs10_pc1 if (rmiss==0), score\n",
      "(3 components skipped)\n",
      "\n",
      "Scoring coefficients \n",
      "    sum of squares(column-loading) = 1\n",
      "\n",
      "    ------------------------------------------------------\n",
      "        Variable |    Comp1     Comp2     Comp3     Comp4 \n",
      "    -------------+----------------------------------------\n",
      "    bcs10_stdw~n |   0.5490   -0.2604   -0.3745    0.7004 \n",
      "    bcs10_stdd~s |   0.3890    0.9182   -0.0743   -0.0034 \n",
      "     bcs10_stdvs |   0.5510   -0.2638   -0.3432   -0.7135 \n",
      "    bcs10_stdmat |   0.4936   -0.1396    0.8582    0.0200 \n",
      "    ------------------------------------------------------\n",
      "\n",
      ". label variable bcs10_pc1 \"BCS Age 10 PCA Score\"\n",
      "\n",
      ". \n",
      ". \n",
      ". * We standardise this variable:\n",
      "\n",
      ". \n",
      ". capture drop bcs10_stdpc1\n",
      "\n",
      ". egen bcs10_stdpc1 = std(bcs10_pc1)\n",
      "(3473 missing values generated)\n",
      "\n",
      ". summ bcs10_stdpc1\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdpc1 |     11,397   -4.73e-10           1  -3.701819   3.371969\n",
      "\n",
      ". label variable bcs10_stdpc1 \"BCS Age 10 standardised PCA Score\"\n",
      "\n",
      ". \n",
      ". summ bcs10_stdpc1\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdpc1 |     11,397   -4.73e-10           1  -3.701819   3.371969\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* We undertake principal components analysis of the four sub-tests \n",
    "* that make up the general ability test. Here are these items:\n",
    "\n",
    "tab bcs10_stdworddefin, mi \n",
    "tab bcs10_stddigits, mi\n",
    "tab bcs10_stdvs, mi\n",
    "tab bcs10_stdmat, mi\n",
    "\n",
    "\n",
    "* We only want to include those who completed all four tests. This variable\n",
    "* was create above.\n",
    "\n",
    "tab rmiss\n",
    "\n",
    "* We examine the correlation between these tests:\n",
    "\n",
    "pwcorr bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat if (rmiss==0), sig\n",
    "\n",
    "* Principal components analysis of the four tests that make up the \n",
    "* general ability test:\n",
    "\n",
    "pca bcs10_stdworddefin bcs10_stddigits bcs10_stdvs bcs10_stdmat if (rmiss==0)\n",
    "\n",
    "* Only the first component has an eigenvalue greater than 1.\n",
    "\n",
    "\n",
    "* Here we predict the score for each individual on the first principal\n",
    "* component. This score is obtained by applying the elements of the \n",
    "* corresponding eigenvector to the starndardised values of the original\n",
    "* observations for an individual.\n",
    "\n",
    "predict bcs10_pc1 if (rmiss==0), score\n",
    "label variable bcs10_pc1 \"BCS Age 10 PCA Score\"\n",
    "\n",
    "\n",
    "* We standardise this variable:\n",
    "\n",
    "capture drop bcs10_stdpc1\n",
    "egen bcs10_stdpc1 = std(bcs10_pc1)\n",
    "summ bcs10_stdpc1\n",
    "label variable bcs10_stdpc1 \"BCS Age 10 standardised PCA Score\"\n",
    "\n",
    "summ bcs10_stdpc1\n",
    "\n",
    "* return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". keep bcsid bcs10_stdworddefin bcs10_stdvs bcs10_stddigits bcs10_stdmat bcs10_stdabilityscore bcs10_stdverbscore bcs10_stdnonverbscore bcs1\n",
      "> 0_abilityscore bcs10_vs bcs10_verbscore bcs10_stdpc1\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp8.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp8.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp8.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "keep bcsid bcs10_stdworddefin bcs10_stdvs bcs10_stddigits bcs10_stdmat bcs10_stdabilityscore bcs10_stdverbscore bcs10_stdnonverbscore bcs10_abilityscore bcs10_vs bcs10_verbscore bcs10_stdpc1\n",
    "\n",
    "sort bcsid\n",
    "\n",
    "save $path3\\temp8.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the SPSS code, noted above, we have also coded the ability test scores in SPSS. We now compare the scores provided by the SPSS coding and the Stata coding to ensure that the procedures we have carried out in Stata are equivalent to the procedures previously used to compute the variable used in previous published studies.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\temp8.dta, clear\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "use $path3\\temp8.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". merge 1:1 bcsid using $path2\\SPSSBASTOTALSCORE.dta\n",
      "(note: variable bcsid was str7, now str21 to accommodate using data's values)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                             0\n",
      "    matched                            14,870  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ". drop _merge\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "merge 1:1 bcsid using $path2\\SPSSBASTOTALSCORE.dta\n",
    "drop _merge\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ score14\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     score14 |     11,525         100          15    69.6506   165.5497\n",
      "\n",
      ". summ bcs10_stdworddefin\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdw~n |     11,525         100          15    69.6506   165.5497\n",
      "\n",
      ". pwcorr score14 bcs10_stdworddefin, sig\n",
      "\n",
      "             |  score14 bcs10_~n\n",
      "-------------+------------------\n",
      "     score14 |   1.0000 \n",
      "             |\n",
      "             |\n",
      "bcs10_stdw~n |   1.0000   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ score14\n",
    "summ bcs10_stdworddefin\n",
    "pwcorr score14 bcs10_stdworddefin, sig\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ score15\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     score15 |     11,512         100          15   24.96668   140.6938\n",
      "\n",
      ". summ bcs10_stddigits\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdd~s |     11,512         100          15   24.96668   140.6938\n",
      "\n",
      ". pwcorr score15 bcs10_stddigits, sig\n",
      "\n",
      "             |  score15 bcs10~ts\n",
      "-------------+------------------\n",
      "     score15 |   1.0000 \n",
      "             |\n",
      "             |\n",
      "bcs10_stdd~s |   1.0000   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ score15\n",
    "summ bcs10_stddigits\n",
    "pwcorr score15 bcs10_stddigits, sig\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ score16\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     score16 |     11,483         100          15   30.66389   145.6857\n",
      "\n",
      ". summ bcs10_stdvs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      " bcs10_stdvs |     11,484         100          15   30.73132   145.6512\n",
      "\n",
      ". pwcorr score16 bcs10_stdvs, sig\n",
      "\n",
      "             |  score16 bcs1~dvs\n",
      "-------------+------------------\n",
      "     score16 |   1.0000 \n",
      "             |\n",
      "             |\n",
      " bcs10_stdvs |   1.0000   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". *There is a one case difference here between the SPSS coding and the coding \n",
      "\n",
      ". *in Stata. There is one fewer case in the SPSS coding.\n",
      "\n",
      ". *By examining the data we can ascertain that this is because the SPSS coding\n",
      "\n",
      ". *determines that this cohort member did not complete the test at all because\n",
      "\n",
      ". *the have no correct or incorrect answers on the two items required to \n",
      "\n",
      ". *gain a score. By examining the data we can see that this cohort member did \n",
      "\n",
      ". *complete the test as they scored many correct answers on the naming \n",
      "\n",
      ". *element of the test. But most of their answers were 'not stated' on the \n",
      "\n",
      ". *example element of the case. It can therefore be determined that the cohort\n",
      "\n",
      ". *member did take the test. Therefore have chosen not to exclude this case.\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ score16\n",
    "summ bcs10_stdvs\n",
    "pwcorr score16 bcs10_stdvs, sig\n",
    "\n",
    "*There is a one case difference here between the SPSS coding and the coding \n",
    "*in Stata. There is one fewer case in the SPSS coding.\n",
    "*By examining the data we can ascertain that this is because the SPSS coding\n",
    "*determines that this cohort member did not complete the test at all because\n",
    "*the have no correct or incorrect answers on the two items required to \n",
    "*gain a score. By examining the data we can see that this cohort member did \n",
    "*complete the test as they scored many correct answers on the naming \n",
    "*element of the test. But most of their answers were 'not stated' on the \n",
    "*example element of the case. It can therefore be determined that the cohort\n",
    "*member did take the test. Therefore have chosen not to exclude this case.\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ score17\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     score17 |     11,496         100          15    57.3678   135.1667\n",
      "\n",
      ". summ bcs10_stdmat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stdmat |     11,496         100          15    57.3678   135.1667\n",
      "\n",
      ". pwcorr score17 bcs10_stdmat, sig\n",
      "\n",
      "             |  score17 bcs10_~t\n",
      "-------------+------------------\n",
      "     score17 |   1.0000 \n",
      "             |\n",
      "             |\n",
      "bcs10_stdmat |   1.0000   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ score17\n",
    "summ bcs10_stdmat\n",
    "pwcorr score17 bcs10_stdmat, sig\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ score20\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     score20 |     11,396         100          15   43.57055    151.197\n",
      "\n",
      ". summ bcs10_stdabilityscore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "bcs10_stda~e |     11,397         100          15   43.58873   151.1925\n",
      "\n",
      ". pwcorr score20 bcs10_stdabilityscore, sig\n",
      "\n",
      "             |  score20 bcs10_..\n",
      "-------------+------------------\n",
      "     score20 |   1.0000 \n",
      "             |\n",
      "             |\n",
      "bcs10_stda~e |   1.0000   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". *This score varies very slightly because of the 1 additional case in our\n",
      "\n",
      ". *coding which is excluded in the SPSS coding.\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "summ score20\n",
    "summ bcs10_stdabilityscore\n",
    "pwcorr score20 bcs10_stdabilityscore, sig\n",
    "\n",
    "*This score varies very slightly because of the 1 additional case in our\n",
    "*coding which is excluded in the SPSS coding.\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we code father's NS-SEC, this will be our main parental social class measure in the analysis.\n",
    "\n",
    "The occupational information we are going to use for our parental social class measure comes from the new occupational coding files (SN7023).\n",
    "\n",
    "Gregg, P. (2012). Occupational Coding for the National Child Development Study (1969, 1991-2008) and the 1970 British Cohort Study (1980, 2000-2008). [data collection]. University of London. Institute of Education. Centre for Longitudinal Studies, [original data producer(s)]. UK Data Service. SN: 7023.\n",
    "\n",
    "\"Researchers from the Avon Longitudinal Study of Parents and Children (ALSPAC), based at the University of Bristol, worked on data from selected waves of the NCDS and BCS70. To create occupational code classifications, the computerised questionnaire response text strings were converted into comma separated value (CSV) files and processed using the CASCOT (Computer Assisted Structured COding Tool) software programme, which used automatic and semi-automatic processing to assign Standard Occupational Classification 2000 (SOC2000) codes (SOC2000) to entries.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". ****Parent's Occupations\n",
      "\n",
      ". use $path1\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_father.dta, clear\n",
      "\n",
      ". keep BCSID B3FSNSSEC B3FSSOCC B3FSSOC90\n",
      "\n",
      ". rename BCSID bcsid\n",
      "\n",
      ". \n",
      ". \n",
      ". *Father's NSSEC\n",
      "\n",
      ". tab B3FSNSSEC\n",
      "\n",
      "   BCS 1980 |\n",
      "    Father: |\n",
      "     NS-SEC |\n",
      "     social |\n",
      " class code |\n",
      "       SEMI |\n",
      " processing |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          2 |        570        4.73        4.73\n",
      "        3.1 |        624        5.18        9.91\n",
      "        3.2 |        120        1.00       10.91\n",
      "        3.3 |         16        0.13       11.04\n",
      "        4.1 |        905        7.51       18.56\n",
      "        4.2 |        176        1.46       20.02\n",
      "        4.3 |         30        0.25       20.27\n",
      "          5 |        695        5.77       26.04\n",
      "        7.1 |        329        2.73       28.77\n",
      "        7.2 |        471        3.91       32.68\n",
      "        7.3 |        137        1.14       33.81\n",
      "        7.4 |        153        1.27       35.09\n",
      "        8.1 |        224        1.86       36.94\n",
      "        9.1 |      1,137        9.44       46.38\n",
      "        9.2 |        221        1.83       48.22\n",
      "         10 |        182        1.51       49.73\n",
      "       11.1 |      1,574       13.07       62.80\n",
      "       11.2 |        276        2.29       65.09\n",
      "       12.1 |        106        0.88       65.97\n",
      "       12.2 |        290        2.41       68.38\n",
      "       12.3 |        603        5.01       73.38\n",
      "       12.4 |        566        4.70       78.08\n",
      "       12.5 |         88        0.73       78.81\n",
      "       12.6 |         74        0.61       79.43\n",
      "       12.7 |         12        0.10       79.53\n",
      "       13.1 |         52        0.43       79.96\n",
      "       13.2 |        132        1.10       81.05\n",
      "       13.3 |      1,461       12.13       93.18\n",
      "       13.4 |        800        6.64       99.83\n",
      "       13.5 |         21        0.17      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     12,045      100.00\n",
      "\n",
      ". capture drop bcs_panssec\n",
      "\n",
      ".     gen bcs_panssec = .\n",
      "(14,874 missing values generated)\n",
      "\n",
      ".     replace bcs_panssec = 1 if (B3FSNSSEC>=1)&(B3FSNSSEC<=2) \n",
      "(570 real changes made)\n",
      "\n",
      ".     *1.1 Large Employers and Higher Managerial\n",
      "\n",
      ".     replace bcs_panssec = 2 if (B3FSNSSEC>=3.1)&(B3FSNSSEC<=3.4) \n",
      "(760 real changes made)\n",
      "\n",
      ".     *1.2 Higher Professional\n",
      "\n",
      ".     replace bcs_panssec = 3 if (B3FSNSSEC>=4.1)&(B3FSNSSEC<=6) \n",
      "(1,806 real changes made)\n",
      "\n",
      ".     *lower managerial and professional\n",
      "\n",
      ".     replace bcs_panssec = 4 if (B3FSNSSEC>=7.1)&(B3FSNSSEC<=7.4) \n",
      "(1,090 real changes made)\n",
      "\n",
      ".     *intermediate\n",
      "\n",
      ".     replace bcs_panssec = 5 if (B3FSNSSEC>=8.1)&(B3FSNSSEC<=9.2) \n",
      "(1,582 real changes made)\n",
      "\n",
      ".     *small employers and own account\n",
      "\n",
      ".     replace bcs_panssec = 6 if (B3FSNSSEC>=10)&(B3FSNSSEC<=11.2) \n",
      "(2,032 real changes made)\n",
      "\n",
      ".     *lower supervisory and technical\n",
      "\n",
      ".     replace bcs_panssec = 7 if (B3FSNSSEC>=12.1)&(B3FSNSSEC<=12.7) \n",
      "(1,739 real changes made)\n",
      "\n",
      ".     *semiroutine\n",
      "\n",
      ".     replace bcs_panssec = 8 if (B3FSNSSEC>=13.1)&(B3FSNSSEC<=13.5) \n",
      "(2,466 real changes made)\n",
      "\n",
      ".     *routine\n",
      "\n",
      ".     tab bcs_panssec\n",
      "\n",
      "bcs_panssec |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |        570        4.73        4.73\n",
      "          2 |        760        6.31       11.04\n",
      "          3 |      1,806       14.99       26.04\n",
      "          4 |      1,090        9.05       35.09\n",
      "          5 |      1,582       13.13       48.22\n",
      "          6 |      2,032       16.87       65.09\n",
      "          7 |      1,739       14.44       79.53\n",
      "          8 |      2,466       20.47      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     12,045      100.00\n",
      "\n",
      ".     label variable bcs_panssec \"BCS Age 10 Father's NSSEC\"\n",
      "\n",
      ".     label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermed\n",
      "> iate\" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" \n",
      "\n",
      ".     label values bcs_panssec nssec\n",
      "\n",
      ".     numlabel, add\n",
      "\n",
      ".     tab bcs_panssec, mi\n",
      "\n",
      "              BCS Age 10 Father's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        570        3.83        3.83\n",
      "                 2. Higher Professional |        760        5.11        8.94\n",
      "   3. Lower managerial and professional |      1,806       12.14       21.08\n",
      "                        4. Intermediate |      1,090        7.33       28.41\n",
      "     5. Small employers and own account |      1,582       10.64       39.05\n",
      "     6. Lower Supervisory and Technical |      2,032       13.66       52.71\n",
      "                        7. Semi-Routine |      1,739       11.69       64.40\n",
      "                             8. Routine |      2,466       16.58       80.98\n",
      "                                      . |      2,829       19.02      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     14,874      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "****Parent's Occupations\n",
    "use $path1\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_father.dta, clear\n",
    "keep BCSID B3FSNSSEC B3FSSOCC B3FSSOC90\n",
    "rename BCSID bcsid\n",
    "\n",
    "\n",
    "*Father's NSSEC\n",
    "tab B3FSNSSEC\n",
    "capture drop bcs_panssec\n",
    "    gen bcs_panssec = .\n",
    "    replace bcs_panssec = 1 if (B3FSNSSEC>=1)&(B3FSNSSEC<=2) \n",
    "    *1.1 Large Employers and Higher Managerial\n",
    "    replace bcs_panssec = 2 if (B3FSNSSEC>=3.1)&(B3FSNSSEC<=3.4) \n",
    "    *1.2 Higher Professional\n",
    "    replace bcs_panssec = 3 if (B3FSNSSEC>=4.1)&(B3FSNSSEC<=6) \n",
    "    *lower managerial and professional\n",
    "    replace bcs_panssec = 4 if (B3FSNSSEC>=7.1)&(B3FSNSSEC<=7.4) \n",
    "    *intermediate\n",
    "    replace bcs_panssec = 5 if (B3FSNSSEC>=8.1)&(B3FSNSSEC<=9.2) \n",
    "    *small employers and own account\n",
    "    replace bcs_panssec = 6 if (B3FSNSSEC>=10)&(B3FSNSSEC<=11.2) \n",
    "    *lower supervisory and technical\n",
    "    replace bcs_panssec = 7 if (B3FSNSSEC>=12.1)&(B3FSNSSEC<=12.7) \n",
    "    *semiroutine\n",
    "    replace bcs_panssec = 8 if (B3FSNSSEC>=13.1)&(B3FSNSSEC<=13.5) \n",
    "    *routine\n",
    "    tab bcs_panssec\n",
    "    label variable bcs_panssec \"BCS Age 10 Father's NSSEC\"\n",
    "    label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermediate\" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" \n",
    "    label values bcs_panssec nssec\n",
    "    numlabel, add\n",
    "    tab bcs_panssec, mi\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". ****I am going to recode NSSEC from above just to double check\n",
      "\n",
      ". capture drop ukempst\n",
      "\n",
      ". gen ukempst = 0\n",
      "\n",
      ". \n",
      ". describe\n",
      "\n",
      "Contains data from F:\\Data\\RAWDATA\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_father.dta\n",
      "  obs:        14,874                          \n",
      " vars:             6                          \n",
      " size:       431,346                          \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "              storage   display    value\n",
      "variable name   type    format     label      variable label\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "bcsid           str7    %7s                   bcsid\n",
      "B3FSSOCC        str4    %4s                   BCS 1980 Father: SEMI auto SOC2000\n",
      "B3FSSOC90       int     %8.0g                 BCS 1980 Father: SEMI automatic SOC90\n",
      "B3FSNSSEC       double  %10.0g                BCS 1980 Father: NS-SEC social class code SEMI processing\n",
      "bcs_panssec     float   %40.0g     nssec      BCS Age 10 Father's NSSEC\n",
      "ukempst         float   %9.0g                 \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "Sorted by: \n",
      "     Note: Dataset has changed since last saved.\n",
      "\n",
      ". capture drop soc2000\n",
      "\n",
      ".     gen soc2000 = real(B3FSSOCC)\n",
      "(2,829 missing values generated)\n",
      "\n",
      ". \n",
      ". sort soc2000 ukempst\n",
      "\n",
      ". merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
      "(label nssec already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         5,330\n",
      "        from master                     2,829  (_merge==1)\n",
      "        from using                      2,501  (_merge==2)\n",
      "\n",
      "    matched                            12,045  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ". \n",
      ". tab nssec\n",
      "\n",
      "                                  nssec |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        916        6.30        6.30\n",
      "                 2. Higher Professional |      1,052        7.23       13.53\n",
      "   3. Lower managerial and professional |      2,390       16.43       29.96\n",
      "                        4. Intermediate |      1,248        8.58       38.54\n",
      "     5. Small employers and own account |      2,050       14.09       52.63\n",
      "     6. Lower Supervisory and Technical |      2,292       15.76       68.39\n",
      "                        7. Semi-Routine |      1,940       13.34       81.73\n",
      "                             8. Routine |      2,658       18.27      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     14,546      100.00\n",
      "\n",
      ". \n",
      ". sort soc2000\n",
      "\n",
      ". drop if _merge==2\n",
      "(2,501 observations deleted)\n",
      "\n",
      ". \n",
      ". drop ukempst\n",
      "\n",
      ". \n",
      ". tab nssec bcs_panssec\n",
      "\n",
      "                      |                                BCS Age 10 Father's NSSEC\n",
      "                nssec | 1. Large   2. Higher  3. Lower   4. Interm  5. Small   6. Lower   7. Semi-R  8. Routin |     Total\n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "1. Large Employers an |       570          0          0          0          0          0          0          0 |       570 \n",
      "2. Higher Professiona |         0        760          0          0          0          0          0          0 |       760 \n",
      "3. Lower managerial a |         0          0      1,806          0          0          0          0          0 |     1,806 \n",
      "      4. Intermediate |         0          0          0      1,090          0          0          0          0 |     1,090 \n",
      "5. Small employers an |         0          0          0          0      1,582          0          0          0 |     1,582 \n",
      "6. Lower Supervisory  |         0          0          0          0          0      2,032          0          0 |     2,032 \n",
      "      7. Semi-Routine |         0          0          0          0          0          0      1,739          0 |     1,739 \n",
      "           8. Routine |         0          0          0          0          0          0          0      2,466 |     2,466 \n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "                Total |       570        760      1,806      1,090      1,582      2,032      1,739      2,466 |    12,045 \n",
      "\n",
      "\n",
      ". kap nssec bcs_panssec\n",
      "\n",
      "             Expected\n",
      "Agreement   Agreement     Kappa   Std. Err.         Z      Prob>Z\n",
      "-----------------------------------------------------------------\n",
      " 100.00%      14.54%     1.0000     0.0037     270.50      0.0000\n",
      "\n",
      ". *Perfect match\n",
      "\n",
      ". \n",
      ". drop _merge bcs_panssec\n",
      "\n",
      ". \n",
      ". rename nssec bcs_panssecsimp\n",
      "\n",
      ". rename soc2000 bcs_dadsoc2000simp\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp9.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp9.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp9.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "****I am going to recode NSSEC from above just to double check\n",
    "capture drop ukempst\n",
    "gen ukempst = 0\n",
    "\n",
    "describe\n",
    "capture drop soc2000\n",
    "    gen soc2000 = real(B3FSSOCC)\n",
    "\n",
    "sort soc2000 ukempst\n",
    "merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
    "\n",
    "tab nssec\n",
    "\n",
    "sort soc2000\n",
    "drop if _merge==2\n",
    "\n",
    "drop ukempst\n",
    "\n",
    "tab nssec bcs_panssec\n",
    "kap nssec bcs_panssec\n",
    "*Perfect match\n",
    "\n",
    "drop _merge bcs_panssec\n",
    "\n",
    "rename nssec bcs_panssecsimp\n",
    "rename soc2000 bcs_dadsoc2000simp\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp9.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "We also code mother's NS-SEC here. This is not used in the main analysis as mother's NS-SEC is not available in the NCDS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_mother.dta,  clear\n",
      "\n",
      ". keep BCSID B3MSNSSEC B3MSSOCC\n",
      "\n",
      ". rename BCSID bcsid\n",
      "\n",
      ". \n",
      ". *Mother's NSSEC\n",
      "\n",
      ". capture drop bcs_manssec\n",
      "\n",
      ".     gen bcs_manssec = .\n",
      "(14,874 missing values generated)\n",
      "\n",
      ".     replace bcs_manssec = 1 if (B3MSNSSEC>=1)&(B3MSNSSEC<=2) \n",
      "(51 real changes made)\n",
      "\n",
      ".     *1.1 Large Employers and Higher Managerial\n",
      "\n",
      ".     replace bcs_manssec = 2 if (B3MSNSSEC>=3.1)&(B3MSNSSEC<=3.4) \n",
      "(95 real changes made)\n",
      "\n",
      ".     *1.2 Higher Professional\n",
      "\n",
      ".     replace bcs_manssec = 3 if (B3MSNSSEC>=4.1)&(B3MSNSSEC<=6) \n",
      "(1,134 real changes made)\n",
      "\n",
      ".     *lower managerial and professional\n",
      "\n",
      ".     replace bcs_manssec = 4 if (B3MSNSSEC>=7.1)&(B3MSNSSEC<=7.4) \n",
      "(2,154 real changes made)\n",
      "\n",
      ".     *intermediate\n",
      "\n",
      ".     replace bcs_manssec = 5 if (B3MSNSSEC>=8.1)&(B3MSNSSEC<=9.2) \n",
      "(365 real changes made)\n",
      "\n",
      ".     *small employers and own account\n",
      "\n",
      ".     replace bcs_manssec = 6 if (B3MSNSSEC>=10)&(B3MSNSSEC<=11.2) \n",
      "(174 real changes made)\n",
      "\n",
      ".     *lower supervisory and technical\n",
      "\n",
      ".     replace bcs_manssec = 7 if (B3MSNSSEC>=12.1)&(B3MSNSSEC<=12.7) \n",
      "(2,616 real changes made)\n",
      "\n",
      ".     *semiroutine\n",
      "\n",
      ".     replace bcs_manssec = 8 if (B3MSNSSEC>=13.1)&(B3MSNSSEC<=13.5) \n",
      "(2,936 real changes made)\n",
      "\n",
      ".     *routine\n",
      "\n",
      ".     tab bcs_manssec\n",
      "\n",
      "bcs_manssec |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |         51        0.54        0.54\n",
      "          2 |         95        1.00        1.53\n",
      "          3 |      1,134       11.91       13.44\n",
      "          4 |      2,154       22.61       36.05\n",
      "          5 |        365        3.83       39.88\n",
      "          6 |        174        1.83       41.71\n",
      "          7 |      2,616       27.46       69.18\n",
      "          8 |      2,936       30.82      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |      9,525      100.00\n",
      "\n",
      ".     label variable bcs_manssec \"BCS Age 10 Mother's NSSEC\"\n",
      "\n",
      ".     label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermed\n",
      "> iate\" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" , replace\n",
      "\n",
      ".     label values bcs_manssec nssec\n",
      "\n",
      ".     numlabel, add\n",
      "\n",
      ".     tab bcs_manssec, mi\n",
      "\n",
      "              BCS Age 10 Mother's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |         51        0.34        0.34\n",
      "                 2. Higher Professional |         95        0.64        0.98\n",
      "   3. Lower managerial and professional |      1,134        7.62        8.61\n",
      "                        4. Intermediate |      2,154       14.48       23.09\n",
      "     5. Small employers and own account |        365        2.45       25.54\n",
      "     6. Lower Supervisory and Technical |        174        1.17       26.71\n",
      "                        7. Semi-Routine |      2,616       17.59       44.30\n",
      "                             8. Routine |      2,936       19.74       64.04\n",
      "                                      . |      5,349       35.96      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     14,874      100.00\n",
      "\n",
      ".     \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_mother.dta,  clear\n",
    "keep BCSID B3MSNSSEC B3MSSOCC\n",
    "rename BCSID bcsid\n",
    "\n",
    "*Mother's NSSEC\n",
    "capture drop bcs_manssec\n",
    "    gen bcs_manssec = .\n",
    "    replace bcs_manssec = 1 if (B3MSNSSEC>=1)&(B3MSNSSEC<=2) \n",
    "    *1.1 Large Employers and Higher Managerial\n",
    "    replace bcs_manssec = 2 if (B3MSNSSEC>=3.1)&(B3MSNSSEC<=3.4) \n",
    "    *1.2 Higher Professional\n",
    "    replace bcs_manssec = 3 if (B3MSNSSEC>=4.1)&(B3MSNSSEC<=6) \n",
    "    *lower managerial and professional\n",
    "    replace bcs_manssec = 4 if (B3MSNSSEC>=7.1)&(B3MSNSSEC<=7.4) \n",
    "    *intermediate\n",
    "    replace bcs_manssec = 5 if (B3MSNSSEC>=8.1)&(B3MSNSSEC<=9.2) \n",
    "    *small employers and own account\n",
    "    replace bcs_manssec = 6 if (B3MSNSSEC>=10)&(B3MSNSSEC<=11.2) \n",
    "    *lower supervisory and technical\n",
    "    replace bcs_manssec = 7 if (B3MSNSSEC>=12.1)&(B3MSNSSEC<=12.7) \n",
    "    *semiroutine\n",
    "    replace bcs_manssec = 8 if (B3MSNSSEC>=13.1)&(B3MSNSSEC<=13.5) \n",
    "    *routine\n",
    "    tab bcs_manssec\n",
    "    label variable bcs_manssec \"BCS Age 10 Mother's NSSEC\"\n",
    "    label define nssec 1 \"Large Employers and Higher Managerial\" 2 \"Higher Professional\" 3 \"Lower managerial and professional\" 4 \"Intermediate\" 5 \"Small employers and own account\" 6 \"Lower Supervisory and Technical\" 7 \"Semi-Routine\" 8 \"Routine\" , replace\n",
    "    label values bcs_manssec nssec\n",
    "    numlabel, add\n",
    "    tab bcs_manssec, mi\n",
    "    \n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". ****I am going to recode NSSEC from above just to double check\n",
      "\n",
      ". capture drop ukempst\n",
      "\n",
      ".     gen ukempst = 0\n",
      "\n",
      ".     describe\n",
      "\n",
      "Contains data from F:\\Data\\RAWDATA\\ARCHIVE\\NCDSBCS_OCCS\\bcs3_occupation_coding_mother.dta\n",
      "  obs:        14,874                          \n",
      " vars:             5                          \n",
      " size:       401,598                          \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "              storage   display    value\n",
      "variable name   type    format     label      variable label\n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "bcsid           str7    %7s                   bcsid\n",
      "B3MSSOCC        str4    %4s                   (BCS 1980 Mother) SEMI auto SOC2000\n",
      "B3MSNSSEC       double  %10.0g                (BCS 1980 Mother) NS-SEC social class code SEMI processing\n",
      "bcs_manssec     float   %40.0g     nssec      BCS Age 10 Mother's NSSEC\n",
      "ukempst         float   %9.0g                 \n",
      "--------------------------------------------------------------------------------------------------------------------------------------------\n",
      "Sorted by: \n",
      "     Note: Dataset has changed since last saved.\n",
      "\n",
      ".     capture drop soc2000\n",
      "\n",
      ".     gen soc2000 = real(B3MSSOCC)\n",
      "(5,349 missing values generated)\n",
      "\n",
      ".     \n",
      ". sort soc2000 ukempst\n",
      "\n",
      ". merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
      "(label nssec already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         7,907\n",
      "        from master                     5,349  (_merge==1)\n",
      "        from using                      2,558  (_merge==2)\n",
      "\n",
      "    matched                             9,525  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ". \n",
      ". sort soc2000\n",
      "\n",
      ". drop if _merge==2\n",
      "(2,558 observations deleted)\n",
      "\n",
      ". \n",
      ". tab nssec bcs_manssec\n",
      "\n",
      "                      |                                BCS Age 10 Mother's NSSEC\n",
      "                nssec | 1. Large   2. Higher  3. Lower   4. Interm  5. Small   6. Lower   7. Semi-R  8. Routin |     Total\n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "1. Large Employers an |        51          0          0          0          0          0          0          0 |        51 \n",
      "2. Higher Professiona |         0         95          0          0          0          0          0          0 |        95 \n",
      "3. Lower managerial a |         0          0      1,134          0          0          0          0          0 |     1,134 \n",
      "      4. Intermediate |         0          0          0      2,154          0          0          0          0 |     2,154 \n",
      "5. Small employers an |         0          0          0          0        365          0          0          0 |       365 \n",
      "6. Lower Supervisory  |         0          0          0          0          0        174          0          0 |       174 \n",
      "      7. Semi-Routine |         0          0          0          0          0          0      2,616          0 |     2,616 \n",
      "           8. Routine |         0          0          0          0          0          0          0      2,936 |     2,936 \n",
      "----------------------+----------------------------------------------------------------------------------------+----------\n",
      "                Total |        51         95      1,134      2,154        365        174      2,616      2,936 |     9,525 \n",
      "\n",
      "\n",
      ". kap nssec bcs_manssec\n",
      "\n",
      "             Expected\n",
      "Agreement   Agreement     Kappa   Std. Err.         Z      Prob>Z\n",
      "-----------------------------------------------------------------\n",
      " 100.00%      23.77%     1.0000     0.0055     181.76      0.0000\n",
      "\n",
      ". *The two NS-SEC codings agree\n",
      "\n",
      ". \n",
      ". drop _merge bcs_manssec ukempst\n",
      "\n",
      ". \n",
      ". rename nssec bcs_manssecsimp\n",
      "\n",
      ". rename soc2000 bcs_mumsoc2000\n",
      "\n",
      ". \n",
      ". label variable bcs_manssecsimp \"BCS Age 10 Mother's NSSEC Simplified\"\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp10.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp10.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp10.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "****I am going to recode NSSEC from above just to double check\n",
    "capture drop ukempst\n",
    "    gen ukempst = 0\n",
    "    describe\n",
    "    capture drop soc2000\n",
    "    gen soc2000 = real(B3MSSOCC)\n",
    "    \n",
    "sort soc2000 ukempst\n",
    "merge m:m soc2000 ukempst using $path1\\OTHER\\SOC2000_to_NSSEC_20160527_RC_V1.dta\n",
    "\n",
    "sort soc2000\n",
    "drop if _merge==2\n",
    "\n",
    "tab nssec bcs_manssec\n",
    "kap nssec bcs_manssec\n",
    "*The two NS-SEC codings agree\n",
    "\n",
    "drop _merge bcs_manssec ukempst\n",
    "\n",
    "rename nssec bcs_manssecsimp\n",
    "rename soc2000 bcs_mumsoc2000\n",
    "\n",
    "label variable bcs_manssecsimp \"BCS Age 10 Mother's NSSEC Simplified\"\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp10.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Here we use the response files to produce variables indicating the outcome at each sweep of the survey (e.g. productive, not productive).\n",
    "\n",
    "We also code gender using the response files, as there is less missing data in this variable than the gender variable available in the individual sweeps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". ****Information on response \n",
      "\n",
      ". \n",
      ". use $path1\\ARCHIVE\\BCS\\response\\bcs_response.dta, clear\n",
      "\n",
      ". \n",
      ". keep BCSID OUTCME01 OUTCME02 OUTCME03 SEX\n",
      "\n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". capture drop bcs_male\n",
      "\n",
      ".     gen bcs_male = .\n",
      "(19,006 missing values generated)\n",
      "\n",
      ".     replace bcs_male = 1 if (SEX==1)\n",
      "(9,686 real changes made)\n",
      "\n",
      ".     replace bcs_male = 0 if (SEX==2)\n",
      "(8,943 real changes made)\n",
      "\n",
      ".     label variable bcs_male \"BCS Cohort member Male\"\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\", replace\n",
      "\n",
      ".     label values bcs_male yesno\n",
      "\n",
      ".     tab bcs_male, mi\n",
      "\n",
      " BCS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "         No |      8,943       47.05       47.05\n",
      "        Yes |      9,686       50.96       98.02\n",
      "          . |        377        1.98      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     19,006      100.00\n",
      "\n",
      ".     tab bcs_male SEX\n",
      "\n",
      "BCS Cohort |\n",
      "    member | Sex of cohort member\n",
      "      Male |   1. Male  2. Female |     Total\n",
      "-----------+----------------------+----------\n",
      "        No |         0      8,943 |     8,943 \n",
      "       Yes |     9,686          0 |     9,686 \n",
      "-----------+----------------------+----------\n",
      "     Total |     9,686      8,943 |    18,629 \n",
      "\n",
      "\n",
      ".     drop SEX\n",
      "\n",
      ". \n",
      ". rename BCSID bcsid\n",
      "\n",
      ". \n",
      ". *Outcome of the first survey\n",
      "\n",
      ". tab OUTCME01\n",
      "\n",
      "Outcome to BCS1 (1970)   |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     17,196       90.48       90.48\n",
      "   4. Other unproductive |         18        0.09       90.57\n",
      "           6. Not Issued |      1,792        9.43      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     19,006      100.00\n",
      "\n",
      ". rename OUTCME01 bcs_0outcome\n",
      "\n",
      ".     label variable bcs_0outcome \"BCS response outcome 1970 (age 0)\"\n",
      "\n",
      ".     \n",
      ". *Outcome of the age 5 survey\n",
      "\n",
      ". tab OUTCME02\n",
      "\n",
      "Outcome to BCS2 (1975)   |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     13,135       69.11       69.11\n",
      "   4. Other unproductive |      3,256       17.13       86.24\n",
      "           6. Not Issued |      2,016       10.61       96.85\n",
      "                 8. Dead |        599        3.15      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     19,006      100.00\n",
      "\n",
      ". rename OUTCME02 bcs_5outcome\n",
      "\n",
      ".     label variable bcs_5outcome \"BCS response outcome 1975 (age 5)\"\n",
      "\n",
      ".     \n",
      ". *Outcome of the age 10 survey\n",
      "\n",
      ". tab OUTCME03\n",
      "\n",
      "Outcome to BCS3 (1980)   |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,869       78.23       78.23\n",
      "   4. Other unproductive |      2,381       12.53       90.76\n",
      "           6. Not Issued |      1,146        6.03       96.79\n",
      "                 8. Dead |        610        3.21      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     19,006      100.00\n",
      "\n",
      ". rename OUTCME03 bcs_10outcome\n",
      "\n",
      ".     label variable bcs_10outcome \"BCS response outcome 1980 (age 10)\"\n",
      "\n",
      ". \n",
      ". *Here we create a simple dummy variable to indicate whether the cohort\n",
      "\n",
      ". *member had a productive interview at the age 10 survey\n",
      "\n",
      ". tab bcs_10outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "           1980 (age 10) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,869       78.23       78.23\n",
      "   4. Other unproductive |      2,381       12.53       90.76\n",
      "           6. Not Issued |      1,146        6.03       96.79\n",
      "                 8. Dead |        610        3.21      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     19,006      100.00\n",
      "\n",
      ".     gen sweeptestoutcome = 0\n",
      "\n",
      ".     replace sweeptestoutcome = 1 if (bcs_10outcome==1)\n",
      "(14,869 real changes made)\n",
      "\n",
      ".     tab bcs_10outcome sweeptestoutcome\n",
      "\n",
      " BCS response outcome |   sweeptestoutcome\n",
      "        1980 (age 10) |         0          1 |     Total\n",
      "----------------------+----------------------+----------\n",
      "        1. Productive |         0     14,869 |    14,869 \n",
      "4. Other unproductive |     2,381          0 |     2,381 \n",
      "        6. Not Issued |     1,146          0 |     1,146 \n",
      "              8. Dead |       610          0 |       610 \n",
      "----------------------+----------------------+----------\n",
      "                Total |     4,137     14,869 |    19,006 \n",
      "\n",
      "\n",
      ".     label variable sweeptestoutcome \"Productive at age 10 survey\"\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path3\\temp11.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp11.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp11.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "****Information on response \n",
    "\n",
    "use $path1\\ARCHIVE\\BCS\\response\\bcs_response.dta, clear\n",
    "\n",
    "keep BCSID OUTCME01 OUTCME02 OUTCME03 SEX\n",
    "numlabel, add\n",
    "\n",
    "capture drop bcs_male\n",
    "    gen bcs_male = .\n",
    "    replace bcs_male = 1 if (SEX==1)\n",
    "    replace bcs_male = 0 if (SEX==2)\n",
    "    label variable bcs_male \"BCS Cohort member Male\"\n",
    "    label define yesno 1 \"Yes\" 0 \"No\", replace\n",
    "    label values bcs_male yesno\n",
    "    tab bcs_male, mi\n",
    "    tab bcs_male SEX\n",
    "    drop SEX\n",
    "\n",
    "rename BCSID bcsid\n",
    "\n",
    "*Outcome of the first survey\n",
    "tab OUTCME01\n",
    "rename OUTCME01 bcs_0outcome\n",
    "    label variable bcs_0outcome \"BCS response outcome 1970 (age 0)\"\n",
    "    \n",
    "*Outcome of the age 5 survey\n",
    "tab OUTCME02\n",
    "rename OUTCME02 bcs_5outcome\n",
    "    label variable bcs_5outcome \"BCS response outcome 1975 (age 5)\"\n",
    "    \n",
    "*Outcome of the age 10 survey\n",
    "tab OUTCME03\n",
    "rename OUTCME03 bcs_10outcome\n",
    "    label variable bcs_10outcome \"BCS response outcome 1980 (age 10)\"\n",
    "\n",
    "*Here we create a simple dummy variable to indicate whether the cohort\n",
    "*member had a productive interview at the age 10 survey\n",
    "tab bcs_10outcome\n",
    "    gen sweeptestoutcome = 0\n",
    "    replace sweeptestoutcome = 1 if (bcs_10outcome==1)\n",
    "    tab bcs_10outcome sweeptestoutcome\n",
    "    label variable sweeptestoutcome \"Productive at age 10 survey\"\n",
    "\n",
    "sort bcsid\n",
    "save $path3\\temp11.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "We also clean some additional variables that may potentially be used when producing the weights, and in the multiple imputation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\BCS\\S1\\bcs1derived.dta, clear\n",
      "\n",
      ". \n",
      ". keep BCSID BD1MAGE BD1REGN BD1AGEFB BD1FAGE BD1MAGM\n",
      "\n",
      ". \n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". *Mother's age at cohort member's birth\n",
      "\n",
      ". tab BD1MAGE\n",
      "\n",
      " 1970: Age of mother at CM's |\n",
      "birth (from s1 var a0005a/s2 |\n",
      "                 var e008)   |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "-8. No information available |         33        0.19        0.19\n",
      "                          14 |          2        0.01        0.20\n",
      "                          15 |         26        0.15        0.35\n",
      "                          16 |        130        0.76        1.11\n",
      "                          17 |        304        1.77        2.88\n",
      "                          18 |        517        3.01        5.89\n",
      "                          19 |        689        4.01        9.89\n",
      "                          20 |        926        5.38       15.28\n",
      "                          21 |      1,104        6.42       21.70\n",
      "                          22 |      1,349        7.84       29.54\n",
      "                          23 |      1,489        8.66       38.20\n",
      "                          24 |      1,226        7.13       45.33\n",
      "                          25 |      1,308        7.61       52.94\n",
      "                          26 |      1,200        6.98       59.92\n",
      "                          27 |      1,124        6.54       66.45\n",
      "                          28 |        848        4.93       71.38\n",
      "                          29 |        820        4.77       76.15\n",
      "                          30 |        728        4.23       80.38\n",
      "                          31 |        618        3.59       83.98\n",
      "                          32 |        499        2.90       86.88\n",
      "                          33 |        409        2.38       89.26\n",
      "                          34 |        364        2.12       91.38\n",
      "                          35 |        330        1.92       93.29\n",
      "                          36 |        236        1.37       94.67\n",
      "                          37 |        218        1.27       95.94\n",
      "                          38 |        186        1.08       97.02\n",
      "                          39 |        161        0.94       97.95\n",
      "                          40 |        128        0.74       98.70\n",
      "                          41 |         85        0.49       99.19\n",
      "                          42 |         61        0.35       99.55\n",
      "                          43 |         36        0.21       99.76\n",
      "                          44 |         21        0.12       99.88\n",
      "                          45 |          7        0.04       99.92\n",
      "                          46 |          7        0.04       99.96\n",
      "                          47 |          2        0.01       99.97\n",
      "                          49 |          1        0.01       99.98\n",
      "                          50 |          1        0.01       99.98\n",
      "                          51 |          1        0.01       99.99\n",
      "                          52 |          1        0.01       99.99\n",
      "                          53 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,196      100.00\n",
      "\n",
      ".     recode BD1MAGE (-8=.)\n",
      "(BD1MAGE: 33 changes made)\n",
      "\n",
      ".     rename BD1MAGE bcs_mumagebirth\n",
      "\n",
      ".     label variable bcs_mumagebirth \"BCS Mother's Age at Cohort Member's Birth\"\n",
      "\n",
      ".     tab bcs_mumagebirth\n",
      "\n",
      "  BCS Mother's Age at Cohort |\n",
      "              Member's Birth |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "                          14 |          2        0.01        0.01\n",
      "                          15 |         26        0.15        0.16\n",
      "                          16 |        130        0.76        0.92\n",
      "                          17 |        304        1.77        2.69\n",
      "                          18 |        517        3.01        5.70\n",
      "                          19 |        689        4.01        9.72\n",
      "                          20 |        926        5.40       15.11\n",
      "                          21 |      1,104        6.43       21.55\n",
      "                          22 |      1,349        7.86       29.41\n",
      "                          23 |      1,489        8.68       38.08\n",
      "                          24 |      1,226        7.14       45.23\n",
      "                          25 |      1,308        7.62       52.85\n",
      "                          26 |      1,200        6.99       59.84\n",
      "                          27 |      1,124        6.55       66.39\n",
      "                          28 |        848        4.94       71.33\n",
      "                          29 |        820        4.78       76.11\n",
      "                          30 |        728        4.24       80.35\n",
      "                          31 |        618        3.60       83.95\n",
      "                          32 |        499        2.91       86.86\n",
      "                          33 |        409        2.38       89.24\n",
      "                          34 |        364        2.12       91.36\n",
      "                          35 |        330        1.92       93.28\n",
      "                          36 |        236        1.38       94.66\n",
      "                          37 |        218        1.27       95.93\n",
      "                          38 |        186        1.08       97.01\n",
      "                          39 |        161        0.94       97.95\n",
      "                          40 |        128        0.75       98.69\n",
      "                          41 |         85        0.50       99.19\n",
      "                          42 |         61        0.36       99.55\n",
      "                          43 |         36        0.21       99.76\n",
      "                          44 |         21        0.12       99.88\n",
      "                          45 |          7        0.04       99.92\n",
      "                          46 |          7        0.04       99.96\n",
      "                          47 |          2        0.01       99.97\n",
      "                          49 |          1        0.01       99.98\n",
      "                          50 |          1        0.01       99.98\n",
      "                          51 |          1        0.01       99.99\n",
      "                          52 |          1        0.01       99.99\n",
      "                          53 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,163      100.00\n",
      "\n",
      ". \n",
      ". *Father's age at cohort member's birth\n",
      "\n",
      ". tab BD1FAGE\n",
      "\n",
      " 1970: Age of father at CM's |\n",
      "    birth (from s2 var e009) |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "-8. No information available |      4,729       27.50       27.50\n",
      "    -1. N/A No father figure |        529        3.08       30.58\n",
      "                          14 |          1        0.01       30.58\n",
      "                          15 |          4        0.02       30.61\n",
      "                          16 |         11        0.06       30.67\n",
      "                          17 |         40        0.23       30.90\n",
      "                          18 |         89        0.52       31.42\n",
      "                          19 |        190        1.10       32.53\n",
      "                          20 |        252        1.47       33.99\n",
      "                          21 |        426        2.48       36.47\n",
      "                          22 |        686        3.99       40.46\n",
      "                          23 |        721        4.19       44.65\n",
      "                          24 |        741        4.31       48.96\n",
      "                          25 |        831        4.83       53.79\n",
      "                          26 |        897        5.22       59.01\n",
      "                          27 |        842        4.90       63.90\n",
      "                          28 |        700        4.07       67.98\n",
      "                          29 |        685        3.98       71.96\n",
      "                          30 |        736        4.28       76.24\n",
      "                          31 |        563        3.27       79.51\n",
      "                          32 |        502        2.92       82.43\n",
      "                          33 |        500        2.91       85.34\n",
      "                          34 |        417        2.42       87.76\n",
      "                          35 |        304        1.77       89.53\n",
      "                          36 |        316        1.84       91.37\n",
      "                          37 |        228        1.33       92.70\n",
      "                          38 |        245        1.42       94.12\n",
      "                          39 |        160        0.93       95.05\n",
      "                          40 |        178        1.04       96.09\n",
      "                          41 |        122        0.71       96.80\n",
      "                          42 |        102        0.59       97.39\n",
      "                          43 |         91        0.53       97.92\n",
      "                          44 |         69        0.40       98.32\n",
      "                          45 |         50        0.29       98.61\n",
      "                          46 |         26        0.15       98.76\n",
      "                          47 |         40        0.23       98.99\n",
      "                          48 |         39        0.23       99.22\n",
      "                          49 |         27        0.16       99.38\n",
      "                          50 |         26        0.15       99.53\n",
      "                          51 |         11        0.06       99.59\n",
      "                          52 |         13        0.08       99.67\n",
      "                          53 |          5        0.03       99.70\n",
      "                          54 |          8        0.05       99.74\n",
      "                          55 |          8        0.05       99.79\n",
      "                          56 |          6        0.03       99.83\n",
      "                          57 |          5        0.03       99.85\n",
      "                          58 |          4        0.02       99.88\n",
      "                          59 |          5        0.03       99.91\n",
      "                          60 |          1        0.01       99.91\n",
      "                          61 |          1        0.01       99.92\n",
      "                          62 |          2        0.01       99.93\n",
      "                          63 |          3        0.02       99.95\n",
      "                          64 |          2        0.01       99.96\n",
      "                          65 |          3        0.02       99.98\n",
      "                          67 |          1        0.01       99.98\n",
      "                          68 |          1        0.01       99.99\n",
      "                          70 |          1        0.01       99.99\n",
      "                          72 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,196      100.00\n",
      "\n",
      ".     recode BD1FAGE (-8=.) (-1=.)\n",
      "(BD1FAGE: 5258 changes made)\n",
      "\n",
      ".     rename BD1FAGE bcs_dadagebirth\n",
      "\n",
      ".     label variable bcs_dadagebirth \"BCS Father's Age at Cohort Member's Birth\"\n",
      "\n",
      ".     tab bcs_dadagebirth\n",
      "\n",
      "  BCS Father's Age at Cohort |\n",
      "              Member's Birth |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "                          14 |          1        0.01        0.01\n",
      "                          15 |          4        0.03        0.04\n",
      "                          16 |         11        0.09        0.13\n",
      "                          17 |         40        0.34        0.47\n",
      "                          18 |         89        0.75        1.21\n",
      "                          19 |        190        1.59        2.81\n",
      "                          20 |        252        2.11        4.92\n",
      "                          21 |        426        3.57        8.49\n",
      "                          22 |        686        5.75       14.23\n",
      "                          23 |        721        6.04       20.27\n",
      "                          24 |        741        6.21       26.48\n",
      "                          25 |        831        6.96       33.44\n",
      "                          26 |        897        7.51       40.95\n",
      "                          27 |        842        7.05       48.01\n",
      "                          28 |        700        5.86       53.87\n",
      "                          29 |        685        5.74       59.61\n",
      "                          30 |        736        6.17       65.77\n",
      "                          31 |        563        4.72       70.49\n",
      "                          32 |        502        4.21       74.69\n",
      "                          33 |        500        4.19       78.88\n",
      "                          34 |        417        3.49       82.38\n",
      "                          35 |        304        2.55       84.92\n",
      "                          36 |        316        2.65       87.57\n",
      "                          37 |        228        1.91       89.48\n",
      "                          38 |        245        2.05       91.53\n",
      "                          39 |        160        1.34       92.87\n",
      "                          40 |        178        1.49       94.36\n",
      "                          41 |        122        1.02       95.38\n",
      "                          42 |        102        0.85       96.24\n",
      "                          43 |         91        0.76       97.00\n",
      "                          44 |         69        0.58       97.58\n",
      "                          45 |         50        0.42       98.00\n",
      "                          46 |         26        0.22       98.22\n",
      "                          47 |         40        0.34       98.55\n",
      "                          48 |         39        0.33       98.88\n",
      "                          49 |         27        0.23       99.10\n",
      "                          50 |         26        0.22       99.32\n",
      "                          51 |         11        0.09       99.41\n",
      "                          52 |         13        0.11       99.52\n",
      "                          53 |          5        0.04       99.56\n",
      "                          54 |          8        0.07       99.63\n",
      "                          55 |          8        0.07       99.70\n",
      "                          56 |          6        0.05       99.75\n",
      "                          57 |          5        0.04       99.79\n",
      "                          58 |          4        0.03       99.82\n",
      "                          59 |          5        0.04       99.87\n",
      "                          60 |          1        0.01       99.87\n",
      "                          61 |          1        0.01       99.88\n",
      "                          62 |          2        0.02       99.90\n",
      "                          63 |          3        0.03       99.92\n",
      "                          64 |          2        0.02       99.94\n",
      "                          65 |          3        0.03       99.97\n",
      "                          67 |          1        0.01       99.97\n",
      "                          68 |          1        0.01       99.98\n",
      "                          70 |          1        0.01       99.99\n",
      "                          72 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     11,938      100.00\n",
      "\n",
      ". \n",
      ". *Mother was married at cohort member's birth\n",
      "\n",
      ". tab BD1MAGM\n",
      "\n",
      "      1970: Age of mother at |\n",
      "           present marriage  |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "-8. No information available |        185        1.08        1.08\n",
      "         -1. N/A Not married |      1,000        5.82        6.89\n",
      "                           5 |          1        0.01        6.90\n",
      "                           9 |          1        0.01        6.90\n",
      "                          11 |          1        0.01        6.91\n",
      "                          12 |          4        0.02        6.93\n",
      "                          13 |         17        0.10        7.03\n",
      "                          14 |         15        0.09        7.12\n",
      "                          15 |        182        1.06        8.18\n",
      "                          16 |        590        3.43       11.61\n",
      "                          17 |      1,123        6.53       18.14\n",
      "                          18 |      1,891       11.00       29.13\n",
      "                          19 |      2,334       13.57       42.71\n",
      "                          20 |      2,521       14.66       57.37\n",
      "                          21 |      2,053       11.94       69.31\n",
      "                          22 |      1,488        8.65       77.96\n",
      "                          23 |      1,091        6.34       84.30\n",
      "                          24 |        716        4.16       88.47\n",
      "                          25 |        524        3.05       91.52\n",
      "                          26 |        317        1.84       93.36\n",
      "                          27 |        281        1.63       94.99\n",
      "                          28 |        198        1.15       96.14\n",
      "                          29 |        173        1.01       97.15\n",
      "                          30 |        101        0.59       97.74\n",
      "                          31 |         94        0.55       98.28\n",
      "                          32 |         78        0.45       98.74\n",
      "                          33 |         53        0.31       99.05\n",
      "                          34 |         39        0.23       99.27\n",
      "                          35 |         30        0.17       99.45\n",
      "                          36 |         32        0.19       99.63\n",
      "                          37 |         21        0.12       99.76\n",
      "                          38 |         16        0.09       99.85\n",
      "                          39 |         13        0.08       99.92\n",
      "                          40 |          9        0.05       99.98\n",
      "                          41 |          1        0.01       99.98\n",
      "                          42 |          1        0.01       99.99\n",
      "                          44 |          1        0.01       99.99\n",
      "                          51 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,196      100.00\n",
      "\n",
      ".     capture drop bcs_mummarried\n",
      "\n",
      ".     gen bcs_mummarried = .\n",
      "(17,196 missing values generated)\n",
      "\n",
      ".     replace bcs_mummarried = 1 if (BD1MAGM>=5)\n",
      "(16,011 real changes made)\n",
      "\n",
      ".     replace bcs_mummarried = 0 if (BD1MAGM==-1)\n",
      "(1,000 real changes made)\n",
      "\n",
      ".     label variable bcs_mummarried \"BCS Mother married at Cohort Member's Birth\"\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\"\n",
      "\n",
      ".     label values bcs_mummarried yesno\n",
      "\n",
      ".     tab bcs_mummarried, mi\n",
      "\n",
      " BCS Mother |\n",
      " married at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "         No |      1,000        5.82        5.82\n",
      "        Yes |     16,011       93.11       98.92\n",
      "          . |        185        1.08      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ".     drop BD1MAGM\n",
      "\n",
      ". \n",
      ". *Mother's age at first birth\n",
      "\n",
      ". tab BD1AGEFB\n",
      "\n",
      "1970: Age of mother at first |\n",
      "                     birth   |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "-8. No information available |        118        0.69        0.69\n",
      "                          12 |          1        0.01        0.69\n",
      "                          13 |          5        0.03        0.72\n",
      "                          14 |         34        0.20        0.92\n",
      "                          15 |        127        0.74        1.66\n",
      "                          16 |        450        2.62        4.27\n",
      "                          17 |        947        5.51        9.78\n",
      "                          18 |      1,411        8.21       17.99\n",
      "                          19 |      1,662        9.67       27.65\n",
      "                          20 |      1,855       10.79       38.44\n",
      "                          21 |      1,887       10.97       49.41\n",
      "                          22 |      1,848       10.75       60.16\n",
      "                          23 |      1,525        8.87       69.03\n",
      "                          24 |      1,287        7.48       76.51\n",
      "                          25 |      1,080        6.28       82.79\n",
      "                          26 |        782        4.55       87.34\n",
      "                          27 |        581        3.38       90.72\n",
      "                          28 |        415        2.41       93.13\n",
      "                          29 |        314        1.83       94.96\n",
      "                          30 |        235        1.37       96.32\n",
      "                          31 |        147        0.85       97.18\n",
      "                          32 |        130        0.76       97.94\n",
      "                          33 |         87        0.51       98.44\n",
      "                          34 |         72        0.42       98.86\n",
      "                          35 |         55        0.32       99.18\n",
      "                          36 |         33        0.19       99.37\n",
      "                          37 |         25        0.15       99.52\n",
      "                          38 |         25        0.15       99.66\n",
      "                          39 |         20        0.12       99.78\n",
      "                          40 |         19        0.11       99.89\n",
      "                          41 |          4        0.02       99.91\n",
      "                          42 |          7        0.04       99.95\n",
      "                          43 |          5        0.03       99.98\n",
      "                          45 |          1        0.01       99.99\n",
      "                          46 |          1        0.01       99.99\n",
      "                          47 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,196      100.00\n",
      "\n",
      ".     recode BD1AGEFB (-8=.)\n",
      "(BD1AGEFB: 118 changes made)\n",
      "\n",
      ".     rename BD1AGEFB bcs_mumagefirstbirth\n",
      "\n",
      ".     label variable bcs_mumagefirstbirth \"BCS Mother's Age at First Birth\"\n",
      "\n",
      ".     tab bcs_mumagefirstbirth\n",
      "\n",
      "   BCS Mother's Age at First |\n",
      "                       Birth |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "                          12 |          1        0.01        0.01\n",
      "                          13 |          5        0.03        0.04\n",
      "                          14 |         34        0.20        0.23\n",
      "                          15 |        127        0.74        0.98\n",
      "                          16 |        450        2.63        3.61\n",
      "                          17 |        947        5.55        9.16\n",
      "                          18 |      1,411        8.26       17.42\n",
      "                          19 |      1,662        9.73       27.15\n",
      "                          20 |      1,855       10.86       38.01\n",
      "                          21 |      1,887       11.05       49.06\n",
      "                          22 |      1,848       10.82       59.88\n",
      "                          23 |      1,525        8.93       68.81\n",
      "                          24 |      1,287        7.54       76.35\n",
      "                          25 |      1,080        6.32       82.67\n",
      "                          26 |        782        4.58       87.25\n",
      "                          27 |        581        3.40       90.65\n",
      "                          28 |        415        2.43       93.08\n",
      "                          29 |        314        1.84       94.92\n",
      "                          30 |        235        1.38       96.30\n",
      "                          31 |        147        0.86       97.16\n",
      "                          32 |        130        0.76       97.92\n",
      "                          33 |         87        0.51       98.43\n",
      "                          34 |         72        0.42       98.85\n",
      "                          35 |         55        0.32       99.17\n",
      "                          36 |         33        0.19       99.37\n",
      "                          37 |         25        0.15       99.51\n",
      "                          38 |         25        0.15       99.66\n",
      "                          39 |         20        0.12       99.78\n",
      "                          40 |         19        0.11       99.89\n",
      "                          41 |          4        0.02       99.91\n",
      "                          42 |          7        0.04       99.95\n",
      "                          43 |          5        0.03       99.98\n",
      "                          45 |          1        0.01       99.99\n",
      "                          46 |          1        0.01       99.99\n",
      "                          47 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     17,078      100.00\n",
      "\n",
      ". \n",
      ". *Region at the cohort member's birth\n",
      "\n",
      ". tab BD1REGN\n",
      "\n",
      "  1970: Standard Region |\n",
      "         of residence   |      Freq.     Percent        Cum.\n",
      "------------------------+-----------------------------------\n",
      "               1. North |      1,023        5.95        5.95\n",
      "2. Yorks and Humberside |      1,486        8.64       14.59\n",
      "       3. East Midlands |      1,036        6.02       20.62\n",
      "         4. East Anglia |        539        3.13       23.75\n",
      "          5. South East |      5,022       29.20       52.95\n",
      "          6. South West |      1,051        6.11       59.07\n",
      "       7. West Midlands |      1,745       10.15       69.21\n",
      "          8. North West |      2,170       12.62       81.83\n",
      "               9. Wales |        879        5.11       86.94\n",
      "           10. Scotland |      1,617        9.40       96.35\n",
      "   11. Northern Ireland |        628        3.65      100.00\n",
      "------------------------+-----------------------------------\n",
      "                  Total |     17,196      100.00\n",
      "\n",
      ".     rename BD1REGN bcs_region\n",
      "\n",
      ".     label variable bcs_region \"BCS Region at Birth\"\n",
      "\n",
      ". \n",
      ". rename BCSID bcsid\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp12.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp12.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp12.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\BCS\\S1\\bcs1derived.dta, clear\n",
    "\n",
    "keep BCSID BD1MAGE BD1REGN BD1AGEFB BD1FAGE BD1MAGM\n",
    "\n",
    "numlabel, add\n",
    "\n",
    "*Mother's age at cohort member's birth\n",
    "tab BD1MAGE\n",
    "    recode BD1MAGE (-8=.)\n",
    "    rename BD1MAGE bcs_mumagebirth\n",
    "    label variable bcs_mumagebirth \"BCS Mother's Age at Cohort Member's Birth\"\n",
    "    tab bcs_mumagebirth\n",
    "\n",
    "*Father's age at cohort member's birth\n",
    "tab BD1FAGE\n",
    "    recode BD1FAGE (-8=.) (-1=.)\n",
    "    rename BD1FAGE bcs_dadagebirth\n",
    "    label variable bcs_dadagebirth \"BCS Father's Age at Cohort Member's Birth\"\n",
    "    tab bcs_dadagebirth\n",
    "\n",
    "*Mother was married at cohort member's birth\n",
    "tab BD1MAGM\n",
    "    capture drop bcs_mummarried\n",
    "    gen bcs_mummarried = .\n",
    "    replace bcs_mummarried = 1 if (BD1MAGM>=5)\n",
    "    replace bcs_mummarried = 0 if (BD1MAGM==-1)\n",
    "    label variable bcs_mummarried \"BCS Mother married at Cohort Member's Birth\"\n",
    "    label define yesno 1 \"Yes\" 0 \"No\"\n",
    "    label values bcs_mummarried yesno\n",
    "    tab bcs_mummarried, mi\n",
    "    drop BD1MAGM\n",
    "\n",
    "*Mother's age at first birth\n",
    "tab BD1AGEFB\n",
    "    recode BD1AGEFB (-8=.)\n",
    "    rename BD1AGEFB bcs_mumagefirstbirth\n",
    "    label variable bcs_mumagefirstbirth \"BCS Mother's Age at First Birth\"\n",
    "    tab bcs_mumagefirstbirth\n",
    "\n",
    "*Region at the cohort member's birth\n",
    "tab BD1REGN\n",
    "    rename BD1REGN bcs_region\n",
    "    label variable bcs_region \"BCS Region at Birth\"\n",
    "\n",
    "rename BCSID bcsid\n",
    "\n",
    "sort bcsid\n",
    "\n",
    "save $path3\\temp12.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path1\\ARCHIVE\\BCS\\S1\\bcs7072a.dta, clear\n",
      "\n",
      ". \n",
      ". keep bcsid a0166 a0037 a0038 a0297 a0255 \n",
      "\n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". tab a0166\n",
      "\n",
      "             Parity |      Freq.     Percent        Cum.\n",
      "--------------------+-----------------------------------\n",
      "      -2. Not Known |         32        0.19        0.19\n",
      "                  0 |      6,389       37.15       37.34\n",
      "                  1 |      5,520       32.10       69.44\n",
      "                  2 |      2,787       16.21       85.65\n",
      "                  3 |      1,266        7.36       93.01\n",
      "                  4 |        609        3.54       96.55\n",
      "                  5 |        297        1.73       98.28\n",
      "                  6 |        136        0.79       99.07\n",
      "                  7 |         74        0.43       99.50\n",
      "                  8 |         36        0.21       99.71\n",
      "                  9 |         23        0.13       99.84\n",
      "                 10 |         11        0.06       99.91\n",
      "                 11 |          9        0.05       99.96\n",
      "                 12 |          3        0.02       99.98\n",
      "                 13 |          2        0.01       99.99\n",
      "                 14 |          1        0.01       99.99\n",
      "                 17 |          1        0.01      100.00\n",
      "--------------------+-----------------------------------\n",
      "              Total |     17,196      100.00\n",
      "\n",
      ". recode a0166 (-2=.)\n",
      "(a0166: 32 changes made)\n",
      "\n",
      ". rename a0166 bcs_parity\n",
      "\n",
      ". label variable bcs_parity \"BCS Parity at Birth\"\n",
      "\n",
      ". tab bcs_parity\n",
      "\n",
      "BCS Parity at Birth |      Freq.     Percent        Cum.\n",
      "--------------------+-----------------------------------\n",
      "                  0 |      6,389       37.22       37.22\n",
      "                  1 |      5,520       32.16       69.38\n",
      "                  2 |      2,787       16.24       85.62\n",
      "                  3 |      1,266        7.38       93.00\n",
      "                  4 |        609        3.55       96.55\n",
      "                  5 |        297        1.73       98.28\n",
      "                  6 |        136        0.79       99.07\n",
      "                  7 |         74        0.43       99.50\n",
      "                  8 |         36        0.21       99.71\n",
      "                  9 |         23        0.13       99.84\n",
      "                 10 |         11        0.06       99.91\n",
      "                 11 |          9        0.05       99.96\n",
      "                 12 |          3        0.02       99.98\n",
      "                 13 |          2        0.01       99.99\n",
      "                 14 |          1        0.01       99.99\n",
      "                 17 |          1        0.01      100.00\n",
      "--------------------+-----------------------------------\n",
      "              Total |     17,164      100.00\n",
      "\n",
      ". \n",
      ". *Mother attended mothercraft classes\n",
      "\n",
      ". tab a0037\n",
      "\n",
      "   Mothercraft classes |      Freq.     Percent        Cum.\n",
      "-----------------------+-----------------------------------\n",
      "        -3. Not Stated |         90        0.52        0.52\n",
      "         -2. Not Known |         33        0.19        0.72\n",
      "               1. None |     12,372       71.95       72.66\n",
      "2. Individual Instruct |        940        5.47       78.13\n",
      "   3. LHA Clinic Class |      1,923       11.18       89.31\n",
      "     4. Hospital Class |      1,417        8.24       97.55\n",
      "              5. Other |        201        1.17       98.72\n",
      "              6. 2 & 3 |         98        0.57       99.29\n",
      "              7. 3 & 4 |         83        0.48       99.77\n",
      "              8. 2 & 4 |         39        0.23      100.00\n",
      "-----------------------+-----------------------------------\n",
      "                 Total |     17,196      100.00\n",
      "\n",
      ". capture drop bcs_mothercraft\n",
      "\n",
      ".     gen bcs_mothercraft = .\n",
      "(17,196 missing values generated)\n",
      "\n",
      ".     replace bcs_mothercraft = 1 if (a0037>=2)&(a0037<=8)\n",
      "(4,701 real changes made)\n",
      "\n",
      ".     replace bcs_mothercraft = 0 if (a0037==1)\n",
      "(12,372 real changes made)\n",
      "\n",
      ".     label variable bcs_mothercraft \"BCS Mother Attended Mothercraft Classes\"\n",
      "\n",
      ".     label define yesno 1 \"Yes\" 0 \"No\", replace\n",
      "\n",
      ".     label values bcs_mothercraft yesno\n",
      "\n",
      ".     numlabel, add\n",
      "\n",
      ".     tab bcs_mothercraft, mi\n",
      "\n",
      " BCS Mother |\n",
      "   Attended |\n",
      "Mothercraft |\n",
      "    Classes |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     12,372       71.95       71.95\n",
      "     1. Yes |      4,701       27.34       99.28\n",
      "          . |        123        0.72      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ". \n",
      ". *Mother attended labour classes\n",
      "\n",
      ". tab a0038\n",
      "\n",
      "    Labour-preparation |\n",
      "               classes |      Freq.     Percent        Cum.\n",
      "-----------------------+-----------------------------------\n",
      "        -3. Not Stated |        105        0.61        0.61\n",
      "         -2. Not Known |         34        0.20        0.81\n",
      "               1. None |     12,550       72.98       73.79\n",
      "2. Individual Instruct |        641        3.73       77.52\n",
      "   3. LHA Clinic Class |      2,192       12.75       90.27\n",
      "     4. Hospital Class |      1,326        7.71       97.98\n",
      "              5. Other |        251        1.46       99.44\n",
      "              6. 2 & 3 |         42        0.24       99.68\n",
      "              7. 3 & 4 |         44        0.26       99.94\n",
      "              8. 2 & 4 |         11        0.06      100.00\n",
      "-----------------------+-----------------------------------\n",
      "                 Total |     17,196      100.00\n",
      "\n",
      ". capture drop bcs_labourclass\n",
      "\n",
      ".     gen bcs_labourclass = .\n",
      "(17,196 missing values generated)\n",
      "\n",
      ".     replace bcs_labourclass = 1 if (a0038>=2)&(a0038<=8)\n",
      "(4,507 real changes made)\n",
      "\n",
      ".     replace bcs_labourclass = 0 if (a0038==1)\n",
      "(12,550 real changes made)\n",
      "\n",
      ".     label variable bcs_labourclass \"BCS Mother Attended Labour Classes\"\n",
      "\n",
      ".     label values bcs_labourclass yesno\n",
      "\n",
      ".     numlabel, add\n",
      "(no value label to be modified)\n",
      "\n",
      ".     tab bcs_labourclass, mi\n",
      "\n",
      " BCS Mother |\n",
      "   Attended |\n",
      "     Labour |\n",
      "    Classes |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     12,550       72.98       72.98\n",
      "     1. Yes |      4,507       26.21       99.19\n",
      "          . |        139        0.81      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ". \n",
      ". drop a0037 a0038\n",
      "\n",
      ". \n",
      ". *Mother attempted breast feeding\n",
      "\n",
      ". tab a0297\n",
      "\n",
      "   Was Lactation |\n",
      "       Attempted |      Freq.     Percent        Cum.\n",
      "-----------------+-----------------------------------\n",
      "  -3. Not Stated |        228        1.33        1.33\n",
      "   -2. Not Known |          3        0.02        1.34\n",
      "    1. Attempted |      6,311       36.70       38.04\n",
      "2. Not Attempted |     10,654       61.96      100.00\n",
      "-----------------+-----------------------------------\n",
      "           Total |     17,196      100.00\n",
      "\n",
      ".     capture drop bcs_breast\n",
      "\n",
      ".     gen bcs_breast = .\n",
      "(17,196 missing values generated)\n",
      "\n",
      ".     replace bcs_breast = 1 if (a0297==1)\n",
      "(6,311 real changes made)\n",
      "\n",
      ".     replace bcs_breast = 0 if (a0297==2)\n",
      "(10,654 real changes made)\n",
      "\n",
      ".     label variable bcs_breast \"BCS Mother Attempted Breast Feeding\"\n",
      "\n",
      ".     label values bcs_breast yesno\n",
      "\n",
      ".     tab bcs_breast, mi\n",
      "\n",
      " BCS Mother |\n",
      "  Attempted |\n",
      "     Breast |\n",
      "    Feeding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     10,654       61.96       61.96\n",
      "     1. Yes |      6,311       36.70       98.66\n",
      "          . |        231        1.34      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,196      100.00\n",
      "\n",
      ".     drop a0297\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". \n",
      ". save $path3\\temp13.dta, replace\n",
      "(note: file F:\\Data\\MYDATA\\TEMP\\temp13.dta not found)\n",
      "file F:\\Data\\MYDATA\\TEMP\\temp13.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path1\\ARCHIVE\\BCS\\S1\\bcs7072a.dta, clear\n",
    "\n",
    "keep bcsid a0166 a0037 a0038 a0297 a0255 \n",
    "numlabel, add\n",
    "\n",
    "tab a0166\n",
    "recode a0166 (-2=.)\n",
    "rename a0166 bcs_parity\n",
    "label variable bcs_parity \"BCS Parity at Birth\"\n",
    "tab bcs_parity\n",
    "\n",
    "*Mother attended mothercraft classes\n",
    "tab a0037\n",
    "capture drop bcs_mothercraft\n",
    "    gen bcs_mothercraft = .\n",
    "    replace bcs_mothercraft = 1 if (a0037>=2)&(a0037<=8)\n",
    "    replace bcs_mothercraft = 0 if (a0037==1)\n",
    "    label variable bcs_mothercraft \"BCS Mother Attended Mothercraft Classes\"\n",
    "    label define yesno 1 \"Yes\" 0 \"No\", replace\n",
    "    label values bcs_mothercraft yesno\n",
    "    numlabel, add\n",
    "    tab bcs_mothercraft, mi\n",
    "\n",
    "*Mother attended labour classes\n",
    "tab a0038\n",
    "capture drop bcs_labourclass\n",
    "    gen bcs_labourclass = .\n",
    "    replace bcs_labourclass = 1 if (a0038>=2)&(a0038<=8)\n",
    "    replace bcs_labourclass = 0 if (a0038==1)\n",
    "    label variable bcs_labourclass \"BCS Mother Attended Labour Classes\"\n",
    "    label values bcs_labourclass yesno\n",
    "    numlabel, add\n",
    "    tab bcs_labourclass, mi\n",
    "\n",
    "drop a0037 a0038\n",
    "\n",
    "*Mother attempted breast feeding\n",
    "tab a0297\n",
    "    capture drop bcs_breast\n",
    "    gen bcs_breast = .\n",
    "    replace bcs_breast = 1 if (a0297==1)\n",
    "    replace bcs_breast = 0 if (a0297==2)\n",
    "    label variable bcs_breast \"BCS Mother Attempted Breast Feeding\"\n",
    "    label values bcs_breast yesno\n",
    "    tab bcs_breast, mi\n",
    "    drop a0297\n",
    "\n",
    "sort bcsid\n",
    "\n",
    "save $path3\\temp13.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Merge all these pieces of information together to create a working BCS data file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\temp1.dta, clear\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp2.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         4,835\n",
      "        from master                     4,448  (_merge==1)\n",
      "        from using                        387  (_merge==2)\n",
      "\n",
      "    matched                            12,748  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        17583             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp3.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                           387\n",
      "        from master                       387  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            17,196  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        17583             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp4.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         4,448\n",
      "        from master                     4,448  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            13,135  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        17583             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp5.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         4,252\n",
      "        from master                     3,480  (_merge==1)\n",
      "        from using                        772  (_merge==2)\n",
      "\n",
      "    matched                            14,103  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18355             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp6.dta\n",
      "(label rgsc already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         3,485\n",
      "        from master                     3,485  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            14,870  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18355             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp7.dta\n",
      "(label rgsc already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         7,074\n",
      "        from master                     6,907  (_merge==1)\n",
      "        from using                        167  (_merge==2)\n",
      "\n",
      "    matched                            11,448  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18522             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp8.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         3,652\n",
      "        from master                     3,652  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            14,870  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18522             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp9.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         3,808\n",
      "        from master                     3,728  (_merge==1)\n",
      "        from using                         80  (_merge==2)\n",
      "\n",
      "    matched                            14,794  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18602             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp10.dta\n",
      "(label nssec already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         3,728\n",
      "        from master                     3,728  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            14,874  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        18602             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp11.dta\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                           570\n",
      "        from master                        83  (_merge==1)\n",
      "        from using                        487  (_merge==2)\n",
      "\n",
      "    matched                            18,519  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        19089             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp12.dta\n",
      "(label yesno already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         1,893\n",
      "        from master                     1,893  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            17,196  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        19089             0\n",
      "--------------------------------------\n",
      "\n",
      ".     merge 1:1 bcsid using $path3\\temp13.dta\n",
      "(label A0255 already defined)\n",
      "(label yesno already defined)\n",
      "\n",
      "    Result                           # of obs.\n",
      "    -----------------------------------------\n",
      "    not matched                         1,893\n",
      "        from master                     1,893  (_merge==1)\n",
      "        from using                          0  (_merge==2)\n",
      "\n",
      "    matched                            17,196  (_merge==3)\n",
      "    -----------------------------------------\n",
      "\n",
      ".     drop _merge\n",
      "\n",
      ".     sort bcsid\n",
      "\n",
      ".     duplicates report bcsid\n",
      "\n",
      "Duplicates in terms of bcsid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        19089             0\n",
      "--------------------------------------\n",
      "\n",
      ".     \n",
      ". capture drop cohort\n",
      "\n",
      ".     gen cohort=2\n",
      "\n",
      ".     label variable cohort \"Cohort\"\n",
      "\n",
      ".     label define cohort 1 \"NCDS\" 2 \"BCS\"\n",
      "\n",
      ".     label values cohort cohort\n",
      "\n",
      ".     tab cohort, mi\n",
      "\n",
      "     Cohort |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "        BCS |     19,089      100.00      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     19,089      100.00\n",
      "\n",
      ". \n",
      ". sort bcsid\n",
      "\n",
      ". save $path2\\BCS_MAIN.dta, replace\n",
      "file F:\\Data\\MYDATA\\WORK\\BCS_MAIN.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "use $path3\\temp1.dta, clear\n",
    "    sort bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp2.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp3.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp4.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp5.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp6.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp7.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp8.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp9.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp10.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp11.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp12.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    merge 1:1 bcsid using $path3\\temp13.dta\n",
    "    drop _merge\n",
    "    sort bcsid\n",
    "    duplicates report bcsid\n",
    "    \n",
    "capture drop cohort\n",
    "    gen cohort=2\n",
    "    label variable cohort \"Cohort\"\n",
    "    label define cohort 1 \"NCDS\" 2 \"BCS\"\n",
    "    label values cohort cohort\n",
    "    tab cohort, mi\n",
    "\n",
    "sort bcsid\n",
    "save $path2\\BCS_MAIN.dta, replace\n",
    "\n",
    "* return to jupyter\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Delete the temporary data files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". erase $path3\\temp1.dta\n",
      "\n",
      ". erase $path3\\temp2.dta\n",
      "\n",
      ". erase $path3\\temp3.dta\n",
      "\n",
      ". erase $path3\\temp4.dta\n",
      "\n",
      ". erase $path3\\temp5.dta\n",
      "\n",
      ". erase $path3\\temp6.dta\n",
      "\n",
      ". erase $path3\\temp7.dta\n",
      "\n",
      ". erase $path3\\temp8.dta\n",
      "\n",
      ". erase $path3\\temp9.dta\n",
      "\n",
      ". erase $path3\\temp10.dta\n",
      "\n",
      ". erase $path3\\temp11.dta\n",
      "\n",
      ". erase $path3\\temp12.dta\n",
      "\n",
      ". erase $path3\\temp13.dta\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "erase $path3\\temp1.dta\n",
    "erase $path3\\temp2.dta\n",
    "erase $path3\\temp3.dta\n",
    "erase $path3\\temp4.dta\n",
    "erase $path3\\temp5.dta\n",
    "erase $path3\\temp6.dta\n",
    "erase $path3\\temp7.dta\n",
    "erase $path3\\temp8.dta\n",
    "erase $path3\\temp9.dta\n",
    "erase $path3\\temp10.dta\n",
    "erase $path3\\temp11.dta\n",
    "erase $path3\\temp12.dta\n",
    "erase $path3\\temp13.dta\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Append the NCDS and BCS data files and create a new id variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path2\\NCDS_MAIN.dta, clear\n",
      "\n",
      ".     append using $path2\\BCS_MAIN.dta\n",
      "(label cohort already defined)\n",
      "(label OUTCME01 already defined)\n",
      "(label OUTCME02 already defined)\n",
      "(label yesno already defined)\n",
      "(label nssec already defined)\n",
      "(label rgsc already defined)\n",
      "(label ed_cat already defined)\n",
      "(label egp already defined)\n",
      "\n",
      ".     tab cohort, mi\n",
      "\n",
      "     Cohort |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       NCDS |     18,558       49.29       49.29\n",
      "        BCS |     19,089       50.71      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     37,647      100.00\n",
      "\n",
      ". \n",
      ". *Here we create a new id number for all cases in our dataset\n",
      "\n",
      ". * We probably won't need this, but we create it just in case\n",
      "\n",
      ". capture drop poolid\n",
      "\n",
      ".     gen poolid = _n\n",
      "\n",
      ".     sum poolid\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "      poolid |     37,647       18824     10867.9          1      37647\n",
      "\n",
      ".     duplicates report poolid\n",
      "\n",
      "Duplicates in terms of poolid\n",
      "\n",
      "--------------------------------------\n",
      "   copies | observations       surplus\n",
      "----------+---------------------------\n",
      "        1 |        37647             0\n",
      "--------------------------------------\n",
      "\n",
      ".     label variable poolid \"New ID for Pooled Data\"\n",
      "\n",
      ".     \n",
      ".     sort poolid\n",
      "\n",
      ".     \n",
      ". save $path3\\pooledNCDSBCS_v1.dta, replace\n",
      "file F:\\Data\\MYDATA\\TEMP\\pooledNCDSBCS_v1.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path2\\NCDS_MAIN.dta, clear\n",
    "    append using $path2\\BCS_MAIN.dta\n",
    "    tab cohort, mi\n",
    "\n",
    "*Here we create a new id number for all cases in our dataset\n",
    "* We probably won't need this, but we create it just in case\n",
    "capture drop poolid\n",
    "    gen poolid = _n\n",
    "    sum poolid\n",
    "    duplicates report poolid\n",
    "    label variable poolid \"New ID for Pooled Data\"\n",
    "    \n",
    "    sort poolid\n",
    "    \n",
    "save $path3\\pooledNCDSBCS_v1.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now create joint variables from the information available in the two cohorts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v1.dta, clear\n",
      "\n",
      ". numlabel, add\n",
      "\n",
      ". \n",
      ". *Cohort member's standardised ability test scores age 10/11\n",
      "\n",
      ". capture drop ability\n",
      "\n",
      ".     gen ability = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace ability = ncds11_stdbastotalscore if (cohort==1)\n",
      "(14,131 real changes made)\n",
      "\n",
      ".     replace ability = bcs10_stdabilityscore if (cohort==2)\n",
      "(11,397 real changes made)\n",
      "\n",
      ".     label variable ability \"Ability Test Score\"\n",
      "\n",
      ".     summ ability\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |     25,528         100    14.99971   43.58873   151.1925\n",
      "\n",
      ".     summ ability if (cohort==1)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |     14,131         100          15   60.10213   134.4337\n",
      "\n",
      ".     summ ability if (cohort==2)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |     11,397         100          15   43.58873   151.1925\n",
      "\n",
      ".     \n",
      ". \n",
      ". *Create one single variable for the ability PCA score in both cohorts\n",
      "\n",
      ". \n",
      ". summ ncds11_stdpc1 bcs10_stdpc1\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds11_std~1 |     14,131   -1.45e-10           1   -2.68232   2.329129\n",
      "bcs10_stdpc1 |     11,397   -4.73e-10           1  -3.701819   3.371969\n",
      "\n",
      ". \n",
      ". capture drop pcascore\n",
      "\n",
      ".     gen pcascore = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace pcascore = ncds11_stdpc1 if (cohort==1)\n",
      "(14,131 real changes made)\n",
      "\n",
      ".     replace pcascore = bcs10_stdpc1 if (cohort==2)\n",
      "(11,397 real changes made)\n",
      "\n",
      ".     label variable pcascore \"PCA Ability Test Score\"\n",
      "\n",
      ".     summ pcascore\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "    pcascore |     25,528   -2.91e-10    .9999804  -3.701819   3.371969\n",
      "\n",
      ".     summ pcascore if (cohort==1)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "    pcascore |     14,131   -1.45e-10           1   -2.68232   2.329129\n",
      "\n",
      ".     summ pcascore if (cohort==2)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "    pcascore |     11,397   -4.73e-10           1  -3.701819   3.371969\n",
      "\n",
      ". \n",
      ". *Cohort member's gender\n",
      "\n",
      ". tab ncds_male, mi\n",
      "\n",
      "NCDS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      8,959       23.80       23.80\n",
      "     1. Yes |      9,595       25.49       49.28\n",
      "          . |     19,093       50.72      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     37,647      100.00\n",
      "\n",
      ". tab ncds_male\n",
      "\n",
      "NCDS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      8,959       48.29       48.29\n",
      "     1. Yes |      9,595       51.71      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,554      100.00\n",
      "\n",
      ". tab bcs_male, mi\n",
      "\n",
      " BCS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      8,943       23.75       23.75\n",
      "     1. Yes |      9,686       25.73       49.48\n",
      "          . |     19,018       50.52      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     37,647      100.00\n",
      "\n",
      ". tab bcs_male\n",
      "\n",
      " BCS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      8,943       48.01       48.01\n",
      "     1. Yes |      9,686       51.99      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     18,629      100.00\n",
      "\n",
      ". \n",
      ". capture drop male\n",
      "\n",
      ".     gen male = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace male = ncds_male  if (cohort==1)\n",
      "(18,554 real changes made)\n",
      "\n",
      ".     replace male = bcs_male if (cohort==2)\n",
      "(18,629 real changes made)\n",
      "\n",
      ".     label variable male \"male\"\n",
      "\n",
      ".     label values male yesno\n",
      "\n",
      ".     tab male\n",
      "\n",
      "       male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     17,902       48.15       48.15\n",
      "     1. Yes |     19,281       51.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     37,183      100.00\n",
      "\n",
      ". \n",
      ". *Father's NS-SEC\n",
      "\n",
      ". tab ncds_panssec \n",
      "\n",
      "             NCDS Age 11 Father's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        367        3.29        3.29\n",
      "                 2. Higher Professional |        536        4.80        8.09\n",
      "   3. Lower managerial and professional |      1,323       11.86       19.95\n",
      "                        4. Intermediate |      1,058        9.48       29.44\n",
      "     5. Small employers and own account |      1,374       12.32       41.75\n",
      "     6. Lower Supervisory and Technical |      1,817       16.29       58.04\n",
      "                        7. Semi-Routine |      1,972       17.68       75.72\n",
      "                             8. Routine |      2,709       24.28      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     11,156      100.00\n",
      "\n",
      ". tab bcs_panssec\n",
      "\n",
      "                                  nssec |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        570        4.73        4.73\n",
      "                 2. Higher Professional |        760        6.31       11.04\n",
      "   3. Lower managerial and professional |      1,806       14.99       26.04\n",
      "                        4. Intermediate |      1,090        9.05       35.09\n",
      "     5. Small employers and own account |      1,582       13.13       48.22\n",
      "     6. Lower Supervisory and Technical |      2,032       16.87       65.09\n",
      "                        7. Semi-Routine |      1,739       14.44       79.53\n",
      "                             8. Routine |      2,466       20.47      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     12,045      100.00\n",
      "\n",
      ". \n",
      ". capture drop dadnssec\n",
      "\n",
      ".     gen dadnssec = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace dadnssec = ncds_panssec if (cohort==1)\n",
      "(11,156 real changes made)\n",
      "\n",
      ".     replace dadnssec = bcs_panssec if (cohort==2)\n",
      "(12,045 real changes made)\n",
      "\n",
      ".     label variable dadnssec \"Father's NSSEC\"\n",
      "\n",
      ".     label values dadnssec nssec \n",
      "\n",
      ".     tab dadnssec\n",
      "\n",
      "                         Father's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        937        4.04        4.04\n",
      "                 2. Higher Professional |      1,296        5.59        9.62\n",
      "   3. Lower managerial and professional |      3,129       13.49       23.11\n",
      "                        4. Intermediate |      2,148        9.26       32.37\n",
      "     5. Small employers and own account |      2,956       12.74       45.11\n",
      "     6. Lower Supervisory and Technical |      3,849       16.59       61.70\n",
      "                        7. Semi-Routine |      3,711       16.00       77.69\n",
      "                             8. Routine |      5,175       22.31      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |     23,201      100.00\n",
      "\n",
      ". \n",
      ". *Here we create an interaction between NSSSEC and Cohort\n",
      "\n",
      ". *Of course we can undetake interactions using the interactions code in Stata\n",
      "\n",
      ". *However creating an interaction here allows a little more clarity.\n",
      "\n",
      ". \n",
      ". *Coefficients and standard errors of a model with interaction\n",
      "\n",
      ". *terms cannot be readily interpreted independently of each other, \n",
      "\n",
      ". *since any given coefficient refers to the combined influence of all \n",
      "\n",
      ". *of the other contributing variables. We specify the interaction as a \n",
      "\n",
      ". *discrete categorical variable that has a distinct value\n",
      "\n",
      ". *for each combination of circumstances. This allows the independent effect\n",
      "\n",
      ". *of each discrete category to be more easily interpreted.\n",
      "\n",
      ". \n",
      ". *See: Jaccard, J. and R. Turrisi (2003) Interaction Effects in Multiple \n",
      "\n",
      ". * Regression. London: Sage. \n",
      "\n",
      ".  \n",
      ". *NSSEC * Cohort Interaction\n",
      "\n",
      ". capture drop nsinteraction\n",
      "\n",
      ".     gen nsinteraction = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
      "(367 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
      "(570 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
      "(536 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
      "(760 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
      "(1,323 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
      "(1,806 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
      "(1,058 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
      "(1,090 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
      "(1,374 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
      "(1,582 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
      "(1,817 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
      "(2,032 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
      "(1,972 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
      "(1,739 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
      "(2,709 real changes made)\n",
      "\n",
      ".     replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
      "(2,466 real changes made)\n",
      "\n",
      ".     tab nsinteraction\n",
      "\n",
      "nsinteracti |\n",
      "         on |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |        367        1.58        1.58\n",
      "          2 |        570        2.46        4.04\n",
      "          3 |        536        2.31        6.35\n",
      "          4 |        760        3.28        9.62\n",
      "          5 |      1,323        5.70       15.33\n",
      "          6 |      1,806        7.78       23.11\n",
      "          7 |      1,058        4.56       27.67\n",
      "          8 |      1,090        4.70       32.37\n",
      "          9 |      1,374        5.92       38.29\n",
      "         10 |      1,582        6.82       45.11\n",
      "         11 |      1,817        7.83       52.94\n",
      "         12 |      2,032        8.76       61.70\n",
      "         13 |      1,972        8.50       70.20\n",
      "         14 |      1,739        7.50       77.69\n",
      "         15 |      2,709       11.68       89.37\n",
      "         16 |      2,466       10.63      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     23,201      100.00\n",
      "\n",
      ".     label variable nsinteraction \"NSSEC Interaction\"\n",
      "\n",
      ".     label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 1\n",
      "> 1 \"NCDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\"\n",
      "\n",
      ".     label values nsinteraction nsint\n",
      "\n",
      ". \n",
      ". *Parents Education\n",
      "\n",
      ". \n",
      ". capture drop parented\n",
      "\n",
      ".     gen parented = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace parented = ncds_parented if (cohort==1)\n",
      "(15,927 real changes made)\n",
      "\n",
      ".     replace parented = bcs_parented if (cohort==2)\n",
      "(13,088 real changes made)\n",
      "\n",
      ".     label values parented ed\n",
      "\n",
      ".     label variable parented \"Parent's Highest Education\"\n",
      "\n",
      ".     tab parented\n",
      "\n",
      "   Parent's |\n",
      "    Highest |\n",
      "  Education |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     18,499       63.76       63.76\n",
      "          2 |      7,797       26.87       90.63\n",
      "          3 |      1,021        3.52       94.15\n",
      "          4 |      1,698        5.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     29,015      100.00\n",
      "\n",
      ".     tab parented cohort, col\n",
      "\n",
      "+-------------------+\n",
      "| Key               |\n",
      "|-------------------|\n",
      "|     frequency     |\n",
      "| column percentage |\n",
      "+-------------------+\n",
      "\n",
      "  Parent's |\n",
      "   Highest |        Cohort\n",
      " Education |   1. NCDS     2. BCS |     Total\n",
      "-----------+----------------------+----------\n",
      "         1 |    11,659      6,840 |    18,499 \n",
      "           |     73.20      52.26 |     63.76 \n",
      "-----------+----------------------+----------\n",
      "         2 |     3,384      4,413 |     7,797 \n",
      "           |     21.25      33.72 |     26.87 \n",
      "-----------+----------------------+----------\n",
      "         3 |       246        775 |     1,021 \n",
      "           |      1.54       5.92 |      3.52 \n",
      "-----------+----------------------+----------\n",
      "         4 |       638      1,060 |     1,698 \n",
      "           |      4.01       8.10 |      5.85 \n",
      "-----------+----------------------+----------\n",
      "     Total |    15,927     13,088 |    29,015 \n",
      "           |    100.00     100.00 |    100.00 \n",
      "\n",
      "\n",
      ". \n",
      ". *Additional variables that will potentially be used to produce weights and in multiple imputation\n",
      "\n",
      ". *Mother's Age at the Birth of the Cohort member\n",
      "\n",
      ". capture drop mumage\n",
      "\n",
      ".     gen mumage = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace mumage = ncds_mumagebirth if (cohort==1)\n",
      "(17,402 real changes made)\n",
      "\n",
      ".     replace mumage = bcs_mumagebirth if (cohort==2)\n",
      "(17,163 real changes made)\n",
      "\n",
      ".     label variable mumage \"Mother's Age at CM Birth\"\n",
      "\n",
      ". \n",
      ". *Cohort Member Parity at Birth\n",
      "\n",
      ". capture drop parity\n",
      "\n",
      ".     gen parity = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace parity = ncds_parity if (cohort==1)\n",
      "(17,412 real changes made)\n",
      "\n",
      ".     replace parity = bcs_parity if (cohort==2)\n",
      "(17,164 real changes made)\n",
      "\n",
      ".     label variable parity \"Parity at Birth\"\n",
      "\n",
      ". \n",
      ". *Whether the cohort member's mother is married at cohort member's birth\n",
      "\n",
      ". tab ncds_married   \n",
      "\n",
      "NCDS Mother |\n",
      " married at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |        743        4.27        4.27\n",
      "     1. Yes |     16,662       95.73      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,405      100.00\n",
      "\n",
      ". tab bcs_mummarried\n",
      "\n",
      " BCS Mother |\n",
      " married at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      1,000        5.88        5.88\n",
      "     1. Yes |     16,011       94.12      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,011      100.00\n",
      "\n",
      ". capture drop married\n",
      "\n",
      ".     gen married = .\n",
      "(37,647 missing values generated)\n",
      "\n",
      ".     replace married = ncds_married if (cohort==1)\n",
      "(17,405 real changes made)\n",
      "\n",
      ".     replace married = bcs_mummarried if (cohort==2)\n",
      "(17,011 real changes made)\n",
      "\n",
      ".     label variable married \"Mother married at CM birth\"\n",
      "\n",
      ".     label values married yesno\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v1.dta, clear\n",
    "numlabel, add\n",
    "\n",
    "*Cohort member's standardised ability test scores age 10/11\n",
    "capture drop ability\n",
    "    gen ability = .\n",
    "    replace ability = ncds11_stdbastotalscore if (cohort==1)\n",
    "    replace ability = bcs10_stdabilityscore if (cohort==2)\n",
    "    label variable ability \"Ability Test Score\"\n",
    "    summ ability\n",
    "    summ ability if (cohort==1)\n",
    "    summ ability if (cohort==2)\n",
    "    \n",
    "\n",
    "*Create one single variable for the ability PCA score in both cohorts\n",
    "\n",
    "summ ncds11_stdpc1 bcs10_stdpc1\n",
    "\n",
    "capture drop pcascore\n",
    "    gen pcascore = .\n",
    "    replace pcascore = ncds11_stdpc1 if (cohort==1)\n",
    "    replace pcascore = bcs10_stdpc1 if (cohort==2)\n",
    "    label variable pcascore \"PCA Ability Test Score\"\n",
    "    summ pcascore\n",
    "    summ pcascore if (cohort==1)\n",
    "    summ pcascore if (cohort==2)\n",
    "\n",
    "*Cohort member's gender\n",
    "tab ncds_male, mi\n",
    "tab ncds_male\n",
    "tab bcs_male, mi\n",
    "tab bcs_male\n",
    "\n",
    "capture drop male\n",
    "    gen male = .\n",
    "    replace male = ncds_male  if (cohort==1)\n",
    "    replace male = bcs_male if (cohort==2)\n",
    "    label variable male \"male\"\n",
    "    label values male yesno\n",
    "    tab male\n",
    "\n",
    "*Father's NS-SEC\n",
    "tab ncds_panssec \n",
    "tab bcs_panssec\n",
    "\n",
    "capture drop dadnssec\n",
    "    gen dadnssec = .\n",
    "    replace dadnssec = ncds_panssec if (cohort==1)\n",
    "    replace dadnssec = bcs_panssec if (cohort==2)\n",
    "    label variable dadnssec \"Father's NSSEC\"\n",
    "    label values dadnssec nssec \n",
    "    tab dadnssec\n",
    "\n",
    "*Here we create an interaction between NSSSEC and Cohort\n",
    "*Of course we can undetake interactions using the interactions code in Stata\n",
    "*However creating an interaction here allows a little more clarity.\n",
    "\n",
    "*Coefficients and standard errors of a model with interaction\n",
    "*terms cannot be readily interpreted independently of each other, \n",
    "*since any given coefficient refers to the combined influence of all \n",
    "*of the other contributing variables. We specify the interaction as a \n",
    "*discrete categorical variable that has a distinct value\n",
    "*for each combination of circumstances. This allows the independent effect\n",
    "*of each discrete category to be more easily interpreted.\n",
    "\n",
    "*See: Jaccard, J. and R. Turrisi (2003) Interaction Effects in Multiple \n",
    "* Regression. London: Sage. \n",
    " \n",
    "*NSSEC * Cohort Interaction\n",
    "capture drop nsinteraction\n",
    "    gen nsinteraction = .\n",
    "    replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
    "    replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
    "    replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
    "    replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
    "    replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
    "    replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
    "    replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
    "    replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
    "    replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
    "    replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
    "    replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
    "    replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
    "    replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
    "    replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
    "    replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
    "    replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
    "    tab nsinteraction\n",
    "    label variable nsinteraction \"NSSEC Interaction\"\n",
    "    label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 11 \"NCDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\"\n",
    "    label values nsinteraction nsint\n",
    "\n",
    "*Parents Education\n",
    "\n",
    "capture drop parented\n",
    "    gen parented = .\n",
    "    replace parented = ncds_parented if (cohort==1)\n",
    "    replace parented = bcs_parented if (cohort==2)\n",
    "    label values parented ed\n",
    "    label variable parented \"Parent's Highest Education\"\n",
    "    tab parented\n",
    "    tab parented cohort, col\n",
    "\n",
    "*Additional variables that will potentially be used to produce weights and in multiple imputation\n",
    "*Mother's Age at the Birth of the Cohort member\n",
    "capture drop mumage\n",
    "    gen mumage = .\n",
    "    replace mumage = ncds_mumagebirth if (cohort==1)\n",
    "    replace mumage = bcs_mumagebirth if (cohort==2)\n",
    "    label variable mumage \"Mother's Age at CM Birth\"\n",
    "\n",
    "*Cohort Member Parity at Birth\n",
    "capture drop parity\n",
    "    gen parity = .\n",
    "    replace parity = ncds_parity if (cohort==1)\n",
    "    replace parity = bcs_parity if (cohort==2)\n",
    "    label variable parity \"Parity at Birth\"\n",
    "\n",
    "*Whether the cohort member's mother is married at cohort member's birth\n",
    "tab ncds_married   \n",
    "tab bcs_mummarried\n",
    "capture drop married\n",
    "    gen married = .\n",
    "    replace married = ncds_married if (cohort==1)\n",
    "    replace married = bcs_mummarried if (cohort==2)\n",
    "    label variable married \"Mother married at CM birth\"\n",
    "    label values married yesno\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now define the analytic sample.\n",
    "\n",
    "The BCS includes cohort members born in Northern Ireland. These cohort members are included in the first survey but not in susequent sweeps. We exclude these Northern Irish cohort members for comparabiltiy with the NCDS dataset.\n",
    "\n",
    "The cross sectional sample sizes in the cohort studies vary, because some cohort members (e.g. immigrants to the UK) were included after the first sweep. For consistency and clarity we include only the original birth sample of both cohorts in our analytical sample (i.e. we keep only cohort members who were present at the first survey).\n",
    "\n",
    "More details on the samples of the NCDS and BCS are available [here](https://sp.ukdataservice.ac.uk/doc/5579/mrdoc/pdf/ncds_and_bcs70_response.pdf)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Exclude cohort members from Northern Ireland\n",
      "\n",
      ". \n",
      ". tab ncds0_country, mi\n",
      "\n",
      " NCDS Age 0 |\n",
      "    Country |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      " 1. England |     14,517       38.56       38.56\n",
      "   2. Wales |        914        2.43       40.99\n",
      "3. Scotland |      1,985        5.27       46.26\n",
      "          . |     20,231       53.74      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     37,647      100.00\n",
      "\n",
      ". \n",
      ". tab bcs0_country\n",
      "\n",
      "   1970: Country of |\n",
      "        Interview   |      Freq.     Percent        Cum.\n",
      "--------------------+-----------------------------------\n",
      "         1. England |     14,072       81.83       81.83\n",
      "           2. Wales |        879        5.11       86.94\n",
      "        3. Scotland |      1,617        9.40       96.35\n",
      "4. Northern Ireland |        628        3.65      100.00\n",
      "--------------------+-----------------------------------\n",
      "              Total |     17,196      100.00\n",
      "\n",
      ". drop if bcs0_country==4\n",
      "(628 observations deleted)\n",
      "\n",
      ". *628 babies from NI are deleted\n",
      "\n",
      ". \n",
      ". tab ncds_11outcome \n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,337       82.64       82.64\n",
      "              2. Refusal |        797        4.29       86.94\n",
      "          3. Non-contact |        406        2.19       89.13\n",
      "   4. Other unproductive |        202        1.09       90.21\n",
      "           6. Not Issued |        275        1.48       91.70\n",
      "7. Not Issued - Emigrant |        701        3.78       95.47\n",
      "    8. Not Issued - Dead |        840        4.53      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". tab bcs_10outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "           1980 (age 10) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,851       80.81       80.81\n",
      "   4. Other unproductive |      2,381       12.96       93.76\n",
      "           6. Not Issued |        549        2.99       96.75\n",
      "                 8. Dead |        597        3.25      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,378      100.00\n",
      "\n",
      ". \n",
      ". *Keep only the original birth sample\n",
      "\n",
      ". \n",
      ". tab ncds_0outcome \n",
      "\n",
      "   NCDS response outcome |\n",
      "            1958 (age 0) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     17,415       93.84       93.84\n",
      "          3. Non-contact |        218        1.17       95.02\n",
      "           6. Not Issued |        925        4.98      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". tab ncds_11outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     15,337       82.64       82.64\n",
      "              2. Refusal |        797        4.29       86.94\n",
      "          3. Non-contact |        406        2.19       89.13\n",
      "   4. Other unproductive |        202        1.09       90.21\n",
      "           6. Not Issued |        275        1.48       91.70\n",
      "7. Not Issued - Emigrant |        701        3.78       95.47\n",
      "    8. Not Issued - Dead |        840        4.53      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,558      100.00\n",
      "\n",
      ". tab ncds_0outcome ncds_11outcome\n",
      "\n",
      "NCDS response outcome |                     NCDS response outcome 1969 (age 11)\n",
      "         1958 (age 0) | 1. Produc  2. Refusa  3. Non-co  4. Other   6. Not Is  7. Not Is  8. Not Is |     Total\n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "        1. Productive |    14,574        781        358        195          0        667        840 |    17,415 \n",
      "       3. Non-contact |       182          2         33          0          0          1          0 |       218 \n",
      "        6. Not Issued |       581         14         15          7        275         33          0 |       925 \n",
      "----------------------+-----------------------------------------------------------------------------+----------\n",
      "                Total |    15,337        797        406        202        275        701        840 |    18,558 \n",
      "\n",
      "\n",
      ". drop if (ncds_0outcome !=1)&(cohort==1)\n",
      "(1,143 observations deleted)\n",
      "\n",
      ". \n",
      ". tab bcs_0outcome \n",
      "\n",
      "    BCS response outcome |\n",
      "            1970 (age 0) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     16,568       90.15       90.15\n",
      "   4. Other unproductive |         18        0.10       90.25\n",
      "           6. Not Issued |      1,792        9.75      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,378      100.00\n",
      "\n",
      ". tab bcs_10outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "           1980 (age 10) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,851       80.81       80.81\n",
      "   4. Other unproductive |      2,381       12.96       93.76\n",
      "           6. Not Issued |        549        2.99       96.75\n",
      "                 8. Dead |        597        3.25      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     18,378      100.00\n",
      "\n",
      ". tab bcs_0outcome bcs_10outcome\n",
      "\n",
      " BCS response outcome |     BCS response outcome 1980 (age 10)\n",
      "         1970 (age 0) | 1. Produc  4. Other   6. Not Is    8. Dead |     Total\n",
      "----------------------+--------------------------------------------+----------\n",
      "        1. Productive |    13,757      2,212          4        595 |    16,568 \n",
      "4. Other unproductive |        15          2          0          1 |        18 \n",
      "        6. Not Issued |     1,079        167        545          1 |     1,792 \n",
      "----------------------+--------------------------------------------+----------\n",
      "                Total |    14,851      2,381        549        597 |    18,378 \n",
      "\n",
      "\n",
      ". drop if (bcs_0outcome !=1)&(cohort==2)\n",
      "(1,893 observations deleted)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Exclude cohort members from Northern Ireland\n",
    "\n",
    "tab ncds0_country, mi\n",
    "\n",
    "tab bcs0_country\n",
    "drop if bcs0_country==4\n",
    "*628 babies from NI are deleted\n",
    "\n",
    "tab ncds_11outcome \n",
    "tab bcs_10outcome\n",
    "\n",
    "*Keep only the original birth sample\n",
    "\n",
    "tab ncds_0outcome \n",
    "tab ncds_11outcome\n",
    "tab ncds_0outcome ncds_11outcome\n",
    "drop if (ncds_0outcome !=1)&(cohort==1)\n",
    "\n",
    "tab bcs_0outcome \n",
    "tab bcs_10outcome\n",
    "tab bcs_0outcome bcs_10outcome\n",
    "drop if (bcs_0outcome !=1)&(cohort==2)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we create some variables that identify the cohort member's response to the first survey and the survey with the outcome of interest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *sweep0outcome indicates that the cohort member was included in the first survey\n",
      "\n",
      ". capture drop sweep0outcome\n",
      "\n",
      ".     gen sweep0outcome = 0\n",
      "\n",
      ".     replace sweep0outcome = 1 if ((ncds_0outcome==1)&(cohort==1))\n",
      "(17,415 real changes made)\n",
      "\n",
      ".     replace sweep0outcome = 1 if ((bcs_0outcome==1)&(cohort==2))\n",
      "(16,568 real changes made)\n",
      "\n",
      ".     label values sweep0outcome yesno\n",
      "\n",
      ".     tab sweep0outcome cohort\n",
      "\n",
      "sweep0outc |        Cohort\n",
      "       ome |   1. NCDS     2. BCS |     Total\n",
      "-----------+----------------------+----------\n",
      "    1. Yes |    17,415     16,568 |    33,983 \n",
      "-----------+----------------------+----------\n",
      "     Total |    17,415     16,568 |    33,983 \n",
      "\n",
      "\n",
      ".     label variable sweep0outcome \"Productive at first survey\"\n",
      "\n",
      ". \n",
      ". tab ncds_11outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,574       83.69       83.69\n",
      "              2. Refusal |        781        4.48       88.17\n",
      "          3. Non-contact |        358        2.06       90.23\n",
      "   4. Other unproductive |        195        1.12       91.35\n",
      "7. Not Issued - Emigrant |        667        3.83       95.18\n",
      "    8. Not Issued - Dead |        840        4.82      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     17,415      100.00\n",
      "\n",
      ". tab bcs_10outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "           1980 (age 10) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     13,757       83.03       83.03\n",
      "   4. Other unproductive |      2,212       13.35       96.38\n",
      "           6. Not Issued |          4        0.02       96.41\n",
      "                 8. Dead |        595        3.59      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     16,568      100.00\n",
      "\n",
      ". \n",
      ". *sweeptestoutcome indicates that they were included in the age 10/11 survey\n",
      "\n",
      ". capture drop sweeptestoutcome\n",
      "\n",
      ".     gen sweeptestoutcome = 0\n",
      "\n",
      ".     replace sweeptestoutcome = 1 if ((ncds_11outcome==1)&(cohort==1))\n",
      "(14,574 real changes made)\n",
      "\n",
      ".     replace sweeptestoutcome = 1 if ((bcs_10outcome==1)&(cohort==2))\n",
      "(13,757 real changes made)\n",
      "\n",
      ".     label values sweeptestoutcome yesno\n",
      "\n",
      ".     tab sweeptestoutcome cohort\n",
      "\n",
      "sweeptesto |        Cohort\n",
      "    utcome |   1. NCDS     2. BCS |     Total\n",
      "-----------+----------------------+----------\n",
      "     0. No |     2,841      2,811 |     5,652 \n",
      "    1. Yes |    14,574     13,757 |    28,331 \n",
      "-----------+----------------------+----------\n",
      "     Total |    17,415     16,568 |    33,983 \n",
      "\n",
      "\n",
      ".     label variable sweeptestoutcome \"Productive at age 10/11 survey\"\n",
      "\n",
      ". \n",
      ". *Also create a variable to indicate if the cohort members had died by the age\n",
      "\n",
      ". * 10/11 surveys. This will be used to delete cases after multiple imputation.\n",
      "\n",
      ". capture drop deadtestoutcome\n",
      "\n",
      ".     gen deadtestoutcome = 0\n",
      "\n",
      ".     replace deadtestoutcome = 1 if ((ncds_11outcome==8)&(cohort==1))\n",
      "(840 real changes made)\n",
      "\n",
      ".     replace deadtestoutcome = 1 if ((bcs_10outcome==8)&(cohort==2))\n",
      "(595 real changes made)\n",
      "\n",
      ".     label values deadtestoutcome yesno\n",
      "\n",
      ".     tab deadtestoutcome cohort\n",
      "\n",
      "deadtestou |        Cohort\n",
      "     tcome |   1. NCDS     2. BCS |     Total\n",
      "-----------+----------------------+----------\n",
      "     0. No |    16,575     15,973 |    32,548 \n",
      "    1. Yes |       840        595 |     1,435 \n",
      "-----------+----------------------+----------\n",
      "     Total |    17,415     16,568 |    33,983 \n",
      "\n",
      "\n",
      ".     label variable deadtestoutcome \"Dead at age 10/11 survey\"\n",
      "\n",
      ".     tab deadtestoutcome\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     32,548       95.78       95.78\n",
      "     1. Yes |      1,435        4.22      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     33,983      100.00\n",
      "\n",
      ". \n",
      ". tab cohort\n",
      "\n",
      "     Cohort |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    1. NCDS |     17,415       51.25       51.25\n",
      "     2. BCS |     16,568       48.75      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     33,983      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*sweep0outcome indicates that the cohort member was included in the first survey\n",
    "capture drop sweep0outcome\n",
    "    gen sweep0outcome = 0\n",
    "    replace sweep0outcome = 1 if ((ncds_0outcome==1)&(cohort==1))\n",
    "    replace sweep0outcome = 1 if ((bcs_0outcome==1)&(cohort==2))\n",
    "    label values sweep0outcome yesno\n",
    "    tab sweep0outcome cohort\n",
    "    label variable sweep0outcome \"Productive at first survey\"\n",
    "\n",
    "tab ncds_11outcome\n",
    "tab bcs_10outcome\n",
    "\n",
    "*sweeptestoutcome indicates that they were included in the age 10/11 survey\n",
    "capture drop sweeptestoutcome\n",
    "    gen sweeptestoutcome = 0\n",
    "    replace sweeptestoutcome = 1 if ((ncds_11outcome==1)&(cohort==1))\n",
    "    replace sweeptestoutcome = 1 if ((bcs_10outcome==1)&(cohort==2))\n",
    "    label values sweeptestoutcome yesno\n",
    "    tab sweeptestoutcome cohort\n",
    "    label variable sweeptestoutcome \"Productive at age 10/11 survey\"\n",
    "\n",
    "*Also create a variable to indicate if the cohort members had died by the age\n",
    "* 10/11 surveys. This will be used to delete cases after multiple imputation.\n",
    "capture drop deadtestoutcome\n",
    "    gen deadtestoutcome = 0\n",
    "    replace deadtestoutcome = 1 if ((ncds_11outcome==8)&(cohort==1))\n",
    "    replace deadtestoutcome = 1 if ((bcs_10outcome==8)&(cohort==2))\n",
    "    label values deadtestoutcome yesno\n",
    "    tab deadtestoutcome cohort\n",
    "    label variable deadtestoutcome \"Dead at age 10/11 survey\"\n",
    "    tab deadtestoutcome\n",
    "\n",
    "tab cohort\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create a variable that indicates which cases have complete information on all the required information for our main analysis (i.e. this variable indicates the complete records sample)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". capture drop samplenssec \n",
      "\n",
      ".     egen samplenssec  = rmiss(ability male parented dadnssec)\n",
      "\n",
      ".     tab samplenssec \n",
      "\n",
      "samplenssec |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |     17,716       52.13       52.13\n",
      "          1 |      8,681       25.55       77.68\n",
      "          2 |      3,887       11.44       89.12\n",
      "          3 |      3,690       10.86       99.97\n",
      "          4 |          9        0.03      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     33,983      100.00\n",
      "\n",
      ".     label variable samplenssec  \"Sample non-missing ns-sec measure\"\n",
      "\n",
      ".     tab samplenssec  if (cohort==1)\n",
      "\n",
      "     Sample |\n",
      "non-missing |\n",
      "     ns-sec |\n",
      "    measure |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |      9,617       55.22       55.22\n",
      "          1 |      4,185       24.03       79.25\n",
      "          2 |      1,900       10.91       90.16\n",
      "          3 |      1,710        9.82       99.98\n",
      "          4 |          3        0.02      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,415      100.00\n",
      "\n",
      ".     tab samplenssec  if (cohort==2)\n",
      "\n",
      "     Sample |\n",
      "non-missing |\n",
      "     ns-sec |\n",
      "    measure |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          0 |      8,099       48.88       48.88\n",
      "          1 |      4,496       27.14       76.02\n",
      "          2 |      1,987       11.99       88.01\n",
      "          3 |      1,980       11.95       99.96\n",
      "          4 |          6        0.04      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,568      100.00\n",
      "\n",
      ".     \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "capture drop samplenssec \n",
    "    egen samplenssec  = rmiss(ability male parented dadnssec)\n",
    "    tab samplenssec \n",
    "    label variable samplenssec  \"Sample non-missing ns-sec measure\"\n",
    "    tab samplenssec  if (cohort==1)\n",
    "    tab samplenssec  if (cohort==2)\n",
    "    \n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### General Ability Test Scores <a class=\"anchor\" id=\"generalabilitytestscores\"></a>\n",
    "\n",
    "In the NCDS and the BCS cohort members completed general ability tests at age 11 and 10 respectively. The general ability test in the NCDS comprised of 40 verbal and 40 non-verbal items (see Shepherd, 2012). The general ability test in the BCS comprised of four sub-scales from the British Ability Scales, word definition, word similarities, recall of digits and matrices (see Parsons, 2014).\n",
    "\n",
    "We computed an overall cognitive ability test score using the summated test scores. This is the method used in previous studies which examine the role of cognitive ability in educational and occupational attainment (e.g. Breen and Goldthorpe, 2001). Alternatively principal components analysis (PCA) can be used to summarise the relationship between the cognitive ability subtests in order to produce an estimate of general ability ‘g’. This method has also been deployed in previous studies using the cognitive ability test scores in the NCDS and BCS (e.g. Schoon, 2010). We have computed scores using the two alternative methods, and we find that the total scores and the PCA scores are almost perfectly correlated (NCDS: r = 0.999, p < 0.001; BCS: r = 0.997, p < 0.001). Therefore, we conclude that either approach would be suitable for this analysis, but we have chosen the total score measure because of their direct comparability with previous studies.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * Correlation between PCA and Total Score Methods\n",
      "\n",
      ". \n",
      ". pwcorr ability pcascore if (cohort==1)&(samplenssec==0), sig\n",
      "\n",
      "             |  ability pcascore\n",
      "-------------+------------------\n",
      "     ability |   1.0000 \n",
      "             |\n",
      "             |\n",
      "    pcascore |   0.9994   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". pwcorr ability pcascore if (cohort==2)&(samplenssec==0), sig\n",
      "\n",
      "             |  ability pcascore\n",
      "-------------+------------------\n",
      "     ability |   1.0000 \n",
      "             |\n",
      "             |\n",
      "    pcascore |   0.9970   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* Correlation between PCA and Total Score Methods\n",
    "\n",
    "pwcorr ability pcascore if (cohort==1)&(samplenssec==0), sig\n",
    "pwcorr ability pcascore if (cohort==2)&(samplenssec==0), sig\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The general ability test in the NCDS is comparable with the test in the BCS (see Elliott et al., 1978; Shepherd, 2012). However, it is not possible to directly assess the Flynn Effect using the general ability test measures in the NCDS and the BCS. This is because the tests include a different number of items and have different total scores. The two measures are suitable for the current analysis because our focus is on relative social class inequalities within each of the two cohorts. In order operationalise the analyses we construct a cross-cohort measure using arithmetic standardisation, which has been used in previous studies (see Schoon, 2010). The summary statistics for the cognitive ability tests are provided in table 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *****TABLE 1: \n",
      "\n",
      ". *DESCRIPTIVE STATISTICS FOR GENERAL ABILITY TEST SCORES IN THE NCDS AND BCS.\n",
      "\n",
      ". summ ability if (cohort==1) & (samplenssec == 0)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |      9,617    100.8653     14.7091   60.10213   133.5046\n",
      "\n",
      ". summ ability if (cohort==2) & (samplenssec == 0)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |      8,099    100.8371    14.76514   45.37629   151.1925\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*****TABLE 1: \n",
    "*DESCRIPTIVE STATISTICS FOR GENERAL ABILITY TEST SCORES IN THE NCDS AND BCS.\n",
    "summ ability if (cohort==1) & (samplenssec == 0)\n",
    "summ ability if (cohort==2) & (samplenssec == 0)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "<img src=\"https://i.imgur.com/ADh0Qe6.png\" alt=\"Table 1\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Parental Social Class <a class=\"anchor\" id=\"parentalsocialclass\"></a>\n",
    "\n",
    "The central analytical focus of this article is an investigation of the effects of parental social class on filial general cognitive ability test scores. Social class schemes are widely used in sociological research and are regarded as socio-economic measures that divide the population into unequally rewarded categories (Crompton, 2008). We employ an occupation-based socio-economic measure because it provides a robust and parsimonious indicator of parental social positions (see Connelly et al., 2016b). Occupation based socio-economic measures do not simply act as a proxy where income data are unavailable, they are sociological measures designed to better understand fundamental forms of social relations and inequalities to which income is merely epiphenomenal (Rose and Pevalin, 2003). In this analysis we employ the United Kingdom National Statistics Socio-Economic Classification (NS-SEC) (see Rose and Pevalin, 2005) which is widely used in sociological analyses and in official statistics.\n",
    "\n",
    "Gregg (2012) coded and deposited UK standard occupational classification codes (SOC2000) for the job titles of NCDS fathers collected in the age 11 survey, and BCS mothers and fathers collected in the age 10 survey (SN7023, Gregg, 2012). These detailed occupational codes are an invaluable resource, and we use them to compute NS-SEC in both cohorts. As detailed occupational information (i.e. SOC codes) is only available for fathers in the NCDS [3](#note3) we only use father’s information in the BCS (see table 2).\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Further Explanatory Variables <a class=\"anchor\" id=\"explanatoryvariables\"></a>\n",
    "\n",
    "In previous research gender differences in childhood cognitive ability test scores have been observed (see Van der Sluis et al., 2006; Strand et al., 2006; Sullivan et al., 2013). Parental education is measured using mother’s and father’s years of education completed after the compulsory school leaving age. We categorise these variables in a similar manner to previous research using these data (see Cheung and Egerton, 2007). We are cautious not to attribute titles to these categories because in British samples years of education do not neatly map on to an individual’s educational experiences and attainments (see Connelly et al., 2016a). We use the highest level of education of the cohort member’s parents to represent the parental level of education (see table 2). Parental education is included as a control variable which may measure an additional dimension of a family’s socio-economic position.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *****TABLE 2: \n",
      "\n",
      ". *DESCRIPTIVE STATISTICS OF THE GENDER AND PARENTAL EDUCATION VARIABLES IN \n",
      "\n",
      ". *THE NCDS AND BCS.\n",
      "\n",
      ". tab male if (cohort==1) & (samplenssec == 0)\n",
      "\n",
      "       male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      4,700       48.87       48.87\n",
      "     1. Yes |      4,917       51.13      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |      9,617      100.00\n",
      "\n",
      ". tab male if (cohort==2) & (samplenssec == 0)\n",
      "\n",
      "       male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      3,937       48.61       48.61\n",
      "     1. Yes |      4,162       51.39      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |      8,099      100.00\n",
      "\n",
      ". \n",
      ". tab parented if (cohort==1) & (samplenssec == 0)\n",
      "\n",
      "   Parent's |\n",
      "    Highest |\n",
      "  Education |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      6,964       72.41       72.41\n",
      "          2 |      2,123       22.08       94.49\n",
      "          3 |        136        1.41       95.90\n",
      "          4 |        394        4.10      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |      9,617      100.00\n",
      "\n",
      ". tab parented if (cohort==2) & (samplenssec == 0)\n",
      "\n",
      "   Parent's |\n",
      "    Highest |\n",
      "  Education |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |      4,162       51.39       51.39\n",
      "          2 |      2,773       34.24       85.63\n",
      "          3 |        477        5.89       91.52\n",
      "          4 |        687        8.48      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |      8,099      100.00\n",
      "\n",
      ". \n",
      ". tab dadnssec if (cohort==1) & (samplenssec == 0)\n",
      "\n",
      "                         Father's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        296        3.08        3.08\n",
      "                 2. Higher Professional |        447        4.65        7.73\n",
      "   3. Lower managerial and professional |      1,125       11.70       19.42\n",
      "                        4. Intermediate |        898        9.34       28.76\n",
      "     5. Small employers and own account |      1,193       12.41       41.17\n",
      "     6. Lower Supervisory and Technical |      1,589       16.52       57.69\n",
      "                        7. Semi-Routine |      1,714       17.82       75.51\n",
      "                             8. Routine |      2,355       24.49      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |      9,617      100.00\n",
      "\n",
      ". tab dadnssec if (cohort==2) & (samplenssec == 0)\n",
      "\n",
      "                         Father's NSSEC |      Freq.     Percent        Cum.\n",
      "----------------------------------------+-----------------------------------\n",
      "1. Large Employers and Higher Manageria |        371        4.58        4.58\n",
      "                 2. Higher Professional |        483        5.96       10.54\n",
      "   3. Lower managerial and professional |      1,215       15.00       25.55\n",
      "                        4. Intermediate |        737        9.10       34.65\n",
      "     5. Small employers and own account |      1,037       12.80       47.45\n",
      "     6. Lower Supervisory and Technical |      1,449       17.89       65.34\n",
      "                        7. Semi-Routine |      1,188       14.67       80.01\n",
      "                             8. Routine |      1,619       19.99      100.00\n",
      "----------------------------------------+-----------------------------------\n",
      "                                  Total |      8,099      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*****TABLE 2: \n",
    "*DESCRIPTIVE STATISTICS OF THE GENDER AND PARENTAL EDUCATION VARIABLES IN \n",
    "*THE NCDS AND BCS.\n",
    "tab male if (cohort==1) & (samplenssec == 0)\n",
    "tab male if (cohort==2) & (samplenssec == 0)\n",
    "\n",
    "tab parented if (cohort==1) & (samplenssec == 0)\n",
    "tab parented if (cohort==2) & (samplenssec == 0)\n",
    "\n",
    "tab dadnssec if (cohort==1) & (samplenssec == 0)\n",
    "tab dadnssec if (cohort==2) & (samplenssec == 0)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/cGnY9c7.png\" alt=\"Table 2\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save this dataset which contains our analytical sample for the complete records analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". keep cohort ability ncds_male bcs_male male pcascore ncds_panssec bcs_panssec dadnssec nsinteraction ncds_paed_cat ncds_moed_cat ncds_pare\n",
      "> nted ncds_parented bcs_parented parented  mumage parity married samplenssec sweep0outcome sweeptestoutcome deadtestoutcome cohort ncdsid b\n",
      "> csid poolid ncds_region ncds0_olddadrgsc ncds0_country ncds_mumagebirth ncds_parity ncds_married ncds_male ncds_paed_cat ncds0_olddadrgsc \n",
      "> ncds_moed_cat ncds_region bcs_male bcs0_country bcs_paed bcs_moed bcs_region bcs_mumagefirstbirth bcs_mumagebirth bcs_mummarried bcs_parit\n",
      "> y bcs_mothercraft bcs_labourclass bcs_breast ncds_0outcome bcs_0outcome ncds_11outcome bcs_10outcome n539 n1225 n2393 n2394\n",
      "\n",
      ". \n",
      ". save $path3\\pooledNCDSBCS_v2.dta, replace\n",
      "file F:\\Data\\MYDATA\\TEMP\\pooledNCDSBCS_v2.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "keep cohort ability ncds_male bcs_male male pcascore ncds_panssec bcs_panssec dadnssec nsinteraction ncds_paed_cat ncds_moed_cat ncds_parented ncds_parented bcs_parented parented  mumage parity married samplenssec sweep0outcome sweeptestoutcome deadtestoutcome cohort ncdsid bcsid poolid ncds_region ncds0_olddadrgsc ncds0_country ncds_mumagebirth ncds_parity ncds_married ncds_male ncds_paed_cat ncds0_olddadrgsc ncds_moed_cat ncds_region bcs_male bcs0_country bcs_paed bcs_moed bcs_region bcs_mumagefirstbirth bcs_mumagebirth bcs_mummarried bcs_parity bcs_mothercraft bcs_labourclass bcs_breast ncds_0outcome bcs_0outcome ncds_11outcome bcs_10outcome n539 n1225 n2393 n2394\n",
    "\n",
    "save $path3\\pooledNCDSBCS_v2.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Descriptive Results <a class=\"anchor\" id=\"descriptiveresults\"></a>\n",
    "\n",
    "The relationship between father’s social class (NS-SEC) and children’s cognitive ability test scores is reported in table 3. There is very clear evidence of a social class effect and, on average, children with more occupationally advantaged fathers have higher cognitive ability test scores in both cohorts. The difference between the children with the most advantaged fathers (NS-SEC 1.1, e.g. a chief executive officer) and the least advantaged fathers (NS-SEC 7, e.g. a construction labourer) is on average 13 points for those in the NCDS cohort, and 11 points for those in the BCS cohort. The greatest differences are observed between children with fathers in NS-SEC 1.2 (e.g. university professors) and children with fathers in NS-SEC 7 (e.g. a construction labourer). These differences are on average 14 points in the NCDS and 15 points in the BCS, which is approximately one standard deviation for both cohorts.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v2.dta, clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v2.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * TABLE 3: MEAN AND STANDARD DEVIATION OF ABILITY TEST SCORES BY FATHERS NS-SEC.\n",
      "\n",
      ". tab dadnssec if (cohort==1)&(samplenssec==0), summarize(ability)\n",
      "\n",
      "   Father's |    Summary of Ability Test Score\n",
      "      NSSEC |        Mean   Std. Dev.       Freq.\n",
      "------------+------------------------------------\n",
      "  1. Large  |   108.54637    12.81598         296\n",
      "  2. Higher |   109.77293   12.447527         447\n",
      "  3. Lower  |   107.58185   13.092885       1,125\n",
      "  4. Interm |   104.93596   13.638081         898\n",
      "  5. Small  |   100.19477   14.394405       1,193\n",
      "  6. Lower  |   100.44714   14.542212       1,589\n",
      "  7. Semi-R |   98.630747   14.160227       1,714\n",
      "  8. Routin |   95.696471   14.372789       2,355\n",
      "------------+------------------------------------\n",
      "      Total |   100.86529   14.709098       9,617\n",
      "\n",
      ". tab dadnssec if (cohort==2)&(samplenssec==0), summarize(ability)\n",
      "\n",
      "   Father's |    Summary of Ability Test Score\n",
      "      NSSEC |        Mean   Std. Dev.       Freq.\n",
      "------------+------------------------------------\n",
      "  1. Large  |   106.46699   13.788343         371\n",
      "  2. Higher |   110.30348   12.838866         483\n",
      "  3. Lower  |   106.39991    14.26658       1,215\n",
      "  4. Interm |   104.66767   13.445025         737\n",
      "  5. Small  |    99.48955   14.141566       1,037\n",
      "  6. Lower  |   99.546757   13.969552       1,449\n",
      "  7. Semi-R |   97.741444   14.447238       1,188\n",
      "  8. Routin |    95.09387   14.182694       1,619\n",
      "------------+------------------------------------\n",
      "      Total |   100.83708   14.765141       8,099\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* TABLE 3: MEAN AND STANDARD DEVIATION OF ABILITY TEST SCORES BY FATHERS NS-SEC.\n",
    "tab dadnssec if (cohort==1)&(samplenssec==0), summarize(ability)\n",
    "tab dadnssec if (cohort==2)&(samplenssec==0), summarize(ability)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/S0wpdlS.png\" alt=\"Table 3\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Missing Data\n",
    "\n",
    "Missing data in the cohort studies has the potential to induce bias into estimation within some analyses. As Carpenter and Kenward (2012) strongly advise we first conduct a complete records analysis, followed by a series of principled approaches to handling missing data. The NCDS and BCS do not include non-response weights in the deposited datasets. We construct inverse probability weights (IPW) in an attempt to reduce bias in the complete records analysis due to attrition (Höfler et al., 2005). We also undertake multiple imputation by chained equations (see Little and Rubin, 2014), and we use multiple imputation and inverse probability weights in combination (see Seaman et al., 2012). The substantive conclusions of the models using these different missing data strategies are largely consistent but this could not have been know a priori. We focus our discussion on the more sophisticated models, which use multiple imputation and inverse probability weights to provide improved adjustments in the presence of missing data.\n",
    "\n",
    "An important innovation in the present work is that details of the complete modelling process and outputs are provided within the Jupyter notebook. There is no single agreed upon approach for handling missing data in large-scale surveys. There are alternative ways of specifying how datasets are multiply imputed, and therefore for the work to be reproducible it is essential to have clear documentation that facilitates the precise duplication of the datasets that are created.\n",
    "\n",
    "Missing data techniques are at the cutting edge of statistical methods. It is highly likely that as statistical theory develops, the techniques and approaches that are currently prescribed may be modified. We also envisage that facilities within data analysis software will inevitably change. Therefore, we argue that there are obvious benefits to providing clearly documented information about the processes relating to handling missing data in order to enable the work to be reproducible at some point in the future.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * TABLE S1: PATTERNS OF UNIT MISSINGNESS FOR THE NCDS AND BCS.\n",
      "\n",
      ". \n",
      ". *Present at the birth survey\n",
      "\n",
      ". tab ncds_0outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "            1958 (age 0) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     17,415      100.00      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     17,415      100.00\n",
      "\n",
      ". tab bcs_0outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "            1970 (age 0) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     16,568      100.00      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     16,568      100.00\n",
      "\n",
      ". *Outcome at the age 10/11 survey\n",
      "\n",
      ". tab ncds_11outcome\n",
      "\n",
      "   NCDS response outcome |\n",
      "           1969 (age 11) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     14,574       83.69       83.69\n",
      "              2. Refusal |        781        4.48       88.17\n",
      "          3. Non-contact |        358        2.06       90.23\n",
      "   4. Other unproductive |        195        1.12       91.35\n",
      "7. Not Issued - Emigrant |        667        3.83       95.18\n",
      "    8. Not Issued - Dead |        840        4.82      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     17,415      100.00\n",
      "\n",
      ". tab bcs_10outcome\n",
      "\n",
      "    BCS response outcome |\n",
      "           1980 (age 10) |      Freq.     Percent        Cum.\n",
      "-------------------------+-----------------------------------\n",
      "           1. Productive |     13,757       83.03       83.03\n",
      "   4. Other unproductive |      2,212       13.35       96.38\n",
      "           6. Not Issued |          4        0.02       96.41\n",
      "                 8. Dead |        595        3.59      100.00\n",
      "-------------------------+-----------------------------------\n",
      "                   Total |     16,568      100.00\n",
      "\n",
      ". *Deceased at age 10/11 Survey\n",
      "\n",
      ". tab deadtestoutcome\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     32,548       95.78       95.78\n",
      "     1. Yes |      1,435        4.22      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     33,983      100.00\n",
      "\n",
      ". tab deadtestoutcome if (cohort==1)\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     16,575       95.18       95.18\n",
      "     1. Yes |        840        4.82      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,415      100.00\n",
      "\n",
      ". tab deadtestoutcome if (cohort==2)\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     15,973       96.41       96.41\n",
      "     1. Yes |        595        3.59      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,568      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* TABLE S1: PATTERNS OF UNIT MISSINGNESS FOR THE NCDS AND BCS.\n",
    "\n",
    "*Present at the birth survey\n",
    "tab ncds_0outcome\n",
    "tab bcs_0outcome\n",
    "*Outcome at the age 10/11 survey\n",
    "tab ncds_11outcome\n",
    "tab bcs_10outcome\n",
    "*Deceased at age 10/11 Survey\n",
    "tab deadtestoutcome\n",
    "tab deadtestoutcome if (cohort==1)\n",
    "tab deadtestoutcome if (cohort==2)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/FFAZp3p.png\" alt=\"Table S1\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * TABLE S2: PATTERNS OF ITEM MISSINGNESS FOR THE NCDS AND BCS DATA POOLED.\n",
      "\n",
      ". mvpatterns ability male parented dadnssec\n",
      "Variable     | type     obs   mv   variable label\n",
      "-------------+-----------------------------------------------\n",
      "ability      | float  24828 9155   Ability Test Score\n",
      "male         | float  33974    9   male\n",
      "parented     | float  27778 6205   Parent's Highest Education\n",
      "dadnssec     | float  2179112192   Father's NSSEC\n",
      "-------------------------------------------------------------\n",
      "\n",
      "Patterns of missing values\n",
      "\n",
      "  +------------------------+\n",
      "  | _pattern   _mv   _freq |\n",
      "  |------------------------|\n",
      "  |     ++++     0   17716 |\n",
      "  |     +++.     1    4908 |\n",
      "  |     .+..     3    3690 |\n",
      "  |     .++.     2    2814 |\n",
      "  |     .+++     1    2340 |\n",
      "  |------------------------|\n",
      "  |     ++.+     1    1433 |\n",
      "  |     ++..     2     771 |\n",
      "  |     .+.+     2     302 |\n",
      "  |     ....     4       9 |\n",
      "  +------------------------+\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* TABLE S2: PATTERNS OF ITEM MISSINGNESS FOR THE NCDS AND BCS DATA POOLED.\n",
    "mvpatterns ability male parented dadnssec\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/zMwdqHk.png\" alt=\"Table S2\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As Carpenter and Kenward (2012) advise we first conduct a complete records analysis (see table S4), followed by a series of principled approaches to handling missing data. \n",
    "\n",
    "1.\tWe construct inverse probability weights (IPW) in an attempt to reduce bias in the complete records analysis due to attrition (see table S5, model 1). \n",
    "\n",
    "2.\tWe also undertake multiple imputation by chained equations (see table S5, model 2). \n",
    "\n",
    "3.\tWe use multiple imputation and inverse probability weights in combination (results shown in main paper). \n",
    "\n",
    "The substantive conclusions of the models using these different missing data strategies are largely consistent but this could not have been known a priori. We focus our discussion in the main article on the more sophisticated models, which use multiple imputation and inverse probability weights to provide improved adjustments in the presence of missing data. Missing data techniques are at the cutting edge of statistical methods. It is highly likely that as statistical theory develops, the techniques and approaches that are currently prescribed may be modified. We also envisage that facilities within data analysis software will inevitably change. Therefore, we argue that there are obvious benefits to providing clearly documented information about the processes relating to handling missing data in order to enable the work to be reproducible at some point in the future. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###### Inverse Probability Weights\n",
    "\n",
    "We constructed inverse probability weights (IPW) in an attempt to reduce bias in the complete records analysis due to attrition (see Höfler, Pfister, Lieb, & Wittchen, 2005). To produce the inverse probability weights we first model whether a cohort member is present at the age 11 (NCDS) or the age 10 (BCS) sweep of the survey. We selected variables to predict this outcome based on their use in previous models of missingness in the cohort studies (see Mostafa & Wiggins, 2015; Plewis, Calderwood, Hawkes, & Nathan, 2004), and also the degree of missingness on these variables themselves. The variables used in these models are shown in table S3.\n",
    "\n",
    "A very small per cent of variance in missingness at age 11/10 is accounted for by our models (less than 1 per cent in the NCDS, and 3 per cent in the BCS). This indicates that the predictive power of our models is weak and our attrition weights are unlikely to have a major impact on the results. We have made a best attempt however, with the available information, to construct suitable inverse probability weights. Including additional variables in the models of missingness at age 10/11 did not lead to large increases in pseudo R2 and led to a reduction in the number of observations included in the model due to item missingness. Mostafa and Wiggins (2015) argue that the use of metadata, such as interviewer characteristics and conditions surrounding the collection of the data, could account for more of the variance in unit missingness in the cohort studies. These metadata variables are currently not available in the deposited NCDS or BCS datasets available to researchers.\n",
    "\n",
    "Following the models of missingness at age 10/11, we calculated predicted probabilities of observing the cohort member at age 10/11. The weight is the inverse of these predicted probabilities (Höfler et al., 2005). For 235 cases there was missing information that prevented the calculation of the probability of inclusion. In these cases a weight of 1 was allocated to ensure that these cases were included in the models and that the overall sample sizes would remain consistent.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/A8clawb.png\" alt=\"Table S3\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v2.dta, clear\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v2.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab sweeptestoutcome if (cohort==1)\n",
      "\n",
      " Productive |\n",
      "     at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      2,841       16.31       16.31\n",
      "     1. Yes |     14,574       83.69      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     17,415      100.00\n",
      "\n",
      ". tab sweeptestoutcome if (cohort==2)\n",
      "\n",
      " Productive |\n",
      "     at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      2,811       16.97       16.97\n",
      "     1. Yes |     13,757       83.03      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,568      100.00\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "tab sweeptestoutcome if (cohort==1)\n",
    "tab sweeptestoutcome if (cohort==2)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Potential explanatory variables in the missingness model for the NCDS (1958 Cohort).\n",
      "\n",
      ". tab ncds_region\n",
      "\n",
      "     Region at PMS |\n",
      "    (1958) - Birth |      Freq.     Percent        Cum.\n",
      "-------------------+-----------------------------------\n",
      "          1. North |      1,234        7.09        7.09\n",
      "     2. North West |      2,295       13.18       20.26\n",
      "   3. E & W.Riding |      1,433        8.23       28.49\n",
      " 4. North Midlands |      1,299        7.46       35.95\n",
      "       5. Midlands |      1,648        9.46       45.41\n",
      "           6. East |      1,242        7.13       52.55\n",
      "     7. South East |      3,444       19.78       72.32\n",
      "          8. South |        955        5.48       77.81\n",
      "     9. South West |        966        5.55       83.35\n",
      "         10. Wales |        914        5.25       88.60\n",
      "      11. Scotland |      1,985       11.40      100.00\n",
      "-------------------+-----------------------------------\n",
      "             Total |     17,415      100.00\n",
      "\n",
      ". tab ncds0_olddadrgsc\n",
      "\n",
      " NCDS Birth |\n",
      "   Dad RGSC |\n",
      " Old Coding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "       1. I |        746        4.53        4.53\n",
      "      2. II |      2,133       12.96       17.49\n",
      "  3. III NM |      1,592        9.67       27.17\n",
      "   4. III M |      8,376       50.89       78.06\n",
      "      5. IV |      1,995       12.12       90.18\n",
      "       6. V |      1,616        9.82      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,458      100.00\n",
      "\n",
      ". tab ncds0_country, mi\n",
      "\n",
      " NCDS Age 0 |\n",
      "    Country |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      " 1. England |     14,516       42.72       42.72\n",
      "   2. Wales |        914        2.69       45.41\n",
      "3. Scotland |      1,985        5.84       51.25\n",
      "          . |     16,568       48.75      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     33,983      100.00\n",
      "\n",
      ". summ ncds_mumagebirth\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds_mumag~h |     17,402    27.45702     5.72552          8         48\n",
      "\n",
      ". summ ncds_parity\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      " ncds_parity |     17,412    1.316219    1.560322          0          9\n",
      "\n",
      ". summ ncds_married\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds_married |     17,405    .9573111    .2021606          0          1\n",
      "\n",
      ". summ ncds_male\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "   ncds_male |     17,412    .5169423    .4997272          0          1\n",
      "\n",
      ". summ ncds_paed_cat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds_paed_~t |     13,950    1.272688    .6484122          1          4\n",
      "\n",
      ". summ ncds0_olddadrgsc\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds0_oldd~c |     16,458    3.825677    1.227504          1          6\n",
      "\n",
      ". summ ncds_moed_cat\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "ncds_moed_~t |     10,798    1.261993     .578698          1          4\n",
      "\n",
      ". summ ncds_region\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      " ncds_region |     17,415    5.881596    3.099308          1         11\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Potential explanatory variables in the missingness model for the NCDS (1958 Cohort).\n",
    "tab ncds_region\n",
    "tab ncds0_olddadrgsc\n",
    "tab ncds0_country, mi\n",
    "summ ncds_mumagebirth\n",
    "summ ncds_parity\n",
    "summ ncds_married\n",
    "summ ncds_male\n",
    "summ ncds_paed_cat\n",
    "summ ncds0_olddadrgsc\n",
    "summ ncds_moed_cat\n",
    "summ ncds_region\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Final missingness model selected\n",
      "\n",
      ". logit sweeptestoutcome ncds_mumagebirth ncds_parity ncds_married ncds_male ib7.ncds_region if (cohort==1), allbaselevels\n",
      "\n",
      "Iteration 0:   log likelihood = -7733.3828  \n",
      "Iteration 1:   log likelihood = -7700.5492  \n",
      "Iteration 2:   log likelihood =  -7700.044  \n",
      "Iteration 3:   log likelihood = -7700.0439  \n",
      "\n",
      "Logistic regression                             Number of obs     =     17,395\n",
      "                                                LR chi2(14)       =      66.68\n",
      "                                                Prob > chi2       =     0.0000\n",
      "Log likelihood = -7700.0439                     Pseudo R2         =     0.0043\n",
      "\n",
      "------------------------------------------------------------------------------------\n",
      "  sweeptestoutcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]\n",
      "-------------------+----------------------------------------------------------------\n",
      "  ncds_mumagebirth |   -.001769   .0041896    -0.42   0.673    -.0099805    .0064424\n",
      "       ncds_parity |   .0185246     .01558     1.19   0.234    -.0120116    .0490608\n",
      "      ncds_married |   .4654806    .089686     5.19   0.000     .2896992    .6412619\n",
      "         ncds_male |  -.0631524   .0412148    -1.53   0.125     -.143932    .0176273\n",
      "                   |\n",
      "       ncds_region |\n",
      "         1. North  |   .4368775   .0971786     4.50   0.000     .2464111     .627344\n",
      "    2. North West  |   .0769387   .0712456     1.08   0.280    -.0627002    .2165776\n",
      "  3. E & W.Riding  |   .2697409   .0874209     3.09   0.002      .098399    .4410828\n",
      "4. North Midlands  |   .0849654   .0864225     0.98   0.326    -.0844197    .2543504\n",
      "      5. Midlands  |   .1899264   .0814421     2.33   0.020     .0303029    .3495499\n",
      "          6. East  |   .1562596   .0894535     1.75   0.081     -.019066    .3315852\n",
      "    7. South East  |          0  (base)\n",
      "         8. South  |   .0703747    .096803     0.73   0.467    -.1193557    .2601051\n",
      "    9. South West  |   .1696321   .0989112     1.71   0.086    -.0242303    .3634945\n",
      "        10. Wales  |   .3239034   .1060707     3.05   0.002     .1160087    .5317982\n",
      "     11. Scotland  |   .0284434   .0738455     0.39   0.700     -.116291    .1731779\n",
      "                   |\n",
      "             _cons |   1.124436   .1379711     8.15   0.000     .8540176    1.394854\n",
      "------------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". fitstat\n",
      "\n",
      "Measures of Fit for logit of sweeptestoutcome\n",
      "\n",
      "Log-Lik Intercept Only:    -7733.383     Log-Lik Full Model:        -7700.044\n",
      "D(17379):                  15400.088     LR(14):                       66.678\n",
      "                                         Prob > LR:                     0.000\n",
      "McFadden's R2:                 0.004     McFadden's Adj R2:             0.002\n",
      "Maximum Likelihood R2:         0.004     Cragg & Uhler's R2:            0.006\n",
      "McKelvey and Zavoina's R2:     0.008     Efron's R2:                    0.004\n",
      "Variance of y*:                3.317     Variance of error:             3.290\n",
      "Count R2:                      0.837     Adj Count R2:                  0.000\n",
      "AIC:                           0.887     AIC*n:                     15432.088\n",
      "BIC:                     -154287.392     BIC':                         70.017\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Final missingness model selected\n",
    "logit sweeptestoutcome ncds_mumagebirth ncds_parity ncds_married ncds_male ib7.ncds_region if (cohort==1), allbaselevels\n",
    "\n",
    "fitstat\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Create a variable for the predicted probability of missingness from this model.\n",
      "\n",
      ". capture drop pp_ncds\n",
      "\n",
      ". predict pp_ncds\n",
      "(option pr assumed; Pr(sweeptestoutcome))\n",
      "(16588 missing values generated)\n",
      "\n",
      ". summ pp_ncds\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     pp_ncds |     17,395    .8370221    .0231933   .7305862   .8932639\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Create a variable for the predicted probability of missingness from this model.\n",
    "capture drop pp_ncds\n",
    "predict pp_ncds\n",
    "summ pp_ncds\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Potential explanatory variables in the missingness model for the BCS (1970 Cohort).\n",
      "\n",
      ". tab bcs_male\n",
      "\n",
      " BCS Cohort |\n",
      "member Male |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |      7,975       48.15       48.15\n",
      "     1. Yes |      8,587       51.85      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,562      100.00\n",
      "\n",
      ". tab bcs0_country\n",
      "\n",
      "   1970: Country of |\n",
      "        Interview   |      Freq.     Percent        Cum.\n",
      "--------------------+-----------------------------------\n",
      "         1. England |     14,072       84.93       84.93\n",
      "           2. Wales |        879        5.31       90.24\n",
      "        3. Scotland |      1,617        9.76      100.00\n",
      "--------------------+-----------------------------------\n",
      "              Total |     16,568      100.00\n",
      "\n",
      ". tab bcs_paed\n",
      "\n",
      "        BCS |\n",
      "   Father's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    1. Comp |      7,806       65.23       65.23\n",
      "2. Comp+1-3 |      2,843       23.76       88.99\n",
      "3. Comp+4-5 |        525        4.39       93.38\n",
      " 4. Comp+6+ |        792        6.62      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     11,966      100.00\n",
      "\n",
      ". tab bcs_moed\n",
      "\n",
      "        BCS |\n",
      "   Mother's |\n",
      "  Education |\n",
      " Categories |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "    1. Comp |      8,279       65.50       65.50\n",
      "2. Comp+1-3 |      3,466       27.42       92.93\n",
      "3. Comp+4-5 |        461        3.65       96.57\n",
      " 4. Comp+6+ |        433        3.43      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     12,639      100.00\n",
      "\n",
      ". tab bcs_region\n",
      "\n",
      "    BCS Region at Birth |      Freq.     Percent        Cum.\n",
      "------------------------+-----------------------------------\n",
      "               1. North |      1,023        6.17        6.17\n",
      "2. Yorks and Humberside |      1,486        8.97       15.14\n",
      "       3. East Midlands |      1,036        6.25       21.40\n",
      "         4. East Anglia |        539        3.25       24.65\n",
      "          5. South East |      5,022       30.31       54.96\n",
      "          6. South West |      1,051        6.34       61.30\n",
      "       7. West Midlands |      1,745       10.53       71.84\n",
      "          8. North West |      2,170       13.10       84.93\n",
      "               9. Wales |        879        5.31       90.24\n",
      "           10. Scotland |      1,617        9.76      100.00\n",
      "------------------------+-----------------------------------\n",
      "                  Total |     16,568      100.00\n",
      "\n",
      ". tab bcs_mumagefirstbirth\n",
      "\n",
      "   BCS Mother's Age at First |\n",
      "                       Birth |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "                          12 |          1        0.01        0.01\n",
      "                          13 |          5        0.03        0.04\n",
      "                          14 |         33        0.20        0.24\n",
      "                          15 |        126        0.77        1.00\n",
      "                          16 |        444        2.70        3.70\n",
      "                          17 |        929        5.65        9.35\n",
      "                          18 |      1,387        8.43       17.78\n",
      "                          19 |      1,608        9.77       27.55\n",
      "                          20 |      1,793       10.90       38.45\n",
      "                          21 |      1,796       10.92       49.37\n",
      "                          22 |      1,775       10.79       60.16\n",
      "                          23 |      1,469        8.93       69.09\n",
      "                          24 |      1,232        7.49       76.58\n",
      "                          25 |      1,041        6.33       82.91\n",
      "                          26 |        755        4.59       87.50\n",
      "                          27 |        553        3.36       90.86\n",
      "                          28 |        393        2.39       93.25\n",
      "                          29 |        300        1.82       95.07\n",
      "                          30 |        214        1.30       96.37\n",
      "                          31 |        141        0.86       97.23\n",
      "                          32 |        126        0.77       97.99\n",
      "                          33 |         81        0.49       98.49\n",
      "                          34 |         68        0.41       98.90\n",
      "                          35 |         50        0.30       99.20\n",
      "                          36 |         30        0.18       99.39\n",
      "                          37 |         24        0.15       99.53\n",
      "                          38 |         24        0.15       99.68\n",
      "                          39 |         17        0.10       99.78\n",
      "                          40 |         19        0.12       99.90\n",
      "                          41 |          4        0.02       99.92\n",
      "                          42 |          7        0.04       99.96\n",
      "                          43 |          3        0.02       99.98\n",
      "                          45 |          1        0.01       99.99\n",
      "                          46 |          1        0.01       99.99\n",
      "                          47 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     16,451      100.00\n",
      "\n",
      ". tab bcs_mumagebirth\n",
      "\n",
      "  BCS Mother's Age at Cohort |\n",
      "              Member's Birth |      Freq.     Percent        Cum.\n",
      "-----------------------------+-----------------------------------\n",
      "                          14 |          2        0.01        0.01\n",
      "                          15 |         26        0.16        0.17\n",
      "                          16 |        130        0.79        0.96\n",
      "                          17 |        300        1.81        2.77\n",
      "                          18 |        509        3.08        5.85\n",
      "                          19 |        670        4.05        9.90\n",
      "                          20 |        902        5.45       15.35\n",
      "                          21 |      1,059        6.40       21.76\n",
      "                          22 |      1,315        7.95       29.71\n",
      "                          23 |      1,446        8.74       38.46\n",
      "                          24 |      1,183        7.15       45.61\n",
      "                          25 |      1,265        7.65       53.26\n",
      "                          26 |      1,156        6.99       60.25\n",
      "                          27 |      1,071        6.48       66.73\n",
      "                          28 |        823        4.98       71.70\n",
      "                          29 |        790        4.78       76.48\n",
      "                          30 |        695        4.20       80.68\n",
      "                          31 |        591        3.57       84.26\n",
      "                          32 |        476        2.88       87.14\n",
      "                          33 |        392        2.37       89.51\n",
      "                          34 |        344        2.08       91.59\n",
      "                          35 |        302        1.83       93.41\n",
      "                          36 |        220        1.33       94.74\n",
      "                          37 |        208        1.26       96.00\n",
      "                          38 |        175        1.06       97.06\n",
      "                          39 |        150        0.91       97.97\n",
      "                          40 |        125        0.76       98.72\n",
      "                          41 |         80        0.48       99.21\n",
      "                          42 |         59        0.36       99.56\n",
      "                          43 |         33        0.20       99.76\n",
      "                          44 |         20        0.12       99.89\n",
      "                          45 |          6        0.04       99.92\n",
      "                          46 |          6        0.04       99.96\n",
      "                          47 |          2        0.01       99.97\n",
      "                          49 |          1        0.01       99.98\n",
      "                          50 |          1        0.01       99.98\n",
      "                          51 |          1        0.01       99.99\n",
      "                          52 |          1        0.01       99.99\n",
      "                          53 |          1        0.01      100.00\n",
      "-----------------------------+-----------------------------------\n",
      "                       Total |     16,536      100.00\n",
      "\n",
      ". tab bcs_mummarried\n",
      "\n",
      " BCS Mother |\n",
      " married at |\n",
      "     Cohort |\n",
      "   Member's |\n",
      "      Birth |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |        978        5.97        5.97\n",
      "     1. Yes |     15,408       94.03      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,386      100.00\n",
      "\n",
      ". tab bcs_parity\n",
      "\n",
      "BCS Parity at Birth |      Freq.     Percent        Cum.\n",
      "--------------------+-----------------------------------\n",
      "                  0 |      6,187       37.41       37.41\n",
      "                  1 |      5,356       32.38       69.79\n",
      "                  2 |      2,689       16.26       86.05\n",
      "                  3 |      1,204        7.28       93.33\n",
      "                  4 |        568        3.43       96.77\n",
      "                  5 |        267        1.61       98.38\n",
      "                  6 |        130        0.79       99.17\n",
      "                  7 |         65        0.39       99.56\n",
      "                  8 |         30        0.18       99.74\n",
      "                  9 |         23        0.14       99.88\n",
      "                 10 |          8        0.05       99.93\n",
      "                 11 |          6        0.04       99.96\n",
      "                 12 |          3        0.02       99.98\n",
      "                 13 |          2        0.01       99.99\n",
      "                 14 |          1        0.01      100.00\n",
      "--------------------+-----------------------------------\n",
      "              Total |     16,539      100.00\n",
      "\n",
      ". tab bcs_mothercraft\n",
      "\n",
      " BCS Mother |\n",
      "   Attended |\n",
      "Mothercraft |\n",
      "    Classes |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     12,038       73.18       73.18\n",
      "     1. Yes |      4,412       26.82      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,450      100.00\n",
      "\n",
      ". tab bcs_labourclass\n",
      "\n",
      " BCS Mother |\n",
      "   Attended |\n",
      "     Labour |\n",
      "    Classes |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     12,089       73.57       73.57\n",
      "     1. Yes |      4,344       26.43      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,433      100.00\n",
      "\n",
      ". tab bcs_breast\n",
      "\n",
      " BCS Mother |\n",
      "  Attempted |\n",
      "     Breast |\n",
      "    Feeding |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |     10,123       61.91       61.91\n",
      "     1. Yes |      6,227       38.09      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |     16,350      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Potential explanatory variables in the missingness model for the BCS (1970 Cohort).\n",
    "tab bcs_male\n",
    "tab bcs0_country\n",
    "tab bcs_paed\n",
    "tab bcs_moed\n",
    "tab bcs_region\n",
    "tab bcs_mumagefirstbirth\n",
    "tab bcs_mumagebirth\n",
    "tab bcs_mummarried\n",
    "tab bcs_parity\n",
    "tab bcs_mothercraft\n",
    "tab bcs_labourclass\n",
    "tab bcs_breast\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Final missingness model selected\n",
      "\n",
      ". logit sweeptestoutcome bcs_male ib5.bcs_region bcs_mumagebirth bcs_mummarried bcs_parity if (cohort==2)\n",
      "\n",
      "Iteration 0:   log likelihood = -7391.3181  \n",
      "Iteration 1:   log likelihood =  -7219.078  \n",
      "Iteration 2:   log likelihood = -7199.8066  \n",
      "Iteration 3:   log likelihood = -7199.7939  \n",
      "Iteration 4:   log likelihood = -7199.7939  \n",
      "\n",
      "Logistic regression                             Number of obs     =     16,353\n",
      "                                                LR chi2(13)       =     383.05\n",
      "                                                Prob > chi2       =     0.0000\n",
      "Log likelihood = -7199.7939                     Pseudo R2         =     0.0259\n",
      "\n",
      "------------------------------------------------------------------------------------------\n",
      "        sweeptestoutcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]\n",
      "-------------------------+----------------------------------------------------------------\n",
      "                bcs_male |  -.0560849   .0425418    -1.32   0.187    -.1394653    .0272954\n",
      "                         |\n",
      "              bcs_region |\n",
      "               1. North  |   .6300561   .1022219     6.16   0.000      .429705    .8304073\n",
      "2. Yorks and Humberside  |   .4836959   .0833184     5.81   0.000     .3203948     .646997\n",
      "       3. East Midlands  |   .3774229   .0941076     4.01   0.000     .1929754    .5618705\n",
      "         4. East Anglia  |   .3941494   .1272544     3.10   0.002     .1447354    .6435635\n",
      "          6. South West  |   .4135188   .0948548     4.36   0.000     .2276067    .5994309\n",
      "       7. West Midlands  |   .3147517   .0752246     4.18   0.000     .1673142    .4621891\n",
      "          8. North West  |   .3327414   .0690646     4.82   0.000     .1973772    .4681056\n",
      "               9. Wales  |   .7710295   .1142771     6.75   0.000     .5470505    .9950086\n",
      "           10. Scotland  |   .4202723   .0784915     5.35   0.000     .2664319    .5741128\n",
      "                         |\n",
      "         bcs_mumagebirth |   .0027454   .0046279     0.59   0.553    -.0063251    .0118159\n",
      "          bcs_mummarried |   1.216936   .0730184    16.67   0.000     1.073823     1.36005\n",
      "              bcs_parity |  -.0547461     .01739    -3.15   0.002      -.08883   -.0206623\n",
      "                   _cons |    .223148   .1224326     1.82   0.068    -.0168154    .4631115\n",
      "------------------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". fitstat\n",
      "\n",
      "Measures of Fit for logit of sweeptestoutcome\n",
      "\n",
      "Log-Lik Intercept Only:    -7391.318     Log-Lik Full Model:        -7199.794\n",
      "D(16338):                  14399.588     LR(13):                      383.048\n",
      "                                         Prob > LR:                     0.000\n",
      "McFadden's R2:                 0.026     McFadden's Adj R2:             0.024\n",
      "Maximum Likelihood R2:         0.023     Cragg & Uhler's R2:            0.039\n",
      "McKelvey and Zavoina's R2:     0.040     Efron's R2:                    0.027\n",
      "Variance of y*:                3.427     Variance of error:             3.290\n",
      "Count R2:                      0.832     Adj Count R2:                 -0.000\n",
      "AIC:                           0.882     AIC*n:                     14429.588\n",
      "BIC:                     -144114.411     BIC':                       -256.920\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Final missingness model selected\n",
    "logit sweeptestoutcome bcs_male ib5.bcs_region bcs_mumagebirth bcs_mummarried bcs_parity if (cohort==2)\n",
    "\n",
    "fitstat\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Create a variable for the predicted probability of missingness from this model.\n",
      "\n",
      ". capture drop pp_bcs\n",
      "\n",
      ". predict pp_bcs\n",
      "(option pr assumed; Pr(sweeptestoutcome))\n",
      "(17630 missing values generated)\n",
      "\n",
      ". summ pp_bcs\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "      pp_bcs |     16,353    .8324466    .0616245   .4754927   .9099228\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Create a variable for the predicted probability of missingness from this model.\n",
    "capture drop pp_bcs\n",
    "predict pp_bcs\n",
    "summ pp_bcs\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Creating the Inverse Probability Weights from the Predicted probabilities\n",
      "\n",
      ". * created above (pp_ncds and pp_bcs).\n",
      "\n",
      ". capture drop ipw\n",
      "\n",
      ". gen ipw = .\n",
      "(33,983 missing values generated)\n",
      "\n",
      ". replace ipw = 1/pp_ncds if (cohort==1)\n",
      "(17,395 real changes made)\n",
      "\n",
      ". replace ipw = 1/pp_bcs if (cohort==2)\n",
      "(16,353 real changes made)\n",
      "\n",
      ". label variable ipw \"Inverse Probability Weight\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Creating the Inverse Probability Weights from the Predicted probabilities\n",
    "* created above (pp_ncds and pp_bcs).\n",
    "capture drop ipw\n",
    "gen ipw = .\n",
    "replace ipw = 1/pp_ncds if (cohort==1)\n",
    "replace ipw = 1/pp_bcs if (cohort==2)\n",
    "label variable ipw \"Inverse Probability Weight\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ ipw\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "         ipw |     33,748    1.202474    .0843147   1.098994   2.103082\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "summ ipw\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Those who have missing predicted probabilities (i.e. have missing\n",
      "\n",
      ". *  data in the model) are given a weight of 1.\n",
      "\n",
      ". recode ipw (.=1)\n",
      "(ipw: 235 changes made)\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "*Those who have missing predicted probabilities (i.e. have missing\n",
    "*  data in the model) are given a weight of 1.\n",
    "recode ipw (.=1)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Drop the predicted values, these were only needed in creating the IPWs.\n",
      "\n",
      ". drop pp_ncds pp_bcs\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Drop the predicted values, these were only needed in creating the IPWs.\n",
    "drop pp_ncds pp_bcs\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". save $path3\\pooledNCDSBCS_v3.dta, replace\n",
      "file F:\\Data\\MYDATA\\TEMP\\pooledNCDSBCS_v3.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "save $path3\\pooledNCDSBCS_v3.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###### Multiple Imputation\n",
    "\n",
    "We implemented Multiple Imputation using the mi commands in Stata SE 14.1. In multiple imputation a number of plausible values are computed that better represent the uncertainty around the missing value. This method imputes multiple values for each missing variable from a cycle of regression models. These datasets are then pooled when analyses are undertaken using ‘Rubin’s rules’ (see Little & Rubin, 2014). We imputed 10 datasets. Our missing data model included all of the variables in the explanatory model of interest as well as parity and mother’s marital status (see table S3 for details of these additional variables). We imputed missing values on both our explanatory and dependent variables. There has been some debate concerning the imputation of missing values for dependent variables (see Von Hippel, 2007). Recent methodological research recommends that imputed values on the dependent variable should not be deleted prior to analysis, and this is the approach we have taken in this article (see Sullivan, Salter, Ryan, & Lee, 2015). We carried out multiple imputation on the missing data from all cohort members present at the first sweep of the studies, and we deleted deceased cohort members before the substantive analysis.\n",
    "\n",
    "When computing the interaction term in our models (father’s NS-SEC x cohort) we have a scenario where there is missing data on the father’s NS-SEC variable but no missing data on the cohort variable. Because cohort is not missing and has only two levels (NCDS or BCS) we carried out multiple imputation separately for each cohort (see Carpenter & Kenward, 2012, pp. 148-150; Enders, 2010, pp. 267-268). We then created the interaction term following the multiple imputation.\n",
    "\n",
    "We also use multiple imputation and inverse probability weights in combination (see Seaman, White, Copas, & Li, 2012). This is carried out by imputing the data as described above, then deleting those cases who were not present in the age 10/11 surveys. In the analytical model we combine the imputed datasets, as described above, and adjust the analyses using the inverse probability weights. Seaman et al. (2012) note that whilst combining multiple imputation and inverse probability weights will generally have no advantages if the imputation models are correctly specified, the combination of multiple imputation and inverse probability weights can act as a robustness check. The strategy of combining multiple imputation and inverse probability weights has been used previously in the analysis of the National Child Development Study (Caldwell et al., 2008; Stansfeld, Clark, Caldwell, Rodgers, & Power, 2008), however we do not have access to the inverse probability weights used in these previous studies.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Additional Notes:\n",
    "\n",
    "We impute the datasets using similar variables available in both datasets\n",
    "and used in the missingness models used to prepare the inverse probability \n",
    "weights described above.\n",
    "\n",
    "There are many approaches that can be taken to imputing an interaction.\n",
    "\n",
    "In our case the interaction involved one variable with no missingness (cohort) \n",
    "and one variable with missingness (the father's NS-SEC variable). Due to this \n",
    "special circumstance we have chosen to split the data by the values of the\n",
    "fully observed variable (cohort) and separately impute the subsets of the data.\n",
    "The interaction is then created after the imputation.\n",
    "\n",
    "see:\n",
    "\n",
    "[Multiple Imputation in Stata:Creating Imputation Models](https://www.ssc.wisc.edu/sscc/pubs/stata_mi_models.htm#InteractionTerms)\n",
    "\n",
    "We could consider cohort as equivalent to race in this description:\n",
    "\n",
    "\"For example, suppose you're regressing income on education, experience, \n",
    "and black (an indicator for \"subject is black\"), but think the returns to \n",
    "education vary by race and thus include black##c.education in the regression. \n",
    "The just another variable approach would create a variable edblack=black*race \n",
    "and impute it, but it's possible for the model to impute a zero for black and \n",
    "a non-zero value for edblack. There's no indication this would cause problems \n",
    "in the analysis model, however.\n",
    "\n",
    "An alternative would be to add the by(black) option to the imputation command, \n",
    "so that whites and blacks are imputed separately. This would allow you to use \n",
    "black##c.education in your analysis model without bias (and it would always \n",
    "correspond to the actual values of black and education). However, running two \n",
    "separate imputation models allows the returns to experience to vary by race in \n",
    "the imputation model, not just education. If you had strong theoretical reasons \n",
    "to believe that was not the case (which is unlikely) that would be a \n",
    "specification problem. A far more more common problem is small sample size: \n",
    "make sure each of your by() groups is big enough for reasonable regressions.\"\n",
    "\n",
    "See also: \n",
    "\n",
    "[Multiple imputation with interactions and non-linear terms](http://thestatsgeek.com/2014/05/10/multiple-imputation-with-interactions-and-non-linear-terms/)\n",
    "\n",
    "See also: \n",
    "\n",
    "Carpenter, James, and Michael Kenward. Multiple imputation and its application. \n",
    "John Wiley & Sons, 2012. chapter 7.\n",
    "\n",
    "\"If Y3 is fully observed, but Y1, Y2 are partially observed then since\n",
    "equation 7.1 fits a straight line for each group defined by Y3 we can\n",
    "impute as followed:\n",
    "\n",
    "1) Divide the data into two groups by values of binary Y3.\n",
    "2) Separately in each group, impute (Y1,Y2) using a bivariate normal model or\n",
    "FCS equivalent, creating K imputed datasets.\n",
    "3) For k = 1, ...., K append the imputed datasets for the two groups, to give\n",
    "k imputed datasets.\n",
    "\n",
    "This approach, of imputing separately in the groups defined by categorical \n",
    "variables in the interaction, is by far the simplest approach; clearly we\n",
    "can have more than the two groups in the above discussion. The imputation\n",
    "groups may be defined by levels of a single vategorical variable, or the \n",
    "interaction of categorical variables. The only requirement is that these\n",
    "variables be fully observed on each unit.\"\n",
    "\n",
    "After running the multiple imputations we will delete those cohort members\n",
    "known to be dead at the age 10/11 sweeps (deadtestoutcome).\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v3.dta, clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v3.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". set seed 1485\n",
      "\n",
      ". \n",
      ". keep ability male parented dadnssec cohort ipw  parity married poolid deadtestoutcome sweeptestoutcome samplenssec\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "\n",
    "set seed 1485\n",
    "\n",
    "keep ability male parented dadnssec cohort ipw  parity married poolid deadtestoutcome sweeptestoutcome samplenssec\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We get an error when we run the multiple imputation \"invalid name\".\n",
    "\n",
    "I think this is because the label for NS-SEC is too long.\n",
    "\n",
    "Dropping the NS-SEC label solves the problem so we drop it here.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". label drop nssec\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "label drop nssec\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "We set the dataset to be an mi dataset, we use the mlong style which is described as \"memory efficient\".\n",
    "\n",
    "see: [Multiple-imputation analysis using Stata’s mi command](http://www.stata.com/meeting/boston10/boston10_marchenko.pdf)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi set mlong\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi set mlong\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "mi register imputed identifies which variables in the imputation model have \n",
    "\tmissing information and will be imputed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi register imputed ability male parented dadnssec parity married\n",
      "(16325 m=0 obs. now marked as incomplete)\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "mi register imputed ability male parented dadnssec parity married\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "mi register regular identifies variables which are the same values in the \n",
    "\timputed data and the original data (i.e. that don't have missing values)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi register regular cohort ipw poolid deadtestoutcome sweeptestoutcome samplenssec\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi register regular cohort ipw poolid deadtestoutcome sweeptestoutcome samplenssec\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create 60 imputed datasets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi impute chained (reg) ability parity (logit) male married (mlogit) parented dadnssec, add(60) rseed(1485) by(cohort)\n",
      "\n",
      "Performing setup for each by() group:\n",
      "\n",
      "-> cohort = 1. NCDS\n",
      "Conditional models:\n",
      "            parity: regress parity i.male i.married i.parented ability i.dadnssec\n",
      "              male: logit male parity i.married i.parented ability i.dadnssec\n",
      "           married: logit married parity i.male i.parented ability i.dadnssec\n",
      "          parented: mlogit parented parity i.male i.married ability i.dadnssec\n",
      "           ability: regress ability parity i.male i.married i.parented i.dadnssec\n",
      "          dadnssec: mlogit dadnssec parity i.male i.married i.parented ability\n",
      "\n",
      "-> cohort = 2. BCS\n",
      "Conditional models:\n",
      "              male: logit male parity i.married i.parented ability i.dadnssec\n",
      "            parity: regress parity i.male i.married i.parented ability i.dadnssec\n",
      "           married: logit married i.male parity i.parented ability i.dadnssec\n",
      "          parented: mlogit parented i.male parity i.married ability i.dadnssec\n",
      "           ability: regress ability i.male parity i.married i.parented i.dadnssec\n",
      "          dadnssec: mlogit dadnssec i.male parity i.married i.parented ability\n",
      "\n",
      "Performing imputation for each by() group:\n",
      "\n",
      "-> cohort = 1. NCDS\n",
      "Performing chained iterations ...\n",
      "\n",
      "-> cohort = 2. BCS\n",
      "Performing chained iterations ...\n",
      "\n",
      "Multivariate imputation                     Imputations =       60\n",
      "Chained equations                                 added =       60\n",
      "Imputed: m=1 through m=60                       updated =        0\n",
      "\n",
      "Initialization: monotone                     Iterations =      600\n",
      "                                                burn-in =       10\n",
      "\n",
      "           ability: linear regression\n",
      "            parity: linear regression\n",
      "              male: logistic regression\n",
      "           married: logistic regression\n",
      "          parented: multinomial logistic regression\n",
      "          dadnssec: multinomial logistic regression\n",
      "\n",
      "------------------------------------------------------------------\n",
      "                   |               Observations per m             \n",
      "by()               |----------------------------------------------\n",
      "          Variable |   Complete   Incomplete   Imputed |     Total\n",
      "-------------------+-----------------------------------+----------\n",
      "cohort = 1. NCDS   |                                   |\n",
      "           ability |      13440         3975      3975 |     17415\n",
      "            parity |      17412            3         3 |     17415\n",
      "              male |      17412            3         3 |     17415\n",
      "           married |      17405           10        10 |     17415\n",
      "          parented |      15083         2332      2332 |     17415\n",
      "          dadnssec |      10598         6817      6817 |     17415\n",
      "                   |                                   |\n",
      "cohort = 2. BCS    |                                   |\n",
      "           ability |      11388         5180      5180 |     16568\n",
      "            parity |      16539           29        29 |     16568\n",
      "              male |      16562            6         6 |     16568\n",
      "           married |      16386          182       182 |     16568\n",
      "          parented |      12695         3873      3873 |     16568\n",
      "          dadnssec |      11193         5375      5375 |     16568\n",
      "                   |                                   |\n",
      "-------------------+-----------------------------------+----------\n",
      "Overall            |                                   |\n",
      "           ability |      24828         9155      9155 |     33983\n",
      "            parity |      33951           32        32 |     33983\n",
      "              male |      33974            9         9 |     33983\n",
      "           married |      33791          192       192 |     33983\n",
      "          parented |      27778         6205      6205 |     33983\n",
      "          dadnssec |      21791        12192     12192 |     33983\n",
      "------------------------------------------------------------------\n",
      "(complete + incomplete = total; imputed is the minimum across m\n",
      " of the number of filled-in observations.)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi impute chained (reg) ability parity (logit) male married (mlogit) parented dadnssec, add(60) rseed(1485) by(cohort)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". save $path3\\pooledNCDSBCS_v3_imputed.dta, replace\n",
      "file F:\\Data\\MYDATA\\TEMP\\pooledNCDSBCS_v3_imputed.dta saved\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "save $path3\\pooledNCDSBCS_v3_imputed.dta, replace\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Modelling Results <a class=\"anchor\" id=\"modellingresults\"></a>\n",
    "\n",
    "The models reported in table 4 are ordinary least squares (OLS) linear regression analyses of the cognitive ability test scores. The data from the two cohorts have been pooled, and the models include a dummy variable indicating cohort membership. Table 6, model 1 shows that boys have marginally lower cognitive ability test scores. Children of parents who have spent a longer period of time in education on average have higher cognitive ability test scores. There is also a large social class effect that is significant, net of parental education and gender [4](#note4).\n",
    "\n",
    "Children from the least advantaged social class NS-SEC 7 (e.g. the daughter of a construction labourer) on average score 7 points lower than children from social class NS-SEC 3 (e.g. the daughter of a police officer). By contrast children from social class NS-SEC 1.2 (e.g. the daughter of a university lecturer) on average score 2 points higher than counterparts in NS-SEC 3. Similar socio-economic inequalities in cognitive test scores have previously been reported  (see Shenkin et al., 2001; Lawlor et al., 2005; Feinstein, 2003; Sullivan et al., 2013).\n",
    "\n",
    "In model 4 (table 6) we include an interaction term representing father’s NS-SEC and cohort, to investigate changes between the cohorts. Including the interaction term in the model does not improve the proportion of variance explained overall. We do not find systematic changes NS-SEC inequalities between the two cohorts.\n",
    "\n",
    "Despite the overall lack of improvement in model fit when the interaction is included, we observe that there are some small statistically significant differences between the cohorts (table 4, model 2). To aid in the interpretation of the change in effect sizes between cohorts, a visualisation of this relationship is provided by a plot of the regression coefficients for father’s NS-SEC and 95 per cent quasi-variance comparison intervals [5](#note5) is provided (figure 1). Overall, in figure 1 there is no clear pattern of either increasing or decreasing social class inequalities between the two cohorts. \n",
    "\n",
    "BCS members have marginally lower test scores, across all social class groups. We emphasise that the cognitive ability tests in the NCDS and the BCS are not identical, however the two measures are suitable for the current analysis because our focus is on relative social class inequalities within the two cohorts. The outcome variable in this model is constructed using arithmetic standardisation. The difference between the scores in the NCDS and the BCS in this analysis should not therefore be understood as a direct assessment of the Flynn Effect. We conclude that the more parsimonous model that does not include the interaction is more appropriate.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Open the dataset with multiple imputation. The models using complete cases and the weights (only) are shown in the supplementary materials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v3_imputed.dta, clear\n",
      "\n",
      ". \n",
      ". set seed 1485\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v3_imputed.dta, clear\n",
    "\n",
    "set seed 1485\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We included information on the deceased cohort members in the multiple imputation, above (to provide additional information in the model). We now delete the cases who were deceased by the time of the age 10/11 surveys. We do this as we do not want to model outcomes for cohort members who were deceased at the time the ability test was taken."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab deadtestoutcome, mi\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |    925,948       91.36       91.36\n",
      "     1. Yes |     87,535        8.64      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |  1,013,483      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tab deadtestoutcome, mi\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". summ ability if (deadtestoutcome==1)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |     86,100    99.38687    15.00202   28.07528   160.2226\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "summ ability if (deadtestoutcome==1)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". drop if deadtestoutcome==1\n",
      "(87,535 observations deleted)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "drop if deadtestoutcome==1\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create an interaction variable. We created this above for the complete records analysis, but we recreate it again here after the multiple imputation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". capture drop nsinteraction\n",
      "\n",
      ". gen nsinteraction = .\n",
      "(925,948 missing values generated)\n",
      "\n",
      ". replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
      "(13,535 real changes made)\n",
      "\n",
      ". replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
      "(22,498 real changes made)\n",
      "\n",
      ". replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
      "(18,391 real changes made)\n",
      "\n",
      ". replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
      "(27,622 real changes made)\n",
      "\n",
      ". replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
      "(47,752 real changes made)\n",
      "\n",
      ". replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
      "(69,934 real changes made)\n",
      "\n",
      ". replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
      "(39,097 real changes made)\n",
      "\n",
      ". replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
      "(39,944 real changes made)\n",
      "\n",
      ". replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
      "(51,654 real changes made)\n",
      "\n",
      ". replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
      "(66,476 real changes made)\n",
      "\n",
      ". replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
      "(70,532 real changes made)\n",
      "\n",
      ". replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
      "(79,389 real changes made)\n",
      "\n",
      ". replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
      "(77,046 real changes made)\n",
      "\n",
      ". replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
      "(73,037 real changes made)\n",
      "\n",
      ". replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
      "(110,371 real changes made)\n",
      "\n",
      ". replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
      "(107,913 real changes made)\n",
      "\n",
      ". tab nsinteraction\n",
      "\n",
      "nsinteracti |\n",
      "         on |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     13,535        1.48        1.48\n",
      "          2 |     22,498        2.46        3.94\n",
      "          3 |     18,391        2.01        5.95\n",
      "          4 |     27,622        3.02        8.96\n",
      "          5 |     47,752        5.22       14.18\n",
      "          6 |     69,934        7.64       21.82\n",
      "          7 |     39,097        4.27       26.10\n",
      "          8 |     39,944        4.36       30.46\n",
      "          9 |     51,654        5.64       36.10\n",
      "         10 |     66,476        7.26       43.37\n",
      "         11 |     70,532        7.71       51.08\n",
      "         12 |     79,389        8.67       59.75\n",
      "         13 |     77,046        8.42       68.17\n",
      "         14 |     73,037        7.98       76.15\n",
      "         15 |    110,371       12.06       88.21\n",
      "         16 |    107,913       11.79      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |    915,191      100.00\n",
      "\n",
      ". label variable nsinteraction \"NSSEC Interaction\"\n",
      "\n",
      ". label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 11 \"N\n",
      "> CDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\", replace\n",
      "\n",
      ". label values nsinteraction nsint\n",
      "\n",
      ". \n",
      ". mi register passive nsinteraction\n",
      "(system variable _mi_id updated due to changed number of obs.)\n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "capture drop nsinteraction\n",
    "gen nsinteraction = .\n",
    "replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
    "replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
    "replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
    "replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
    "replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
    "replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
    "replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
    "replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
    "replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
    "replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
    "replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
    "replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
    "replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
    "replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
    "replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
    "replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
    "tab nsinteraction\n",
    "label variable nsinteraction \"NSSEC Interaction\"\n",
    "label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 11 \"NCDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\", replace\n",
    "label values nsinteraction nsint\n",
    "\n",
    "mi register passive nsinteraction\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The models in table 4 include multiple imputation and the inverse probability weights.\n",
    "\n",
    "For these models we also keep only those present at age 10/11.\n",
    "\n",
    "See: Seaman, S. R., White, I. R., Copas, A. J., & Li, L. (2012). [Combining multiple \n",
    "imputation and inverse probability weighting](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3412287/). Biometrics, 68(1), 129-137.\n",
    "\n",
    "\" Some researchers may prefer to use straightforward MI (what we called MI/MI). \n",
    "Provided that the imputation models are correctly specified, this will be more \n",
    "efficient than IPW/MI. However, our (admittedly contrived) simulations and \n",
    "(not contrived) real data example have shown that those who prefer IPW/MI have \n",
    "some justification for their caution. A possible use for IPW/MI is as a check, \n",
    "or diagnostic, for MI/MI. If the results of IPW/MI and MI/MI are very different,\n",
    "further exploration would be warranted, possibly leading to refinement of the \n",
    "imputation model.\"\n",
    "\n",
    "\n",
    "These two papers also use this method:\n",
    "\n",
    "Caldwell, T. M., Rodgers, B., Clark, C., Jefferis, B. J. M. H., Stansfeld, \n",
    "S. A., & Power, C. (2008). Lifecourse socioeconomic predictors of midlife \n",
    "drinking patterns, problems and abstention: findings from the 1958 British \n",
    "Birth Cohort Study. Drug and alcohol dependence, 95(3), 269-278.\n",
    "\n",
    "Stansfeld, S. A., Clark, C., Caldwell, T., Rodgers, B., & Power, C. (2008). \n",
    "Psychosocial work characteristics and anxiety and depressive disorders in \n",
    "midlife: the effects of prior psychological distress. Occupational and \n",
    "Environmental Medicine, 65(9), 634-642.\t\n",
    "\n",
    "\"Multiple imputation was used to address missing data in the\n",
    "analyses, using the ICE programme in STATA. All psychological\n",
    "health, sociodemographic and work variables reported in this\n",
    "paper were included in the imputation equations; employment\n",
    "status at 33 and father’s social class at 7 and own social class at\n",
    "42 were also included as they were significantly associated with\n",
    "attrition....\n",
    "\n",
    " All living participants were included in the imputation,\n",
    "but analyses were conducted only for those who\n",
    "participated in the study at age 45 and were in paid employment\n",
    "(n = 8243).....\n",
    "\n",
    "In order to address attrition, inverse probability\n",
    "weights were then estimated from a logistic regression model\n",
    "predicting participation in the study at age 45. Sex and all of the\n",
    "independent variables used in the imputation equation, except\n",
    "those measured at 45, and all significant two-way interactions\n",
    "were used as predictors in this logistic regression. The weight\n",
    "was applied to all analyses in this paper.\"\n",
    "\t\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keep only those present at the age 10/11 surveys."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". estimates clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "estimates clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab sweeptestoutcome\n",
      "\n",
      " Productive |\n",
      "     at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |    257,237       27.78       27.78\n",
      "     1. Yes |    668,711       72.22      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |    925,948      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tab sweeptestoutcome\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *keep only those productive at age 10/11\n",
      "\n",
      ". keep if sweeptestoutcome ==1\n",
      "(257,237 observations deleted)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*keep only those productive at age 10/11\n",
    "keep if sweeptestoutcome ==1\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *TABLE 4 MODEL 1\n",
      "\n",
      ". \n",
      ". estimates clear\n",
      "\n",
      ". \n",
      ". mi estimate, post: regress ability male i.parented ib4.dadnssec cohort [pweight=ipw], allbaselevels\n",
      "(system variable _mi_id updated due to changed number of obs.)\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.3613\n",
      "                                                Largest FMI       =     0.4366\n",
      "                                                Complete DF       =      28318\n",
      "DF adjustment:   Small sample                   DF:     min       =     308.70\n",
      "                                                        avg       =     874.12\n",
      "                                                        max       =   2,278.12\n",
      "Model F test:       Equal FMI                   F(  12, 7041.2)   =     297.81\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "------------------------------------------------------------------------------\n",
      "     ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "        male |  -.5529273    .179601    -3.08   0.002    -.9051258   -.2007288\n",
      "             |\n",
      "    parented |\n",
      "          2  |   5.865727   .2354561    24.91   0.000     5.403645    6.327808\n",
      "          3  |   8.298234   .5356372    15.49   0.000     7.246654    9.349814\n",
      "          4  |   10.62638   .4562337    23.29   0.000     9.730598    11.52217\n",
      "             |\n",
      "    dadnssec |\n",
      "          1  |   1.787366   .5802228     3.08   0.002     .6480052    2.926727\n",
      "          2  |   2.279596   .5910755     3.86   0.000     1.116549    3.442643\n",
      "          3  |   1.190626   .4294348     2.77   0.006     .3469172    2.034335\n",
      "          5  |  -3.526372   .4348423    -8.11   0.000    -4.380374    -2.67237\n",
      "          6  |  -3.306138   .4132835    -8.00   0.000    -4.117849   -2.494427\n",
      "          7  |  -4.797451   .4256146   -11.27   0.000    -5.633624   -3.961277\n",
      "          8  |  -7.168137   .4124364   -17.38   0.000    -7.978642   -6.357632\n",
      "             |\n",
      "      cohort |  -2.087461   .1840391   -11.34   0.000    -2.448375   -1.726548\n",
      "       _cons |   104.0589   .4338118   239.87   0.000     103.2077    104.9102\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*TABLE 4 MODEL 1\n",
    "\n",
    "estimates clear\n",
    "\n",
    "mi estimate, post: regress ability male i.parented ib4.dadnssec cohort [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mibeta ability male i.parented ib4.dadnssec cohort [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.3613\n",
      "                                                Largest FMI       =     0.4365\n",
      "                                                Complete DF       =      28318\n",
      "DF adjustment:   Small sample                   DF:     min       =     308.70\n",
      "                                                        avg       =     874.12\n",
      "                                                        max       =   2,278.12\n",
      "Model F test:       Equal FMI                   F(  12, 7041.2)   =     297.81\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "------------------------------------------------------------------------------\n",
      "     ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "        male |  -.5529273    .179601    -3.08   0.002    -.9051258   -.2007288\n",
      "             |\n",
      "    parented |\n",
      "          2  |   5.865727   .2354561    24.91   0.000     5.403645    6.327808\n",
      "          3  |   8.298234   .5356372    15.49   0.000     7.246654    9.349814\n",
      "          4  |   10.62638   .4562337    23.29   0.000     9.730598    11.52217\n",
      "             |\n",
      "    dadnssec |\n",
      "          1  |   1.787366   .5802228     3.08   0.002     .6480052    2.926727\n",
      "          2  |   2.279596   .5910755     3.86   0.000     1.116549    3.442643\n",
      "          3  |   1.190626   .4294348     2.77   0.006     .3469172    2.034335\n",
      "          5  |  -3.526372   .4348423    -8.11   0.000    -4.380374    -2.67237\n",
      "          6  |  -3.306138   .4132835    -8.00   0.000    -4.117849   -2.494427\n",
      "          7  |  -4.797451   .4256146   -11.27   0.000    -5.633624   -3.961277\n",
      "          8  |  -7.168137   .4124364   -17.38   0.000    -7.978642   -6.357632\n",
      "             |\n",
      "      cohort |  -2.087461   .1840391   -11.34   0.000    -2.448375   -1.726548\n",
      "       _cons |   104.0589   .4338118   239.87   0.000     103.2077    104.9102\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      "Standardized coefficients and R-squared\n",
      "Summary statistics over 60 imputations\n",
      "\n",
      "             |       mean       min        p25     median        p75       max\n",
      "-------------+----------------------------------------------------------------\n",
      "        male |  -.0184715    -.0232  -.0200423  -.0184812  -.0170057    -.0117\n",
      "             |\n",
      "    parented |\n",
      "          2  |   .1745555      .168   .1718451     .17444   .1770959      .185\n",
      "          3  |   .1019757     .0924   .1000508   .1016537   .1045144      .109\n",
      "          4  |   .1657808      .157   .1625834   .1663338   .1687355      .176\n",
      "             |\n",
      "    dadnssec |\n",
      "          1  |   .0229434     .0125   .0205665   .0230113   .0258286     .0314\n",
      "          2  |   .0337025     .0219   .0294873   .0337864   .0377892     .0497\n",
      "          3  |   .0267778    .00998   .0239177   .0274362   .0305021     .0414\n",
      "          5  |  -.0785171    -.0945  -.0812713   -.077357  -.0740187    -.0697\n",
      "          6  |  -.0826347    -.0963  -.0866849  -.0822976  -.0787418    -.0698\n",
      "          7  |  -.1187835      -.13  -.1226171  -.1182057  -.1147945     -.101\n",
      "          8  |   -.201945     -.215  -.2071063  -.2027168  -.1972093     -.186\n",
      "             |\n",
      "      cohort |  -.0697507    -.0759  -.0717631  -.0691932  -.0681856    -.0654\n",
      "-------------+----------------------------------------------------------------\n",
      "    R-square |   .1389302      .134   .1376231   .1393436   .1400777      .144\n",
      "Adj R-square |   .1385653      .134   .1372576   .1389789   .1397133      .143\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mibeta ability male i.parented ib4.dadnssec cohort [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *TABLE 4 MODEL 2\n",
      "\n",
      ". \n",
      ". estimates clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*TABLE 4 MODEL 2\n",
    "\n",
    "estimates clear\n",
    "\n",
    "* return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate, post: regress ability male i.parented ib7.nsinteraction [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3949\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     376.16\n",
      "                                                        avg       =     611.14\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.823372    .864195     3.27   0.001     1.126202    4.520541\n",
      "     BCS 1.1  |  -.6376888   .7817899    -0.82   0.415     -2.17297    .8975921\n",
      "    NCDS 1.2  |   1.849502   .7733177     2.39   0.017     .3296352    3.369368\n",
      "     BCS 1.2  |   .9621695   .7714264     1.25   0.213    -.5543167    2.478656\n",
      "      NCDS 2  |   1.708454   .6153074     2.78   0.006     .4989125    2.917995\n",
      "       BCS 2  |  -.9002437   .5898865    -1.53   0.128    -2.058989    .2585018\n",
      "       BCS 3  |  -1.585695    .683324    -2.32   0.021    -2.928931   -.2424602\n",
      "      NCDS 4  |  -3.241009   .6280868    -5.16   0.000    -4.475416   -2.006603\n",
      "       BCS 4  |  -5.423505    .624567    -8.68   0.000    -6.651111     -4.1959\n",
      "      NCDS 5  |  -2.937072    .592384    -4.96   0.000    -4.101296   -1.772848\n",
      "       BCS 5  |  -5.289967   .5726627    -9.24   0.000    -6.414886   -4.165048\n",
      "      NCDS 6  |   -4.47695   .5983433    -7.48   0.000    -5.653467   -3.300433\n",
      "       BCS 6  |  -6.752555   .5875267   -11.49   0.000    -7.906433   -5.598676\n",
      "      NCDS 7  |  -7.119842   .5645244   -12.61   0.000    -8.229579   -6.010104\n",
      "       BCS 7  |   -8.78016   .5601276   -15.68   0.000     -9.88061    -7.67971\n",
      "              |\n",
      "        _cons |   101.7356   .4928228   206.43   0.000     100.7669    102.7043\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate, post: regress ability male i.parented ib7.nsinteraction [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mibeta ability male i.parented ib7.nsinteraction [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3948\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     376.16\n",
      "                                                        avg       =     611.14\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.823372    .864195     3.27   0.001     1.126202    4.520541\n",
      "     BCS 1.1  |  -.6376888   .7817899    -0.82   0.415     -2.17297    .8975921\n",
      "    NCDS 1.2  |   1.849502   .7733177     2.39   0.017     .3296352    3.369368\n",
      "     BCS 1.2  |   .9621695   .7714264     1.25   0.213    -.5543167    2.478656\n",
      "      NCDS 2  |   1.708454   .6153074     2.78   0.006     .4989125    2.917995\n",
      "       BCS 2  |  -.9002437   .5898865    -1.53   0.128    -2.058989    .2585018\n",
      "       BCS 3  |  -1.585695    .683324    -2.32   0.021    -2.928931   -.2424602\n",
      "      NCDS 4  |  -3.241009   .6280868    -5.16   0.000    -4.475416   -2.006603\n",
      "       BCS 4  |  -5.423505    .624567    -8.68   0.000    -6.651111     -4.1959\n",
      "      NCDS 5  |  -2.937072    .592384    -4.96   0.000    -4.101296   -1.772848\n",
      "       BCS 5  |  -5.289967   .5726627    -9.24   0.000    -6.414886   -4.165048\n",
      "      NCDS 6  |   -4.47695   .5983433    -7.48   0.000    -5.653467   -3.300433\n",
      "       BCS 6  |  -6.752555   .5875267   -11.49   0.000    -7.906433   -5.598676\n",
      "      NCDS 7  |  -7.119842   .5645244   -12.61   0.000    -8.229579   -6.010104\n",
      "       BCS 7  |   -8.78016   .5601276   -15.68   0.000     -9.88061    -7.67971\n",
      "              |\n",
      "        _cons |   101.7356   .4928228   206.43   0.000     100.7669    102.7043\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      "Standardized coefficients and R-squared\n",
      "Summary statistics over 60 imputations\n",
      "\n",
      "             |       mean       min        p25     median        p75       max\n",
      "-------------+----------------------------------------------------------------\n",
      "        male |  -.0186955    -.0234  -.0202557  -.0187198  -.0171167     -.012\n",
      "             |\n",
      "    parented |\n",
      "          2  |   .1747311      .168   .1718273   .1747024   .1774466      .186\n",
      "          3  |   .1020976     .0932   .1003903   .1019859   .1046074      .109\n",
      "          4  |   .1660812      .157   .1629075    .166622   .1689558      .176\n",
      "             |\n",
      "nsinteract~n |\n",
      "   NCDS 1.1  |   .0236479     .0145   .0214039   .0242233   .0263084     .0326\n",
      "    BCS 1.1  |  -.0063036    -.0159  -.0092196   -.005852   -.003795    .00758\n",
      "   NCDS 1.2  |   .0186201    .00818   .0153741   .0191881   .0207888     .0315\n",
      "    BCS 1.2  |   .0106749   -.00121   .0069641   .0108788   .0138816     .0218\n",
      "     NCDS 2  |   .0269205     .0077    .023743   .0274129   .0303461     .0459\n",
      "      BCS 2  |  -.0154647    -.0312  -.0189422  -.0141758  -.0118529   -.00146\n",
      "      BCS 3  |  -.0212993    -.0328  -.0244832  -.0221359  -.0179404   -.00803\n",
      "     NCDS 4  |  -.0525572    -.0678  -.0564814  -.0527218  -.0476009    -.0407\n",
      "      BCS 4  |  -.0889345     -.106  -.0934298  -.0887321  -.0842438    -.0748\n",
      "     NCDS 5  |  -.0546596    -.0712  -.0589729  -.0541065   -.050261    -.0381\n",
      "      BCS 5  |  -.0977622     -.115  -.1010194   -.096559  -.0935974    -.0884\n",
      "     NCDS 6  |  -.0864516     -.104  -.0906362  -.0861367  -.0814556    -.0721\n",
      "      BCS 6  |  -.1168589      -.13  -.1203538  -.1155812  -.1128864     -.106\n",
      "     NCDS 7  |  -.1591148     -.178  -.1644767  -.1588776  -.1534594     -.145\n",
      "      BCS 7  |  -.1782031     -.192   -.182083   -.178447  -.1732933     -.164\n",
      "-------------+----------------------------------------------------------------\n",
      "    R-square |   .1393231      .134   .1379374   .1397493   .1404945      .144\n",
      "Adj R-square |   .1387454      .134   .1373589   .1391719   .1399177      .143\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mibeta ability male i.parented ib7.nsinteraction [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/6JEIXqv.png\" alt=\"Table 4\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * Here I am repeating table 6 model 2 whilst changing to reference category\n",
      "\n",
      ". * This allows us to compare between categories in the interaction.\n",
      "\n",
      ". \n",
      ". mi estimate: regress ability male i.parented ib1.nsinteraction [pweight=ipw], allbaselevels \n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3263\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     546.37\n",
      "                                                        avg       =     917.99\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "     BCS 1.1  |  -3.461061   .9929515    -3.49   0.001     -5.41153   -1.510591\n",
      "    NCDS 1.2  |    -.97387   .9429399    -1.03   0.302    -2.825543    .8778034\n",
      "     BCS 1.2  |  -1.861202   .9158719    -2.03   0.042     -3.65911   -.0632942\n",
      "      NCDS 2  |  -1.114918     .80775    -1.38   0.168    -2.700447    .4706112\n",
      "       BCS 2  |  -3.723615   .7835263    -4.75   0.000    -5.260958   -2.186273\n",
      "      NCDS 3  |  -2.823372    .864195    -3.27   0.001    -4.520541   -1.126202\n",
      "       BCS 3  |  -4.409067    .859921    -5.13   0.000    -6.097227   -2.720907\n",
      "      NCDS 4  |  -6.064381   .8117831    -7.47   0.000    -7.657461   -4.471301\n",
      "       BCS 4  |  -8.246877   .8179949   -10.08   0.000     -9.85251   -6.641245\n",
      "      NCDS 5  |  -5.760444   .7813254    -7.37   0.000    -7.293525   -4.227363\n",
      "       BCS 5  |  -8.113339   .7773216   -10.44   0.000    -9.638575   -6.588103\n",
      "      NCDS 6  |  -7.300322   .7878075    -9.27   0.000    -8.846467   -5.754177\n",
      "       BCS 6  |  -9.575926   .7966527   -12.02   0.000    -11.13922   -8.012632\n",
      "      NCDS 7  |  -9.943213   .7866582   -12.64   0.000    -11.48757   -8.398859\n",
      "       BCS 7  |  -11.60353   .7729728   -15.01   0.000    -13.12038   -10.08668\n",
      "              |\n",
      "        _cons |    104.559   .7301608   143.20   0.000      103.126    105.9919\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is a significant difference between BCS and NCDS for 1.1\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* Here I am repeating table 6 model 2 whilst changing to reference category\n",
    "* This allows us to compare between categories in the interaction.\n",
    "\n",
    "mi estimate: regress ability male i.parented ib1.nsinteraction [pweight=ipw], allbaselevels \n",
    "*There is a significant difference between BCS and NCDS for 1.1\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib3.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.4151\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     340.89\n",
      "                                                        avg       =     644.64\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |     .97387   .9429399     1.03   0.302    -.8778034    2.825543\n",
      "     BCS 1.1  |  -2.487191   .8802418    -2.83   0.005    -4.216437   -.7579438\n",
      "     BCS 1.2  |  -.8873322   .8054189    -1.10   0.271    -2.469263    .6945985\n",
      "      NCDS 2  |  -.1410478   .7311071    -0.19   0.847    -1.578167    1.296071\n",
      "       BCS 2  |  -2.749745   .6848951    -4.01   0.000    -4.094722   -1.404769\n",
      "      NCDS 3  |  -1.849502   .7733177    -2.39   0.017    -3.369368   -.3296352\n",
      "       BCS 3  |  -3.435197   .8188311    -4.20   0.000    -5.045795   -1.824599\n",
      "      NCDS 4  |  -5.090511   .7340165    -6.94   0.000      -6.5323   -3.648723\n",
      "       BCS 4  |  -7.273007   .7331959    -9.92   0.000     -8.71345   -5.832564\n",
      "      NCDS 5  |  -4.786574   .7140294    -6.70   0.000    -6.189363   -3.383785\n",
      "       BCS 5  |  -7.139469     .70503   -10.13   0.000    -8.524578   -5.754359\n",
      "      NCDS 6  |  -6.326452   .7030693    -9.00   0.000    -7.707528   -4.945375\n",
      "       BCS 6  |  -8.602056   .7171048   -12.00   0.000    -10.01055   -7.193567\n",
      "      NCDS 7  |  -8.969343   .6840177   -13.11   0.000    -10.31299   -7.625696\n",
      "       BCS 7  |  -10.62966   .7022876   -15.14   0.000     -12.0096   -9.249726\n",
      "              |\n",
      "        _cons |   103.5851    .641653   161.43   0.000     102.3244    104.8458\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is not a significant difference between BCS and NCDS for 1.2\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib3.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is not a significant difference between BCS and NCDS for 1.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib5.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3774\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     411.08\n",
      "                                                        avg       =     661.86\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   1.114918     .80775     1.38   0.168    -.4706112    2.700447\n",
      "     BCS 1.1  |  -2.346143   .7565414    -3.10   0.002    -3.832342   -.8599437\n",
      "    NCDS 1.2  |   .1410478   .7311071     0.19   0.847    -1.296071    1.578167\n",
      "     BCS 1.2  |  -.7462844   .7104072    -1.05   0.294    -2.142445    .6498759\n",
      "       BCS 2  |  -2.608698   .5483809    -4.76   0.000    -3.686405    -1.53099\n",
      "      NCDS 3  |  -1.708454   .6153074    -2.78   0.006    -2.917995   -.4989125\n",
      "       BCS 3  |  -3.294149   .6291214    -5.24   0.000    -4.530563   -2.057736\n",
      "      NCDS 4  |  -4.949463   .5637005    -8.78   0.000    -6.056628   -3.842298\n",
      "       BCS 4  |  -7.131959    .554731   -12.86   0.000    -8.221485   -6.042433\n",
      "      NCDS 5  |  -4.645526   .5202506    -8.93   0.000    -5.667113   -3.623939\n",
      "       BCS 5  |  -6.998421   .5229782   -13.38   0.000    -8.025714   -5.971127\n",
      "      NCDS 6  |  -6.185404   .5216915   -11.86   0.000    -7.210182   -5.160626\n",
      "       BCS 6  |  -8.461009   .5648557   -14.98   0.000    -9.571182   -7.350835\n",
      "      NCDS 7  |  -8.828296   .5072247   -17.41   0.000    -9.825113   -7.831479\n",
      "       BCS 7  |  -10.48861   .4902061   -21.40   0.000    -11.45097   -9.526259\n",
      "              |\n",
      "        _cons |   103.4441    .416877   248.14   0.000     102.6253    104.2628\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is a significant difference between BCS and NCDS for 2\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib5.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib7.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3949\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     376.16\n",
      "                                                        avg       =     611.14\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.823372    .864195     3.27   0.001     1.126202    4.520541\n",
      "     BCS 1.1  |  -.6376888   .7817899    -0.82   0.415     -2.17297    .8975921\n",
      "    NCDS 1.2  |   1.849502   .7733177     2.39   0.017     .3296352    3.369368\n",
      "     BCS 1.2  |   .9621695   .7714264     1.25   0.213    -.5543167    2.478656\n",
      "      NCDS 2  |   1.708454   .6153074     2.78   0.006     .4989125    2.917995\n",
      "       BCS 2  |  -.9002437   .5898865    -1.53   0.128    -2.058989    .2585018\n",
      "       BCS 3  |  -1.585695    .683324    -2.32   0.021    -2.928931   -.2424602\n",
      "      NCDS 4  |  -3.241009   .6280868    -5.16   0.000    -4.475416   -2.006603\n",
      "       BCS 4  |  -5.423505    .624567    -8.68   0.000    -6.651111     -4.1959\n",
      "      NCDS 5  |  -2.937072    .592384    -4.96   0.000    -4.101296   -1.772848\n",
      "       BCS 5  |  -5.289967   .5726627    -9.24   0.000    -6.414886   -4.165048\n",
      "      NCDS 6  |   -4.47695   .5983433    -7.48   0.000    -5.653467   -3.300433\n",
      "       BCS 6  |  -6.752555   .5875267   -11.49   0.000    -7.906433   -5.598676\n",
      "      NCDS 7  |  -7.119842   .5645244   -12.61   0.000    -8.229579   -6.010104\n",
      "       BCS 7  |   -8.78016   .5601276   -15.68   0.000     -9.88061    -7.67971\n",
      "              |\n",
      "        _cons |   101.7356   .4928228   206.43   0.000     100.7669    102.7043\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is a significant difference between BCS and NCDS for 3\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib7.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib9.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3636\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     442.12\n",
      "                                                        avg       =     775.49\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   6.064381   .8117831     7.47   0.000     4.471301    7.657461\n",
      "     BCS 1.1  |   2.603321   .7804266     3.34   0.001     1.070235    4.136406\n",
      "    NCDS 1.2  |   5.090511   .7340165     6.94   0.000     3.648723      6.5323\n",
      "     BCS 1.2  |   4.203179   .7387809     5.69   0.000     2.751532    5.654826\n",
      "      NCDS 2  |   4.949463   .5637005     8.78   0.000     3.842298    6.056628\n",
      "       BCS 2  |   2.340766   .5572555     4.20   0.000      1.24648    3.435051\n",
      "      NCDS 3  |   3.241009   .6280868     5.16   0.000     2.006603    4.475416\n",
      "       BCS 3  |   1.655314   .6106865     2.71   0.007     .4565879     2.85404\n",
      "       BCS 4  |  -2.182496    .566279    -3.85   0.000    -3.294549   -1.070443\n",
      "      NCDS 5  |   .3039371     .51749     0.59   0.557    -.7118248    1.319699\n",
      "       BCS 5  |  -2.048958   .5256657    -3.90   0.000    -3.081013   -1.016902\n",
      "      NCDS 6  |  -1.235941   .5165811    -2.39   0.017    -2.250243   -.2216388\n",
      "       BCS 6  |  -3.511545   .5481737    -6.41   0.000    -4.587886   -2.435205\n",
      "      NCDS 7  |  -3.878832   .4973326    -7.80   0.000    -4.855647   -2.902017\n",
      "       BCS 7  |   -5.53915   .4898767   -11.31   0.000    -6.500522   -4.577779\n",
      "              |\n",
      "        _cons |   98.49461   .4022243   244.87   0.000     97.70519    99.28403\n",
      "-------------------------------------------------------------------------------\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib9.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib11.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3628\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     444.14\n",
      "                                                        avg       =     790.78\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   5.760444   .7813254     7.37   0.000     4.227363    7.293525\n",
      "     BCS 1.1  |   2.299383   .7361005     3.12   0.002     .8539926    3.744774\n",
      "    NCDS 1.2  |   4.786574   .7140294     6.70   0.000     3.383785    6.189363\n",
      "     BCS 1.2  |   3.899242   .6908882     5.64   0.000     2.542406    5.256078\n",
      "      NCDS 2  |   4.645526   .5202506     8.93   0.000     3.623939    5.667113\n",
      "       BCS 2  |   2.036829   .5130722     3.97   0.000     1.029581    3.044076\n",
      "      NCDS 3  |   2.937072    .592384     4.96   0.000     1.772848    4.101296\n",
      "       BCS 3  |   1.351377   .5907958     2.29   0.022     .1913112    2.511443\n",
      "      NCDS 4  |  -.3039371     .51749    -0.59   0.557    -1.319699    .7118248\n",
      "       BCS 4  |  -2.486433   .5347522    -4.65   0.000    -3.536741   -1.436125\n",
      "       BCS 5  |  -2.352895   .4796666    -4.91   0.000    -3.294389   -1.411401\n",
      "      NCDS 6  |  -1.539878   .4734201    -3.25   0.001    -2.469313   -.6104425\n",
      "       BCS 6  |  -3.815482   .5068211    -7.53   0.000    -4.810484   -2.820481\n",
      "      NCDS 7  |  -4.182769   .4621682    -9.05   0.000    -5.090794   -3.274745\n",
      "       BCS 7  |  -5.843088   .4597781   -12.71   0.000    -6.745676   -4.940499\n",
      "              |\n",
      "        _cons |   98.79855   .3534348   279.54   0.000     98.10489     99.4922\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is a significant difference between BCS and NCDS for 5\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib11.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib13.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3949\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     376.16\n",
      "                                                        avg       =     770.95\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   7.300322   .7878075     9.27   0.000     5.754177    8.846467\n",
      "     BCS 1.1  |   3.839261    .740482     5.18   0.000     2.384931    5.293591\n",
      "    NCDS 1.2  |   6.326452   .7030693     9.00   0.000     4.945375    7.707528\n",
      "     BCS 1.2  |    5.43912   .7243258     7.51   0.000     4.015152    6.863087\n",
      "      NCDS 2  |   6.185404   .5216915    11.86   0.000     5.160626    7.210182\n",
      "       BCS 2  |   3.576706   .5187153     6.90   0.000     2.557914    4.595498\n",
      "      NCDS 3  |    4.47695   .5983433     7.48   0.000     3.300433    5.653467\n",
      "       BCS 3  |   2.891255   .5824098     4.96   0.000     1.747697    4.034812\n",
      "      NCDS 4  |   1.235941   .5165811     2.39   0.017     .2216388    2.250243\n",
      "       BCS 4  |  -.9465552   .5074465    -1.87   0.063    -1.942659    .0495483\n",
      "      NCDS 5  |   1.539878   .4734201     3.25   0.001     .6104425    2.469313\n",
      "       BCS 5  |  -.8130169   .4617774    -1.76   0.079    -1.719167    .0931337\n",
      "       BCS 6  |  -2.275604   .5006608    -4.55   0.000    -3.258657   -1.292552\n",
      "      NCDS 7  |  -2.642892   .4281881    -6.17   0.000    -3.483497   -1.802286\n",
      "       BCS 7  |   -4.30321   .4464721    -9.64   0.000    -5.179643   -3.426777\n",
      "              |\n",
      "        _cons |   97.25867   .3372155   288.42   0.000     96.59674     97.9206\n",
      "-------------------------------------------------------------------------------\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib13.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 6"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate: regress ability male i.parented ib15.nsinteraction  [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3787\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     408.24\n",
      "                                                        avg       =     707.24\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "           3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "           4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   9.943213   .7866582    12.64   0.000     8.398859    11.48757\n",
      "     BCS 1.1  |   6.482153    .699118     9.27   0.000     5.109842    7.854464\n",
      "    NCDS 1.2  |   8.969343   .6840177    13.11   0.000     7.625696    10.31299\n",
      "     BCS 1.2  |   8.082011   .6867457    11.77   0.000     6.732589    9.431434\n",
      "      NCDS 2  |   8.828296   .5072247    17.41   0.000     7.831479    9.825113\n",
      "       BCS 2  |   6.219598   .4891422    12.72   0.000     5.259014    7.180182\n",
      "      NCDS 3  |   7.119842   .5645244    12.61   0.000     6.010104    8.229579\n",
      "       BCS 3  |   5.534146    .586579     9.43   0.000     4.381427    6.686865\n",
      "      NCDS 4  |   3.878832   .4973326     7.80   0.000     2.902017    4.855647\n",
      "       BCS 4  |   1.696336   .4954524     3.42   0.001      .723276    2.669397\n",
      "      NCDS 5  |   4.182769   .4621682     9.05   0.000     3.274745    5.090794\n",
      "       BCS 5  |   1.829875    .447286     4.09   0.000      .951729     2.70802\n",
      "      NCDS 6  |   2.642892   .4281881     6.17   0.000     1.802286    3.483497\n",
      "       BCS 6  |   .3672871   .4787048     0.77   0.443    -.5728835    1.307458\n",
      "       BCS 7  |  -1.660318   .4242854    -3.91   0.000    -2.493524   -.8271119\n",
      "              |\n",
      "        _cons |   94.61578   .2989141   316.53   0.000     94.02888    95.20267\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". *There is a significant difference between BCS and NCDS for 7\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate: regress ability male i.parented ib15.nsinteraction  [pweight=ipw], allbaselevels\n",
    "*There is a significant difference between BCS and NCDS for 7"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now produce a graph of coefficients and 95% quasi-variance comparison intervals based on table 6 model 2.\n",
    "\n",
    "To do this we would ideally use the [-qv-](http://econpapers.repec.org/software/bocbocode/s457831.htm) command. However, this doesn't work with multiple imputation.\n",
    "\n",
    "We therefore use the online quasi-variance calculator - [kuvee](http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/firth/software/qvcalc/kuvee/).\n",
    "\n",
    "To make this more straightforward I move the reference category to be the first category. This will make it easier when entering the data in the kuvee calculator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". numlabel, add\n",
      "\n",
      ". tab nsinteraction\n",
      "\n",
      "      NSSEC |\n",
      "Interaction |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "1. NCDS 1.1 |      9,690        1.46        1.46\n",
      " 2. BCS 1.1 |     16,558        2.50        3.96\n",
      "3. NCDS 1.2 |     13,174        1.99        5.95\n",
      " 4. BCS 1.2 |     19,825        2.99        8.95\n",
      "  5. NCDS 2 |     34,201        5.17       14.11\n",
      "   6. BCS 2 |     50,360        7.61       21.72\n",
      "  7. NCDS 3 |     28,093        4.24       25.96\n",
      "   8. BCS 3 |     28,461        4.30       30.26\n",
      "  9. NCDS 4 |     36,728        5.55       35.81\n",
      "  10. BCS 4 |     49,205        7.43       43.24\n",
      " 11. NCDS 5 |     50,592        7.64       50.88\n",
      "  12. BCS 5 |     56,808        8.58       59.46\n",
      " 13. NCDS 6 |     55,606        8.40       67.85\n",
      "  14. BCS 6 |     53,038        8.01       75.86\n",
      " 15. NCDS 7 |     80,234       12.12       87.98\n",
      "  16. BCS 7 |     79,595       12.02      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |    662,168      100.00\n",
      "\n",
      ". \n",
      ". capture drop newint\n",
      "\n",
      ".     gen newint = .\n",
      "(668,711 missing values generated)\n",
      "\n",
      ".     replace newint = 1 if (nsinteraction==7)\n",
      "(28,093 real changes made)\n",
      "\n",
      ".     replace newint = 2 if (nsinteraction==1)\n",
      "(9,690 real changes made)\n",
      "\n",
      ".     replace newint = 3 if (nsinteraction==2)\n",
      "(16,558 real changes made)\n",
      "\n",
      ".     replace newint = 4 if (nsinteraction==3)\n",
      "(13,174 real changes made)\n",
      "\n",
      ".     replace newint = 5 if (nsinteraction==4)\n",
      "(19,825 real changes made)\n",
      "\n",
      ".     replace newint = 6 if (nsinteraction==5)\n",
      "(34,201 real changes made)\n",
      "\n",
      ".     replace newint = 7 if (nsinteraction==6)\n",
      "(50,360 real changes made)\n",
      "\n",
      ".     replace newint = 8 if (nsinteraction==8)\n",
      "(28,461 real changes made)\n",
      "\n",
      ".     replace newint = 9 if (nsinteraction==9)\n",
      "(36,728 real changes made)\n",
      "\n",
      ".     replace newint = 10 if (nsinteraction==10)\n",
      "(49,205 real changes made)\n",
      "\n",
      ".     replace newint = 11 if (nsinteraction==11)\n",
      "(50,592 real changes made)\n",
      "\n",
      ".     replace newint = 12 if (nsinteraction==12)\n",
      "(56,808 real changes made)\n",
      "\n",
      ".     replace newint = 13 if (nsinteraction==13)\n",
      "(55,606 real changes made)\n",
      "\n",
      ".     replace newint = 14 if (nsinteraction==14)\n",
      "(53,038 real changes made)\n",
      "\n",
      ".     replace newint = 15 if (nsinteraction==15)\n",
      "(80,234 real changes made)\n",
      "\n",
      ".     replace newint = 16 if (nsinteraction==16)\n",
      "(79,595 real changes made)\n",
      "\n",
      ".     tab newint\n",
      "\n",
      "     newint |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     28,093        4.24        4.24\n",
      "          2 |      9,690        1.46        5.71\n",
      "          3 |     16,558        2.50        8.21\n",
      "          4 |     13,174        1.99       10.20\n",
      "          5 |     19,825        2.99       13.19\n",
      "          6 |     34,201        5.17       18.36\n",
      "          7 |     50,360        7.61       25.96\n",
      "          8 |     28,461        4.30       30.26\n",
      "          9 |     36,728        5.55       35.81\n",
      "         10 |     49,205        7.43       43.24\n",
      "         11 |     50,592        7.64       50.88\n",
      "         12 |     56,808        8.58       59.46\n",
      "         13 |     55,606        8.40       67.85\n",
      "         14 |     53,038        8.01       75.86\n",
      "         15 |     80,234       12.12       87.98\n",
      "         16 |     79,595       12.02      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |    662,168      100.00\n",
      "\n",
      ".     \n",
      ". mi register passive newint\n",
      "\n",
      ".     \n",
      ". tab nsinteraction newint\n",
      "\n",
      "      NSSEC |                                                    newint\n",
      "Interaction |         1          2          3          4          5          6          7          8          9         10 |     Total\n",
      "------------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "1. NCDS 1.1 |         0      9,690          0          0          0          0          0          0          0          0 |     9,690 \n",
      " 2. BCS 1.1 |         0          0     16,558          0          0          0          0          0          0          0 |    16,558 \n",
      "3. NCDS 1.2 |         0          0          0     13,174          0          0          0          0          0          0 |    13,174 \n",
      " 4. BCS 1.2 |         0          0          0          0     19,825          0          0          0          0          0 |    19,825 \n",
      "  5. NCDS 2 |         0          0          0          0          0     34,201          0          0          0          0 |    34,201 \n",
      "   6. BCS 2 |         0          0          0          0          0          0     50,360          0          0          0 |    50,360 \n",
      "  7. NCDS 3 |    28,093          0          0          0          0          0          0          0          0          0 |    28,093 \n",
      "   8. BCS 3 |         0          0          0          0          0          0          0     28,461          0          0 |    28,461 \n",
      "  9. NCDS 4 |         0          0          0          0          0          0          0          0     36,728          0 |    36,728 \n",
      "  10. BCS 4 |         0          0          0          0          0          0          0          0          0     49,205 |    49,205 \n",
      " 11. NCDS 5 |         0          0          0          0          0          0          0          0          0          0 |    50,592 \n",
      "  12. BCS 5 |         0          0          0          0          0          0          0          0          0          0 |    56,808 \n",
      " 13. NCDS 6 |         0          0          0          0          0          0          0          0          0          0 |    55,606 \n",
      "  14. BCS 6 |         0          0          0          0          0          0          0          0          0          0 |    53,038 \n",
      " 15. NCDS 7 |         0          0          0          0          0          0          0          0          0          0 |    80,234 \n",
      "  16. BCS 7 |         0          0          0          0          0          0          0          0          0          0 |    79,595 \n",
      "------------+--------------------------------------------------------------------------------------------------------------+----------\n",
      "      Total |    28,093      9,690     16,558     13,174     19,825     34,201     50,360     28,461     36,728     49,205 |   662,168 \n",
      "\n",
      "\n",
      "      NSSEC |                              newint\n",
      "Interaction |        11         12         13         14         15         16 |     Total\n",
      "------------+------------------------------------------------------------------+----------\n",
      "1. NCDS 1.1 |         0          0          0          0          0          0 |     9,690 \n",
      " 2. BCS 1.1 |         0          0          0          0          0          0 |    16,558 \n",
      "3. NCDS 1.2 |         0          0          0          0          0          0 |    13,174 \n",
      " 4. BCS 1.2 |         0          0          0          0          0          0 |    19,825 \n",
      "  5. NCDS 2 |         0          0          0          0          0          0 |    34,201 \n",
      "   6. BCS 2 |         0          0          0          0          0          0 |    50,360 \n",
      "  7. NCDS 3 |         0          0          0          0          0          0 |    28,093 \n",
      "   8. BCS 3 |         0          0          0          0          0          0 |    28,461 \n",
      "  9. NCDS 4 |         0          0          0          0          0          0 |    36,728 \n",
      "  10. BCS 4 |         0          0          0          0          0          0 |    49,205 \n",
      " 11. NCDS 5 |    50,592          0          0          0          0          0 |    50,592 \n",
      "  12. BCS 5 |         0     56,808          0          0          0          0 |    56,808 \n",
      " 13. NCDS 6 |         0          0     55,606          0          0          0 |    55,606 \n",
      "  14. BCS 6 |         0          0          0     53,038          0          0 |    53,038 \n",
      " 15. NCDS 7 |         0          0          0          0     80,234          0 |    80,234 \n",
      "  16. BCS 7 |         0          0          0          0          0     79,595 |    79,595 \n",
      "------------+------------------------------------------------------------------+----------\n",
      "      Total |    50,592     56,808     55,606     53,038     80,234     79,595 |   662,168 \n",
      "\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "numlabel, add\n",
    "tab nsinteraction\n",
    "\n",
    "capture drop newint\n",
    "    gen newint = .\n",
    "    replace newint = 1 if (nsinteraction==7)\n",
    "    replace newint = 2 if (nsinteraction==1)\n",
    "    replace newint = 3 if (nsinteraction==2)\n",
    "    replace newint = 4 if (nsinteraction==3)\n",
    "    replace newint = 5 if (nsinteraction==4)\n",
    "    replace newint = 6 if (nsinteraction==5)\n",
    "    replace newint = 7 if (nsinteraction==6)\n",
    "    replace newint = 8 if (nsinteraction==8)\n",
    "    replace newint = 9 if (nsinteraction==9)\n",
    "    replace newint = 10 if (nsinteraction==10)\n",
    "    replace newint = 11 if (nsinteraction==11)\n",
    "    replace newint = 12 if (nsinteraction==12)\n",
    "    replace newint = 13 if (nsinteraction==13)\n",
    "    replace newint = 14 if (nsinteraction==14)\n",
    "    replace newint = 15 if (nsinteraction==15)\n",
    "    replace newint = 16 if (nsinteraction==16)\n",
    "    tab newint\n",
    "    \n",
    "mi register passive newint\n",
    "    \n",
    "tab nsinteraction newint\n",
    "\n",
    "* return to jupyter\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mi estimate, post: regress ability ib1.newint male i.parented [pweight=ipw], allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     28,331\n",
      "                                                Average RVI       =     0.4115\n",
      "                                                Largest FMI       =     0.3949\n",
      "                                                Complete DF       =      28311\n",
      "DF adjustment:   Small sample                   DF:     min       =     376.16\n",
      "                                                        avg       =     611.14\n",
      "                                                        max       =   2,252.49\n",
      "Model F test:       Equal FMI                   F(  19, 8704.8)   =     182.60\n",
      "Within VCE type:       Robust                   Prob > F          =     0.0000\n",
      "\n",
      "------------------------------------------------------------------------------\n",
      "     ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "      newint |\n",
      "          2  |   2.823372    .864195     3.27   0.001     1.126202    4.520541\n",
      "          3  |  -.6376888   .7817899    -0.82   0.415     -2.17297    .8975921\n",
      "          4  |   1.849502   .7733177     2.39   0.017     .3296352    3.369368\n",
      "          5  |   .9621695   .7714264     1.25   0.213    -.5543167    2.478656\n",
      "          6  |   1.708454   .6153074     2.78   0.006     .4989125    2.917995\n",
      "          7  |  -.9002437   .5898865    -1.53   0.128    -2.058989    .2585018\n",
      "          8  |  -1.585695    .683324    -2.32   0.021    -2.928931   -.2424602\n",
      "          9  |  -3.241009   .6280868    -5.16   0.000    -4.475416   -2.006603\n",
      "         10  |  -5.423505    .624567    -8.68   0.000    -6.651111     -4.1959\n",
      "         11  |  -2.937072    .592384    -4.96   0.000    -4.101296   -1.772848\n",
      "         12  |  -5.289967   .5726627    -9.24   0.000    -6.414886   -4.165048\n",
      "         13  |   -4.47695   .5983433    -7.48   0.000    -5.653467   -3.300433\n",
      "         14  |  -6.752555   .5875267   -11.49   0.000    -7.906433   -5.598676\n",
      "         15  |  -7.119842   .5645244   -12.61   0.000    -8.229579   -6.010104\n",
      "         16  |   -8.78016   .5601276   -15.68   0.000     -9.88061    -7.67971\n",
      "             |\n",
      "        male |  -.5596313   .1796991    -3.11   0.002    -.9120244   -.2072382\n",
      "             |\n",
      "    parented |\n",
      "          2  |   5.871626   .2362326    24.86   0.000     5.408002     6.33525\n",
      "          3  |    8.30813   .5331239    15.58   0.000     7.261597    9.354663\n",
      "          4  |   10.64564   .4568768    23.30   0.000     9.748591    11.54268\n",
      "             |\n",
      "       _cons |   101.7356   .4928228   206.43   0.000     100.7669    102.7043\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mi estimate, post: regress ability ib1.newint male i.parented [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". estat vce\n",
      "\n",
      "Covariance matrix of coefficients of regress model\n",
      "\n",
      "             |          2.          3.          4.          5.          6.          7.          8.          9.         10.         11.\n",
      "        e(V) |     newint      newint      newint      newint      newint      newint      newint      newint      newint      newint \n",
      "-------------+------------------------------------------------------------------------------------------------------------------------\n",
      "    2.newint |  .74683305                                                                                                             \n",
      "    3.newint |  .18603788   .61119543                                                                                                 \n",
      "    4.newint |  .22785882   .21719497   .59802021                                                                                     \n",
      "    5.newint |  .25155519   .22151191   .27220963   .59509866                                                                         \n",
      "    6.newint |  .23648811   .20872186    .2210529   .23451172   .37860321                                                             \n",
      "    7.newint |  .24044282   .21109345   .23845249   .24178883   .21292383   .34796605                                                 \n",
      "    8.newint |  .23715027   .18746127   .19723372   .20705179    .2248706   .21560595   .46693166                                     \n",
      "    9.newint |  .24116712   .19831142   .22686655   .22189722     .227669    .2159627   .24424337   .39449306                         \n",
      "   10.newint |  .23390068   .19426161   .22526396   .22989888   .23048032   .21112124   .23795957   .23195252   .39008389             \n",
      "   11.newint |  .24364124   .21013518   .21955054   .23434548   .22943067   .21782092   .23440539   .23880801   .22752142   .35091884 \n",
      "   12.newint |   .2352734   .20438136   .21444775   .19188841    .2165198    .2081815   .22195449   .22305559   .20770804   .22439068 \n",
      "   13.newint |  .24210358    .2104483   .23086428   .21423275   .23222796   .21845759   .24287263   .24282589   .24529834   .24240349 \n",
      "   14.newint |   .2286826   .19901053   .21448431   .20325265   .20236443   .20196978   .20485222   .21959314    .2171903   .21961942 \n",
      "   15.newint |  .22334486   .22055867    .2244139   .22108345    .2200071   .21369691   .22077229   .23292057   .23164934   .22800361 \n",
      "   16.newint |   .2315445      .19481   .20927759   .20746533   .22602204   .19893182   .21390876    .2341284   .22183976   .22663294 \n",
      "        male | -.00091878   .00106025  -.00242461  -.00480137   .00295409  -.00143565  -.00168798  -.00133548  -.00337208  -.00044733 \n",
      "  2.parented | -.00377538  -.00211464  -.00729941  -.00521114  -.00010989  -.00760007   .00114305   .02348996   .01137725   .02130149 \n",
      "  3.parented |  .01285024  -.02102008  -.04211511  -.04587194  -.00814881  -.01365593  -.00273648   .02067818  -.01161175   .01990581 \n",
      "  4.parented | -.01107807  -.03566959  -.07243865  -.09386182  -.01440704  -.03461003   .01043986   .02380378   .01807404   .01534046 \n",
      "       _cons | -.22828625  -.20331192  -.21458798  -.20842671  -.22384554  -.20638479   -.2216094  -.23779147   -.2269583  -.23443849 \n",
      "\n",
      "             |         12.         13.         14.         15.         16.                      2.          3.          4.            \n",
      "        e(V) |     newint      newint      newint      newint      newint        male    parented    parented    parented       _cons \n",
      "-------------+------------------------------------------------------------------------------------------------------------------------\n",
      "   12.newint |  .32794261                                                                                                             \n",
      "   13.newint |  .23635951   .35801475                                                                                                 \n",
      "   14.newint |  .21284362   .22627058   .34518763                                                                                     \n",
      "   15.newint |  .22328283   .24667879    .2173586   .31868785                                                                         \n",
      "   16.newint |  .21598215   .23621015   .20279678   .22620632   .31374291                                                             \n",
      "        male |  .00226928  -.00177822  -.00327582  -.00041014  -.00052549   .03229177                                                 \n",
      "  2.parented |  .01083125   .02631137   .01317801    .0240857   .02554511   .00105094   .05580584                                     \n",
      "  3.parented |  .01305168   .02546057  -.00046406   .01498719   .01164087   .00348899   .02291568   .28422109                         \n",
      "  4.parented |  .01772126   .03475864   .02510464   .02376739   .02835881   .00202651   .03276072    .0403896   .20873638             \n",
      "       _cons | -.22146877  -.24358735  -.21362248  -.23610623  -.22731873  -.01625676  -.03184747  -.02453975  -.03064344   .24287427 \n",
      "\n",
      ". \n"
     ]
    }
   ],
   "source": [
    "estat vce\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We enter the matrix into the calculator. Here are the results:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/MpL1nC0.png\" alt=\"Kuvee Results\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "95% Quasi-variance standard errors are calculated as +- 1.96(QSE)\n",
    "\n",
    "We input these results into Stata along with the coefficients.\n",
    "\n",
    "N.B. We have moved the reference category back into its appropriate position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". clear\n",
      "\n",
      ". input cohort class  coef se qv qvse lb ub\n",
      "\n",
      "        cohort      class       coef         se         qv       qvse         lb         ub\n",
      "  1. 1 1 2.82 0.864 0.502 0.709 1.43036 4.20964\n",
      "  2. 2 1.1 -0.64 0.782 0.423 0.65 -1.914 0.634\n",
      "  3. 1 2 1.85 0.773 0.375 0.612 0.65048 3.04952\n",
      "  4. 2 2.2 0.96 0.771 0.374 0.611 -0.23756 2.15756\n",
      "  5. 1 3 1.71 0.615 0.157 0.396 0.93384 2.48616\n",
      "  6. 2 3.2 -0.9 0.59 0.146 0.382 -1.64872 -0.15128\n",
      "  7. 1 4 0 0 0.223 0.472 -0.92512 0.92512\n",
      "  8. 2 4.2 -1.59 0.683 0.243 0.493 -2.55628 -0.62372\n",
      "  9. 1 5 -3.24 0.628 0.157 0.397 -4.01812 -2.46188\n",
      " 10. 2 5.2 -5.42 0.625 0.162 0.403 -6.20988 -4.63012\n",
      " 11. 1 6 -2.94 0.592 0.117 0.342 -3.61032 -2.26968\n",
      " 12. 2 6.2 -5.29 0.573 0.114 0.338 -5.95248 -4.62752\n",
      " 13. 1 7 -4.48 0.598 0.107 0.328 -5.12288 -3.83712\n",
      " 14. 2 7.2 -6.75 0.588 0.144 0.379 -7.49284 -6.00716\n",
      " 15. 1 8 -7.12 0.565 0.089 0.298 -7.70408 -6.53592\n",
      " 16. 2 8.2 -8.78 0.56 0.097 0.312 -9.39152 -8.16848\n",
      " 17. end\n",
      "\n",
      ". \n",
      ". summarize\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "      cohort |         16         1.5    .5163978          1          2\n",
      "       class |         16     4.59375    2.378366          1        8.2\n",
      "        coef |         16   -2.488125    3.557466      -8.78       2.82\n",
      "          se |         16    .6129375     .188273          0       .864\n",
      "          qv |         16     .214375    .1311396       .089       .502\n",
      "-------------+---------------------------------------------------------\n",
      "        qvse |         16     .445125    .1316657       .298       .709\n",
      "          lb |         16    -3.36057    3.362431   -9.39152    1.43036\n",
      "          ub |         16    -1.61568    3.760104   -8.16848    4.20964\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "clear\n",
    "input cohort class  coef se qv qvse lb ub\n",
    "1 1 2.82 0.864 0.502 0.709 1.43036 4.20964\n",
    "2 1.1 -0.64 0.782 0.423 0.65 -1.914 0.634\n",
    "1 2 1.85 0.773 0.375 0.612 0.65048 3.04952\n",
    "2 2.2 0.96 0.771 0.374 0.611 -0.23756 2.15756\n",
    "1 3 1.71 0.615 0.157 0.396 0.93384 2.48616\n",
    "2 3.2 -0.9 0.59 0.146 0.382 -1.64872 -0.15128\n",
    "1 4 0 0 0.223 0.472 -0.92512 0.92512\n",
    "2 4.2 -1.59 0.683 0.243 0.493 -2.55628 -0.62372\n",
    "1 5 -3.24 0.628 0.157 0.397 -4.01812 -2.46188\n",
    "2 5.2 -5.42 0.625 0.162 0.403 -6.20988 -4.63012\n",
    "1 6 -2.94 0.592 0.117 0.342 -3.61032 -2.26968\n",
    "2 6.2 -5.29 0.573 0.114 0.338 -5.95248 -4.62752\n",
    "1 7 -4.48 0.598 0.107 0.328 -5.12288 -3.83712\n",
    "2 7.2 -6.75 0.588 0.144 0.379 -7.49284 -6.00716\n",
    "1 8 -7.12 0.565 0.089 0.298 -7.70408 -6.53592\n",
    "2 8.2 -8.78 0.56 0.097 0.312 -9.39152 -8.16848\n",
    "end\n",
    "\n",
    "summarize\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Code to make figure 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". label variable class \"Father's NSSEC\"\n",
      "\n",
      ". label variable coef \"OLS Coefficient\"\n",
      "\n",
      ". label variable lb \"Upper bound\"\n",
      "\n",
      ". label variable ub \"Lower bound\"\n",
      "\n",
      ". summarize\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "      cohort |         16         1.5    .5163978          1          2\n",
      "       class |         16     4.59375    2.378366          1        8.2\n",
      "        coef |         16   -2.488125    3.557466      -8.78       2.82\n",
      "          se |         16    .6129375     .188273          0       .864\n",
      "          qv |         16     .214375    .1311396       .089       .502\n",
      "-------------+---------------------------------------------------------\n",
      "        qvse |         16     .445125    .1316657       .298       .709\n",
      "          lb |         16    -3.36057    3.362431   -9.39152    1.43036\n",
      "          ub |         16    -1.61568    3.760104   -8.16848    4.20964\n",
      "\n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "label variable class \"Father's NSSEC\"\n",
    "label variable coef \"OLS Coefficient\"\n",
    "label variable lb \"Upper bound\"\n",
    "label variable ub \"Lower bound\"\n",
    "summarize\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". graph set window fontface \"Times New Roman\"\n",
      "\n",
      ". \n",
      ". graph twoway (scatter coef class if (cohort==1), msymbol(circle))(scatter coef class if (cohort==2), msymbol(diamond))|| rspike ub lb clas\n",
      "> s, xlabel(1 \"1.1\" 2 \"1.2\" 3 \"2\" 4 \"3\" 5 \"4\" 6 \"5\" 7 \"6\" 8 \"7\")xtitle( , size(small))xline(3.5, lp(dash))xline(5.6, lp(dash))text(-7.5 2 \"M\n",
      "> anagerial and\" \"Professional\")text(-7.5 5 \"Intermediate\")text(1 7.1  \"Routine and\" \"Manual\")scheme(s1mono)legend(order(1 \"NCDS 58\" 2 \"BCS \n",
      "> 70\") row(1) region(lwidth(none)))title(\"Predictions of General Ability Test Score by Father's Social Class\", size(medsmall))subtitle(\"OLS \n",
      "> Regression Coefficients and 95% Quasi-Variance Comparison Intervals \", size(small))note(\"Data: 1958 National Child Development Study and 1\n",
      "> 970 British Cohort Study.\" \"Note: Estimates are taken from table 6, model 2. Model also contains Gender and Parent's Highest Education.\", \n",
      "> size(vsmall))\n",
      "\n",
      ". graph export F:\\DATA\\MYDATA\\TEMP\\figure1.png, replace\n",
      "(file F:\\DATA\\MYDATA\\TEMP\\figure1.png written in PNG format)\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "graph set window fontface \"Times New Roman\"\n",
    "\n",
    "graph twoway (scatter coef class if (cohort==1), msymbol(circle))(scatter coef class if (cohort==2), msymbol(diamond))|| rspike ub lb class, xlabel(1 \"1.1\" 2 \"1.2\" 3 \"2\" 4 \"3\" 5 \"4\" 6 \"5\" 7 \"6\" 8 \"7\")xtitle( , size(small))xline(3.5, lp(dash))xline(5.6, lp(dash))text(-7.5 2 \"Managerial and\" \"Professional\")text(-7.5 5 \"Intermediate\")text(1 7.1  \"Routine and\" \"Manual\")scheme(s1mono)legend(order(1 \"NCDS 58\" 2 \"BCS 70\") row(1) region(lwidth(none)))title(\"Predictions of General Ability Test Score by Father's Social Class\", size(medsmall))subtitle(\"OLS Regression Coefficients and 95% Quasi-Variance Comparison Intervals \", size(small))note(\"Data: 1958 National Child Development Study and 1970 British Cohort Study.\" \"Note: Estimates are taken from table 6, model 2. Model also contains Gender and Parent's Highest Education.\", size(vsmall))\n",
    "graph export F:\\DATA\\MYDATA\\TEMP\\figure1.png, replace\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## !! Now change your kernel to the Python kernel\n",
    "\n",
    "Go to: Kernel -> Change Kernel -> Python"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAd4AAAFcCAIAAADphEhHAAADAFBMVEUAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAACzMPSIAABNzUlEQVR42mP4PwpGwSgYBaNgkAGG/6NgFIyCUTAKBhkYLZpHwSgY\nBaNg0IHRonkUjIJRMAoGHRgtmkfBKBgFo2DQgdGieRSMglEwCgYdGC2aR8EoGAWjYNCB0aJ5FIyC\nUTAKBh0YLZpHwSgYBaNg0IHRonkUjIJRMAoGHRgtmkfBKBgFo2DQgdGieRSMglEwCgYdGC2aR8Eo\nGAWjYNCB0aJ5FIyCUTAKBh0YLZpHwSgYBaNg0IHRohkT3J5gxYAC0rb9JxbcnmBlNeH2////t6Ux\nQBh4ADFqyATb0hgYGBgY0MyHiTIwpG37v20CihxVwDacPtqWxsCQhhKO8KBCgG0w7ZgMYsFt9Mhj\nYEAPBRLAtjQGBgYGqwm3qRlYtzE9jgfcxvQQAd23J1ihBTVusC2NgNJtJIfANhxRhl389gSo96wm\n3L49IW3C7f+4VI40MFo0Ywfb4En29gQrBmJTyrY0BoJqYemPlmBbGqYjbk+wQioct6UxIPGoBLal\nMeDy/ra0tLQ0ZAu34VaLXY7IcIMruw0vALGFBiaAa0SAbdAi4vYEKwiDGmAbNs8RANuQgw6LQ///\nRwjfnmDFQHzUEgibbVQLgW3YvL0tDSF2e4IVhLMtDYvKEQhGi2bsYBtSZtiWRnxSvw0vEHABApmB\nGgCbIzA9sS0NlU8NgM3m//+hdm1LQ3EBLrX//2OVIzXcsBiBF2AxH8nBRLcZiQCkuuw/skNwAiT3\nE6McCm5PmIBPJZJJFIcAhrdvT0Av77elpW37////bXSVIxKMFs3YAXKaTINW4tvSGNImTECu2yHM\n/////4fxGWBC26Dtjf//b0+wYmBggBaM26DKYBw0NTD+/21pDGnbtqUxMMCFoErSJmzbdvs/MoBK\nwBTCuVA+BGxLY2BI2/YfO9iWxsDAAFe/DcPq/xAejLstjQEeDtsgMjDDb2PPVNACYFsaVNn/////\n/789wcpqwgSIfqjwNliA3J5gBWHARLZB1DHAXMbAwJC27f///7cnWDEwQNmo4DbMCDiA6oSK3obq\nhIQnVA7NoG1pGEJQbTBRGBdm6v9taQzwwPm/LY2BgQHCRAK3J1iheByqCsL5f3uCFQMDlA0D29Jg\n/G3wAnIbii4YD8aBhRPMaggPxt2WhuTI////I2yFhAYCbEtjYGBI2/YfCUCVYmqGqtsGjbL/29IY\nIAAifnuCFUQcCm5PQC+ZYeA2TCWmCQwMDHBHonGHGxgtmrEDeKJgYGBgSNsGTweQJPMf1kTZBkmG\ntydAR/duT4Akt21pMLXb0iAy29IYrCbcvn17GyzZ/d8GV3MbJnZ7ghUDQ9q22xOsGBgYGBjStv3/\nvy0NogZhI4SGgdtoev//////NkwMDm5PgLgLCm5PsGKAgLRtcBO3pTFYTdgGlUnbBhW4/f8/NgUQ\nkdtQi2A0goECoCXz///boIHx//////9vT7CC2PP/9gQrhrRt/7eloRsMF0EOt/+3J8A0/v///z+s\nI48ObiM0/P////9/FF/c/o/gWk1AMx8FbEtjYIC44f////+3QT2wDWII3I7bECfdnmDFwABXDTH6\n/3+oYji4PcGKgSFtG5SVtg1CM6Rt+w8BmD7alsYAA1CTbkNthtKo7t+WxgAxbhvUZhSXbJtgxcAA\nNwgCEAogNDLYlsbAgFB+G2bP7QlWcEvStv3//39bGoMVtK62mnD7/22oQhiNYEDB7QkoaRIJ3Iao\nhFJwGuY4KA2l4PRwA6NFM3awDZrekAFC7PYEKwY4SNt2G5p2/v//D2dCGQg9UAAV////P5wNpf7/\n/////7Y0SHLdBtMHk7w9wYoBkuZRwG2o9P////9vg+pFEYSA2xOsGKAGwsBtqKLbE6wY4CBt2///\n29Csvo1TAQRA5dO2QdhWE27/RwW3IQqgIG3bfwi4jVALY8JofIz/////vz0BnquhfWBMcBtZw///\ntycgOSJtG5SPUHEbVTkKuD3BCqIU0+dWE6B6tkEDfxtcye0JVgxwABX7//////+3EdrgzNsT8PkI\nYeg2eKv5P0QTA9Ts2zCT/v9HKIcK3oaog4C0bUgKYACiwmoC1AAMcHuCFUT6NsTE//////+/LY3B\nasLtbVjMsoIpuT3BioEBYuf/20ji/////397AtzLaOA2ksrbE+AmQJhoMnDusAOjRTN2gJHg/iOL\n3Z5gBWX9///////biKQEZ0IZCD1QABX///8/nH17ghVc0W2IEEIfTOD//////9+eYMXAAJWAgNsT\nMPUiGAhwe4IVWjK+DVV0G9UvmFbfxqng//9taQwMVhNu354AdcVtiBYUAG8z/////z+SAixMGI2P\nAQG3ITZiNjDh4DaGhjSEM6Dg9gQrWHjeRlX+//////+RCsltaQxp26AkHNyeYAXn34bqRyi5jc3G\n/////78NVfofncmQhsNHCEPhYBuekN8GUw4VvI3uErgCFHB7Ajw0YAAjBG5PgNr3////2xOsrPAU\nzdvwuPD//////9+egF42Q5PKbahKdBP+//////YEFEeicYcTGC2asYNt6AnuP4oYgrktLW3b/21p\nsOQBTyq3EcnLasLt/////9+WlrbtP1z8/38E+/YEWBq9DRX5vw1mAUwE3lxC67/dxtSLxEICtydA\nXQYFt2GK4Fb9h7gQzoepgAv8R1MAZ92eYAVnWE24/R8ZQLMbDCBUIFjb0tC9gIcBA9vSGBgYGFDF\nkAG6hm1Qt/7/vy0tbRtGeKIr//////9tMHf9vw31IEJkW1ratv+3J2AE/ja4NUjMbWlQxv///5HU\n/t8GN+7/////t+H00Ta4STAAE7k9wQrOgOuEScIF4QL/IS5B8KEAPTRgYBvcgbehFt2egO5lhBqI\n4VDxbVA7bk+wgjOsJtz+jwy2pSH7F243VOU2NBPQHInGHXZgtGjGBLcnWDFAAHKUb0tjYGBACEG5\nDGnb/v//j+AyMDAwpG2DGWE14TZCymrC7f//YVw0NVA2A0Patv//YWoY0rZBxa0m3N42wcqKgYGB\ngYEBogQBoGrgElDNDFYTbv9HAzApBgYGZHmYcNo2OBvZarggA5ICqwm3ETqtrKwYGBjSJiBp+f//\nP8JpcBGYAIPVhNsIDkPatv//4ZJIo6Fwxm2YVRCVELAtDYWLBKAmMTCgaNiWxsDAwAAV2jbBCqom\nbdv/////Q2WhHAjYNgHmAoQERBnUTUg2QeShslBJGBcqCQdwTegS27D4CKEYZur//////98GMRoa\n8ttgAmnboAwGImMQArZNsLJiYGBggChAAGwhcBtdAGYgg9UEeKRaTbgNFYW6ECNtQAFUFQMDA8w8\nXCZs2zbByoqBgYGBgSFt2////9G4ww6MFs2jYIgCrF3/IQ2Gn49GAflgtGgeBUMTQHrPwwkMPx+N\nAgrAaNE8CoYYgHZzh1ExNvx8NAooB6NF8ygYBaNgFAw6MFo0j4JRMApGwaADo0XzKBgFo2AUDDow\nWjSPglEwCkbBoAOjRfMoGAWjYBQMOjBaNBMBoIvgGRgYrCbcRuJDeDAAmWe3mnAbdREUTDEUIEtR\nBWzDsk2BANiWxsDAwIDFrVgBTDUUYCjfRroDSAa30bca34aGKlwUKmA14fb//9vSGBigzP/////H\nulwYqp4BSR2RYBuqf2EGQcW2pTEwwDkYYFsaTimSwLY0BgYGBgYGK6JikESwjVxHboO4yorEAMUL\ntpHrmCEPRotmQuD2BCt4YkOwMTaHwpLQ7QlWGDnzNrxguT3BCk0f3cG2NLj7bmNxKzq4PcEqbRuU\nJB+g7dYmEdzGcCcs+LdBQ/3/Ntiu3f///0MkIeT//9gK5tsTYPGIyiYTwF3x/////yhOoQnYRloM\n0hPcxpVOyEwAtweZ/+gJRotmAuA2amEK48IyPhzcnoA7h9+GJ9ht8PMG/m9LY2CAtTAgbAYGBgaG\ntG3/t6VZpaVZMTCkIZ+5e3uCFQMDQ1ratm2o7P/bYOUCRBSq/P+2NIY0yOZYiCwU3J5ghSoAAbcn\nWDEwMMD0/t8GtXcbVBgK0tLSoJq3pTEwMDAwWE24/f//NlRBuBiS7dvSGBgYGNK2Qe2BuhwGIIIM\nDGnb/v/fhqoRLgfjQ8C2NBh3W5rVBPjuYBQxCImtYP5/GxqJUHB7ghXk/GSrCbf/b0uDGnQbYiYS\nG+rsbXDL4eD2BCurCRADYWXQ7QlWDAwMDAxp2/7/34YcoWnbMGQZULy8LY2BgQHiGCjbasLt/3Bw\newIRMXh7ghXUTIa0bRBD0rb9//9/WxrEp1DubYgShrRt//9vw+FIqK8hHJj529JQ3QwDtydAUzqa\ngm1wOyEsqwm3///fBrWRgYEBInJ7ghUDA3yDOUPatv8IE29PsGJggDlmRIDRopkA2IYoTP//h6SQ\ntG3//29LQxaFgm1pDAwMDAxp2/6jgtsTrBggAC4FLTK2pTGkbfsPNez2BCurCbe3pTFAEup/ZDW3\nISl0W1ratv9obIiptydYWU24/f///21pDLAy1WrC7f+3IYph4DZMGTKAC25LY7CacBvZ3v8Qzu0J\naRNu355gBbcrbdt/CLiNEEybANUFyZNWE27/vz3BKm3b/20TrKzStkEUp237/39bWtq2/3AAEbw9\nwQrT2bcnWFlNuP3///9taQxpqFog4v+hQff//////29PgMTN/21pDAwMDFYTbt+ekDbh9u0JVgwQ\nJ8LAtjQGK7guuD6YUbdhVkNIqwm3b0+wStv2//+2tLRt/29PsEI17P//////355gBRHcBm0y355g\nlbbt/+0JVtCi0GrC7f//b0+A6r2NkMXi5bRt/yHg9oS0Cbf//9+G4vnbE6ysJtz+jwrggtvSGNDM\nRLJxWxoDA5zLAEtUtydY4XHkf5ivrSbcxmp+2rb/CADl30ZTsG0CPAHAfZQGt/E/RNH///9vT5iw\n7T+Ed3sCxMbbE6zStsGpbWlp2/6PEDBaNBMAtydYMSAlh9vQFAPLx1gATAkC3IalK4RJ29IYoMBq\nwu3/tydYMTAwQNj//8PMxqYGYgAaO23b///b4IZDWFBhOA0FtyegeAcCIDr+/4eytiHbC8lMEPL/\nbYhhMAdCAFyQAQqsJtyGisHkoNT///9vT0C4HAEgeq0m3P5/G6oSQsMtgnCRAEQHA0QTHECdCQUQ\nHsQMCAkFtyegBMJtSITBlNyG2QWxw2rC7f//b09AOPv2BKu0bf9vT7BiYGCASv///38bpMxKg3L/\n//+/Da4dZjJM7///yLIwMQiNUAtTw8AAUQcFtyegOB4CtqXBxCCs2xOs0rb9////P8w4qACMC+P/\n35bGwAAxHi4Fl/t/e4IVA8TX29IY0rb9//8fxroNVQKjYQDOhzGgNJT6vy2NAQqsJtxG2PgfwoQX\nvBBlVhNuI3T+vw13zAgBo0UzIXB7AmKkAsGGpKX//////w8pASDk///////floaWgm4jJS+oNiTl\n///DkyQEwMxGVnN7woRt////vz3BKm0bMhuZgtoK0X4bIgyn4WBbGlICvz3BymrC7duoepHt/Q/h\nQMj/tyGGQSkogPKgSiAAKgajodR/VJdDwbY0hrRt/29PsLKacPv/bagUhN6WBg3u2xOskFwNAyhW\n/v///z9yQMIkt6VZTbgNJWHg9gRYPCKxb0+wStsGFWBIS0NyFaqzoRQ6uD3BioEBLrENSft/hOW3\nIXpRZG9DxKA0hIQAmBfQwDaCMfj/NswUCPc/TGBbGkTZ7QlWVhNuQ3gQ9n+Yyv//b0PUIvv6Ng7z\nYTQMwPkwBpSGUv9RfAQxCgJuT7CysoLIobjq/22ITmTH/B8ZYLRoJgLcnmDFAAFWE26j8BkYGKCC\ntyekwc4oZEBJPXDFadsQvLRt/7elMTAwMDAwpG2DCTIwMDAwpCFGA/4jqbl9ewKEnbbtPzIbosRq\nwm240RAOTBhG/0cCMIUMCBmYEJS/LY2BgYGBAeYYKLCysmJggCiBK4D6A01wWxoDA4MVku3b0hgY\nGFB9AQcQu62srBgY0tLSGBgYrNA1MjAwMGDRwmA14fb////hXGQ1SNkeIgvnQgHCYLguhFDaNpgu\nKysGhrQJyM7elsbAgGHa///////fhlIxWDEwQLRbWVkxMEB0wPQiyaalQcW2QWmIIgYGhGUMEDYS\ngOhnYGCA6Pj/HyFkNeE2TFvaNohgGiKStqUxQIDVhNswPVZWVlACIrotDcK6jexrqFoGBitkJTD6\nPwRsS2NgYGBgsEJIwGiIVNo2uBoGaNKymoDQnLbt////MJusrKwYGBApCdUxIwKMFs2DAGzbBktv\n0MbBKKALgBQCaSMoxJFqrFEwyMFo0TzwYBukPcDAwMAwmnHoDiChPyLKZ0hVNJrGhgYYLZpHwSgY\nBaNg0IHRonkUjIJRMAoGHaBf0cwwCoYFMDAwYBgFdAEGo0E9HMF/4gCx6igHxLtpFAxmYGBg8H8U\n0AWMBvXwA8QXg8SqoxwQ76ZRMJjBaHlBNzAa1MMPEF8MEquOckC8m0bBKBgFo2BYAuKLQWLVUQ6I\nd9MoGAWjYBQMS0B8MUisOsoB8W4aBaNgFIyCYQmILwaJVQcD29IQq/O3pTFAAVwID2Ag2k2jYDCD\n0QFQuoHRoB5+gPhikFh1MLANVjRvS0PaV7QNJooHEO+mUTCYwWh5QTcwGtTDDxBfDBKrDgaghfDt\nCVaIgvn//////2+bgML9//8/Awb4PwqGPhgtL+gGRoN6+AHii0Fi1SGBbWkMDAwMaCXzf/jBUbgA\nA9FuGgWDGYyWF3QDo0E9/ADxxSCx6ggBwkdaEe+mUTCYwWh5QTcwGtTDDxBfDBKrjnJAvJtGwWAG\no+UF3cBoUA8/QHwxSKw6ygHxbhoFgxmMlhd0A6NBPfwA8cUgseooB8S7aRSMglEwCoYlIL4YJFYd\n5YB4N42CUTAKRsGwBMQXg8SqoxwQ76ZRMApGwSgYloD4YpBYdZQD4t00CkbBKBgFwxIQXwwSq45y\nQLybRsFgBqNzU3QDo0E9/ADxxSCx6igHuNzU0NDwfxQMHTBaXtANjAb18AO4ikFMQKw6ygEuN40W\nzUMLjJYXdAOjQT38AK5iEBMQq45ygMtNo0Xz0AKj5QXdwGhQDz+AqxjEBMSqoxzgctNo0Ty0wGh5\nQTcwGtTDD+AqBjEBseooB7jcNFo0Dy0wWl7QDYwG9fADuIpBTECsOsoBLjeNFs1DC4yWF3QDo0E9\n/ACuYhATEKuOcoDLTaNF8ygYBaNghABcxSAmIFYd5QCXm0aL5lEwCkbBCAG4ikFMQKw6ygEuN40W\nzaNgFIyCEQJwFYOYgFh1lANcbhotmocWGB0ApRsYDerhB3AVg5iAWHWUA1xuGi2ahxYYLS/oBkaD\nevgBXMUgJiBWHeUAl5tGi+ahBUbLC7qB0aAefgBXMYgJiFVHOcDlptGieWiB0fKCbmA0qIcfwFUM\nYgJi1VEOcLlptGgeWmC0vKAbGA3q4QdwFYOYgFh1lANcbhotmocWGC0v6AZGg3r4AVzFICYgVh3l\nAKubEhISGoZO0TyEnEo7MFpe0A2MBvXwA1iLQayAWHWUA0w3JSQkJCQkNDQ0JCQk/B8KoGG0aB4F\no2AUUAAwi0FcgFh1lAM0NyXAQENDQ0JCQsJQKJ0bRovmUTAKRgEFAK0YxAOIVUc5QHNTAgw0jBbN\no2AUjIKRAdCKQTyAWHWUA0w3JSQkJMCK5v9DATSMFs2jA6B0BKNBPfwAZjGICxCrjnKA1U0JCQkN\nQ6e8G0JOpR0YLS/oBkaDevgBrMUgVkCsOsoBLjcNofJuCDmVdmC0vKAbGA3q4QdwFYOYgFh1lANc\nbhpC5d0QcirtwGh5QTcwGtTDD+AqBjEBseooB7jcNITKuyHk1P80c+1oeUE3MBrUww/gKgYxAbHq\nKAe43ESjEoQWYAg59T/NXDtaXtANjAb18AO4ikFMQKw6ygEuN9GoBKEFGEJO/U8z146WF3QDo0E9\n/ACuYhATEKuOcoDLTTQqQWgBhpBT/w81146CUTASAK5iEBMQq45ygMtNQ6gEIcapxKihDxg8LhkF\no2AUQACuYhATEKuOcoDLTUOlBCFyCTYxaugDBo9LRsEoGAUQgKsYxATEqqMc4HLTkChBEhISEhIS\nGojYuNgwaLwzeFwyCkbBKIAAXMUgJiBWHeUAl5sGfwmSAAMNDQ0JCQkJeEvnhsHhnYSEhAbauGR0\nbopuYDSohx/AVQxiAmLVUQ5wuYlGJQgVQQIMNAyRojkhISEB5tr/1Aaj5QXdwNAJ6tsTrBiQgdWE\n2//JBNvSKNFNANyeYEU7w4kCDDiKQUxArDrKAS43NQyCsowgSEhISCCusGsYaO8kwEBDQ0NCQkIC\nIQeTCoZOeTHkwZAK6tsTrBjStv3//////21pDAwwNnHg9oQ0OpSY29IYKKo1qAEYcBSDmIBYdZQD\nXG5qGOiyjEiQkJDQQIRTiVFDU5AAAw2jRfMQB0MqqJGK5v+3J1iRVgRuSyNJOdng9mirGQvA5aaG\ngS7LiAfEOJUYNbQGCQkJCbCi+T+1wZAqL4Y2GFJBfXsCrGi+PcEKuXF6e4IVAwMDA1RoWxoDA6RJ\nvS2NAUUMIvp/G2xAY1saQ9q2bWkMDFBV////h/DgXBiACkMN+L8NU+N/CJeBAUloYAADjmIQExCr\njnKAy00Ng6AsIxIQ41Ri1NABJCQkNNDGJUOqvBjaYEgFNbwERi39bsPaqbcnWEHLzm1pEPr/bYjc\n7dvbIIz///9vS4Povz3BioGBAaJhG7SwhjWtYXwouA3VDKFvY9F4ewK01rg9gcTWPPUBA45iEBMQ\nq45yQLybBi0gprAjRg19AI1cMqTKi6ENhlRQ354AKf+2wQre//////9/e4IVvCzcBi0pt8FUwCXh\njP9IbHRltydACl0GBgYGqAwCQCXTtv3HqtFqwu3/////R2YOEGAguhgkVh3lgHg3DVpATGFHjBr6\ngMHjklEwAsDtCVbQEhFeNP5HFv3//za0XITLwwQQjP9IbHRltydYQfjoYFsavKGdtg3CT9v2/z+y\nRqsJt/////8fmTlAgPhikFh1lAPi3TRoATGFHTFq6AMGj0tGwQgAt2El43+kwhEiDCkNb8OKRbjs\ntjQGBgaGtG0Iqf/YlMFE4AL/t6VBGf////8PE749AeoAmMD/2wiNCBEGGHtgAAPRxSCx6igHxLtp\n0AKChd2CBQsaGhoWLFjwfxCABkKuHQWjgEoAUuQxMMCLvW1pDAwMEB5cLm3bfwiASjJYWUGKTphI\n2jaYYqsJE9IYGBgYGNK2wUVuw5QxMKTBTPr//z9c1MrKioGBIS0NwsOukYGBgQFVM70BA9HFILHq\nKAfEu2nQAvyF3YIFCxbAiuYFg6B0bsDrWrLBkBoAHdpgNKiHHyC+GCRWHeWAeDcNWoCnsFsAAw2w\nonnBQJfODbhdSwkYLS/oBkaDevgB4otBYtVRDoh306AFuAq7BUigAaloXjCgpXMDDtdSCEbLC7qB\n0aAefoD4YpBYdZQD4t00aAGewm4BDDQgFc3/hyOgRXnRgDtgRzKgRVAPEIAO/GId6IUOBFtNuP2f\ncrAtjUoG0QgwEF0MEquOckC8mwYtwF+CLFiwYAFS0fx/mAJalBcNeAN2xAJaBPUAgtsTrLAUwNhF\nyQTb0hioZxhNAAPRxSCx6igHxLtp0AKCJcgCWNH8f/gCWpQXDYQCdmQCWgT1AILbEyZMmGCF1nCG\nCFKvNL1NTcNoAIgvBolVRzkg3k1DGgz7UoYW5cWwDzTyAC2CegDB7QkTtv3floZSNm9LS9uGVJpu\nS2OAAIiabWkMyKdhQFhp2/5DWFYTbv+HMhkYIOL//yMZNigBA9HFILHqKAdEummo59Kh7v4BAaOB\nNhLA7QkTtv3/j1x2Qs4ChYvAGBD69gQrBgYGSJG7LQ1REKdt+/8fpgZGwWkEY5ACBuKKwf//R4tm\naoOh7v4BAaOBNhIApGhGKV4nbPv//z9qaXobUiKnbfuPohKqBFPk////t5G03EYSH4yAgbhi8P//\n0aKZ2mCou39AwGigjQQALYn//98GaQNvS0vb9v////+I0nRbGgODFaS9nLYNwk/b9v8/khIMkW1o\nWm5DxQcrILIY/P9/tGimNhjq7h8QMBpoIwHAi+b/tydYMVhZpUGL0NsToKXptjR48QphwAT+30ZX\n8n9bGgMDA0NaGpR/G6blNkzlIAUMxBWD//+PFs3UBkPd/QQBLeamhn2gkQdoEdQDBG5PsGJgYGBg\nYICWm9uQilQosJpwe1saAwMDAwMxp2FYWVkh860gWiYgqRycgIG4YvD//9GimdpgqLufIKB6eUG7\nU/+HOqB6UI+CAQdEFoP//5NXNENrKgRI2/afIGAgzk1DPZcOdfcTBNQtLxISEhISEhpoc1fWUAfU\nDepRMBgAkcXg//+kF83b0rCUxFgF0QCRbmoY4kXbUHc/QUDF8iIBBhoaGhISEhJGS2dUQMWgHgWD\nBBBZDP7/T3LRvG0CjmEcxBg/DDBggP9EgIYhXrQNdfcTBFQsLxJgoGG0aMYGqBjUo2CQACKLwf//\nSS6a/29Lw9Y+xi6KAoh0U8MQL9qGuvsJAuqWFwkJCQmwovn/KEAF1A3qUTAYAJHF4P//pBfN////\n35bGgA4Ilcv//xPrpoYhXrQNIfeT51SqlxcJCQkNZLlk2AOqB/UoGHBAZDH4/z9ZRTN5gEg3jeZS\nuoHBE9SDxyWjYBTQFBBZDP7/P1o0j2AweIJ68LhkFIwCmgIii8H//wdZ0bxgBByqOXhAw6ApEAeP\nS0bBKKApIKYYhABi1VEOCLppwYIFC2BF84LR0pn2oIGsApEWA6DkuWTYA1oE9SgYWECwGIQDYtVR\nDvC7aQEMNMCK5gWjpTONQQNZBSItygvyXDLsAS2CehQMLMBfDCIDYtVRDvC4aQESaEAqmheMls60\nBA1kFYi0KC+IcQkxaoYZoEVQj4KBBXiKQTRArDrKAX43LYCBBqSi+f8ooBlYAAvq/yQCWpQXDUQU\nu8SoGWaAFkE9CgYW4C8GkQGx6igHBN20YMGCBbDyYgHpRcYoIB4sWLBgAblBTYvyooGIYpcYNcMM\n0CKoR8HAAoLFIBwQq45yQIybFsDKi/+jgGZgAQw0NDQsgIH/RANalBcNRBS7xKgZZoAWQT0KBhYQ\nUwxCALHqKAdEumkE5kB6ggVIoAGpaF5AdOk8UOVFw8hLGAMV1KOAdoDIYvD//9GieeSBBTDQgFQ0\n/x/0oGE0YYyCoQ+ILAb//x8tmkckWLBgwQKkovn/UAANowljFAx9QGQx+P//aNE8UsECWNH8f4iA\nhtGEMQqGPiCyGPz/f7RoHsGAvKAeqAFQ8lw7pMFABfUooB0gshj8/3+0aB7BgLygHqjygjzXDmkw\nUEE9CmgHiCwG//8fLZpHMCAvqAekvFgw1IZfqAIGJKhHAU0BkcXg//+jRfMIBuQFNf3LiwULFiyA\nFc0LRlLpTP+gHgW0BkQWg///jxbNIxiQF9R0Li8WwEADrGheMGJKZzoH9SigAyCyGPz/f7RoHgUk\nAnqWFwuQQANS0bxgZJTO9AzqUUAfQGQx+P//aNE8CkgEdC4vFsBAA1LR/H9kADoH9SigAyCyGPz/\nf7RoHgWDHixYsGABUtH8fxSMgiELiCwG//8fLZpHwVAAC2BF8/9RMAqGMiCyGPz/f/AVzaNgFGAF\nDaN19igY+oD4YpBYdZQD4t00CkYBJhgtmkfBMADEF4PEqqMcEO+mUTCYwUDNTY3AonmggnoU0A4Q\nXwwSq45yQLybRsFgBgNVXowWzaNgGADii0Fi1VEOiHfTKBjMYKDKi9GieRQMA0B8MUisOsoB8W4a\nBYMZDFR5MVo0kwdGYLgNZkB8MUisOsoB8W4aBYMZUKW8IAOMwCKGKkE9AsNtMAPii0Fi1VEOiHfT\nKBjMgCrlBRlg8BQxdHMJVYKabq4dBcQA4otBYtVRDoh30ygYzIAq5cWQBnQr7KgS1HRz7SggBhBf\nDBKrjnJAvJtGwSgYzGBoFXZDy7XDHhBfDBKrjnJAvJtGwSgYzGAIFXYJCQkNQ8e1IwEQXwwSq45y\nQLybRsEoGMxgqBR2CQkJCQkJDQ0NCQkJ/0fB4ADEF4PEqqMcEO+mUTCYAVUGQIc0aKBX0UxJUCfA\nQENDQ0JCQsJo6Tw4APHFILHqKAfEu2kUDGZASXkxPEDDaNE8CsgFxBeDxKqjHBDvplEwmAEl5cUw\nAAkJCQ1DoWj+//9/QkJCQkJCQ0NDwmi5PGgA8cUgseooB8S7aRQMZkBheTGkQUJCQgIdCzvKgzoh\nIaGBXhXJKCAGEF8MEquOckC8m0bBYAaUlxdDFCTAQENDQ0JCQgLtS2eqBHXDaNE8mADxxSCx6igH\nxLtpFAxmQJXyYiiCBBhoGC2aRwG5gPhikFh1lAPi3TQKBjOgSnkxREFCQkICrGj+T3tAlaBuGC2a\nBxMgvhgkVh3lgHg3jYJRMGhBQkJCw5Aq7IaWa4c9IL4YJFYd5YB4N42CUTCYwdAq7IaWa4c9IL4Y\nJFYd5YB4N42CUTCYwdAq7IaWa4c9IL4YJFYd5YB4N42CUTCYwWhhNwrIBsQXg8SqoxwQ76ZRMJgB\nVeamhjSgW9E8GtTDDxBfDBKrjnJAvJtGwWAGo+XFaNE8CsgGxBeDxKqjHBDvplEwmMFoeTFaNNMI\n0C1gBxAQXwwSq45yQLybRsFgBiOtvMAEdCtBRlpQ0y1gBxAQXwwSq45yQLybRsFgBiOtvBhAMNKC\nerRoRgbEqqMcEO+mUTCYwUgrLwYQjLSgHi2akQGx6igHxLtpFAxmMNLKiwEEIy2oR4tmZECsOsoB\n8W4aBaNgFIw0MOR2wJMHiC8GiVVHOSDeTaNgFIyCEQUSEhISEhIa6HVu1AAC4otBYtVRDoh30ygY\nBaNg5IAEGGhoaEhISEgY1qUz8cUgseooB8S7aRQMZjDSBkAHEIyQoE6AgYbRohkJEKuOckC8m0bB\nYAYjpLwYDGDkBHVCQkICrGj+P6wB8cUgseooB8S7aRQMZjByyguyQQOVprPoE9TUci2FICEhoWFw\nuISmgPhikFh1MLAtjSFt238I2JbGAAVwITyAgWg3jYLBDOhTXgxpQK0ihj5BTS3XUg4Gj0toB4gv\nBolVBwPbYEXztjQGBqsJt/////////9tMFE8gHg3jYLBDOhTXgxpQK0ihj5BTS3XUg4Gj0toB4gv\nBolVBwPQQvj2BCtEwfz//////7dNQOFiAuLdNAoGM6BPeTGkAbWKGPoENbVcSzkYPC6hHSC+GCRW\nHRLYlsbAwMCAVjL/35aWtu0/PsBAtJtGwWAG9CkvhjSgVhFDn6CmlmspB4PHJbQDxBeDxKojBLal\noRXV//8zYID/o2AUjAAwtIqYwePaweMS2gHii0Fi1VEOiHfTKBgFQxoMoSJmwYIFDQ0NCxYs+D8I\nQMPQCTeyAfHFILHqcIJtaehDGzgA8W4aBaNgSIOhUsQsWLBgAaxoXjA4SudhD4gvBolVhxOMFs2j\nYBQggQWwwu7/4AYLYKChoWEBDPwfBTQGxBeDxKrDCUaL5hEG6DM3NUTBggULFiAVdv8pA7QL6gVI\noAHmWgj4PwpoCYgvBolVRzkg3k2jYDAD2pUXQx0sgIEGpMLuPwWApkG9AAYaqOTaUUAMIL4YJFYd\n5YB4N42CwQxoWl4MXbAACTQgFXYLKCjvaB3UCxYsWIDk2v+jgPaA+GKQWHWUA+LdNAoGM6B1eTF0\nwQIYaIAVdgsoK+/oENQLYK79PwroAogvBolVRzkg3k2jYDADOpQXQxcsWLBgAaywW0BxeUefoG4Y\nIutJhgcgvhgkVh3lgHg3jYLBDOhTXgxdsABWNP+nGNAnqBtGi2Y6AuKLQWLVUQ6Id9MoGAVDGgyt\nwm5ouXaoA+KLQWLVUQ6Id9MoGAVDGgytwm5ouXaoA+KLQWLVUQ6Id9MoGAVDGgytwm5ouXaoA+KL\nQWLVUQ6Id9MoGAVDGowWdrQDQz1siS8GiVVHOSDeTaNgMAP6zE0NaUCt4mM0qDEBtcJ2oADxxSCx\n6igHxLtpFAxmMFpeEATUKj5GgxoTUCtsBwoQXwwSq45yQLybRsFgBqPlBd3AaFBjgtGimfqAeDeN\ngsEMRssLuoHRoMYEo0Uz9QHxbhoFgxmMlhd0A6NBjQlGi2bqA+LdNAoGMxgtL+gGRoMaE4wWzdQH\nxLtpFAxmMFpe0A2MBjUaWEC9TfCUgwayKgnii0Fi1VEOiHfTKBgFo2AUoIEFCxYsgBXNCwZB6dww\nWjTjBbcnWDEwMDAwpG37jwG2pTEwMDAQeQkLuWAbjltecIkTA25PsCJb7ygYBcMQLICBBljRvGCg\nS+eG0aKZILg9wQpLAYxdlE5gWxoD2ZZToncUjILhBxYggQakonnBgJbODaNFM0Fwe8KECROs0BrO\nEMGBK+FuU2A5JXppDkYHQOkGRoMaDhbAQANS0fx/QEHDaNFMENyeMGHb/21pKGXztrS0bUgl3LY0\nBgiAqNmWxpC2bVsaAwO8eQrhMMAU3J5gxQADadvg0gjFaRMmWDEwWE24vQ02cLENogSi/v//2wjL\noQBNwbY0BjQ3/P8P4TIwIAkNOjBaXtANjAY1MliwYMECpKL5/4CChISEhtGimSC4PWHCtv//kcvC\n2xPSJtz+DxeBMSD07QlWDAwMkBJyG6RgvT3BymoCXMH/2xMgbfDbEO7/bWlWE27///9/WxqD1YRt\nE6wYGGBl57Y0KBOmFEYjGFAA40Po2xBDGNK2/f+/LY3BCirDkLYNohQiMijBaHlBNzAa1GhgAaxo\n/j+gICEhISEhoaGhISEh4T+JgIHoYpBYdZQD4t1EKoAUzf//b0uDFG0wgdsTrJBKuNsTrBgYIIUh\nskqEktsIBVD52xDZ2xAJCEjbBpf+//8/XM3//////78NUZi2DcKGi8PBbSQFcENuQ1RCqf///yMz\nBx8YLS/oBkaDGhM0kNVWpSJIgIGGhoaEhIQEEktnBqKLQWLVUQ6IdxOpAFoS//+/DdL+3JaGUub9\n//9/WxoDgxWkrZq2DcJP2/b/P0LJNjQFtydYMTAwMMA5adv+I4FtMO3///+/jdsEqwm3/yMAuoJt\nMENuQ1RCqf///yMzBx8YKuXFtjQGCKB7UMKjb1saZZYPlaCmJ2gYLZqpDoh3E6kAXjT/vz3BisHK\nKg2aHW5PQGSRtG3///+/PQFHsQjj34YqgI1gQAFM+v9/SKmP4P//fxu7Cf9vQ8ThAEMBTOD/bajK\nbWkMCBEGGHvQgSFUXtyGhix2ABn1ojrYlsZAsDog0uohFNR0Aw0DXTT/h5XODQ0NCSSWy/9JKQaJ\nVUc5IN5NpIDbE6wYGBgY4JlhG7TAg4szMFhNuL0tjYGBgYHBysqKgYEhLQ3CS9sGVYShACbBwMDA\nYDXh9v///6HyDGnb4GyrCbfh1lihmzABLv4fBlAVpEF4cKusJtyGK2FgYGBgSNv2f3CCIVRe3J6A\nr2hGq3+pB/Bb+/8/0VYPoaCmG2gYBEXz////ExISGshyCQPRxSCx6igHxLtp4MG2CfCcg8QcBUMM\nIJWR29IYUNfDQNiwKhDKg0j935YGrVytJmybYGU1YdsEKwYGBgYrWAVsBVH2H8KDc2F8BpjQtjQE\nAwLStiHx0rb9/w/nQRSOAkKAvAKRFoA8lzAQXQwSq45yQLybBhog5ejRknkoA1hE3oYUrQxp2/7/\n3wYpLm/f3gaV/P//P6wRuy2NAbkchmu0mnD7//9taQwQE25DNaLouv3/9gToSNXtCVYMVhNu/9+W\nxgDRehuqAUrjtvr2/1FACJBXINICkOcSBqKLQWLVUQ6Id9PAg21pDFCQtu3/KBiq4DZyEQiNSrgY\nnPH/9gQrBjhI24ak+D8S+zZM/e0JVgxp2/7fnoCi6zZM+v9/OBPO+P///22I8rRtEDZU/DZEFALS\ntv0fBUMHjBbNZACUBM/AQEKi3wYtk6E5hziwjYIWDyV6RwEBcBteBP7fhq2EhTD+38a3AgfOvg1T\nf3sCrGhO2/YfDm7DpJGYMMa2NAYGK6TFObeh4v//30Y1ZBQMITBaNJMJtsGy1P/bE6wYGKBZAT+A\n9S7pBralMRDptMEDhtDc1G14Efh/Gyw5wMXgDCTJ/9swVuDA2XD1tydASli4zP9taWnb/m9LgzUB\nbk+wYmBgSNv2/zZEyzaowtsTrOAMqwm3/////x8u9///tjQoAwGGUFCPAiIBA9HFILHqKAfEu4kq\nAJHm///fBs81eAFSjqEbGAg7KQNDpbzYlsYAAVaw2TuGtG23J1gxMDAwWE24DVMASRcQNoQHZSOp\nYYDNIDKgDkVDBRnStv3//x/BZWBgYEjb9h9hF1QCujhnG0xl2rb//2FsGA8FDJWgHgXEAwaii0Fi\n1VEOiHcTVcA2RNG8LY2BwQqaldKgM++3YTkHwvmP4EL429IYGBigbLhk2oRt225jcreloamD6/u/\nLY0Blq2hQhAOAwMDxHW3J4wWzaMABxgN6uEHGIguBolVRzkg3k1UAfAikIEBUgzehhSb0ILwNqxI\nvD3BCiL///9tmNh/2MjGNkihi+BaTbgNp6D0tjSYqXD9tyFm3p5gxcDAwMCQtu3//20Qk25DlcBo\nBGPIgNHygm5gNKiHH2AguhgkVh3lgHg3UQVsS2NI2/YfFSDEkEvEbZBSE0nwNrRMZWBgYGBIg3VN\nIXL//6Nz/9+G6INS////R5i5DWYjsuTtCVYMDAwQidtI4kMDjJYXdAOjQT38AAPRxSCx6igHxLuJ\nKgBeKCIBhNjtCVYw5v/bsNIRmQGTRIDbE6wYGOCakLm3IfpuT8A0E24jkgCDFdbJ+qECRssLuoHR\noB5+gIHoYpBYdZQD4t1EFbANVigiASSx2xMg+wL+/78NLxwRLITCbWlp2xD7TqBDGKhcuL7bmGbC\nDYKKwPi3J4wWzaOAEBgN6iENsK6uI74YJFYd5YB4N1EMbk+wYoAA5GJvWxoDAwNCCK4obdv////h\n0lB5GC9t2////7dNsEJWi8qFmmM14TacDZH4DzMEPupshT5ZPwEu/n8UjIJRMKzAaNE8CkbBKBgF\ngw6MFs2jYBSMglEw6MBo0TwK6ApGB0DpBkaDekiD0aJ5FNAVjJYXdAOjQT10wQIc1xgSXwwSq45y\nQLybRsFgBqPlBd3AaFAPUbBgwYIFsKJ5AWrpTHwxSKw6ygHxbhoFgxmMlhd0A6NBPRTBAhhogBXN\nC5BKZ+KLQWLVUQ6Id9MoGMxgtLygGxgN6iEHFiCBBqSieQGsdCa+GCRWHeWAeDeNgsEMRssLuoHR\noB6KYAEMNCAVzf9hgPhikFh1lAPi3TQKBjMYLS/oBkaDeoiCBQsWLEAqmv8jAeKLQWLVUQ6Id9Mo\nGAWjYBQMabAAVjT/RwXEF4PEqqMcEO+mUTAKRsEoGOqgYXRd8ygYBaNgFAw2MFo0j4JRMApGwaAD\no0XzKKArGJ2bohsYDeohDUaL5lFAVzBaXtANjAb18APEF4PEqkMB0EOHEQByPDF+wEC0m0bBYAaj\n5QXdwGhQDz9AfDFIrDo42JaGpSTGKogGiHfTKBjMYLS8oBsYDerhB4gvBolVBwPbJuC4keP2hAn4\ny2bi3TQKBjMYLS/oBkaDevgB4otBYtXBwbY0bO1j7KIogHg3jYLBDEbLC7qB0aAefoD4YpBYdchg\nWxoDOiBULv//T4KbRsFgBqPlBd3AaFAPP0B8MUisOsoB8W4aBaNgFIyCYQmILwaJVUc5IN5No2AU\njIJRMCwB8cUgseoIAsxpQAYM8H8UjIJRMApGMCC+GCRWHQzcnmDFgAMQGG9mINpNo2Awg9EBULqB\n0aAefoD4YpBYdQiwLQ1rGYzZakYDxLtpFAxmMFpe0A2MBvXwA8QXg8SqoxwQ76ZRMJjBaHlBNzAa\n1MMPEF8MEquOckC8m0bBYAaj5QXdwGhQDz9AfDFIrDqcYFsagxWODYKogHg3jYLBDEbLC7qB0aAe\nfoD4YpBYdTjBaNE8wsBoeUE3MBrUww8QXwwSqw4nIKVoHgXDABgYGDCMAroAg9GgHo7gP3GAWHWU\nA4ZRMApGwSgY8eA/cYBYdbQDxLt1wMEQcur/IeXaIeTU/6OupRkYQk79T3vX0tZ0YgCtfUhFMISc\n+n9IuXYIOfX/qGtpBoaQU//T3rW0NZ0YQGsfUhEMIaf+H1KuHUJO/T/qWpqBIeTU/7R3LW1NJwbQ\n2odUBEPIqf+HlGuHkFP/j7qWZmAIOfU/7V1LW9NHwSgYBaNgFJABRovmUTAKRsEoGHRgtGgeBaNg\nFIyCQQdGi+ZRMApGwSgYdGC0aB4Fo2AUjIJBBwagaMaztRuP1IAAHO6B3yeA9eTqAQeD3HkoAOZW\nbME8WMHtCVZDxLnbYBcsD4GU8P//bUhaGPRuhYcqA03TLb2LZkjwY/UQHqkBAbjcc3tCGkRsWxo2\n6YEGg9x5KADm1tsTrAZ/joSBIRCuULBtwpBw5v//////35bGMARK5f/////ftg3uShzXilAH0Lto\n/v///23cjQ48UgMCCLjn9uAuUQa585DBtjR84TyYwLa0tAn4U8VgAbcnWDEwDJXiDnv3dJADWMuC\nNmC0aMYHCLlnW9qgTvmD3HlwQPDyskEDbk+YsO0/oVQxSMDt27f//789wWrwl863J1gxpKWlMTAw\nDH63IgHalsyjRTNeQMA9NI4bSsEgdx4E3J5gxcDAMDSGCKABeht/qhhsYNA79/YEK4a0tG3b/v//\nvy1t6BTO0NRAMzBaNOMD+N2zbXAP5Q1y5yGD2xOshkDZDAvQ23hTxeADtydYDeri7jZSeG4bMiMb\ntC6ZR4tmvACPe24P7vQ+yJ2HAQb/YPPtCZDmPQwMdvciAM0LEUrBNsTA223cOW6QgW1pMDfTCIwW\nzfgALvcgicOaUoMJDHLnYQGDv2RGAkjBOxTAEAjbbbC28tAJWpqXzHQvmm/D2h7QGLg9wQqZycDA\nAJcaaIDuntsTIE7dlsaABKwm3P4/qMAgdx4yuD0BGsIMtE7mVAW3h0T5AU8HQyJsYa4dEo79//8/\n7UtmuhfNo2AUjIJRMAoIgtGieRSMglEwCgYdGC2aR8EgAbfhwxtQgKvLuC0NpxQM3MY1C7otjQF5\nlGdbGgMDlA9nwvuqqCK3UZ1nNeH2/////8NUMTBYQdWNglFAFTBaNI+CwQNuI5WomOsKbk+YsO3/\n/9sTrHCX2jCAqRkJbEtDMgBi6P9t0OIeYTqmyH+kCTVoKbwNYRSSulEwCigHo0XzKBg84DasaIYt\nLLk9wYqBgQFS5G1LY2BgYEjb9v/2BKu0CROsIJz/MAmrCbf//9+WZpWWZsXAYGWVNuE2THsaZDsD\nHGybMGHbBNhMHrRovj0BMseLAJgi/7dBi2Zoufz/9gQrqBNGwSigNhgtmkfB4AG3J1gxQACkDLw9\nwSpt2//bE6ysJtzeNsHKKm3b/////9+eACk0b0Nl0ybc/v9/WxpDWloaXOf//////4co+L8NWpLC\nwLYJE27//78tjSFt2///0KL5/////7elMTAwMDCkwfgYIlAuA0zk9gQrJNtGwSigJhgtmkfB4AG3\nIWXpf2jp+f//f2hxaDXhNkISxoDQEAUMDAwMVhNuw9q1MHB7ghUDrCCFA5jh29IY0rYhFc0QcBuj\nvIWLwEyHlfW3J4y2mkcBrcBo0TwKBg+4PcEKpajblsaQtu3/bUjReBsmCWNAaGirGQJghScEQItd\niDoEgBXN///fnmAFLbiRTIFYikXkP5LptyFmbkuD6v//////2xB3joJRQA0wWjSPgkECbk+wYmBg\nYGBgQJRvECErKysGhjRo+zht27Y0BgYGqwm3YTREnIGBIQ0yAG01Aa799gSIVBqs9PwPUwwXgbaA\nb09Is7JigIC0bf//Y4pA3IIAadv+//+PIoyweBSMAorBaNE8CkbBKBgFgw6MFs2jYBSMglEw6MBo\n0TwKRsEoGAWDDowWzaNgFIyCUTDowGjRPApGwSgYBYMOjBbNo2AUjIJRMOjAaNE8CkbBKBgFgw6M\nFs2jYBSMglEw6MBo0TwKRsEoGAWDDowWzaNgFIyCUTDowGjRPApGwSgYBYMOjBbNo2AUjIJRMOjA\naNE8CkbBKBgFgw6MFs2jYBSMglEw6MBo0TwKRsEoGAWDDowWzaOAigB6GjLSycVQEbgQ/Hxjqwm3\nb09Im3AbLoAAVlC1///D9FtNuL1tApIoFEAkGeCHJ8MFYHyaAgyXW6G4EC5tNeH27QlpEDmIA62w\neAciAwFWUDmoETDuKBhJYLRoHgVUBpDyBKlwhF428h9S/MDKmdsTrOCcbWlI6relMTDA9G9Lgyi5\nPQFyHSAqQC/eYKr/b0M2jxKwYMGC//jA7QlIV1BtS2OAc7alwT33//YEmE+3QR14ewKGd7Ztg+r8\n/x96uv//2xOgt6Yg3a4yCkYMYPg/CkYB0SAhISEhIeE/XnB7woQJE6wQpdT/29Ci+fYE9AIJVght\nQytKt6UhyjKYBHo5/P/2BCsGBiRr/t+GFWX/b0Ouh6IQLICB/zjB7QlW2BxwewJ2n+LzDhzAWtjI\nihHMUTBSAMP/UTAKiAMJSOA/bgApibelwUtNiMB/zPIKDrahFz1wtdvSUEpfZHD79u3//29PsELI\n354A0XWbCiXzAlTwHzu4PQGpaN6WBnXK7QkQd2CCbWk4vQMDsJL5Nqyc/4/KHgUjBDD8HwWjgAiQ\ngAH+4wDQkvj/tjRo0xcqcBtngfV/G86i+f///9vSGBgY4DxMcBup3Lo9wQqvWiLBAmzgPxYAsQ8G\nYH64PcEKtxsIeAdWMv+/jeotOHsUjBDA8H8UjAIiQAIG+I8DQEvi////34YUUVABKO8/FrANX9H8\n/z+EjyKADG4jGsnbJkyAXOcKEyADLMAN/qOD2xPQWs0Madsgorgc+///f4gC7CrgJfP/20jFMTJ7\nFIwQwPB/FIwC4kACEviPG0BL4v//////f3uCFYNVWhpE4PYEK7TyCKZ0G1rRvC0NWnJBhmj/////\n//82NDUIACvObsMLMASLPLAAG/iPBdyegFQ0/789AepBOAMOID4l7B3sKpCYo2CkAIb/o2AUEA0S\nEhIS8JbL/2GlEBxsS2OAlyvb0hiQiizEwoNtyEXPtjQGuKptaTDW7QnIhSAygBuDUHKbwqL5P0bp\n/B87uA238v///9vSGOBe3ZYG98P///9hTtxGyDtIJfP//7ehvoDRo2BEAYb/o2AUUA9sS2NgYGBA\nLpb+/0cpb2AKGGCl2O0JVgxoACLx/////9smTNgGk0eI/v///z/CICRxuFnI1pMNFsDAf+wAbhsC\nILkF4UCEMB7v/P////9/lJD6D7eBKr4ZBUMMMPwfBaNgFOAAuMvlUTAKaAsY/o+CUTAKRsEoGGRg\ntGgeBaNgFIyCQQdGi+ZRMApGwSgYdGC0aB4Fo2AUjIJBB0aL5lEwCkbBKBh0YLRoHgWjYBSMgkEH\nRovmUTAKRsEoGHRgtGgeBaNgFIyCQQdGi+ZRMApGwSgYdGC0aB4Fo2AUjIJBB0aL5lEwCkbBKBh0\nYLRoHgWjYBSMgkEHRovmUTAKRsEoGHSA3kXz7W0TJkzYtm3bhAnbiD/o8Pa2CbAjgG9v27ZtG1zv\ntglpaRDO7W3btm3btu32fyi4fXsCTM9tZHEYuA2TxQtuT0A5pfH2tgkTJkzYtm3b7dv/b0PPb///\n/////7cnTNj2/zZC8e1taROgzP//b2+DaJowAdMZGADJEGqA29smTJgwYds23DajOJU4gKbl9rYJ\nMO7tbdu2bYN6c9uEtLS0tDQI7/a2CRMgLAiA8Ldtm4As+P/2BJhBqOz//28jhzYM3N42YcKECdu2\nbbuNKoVVMXaAMOI/DoDm2VEwCugE6F00I/IN7Hjx27dv//////a2CRNu////H8pFA/B8ClEFNeP2\nBHjGvj1hwjYYCQW3t6VBTyDftm3b////oSbDLYKC28ic/3AORPH/bYiCEmrp/////9++DSuaoQIQ\nAFd8+zaqSyakTbj9/////9ugXv5/G0pCqf////+H0UiG/EcCyBwIG0IiK4OwkD14ewKEffv2bZj4\n7dv/ocT//////78Nc+ptiBiEhILbSDwkFkwLDMC5UBtg3v3///9tqMiEbf//I4vDOXDvwgHUnTBw\n+/bt//+hym//RwCIyP////9Dlfz/D6WgUlDO//9wcQhANv82ZeEzCkYB7cDAFc3/t6VZTdg2YcK2\n/9smTLgNEb4N4/6/jdKQuT1hwrb//////w/JyRDF///fvr0tLS1t2///////vz3BCn5D3P//////\nv71t2+1taVYTbv/ftm3b/9sTJkBMhurdlpYGFYQw/9+eYJW27fYEKHsCRDFU7v////+Rmf////9/\ne0IaXD1UbltaGqqZUHAbYuX///+3pVlN2LZtwrbb29LSJtzeljbh9v//27Zt+///NpJgGqohtydY\npW3bNiEtbcLt/7cnpKVtu71tQhrEH9u2QXVtm2CVBnPM7QlpE27/h4FtaVZWaRO23YaJQ8htUJUI\nW/5vS5tw+z/EMRBwe8IESCDcRjF8wrb//6FaYAAi+B8mfntC2oTb/////////+1t227/xyL+///t\nCVZWaRMmpKWlbbv9///tCWlp26B23IYo25aWBrH+/7Zt2/7fnpCWBpWHgm1I7P///0PcABG8PSEN\nrhhDfNu2tLQJt//DAPnhMwpGAU3BwBXNtyekpW37v23bhG0TJky4fRsiDOP+RwWQTAIBtyFF038o\ngEjdnjBh2//b0FwEAZBi4faEtLS0bdsQJt+GaN6WlrYNyoQUABDO7QkTtv3/D1f8f1ta2rb/ELAt\nLW3C7f9wgKo+LW0bhIQIw8QhACr2///tCWlp27ZNgHD+//+/bcKEbdu23Yawbv//////f2RDIKJQ\nzra0tG1QNsTwbWlpaRMm3P7//////8jiEDYyuD0hLQ3mcQi5DWHYf4iu//+3TUjbtm3b7f9wsA0a\nCBB1tydMwNACBchclNi5DTVvW1ratv//byMkEJzbE6zS4OZuS0Nj/7+9LS0tbcJtqCCyRdvS0ibc\n/g8DEPn//7dNmHAbyrk9YQLcMFTxtAm3/yOD2+SFzygYBbQEdC+at6WlTdgGycDboLwJMOL2hDQY\n6/aECUi54PaENBj39u1tEyAZ5Pa2Cdtuw3P/hAm3b9+esG3bfxiAyfzflpa27f+2tDSIydsmpKVN\nuH17W1ratv+3J6SlbbsNyXC3J6RNgObJ/9vSYOq2paVt+w8D2yakTbh9+/ZtDPVQaltaGtRM1KbZ\ntrS0Cdtu3942IW3Ctv//b09Is0qbMGHb7f///9+eAFV2Gy64LQ1myO1t27bd/g9jT4DqnYBUfEyA\n6YKLQNSkTYD2u29vm7Dt9u3b2yZM2AYT35aWNmHbBCurCbf/356QlrYN4dTbE6AMCNiWljZh2wQI\nMQHZ8G0ILRBwewKce/v2tgnboOz/t7fBmLcnTNh2e9u2bf/hYFuaVdq227e3TUhL2/b/P9SEbWlp\nyO7c9n/btm23b0+YsO327QlpE2BugIFtE9Im3L59+/a2bbf//789IS1t2+1t26BsuOLbGOK3J6Sl\nTaAofG5vm7Dt9v9RMApoCuheNI8CEgG8RKA1uL1ttMDBB0bDZxTQE4wWzYMc3N42YcKE2zQvE25v\nGy148IHR8BkFdAZ0Lppv354wYcK2/////ycpsUO0bdu2DYuW2xOgBtIYEN963bZtG9FqkcHtbWkI\nr9yekJYG5yCD29u2bSO1qEYx+f9/VMNvb0tLs0rDCNjb2yYg+u23kdgIcBufsTCAVZAicHvCBEIG\n3t42AQKwuBoB0N1PHEDRdXvbhAkTtm2bQFp0b5uAMAECbm+bMGHC7dsQxrbbmH68TXSSwsgsKA4m\nAG5PIFrpKKAtoHPR/P//7W1pVlYTbv///3/bNkgiuH37////////v71twgQIEyaCAKgJ8/bt2///\nI6mHAQgPSqKpuQ2h/t+GMSAAjfsfiQvXC1EDdcLt2/8hAM74/x/B+v///+0JKBnhNqrk////YeT/\n/5hWTNj2Hya7DVai3YYK/P/////tCWkTkLioACIBIVF13YaY/P////8Qcbjh///fvn37////2zDN\nvX0bVqje3jYhDSp/+/Z/OLh9G4+x////v40qCOX+//8fzoADuAic8f////9Qv9xGDSVkAOejSMFs\nRAhDqf////9HYmG4HxXcRhJEYiHp+v8fGiOQ2cz//28jafn///9/JC6EBfHLtgkTbv//j6IYYs5/\nZMf//////3+YGog8lPP/P0L89u3//6HGQgFEKRzchjkYovL2fyiAsW7fvv3/P6oJ////v43M+Y/C\nGQV0AQNQNG+7vS3NasLt/9u2bfsPSTSQ1AhJUbe3Tdh2e1ta2oTbt7dtg6eI2xOsrNImTEibsA2S\nrCF60ybchmq+PSEtbdvtbRPSJty+PSFtwrYJUIW3J6RNQDJzwoQJEL0QcBtJWdq2bdsmQJXd/v//\n////tyF6JyDUTLi9bdu22///IwzcNsEqbdvtCWlpMCP//789IW0CRGjbtrQJ26AO/H97Qlratttw\nF97+//////8QFqoVtyekpU24DdV2e9uEbbe3QQT+//+/Lc0qbdu2CTAuHNyekIZs+LYJE7b9/78t\nLW3b//+3kdnbYKZtS0vb9h8JQFShgtu3b29LS5tw+//tbdugRk/YBjcKmY3N2NsTJkA8BVEEYW3b\nBhGHiEHB7QkTIMK3J0zYBpO6jRSwtyekTbj9//aECTBj0tK2/UdWABHdtu0/BGxLS4Mwb2+bMGHC\n7W0w5yFH1u0JE7bBVN7eBleQlrYN2iW5PWECxLrbOHRBwe0JaWnbtk1IS9v2//aECTAtaWnbtm2b\nADcW2YS0Cbf/////f9uECchu/n97Qlratm3btm2bYJW27T/EjtsTJmz7D1H5//aENIQh26AmT5gw\nAWLIbZix/////38bkVmgJkB0TUibcBvOngCRvD1hAtzNE25Dld6eMGEblPn/9gQruL2jgJ5gQIrm\n//9vT0hLS9u27TYkPfzfNmHC7f8QzrYJE27/xwAQOQi4vS0tLW0CTD0kBUHYiCS1bduEbRMmwBLc\ntgkTbv////8/Qi8U4FL2//////8hgshqrNLStv3///8/QuXtCWkTYPZCwe0JaRNuI4j/UNUQDkTl\nNpg5EEEkKyZs+/8foh6iCMKEg21paQgFSABiDkT7tjQrq7QJt////79twoTbEBmY1IQJt///////\n/3+oQVAAiRN0cPv27f+3J6SlTdgGCd9tE9Im3P7///+2CUQZC/MUVBAa8lCNSHpgInDGtgkTbkM5\nEPMhbDQDIYIQBTDD/0PAtjQrKPv2hLQJt7dNQOb9vz1hAkz3/9sTJmz7/x9NARTArIMI3p4wAU0X\nFEDF/v//j65l24QJEAkIF6ILwv7//z+amxES29LStsHJCRO23d42YdttmDzEkG0wk+GGQGShAMGB\nsiC6IJxtaWlwj/z//x/Nzf+3paVhy5UQE0YBPcEAFc3////flgZJI2nbbm/btu32//+3J6SlTbi9\nbUKaVdqECdtu/789YcLt/1CwLS1twrbbt29vm7BtwrZtt29PmLDtNkT97W1pEHMm3IaS29LS0tLS\nJmybACNub4OZuQ2m9/////+hpk6AEBNu/78NU/b/////MPdMSEubsA1KTLi9bULahNv//9+Gqbw9\nIW3C7f/b0tIQCff2hLQJUPHbE9LSIJZChaEkXP3tCWlp6Fbc3jZh2+3//yGKbsMs+g8B2yZMgBsI\nVff/////tyekTUAyfMKEtDS4sglpadu2bUtLm3D7/224advS0iAugChIS0tLm7DtP4qZ/7dt23b7\n//9taRNu//+/Lc1qwu3bE9LSiDZ2W1rahG0T0tImQCJo27Ztt29PmLDt9u0JaWkwQyDg9oS0tAno\nUrdhftn2///tCWlpSKEEMfA2koJt27bdvj1hwrbb/6Fg24S0Cdtu3769bULahNu3Yc67PSENruX2\nhLS0bejuvz0hbQLMjG1paRDrtk1Im4BNFxQgcbalpU3YNgFCTMAwdlsazIQJt2//x3DztrS0Cbf/\n//9/e4JV2rb//7elpW37f3tbWtqEbdu2bbv9//9tVEMgJm/bBjXkNszY/////9+WljZh2+3bt7dN\n2LZtQlraNpiDt6WlTdg2wcpqwu3/tyekpU2YMGEbxIYJMOL27W1padv+356Qlrbt9rZtGPZum7Dt\n9v9RQB9A96J5FFANbBvNJ8MbbIPG8LZt2/6PgpEGRovmUTAKBiu4vW3ChAnbYK3hUTCiwGjRPApG\nwSgYBYMOjBbNo2AUjIJRMOjAaNE8CkbBKBgFgw6MFs2jYBSMglEw6MBo0TwKRsEoGAWDDowWzaNg\nFIyCUTDowGjRPApGwSgYBYMOjBbNo2AUjIJRMOjAaNE8CkbBKBgFgw6MFs2jYBSMglEw6MBo0TwK\nRsEoGAWDDowWzaNgFIyCUTDowGjRPApGwSgYBYMOjBbNo2AUjIJRMOgAAAyGJghegAiEAAAAAElF\nTkSuQmCC\n",
      "text/plain": [
       "<IPython.core.display.Image object>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.core.display import Image\n",
    "Image(filename=('f:/DATA/MYDATA/TEMP/figure1.png'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/7aflA3I.png\" alt=\"figure 1\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## !! Now return to the Stata kernel\n",
    "\n",
    "Go to: Kernel -> Change Kernel -> Stata"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to reset the paths in Stata. Stata forgets this when the kernel is changed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". global path1 \"F:\\Data\\RAWDATA\"\n",
      "\n",
      ". global path2 \"F:\\Data\\MYDATA\\WORK\"\n",
      "\n",
      ". global path3 \"F:\\Data\\MYDATA\\TEMP\"\n",
      "\n",
      ". global path4 \"F:\\Data\\MYDATA\\FINAL\"\n",
      "\n",
      ". \n",
      ". clear\n",
      "\n",
      ". \n",
      ". *return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "global path1 \"F:\\Data\\RAWDATA\"\n",
    "global path2 \"F:\\Data\\MYDATA\\WORK\"\n",
    "global path3 \"F:\\Data\\MYDATA\\TEMP\"\n",
    "global path4 \"F:\\Data\\MYDATA\\FINAL\"\n",
    "\n",
    "clear\n",
    "\n",
    "*return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Discussion of Social Class Effect <a class=\"anchor\" id=\"socialclasseffect\"></a>\n",
    "\n",
    "There is a clear and observable negative social class gradient that is net of gender and parental education. Overall children from more occupationally advantaged social classes perform better on the general cognitive ability test. The negative social class gradient, and the differences between social class categories, may reflect the instability, and the economic and social strain that results from belonging to the more disadvantaged social class groups (Layte, 2017; Elder, 1994; Conger and Conger, 2002). These differences may also reflect other characteristics of parents jobs, such as complexity (see Parcel and Menaghan, 1994).\n",
    "\n",
    "At the apex of the social class hierarchy are fathers in the ‘managerial and professional’ class. These include fathers in NS-SEC 1.1 (Large employers, higher managerial and administrative occupations), along with fathers in NS-SEC 1.2 (Higher professional occupations) and fathers in NS-SEC 2 (Lower managerial, administrative and professional occupations). The ‘managerial and professional’ class comprises more complex and higher skilled occupations, and employees usually enjoy a high degree of job security, and have a regular, and known, monthly income (Goldthorpe and McKnight, 2006). Fathers in the ‘managerial and professional’ class can have realistic expectations of salary increases, for example via incremental pay scales, and they can realistically expect to be promoted within their occupations up to the age of 50 and even beyond. These advantages are likely to make substantial economic, social, and cultural contributions to the households in which children grow up.\n",
    "\n",
    "At the base of the social class hierarchy are the ‘routine and manual occupations’ (NS-SEC 5, 6 and 7). In both cohorts children born into families in ‘routine and manual occupations’ have markedly lower cognitive ability test scores than children from ‘managerial and professional occupations’ families (NS-SEC 1.1, 1.2 and 2). The fathers in NS-SEC 6 and NS-SEC 7 comprise a group of wage-workers in lower skilled jobs that are usually of a routine nature. The economic lives of the fathers in these classes are characterised by a relatively high risk of job loss, recurrent and often long-term unemployment, and lower earnings. Occupations in NS-SEC 6 and NS-SEC 7 are often rewarded on a weekly rather than an annual basis, and pay can vary as a result of the availability of overtime, piece-rates or shift work premia (Goldthorpe, 2016). The advent of negative events such as job loss and unemployment are likely to have immediate impacts on a household’s economic and social circumstances. We speculate that the precarious nature of the employment conditions that are experienced by employees in routine and manual occupations hangs like a sword of Damocles over these families. The lack of economic security and the lower material rewards associated with jobs in NS-SEC 6 and NS-SEC 7 may contribute to the impoverished cognitive ability of children with fathers in these classes.\n",
    "\n",
    "Occupations in NS-SEC 5 (Lower Supervisory and Technical) usually require specific skills and organisational knowledge. Occupations in this class generally provide more stable employment and include some of the conditions of employment, for example an annual salary, typical in the managerial and professional class. The additional occupational complexity, along with the improved economic security and benefits associated with occupations in NS-SEC 5 may contribute to the improved cognitive ability of children with fathers in this class.\n",
    "\n",
    "Between the ‘managerial and professional occupations’ (NS-SEC 1.1, 1.2 and 2) and the ‘routine and manual occupations’ (NS-SEC 5, 6 and 7) rests the ‘intermediate occupations’ (NS-SEC 3 and 4). Despite being distinctive the ‘intermediate occupations’ are not organised into a hierarchical order. NS-SEC 4 (Small Employers and Own Account Workers) theoretically stands apart from NS-SEC 3 (Intermediate) because it is composed of self-employed workers and small employers. NS-SEC 4 comprises both those who are engaged in largely manual work along with others who are engaged in non-manual work. In contrast to fathers in NS-SEC 1.1 (Large Employers and Higher Managerial Occupations), the fathers in NS-SEC 4 carry out the majority of the entrepreneurial and managerial functions within their enterprise. The children with fathers in NS-SEC 4 have cognitive test scores that are more similar to counterparts in NS-SEC 5 than to other children with fathers in NS-SEC 3. The better performance of children with fathers in NS-SEC 3 may be a reflection of their father’s being engaged in intermediate occupations that can reasonably be described as being ‘white collar’. Being engaged in white collar occupations generally leads to better employment conditions and economic rewards. \n",
    "\n",
    "In the discussion above we have focussed on the employment characteristics and conditions associated with the NS-SEC categories.  The observable negative gradient leads to the plausible conclusion that class differences with substantial differences in the economic, social and cultural milieus within households. We speculate that social class differences in cultural values, parenting styles and family activities may also play a role in reproducing inequalities (see Bourdieu and Passeron, 1977; Ermisch, 2008; Kiernan and Mensah, 2011; Lareau, 2011; Washbrook, 2011; Vincent and Ball, 2007; Sullivan et al., 2013). Researchers in fields such as psychology have pointed to the heritability of general cognitive ability (see for example Tucker-Drob et al., 2013; Hill et al., 2014; Deary et al., 2006), which might be another potentially plausible dimension contributing to the social class gradient.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Conclusions <a class=\"anchor\" id=\"conclusions\"></a>\n",
    "\n",
    "Overall, this article provides persuasive evidence that whilst there are sociologically important and informative differences between social classes, there has not been a notable change in the relative ordering of social class inequalities in childhood general cognitive ability test scores between these two birth cohorts. These analyses detect that gender, parental education and social class have structuring effects on general cognitive ability in childhood. This underlines the benefits of moving beyond psychology’s standard disciplinary boundaries in order to develop a more comprehensive understanding of social influences on cognitive inequalities (Flynn, 2012). \n",
    "\n",
    "In Britain since the end of the Second World War there have been ongoing concerns about social inequality in education. Despite numerous new educational policies and initiatives the structure and organisation of primary schools remained relatively unchanged in the second half of the twentieth century. Primary schools in the post-war period might reasonably be described as being in a state of ‘constant flux’. The children of the NCDS began primary school in the early 1960s, and the children of the BCS entered primary school twelve years later. Nevertheless, the evidence from analysing the two British birth cohorts is that social class inequalities in childhood cognitive ability test scores were notable and persistent. The extent to which parental social class inequalities in general cognitive ability test scores have changed in more recent cohorts is a question for further empirical investigation. Unfortunately, at the current time we are not aware of any nationally representative UK datasets that contain suitable general cognitive ability test measures to effectively examine more recent cohorts.\n",
    "\n",
    "Children’s cognitive ability test scores summarize their capability to understand complex ideas, to engage in various forms of reasoning, to learn from experience and to effectively adapt to their environment. The overall finding, that social class divisions in cognitive ability can be observed when children are still at primary school, and that these inequalities are persistent, is a disturbing result. Pupils with fathers in ‘routine and manual occupations’ are at a distinct disadvantage. These pupils arrived at secondary school already weighed down with stones in their satchels. This is an important finding to emphasise because cognitive ability is known to influence individuals throughout their lives (see Deary et al., 2007; Nettle, 2003; Vanhanen, 2011; Schoon, 2010). \n",
    "\n",
    "There is an increasing desire and requirement to make sociological research more transparent, and to actively render it reproducible. In addition to the substantive findings, this article makes a ground breaking methodological contribution by using Jupyter notebooks which are an internationally recognised open source research platform. Publishing the Jupyter notebook allows third parties to fully reproduce the complete workflow behind the production of the article, and to duplicate the empirical results. In addition to increasing transparency, this approach enables the possibility for other researchers to extend the work, for example with different measures, additional data or alternative techniques. Improving transparency is an attractive feature and is highly likely to make a major contribution to quantitative sociology.\n",
    "\n",
    "In developing an open and published workflow we have drawn upon ideas advanced in computer science especially the concept ‘literate computing’, which is the weaving of a narrative directly into live computation, interleaving text with code and results in order to construct a complete piece that achieves the goals of communicating results [6](#note6) (Knuth, 1992). A further innovation within this work has been the adoption of ‘pair programming’ which is a technique from software development in which two programmers work together in the development of code. In addition we have also used ‘code peer review’, and each author has run the complete workflow independently using a different computer and software set-up. This has enabled us to undertake an in-depth test of the reproducibility of the work. These practices are rarely utilised in sociological research but bring great benefits to the discipline.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Notes <a class=\"anchor\" id=\"notes\"></a>\n",
    "\n",
    "[1]<a class=\"anchor\" id=\"note1\"></a> Although there is some evidence that these increases may have slowed, or even stopped, in recent years (Teasdale & Owen, 2008).\n",
    "  \n",
    "[2]<a class=\"anchor\" id=\"note2\"></a> The 1970 British Cohort Study included babies born in Northern Ireland in the first interview (at birth) but these babies were dropped from all subsequent sweeps of data collection within the study.\n",
    "  \n",
    "[3]<a class=\"anchor\" id=\"note3\"></a> A measure of mother’s occupation before pregnancy was collected in the NCDS birth survey, variable n539. However more than half of mothers in our sample have no occupational information and the information available is only provided as non-standard occupational categories (e.g. ‘bank clerks etc.’, ‘Textile-labourer’, ‘Clerks, typists’). Information on mother’s occupation is also provided in the age 11 NCDS survey dataset, variable n1225. This variable indicates that over half of mothers in our sample have no occupational information and also uses a non-standard categorisation of occupations. Mother’s Registrar General Social Class (n2393) and Socio-Economic Group (n2394) are available from the age 16 NCDS survey, however more than half mothers in our sample have no occupational information. We chose not to use the available mother’s occupational information because of the large number of mothers with no occupational information. The non-standard classification of occupations would not enable us to produce comparable socio-economic measures in a suitably standardised manner. We do not use the mother’s occupational information from the age 16 sweep of the survey as it is collected 5 years after the outcome of interest and it would not allow us to produce NS-SEC in a standardised manner, we therefore consider that it is not an appropriate measure for the present analysis.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v2.dta, clear\n",
      "\n",
      ". \n",
      ". tab n539 if (cohort==1), mi\n",
      "\n",
      "0 Mums paid job when |\n",
      "  starting this baby |\n",
      "          (GRO 1951) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "         1. Teachers |        269        1.54        1.54\n",
      " 2. Nurses qualified |         92        0.53        2.07\n",
      "  3. Bank clerks etc |        246        1.41        3.49\n",
      "  4. Shopkeepers etc |         60        0.34        3.83\n",
      " 5. Others in SCI,II |        101        0.58        4.41\n",
      " 6. Nurses- not qual |        109        0.63        5.04\n",
      "   7. Clerks,typists |      1,559        8.95       13.99\n",
      " 8. Shop asst,hairdr |        799        4.59       18.58\n",
      "  9. Garment workers |        152        0.87       19.45\n",
      "10. Textile wkr skld |        281        1.61       21.06\n",
      "11. Personal service |        224        1.29       22.35\n",
      "12. Others in SC III |        553        3.18       25.52\n",
      "      13. Machinists |        287        1.65       27.17\n",
      "14. Textile wkr SCIV |        104        0.60       27.77\n",
      "   15. Personal-SCIV |        379        2.18       29.95\n",
      " 16. Others in SC IV |        988        5.67       35.62\n",
      "17. Textile-labourer |        356        2.04       37.66\n",
      "   18. Personal-SC V |        122        0.70       38.36\n",
      "                   . |     10,734       61.64      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,415      100.00\n",
      "\n",
      ". tab n1225 if (cohort==1), mi\n",
      "\n",
      "   2P Mothers's most |\n",
      " recent work and SEG |\n",
      "          (GRO 1966) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "       1. Prof,manag |        255        1.46        1.46\n",
      " 2. Intermed non-man |        751        4.31        5.78\n",
      "  3. Typist,clerical |      1,257        7.22       12.99\n",
      "   4. Shop assistant |        805        4.62       17.62\n",
      " 5. Telephonists etc |        178        1.02       18.64\n",
      " 6. Personal service |      1,717        9.86       28.50\n",
      " 7. Forewomen,manual |        120        0.69       29.19\n",
      "   8. Manual workers |      2,705       15.53       44.72\n",
      "      9. Own account |         67        0.38       45.10\n",
      "    10. Farm workers |        140        0.80       45.91\n",
      " 11. Inadequate info |         42        0.24       46.15\n",
      "                   . |      9,378       53.85      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,415      100.00\n",
      "\n",
      ". tab n2393 if (cohort==1), mi\n",
      "\n",
      "  3P Mother-s social |\n",
      " class,if works (GRO |\n",
      "               1970) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "                1. I |         35        0.20        0.20\n",
      "               2. II |      1,134        6.51        6.71\n",
      "      3. III non-man |      2,230       12.81       19.52\n",
      "       4. III manual |        522        3.00       22.52\n",
      "       5. IV non-man |      1,290        7.41       29.92\n",
      "        6. IV manual |      1,107        6.36       36.28\n",
      "                7. V |        742        4.26       40.54\n",
      "     8. Unclassified |        104        0.60       41.14\n",
      "                   . |     10,251       58.86      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,415      100.00\n",
      "\n",
      ". tab n2394 if (cohort==1), mi\n",
      "\n",
      "          3P Mothers |\n",
      "      Socio-economic |\n",
      " group,if works (GRO |\n",
      "               1970) |      Freq.     Percent        Cum.\n",
      "---------------------+-----------------------------------\n",
      "  1. Emp,manag large |         43        0.25        0.25\n",
      "  2. Emp,manag small |        212        1.22        1.46\n",
      "    3. Prof-self-emp |          7        0.04        1.50\n",
      "   4. Prof-employees |         39        0.22        1.73\n",
      " 5. Intermed non-man |        935        5.37        7.10\n",
      "   6. Junior non-man |      2,239       12.86       19.95\n",
      " 7. Personal service |      1,289        7.40       27.36\n",
      "     8. Foremen-man. |         55        0.32       27.67\n",
      "   9. Skilled manual |        279        1.60       29.27\n",
      "10. Semi skld manual |      1,039        5.97       35.24\n",
      "11. Unskilled manual |        743        4.27       39.51\n",
      "12. Work own account |        111        0.64       40.14\n",
      "13. Farmer-emp,manag |          2        0.01       40.16\n",
      "14. Farm-own account |          6        0.03       40.19\n",
      "    15. Agric worker |         59        0.34       40.53\n",
      "    16. Armed forces |          1        0.01       40.53\n",
      " 17. Inadequate info |        104        0.60       41.13\n",
      "                   . |     10,252       58.87      100.00\n",
      "---------------------+-----------------------------------\n",
      "               Total |     17,415      100.00\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v2.dta, clear\n",
    "\n",
    "tab n539 if (cohort==1), mi\n",
    "tab n1225 if (cohort==1), mi\n",
    "tab n2393 if (cohort==1), mi\n",
    "tab n2394 if (cohort==1), mi\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[4]<a class=\"anchor\" id=\"note4\"></a> There is a significant relationship between parental education and father’s social class (χ2 = 4700; p < 0.001 @ 21 d.f.). The association between parental education and father’s social class is relatively weak (V = 0.30). The average variance inflation from the complete records model was 1.70. Following conventional methodological advice we conclude that multicollinearity is not a concern in this model (see Menard, 2002)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v3.dta, clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v3.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". tab parented dadnssec if(samplenssec==0), chi V r\n",
      "\n",
      "+----------------+\n",
      "| Key            |\n",
      "|----------------|\n",
      "|   frequency    |\n",
      "| row percentage |\n",
      "+----------------+\n",
      "\n",
      "  Parent's |\n",
      "   Highest |                                     Father's NSSEC\n",
      " Education | 1. Large   2. Higher  3. Lower   4. Interm  5. Small   6. Lower   7. Semi-R  8. Routin |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         1 |       210        141        831        747      1,529      2,099      2,219      3,350 |    11,126 \n",
      "           |      1.89       1.27       7.47       6.71      13.74      18.87      19.94      30.11 |    100.00 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         2 |       321        360        980        718        592        774        589        562 |     4,896 \n",
      "           |      6.56       7.35      20.02      14.67      12.09      15.81      12.03      11.48 |    100.00 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         3 |        47         78        146         82         62        102         54         42 |       613 \n",
      "           |      7.67      12.72      23.82      13.38      10.11      16.64       8.81       6.85 |    100.00 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "         4 |        89        351        383         88         47         63         40         20 |     1,081 \n",
      "           |      8.23      32.47      35.43       8.14       4.35       5.83       3.70       1.85 |    100.00 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |       667        930      2,340      1,635      2,230      3,038      2,902      3,974 |    17,716 \n",
      "           |      3.76       5.25      13.21       9.23      12.59      17.15      16.38      22.43 |    100.00 \n",
      "\n",
      "         Pearson chi2(21) =  4.7e+03   Pr = 0.000\n",
      "               CramÃ©r's V =   0.2968\n",
      "\n",
      ". kap parented dadnssec if(samplenssec==0)\n",
      "\n",
      "             Expected\n",
      "Agreement   Agreement     Kappa   Std. Err.         Z      Prob>Z\n",
      "-----------------------------------------------------------------\n",
      "   4.54%       4.84%    -0.0031     0.0013      -2.34      0.9904\n",
      "\n",
      ". pwcorr parented dadnssec if(samplenssec==0), sig\n",
      "\n",
      "             | parented dadnssec\n",
      "-------------+------------------\n",
      "    parented |   1.0000 \n",
      "             |\n",
      "             |\n",
      "    dadnssec |  -0.4370   1.0000 \n",
      "             |   0.0000\n",
      "             |\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "tab parented dadnssec if(samplenssec==0), chi V r\n",
    "kap parented dadnssec if(samplenssec==0)\n",
    "pwcorr parented dadnssec if(samplenssec==0), sig\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[5]<a class=\"anchor\" id=\"note5\"></a> Quasi-variance comparison intervals allow comparisons to be made between all categories whereas conventional confidence intervals are restricted to comparisons with the reference category (see Firth, 2003; Gayle and Lambert, 2007). The quasi-variance estimation approach is based on an approximation (see Firth, 2003).\n",
    "  \n",
    "[6]<a class=\"anchor\" id=\"note6\"></a> See also [here](http://blog.fperez.org/).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Supplementary Materials <a class=\"anchor\" id=\"supplement\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Complete Records Models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". use $path3\\pooledNCDSBCS_v3.dta, clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "use $path3\\pooledNCDSBCS_v3.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". regress ability male i.parented ib4.dadnssec i.cohort if(samplenssec==0), allbaselevels\n",
      "\n",
      "      Source |       SS           df       MS      Number of obs   =    17,716\n",
      "-------------+----------------------------------   F(12, 17703)    =    226.92\n",
      "       Model |  512719.985        12  42726.6654   Prob > F        =    0.0000\n",
      "    Residual |  3333217.85    17,703   188.28548   R-squared       =    0.1333\n",
      "-------------+----------------------------------   Adj R-squared   =    0.1327\n",
      "       Total |  3845937.84    17,715   217.10064   Root MSE        =    13.722\n",
      "\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "                                  ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "------------------------------------------+----------------------------------------------------------------\n",
      "                                     male |  -.6621118   .2063506    -3.21   0.001    -1.066579   -.2576445\n",
      "                                          |\n",
      "                                 parented |\n",
      "                                       1  |          0  (base)\n",
      "                                       2  |   5.629584   .2533412    22.22   0.000      5.13301    6.126158\n",
      "                                       3  |   7.873391   .5870722    13.41   0.000     6.722672     9.02411\n",
      "                                       4  |     10.239   .4896922    20.91   0.000     9.279153    11.19884\n",
      "                                          |\n",
      "                                 dadnssec |\n",
      "1. Large Employers and Higher Managerial  |   1.564068   .6322793     2.47   0.013     .3247385    2.803397\n",
      "                  2. Higher Professional  |   2.076091   .5853141     3.55   0.000     .9288175    3.223364\n",
      "    3. Lower managerial and professional  |   1.152207   .4456866     2.59   0.010     .2786179    2.025797\n",
      "                         4. Intermediate  |          0  (base)\n",
      "      5. Small employers and own account  |  -3.436122   .4502615    -7.63   0.000    -4.318679   -2.553566\n",
      "      6. Lower Supervisory and Technical  |  -3.255671   .4248892    -7.66   0.000    -4.088495   -2.422847\n",
      "                         7. Semi-Routine  |  -4.664142   .4306101   -10.83   0.000     -5.50818   -3.820104\n",
      "                              8. Routine  |    -6.9763   .4134982   -16.87   0.000    -7.786797   -6.165803\n",
      "                                          |\n",
      "                                   cohort |\n",
      "                                 1. NCDS  |          0  (base)\n",
      "                                  2. BCS  |  -2.004158   .2129956    -9.41   0.000     -2.42165   -1.586666\n",
      "                                          |\n",
      "                                    _cons |   102.6546   .3858041   266.08   0.000     101.8984    103.4108\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "regress ability male i.parented ib4.dadnssec i.cohort if(samplenssec==0), allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". vif\n",
      "\n",
      "    Variable |       VIF       1/VIF  \n",
      "-------------+----------------------\n",
      "        male |      1.00    0.999011\n",
      "    parented |\n",
      "          2  |      1.21    0.828021\n",
      "          3  |      1.08    0.923138\n",
      "          4  |      1.29    0.773549\n",
      "    dadnssec |\n",
      "          1  |      1.36    0.733736\n",
      "          2  |      1.60    0.623698\n",
      "          3  |      2.14    0.466728\n",
      "          5  |      2.10    0.476441\n",
      "          6  |      2.41    0.414359\n",
      "          7  |      2.39    0.418451\n",
      "          8  |      2.80    0.357238\n",
      "    2.cohort |      1.06    0.943997\n",
      "-------------+----------------------\n",
      "    Mean VIF |      1.70\n",
      "\n",
      ". \n",
      ". testparm ib4.dadnssec##i.cohort\n",
      "\n",
      " ( 1)  1.dadnssec = 0\n",
      " ( 2)  2.dadnssec = 0\n",
      " ( 3)  3.dadnssec = 0\n",
      " ( 4)  5.dadnssec = 0\n",
      " ( 5)  6.dadnssec = 0\n",
      " ( 6)  7.dadnssec = 0\n",
      " ( 7)  8.dadnssec = 0\n",
      " ( 8)  2.cohort = 0\n",
      "\n",
      "       F(  8, 17703) =   99.17\n",
      "            Prob > F =    0.0000\n",
      "\n",
      ". \n",
      ". predict r, resid\n",
      "(16,267 missing values generated)\n",
      "\n",
      ". kdensity r, normal\n",
      "\n",
      ". pnorm r\n",
      "\n",
      ". qnorm r\n",
      "\n",
      ". \n",
      ". rvfplot, yline(0)\n",
      "\n",
      ". estat imtest\n",
      "\n",
      "Cameron & Trivedi's decomposition of IM-test\n",
      "\n",
      "---------------------------------------------------\n",
      "              Source |       chi2     df      p\n",
      "---------------------+-----------------------------\n",
      "  Heteroskedasticity |     158.55     54    0.0000\n",
      "            Skewness |     153.80     12    0.0000\n",
      "            Kurtosis |      51.61      1    0.0000\n",
      "---------------------+-----------------------------\n",
      "               Total |     363.96     67    0.0000\n",
      "---------------------------------------------------\n",
      "\n",
      ". estat hettest\n",
      "\n",
      "Breusch-Pagan / Cook-Weisberg test for heteroskedasticity \n",
      "         Ho: Constant variance\n",
      "         Variables: fitted values of ability\n",
      "\n",
      "         chi2(1)      =    41.27\n",
      "         Prob > chi2  =   0.0000\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "vif\n",
    "\n",
    "testparm ib4.dadnssec##i.cohort\n",
    "\n",
    "predict r, resid\n",
    "kdensity r, normal\n",
    "pnorm r\n",
    "qnorm r\n",
    "\n",
    "rvfplot, yline(0)\n",
    "estat imtest\n",
    "estat hettest\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *TABLE S4 MODEL 1\n",
      "\n",
      ". regress ability male i.parented  ib4.dadnssec i.cohort if(samplenssec==0), allbaselevels\n",
      "\n",
      "      Source |       SS           df       MS      Number of obs   =    17,716\n",
      "-------------+----------------------------------   F(12, 17703)    =    226.92\n",
      "       Model |  512719.985        12  42726.6654   Prob > F        =    0.0000\n",
      "    Residual |  3333217.85    17,703   188.28548   R-squared       =    0.1333\n",
      "-------------+----------------------------------   Adj R-squared   =    0.1327\n",
      "       Total |  3845937.84    17,715   217.10064   Root MSE        =    13.722\n",
      "\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "                                  ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "------------------------------------------+----------------------------------------------------------------\n",
      "                                     male |  -.6621118   .2063506    -3.21   0.001    -1.066579   -.2576445\n",
      "                                          |\n",
      "                                 parented |\n",
      "                                       1  |          0  (base)\n",
      "                                       2  |   5.629584   .2533412    22.22   0.000      5.13301    6.126158\n",
      "                                       3  |   7.873391   .5870722    13.41   0.000     6.722672     9.02411\n",
      "                                       4  |     10.239   .4896922    20.91   0.000     9.279153    11.19884\n",
      "                                          |\n",
      "                                 dadnssec |\n",
      "1. Large Employers and Higher Managerial  |   1.564068   .6322793     2.47   0.013     .3247385    2.803397\n",
      "                  2. Higher Professional  |   2.076091   .5853141     3.55   0.000     .9288175    3.223364\n",
      "    3. Lower managerial and professional  |   1.152207   .4456866     2.59   0.010     .2786179    2.025797\n",
      "                         4. Intermediate  |          0  (base)\n",
      "      5. Small employers and own account  |  -3.436122   .4502615    -7.63   0.000    -4.318679   -2.553566\n",
      "      6. Lower Supervisory and Technical  |  -3.255671   .4248892    -7.66   0.000    -4.088495   -2.422847\n",
      "                         7. Semi-Routine  |  -4.664142   .4306101   -10.83   0.000     -5.50818   -3.820104\n",
      "                              8. Routine  |    -6.9763   .4134982   -16.87   0.000    -7.786797   -6.165803\n",
      "                                          |\n",
      "                                   cohort |\n",
      "                                 1. NCDS  |          0  (base)\n",
      "                                  2. BCS  |  -2.004158   .2129956    -9.41   0.000     -2.42165   -1.586666\n",
      "                                          |\n",
      "                                    _cons |   102.6546   .3858041   266.08   0.000     101.8984    103.4108\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*TABLE S4 MODEL 1\n",
    "regress ability male i.parented  ib4.dadnssec i.cohort if(samplenssec==0), allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". est sto m1\n",
      "\n",
      ". fitstat, s(m1)\n",
      "\n",
      "Measures of Fit for regress of ability\n",
      "\n",
      "Log-Lik Intercept Only:   -72796.653     Log-Lik Full Model:       -71529.256\n",
      "D(17700):                 143058.513     LR(12):                     2534.793\n",
      "                                         Prob > LR:                     0.000\n",
      "R2:                            0.133     Adjusted R2:                   0.133\n",
      "AIC:                           8.077     AIC*n:                    143090.513\n",
      "BIC:                      -30086.843     BIC':                      -2417.407\n",
      "\n",
      "(Indices saved in matrix fs_m1)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "est sto m1\n",
    "fitstat, s(m1)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *TABLE S4 Model 2\n",
      "\n",
      ". regress ability male i.parented ib7.nsinteraction if(samplenssec==0), allbaselevels\n",
      "\n",
      "      Source |       SS           df       MS      Number of obs   =    17,716\n",
      "-------------+----------------------------------   F(19, 17696)    =    143.71\n",
      "       Model |  514115.506        19  27058.7109   Prob > F        =    0.0000\n",
      "    Residual |  3331822.33    17,696  188.281099   R-squared       =    0.1337\n",
      "-------------+----------------------------------   Adj R-squared   =    0.1327\n",
      "       Total |  3845937.84    17,715   217.10064   Root MSE        =    13.722\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.6695247   .2064042    -3.24   0.001    -1.074097   -.2649522\n",
      "              |\n",
      "     parented |\n",
      "           1  |          0  (base)\n",
      "           2  |   5.635049   .2536317    22.22   0.000     5.137906    6.132192\n",
      "           3  |   7.903272   .5880555    13.44   0.000     6.750625    9.055918\n",
      "           4  |   10.27633   .4906155    20.95   0.000     9.314679    11.23799\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |      2.409   .9207012     2.62   0.009     .6043356    4.213665\n",
      "     BCS 1.1  |  -1.037565   .8526089    -1.22   0.224    -2.708762    .6336323\n",
      "    NCDS 1.2  |    1.43215   .8092031     1.77   0.077    -.1539669    3.018268\n",
      "     BCS 1.2  |   .7326651   .8009008     0.91   0.360    -.8371791    2.302509\n",
      "      NCDS 2  |   1.509022   .6158601     2.45   0.014     .3018761    2.716168\n",
      "       BCS 2  |  -1.101743   .6127955    -1.80   0.072    -2.302883    .0993958\n",
      "      NCDS 3  |          0  (base)\n",
      "       BCS 3  |   -1.90088   .6853285    -2.77   0.006    -3.244191   -.5575688\n",
      "      NCDS 4  |   -3.40209   .6086739    -5.59   0.000    -4.595151    -2.20903\n",
      "       BCS 4  |  -5.373346   .6258719    -8.59   0.000    -6.600116   -4.146576\n",
      "      NCDS 5  |  -3.033046   .5757856    -5.27   0.000    -4.161642    -1.90445\n",
      "       BCS 5  |   -5.40104   .5834729    -9.26   0.000    -6.544704   -4.257376\n",
      "      NCDS 6  |  -4.571711   .5696539    -8.03   0.000    -5.688288   -3.455133\n",
      "       BCS 6  |  -6.679397    .607365   -11.00   0.000    -7.869892   -5.488902\n",
      "      NCDS 7  |  -7.164837     .54458   -13.16   0.000    -8.232268   -6.097407\n",
      "       BCS 7  |  -8.580919    .573249   -14.97   0.000    -9.704543   -7.457294\n",
      "              |\n",
      "        _cons |   102.6061   .4826751   212.58   0.000       101.66    103.5522\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*TABLE S4 Model 2\n",
    "regress ability male i.parented ib7.nsinteraction if(samplenssec==0), allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". est sto m2\n",
      "\n",
      ". fitstat\n",
      "\n",
      "Measures of Fit for regress of ability\n",
      "\n",
      "Log-Lik Intercept Only:   -72796.653     Log-Lik Full Model:       -71525.547\n",
      "D(17694):                 143051.094     LR(19):                     2542.212\n",
      "                                         Prob > LR:                     0.000\n",
      "R2:                            0.134     Adjusted R2:                   0.133\n",
      "AIC:                           8.077     AIC*n:                    143095.094\n",
      "BIC:                      -30035.568     BIC':                      -2356.350\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "est sto m2\n",
    "fitstat\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". fitstat, using(m1)\n",
      "\n",
      "Measures of Fit for regress of ability\n",
      "\n",
      "                             Current            Saved       Difference\n",
      "Model:                       regress          regress\n",
      "N:                             17716            17716                0\n",
      "Log-Lik Intercept Only:   -72796.653       -72796.653            0.000\n",
      "Log-Lik Full Model:       -71525.547       -71529.256            3.709\n",
      "D:                        143051.094(17694) 143058.513(17700)   -7.419(-6)\n",
      "LR:                         2542.212(19)     2534.793(12)        7.419(7)\n",
      "Prob > LR:                     0.000            0.000            0.000\n",
      "R2:                            0.134            0.133            0.000\n",
      "Adjusted R2:                   0.133            0.133            0.000\n",
      "AIC:                           8.077            8.077            0.000\n",
      "AIC*n:                    143095.094       143090.513            4.581\n",
      "BIC:                      -30035.568       -30086.843           51.275\n",
      "BIC':                      -2356.350        -2417.407           61.057\n",
      "\n",
      "Difference of   61.057 in BIC' provides very strong support for saved model.\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "fitstat, using(m1)\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". esttab m1 m2, replace cells(b(star fmt(2) label(Coef.)) se(par fmt(2))) stats(N) unstack\n",
      "\n",
      "--------------------------------------------\n",
      "                      (1)             (2)   \n",
      "                  ability         ability   \n",
      "                 Coef./se        Coef./se   \n",
      "--------------------------------------------\n",
      "male                -0.66**         -0.67** \n",
      "                   (0.21)          (0.21)   \n",
      "1.parented           0.00            0.00   \n",
      "                      (.)             (.)   \n",
      "2.parented           5.63***         5.64***\n",
      "                   (0.25)          (0.25)   \n",
      "3.parented           7.87***         7.90***\n",
      "                   (0.59)          (0.59)   \n",
      "4.parented          10.24***        10.28***\n",
      "                   (0.49)          (0.49)   \n",
      "1.dadnssec           1.56*                  \n",
      "                   (0.63)                   \n",
      "2.dadnssec           2.08***                \n",
      "                   (0.59)                   \n",
      "3.dadnssec           1.15**                 \n",
      "                   (0.45)                   \n",
      "4.dadnssec           0.00                   \n",
      "                      (.)                   \n",
      "5.dadnssec          -3.44***                \n",
      "                   (0.45)                   \n",
      "6.dadnssec          -3.26***                \n",
      "                   (0.42)                   \n",
      "7.dadnssec          -4.66***                \n",
      "                   (0.43)                   \n",
      "8.dadnssec          -6.98***                \n",
      "                   (0.41)                   \n",
      "1.cohort             0.00                   \n",
      "                      (.)                   \n",
      "2.cohort            -2.00***                \n",
      "                   (0.21)                   \n",
      "1.nsintera~n                         2.41** \n",
      "                                   (0.92)   \n",
      "2.nsintera~n                        -1.04   \n",
      "                                   (0.85)   \n",
      "3.nsintera~n                         1.43   \n",
      "                                   (0.81)   \n",
      "4.nsintera~n                         0.73   \n",
      "                                   (0.80)   \n",
      "5.nsintera~n                         1.51*  \n",
      "                                   (0.62)   \n",
      "6.nsintera~n                        -1.10   \n",
      "                                   (0.61)   \n",
      "7.nsintera~n                         0.00   \n",
      "                                      (.)   \n",
      "8.nsintera~n                        -1.90** \n",
      "                                   (0.69)   \n",
      "9.nsintera~n                        -3.40***\n",
      "                                   (0.61)   \n",
      "10.nsinter~n                        -5.37***\n",
      "                                   (0.63)   \n",
      "11.nsinter~n                        -3.03***\n",
      "                                   (0.58)   \n",
      "12.nsinter~n                        -5.40***\n",
      "                                   (0.58)   \n",
      "13.nsinter~n                        -4.57***\n",
      "                                   (0.57)   \n",
      "14.nsinter~n                        -6.68***\n",
      "                                   (0.61)   \n",
      "15.nsinter~n                        -7.16***\n",
      "                                   (0.54)   \n",
      "16.nsinter~n                        -8.58***\n",
      "                                   (0.57)   \n",
      "_cons              102.65***       102.61***\n",
      "                   (0.39)          (0.48)   \n",
      "--------------------------------------------\n",
      "N                17716.00        17716.00   \n",
      "--------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "esttab m1 m2, replace cells(b(star fmt(2) label(Coef.)) se(par fmt(2))) stats(N) unstack\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/5AzSe13.png\" alt=\"Table S4\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Model with IPW (only) and MI (only)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Table S5 Model 1\n",
      "\n",
      ". \n",
      ". use $path3\\pooledNCDSBCS_v3.dta, clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Table S5 Model 1\n",
    "\n",
    "use $path3\\pooledNCDSBCS_v3.dta, clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". regress ability male i.parented  ib2.dadnssec i.cohort if(samplenssec==0) [pweight=ipw], allbaselevels\n",
      "(sum of wgt is   2.1096e+04)\n",
      "\n",
      "Linear regression                               Number of obs     =     17,716\n",
      "                                                F(12, 17703)      =     248.97\n",
      "                                                Prob > F          =     0.0000\n",
      "                                                R-squared         =     0.1338\n",
      "                                                Root MSE          =     13.723\n",
      "\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "                                          |               Robust\n",
      "                                  ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "------------------------------------------+----------------------------------------------------------------\n",
      "                                     male |  -.6477882   .2063705    -3.14   0.002    -1.052295   -.2432817\n",
      "                                          |\n",
      "                                 parented |\n",
      "                                       1  |          0  (base)\n",
      "                                       2  |   5.651065   .2551178    22.15   0.000      5.15101    6.151121\n",
      "                                       3  |   7.930292    .576153    13.76   0.000     6.800976    9.059608\n",
      "                                       4  |   10.26015   .4696664    21.85   0.000      9.33956    11.18074\n",
      "                                          |\n",
      "                                 dadnssec |\n",
      "1. Large Employers and Higher Managerial  |  -.4656737   .6546943    -0.71   0.477    -1.748939    .8175914\n",
      "                  2. Higher Professional  |          0  (base)\n",
      "    3. Lower managerial and professional  |  -.9255402    .492401    -1.88   0.060    -1.890694    .0396141\n",
      "                         4. Intermediate  |  -2.029144   .5376404    -3.77   0.000    -3.082972    -.975316\n",
      "      5. Small employers and own account  |  -5.509167   .5327768   -10.34   0.000    -6.553462   -4.464873\n",
      "      6. Lower Supervisory and Technical  |  -5.326667   .5091641   -10.46   0.000    -6.324678   -4.328655\n",
      "                         7. Semi-Routine  |  -6.735277   .5200197   -12.95   0.000    -7.754566   -5.715987\n",
      "                              8. Routine  |  -9.037007   .5062525   -17.85   0.000    -10.02931   -8.044703\n",
      "                                          |\n",
      "                                   cohort |\n",
      "                                 1. NCDS  |          0  (base)\n",
      "                                  2. BCS  |  -2.029992   .2115781    -9.59   0.000    -2.444706   -1.615279\n",
      "                                          |\n",
      "                                    _cons |   104.7123   .4818961   217.29   0.000     103.7678    105.6569\n",
      "-----------------------------------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "regress ability male i.parented  ib2.dadnssec i.cohort if(samplenssec==0) [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". regress ability male i.parented ib7.nsinteraction if(samplenssec==0) [pweight=ipw], allbaselevels\n",
      "(sum of wgt is   2.1096e+04)\n",
      "\n",
      "Linear regression                               Number of obs     =     17,716\n",
      "                                                F(19, 17696)      =     157.98\n",
      "                                                Prob > F          =     0.0000\n",
      "                                                R-squared         =     0.1341\n",
      "                                                Root MSE          =     13.723\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "              |               Robust\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.6557797   .2063706    -3.18   0.001    -1.060286   -.2512732\n",
      "              |\n",
      "     parented |\n",
      "           1  |          0  (base)\n",
      "           2  |    5.65727   .2551589    22.17   0.000     5.157134    6.157406\n",
      "           3  |   7.955523    .577343    13.78   0.000     6.823874    9.087172\n",
      "           4  |   10.29457   .4706686    21.87   0.000     9.372012    11.21712\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.394681   .8726179     2.74   0.006     .6842641    4.105097\n",
      "     BCS 1.1  |  -.9899849   .8255089    -1.20   0.230    -2.608063    .6280935\n",
      "    NCDS 1.2  |   1.423045   .7383901     1.93   0.054    -.0242719    2.870362\n",
      "     BCS 1.2  |   .6962341   .7361712     0.95   0.344    -.7467336    2.139202\n",
      "      NCDS 2  |   1.497368   .5844982     2.56   0.010     .3516939    2.643041\n",
      "       BCS 2  |  -1.139992     .59768    -1.91   0.056    -2.311504    .0315194\n",
      "      NCDS 3  |          0  (base)\n",
      "       BCS 3  |  -1.845178   .6617521    -2.79   0.005    -3.142277    -.548079\n",
      "      NCDS 4  |  -3.411276   .6080687    -5.61   0.000    -4.603151   -2.219402\n",
      "       BCS 4  |  -5.404264   .6235907    -8.67   0.000    -6.626563   -4.181965\n",
      "      NCDS 5  |  -3.030302    .575939    -5.26   0.000    -4.159199   -1.901405\n",
      "       BCS 5  |   -5.44063   .5753243    -9.46   0.000    -6.568322   -4.312938\n",
      "      NCDS 6  |  -4.557855   .5661954    -8.05   0.000    -5.667654   -3.448057\n",
      "       BCS 6  |  -6.737881   .6089291   -11.07   0.000    -7.931442    -5.54432\n",
      "      NCDS 7  |  -7.159155   .5433949   -13.17   0.000    -8.224262   -6.094047\n",
      "       BCS 7  |  -8.603699   .5704641   -15.08   0.000    -9.721864   -7.485533\n",
      "              |\n",
      "        _cons |   102.5984    .474524   216.21   0.000     101.6683    103.5285\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "regress ability male i.parented ib7.nsinteraction if(samplenssec==0) [pweight=ipw], allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". est sto m3\n",
      "\n",
      ". fitstat\n",
      "\n",
      "Measures of Fit for regress of ability\n",
      "\n",
      "Log-Lik Intercept Only:   -72802.894     Log-Lik Full Model:       -71527.330\n",
      "D(17694):                 143054.660     LR(19):                     2551.128\n",
      "                                         Prob > LR:                     0.000\n",
      "R2:                            0.134     Adjusted R2:                   0.133\n",
      "AIC:                           8.077     AIC*n:                    143098.660\n",
      "BIC:                      -30032.002     BIC':                      -2365.266\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "est sto m3\n",
    "fitstat\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". testparm ib7.nsinteraction\n",
      "\n",
      " ( 1)  1.nsinteraction = 0\n",
      " ( 2)  2.nsinteraction = 0\n",
      " ( 3)  3.nsinteraction = 0\n",
      " ( 4)  4.nsinteraction = 0\n",
      " ( 5)  5.nsinteraction = 0\n",
      " ( 6)  6.nsinteraction = 0\n",
      " ( 7)  8.nsinteraction = 0\n",
      " ( 8)  9.nsinteraction = 0\n",
      " ( 9)  10.nsinteraction = 0\n",
      " (10)  11.nsinteraction = 0\n",
      " (11)  12.nsinteraction = 0\n",
      " (12)  13.nsinteraction = 0\n",
      " (13)  14.nsinteraction = 0\n",
      " (14)  15.nsinteraction = 0\n",
      " (15)  16.nsinteraction = 0\n",
      "\n",
      "       F( 15, 17696) =   54.19\n",
      "            Prob > F =    0.0000\n",
      "\n",
      ". esttab m3, replace cells(b(star fmt(2) label(Coef.)) se(par fmt(2))) stats(N) unstack\n",
      "\n",
      "----------------------------\n",
      "                      (1)   \n",
      "                  ability   \n",
      "                 Coef./se   \n",
      "----------------------------\n",
      "male                -0.66** \n",
      "                   (0.21)   \n",
      "1.parented           0.00   \n",
      "                      (.)   \n",
      "2.parented           5.66***\n",
      "                   (0.26)   \n",
      "3.parented           7.96***\n",
      "                   (0.58)   \n",
      "4.parented          10.29***\n",
      "                   (0.47)   \n",
      "1.nsintera~n         2.39** \n",
      "                   (0.87)   \n",
      "2.nsintera~n        -0.99   \n",
      "                   (0.83)   \n",
      "3.nsintera~n         1.42   \n",
      "                   (0.74)   \n",
      "4.nsintera~n         0.70   \n",
      "                   (0.74)   \n",
      "5.nsintera~n         1.50*  \n",
      "                   (0.58)   \n",
      "6.nsintera~n        -1.14   \n",
      "                   (0.60)   \n",
      "7.nsintera~n         0.00   \n",
      "                      (.)   \n",
      "8.nsintera~n        -1.85** \n",
      "                   (0.66)   \n",
      "9.nsintera~n        -3.41***\n",
      "                   (0.61)   \n",
      "10.nsinter~n        -5.40***\n",
      "                   (0.62)   \n",
      "11.nsinter~n        -3.03***\n",
      "                   (0.58)   \n",
      "12.nsinter~n        -5.44***\n",
      "                   (0.58)   \n",
      "13.nsinter~n        -4.56***\n",
      "                   (0.57)   \n",
      "14.nsinter~n        -6.74***\n",
      "                   (0.61)   \n",
      "15.nsinter~n        -7.16***\n",
      "                   (0.54)   \n",
      "16.nsinter~n        -8.60***\n",
      "                   (0.57)   \n",
      "_cons              102.60***\n",
      "                   (0.47)   \n",
      "----------------------------\n",
      "N                17716.00   \n",
      "----------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "testparm ib7.nsinteraction\n",
    "esttab m3, replace cells(b(star fmt(2) label(Coef.)) se(par fmt(2))) stats(N) unstack\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *Table S5 Model 2\n",
      "\n",
      ". \n",
      ". use $path3\\pooledNCDSBCS_v3_imputed.dta, clear\n",
      "\n",
      ". \n",
      ". set seed 1485\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*Table S5 Model 2\n",
    "\n",
    "use $path3\\pooledNCDSBCS_v3_imputed.dta, clear\n",
    "\n",
    "set seed 1485\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". *drop obervations who had died by age 10/11 sweep\n",
      "\n",
      ". tab deadtestoutcome, mi\n",
      "\n",
      "Dead at age |\n",
      "      10/11 |\n",
      "     survey |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "      0. No |    925,948       91.36       91.36\n",
      "     1. Yes |     87,535        8.64      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |  1,013,483      100.00\n",
      "\n",
      ". summ ability if (deadtestoutcome==1)\n",
      "\n",
      "    Variable |        Obs        Mean    Std. Dev.       Min        Max\n",
      "-------------+---------------------------------------------------------\n",
      "     ability |     86,100    99.38687    15.00202   28.07528   160.2226\n",
      "\n",
      ". drop if deadtestoutcome==1\n",
      "(87,535 observations deleted)\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "*drop obervations who had died by age 10/11 sweep\n",
    "tab deadtestoutcome, mi\n",
    "summ ability if (deadtestoutcome==1)\n",
    "drop if deadtestoutcome==1\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * Create the interaction\n",
      "\n",
      ". capture drop nsinteraction\n",
      "\n",
      ". gen nsinteraction = .\n",
      "(925,948 missing values generated)\n",
      "\n",
      ". replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
      "(13,535 real changes made)\n",
      "\n",
      ". replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
      "(22,498 real changes made)\n",
      "\n",
      ". replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
      "(18,391 real changes made)\n",
      "\n",
      ". replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
      "(27,622 real changes made)\n",
      "\n",
      ". replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
      "(47,752 real changes made)\n",
      "\n",
      ". replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
      "(69,934 real changes made)\n",
      "\n",
      ". replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
      "(39,097 real changes made)\n",
      "\n",
      ". replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
      "(39,944 real changes made)\n",
      "\n",
      ". replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
      "(51,654 real changes made)\n",
      "\n",
      ". replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
      "(66,476 real changes made)\n",
      "\n",
      ". replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
      "(70,532 real changes made)\n",
      "\n",
      ". replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
      "(79,389 real changes made)\n",
      "\n",
      ". replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
      "(77,046 real changes made)\n",
      "\n",
      ". replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
      "(73,037 real changes made)\n",
      "\n",
      ". replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
      "(110,371 real changes made)\n",
      "\n",
      ". replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
      "(107,913 real changes made)\n",
      "\n",
      ". tab nsinteraction\n",
      "\n",
      "nsinteracti |\n",
      "         on |      Freq.     Percent        Cum.\n",
      "------------+-----------------------------------\n",
      "          1 |     13,535        1.48        1.48\n",
      "          2 |     22,498        2.46        3.94\n",
      "          3 |     18,391        2.01        5.95\n",
      "          4 |     27,622        3.02        8.96\n",
      "          5 |     47,752        5.22       14.18\n",
      "          6 |     69,934        7.64       21.82\n",
      "          7 |     39,097        4.27       26.10\n",
      "          8 |     39,944        4.36       30.46\n",
      "          9 |     51,654        5.64       36.10\n",
      "         10 |     66,476        7.26       43.37\n",
      "         11 |     70,532        7.71       51.08\n",
      "         12 |     79,389        8.67       59.75\n",
      "         13 |     77,046        8.42       68.17\n",
      "         14 |     73,037        7.98       76.15\n",
      "         15 |    110,371       12.06       88.21\n",
      "         16 |    107,913       11.79      100.00\n",
      "------------+-----------------------------------\n",
      "      Total |    915,191      100.00\n",
      "\n",
      ". label variable nsinteraction \"NSSEC Interaction\"\n",
      "\n",
      ". label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 11 \"N\n",
      "> CDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\", replace\n",
      "\n",
      ". label values nsinteraction nsint\n",
      "\n",
      ". \n",
      ". mi register passive nsinteraction\n",
      "(system variable _mi_id updated due to changed number of obs.)\n",
      "\n",
      ". \n",
      ". estimates clear\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* Create the interaction\n",
    "capture drop nsinteraction\n",
    "gen nsinteraction = .\n",
    "replace nsinteraction = 1 if ((dadnssec==1)&(cohort==1))\n",
    "replace nsinteraction = 2 if ((dadnssec==1)&(cohort==2))\n",
    "replace nsinteraction = 3 if ((dadnssec==2)&(cohort==1))\n",
    "replace nsinteraction = 4 if ((dadnssec==2)&(cohort==2))\n",
    "replace nsinteraction = 5 if ((dadnssec==3)&(cohort==1))\n",
    "replace nsinteraction = 6 if ((dadnssec==3)&(cohort==2))\n",
    "replace nsinteraction = 7 if ((dadnssec==4)&(cohort==1))\n",
    "replace nsinteraction = 8 if ((dadnssec==4)&(cohort==2))\n",
    "replace nsinteraction = 9 if ((dadnssec==5)&(cohort==1))\n",
    "replace nsinteraction = 10 if ((dadnssec==5)&(cohort==2))\n",
    "replace nsinteraction = 11 if ((dadnssec==6)&(cohort==1))\n",
    "replace nsinteraction = 12 if ((dadnssec==6)&(cohort==2))\n",
    "replace nsinteraction = 13 if ((dadnssec==7)&(cohort==1))\n",
    "replace nsinteraction = 14 if ((dadnssec==7)&(cohort==2))\n",
    "replace nsinteraction = 15 if ((dadnssec==8)&(cohort==1))\n",
    "replace nsinteraction = 16 if ((dadnssec==8)&(cohort==2))\n",
    "tab nsinteraction\n",
    "label variable nsinteraction \"NSSEC Interaction\"\n",
    "label define nsint 1 \"NCDS 1.1\" 2 \"BCS 1.1\" 3 \"NCDS 1.2\" 4 \"BCS 1.2\" 5 \"NCDS 2\" 6 \"BCS 2\" 7 \"NCDS 3\" 8 \"BCS 3\" 9 \"NCDS 4\" 10 \"BCS 4\" 11 \"NCDS 5\" 12 \"BCS 5\" 13 \"NCDS 6\" 14 \"BCS 6\" 15 \"NCDS 7\" 16 \"BCS 7\", replace\n",
    "label values nsinteraction nsint\n",
    "\n",
    "mi register passive nsinteraction\n",
    "\n",
    "estimates clear\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". * TABLE S5: Model 2\n",
      "\n",
      ". estimates clear\n",
      "\n",
      ". \n",
      ". mi estimate, post: regress ability male i.parented ib7.nsinteraction, allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     32,548\n",
      "                                                Average RVI       =     0.6145\n",
      "                                                Largest FMI       =     0.4773\n",
      "                                                Complete DF       =      32528\n",
      "DF adjustment:   Small sample                   DF:     min       =     259.36\n",
      "                                                        avg       =     361.08\n",
      "                                                        max       =     675.68\n",
      "Model F test:       Equal FMI                   F(  19, 6063.8)   =     168.96\n",
      "Within VCE type:          OLS                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5792898   .1829449    -3.17   0.002    -.9384987   -.2200809\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.841913   .2323516    25.14   0.000     5.385475    6.298352\n",
      "           3  |   8.215579   .5555597    14.79   0.000     7.123194    9.307965\n",
      "           4  |   10.63947   .4667382    22.80   0.000     9.722092    11.55685\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.907636   .9075401     3.20   0.001     1.123164    4.692108\n",
      "     BCS 1.1  |  -.6489383   .7901949    -0.82   0.412    -2.202167    .9042902\n",
      "    NCDS 1.2  |   1.863414   .8200751     2.27   0.024     .2504675     3.47636\n",
      "     BCS 1.2  |   .9660809   .8012698     1.21   0.229    -.6109012    2.543063\n",
      "      NCDS 2  |    1.71432   .6402835     2.68   0.008     .4542301    2.974411\n",
      "       BCS 2  |  -.8967187   .5858467    -1.53   0.127    -2.048396    .2549588\n",
      "       BCS 3  |  -1.642699   .6991959    -2.35   0.019    -3.018973   -.2664246\n",
      "      NCDS 4  |  -3.264382   .6328476    -5.16   0.000    -4.509892   -2.018872\n",
      "       BCS 4  |  -5.416053   .6137982    -8.82   0.000    -6.623595   -4.208511\n",
      "      NCDS 5  |  -2.933258   .5881108    -4.99   0.000    -4.090345   -1.776171\n",
      "       BCS 5  |  -5.307132   .5814633    -9.13   0.000    -6.450962   -4.163302\n",
      "      NCDS 6  |  -4.485968   .6058989    -7.40   0.000    -5.679075    -3.29286\n",
      "       BCS 6  |  -6.726404   .5894738   -11.41   0.000    -7.885691   -5.567118\n",
      "      NCDS 7  |  -7.091002   .5762464   -12.31   0.000     -8.22565   -5.956353\n",
      "       BCS 7  |  -8.764152   .5622134   -15.59   0.000    -9.870086   -7.658219\n",
      "              |\n",
      "        _cons |   101.7585   .4936476   206.14   0.000     100.7872    102.7298\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "* TABLE S5: Model 2\n",
    "estimates clear\n",
    "\n",
    "mi estimate, post: regress ability male i.parented ib7.nsinteraction, allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ". mibeta ability male i.parented ib7.nsinteraction, allbaselevels\n",
      "\n",
      "Multiple-imputation estimates                   Imputations       =         60\n",
      "Linear regression                               Number of obs     =     32,548\n",
      "                                                Average RVI       =     0.6145\n",
      "                                                Largest FMI       =     0.4773\n",
      "                                                Complete DF       =      32528\n",
      "DF adjustment:   Small sample                   DF:     min       =     259.36\n",
      "                                                        avg       =     361.08\n",
      "                                                        max       =     675.68\n",
      "Model F test:       Equal FMI                   F(  19, 6063.8)   =     168.96\n",
      "Within VCE type:          OLS                   Prob > F          =     0.0000\n",
      "\n",
      "-------------------------------------------------------------------------------\n",
      "      ability |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "--------------+----------------------------------------------------------------\n",
      "         male |  -.5792898   .1829449    -3.17   0.002    -.9384987   -.2200809\n",
      "              |\n",
      "     parented |\n",
      "           2  |   5.841913   .2323516    25.14   0.000     5.385475    6.298352\n",
      "           3  |   8.215579   .5555597    14.79   0.000     7.123194    9.307965\n",
      "           4  |   10.63947   .4667382    22.80   0.000     9.722092    11.55685\n",
      "              |\n",
      "nsinteraction |\n",
      "    NCDS 1.1  |   2.907636   .9075401     3.20   0.001     1.123164    4.692108\n",
      "     BCS 1.1  |  -.6489383   .7901949    -0.82   0.412    -2.202167    .9042902\n",
      "    NCDS 1.2  |   1.863414   .8200751     2.27   0.024     .2504675     3.47636\n",
      "     BCS 1.2  |   .9660809   .8012698     1.21   0.229    -.6109012    2.543063\n",
      "      NCDS 2  |    1.71432   .6402835     2.68   0.008     .4542301    2.974411\n",
      "       BCS 2  |  -.8967187   .5858467    -1.53   0.127    -2.048396    .2549588\n",
      "       BCS 3  |  -1.642699   .6991959    -2.35   0.019    -3.018973   -.2664246\n",
      "      NCDS 4  |  -3.264382   .6328476    -5.16   0.000    -4.509892   -2.018872\n",
      "       BCS 4  |  -5.416053   .6137982    -8.82   0.000    -6.623595   -4.208511\n",
      "      NCDS 5  |  -2.933258   .5881108    -4.99   0.000    -4.090345   -1.776171\n",
      "       BCS 5  |  -5.307132   .5814633    -9.13   0.000    -6.450962   -4.163302\n",
      "      NCDS 6  |  -4.485968   .6058989    -7.40   0.000    -5.679075    -3.29286\n",
      "       BCS 6  |  -6.726404   .5894738   -11.41   0.000    -7.885691   -5.567118\n",
      "      NCDS 7  |  -7.091002   .5762464   -12.31   0.000     -8.22565   -5.956353\n",
      "       BCS 7  |  -8.764152   .5622134   -15.59   0.000    -9.870086   -7.658219\n",
      "              |\n",
      "        _cons |   101.7585   .4936476   206.14   0.000     100.7872    102.7298\n",
      "-------------------------------------------------------------------------------\n",
      "\n",
      "Standardized coefficients and R-squared\n",
      "Summary statistics over 60 imputations\n",
      "\n",
      "             |       mean       min        p25     median        p75       max\n",
      "-------------+----------------------------------------------------------------\n",
      "        male |  -.0193603    -.0268  -.0217452  -.0193787  -.0175059    -.0102\n",
      "             |\n",
      "    parented |\n",
      "          2  |   .1736487      .166   .1708019   .1734237    .176119      .184\n",
      "          3  |   .1018874     .0913   .0994058   .1018867   .1049844       .11\n",
      "          4  |   .1670908      .156   .1638327   .1671821    .169572      .178\n",
      "             |\n",
      "nsinteract~n |\n",
      "   NCDS 1.1  |   .0242594     .0133   .0218914   .0248503   .0270079     .0341\n",
      "    BCS 1.1  |  -.0064477     -.016  -.0093394  -.0063697  -.0043418    .00824\n",
      "   NCDS 1.2  |   .0186347    .00605   .0153932   .0190033   .0216102     .0351\n",
      "    BCS 1.2  |   .0107719   -.00212   .0068104   .0108283   .0142366     .0251\n",
      "     NCDS 2  |   .0268754    .00936   .0222938   .0261917   .0313399     .0491\n",
      "      BCS 2  |   -.015498    -.0317  -.0192108   -.014646  -.0114262   -.00158\n",
      "      BCS 3  |  -.0221424    -.0368  -.0257876   -.023296   -.017001   -.00749\n",
      "     NCDS 4  |   -.052821    -.0718   -.057195  -.0537473  -.0479207     -.036\n",
      "      BCS 4  |   -.089176     -.104  -.0926978  -.0891446  -.0847819     -.076\n",
      "     NCDS 5  |  -.0544292     -.067  -.0603177  -.0531146  -.0497369    -.0329\n",
      "      BCS 5  |  -.0984158     -.117  -.1021296  -.0976271   -.093829    -.0862\n",
      "     NCDS 6  |  -.0863433     -.107  -.0912311  -.0861199  -.0812531    -.0707\n",
      "      BCS 6  |  -.1168845     -.129   -.120967  -.1166671   -.112037     -.104\n",
      "     NCDS 7  |   -.158157     -.176  -.1649127  -.1569527  -.1523098     -.138\n",
      "      BCS 7  |  -.1785939     -.194   -.183423   -.177937  -.1724781      -.16\n",
      "-------------+----------------------------------------------------------------\n",
      "    R-square |   .1389118      .134   .1373248   .1389225   .1405054      .144\n",
      "Adj R-square |   .1384088      .134   .1368209   .1384195   .1400034      .144\n",
      "------------------------------------------------------------------------------\n",
      "\n",
      ". \n",
      ". * return to jupyter\n",
      "\n"
     ]
    }
   ],
   "source": [
    "mibeta ability male i.parented ib7.nsinteraction, allbaselevels\n",
    "\n",
    "* return to jupyter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"https://i.imgur.com/kNk0QQo.png\" alt=\"Table S5\">\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### References <a class=\"anchor\" id=\"references\"></a>\n",
    "\n",
    "Atkinson, M. (2015). Millennium Cohort Study: Interpreting the CANTAB cognitive measures. London: UCL Centre for Longitudinal Studies.\n",
    "\n",
    "Blanden, J., Goodman, A., Gregg, P., & Machin, S. (2004). Changes in Intergenerational mobility in Britain. In M. Corak (Ed.), Generational Income Mobility in North America and Europe. Cambridge: Cambridge University Press.\n",
    "\n",
    "Blanden, J., & Gregg, P. (2004). Family income and educational attainment: a review of approaches and evidence for Britain. Oxford Review of Economic Policy, 20(2), 245-263.\n",
    "\n",
    "Blanden, J., Gregg, P., & Machin, S. (2005). Educational Inequality and Intergenerational Mobility. In S. Machin & A. Vignoles (Eds.), What's The Good of Education? The Economics of Education In The UK. (pp. 99-114). Princeton: Princeton University Press.\n",
    "\n",
    "Blanden, J., Gregg, P., & Macmillan, L. (2007). Accounting for intergenerational income persistence: non-cognitive skills, ability and education. Economic Journal, 117(519), 43-60.\n",
    "\n",
    "Blanden, J., Gregg, P., & Macmillan, L. (2013). Intergenerational persistence in income and social class: the effect of within-group inequality. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(2), 541-563.\n",
    "\n",
    "Blanden, J., & Machin, S. (2004). Educational Inequality and the Expansion of UK Higher Education. Scottish Journal of Political Economy, 51(2), 230-249.\n",
    "\n",
    "Blanden, J., & Machin, S. (2010). Millennium Cohort Study Briefing 13: Intergenerational inequality in early years assessments. London: Institute for Education.\n",
    "\n",
    "Bourdieu, P., & Passeron, J.-C. (1977). Reproduction in education, culture and society. London: Sage.\n",
    "\n",
    "Bradbury, B., Corak, M., Waldfogel, J., & Washbrook, E. (2015). Too many children left behind: The US achievement gap in comparative perspective. New York: Russell Sage Foundation.\n",
    "\n",
    "Breen, R. (Ed.). (2004). Social Mobility in Europe. Oxford: Oxford University Press.\n",
    "\n",
    "Breen, R., & Goldthorpe, J. H. (2001). Class, Mobility and Merit: The experience of Two Birth Cohorts. European Sociological Review, 17(2), 81-101.\n",
    "\n",
    "Breen, R., Luijkx, R., Muller, W., & Pollak, R. (2010). Long-term Trends in Educational Inequality in Europe: Class Inequalities and Gender Differences. European Sociological Review, 26 (1), 31-48.\n",
    "\n",
    "Caldwell, T., Rodgers, B., Clark, C., Jefferis, B., Stansfeld, S., & Power, C. (2008). Lifecourse socioeconomic predictors of midlife drinking patterns, problems and abstention: findings from the 1958 British Birth Cohort Study. Drug and alcohol dependence, 95(3), 269-278.\n",
    "\n",
    "Carpenter, J., & Kenward, M. (2012). Multiple imputation and its application. London: John Wiley & Sons.\n",
    "\n",
    "Cheung, S. Y., & Egerton, M. (2007). Great Britain: Higher Education Expansion and Reform: Changing Educational Inequalities Stratification in Higher Education: A Comparative Study (pp. 195-219). Stanford: Stanford University Press.\n",
    "\n",
    "Conger, R. D., & Conger, K. J. (2002). Resilience in Midwestern families: Selected findings from the first decade of a prospective, longitudinal study. Journal of Marriage and Family, 64(2), 361-373.\n",
    "\n",
    "Connelly, R., Gayle, V., & Lambert, P. S. (2016a). A review of educational attainment measures for social survey research. Methodological Innovations, 9, 2059799116638001.\n",
    "\n",
    "Connelly, R., Gayle, V., & Lambert, P. S. (2016b). A Review of occupation-based social classifications for social survey research. Methodological Innovations, 9, 2059799116638003.\n",
    "\n",
    "Connelly, R., & Platt, L. (2014). Cohort profile: UK millennium Cohort study (MCS). International journal of epidemiology, 43(6), 1719-1725.\n",
    "\n",
    "Crompton, R. (2008). Class and Stratification. Cambridge: Polity Press.\n",
    "\n",
    "Cunha, F., & Heckman, J. (2009). The Economics and Psychology of Inequality and Human Development NBER Working Paper No. 14695. Cambridge: National Bureau of Economic Research.\n",
    "\n",
    "Deary, I., Strand, S., Smith, P., & Fernandes, C. (2007). Intelligence and educational achievement. Intelligence, 35, 13-21.\n",
    "\n",
    "Deary, I. J., Spinath, F. M., & Bates, T. C. (2006). Genetics of intelligence. European journal of human genetics: EJHG, 14(6), 690.\n",
    "\n",
    "Dickerson, A., & Popli, G. (2012). Persistent Poverty and Children's Cognitive Development CLS Working Paper 2012/2. London: Centre for Longitudinal Studies.\n",
    "\n",
    "Dickerson, A., & Popli, G. (2016). Persistent poverty and children's cognitive development: Evidence from the UK Millennium Cohort Study. Journal of the Royal Statistical Society: Series A (Statistics in Society), 179(2), 534-558.\n",
    "\n",
    "Diggle, P. J. (2015). Statistics: a data science for the 21st century. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(4), 793-813.\n",
    "\n",
    "Duncan, G., Yeung, W., J., B.-G., & Smith, J. (1998). How much does childhood poverty affect the lifechances of children? American Sociological Review, 63(3), 406-423.\n",
    "\n",
    "Elder, G. H. (1994). Families in troubled times: Adapting to change in rural America. awthorne, NY: De Gruyter Aldine.\n",
    "\n",
    "Elliott, C., Murray, D., & Pearson, L. (1978). British Ability Scales. London: National Foundation for Educational Research.\n",
    "\n",
    "Elliott, J., & Shepherd, P. (2006). Cohort profile: 1970 British birth cohort (BCS70). International journal of epidemiology, 35(4), 836-843.\n",
    "\n",
    "Enders, C. K. (2010). Applied Missing Data Analysis. London: Guilford Press.\n",
    "\n",
    "Erikson, R., & Goldthorpe, J. H. (1992). The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford: Clarendon.\n",
    "\n",
    "Erikson, R., Goldthorpe, J. H., Jackson, M., Yaish, M., & Cox, D. R. (2005). On class differentials in educational attainment. Proceedings of the National Academy of Sciences of the United States of America, 102(27), 9730-9733.\n",
    "\n",
    "Ermisch, J. (2008). Origins of social immobility and inequality: Parenting and early child development. National Institute Economic Review, 205(1), 62-71.\n",
    "\n",
    "Feinstein, L. (2003). Inequality in the early cognitive development of British children in the 1970 cohort. Economica, 70(277), 73-97.\n",
    "\n",
    "Firth, D. (2003). Overcoming the Reference Category Problem in the Presentation of Statistical Models. Sociological Methodology, 33(1), 1-18.\n",
    "\n",
    "Flynn, J. R. (2012). Are We Getting Smarter? Rising IQ in the Twenty-First Century. Cambridge: Cambridge University Press.\n",
    "\n",
    "Gayle, V., & Lambert, P. (2007). Using Quasi-Variance To Communicate Sociological Results From Statistical Models. Sociology, 41(6), 1191-1208.\n",
    "\n",
    "Goisis, A., Özcan, B., & Myrskylä, M. (2017). Decline in the negative association between low birth weight and cognitive ability. Proceedings of the National Academy of Sciences, 114(1), 84-88.\n",
    "\n",
    "Goldthorpe, J., & Jackson, M. (2007). Intergenerational Class Mobility in Contemporary Britain: Political Concerns And Empirical Findings. The British Journal of Sociology, 58(4), 525-546.\n",
    "\n",
    "Goldthorpe, J., & McKnight, A. (2006). The Economic Basis of Social Class. In S. L. Morgan, D. B. Grusky & G. S. Fields (Eds.), Mobility and Inequality (pp. 109-136). Stanford: Stanford University Press.\n",
    "\n",
    "Goldthorpe, J. H. (2016). Social class mobility in modern Britain: changing structure, constant process. Journal of the British Academy, 4, 89-111.\n",
    "\n",
    "Goodman, A., & Gregg, P. (2010). Poorer children's educational attainment: how important are attitudes and behaviour. London: Joseph Rowntree Foundation.\n",
    "\n",
    "Gottfried, A., Gottfried, A., Bathurst, K., Guerin, D., & Parramore, M. (2003). Socioeconomic status in children's development and family environment: infancy through adolescence. In M. Bornstein & R. Bradley (Eds.), Socioeconomic status, parenting and child development (Vol. 189-207). Mahwah: Lawrence Erlbaum.\n",
    "\n",
    "Gregg, P. (2012). Occupational Coding for the National Child Development Study (1969, 1991-2008) and the 1970 British Cohort Study (1980, 2000-2008). [data collection]. SN: 7023. Colchester: UK Data Archive.\n",
    "\n",
    "Hawkes, D., & Plewis, I. (2006). Modelling non‐response in the national child development study. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(3), 479-491.\n",
    "\n",
    "Hill, W. D., Davies, G., Van De Lagemaat, L. N., Christoforou, A., Marioni, R., Fernandes, C., . . . Craig, L. C. (2014). Human cognitive ability is influenced by genetic variation in components of postsynaptic signalling complexes assembled by NMDA receptors and MAGUK proteins. Translational psychiatry, 4(1), e341.\n",
    "\n",
    "Höfler, M., Pfister, H., Lieb, R., & Wittchen, H. (2005). The use of weights to account for non-response and drop-out. Social psychiatry and psychiatric epidemiology, 40(4), 291-299.\n",
    "\n",
    "Kiernan, K., & Mensah, F. K. (2011). Poverty, family resources and children's educational attainment: The mediating role of parenting. British Educational Research Journal, 37(2), 317-336.\n",
    "\n",
    "King, G. (1995). Replication, replication. PS: Political Science & Politics, 28(3), 444-452.\n",
    "\n",
    "King, G. (2003). The future of replication. International Studies Perspectives, 4, 100–105.\n",
    "\n",
    "Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B. E., Bussonnier, M., Frederic, J., . . . Corlay, S. (2016). Jupyter Notebooks-a publishing format for reproducible computational workflows. Paper presented at the ELPUB.\n",
    "\n",
    "Knuth, D. E. (1992). Literate Programming. Stanford: Stanford University Press.\n",
    "Lareau, A. (2011). Unequal childhoods: Class, race, and family life: University of California Pr.\n",
    "\n",
    "Lawlor, D. A., Batty, G. D., Morton, S., Deary, I., Macintyre, S., Ronalds, G., & Leon, D. A. (2005). Early life predictors of childhood intelligence: evidence from the Aberdeen children of the 1950s study. Journal of Epidemiology and Community Health, 59(8), 656-663.\n",
    "\n",
    "Layte, R. (2017). Why Do Working-Class Kids Do Worse in School? An Empirical Test of Two Theories of Educational Disadvantage. European Sociological Review, Online Preview  DOI: https://doi.org/10.1093/esr/jcx054.\n",
    "\n",
    "Little, R., & Rubin, D. (2014). Statistical Analysis with Missing Data. Hoboken: John Wiley & Sons.\n",
    "\n",
    "Machin, S., & Vignoles, A. (2004). Educational Inequality: The Widening Socio-Economic Gap. Fiscal Studies, 25(2), 22.\n",
    "\n",
    "McCulloch, A., & Joshi, H. (2001). Neighbourhood and family influences on the cognitive ability of children in the British National Child Development Study. Social Science and Medicine, 53(5), 579-591.\n",
    "\n",
    "McDermott, P. A., Fantuzzo, J. W., & Glutting, J. J. (1990). Just say no to subtest analysis: A critique on Wechsler theory and practice. Journal of Psychoeducational Assessment, 8(3), 290-302.\n",
    "\n",
    "Menard, S. (2002). Applied logistic regression analysis (Vol. 106): Sage.\n",
    "Mostafa, T., & Wiggins, R. (2014). Handling attrition and non-response in the 1970 British Cohort Study. London: Centre for Longitudinal Studies.\n",
    "\n",
    "Mostafa, T., & Wiggins, R. (2015). The impact of attrition and non-response in birth cohort studies: a need to incorporate missingness strategies. Longitudinal and Life Course Studies, 6(2), 131-146.\n",
    "\n",
    "Must, O., te Nijenhuis, J., Must, A., & van Vianen, A. E. M. (2009). Comparability of IQ scores over time. Intelligence, 37(1), 25-33.\n",
    "\n",
    "Nature [Editorial]. (2016). Reality check on reproducibility. Nature, 533, 437.\n",
    "Neisser, U., Boodoo, G., Bouchard, T., Boykin, A., Brody, N., Ceci, S., . . . Urbina, S. (1995). Intelligence: Knowns And Unknowns. American Psychologist, 51(2), 77-101.\n",
    "\n",
    "Nettle, D. (2003). Intelligence and Class Mobility in the British Population. British Journal Of Psychology, 94(4), 551-561.\n",
    "\n",
    "Parcel, T. L., & Menaghan, E. G. (1994). Parents' jobs and children's lives. New York: Aldine De Gruyter.\n",
    "\n",
    "Parsons, S. (2014). Childhood Cognition in the 1970 British Cohort Study. London: Centre for Longitudinal Studies.\n",
    "\n",
    "Plewis, I., Calderwood, L., Hawkes, D., & Nathan, G. (2004). Changes in the NCDS and BCS70 populations and samples over time. London: Centre for Longitudinal Studies.\n",
    "\n",
    "Power, C., & Elliott, J. (2006). Cohort profile: 1958 British birth cohort (national child development study). International journal of epidemiology, 35(1), 34-41.\n",
    "\n",
    "Rose, D., & Pevalin, D. J. (2003). The NS-SEC Explained. In D. Rose & D. J. Pevalin (Eds.), A Researcher's Guide to the National Statistics Socio-economic Classification (pp. 28-43). London: Sage.\n",
    "\n",
    "Rose, D., & Pevalin, D. J. (2005). The National Statistics Socio-Economic Classification: Origins, Development and Use. Colchester: University of Essex.\n",
    "\n",
    "Schoon, I. (2010). Childhood cognitive ability and adult academic attainment: Evidence from three British cohort studies. Longitudinal and Life Course Studies, 1(3), 241-158.\n",
    "\n",
    "Schoon, I., Jones, E., Cheng, H., & Maughan, B. (2010). Resilience in children's development. In K. Hansen, H. Joshi & S. Dex (Eds.), Children of the 21st Century: The first five years. Bristol: Policy Press.\n",
    "\n",
    "Schoon, I., Jones, E., Cheng, H., & Maughan, B. (2011). Family hardship, family instability and cognitive development. Journal of Epidemiology and Community Health, 643(1), 239-266.\n",
    "\n",
    "Seaman, S., White, I., Copas, A., & Li, L. (2012). Combining multiple imputation and inverse‐probability weighting. Biometrics, 68(1), 129-137.\n",
    "\n",
    "Shavit, Y., & Blossfeld, H. (1991). Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Boulder, Colorado: Westview Press.\n",
    "\n",
    "Shavit, Y., Yaisch, M., & Bar-Haim, E. (2007). The Persistence Of Persistent Inequality. In S. Scherer, R. Pollack, G. Otte & M. Gangl (Eds.), From origin to destination: Trends and mechanisms in social stratification research. Frankfurt: Campus Verlag.\n",
    "\n",
    "Shenkin, S., Starr, J., Pattie, A., Rush, M., Whalley, L., & Deary, I. (2001). Birth weight and cognitive function at age 11 years: the Scottish Mental Survey 1932. Archives of disease in childhood, 85(3), 189-196.\n",
    "\n",
    "Shepherd, P. (2012). 1958 National Child Development Study User Guide: Measures of Ability At ages 7 to 16. London: Centre for Longitudinal Studies, University of London.\n",
    "\n",
    "Stansfeld, S. A., Clark, C., Caldwell, T., Rodgers, B., & Power, C. (2008). Psychosocial work characteristics and anxiety and depressive disorders in midlife: the effects of prior psychological distress. Occupational and Environmental Medicine, 65(9), 634-642.\n",
    "\n",
    "Sternberg, R., Grigorenko, E., & Bundy, D. (2001). The predictive value of IQ. Merrill-Palmer Quarterly, 47(1), 1-41.\n",
    "\n",
    "Strand, S., Deary, I., & Smith, P. (2006). Sex differences in cognitive abilities test scores: A UK national picture. British Journal of Educational Psychology, 76(3), 463-480.\n",
    "\n",
    "Sullivan, A., Ketende, S., & Joshi, H. (2013). Social Class and Inequalities in Early Cognitive Scores. Sociology, 47(6), 1187-1206.\n",
    "\n",
    "Sullivan, T. R., Salter, A. B., Ryan, P., & Lee, K. J. (2015). Bias and precision of the “multiple imputation, then deletion” method for dealing with missing outcome data. American journal of epidemiology, 182(6), 528-534.\n",
    "\n",
    "Tampubolon, G., & Savage, M. (2012). Intergenerational and Intragenerational Social Mobility in Britain. In P. S. Lambert, R. Connelly, M. Blackburn & V. Gayle (Eds.), Social Stratification: Trends and Processes (pp. 115-131). Aldershot: Ashgate.\n",
    "\n",
    "Teasdale, T. W., & Owen, D. R. (2008). Secular declines in cognitive test scores: A reversal of the Flynn Effect. Intelligence, 36(2), 121-126.\n",
    "\n",
    "Tucker-Drob, E. M., Briley, D. A., & Harden, K. P. (2013). Genetic and environmental influences on cognition across development and context. Current directions in psychological science, 22(5), 349-355.\n",
    "\n",
    "University of London. (2013). 1970 British Cohort Study: Birth and 22-Month Subsample, 1970-1972. [data collection].  3rd Edition. SN2666.: UK Data Service.\n",
    "\n",
    "University of London. (2014). National Child Development Study: Childhood Data, Sweeps 0-3, 1958-1974. [data collection].  3rd Edition. SN: 5565. In [Original Data Producer(s)], National Birthday Trust Fund & National Children's Bureau (Eds.): UK Data Service.\n",
    "\n",
    "University of London. (2015a). Millennium Cohort Study: Fifth Survey, 2012. [data collection]. 2nd Edition. SN7464.: UK Data Service.\n",
    "\n",
    "University of London. (2015b). National Child Development Study Response and Outcomes Dataset, 1958-2013. [data collection]. 5th Edition. SN: 5560. In U. D. Service. (Ed.): UK Data Service.\n",
    "\n",
    "University of London. (2016a). 1970 British Cohort Study: Five-Year Follow-Up, 1975. [data collection].  5th Edition. SN2699.: UK Data Service.\n",
    "\n",
    "University of London. (2016b). 1970 British Cohort Study: Ten-Year Follow-Up, 1980. [data collection].  6th Edition. SN3723.: UK Data Service.\n",
    "\n",
    "Van der Sluis, S., Posthuma, D., Dolan, C., de Geus, E., Colom, R., & Boomsma, D. (2006). Sex differences on the Dutch WAIS-III. Intelligence, 34(3), 273-289.\n",
    "Vanhanen, T. (2011). IQ and international wellbeing indexes. The Journal of Social, Political, and Economic Studies, 36(1), 80.\n",
    "\n",
    "Vincent, C., & Ball, S. J. (2007). 'Making up' the middle-class child: Families, activities and class dispositions. Sociology, 41(6), 1061-1077.\n",
    "\n",
    "Von Hippel, P. T. (2007). Regression with missing Ys: An improved strategy for analyzing multiply imputed data. Sociological Methodology, 37(1), 83-117.\n",
    "\n",
    "Washbrook, E. (2011). Early Environments and Child Outcomes: An Analysis Commission for the Independent Review on Life Chances. Bristol: Centre for Market and Public Organization, University of Bristol."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Stata",
   "language": "stata",
   "name": "stata"
  },
  "language_info": {
   "file_extension": "do",
   "mimetype": "text/x-stata",
   "name": "stata"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}