* stata_codebook.do - attach long-form notes to the .dta files (run once in Stata). * Generated by build_data_dictionary.py - do not edit by hand. * ---- raw_data.dta ---- use "raw_data.dta", clear label data "Full worker wage panel: 5 waves, 2,209 workers" note _dta: Full wage panel from quarcs-lab/data-open (isds/wage_panel_bob4.dta), exported verbatim except that union is mapped from Yes/No to 1/0. One row per worker-year, five biennial waves 2010-2018, 2,209 workers, balanced. The tutorial's analysis panel is the 2010+2012 subset in data_panel.csv. note ID: Unique person ID; the panel cross-sectional (unit) dimension.. Construction: From the source NLSY-style file; stored as a float in the CSV.. Units: integer id. Source: quarcs-lab data-open note year: Calendar year of the observation; the panel time dimension.. Construction: From the source file; raw has 2010-2018 (biennial), the panel keeps 2010 and 2012.. Units: year. Source: quarcs-lab data-open note age: Worker age in years at the survey wave.. Construction: From the source file.. Units: years. Source: quarcs-lab data-open note wagerate: Worker's hourly wage rate (level), the basis for log wage.. Construction: From the source file; lwage = log(wagerate).. Units: US$/hour. Source: quarcs-lab data-open note schooling: Completed years of education; time-invariant within a worker over the panel window.. Construction: From the source file.. Units: years. Source: quarcs-lab data-open note region: US Census region of residence.. Construction: From the source file (categorical text).. Units: category. Source: quarcs-lab data-open note union: 1 if the worker is a union member in that wave, else 0; the treatment of interest.. Construction: Mapped from the source Yes/No string to 1/0.. Units: 0/1. Source: quarcs-lab data-open note lwage: Natural log of the hourly wage rate; the outcome variable.. Construction: log(wagerate), from the source file.. Units: log US$/hour. Source: quarcs-lab data-open note gender: Worker's reported gender (Female / Male).. Construction: From the source file (categorical text).. Units: category. Source: quarcs-lab data-open save "raw_data.dta", replace * ---- data_panel.dta ---- use "data_panel.dta", clear label data "Analysis panel: 2010 & 2012 only (T=2), with female dummy" note _dta: Analysis subset of raw_data.csv restricted to 2010 and 2012 so that T=2 and first-differences coincide with the within estimator. Rows missing lwage/union/age/schooling are dropped (2,199 of 2,209 workers retained), year is cast to integer, and a female dummy is added. Perfectly balanced: every worker contributes exactly two observations. note ID: Unique person ID; the panel cross-sectional (unit) dimension.. Construction: From the source NLSY-style file; stored as a float in the CSV.. Units: integer id. Source: quarcs-lab data-open note year: Calendar year of the observation; the panel time dimension.. Construction: From the source file; raw has 2010-2018 (biennial), the panel keeps 2010 and 2012.. Units: year. Source: quarcs-lab data-open note age: Worker age in years at the survey wave.. Construction: From the source file.. Units: years. Source: quarcs-lab data-open note wagerate: Worker's hourly wage rate (level), the basis for log wage.. Construction: From the source file; lwage = log(wagerate).. Units: US$/hour. Source: quarcs-lab data-open note schooling: Completed years of education; time-invariant within a worker over the panel window.. Construction: From the source file.. Units: years. Source: quarcs-lab data-open note region: US Census region of residence.. Construction: From the source file (categorical text).. Units: category. Source: quarcs-lab data-open note union: 1 if the worker is a union member in that wave, else 0; the treatment of interest.. Construction: Mapped from the source Yes/No string to 1/0.. Units: 0/1. Source: quarcs-lab data-open note lwage: Natural log of the hourly wage rate; the outcome variable.. Construction: log(wagerate), from the source file.. Units: log US$/hour. Source: quarcs-lab data-open note gender: Worker's reported gender (Female / Male).. Construction: From the source file (categorical text).. Units: category. Source: quarcs-lab data-open note female: 1 if the worker's gender is Female, else 0; time-invariant covariate.. Construction: Derived in the analysis panel: 1 if gender == 'Female', else 0.. Units: 0/1. Source: Derived (this study) save "data_panel.dta", replace