* stata_codebook.do - attach long-form notes to the .dta files (run once in Stata). * Generated by build_data_dictionary.py - do not edit by hand. * ---- dataSIM4RCT.dta ---- use "dataSIM4RCT.dta", clear label data "Simulated RCT household panel: 2021 baseline + 2024 endline" note _dta: Synthetic balanced household panel, 2,000 households observed in 2021 (baseline) and 2024 (endline). Random assignment to offer (treat) is fixed within household; receipt (D) turns on only at endline for compliers. True treatment effect = 0.12 log points (~12%). wave/year/post encode the same time axis; alpha/eps/y0 are data-generating-process internals. note id: Unique household ID; the panel unit, repeated across the two waves.. Construction: 1..2000, one per household.. Units: integer. Source: Simulation note age: Age in years of the household head (time-invariant in this panel).. Construction: Drawn at baseline; held fixed across waves.. Units: years. Source: Simulation note female: 1 if the household head is female, else 0 (the chance baseline imbalance, SMD ~9.3%).. Construction: Drawn at baseline; held fixed across waves.. Units: 0/1. Source: Simulation note poverty: 1 if the household is in poverty at baseline (the randomization stratum), else 0.. Construction: Drawn at baseline; randomization was stratified on this variable.. Units: 0/1. Source: Simulation note edu: Years of education of the household head (time-invariant in this panel).. Construction: Drawn at baseline; held fixed across waves.. Units: years. Source: Simulation note treat: Random assignment to the program offer; exogenous, fixed within household across waves.. Construction: Stratified (block) randomization within poverty strata; ~52% assigned.. Units: 0/1. Source: Simulation (randomized) note wave: Integer wave index; an alternative encoding of the time axis to year/post.. Construction: 1 for the 2021 baseline, 2 for the 2024 endline.. Units: 1/2. Source: Simulation note year: Calendar year of the survey wave.. Construction: 2021 for the baseline wave, 2024 for the endline wave.. Units: year. Source: Simulation note post: Binary post-treatment period flag; 1 at endline, 0 at baseline.. Construction: 1 if year==2024 (endline), else 0.. Units: 0/1. Source: Simulation note D: Actual receipt of the transfer; endogenous take-up, non-zero only at endline.. Construction: 0 at baseline; at endline ~85% of the offered and ~5% of controls receive (imperfect compliance).. Units: 0/1. Source: Simulation note alpha: Simulation random component contributing to consumption; not a tutorial covariate.. Construction: Generated by the data-generating process (household/wave-level term).. Units: log scale. Source: Simulation (DGP internal) note eps: Simulation idiosyncratic shock to consumption; not a tutorial covariate.. Construction: Generated by the data-generating process (per-observation noise).. Units: log scale. Source: Simulation (DGP internal) note y: Outcome variable: natural log of monthly household consumption in each wave.. Construction: Baseline level + household and time components, plus the 0.12 treatment bump at endline for recipients.. Units: log of monetary units. Source: Simulation note y0: Each household's 2021 baseline value of y, carried to both rows for ANCOVA-style adjustment.. Construction: y at the baseline wave; constant within household across the two rows.. Units: log of monetary units. Source: Simulation save "dataSIM4RCT.dta", replace