* stata_codebook.do - attach long-form notes to the .dta files (run once in Stata). * Generated by build_data_dictionary.py - do not edit by hand. * ---- health_data.dta ---- use "health_data.dta", clear label data "Cross-section: 50 countries with two raw health indicators" note _dta: Cross-section of 50 simulated countries (seed = 42). One identifier and two raw health indicators (life expectancy, infant mortality); the post's six-step PCA pipeline turns these into the composite Health Index. note country: Synthetic country label.. Construction: Generated as Country_01 ... Country_50 (zero-padded sequential index).. Units: string. Source: Simulation note life_exp: Average life expectancy at birth - a positive health indicator (higher is better).. Construction: Simulated: life_exp = 55 + 30*base_health + N(0, 2), rounded to 1 decimal; base_health ~ Uniform(0, 1).. Units: years. Source: Simulation note infant_mort: Deaths before age 1 per 1,000 live births - a negative health indicator (higher is worse); polarity-adjusted before PCA.. Construction: Simulated: infant_mort = 60 - 55*base_health + N(0, 3), rounded to 1 decimal; same base_health as life_exp (shared latent factor).. Units: per 1,000 live births. Source: Simulation save "health_data.dta", replace