# National Health and Nutrition Examination Survey (NHANES), 2007-2008 (ICPSR 25505)

## Summary
#### The National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The surveys examine a nationally representative sample of approximately 5,000 persons each year. These persons are located in counties across the United States, 15 of which are visited each year. For NHANES 2007-2008, there were 12,946 persons selected for the sample, 10,149 of those were interviewed (78.4 percent) and 9,762 (75.4 percent) were examined in the mobile examination centers (MEC). Many of the NHANES 2007-2008 questions were also asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. As in past health examination surveys, data were collected on the prevalence of chronic conditions in the population. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey. Risk factors, those aspects of a person's lifestyle, constitution, heredity, or environment that may increase the chances of developing a certain disease or condition, were examined. Data on smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, weight, and dietary intake were collected. Information on certain aspects of reproductive health, such as use of oral contraceptives and breastfeeding practices, were also collected. The diseases, medical conditions, and health indicators that were studied include: anemia, cardiovascular disease, diabetes and lower extremity disease, environmental exposures, equilibrium, hearing loss, infectious diseases and immunization, kidney disease, mental health and cognitive functioning, nutrition, obesity, oral health, osteoporosis, physical fitness and physical functioning, reproductive history and sexual behavior, respiratory disease (asthma, chronic bronchitis, emphysema), sexually transmitted diseases, skin diseases, and vision. The sample for the survey was selected to represent the United States population of all ages. The NHANES target population is the civilian, noninstitutionalized United States population. Beginning in 2007, some changes were made to the domains being oversampled. The primary change is the oversampling of the entire Hispanic population instead of just the Mexican American (MA) population, which has been oversampled since 1988. Sufficient numbers of MAs were retained in the sample design so that trends in the health of MAs can continue to be monitored. Persons 60 years of age and older, Blacks, and low income persons were also oversampled. In addition, for each of the race/ethnicity domains, the 12-15 and 16-19 year age domains were combined and the 40-59 year age minority domains were split into 10-year age domains of 40-49 and 50-59. This has led to an increase in the number of participants aged 40 and older and a decrease in 12- to 19-year-olds from previous cycles. The oversample of pregnant women and adolescents in the survey from 1999-2006 was discontinued to allow for the oversampling of the Hispanic population. NCHS is working with public health agencies to increase knowledge of the health status of older Americans. NHANES has a primary role in this endeavor. In the examination, all participants visit the physician who takes their pulse or blood pressure. Dietary interviews and body measurements are included for everyone. All but the very young have a blood sample taken and see the dentist. Depending upon the age of the participant, the rest of the examination includes tests and procedures to assess the various aspects of health listed above. Usually, the older the individual, the more extensive the examination. Demographic data file variables are grouped into three broad categories: (1) Status Variables: Provide core information on the survey participant. Examples of the core variables include interview status, examination status, and sequence number. (Sequence number [SEQN] is a unique ID number assigned to each sample person and is required to match the information on this demographic file to the rest of the NHANES 2007-2008 data.) (2) Recoded Demographic Variables: The variables include age (age in months for persons under age 80, age in years for 1 to 80-year-olds, and a top-coded age group of 80 years and older), gender, a race/ethnicity variable, an current or highest grade of education completed, (less than high school, high school, and more than high school education), country of birth (United States, Mexico, or other foreign born), ratio of family income to poverty threshold, income, and a pregnancy status variable (adjudicated from various pregnancy-related variables). Some of the groupings were made due to limited sample sizes for the two-year dataset. (3) Interview and Examination Sample Weight Variables: Sample weights are available for analyzing NHANES 2007-2008 data. Most data analyses require either the interviewed sample weight (variable name: WTINT2YR) or examined sample weight (variable name: WTMEC2YR). The two-year sample weights (WTINT2YR, WTMEC2YR) should be used for NHANES 2007-2008 analyses.

## Citation
#### United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics. National Health and Nutrition Examination Survey (NHANES), 2007-2008. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2012-02-22. https://doi.org/10.3886/ICPSR25505.v3

## Link
#### https://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/25505

### Data:

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import ipywidgets as widgets
import pandas as pd

In [2]:
from IPython.display import display, clear_output
from ipywidgets import Button, HBox, VBox
from collections import Counter

In [5]:
df = pd.read_csv('25505-0012-Data.tsv', sep= '\t', low_memory = False)
df.head(10)

Unnamed: 0,SEQN,BMDSTATS,BMXWT,BMIWT,BMXRECUM,BMIRECUM,BMXHEAD,BMIHEAD,BMXHT,BMIHT,...,FIAPROXY,FIAINTRP,MIALANG,MIAPROXY,MIAINTRP,AIALANG,WTINT2YR,WTMEC2YR,SDMVPSU,SDMVSTRA
0,41475,3,138.9,,,,,,154.7,,...,2,2,1.0,2.0,2.0,1.0,59356.356426,60045.772497,1,60
1,41476,1,22.0,,,,,,120.4,,...,2,2,,,,,35057.218405,35353.21044,1,70
2,41477,1,83.9,,,,,,167.1,,...,2,2,1.0,2.0,2.0,1.0,9935.266183,10074.150074,1,67
3,41478,1,11.5,,77.4,,,,,,...,2,2,,,,,12846.712058,14560.472652,2,59
4,41479,1,65.7,,,,,,154.4,,...,2,2,2.0,2.0,2.0,2.0,8727.797555,9234.055759,1,70
5,41480,1,27.0,,,,,,122.7,,...,2,2,,,,,7379.745086,7297.067503,2,69
6,41481,1,77.9,,,,,,182.7,,...,2,2,1.0,2.0,2.0,1.0,24342.505253,24655.376656,2,68
7,41482,1,101.6,,,,,,173.8,,...,2,2,2.0,2.0,2.0,2.0,9811.075078,11602.178638,2,65
8,41483,3,133.1,,,,,,173.8,,...,2,2,1.0,2.0,2.0,1.0,8058.685296,7920.812275,2,66
9,41484,1,9.3,,72.7,,,,,,...,2,2,,,,,8942.951928,9259.270099,2,59


### Graph Template:

In [6]:
style = {'description_width': 'initial'}

# set up initial text box and button
graphs_num = widgets.IntText(value = 1, description = "Number of Graphs", style = style)
graphs_ex = widgets.Button(description = "Execute")

# set up the graph widgets
x_axis = []
x_filter = []
x_filter_num = []
y_axis = []
y_filter = []
y_filter_num = []
graph_type = []

# create graph button
run_graph = widgets.Button(description = "Graph!")

# display setup buttons
display(graphs_num)
display(graphs_ex)
x_axis.append(widgets.Dropdown(options = list(df), description = "X Variable", style = style))
x_filter.append(widgets.Dropdown(options = ["No Filter", "=", "!=", ">", ">=", "<", "<="], description = "Filter:", style = style))
x_filter_num.append(widgets.IntText(value = 0))
y_axis.append(widgets.Dropdown(options = list(df), description = "Y Variable", style = style))
y_filter.append(widgets.Dropdown(options = ["No Filter", "=", "!=", ">", ">=", "<", "<="], description = "Filter:", style = style))
y_filter_num.append(widgets.IntText(value = 0))
graph_type.append(widgets.Dropdown(options = ["Bar (Sum)", "Line (Trends)", "Pie (Percent)"], description = "Type:", style = style))
x_group = HBox([x_axis[0], x_filter[0], x_filter_num[0]])
y_group = HBox([y_axis[0], y_filter[0], y_filter_num[0]])
x_and_y = VBox([x_group, y_group]), VBox([x_group, y_group])
print ("Graph 1:")
display(x_and_y[0])
display(graph_type[0])
display(run_graph)

print_x = []
print_y = []
xvar = []

# create graphs
def publish_graph(p):
    clear_output()
    display(graphs_num)
    display(graphs_ex)
    
    for i in range(graphs_num.value):     
        x_group = HBox([x_axis[i], x_filter[i], x_filter_num[i]])    
        y_group = HBox([y_axis[i], y_filter[i], y_filter_num[i]])
        x_and_y = VBox([x_group, y_group]), VBox([x_group, y_group])
        print ("Graph %d:" %(i+1))
        display(x_and_y[0])
        display(graph_type[i])
    display(run_graph)
    
    for i in range(graphs_num.value):
        if x_filter[i].value == 'No Filter':
            for j in range(len(df)):
                xvar.append(j)        
        elif x_filter[i].value == "=":
            for j in range(len(df)):
                if df[x_axis[i].value][j] == x_filter_num[i].value:
                    xvar.append(j)
        elif (x_filter[i].value == "!="):
            for j in range(len(df)):
                if df[x_axis[i].value][j] != x_filter_num[i].value:
                    xvar.append(j)     
        elif x_filter[i].value == ">":
            for j in range(len(df)):
                if df[x_axis[i].value][j] > x_filter_num[i].value:
                    xvar.append(j)          
        elif x_filter[i].value == ">=":
            for j in range(len(df)):
                if df[x_axis[i].value][j] >= x_filter_num[i].value:
                    xvar.append(j)           
        elif x_filter[i].value == "<":
            for j in range(len(df)):
                if df[x_axis[i].value][j] < x_filter_num[i].value:
                    xvar.append(j)          
        elif x_filter[i].value == "<=":
            for j in range(len(df)):
                if df[x_axis[i].value][j] <= x_filter_num[i].value:
                    xvar.append(j) 
                    
######## BAR GRAPH ############################################################
######## PIE CHART ############################################################

        if graph_type[i].value == "Bar (Sum)" or graph_type[i].value == "Pie (Percent)":         
            if y_filter[i].value == "No Filter":
                for k in range(len(xvar)):
                    print_y.append(df[y_axis[i].value][xvar[k]])
         
            elif (y_filter[i].value == "="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] == y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]])   
        
            elif (y_filter[i].value == "!="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] != y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]]) 
       
            elif (y_filter[i].value == ">"):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] > y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]]) 
                        
            elif (y_filter[i].value == ">="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] >= y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]]) 
                        
            elif (y_filter[i].value == "<"):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] < y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]]) 
                        
            elif (y_filter[i].value == "<="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][xvar[k]] <= y_filter_num[i].value):
                        print_y.append(df[y_axis[i].value][xvar[k]])
                        
            D = Counter(print_y)
                        
######## LINE GRAPH ############################################################

        else:
            if (y_filter[i].value == "No Filter"):
                for k in range(len(xvar)):
                    print_x.append(df[x_axis[i].value][xvar[k]])
                    print_y.append(df[y_axis[i].value][xvar[k]])           
            elif (y_filter[i].value == "="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] == y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])          
            elif (y_filter[i].value == "!="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] != y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])  
            elif (y_filter[i].value == ">"):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] > y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])             
            elif (y_filter[i].value == ">="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] >= y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])            
            elif (y_filter[i].value == "<"):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] < y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])            
            elif (y_filter[i].value == "<="):
                for k in range(len(xvar)):
                    if (df[y_axis[i].value][k] <= y_filter_num[i].value):
                        print_x.append(df[x_axis[i].value][xvar[k]])
                        print_y.append(df[y_axis[i].value][xvar[k]])  
########## Print Graph ########################################################                    
        #print(print_x)
        #print(print_y)
        plt.figure()
        if (graph_type[i].value == "Bar (Sum)"):
            plt.bar(range(len(D)), list(D.values()), align='center')
            plt.xticks(range(len(D)), list(D.keys()))
            
        elif (graph_type[i].value == "Pie (Percent)"):
            plt.pie([float(v) for v in D.values()], labels=[float(k) for k in D], autopct='%1.1f%%')
        else:
            plt.plot(print_x, print_y)
            
        print_x.clear()
        print_y.clear()
        xvar.clear()
            
def run_setup(r):
    clear_output()
    display(graphs_num)
    display(graphs_ex)
    for i in range(graphs_num.value):
        x_axis.append(widgets.Dropdown(options = list(df), description = "X Variable", style = style))
        x_filter.append(widgets.Dropdown(options = ["No Filter", "=", "!=", ">", ">=", "<", "<="], description = "Filter:", style = style))
        x_filter_num.append(widgets.IntText(value = 0))
        y_axis.append(widgets.Dropdown(options = list(df), description = "Y Variable", style = style))
        y_filter.append(widgets.Dropdown(options = ["No Filter", "=", "!=", ">", ">=", "<", "<="], description = "Filter:", style = style))
        y_filter_num.append(widgets.IntText(value = 0))
        graph_type.append(widgets.Dropdown(options = ["Bar (Sum)", "Line (Trends)", "Pie (Percent)"], description = "Type:", style = style))
        
        x_group = HBox([x_axis[i], x_filter[i], x_filter_num[i]])
        y_group = HBox([y_axis[i], y_filter[i], y_filter_num[i]])
        x_and_y = VBox([x_group, y_group]), VBox([x_group, y_group])
        print ("Graph %d:" %(i+1))
        display(x_and_y[0])
        display(graph_type[i])
    display(run_graph)
    
graphs_ex.on_click(run_setup)
run_graph.on_click(publish_graph)



A Jupyter Widget

A Jupyter Widget

Graph 1:


A Jupyter Widget

A Jupyter Widget

A Jupyter Widget