{ "metadata": { "name": "", "signature": "sha256:bc6490dd883689bab0367408eba78343b257bc52d1d9fdda1eb8c3266b80c991" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this exercise, reproduce some of the findings from [What Makes Houston the Next Great American City? | Travel | Smithsonian](http://www.smithsonianmag.com/travel/what-makes-houston-the-next-great-american-city-4870584/), specifically the calculation represented in\n", "\n", "![Alt text](http://thumbs.media.smithsonianmag.com//filer/Houston-diversity-3.jpg__600x0_q85_upscale.jpg \"Optional title\")\n", "\n", "whose caption is\n", "\n", "
To assess the parity of the four major U.S. ethnic and racial groups, Rice University researchers used a scale called the Entropy Index. It ranges from 0 (a population has just one group) to 1 (all groups are equivalent). Edging New York for the most balanced diversity, Houston had an Entropy Index of 0.874 (orange bar).\n", "\n", "The research report by *Smithsonian Magazine* is\n", "[Houston Region Grows More Racially/Ethnically Diverse, With Small Declines in Segregation: A Joint Report Analyzing Census Data from 1990, 2000, and 2010](http://kinder.rice.edu/uploadedFiles/Urban_Research_Center/Media/Houston%20Region%20Grows%20More%20Ethnically%20Diverse%202-13.pdf) by the Kinder Institute for Urban Research & the Hobby Center for the Study of Texas. \n", "\n", "In the report, you'll find the following quotes:\n", "\n", "
How does Houston\u2019s racial/ethnic diversity compare to the racial/ethnic\n", "diversity of other large metropolitan areas? The Houston metropolitan\n", "area is the most racially/ethnically diverse.\n", "\n", "....\n", "\n", "
Houston is one of the most racially/ethnically diverse metropolitan\n", "areas in the nation as well. *It is the most diverse of the 10 largest\n", "U.S. metropolitan areas.* [emphasis mine] Unlike the other large metropolitan areas, all\n", "four major racial/ethnic groups have substantial representation in\n", "Houston with Latinos and Anglos occupying roughly equal shares of the\n", "population.\n", "\n", "....\n", "\n", "
Houston has the highest entropy score of the 10 largest metropolitan\n", "areas, 0.874. New York is a close second with a score of 0.872.\n", "\n", "....\n", "\n", "Your task is:\n", "\n", "1. Tabulate all the metropolian/micropolitan statistical areas. Remember that you have to group various entities that show up separately in the Census API but which belong to the same area. You should find 942 metropolitan/micropolitan statistical areas in the 2010 Census.\n", "\n", "1. Calculate the normalized Shannon index (`entropy5`) using the categories of White, Black, Hispanic, Asian, and Other as outlined in the [Day_07_G_Calculating_Diversity notebook](http://nbviewer.ipython.org/github/rdhyee/working-open-data-2014/blob/master/notebooks/Day_07_G_Calculating_Diversity.ipynb#Converting-to-Racial-Dot-Map-Categories) \n", "\n", "1. Calculate the normalized Shannon index (`entropy4`) by not considering the Other category. In other words, assume that the the total population is the sum of White, Black, Hispanic, and Asian.\n", "\n", "1. Figure out how exactly the entropy score was calculated in the report from Rice University. Since you'll find that the entropy score reported matches neither `entropy5` nor `entropy4`, you'll need to play around with the entropy calculation to figure how to use 4 categories to get the score for Houston to come out to \"0.874\" and that for NYC to be \"0.872\". [I **think** I've done so and get 0.873618 and \n", "0.872729 respectively.]\n", "\n", "1. Add a calculation of the [Gini-Simpson diversity index](https://en.wikipedia.org/wiki/Diversity_index#Gini.E2.80.93Simpson_index) using the five categories of White, Black, Hispanic, Asian, and Other.\n", "\n", "1. Note where the Bay Area stands in terms of the diversity index.\n", "\n", "For bonus points:\n", "\n", "* make a bar chart in the style used in the Smithsonian Magazine\n", "\n", "Deliverable:\n", "\n", "1. You will need to upload your notebook to a gist and render the notebook in nbviewer and then enter the nbviewer URL (e.g., http://nbviewer.ipython.org/gist/rdhyee/60b6c0b0aad7fd531938)\n", "2. On bCourses, upload the CSV version of your `msas_df`." ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Hispanic or Latino Origin and Racial Subcategories" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "http://www.census.gov/developers/data/sf1.xml\n", "\n", "compare to http://www.census.gov/prod/cen2010/briefs/c2010br-02.pdf \n", "\n", "I think the P0050001 might be the key category\n", "\n", "* P0010001 = P0050001\n", "* P0050001 = P0050002 + P0050010\n", "\n", "P0050002 Not Hispanic or Latino (total) = \n", "\n", "* P0050003 Not Hispanic White only \n", "* P0050004 Not Hispanic Black only\n", "* P0050006 Not Hispanic Asian only\n", "* Not Hispanic Other (should also be P0050002 - (P0050003 + P0050004 + P0050006)\n", " * P0050005 Not Hispanic: American Indian/ American Indian and Alaska Native alone\n", " * P0050007 Not Hispanic: Native Hawaiian and Other Pacific Islander alone\n", " * P0050008 Not Hispanic: Some Other Race alone\n", " * P0050009 Not Hispanic: Two or More Races\n", "\n", "* P0050010 Hispanic or Latino\n", " \n", "P0050010 = P0050011...P0050017\n", "\n", "From [Hispanic and Latino Americans (Wikipedia)](https://en.wikipedia.org/w/index.php?title=Hispanic_and_Latino_Americans&oldid=595018646): \n", "\n", "
While the two terms are sometimes used interchangeably, Hispanic is a narrower term which mostly refers to persons of Spanish speaking origin or ancestry, while Latino is more frequently used to refer more generally to anyone of Latin American origin or ancestry, including Brazilians.\n", "\n", "and\n", "\n", "
The Census Bureau's 2010 census does provide a definition of the terms Latino or Hispanic and is as follows: \u201cHispanic or Latino\u201d refers to a person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin regardless of race. It allows respondents to self-define whether they were Latino or Hispanic and then identify their specific country or place of origin.[52] On its website, the Census Bureau defines \"Hispanic\" or \"Latino\" persons as being \"persons who trace their origin [to]... Spanish speaking Central and South America countries, and other Spanish cultures\".\n", "\n", "In the [Racial Dot Map](http://bit.ly/rdotmap): \"Whites are coded as blue; African-Americans, green; Asians, red; Hispanics, orange; and all other racial categories are coded as brown.\" \n", "\n", "In this notebook, we will relate the Racial Dot Map 5-category scheme to the P005\\* variables." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab --no-import-all inline" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from pandas import DataFrame, Series, Index\n", "import pandas as pd\n", "\n", "from itertools import islice" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "import census\n", "import us\n", "\n", "import settings" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The census documentation has example URLs but needs your API key to work. In this notebook, we'll use the IPython notebook HTML display mechanism to help out.\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "c = census.Census(key=settings.CENSUS_KEY)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "# generators for the various census geographic entities of interest\n", "\n", "def states(variables='NAME'):\n", " geo={'for':'state:*'}\n", " states_fips = set([state.fips for state in us.states.STATES])\n", " # need to filter out non-states\n", " for r in c.sf1.get(variables, geo=geo):\n", " if r['state'] in states_fips:\n", " yield r\n", " \n", "def counties(variables='NAME'):\n", " \"\"\"ask for all the states in one call\"\"\"\n", " \n", " # tabulate a set of fips codes for the states\n", " states_fips = set([s.fips for s in us.states.STATES])\n", " \n", " geo={'for':'county:*',\n", " 'in':'state:*'} \n", " for county in c.sf1.get(variables, geo=geo):\n", " # eliminate counties whose states aren't in a state or DC\n", " if county['state'] in states_fips:\n", " yield county\n", " \n", "\n", "def counties2(variables='NAME'):\n", " \"\"\"generator for all counties\"\"\"\n", " \n", " # since we can get all the counties in one call, \n", " # this function is for demonstrating the use of walking through \n", " # the states to get at the counties\n", "\n", " for state in us.states.STATES:\n", " geo={'for':'county:*',\n", " 'in':'state:{fips}'.format(fips=state.fips)}\n", " for county in c.sf1.get(variables, geo=geo):\n", " yield county\n", "\n", " \n", "def tracts(variables='NAME'):\n", " for state in us.states.STATES:\n", " \n", " # handy to print out state to monitor progress\n", " # print state.fips, state\n", " counties_in_state={'for':'county:*',\n", " 'in':'state:{fips}'.format(fips=state.fips)}\n", " \n", " for county in c.sf1.get('NAME', geo=counties_in_state):\n", " \n", " # print county['state'], county['NAME']\n", " tracts_in_county = {'for':'tract:*',\n", " 'in': 'state:{s_fips} county:{c_fips}'.format(s_fips=state.fips, \n", " c_fips=county['county'])}\n", " \n", " for tract in c.sf1.get(variables,geo=tracts_in_county):\n", " yield tract\n", "\n", "\n", "def msas(variables=\"NAME\"):\n", " \n", " for state in us.STATES:\n", " geo = {'for':'metropolitan statistical area/micropolitan statistical area:*', \n", " 'in':'state:{state_fips}'.format(state_fips=state.fips)\n", " }\n", " \n", " for msa in c.sf1.get(variables, geo=geo):\n", " yield msa\n", " \n", "def block_groups(variables='NAME'):\n", " # http://api.census.gov/data/2010/sf1?get=P0010001&for=block+group:*&in=state:02+county:170\n", " # let's use the county generator\n", " for county in counties(variables):\n", " geo = {'for':'block group:*',\n", " 'in':'state:{state} county:{county}'.format(state=county['state'],\n", " county=county['county'])\n", " }\n", " for block_group in c.sf1.get(variables, geo):\n", " yield block_group\n", " \n", " \n", "def blocks(variables='NAME'):\n", " # http://api.census.gov/data/2010/sf1?get=P0010001&for=block:*&in=state:02+county:290+tract:00100\n", " \n", " # make use of the tract generator\n", " for tract in tracts(variables):\n", " geo={'for':'block:*',\n", " 'in':'state:{state} county:{county} tract:{tract}'.format(state=tract['state'],\n", " county=tract['county'],\n", " tract=tract['tract'])\n", " }\n", " for block in c.sf1.get(variables, geo):\n", " yield block\n", " \n", "def csas(variables=\"NAME\"):\n", " # http://api.census.gov/data/2010/sf1?get=P0010001&for=combined+statistical+area:*&in=state:24\n", " for state in us.STATES:\n", " geo = {'for':'combined statistical area:*', \n", " 'in':'state:{state_fips}'.format(state_fips=state.fips)\n", " }\n", " \n", " for csa in c.sf1.get(variables, geo=geo):\n", " yield csa\n", "\n", "def districts(variables=\"NAME\"):\n", " # http://api.census.gov/data/2010/sf1?get=P0010001&for=congressional+district:*&in=state:24\n", " for state in us.STATES:\n", " geo = {'for':'congressional district:*', \n", " 'in':'state:{state_fips}'.format(state_fips=state.fips)\n", " }\n", " \n", " for district in c.sf1.get(variables, geo=geo):\n", " yield district \n", " \n", "def zip_code_tabulation_areas(variables=\"NAME\"):\n", " # http://api.census.gov/data/2010/sf1?get=P0010001&for=zip+code+tabulation+area:*&in=state:02\n", " for state in us.STATES:\n", " geo = {'for':'zip code tabulation area:*', \n", " 'in':'state:{state_fips}'.format(state_fips=state.fips)\n", " }\n", " \n", " for zip_code_tabulation_area in c.sf1.get(variables, geo=geo):\n", " yield zip_code_tabulation_area " ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "\n", "def census_labels(prefix='P005', n0=1, n1=17, field_width=4, include_name=True, join=False):\n", " \"\"\"convenience function to generate census labels\"\"\"\n", " \n", " label_format = \"{i:0%dd}\" % (field_width)\n", " \n", " variables = [prefix + label_format.format(i=i) for i in xrange(n0,n1+1)]\n", " if include_name:\n", " variables = ['NAME'] + variables\n", "\n", " if join:\n", " return \",\".join(variables)\n", " else:\n", " return variables\n", "\n", "def rdot_labels(other=True):\n", " if other:\n", " return ['White', 'Black', 'Asian', 'Hispanic', 'Other']\n", " else:\n", " return ['White', 'Black', 'Asian', 'Hispanic']\n", " \n", "FINAL_LABELS = ['NAME', 'Total'] + rdot_labels() + ['p_White', 'p_Black', 'p_Asian', 'p_Hispanic', 'p_Other'] + ['entropy5', 'entropy4', 'entropy_rice', 'gini_simpson']\n", " \n", "def convert_to_rdotmap(row):\n", " \"\"\"takes the P005 variables and maps to a series with White, Black, Asian, Hispanic, Other\n", " Total\"\"\"\n", " return pd.Series({'Total':row['P0050001'],\n", " 'White':row['P0050003'],\n", " 'Black':row['P0050004'],\n", " 'Asian':row['P0050006'],\n", " 'Hispanic':row['P0050010'],\n", " 'Other': row['P0050005'] + row['P0050007'] + row['P0050008'] + row['P0050009'],\n", " }, index=['Total', 'White', 'Black', 'Hispanic', 'Asian', 'Other'])\n", "\n", "\n", "def normalize(s):\n", " \"\"\"take a Series and divide each item by the sum so that the new series adds up to 1.0\"\"\"\n", " total = np.sum(s)\n", " return s.astype('float') / total\n", " \n", "def normalize_relabel(s):\n", " \"\"\"take a Series and divide each item by the sum so that the new series adds up to 1.0\n", " Also relabel the indices by adding p_ prefix\"\"\"\n", " total = np.sum(s)\n", " new_index = list(Series(s.index).apply(lambda x: \"p_\"+x))\n", " return Series(list(s.astype('float') / total),new_index)\n", "\n", "def entropy(series):\n", " \"\"\"Normalized Shannon Index\"\"\"\n", " # a series in which all the entries are equal should result in normalized entropy of 1.0\n", " \n", " # eliminate 0s\n", " series1 = series[series!=0]\n", "\n", " # if len(series) < 2 (i.e., 0 or 1) then return 0\n", " \n", " if len(series1) > 1:\n", " # calculate the maximum possible entropy for given length of input series\n", " max_s = -np.log(1.0/len(series))\n", " \n", " total = float(sum(series1))\n", " p = series1.astype('float')/float(total)\n", " return sum(-p*np.log(p))/max_s\n", " else:\n", " return 0.0\n", "\n", "def gini_simpson(s):\n", " # https://en.wikipedia.org/wiki/Diversity_index#Gini.E2.80.93Simpson_index\n", " s1 = normalize(s)\n", " return 1-np.sum(s1*s1)\n", "\n", "def entropy_rice(series):\n", " \"\"\"hard code how Rice U did calculation \n", " This function takes the entropy5 calculation and removes the contribution from 'Other'\n", " \"\"\"\n", " # pass in a Series with \n", " # 'Asian','Black','Hispanic','White','Other'\n", " # http://kinder.rice.edu/uploadedFiles/Urban_Research_Center/Media/Houston%20Region%20Grows%20More%20Ethnically%20Diverse%202-13.pdf\n", "\n", " s0 = normalize(series)\n", " s_other = s0['Other']*np.log(s0['Other']) if s0['Other'] > 0 else 0.0\n", " return (np.log(0.2)*entropy(series) - s_other)/np.log(0.25)\n", "\n", "def diversity(df):\n", " \"\"\"Takes a df with the P005 variables and does entropy calculation\"\"\"\n", " # convert populations to int\n", " df[census_labels(include_name=False)] = df[census_labels(include_name=False)].astype('int')\n", " df = pd.concat((df, df.apply(convert_to_rdotmap, axis=1)),axis=1)\n", " df = pd.concat((df,df[rdot_labels()].apply(normalize_relabel,axis=1)), axis=1)\n", " df['entropy5'] = df.apply(lambda x:entropy(x[rdot_labels()]), axis=1)\n", " df['entropy4'] = df.apply(lambda x:entropy(x[rdot_labels(other=False)]), axis=1)\n", " df['entropy_rice'] = df.apply(lambda x:entropy_rice(x[rdot_labels()]), axis=1)\n", " df['gini_simpson'] = df.apply(lambda x:gini_simpson(x[rdot_labels()]), axis=1)\n", " return df" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "States" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# grab states, convert populations to int\n", "states_df = DataFrame(list(states(census_labels())))\n", "states_df = diversity(states_df)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "states_df[FINAL_LABELS].head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Alabama | \n", "4779736 | \n", "3204402 | \n", "1244437 | \n", "52937 | \n", "185602 | \n", "92358 | \n", "0.670414 | \n", "0.260357 | \n", "0.011075 | \n", "0.038831 | \n", "0.019323 | \n", "0.541001 | \n", "0.570292 | \n", "0.573075 | \n", "0.480755 | \n", "
1 | \n", "Alaska | \n", "710231 | \n", "455320 | \n", "21949 | \n", "37459 | \n", "39249 | \n", "156254 | \n", "0.641087 | \n", "0.030904 | \n", "0.052742 | \n", "0.055262 | \n", "0.220004 | \n", "0.646677 | \n", "0.475235 | \n", "0.510480 | \n", "0.533815 | \n", "
2 | \n", "Arizona | \n", "6392017 | \n", "3695647 | \n", "239101 | \n", "170509 | \n", "1895149 | \n", "391611 | \n", "0.578166 | \n", "0.037406 | \n", "0.026675 | \n", "0.296487 | \n", "0.061266 | \n", "0.663524 | \n", "0.643529 | \n", "0.646914 | \n", "0.571955 | \n", "
3 | \n", "Arkansas | \n", "2915918 | \n", "2173469 | \n", "447102 | \n", "35647 | \n", "186050 | \n", "73650 | \n", "0.745381 | \n", "0.153331 | \n", "0.012225 | \n", "0.063805 | \n", "0.025258 | \n", "0.515025 | \n", "0.526205 | \n", "0.530902 | \n", "0.416039 | \n", "
4 | \n", "California | \n", "37253956 | \n", "14956253 | \n", "2163804 | \n", "4775070 | \n", "14013719 | \n", "1345110 | \n", "0.401468 | \n", "0.058083 | \n", "0.128176 | \n", "0.376167 | \n", "0.036107 | \n", "0.796994 | \n", "0.843670 | \n", "0.838778 | \n", "0.676216 | \n", "
5 rows \u00d7 16 columns
\n", "\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11 | \n", "Hawaii | \n", "1360301 | \n", "309343 | \n", "19904 | \n", "513294 | \n", "120842 | \n", "396918 | \n", "0.227408 | \n", "0.014632 | \n", "0.377339 | \n", "0.088835 | \n", "0.291787 | \n", "0.833108 | \n", "0.750762 | \n", "0.707954 | \n", "0.712656 | \n", "
4 | \n", "California | \n", "37253956 | \n", "14956253 | \n", "2163804 | \n", "4775070 | \n", "14013719 | \n", "1345110 | \n", "0.401468 | \n", "0.058083 | \n", "0.128176 | \n", "0.376167 | \n", "0.036107 | \n", "0.796994 | \n", "0.843670 | \n", "0.838778 | \n", "0.676216 | \n", "
28 | \n", "Nevada | \n", "2700551 | \n", "1462081 | \n", "208058 | \n", "191047 | \n", "716501 | \n", "122864 | \n", "0.541401 | \n", "0.077043 | \n", "0.070744 | \n", "0.265317 | \n", "0.045496 | \n", "0.751622 | \n", "0.774363 | \n", "0.771193 | \n", "0.623482 | \n", "
32 | \n", "New York | \n", "19378102 | \n", "11304247 | \n", "2783857 | \n", "1406194 | \n", "3416922 | \n", "466882 | \n", "0.583352 | \n", "0.143660 | \n", "0.072566 | \n", "0.176329 | \n", "0.024093 | \n", "0.732727 | \n", "0.787727 | \n", "0.785917 | \n", "0.602124 | \n", "
43 | \n", "Texas | \n", "25145561 | \n", "11397345 | \n", "2886825 | \n", "948426 | \n", "9460921 | \n", "452044 | \n", "0.453255 | \n", "0.114805 | \n", "0.037717 | \n", "0.376246 | \n", "0.017977 | \n", "0.727466 | \n", "0.793870 | \n", "0.792449 | \n", "0.638073 | \n", "
5 rows \u00d7 16 columns
\n", "\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Autauga County | \n", "54571 | \n", "42154 | \n", "9595 | \n", "467 | \n", "1310 | \n", "1045 | \n", "0.772462 | \n", "0.175826 | \n", "0.008558 | \n", "0.024005 | \n", "0.019149 | \n", "0.441816 | \n", "0.453294 | \n", "0.458294 | \n", "0.371372 | \n", "
1 | \n", "Baldwin County | \n", "182265 | \n", "152200 | \n", "16966 | \n", "1340 | \n", "7992 | \n", "3767 | \n", "0.835048 | \n", "0.093084 | \n", "0.007352 | \n", "0.043848 | \n", "0.020668 | \n", "0.388299 | \n", "0.386196 | \n", "0.392968 | \n", "0.291627 | \n", "
2 | \n", "Barbour County | \n", "27457 | \n", "12837 | \n", "12820 | \n", "107 | \n", "1387 | \n", "306 | \n", "0.467531 | \n", "0.466912 | \n", "0.003897 | \n", "0.050515 | \n", "0.011145 | \n", "0.580086 | \n", "0.636407 | \n", "0.637309 | \n", "0.560717 | \n", "
3 | \n", "Bibb County | \n", "22915 | \n", "17191 | \n", "5024 | \n", "22 | \n", "406 | \n", "272 | \n", "0.750207 | \n", "0.219245 | \n", "0.000960 | \n", "0.017718 | \n", "0.011870 | \n", "0.421943 | \n", "0.448712 | \n", "0.451897 | \n", "0.388665 | \n", "
4 | \n", "Blount County | \n", "57322 | \n", "50952 | \n", "724 | \n", "115 | \n", "4626 | \n", "905 | \n", "0.888873 | \n", "0.012630 | \n", "0.002006 | \n", "0.080702 | \n", "0.015788 | \n", "0.274015 | \n", "0.263741 | \n", "0.270876 | \n", "0.202978 | \n", "
5 rows \u00d7 16 columns
\n", "\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1868 | \n", "Queens County | \n", "2230722 | \n", "616727 | \n", "395881 | \n", "508334 | \n", "613750 | \n", "96030 | \n", "0.276470 | \n", "0.177468 | \n", "0.227879 | \n", "0.275135 | \n", "0.043049 | \n", "0.925644 | \n", "0.989171 | \n", "0.976964 | \n", "0.762589 | \n", "
68 | \n", "Aleutians West Census Area | \n", "5561 | \n", "1745 | \n", "318 | \n", "1575 | \n", "726 | \n", "1197 | \n", "0.313792 | \n", "0.057184 | \n", "0.283222 | \n", "0.130552 | \n", "0.215249 | \n", "0.920216 | \n", "0.882623 | \n", "0.829850 | \n", "0.754673 | \n", "
186 | \n", "Alameda County | \n", "1510271 | \n", "514559 | \n", "184126 | \n", "390524 | \n", "339889 | \n", "81173 | \n", "0.340706 | \n", "0.121916 | \n", "0.258579 | \n", "0.225052 | \n", "0.053747 | \n", "0.910834 | \n", "0.957875 | \n", "0.944102 | \n", "0.748656 | \n", "
233 | \n", "Solano County | \n", "413344 | \n", "168628 | \n", "58743 | \n", "59027 | \n", "99356 | \n", "27590 | \n", "0.407960 | \n", "0.142116 | \n", "0.142804 | \n", "0.240371 | \n", "0.066748 | \n", "0.897416 | \n", "0.926901 | \n", "0.911537 | \n", "0.730745 | \n", "
67 | \n", "Aleutians East Borough | \n", "3141 | \n", "425 | \n", "212 | \n", "1113 | \n", "385 | \n", "1006 | \n", "0.135307 | \n", "0.067494 | \n", "0.354346 | \n", "0.122572 | \n", "0.320280 | \n", "0.896064 | \n", "0.864996 | \n", "0.777253 | \n", "0.733972 | \n", "
5 rows \u00d7 16 columns
\n", "\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metropolitan statistical area/micropolitan statistical area | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
26420 | \n", "Houston-Sugar Land-Baytown, TX Metro Area | \n", "5946800 | \n", "2360472 | \n", "998883 | \n", "384596 | \n", "2099412 | \n", "103437 | \n", "0.396931 | \n", "0.167970 | \n", "0.064673 | \n", "0.353032 | \n", "0.017394 | \n", "0.796281 | \n", "0.876425 | \n", "0.873618 | \n", "0.685115 | \n", "
35620 | \n", "New York-Northern New Jersey-Long Island, NY-N... | \n", "18897109 | \n", "9233812 | \n", "3044096 | \n", "1860840 | \n", "4327560 | \n", "430801 | \n", "0.488636 | \n", "0.161088 | \n", "0.098472 | \n", "0.229006 | \n", "0.022797 | \n", "0.805286 | \n", "0.876454 | \n", "0.872729 | \n", "0.672625 | \n", "
47900 | \n", "Washington-Arlington-Alexandria, DC-VA-MD-WV M... | \n", "5582170 | \n", "2711258 | \n", "1409473 | \n", "513919 | \n", "770795 | \n", "176725 | \n", "0.485700 | \n", "0.252496 | \n", "0.092064 | \n", "0.138082 | \n", "0.031659 | \n", "0.808094 | \n", "0.864206 | \n", "0.859318 | \n", "0.671797 | \n", "
31100 | \n", "Los Angeles-Long Beach-Santa Ana, CA Metro Area | \n", "12828837 | \n", "4056820 | \n", "859086 | \n", "1858148 | \n", "5700862 | \n", "353921 | \n", "0.316227 | \n", "0.066965 | \n", "0.144842 | \n", "0.444379 | \n", "0.027588 | \n", "0.798070 | \n", "0.859159 | \n", "0.855080 | \n", "0.676304 | \n", "
19100 | \n", "Dallas-Fort Worth-Arlington, TX Metro Area | \n", "6371773 | \n", "3201677 | \n", "941695 | \n", "337815 | \n", "1752166 | \n", "138420 | \n", "0.502478 | \n", "0.147792 | \n", "0.053017 | \n", "0.274989 | \n", "0.021724 | \n", "0.759459 | \n", "0.824101 | \n", "0.821697 | \n", "0.646772 | \n", "
33100 | \n", "Miami-Fort Lauderdale-Pompano Beach, FL Metro ... | \n", "5564635 | \n", "1937939 | \n", "1096536 | \n", "122082 | \n", "2312929 | \n", "95149 | \n", "0.348260 | \n", "0.197054 | \n", "0.021939 | \n", "0.415648 | \n", "0.017099 | \n", "0.749136 | \n", "0.821351 | \n", "0.819535 | \n", "0.666348 | \n", "
16980 | \n", "Chicago-Joliet-Naperville, IL-IN-WI Metro Area... | \n", "9461105 | \n", "5204489 | \n", "1613644 | \n", "526857 | \n", "1957080 | \n", "159035 | \n", "0.550093 | \n", "0.170556 | \n", "0.055687 | \n", "0.206855 | \n", "0.016809 | \n", "0.736833 | \n", "0.807444 | \n", "0.805894 | \n", "0.622136 | \n", "
12060 | \n", "Atlanta-Sandy Springs-Marietta, GA Metro Area | \n", "5268860 | \n", "2671757 | \n", "1679979 | \n", "252510 | \n", "547400 | \n", "117214 | \n", "0.507084 | \n", "0.318851 | \n", "0.047925 | \n", "0.103893 | \n", "0.022247 | \n", "0.729649 | \n", "0.787682 | \n", "0.786026 | \n", "0.627614 | \n", "
37980 | \n", "Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Me... | \n", "5965343 | \n", "3875845 | \n", "1204303 | \n", "293656 | \n", "468168 | \n", "123371 | \n", "0.649727 | \n", "0.201883 | \n", "0.049227 | \n", "0.078481 | \n", "0.020681 | \n", "0.640825 | \n", "0.685528 | \n", "0.686114 | \n", "0.528088 | \n", "
14460 | \n", "Boston-Cambridge-Quincy, MA-NH Metro Area (par... | \n", "4552402 | \n", "3408585 | \n", "301533 | \n", "292786 | \n", "410516 | \n", "138982 | \n", "0.748744 | \n", "0.066236 | \n", "0.064315 | \n", "0.090176 | \n", "0.030529 | \n", "0.556973 | \n", "0.565366 | \n", "0.569788 | \n", "0.421795 | \n", "
10 rows \u00d7 16 columns
\n", "\n", " | NAME | \n", "Total | \n", "White | \n", "Black | \n", "Asian | \n", "Hispanic | \n", "Other | \n", "p_White | \n", "p_Black | \n", "p_Asian | \n", "p_Hispanic | \n", "p_Other | \n", "entropy5 | \n", "entropy4 | \n", "entropy_rice | \n", "gini_simpson | \n", "entropy_all | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metropolitan statistical area/micropolitan statistical area | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
44700 | \n", "Stockton, CA Metro Area | \n", "685306 | \n", "245919 | \n", "48540 | \n", "94547 | \n", "266341 | \n", "29959 | \n", "0.358846 | \n", "0.070830 | \n", "0.137963 | \n", "0.388645 | \n", "0.043716 | \n", "0.828052 | \n", "0.869824 | \n", "0.862634 | \n", "0.694223 | \n", "0.695439 | \n", "
31100 | \n", "Los Angeles-Long Beach-Santa Ana, CA Metro Area | \n", "12828837 | \n", "4056820 | \n", "859086 | \n", "1858148 | \n", "5700862 | \n", "353921 | \n", "0.316227 | \n", "0.066965 | \n", "0.144842 | \n", "0.444379 | \n", "0.027588 | \n", "0.798070 | \n", "0.859159 | \n", "0.855080 | \n", "0.676304 | \n", "0.689053 | \n", "
23420 | \n", "Fresno, CA Metro Area | \n", "930450 | \n", "304522 | \n", "45005 | \n", "86856 | \n", "468070 | \n", "25997 | \n", "0.327285 | \n", "0.048369 | \n", "0.093348 | \n", "0.503058 | \n", "0.027940 | \n", "0.732562 | \n", "0.780302 | \n", "0.778371 | \n", "0.627984 | \n", "0.685842 | \n", "
40140 | \n", "Riverside-San Bernardino-Ontario, CA Metro Area | \n", "4224851 | \n", "1546666 | \n", "301523 | \n", "249899 | \n", "1996402 | \n", "130361 | \n", "0.366088 | \n", "0.071369 | \n", "0.059150 | \n", "0.472538 | \n", "0.030856 | \n", "0.736344 | \n", "0.779591 | \n", "0.777447 | \n", "0.633143 | \n", "0.680298 | \n", "
25260 | \n", "Hanford-Corcoran, CA Metro Area | \n", "152982 | \n", "53879 | \n", "10314 | \n", "5339 | \n", "77866 | \n", "5584 | \n", "0.352192 | \n", "0.067420 | \n", "0.034900 | \n", "0.508988 | \n", "0.036501 | \n", "0.702746 | \n", "0.729483 | \n", "0.728699 | \n", "0.609796 | \n", "0.676985 | \n", "
32900 | \n", "Merced, CA Metro Area | \n", "255793 | \n", "81599 | \n", "8785 | \n", "18183 | \n", "140485 | \n", "6741 | \n", "0.319004 | \n", "0.034344 | \n", "0.071085 | \n", "0.549214 | \n", "0.026353 | \n", "0.679215 | \n", "0.719629 | \n", "0.719421 | \n", "0.589674 | \n", "0.675128 | \n", "
41500 | \n", "Salinas, CA Metro Area | \n", "415057 | \n", "136435 | \n", "11300 | \n", "23777 | \n", "230003 | \n", "13542 | \n", "0.328714 | \n", "0.027225 | \n", "0.057286 | \n", "0.554148 | \n", "0.032627 | \n", "0.662618 | \n", "0.688024 | \n", "0.688723 | \n", "0.579780 | \n", "0.672056 | \n", "
46700 | \n", "Vallejo-Fairfield, CA Metro Area | \n", "413344 | \n", "168628 | \n", "58743 | \n", "59027 | \n", "99356 | \n", "27590 | \n", "0.407960 | \n", "0.142116 | \n", "0.142804 | \n", "0.240371 | \n", "0.066748 | \n", "0.897416 | \n", "0.926901 | \n", "0.911537 | \n", "0.730745 | \n", "0.670699 | \n", "
12540 | \n", "Bakersfield-Delano, CA Metro Area | \n", "839631 | \n", "323794 | \n", "45377 | \n", "33100 | \n", "413033 | \n", "24327 | \n", "0.385638 | \n", "0.054044 | \n", "0.039422 | \n", "0.491922 | \n", "0.028973 | \n", "0.686088 | \n", "0.722859 | \n", "0.722509 | \n", "0.603981 | \n", "0.668019 | \n", "
26420 | \n", "Houston-Sugar Land-Baytown, TX Metro Area | \n", "5946800 | \n", "2360472 | \n", "998883 | \n", "384596 | \n", "2099412 | \n", "103437 | \n", "0.396931 | \n", "0.167970 | \n", "0.064673 | \n", "0.353032 | \n", "0.017394 | \n", "0.796281 | \n", "0.876425 | \n", "0.873618 | \n", "0.685115 | \n", "0.661636 | \n", "
31460 | \n", "Madera-Chowchilla, CA Metro Area | \n", "150865 | \n", "57380 | \n", "5009 | \n", "2533 | \n", "80992 | \n", "4951 | \n", "0.380340 | \n", "0.033202 | \n", "0.016790 | \n", "0.536851 | \n", "0.032817 | \n", "0.618488 | \n", "0.634707 | \n", "0.637158 | \n", "0.564671 | \n", "0.659675 | \n", "
17500 | \n", "Clewiston, FL Micro Area | \n", "39140 | \n", "13650 | \n", "5057 | \n", "275 | \n", "19243 | \n", "915 | \n", "0.348748 | \n", "0.129203 | \n", "0.007026 | \n", "0.491645 | \n", "0.023378 | \n", "0.685630 | \n", "0.733128 | \n", "0.732653 | \n", "0.619370 | \n", "0.658762 | \n", "
10740 | \n", "Albuquerque, NM Metro Area | \n", "887077 | \n", "374214 | \n", "19766 | \n", "16769 | \n", "414222 | \n", "62106 | \n", "0.421851 | \n", "0.022282 | \n", "0.018904 | \n", "0.466952 | \n", "0.070012 | \n", "0.662122 | \n", "0.629810 | \n", "0.634408 | \n", "0.598243 | \n", "0.658328 | \n", "
29820 | \n", "Las Vegas-Paradise, NV Metro Area | \n", "1951269 | \n", "935955 | \n", "194821 | \n", "165121 | \n", "568644 | \n", "86728 | \n", "0.479665 | \n", "0.099843 | \n", "0.084622 | \n", "0.291423 | \n", "0.044447 | \n", "0.800982 | \n", "0.835903 | \n", "0.830088 | \n", "0.665889 | \n", "0.654037 | \n", "
33700 | \n", "Modesto, CA Metro Area | \n", "514453 | \n", "240423 | \n", "13065 | \n", "24712 | \n", "215658 | \n", "20595 | \n", "0.467337 | \n", "0.025396 | \n", "0.048035 | \n", "0.419199 | \n", "0.040033 | \n", "0.675950 | \n", "0.691203 | \n", "0.691824 | \n", "0.601313 | \n", "0.653970 | \n", "
24380 | \n", "Grants, NM Micro Area | \n", "27213 | \n", "5857 | \n", "221 | \n", "136 | \n", "9934 | \n", "11065 | \n", "0.215228 | \n", "0.008121 | \n", "0.004998 | \n", "0.365046 | \n", "0.406607 | \n", "0.702078 | \n", "0.552323 | \n", "0.551140 | \n", "0.654998 | \n", "0.652172 | \n", "
41740 | \n", "San Diego-Carlsbad-San Marcos, CA Metro Area | \n", "3095313 | \n", "1500047 | \n", "146600 | \n", "328058 | \n", "991348 | \n", "129260 | \n", "0.484619 | \n", "0.047362 | \n", "0.105985 | \n", "0.320274 | \n", "0.041760 | \n", "0.764654 | \n", "0.795817 | \n", "0.792070 | \n", "0.647349 | \n", "0.652044 | \n", "
41940 | \n", "San Jose-Sunnyvale-Santa Clara, CA Metro Area | \n", "1836911 | \n", "648063 | \n", "42686 | \n", "566764 | \n", "510396 | \n", "69002 | \n", "0.352800 | \n", "0.023238 | \n", "0.308542 | \n", "0.277856 | \n", "0.037564 | \n", "0.805816 | \n", "0.852024 | \n", "0.846600 | \n", "0.701179 | \n", "0.649439 | \n", "
47300 | \n", "Visalia-Porterville, CA Metro Area | \n", "442179 | \n", "143935 | \n", "5497 | \n", "14204 | \n", "268065 | \n", "10478 | \n", "0.325513 | \n", "0.012432 | \n", "0.032123 | \n", "0.606236 | \n", "0.023696 | \n", "0.573133 | \n", "0.598715 | \n", "0.601417 | \n", "0.524771 | \n", "0.647226 | \n", "
41860 | \n", "San Francisco-Oakland-Fremont, CA Metro Area | \n", "4335391 | \n", "1840372 | \n", "349895 | \n", "994616 | \n", "938794 | \n", "211714 | \n", "0.424500 | \n", "0.080707 | \n", "0.229418 | \n", "0.216542 | \n", "0.048834 | \n", "0.859532 | \n", "0.901183 | \n", "0.891526 | \n", "0.711379 | \n", "0.645037 | \n", "
20 rows \u00d7 17 columns
\n", "\n", " | msas | \n", "P0050001 | \n", "P0050002 | \n", "P0050003 | \n", "P0050004 | \n", "P0050005 | \n", "P0050006 | \n", "P0050007 | \n", "P0050008 | \n", "P0050009 | \n", "P0050010 | \n", "P0050011 | \n", "P0050012 | \n", "P0050013 | \n", "P0050014 | \n", "P0050015 | \n", "P0050016 | \n", "P0050017 | \n", "NAME | \n", "Total | \n", "\n", " |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
metropolitan statistical area/micropolitan statistical area | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | |
35620 | \n", "[New York-Northern New Jersey-Long Island, NY-... | \n", "18897109 | \n", "14569549 | \n", "9233812 | \n", "3044096 | \n", "31377 | \n", "1860840 | \n", "4859 | \n", "93753 | \n", "300812 | \n", "4327560 | \n", "1943852 | \n", "318520 | \n", "61255 | \n", "17421 | \n", "3729 | \n", "1670891 | \n", "311892 | \n", "New York-Northern New Jersey-Long Island, NY-N... | \n", "18897109 | \n", "... | \n", "
31100 | \n", "[Los Angeles-Long Beach-Santa Ana, CA Metro Area] | \n", "12828837 | \n", "7127975 | \n", "4056820 | \n", "859086 | \n", "25102 | \n", "1858148 | \n", "30821 | \n", "30960 | \n", "267038 | \n", "5700862 | \n", "2710537 | \n", "48532 | \n", "65858 | \n", "26521 | \n", "4627 | \n", "2545313 | \n", "299474 | \n", "Los Angeles-Long Beach-Santa Ana, CA Metro Area | \n", "12828837 | \n", "... | \n", "
16980 | \n", "[Chicago-Joliet-Naperville, IL-IN-WI Metro Are... | \n", "9461105 | \n", "7504025 | \n", "5204489 | \n", "1613644 | \n", "12777 | \n", "526857 | \n", "1975 | \n", "13026 | \n", "131257 | \n", "1957080 | \n", "979392 | \n", "32349 | \n", "23748 | \n", "5944 | \n", "986 | \n", "815750 | \n", "98911 | \n", "Chicago-Joliet-Naperville, IL-IN-WI Metro Area... | \n", "9461105 | \n", "... | \n", "
19100 | \n", "[Dallas-Fort Worth-Arlington, TX Metro Area] | \n", "6371773 | \n", "4619607 | \n", "3201677 | \n", "941695 | \n", "24758 | \n", "337815 | \n", "5431 | \n", "9049 | \n", "99182 | \n", "1752166 | \n", "959603 | \n", "20176 | \n", "18632 | \n", "3688 | \n", "765 | \n", "668721 | \n", "80581 | \n", "Dallas-Fort Worth-Arlington, TX Metro Area | \n", "6371773 | \n", "... | \n", "
37980 | \n", "[Philadelphia-Camden-Wilmington, PA-NJ-DE-MD M... | \n", "5965343 | \n", "5497175 | \n", "3875845 | \n", "1204303 | \n", "9541 | \n", "293656 | \n", "1563 | \n", "10971 | \n", "101296 | \n", "468168 | \n", "192506 | \n", "37477 | \n", "6799 | \n", "2110 | \n", "653 | \n", "191036 | \n", "37587 | \n", "Philadelphia-Camden-Wilmington, PA-NJ-DE-MD Me... | \n", "5965343 | \n", "... | \n", "
5 rows \u00d7 35 columns
\n", "