Understanding the academic achievement gaps

Warning: This is long somewhat meandering post and a work-in-progress

My intent here was to compile the evidence in a narrative fashion.  There are more detailed and more technical sources for much of the information I presented here, but much of it is scattered and much of it is targeted at people that are both knowledgable and willing to invest the time.  My approach here was to present the information in a relatively accessible, top-down fashion, i.e., first identify the magnitude of problem, then characterize it, then present evidence that the favored environmental explanations do not add up, and then (briefly) touch upon some more controversial hypotheses….

One of the first things that clued me into the fact that school systems and socioeconomic status cannot explain the black-white (B-W) academic achievement gaps was seeing SAT data like this:

sat race income 2003

sat race education 1995

sat race income 1995


The obvious pattern here is that high socioeconomic status (SES) blacks do no better (and often worse) than low SES whites, whether measured by their parents’ income or their parents’ educational credentials.   This is really hard to explain away as being mainly a product of poverty, bad schools, and things of that sort either.

Roland Fryer and Steven Levitt wrote a paper on this subject that shows that the academic achievement gaps start before black children enter first grade and that, even after an exhaustive set of controls, the gap grows by approximately 0.1 standard deviations per year through the 4th grade.  They point out that these gaps exist in the same schools, same classrooms, and with the same teachers, i.e., differences in the education inputs in the form of segregation, funding, tracking, or the like cannot “explain” more than a tiny fraction of the observed difference.

Since these models can be a bit complicated (they are susceptible to assumptions) and some people have made some silly complaints about the SAT and the like, I am going to (tediously) document that differences of similar magnitudes can be found much earlier, i.e. K-12 academic outcomes, and that the usual proffered explanations simply do not stand up to even modest scrutiny (using less complicated methods with nation-wide data).

Take a gander at this data from the department of education’s NAEP data explorer.


White kids whose parents did not even graduate high school do as well as black kids with college graduate parents.


Similar patterns are found even after you control for school poverty.  Kids in low poverty schools (using school lunch eligibility proportions as a reliable proxy) tend to be much higher SES than their presumably similarly educated counterparts in high poverty schools.  The academic rigor and grading standards of schools tend to reflect the demonstrated academic performance of the community.

Moreover, low SES groups are more likely to attain less demanding credentials (as measured by SAT scores, academic rigor, choice of major, etc) in higher ed and beyond (see illustration below for this concept).  “College graduates” in low SES or predominantly URM communities are far more likely to be of the community college or non-competitive variety than “college graduates” in high SES communities (low school lunch eligibility).

Google Chrome

Google Chrome

Google Chrome (1)

Put differently, the parents’ nominal educational credentials, as reported in these sorts of statistics, do convey meaningful information (especially within community differences), but they are still fairly crude proxies for the things we care about (as in, actual abilities, actual academic curriculum/education, etc) and cannot be assumed to allow 1:1 or apples-to-apples comparisons between substantially different communities by race/ethnicity, SES, or even in significantly different places [Note: this is one of the huge mistakes made by observational studies that use these observables overly literally to impute strong causative effects to “poverty” and other measures of economic well being]


Blacks at majority white schools do not perform dramatically better.  White kids in majority black schools don’t do dramatically worse either (controlling for just parental education, which is crude).  There is a general consistency to these racial/ethnic gaps within all of these various units of analysis.


Not much changes if we look at the proportions of the school that are black either.


Or latino proportions.

Of course, most of the apparent racial proportion effect is driven by its correlation to the schools’ demographics (including SES).   Schools with large proportions of black or hispanics tend to be relatively low SES all over.  You can see that most of these relationship evaporates if you compare school-wide free lunch proportions vs percent of school white.

Microsoft Excel

Or using the NAEP regression analysis tool to crunch the individual data directly (using school lunch eligibility percentages AND percent of white students):

Microsoft Excel

Note: the small regression coefficient between 0 and 51+ percent white is just 2.2 points using just race/ethnicity and school lunch program eligibility (about 0.05 standard deviations and much smaller than the coefficient for blacks, latinos, etc).  It certainly does not look to me like there is any large or consistent effect in whiter schools for any group.  The data explorer program only allows 3 measures at a time, but I’d bet with more variables (e.g., parent education) it’d be even smaller.

Also note that the school setting (urban/rural/suburban) seems to have little effect along these lines too.

Microsoft Excel

The much mythologized “suburbs” do not systematically outperform large cities once you account for school characteristics like national school lunch program eligibility proportions and the individuals own race/ethnicity (as reported by the school).   Moreover, small towns and rural settings, where whites are much over-represented, do somewhat worse than we’d expect with these sorts of controls in place.

We find broadly similar results if we control for the students’ parents’ educational credentials instead.

Microsoft Excel

My point here is that the students’ individual race/ethnicity, students’ individual parents education credentials (despite the above mentioned flaws), and school SES/poverty more broadly are much better predictors of individual student outcomes than urban/suburban/rural or school racial/ethnic proportions per se.  Although the “data explorer” product won’t allow me to evaluate more than 3 measures at a time (one of which is race/ethnicity) to evaluate this directly, it’s unlikely that adding school setting or racial proportions will add all that much incremental power based on this analysis.

Of course even in the same school districts we routinely find results that seem to vary strongly according to the proportions of the “minorities” in the schools.

NYC school level “college readiness” as proportion black or latino

Google Chrome

Many people mistaken believe that this must be because the schools are underfunded, have much worse teachers, etc etc, but the reality is that this can be predicted quite well by looking at the race/ethnicity of the individual students and better measures of individual SES (school SES tends to correlate for obvious reasons).  Schools that are predominantly URM schools are also predominantly low SES.  We know that both low SES and black (or latino) status predict much worse average outcomes (as a general rule) even in the same schools, classrooms, and the like.

Thus it is not surprising that when we actually look at much bandied examples of school integration “success” like Louisville, KY (Jefferson County school district) or Charlotte, NC, we see precious little evidence of equalized of outcomes within the schools or even appreciably better minority results as compared to other areas nationally.  These presumably exemplary school districts appear to do worse by blacks than the “highly segregated” NYC schools.

By parental education level:


Simple means:

Google Chrome

In reporting unit B-W gaps

Microsoft Excel

Note: These comparisons are only possible where they provide the data.  If the n falls below some very conservative threshold number they mask the data to prevent individuals from being conceivably identified.

Two parent households

It does look like there is a modest correlation between two-parent household status and outcomes using school SES or parent educational credentials.  However, it does not eliminate the B-W gap and it is probably at least partially confounded by the fact that higher SES groups are more likely to get (and stay) married these days (put differently, these proxies for SES are crude enough that there is still likely to be significant residual power left in markers like marriage).

Microsoft Excel 2

Microsoft Excel 3

These differences do not just show up in test scores alone

Despite the fact that there is substantial systematic variation in academic rigor, grading standards, and course selection across schools nationwide, these patterns are visible in raw GPA and related in-school measures.   Even without test scores or adjustments for academic rigor, it is quite obvious that there are large differences in GPA between racial and parent education groups.

12th grade GPA by race and overall school GPA percentiles


12th grade GPA by race, parents’ education, and school GPA percentiles


The B-W differences are actually larger amongst high SES groups than low SES groups and high SES groups generally earn higher GPAs.

If you go further and look within the reported curriculum level achieved (academic rigour) the differences between groups grow even further and they better approximate the patterns we find in standardized testing.



These patterns are similar even in high vs low minority schools.


There are also large differences in post-HS expectations.  Curiously, blacks have actually have notably higher academic expectations at any given GPA / academic rigour level (affirmative action likely plays a large role here!).


This despite the fact that they have lower rates of credential attainment in absolute terms.

Google Chrome

The differences in HS “completion” are even larger if you exclude GEDs and the like.

Google Chrome

Long story short, these standardized test score differences represent real, objective, and meaningful gaps in academic ability and academic achievement.

A brief exploration of cultural explanations

While I personally believe that “cultural” differences (broadly defined) probably play some role in this, there is not much direct evidence for it in practice.

Reported homework hours, for instance, seem to pretty similar between blacks and whites (though I take self-reports with a HUGE grain of salt).

Google Chrome

Media consumption is reported to be significantly higher amongst blacks as compared to whites.  There does seem to be a correlation between the two, but whites whose parents just graduated HS and report watching 6+ hours outperform blacks whose parents graduated college and watch 1 or fewer hours…


Likewise, differences in reported homework hours don’t seem to “explain” these differences either.



Likewise, while I think there is something to the “acting white” argument (in some sub-groups), we, again, observe large differences in metrics that should be a fairly decent proxy for this sort of thing.

White HS grad kids who supposedly strongly agree with the proposition that friends make fun of people that try to do well in school do about as well as black college grad kids who strongly disagree (and certainly better than their white counterparts with the “same” nominal credentials)


Similar patterns are observed with reported parental involvement with school studies (and there are large differences within groups according to this measure).


Likewise if we look at reading scores according to reported beliefs about learning through reading:


If these self-reports have are remotely honest, then the gaps are much too big, in my opinion, to be explained by plausible differences in subjective views or small differences in forthrightness between the groups.  I can believe that there is some difference, which makes 1:1 comparison impossible, but when we observe differences this profound across radically different ends of the distributions it strongly suggests that something more profound is at work.

Objective differences in adult literacy levels

There are similar large objective differences in adult literacy levels as measured by the OECD PIAAC.

Adult literacy by race (& ethnicity) and highest educational credential (detailed)

Firefox 5

Adult literacy by race and highest educational credential (collapsed)

Firefox 14

Adult literacy by current work requirements education level

Firefox 9

and by age group…

Firefox 13

Adult literacy by race and income decile

Firefox 8

Adult literacy by race and economic sector

Firefox 6

[I should have added educational credentials into this!]

Adult literacy by race and hours per week at current job

Firefox 10

These adult literacy differences are not explained by parent educational credentials either

Adult literacy by RE & father’s education level

Firefox 11

Adult literacy by RE & mother’s education level

Firefox 12

It is not just a literacy problem

Technology problem solving


Numeracy by income level


Numeracy by education level


The reason why school/neigborhood SES predicts student outcomes is that parent SES is well correlated with ability and people don’t move at random

There is a strong relationship between SES, as measured by education or income (and especially both combined), and cognitive ability, literacy skills, numeracy, and so on and so forth.   That is to say that there are differences in fundamental skills that the vast majority of children today are exposed to in the primary and secondary school.  These differences persist well into adulthood.

Adult literacy scores by detailed educational credentials (all race/ethnic groups)

Firefox 4

Literacy score by income percentile

Firefox 3

Literacy score by income and age group


[Note: There are age specific income patterns due to income mobility and educational pipelines.  Also there is a documented decline in fluid cognitive ability as people age…]

Mean income by IQ decile (white men age 40-50)

The point here is that neighborhoods and schools are substantially sorted by ability, conscientiousness, actual acquired education/knowledge, occupational interests, and more.   The proxies that we use to try to assess individual SES (e.g., binning nominal educational credentials into broad categories) are, in many ways, less powerful indicators than this same information aggregated at a community or school level.

URMs are typically not held to the same academic standards and tend to produce their credentials and income differently

A significant reason why we see such large differences between people of the “same” SES largely has to do with: neighborhood specific grading standards in primary & secondary schools; affirmative action; “disparate impact” laws and torts; quotas in government and gov’t contractor hiring; individuals sorting into less cognitively demanding occupations (e.g., sales instead of engineering); and so on and so forth.  These systems/issues/flaws are not enough to fully offset the underlying problems in URM communities (i.e., most of them are still poorer, less credentialed, etc), but they are enough to strongly skew the statistics when we try to make apples-to-apples comparisons between groups based on educational attainment or income levels.

Put differently, the apparent disconnect between nominal measures of parent SES and childhood academics has less to do with regression to the mean than the fact that very few of the parents ever achieved at comparable academic or cognitive levels as their white or asian peers.


Of course, most progressives assume that childhood cognitive abilities and academic achievement are somehow purchased through better nutrition, better schools, poorly specified “enrichment” activities, and so on and so forth, but this is generally wildly at odds with the evidence or, at least, has little in the way of empirical support behind it.  We clearly see that even the highest income and/or highest credentialed blacks fail to perform appreciably better than generally poor and/or uneducated whites (and especially most asian groups here).

These differences start from a very young age

Differences in language processing skills and proxies for cognitive ability are found in children as young at 18 months of age along SES measures–long before differences in school systems have a chance to have an impact.

Google Chrome

Google Chrome (1)

Note: The low SES groups are approximately where the high SES groups were 6 months earlier in accuracy and reaction time.

There appear to be large differences in some early childhood parenting practices by SES

It appears that there are large differences in verbal engagement and parenting practices by SES.

Google Chrome

Google Chrome (1)

Google Chrome (2)

Of course, correlation does not imply causation!

These differences exist between blacks and whites at a young age

Observed raw verbal IQ scores by age (months)

Google Chrome

These differences cannot be explained by SES:

verbal_iq_race_ses_ageHigh SES blacks are performing worse than low SES whites as early as 36 months of age, much like we see in the various test scores later in life.

Early childhood intervention programs have failed to demonstrate significant long-term positive cognitive or academic gains

Brookings Institution Report:

Not one of the studies that has suggested long-term positive impacts of center-based early childhood programs has been based on a well-implemented and appropriately analyzed randomized trial, and nearly all have serious limitations in external validity. In contrast, the only two studies in the list with both high internal and external validity (Head Start Impact and Tennessee) find null or negative impacts, and all of the studies that point to very small, null, or negative effects have high external validity. In general, a finding of meaningful long-term outcomes of an early childhood intervention is more likely when the program is old, or small, or a multi-year intervention, and evaluated with something other than a well-implemented RCT. In contrast, as the program being evaluated becomes closer to universal pre-k for four-year-olds and the evaluation design is an RCT, the outcomes beyond the pre-k year diminish to nothing.

I conclude that the best available evidence raises serious doubts that a large public investment in the expansion of pre-k for four-year-olds will have the long-term effects that advocates tout.

This doesn’t mean that we ought not to spend public money to help families with limited financial resources access good childcare for their young children. After all, we spend tax dollars on national parks, symphony orchestras, and Amtrak because they make the lives of those who use them better today. Why not childcare?

It does mean that we need public debate that recognizes the mixed nature of the research findings rather than a rush to judgment based on one-sided and misleading appeals to the preponderance of the evidence.

Google Chrome

Google Chrome

Interventions do not seem to work in primary or secondary school either

Experimentally designed studies, like “moving to opportunity” (MTO), wherein whole families are moved to much lower poverty neighborhoods with better schools find no evidence of lasting significant academic or cognitive gains as compared to the control (intent-to-treat) groups.

Google Chrome
Google ChromeIf there are any real academic or cognitive gains they are apt to be very very modest.

The effects of winning the lottery (exogenous income/wealth shock) do not seem to cause significant lasting gains

A study of Swedish lottery winners found similar null results for academic and cognitive outcomes (although this is not exactly an experimental design, the random income/wealth shock gets us very close to it).

Google Chrome

And yes, despite the fact that Sweden has a very large welfare state and is more homogeneous than the US (historically), there are still large differences at birth in academic, cognitive, health, and numerous other outcomes.

Google Chrome

Quantitive and quantitive data from some of the best open enrollment schools in the country points in a similar direction

Some people argue that experimental designs like MTO don’t mean anything because the neighborhood change wasn’t that dramatic enough (never mind that the data shows that it was very significant).  Presumably they think that the “best” schools with the highest test scores are that way because they spend a lot of money…. or something.

To this I say, take a look at Lower Merion in suburban Philadelphia and this 2006 article on academic outcomes of (relatively middle class) blacks there:

With an average household income of $86,373, LMSD can spend $19,392 per pupil annually, more than twice as much as the majority of Philadelphia’s schools and more than nearly every other American public school district. Lower Merion High School, one of the district’s two high schools, was one of the Wall Street Journal’s top 60 high schools in April 2004, public or private, and given that the median Lower Merion home costs $334,500, it is unsurprising that 94 percent of graduates attend college. District schools routinely win some of the most prestigious state and national competitions, such as the National Science Olympiad. Eighty percent of the district’s students are proficient or better in math and reading on the Pennsylvania System of School Assessment (PSSA). But what the white Main Line sees as a source of pride infuriates South Ardmore, where most of LMSD’s few blacks live. Only 27 of LMSD’s 500 black students are identified as gifted; for whites, 790 out of about 6,000 make the cut. (That’s five vs. 13 percent.) One in four blacks is in special ed.

Most alarming, 60 percent of black students are not grade-level proficient in reading and math in a school district flush enough to provide many staffers with snazzy digital organizers and to test-drive a global positioning system to track its school buses. Which is why, at that highly charged January meeting, Mosley also said, “We are particularly enraged that this district dares to take credit for being one of the top school districts in the state, even the nation, at the same time that it allows our African-American students to stagnate!”

One district, two very different realities–that much is clear. What we don’t know is whose fault it is that Main Line children are doing so poorly–whether the school district is to blame, or whether, as Bill Cosby has pointedly suggested in recent remarks, much of the fault may lie with black parents and students themselves.

Microsoft PowerPoint

Microsoft PowerPoint (1)

Microsoft PowerPoint (2)

Microsoft PowerPoint (3)

Microsoft Excel

The black students at lower merion are (at worst) lower-middle class, neither rich nor poor, whereas the whites are mostly upper-middle class.   Most of them went through the same schools as everyone else starting in kindergarten and their neighborhood (south ardmore) is not a crime-ridden “ghetto”.  And yet, not only do they perform worse than their mostly high SES white peers, they perform worse than the state average.  Clearly their deficit cannot be explained differences in school quality.

Many people seem to forget that the parents of these “rich” schools are (mostly) highly educated by national standards.  Intelligent well educated people are far more likely to have children that are also intelligent and well educated (both genetic and cultural).





Similar patterns are found in other “good” schools with substantial black proportions, especially when there are significant SES differences.   The reason why “good” schools are “good” on aggregate has much, much more to do with the sorts of students in them than the inputs associated with the schools themselves (e.g., per pupil spending, teacher credentials, class size, computer labs, etc).  There is little, if any, systematic relationship between “good” in the marginal, value-add, sense of the word and good aggregate performance.   Predominantly high SES white schools (like Lower Merion) are good mostly because the students are smarter and more motivated than average.  Different groups perform differently mostly because they are different, not because of the school “quality” for the most part.

A brief analysis of Pennsylvania’s PSSA test results


[blacks perform much worse than their white peers in the same schools in general]


[white scores are uncorrelated with URM proportions, whereas black scores appear to be modestly correlated]


[overall scores are well correlated with economic disadvantage]


[but different groups clearly experience different outcomes in the same schools.  Whites and asians significantly outperform blacks even in significantly disadvantaged schools]


[Predominantly URM schools are more likely to be poor, but there are still many “white” schools that are poor too.  It’s not surprising the URM proportions correlate with underperformance for blacks, in particular, for this reason]


[White outcomes correlate somewhat with economic disadvantage proportions]


[Black outcomes correlate better with economic disadvantage than URM proportions.]


[When we compare directly by score, we find that whites and asians are much more likely to reach “advanced” attainment levels than their black peers]


[Likewise for whites and asians versus latinos, albeit to a lesser degree]

The SAT and the PSSA correlate strongly…

Google Chrome



Google Chrome

Google Chrome (1)

Note: The “effect sizes” of parental education are generally much higher than parental income.

Data from interracial families

It appears that black children with white mothers experience approximately intermediate cognitive and academic outcomes.  See this study by Peter Arcidiacono et. al.



Note: The family characteristics of the black children with white mothers is, by many measures, worse than their black counterparts-see income, single parent, welfare status, etc.


Note: The black children with white mothers lived in somewhat “whiter” neighborhoods and attended “whiter” schools, but the economic characteristics of the neighborhoods and, especially, the schools are pretty similar otherwise.


Note: They report that both the mother and father typically have similar mean IQ scores, scores that are about half way between the overall white and black means.   Also, the black fathers appear to be less involved in day-to-day childrearing in the interracial case.


Note: Small differences in media use, sleeping, etc. The white mom’s of black children are more likely to work and work longer hours.

Regression analysis for black boys (mixed and otherwise).



Mother’s characteristics: income, on welfare, single parent, mother’s age, mother’s education, and biological mother.

Father’s characteristics: child knows anything about, child lives with, child ever lived with, child speaks to weekly, HS diploma, some college, college degree, no child support requirement, missing race, missing education, and monthly child support payment.

Their baseline IQ GAP is 0.86 SD and the mother’s race with assorted controls accounts for 0.57 SD without school fixed effects and 0.36 SD with school fixed effects.  Those gaps are still significant and that is with a fairly generous set of assumptions regarding the arrow of causation and the comparability of educational credentials…

The authors of this paper rule out genetics early on with nary any thought and then go on to argue that socially defined race or skin color gaps can be almost entirely explained by “observables” (which includes the mother’s race, amongst others).

interracial_full_factor_modelMoving from model 1 to model 3 (more controls) causes the apparent effect of the students “race” (black or hispanic) to fall to below the level of statistical significance.  This would appear to suggest that once you control for the race of the mother and her educational credentials that the race of the child, her income level, and the like are of relatively little importance.

Note the large gap between the effect of the mother’s educational credentials and the father’s educational credentials.  The mother’s college degree accounts for ~0.3 SD but the father’s college degree is only worth ~0.1 SD.  Likewise, the mother’s HS degree is worth ~0.1 SD whereas the father’s is worth ~0.03 SD.  This ought to jump out at people.


We do not see these sorts of patterns when we look at the national data within racial/ethnic groups.  The mother’s and father’s education credentials are about equally predictive and they appear to be quite additive.

The regression analysis shows much the same thing:


We do, however, observe familiar and pronounced differences across racial/ethnic groups in the matrix.  Black kids whose parents both graduated from college score just 5 points (~0.15 SD) better than white kids whose parents both failed to graduate from high school.  The reason why the minority father’s education in their analysis has so much less effect than the mother’s education is almost certainly the result of the fact that the white mother’s educational credentials in their analysis are so much more meaningful (“real”) measures of accomplishment than a naive reading of nominal measures would suggest (which goes to verbal skills, cognitive skills, and other aspects).  Whether the transmission from mother to child is best explained by genetic, verbal/interactions in early childhood, varying parenting practice, or what have you, I’ll leave that be for now…

Skin tone

There appear to be significant differences in cognitive and academic measures according to skin tone.  Without any controls the difference between light and dark is about 0.2 SD, but after the race/ethnicity of the mothers and other characteristics associated with them (e.g., education) are taken into account there isn’t an appreciable difference in this analysis.


My interpretation of this data is mainly that “colorism” per se likely plays little to no role in explaining these patterns.  I don’t want to get weeds here too much, but there is a documented relationship between skin tone and african racial admixture amongst african-americans–the correlation is roughly 0.44.

Google Chrome

Thus darker african americans will tend be significantly more african and people with high african admixture will tend to be dark, even though some people with light skin will have significantly more african ancestry than you might expect and some people with dark skin will have unusually low african ancestry.  Put differently, skin color is an imperfect proxy for african ancestry.

Google Chrome

Histogram of african american melanin content

Google Chrome

[It is approximately normally distributed]


In any event, we find the skin color – outcomes relationship in numerous outcomes measures and it is actually pretty strong.

See this table for IQ – skin color relationships

add-health-verbal-iq-of-blacks-by-family-ancestry-by-skin-color-weighted 2

Note: This was extracted from the NLSY97 dataset — large N.

We also find similar outcomes patterns with admixture data per se:

Microsoft Word

To wrap this up…. for now

None of this proves that the cause is substantially genetic (and I don’t feel like reviewing the evidence for this hypothesis just yet), but the evidence is quite consistent with the hereditarian hypothesis whereas the popular progressive notions about the causes, like uniquely “bad schools” for blacks or uniform SES induced differences, simply do not come anywhere close to explaining these differences.  If it is caused mostly by cultural differences, like verbal skills or parenting practices, it sure still acts a lot like a statistical difference in genetic cognitive abilities.  More importantly, at an operational level, we certainly do not know how to fix it today (it’s not for want of effort)!

Race is not just a social construct

I have frequently heard people insist that “race is just a social construct”, that there is no genetic basis to it, that it has no statistical relevance, and so on and so forth.  This is clearly wrong, as others have pointed it out repeatedly, but people keep on repeating it for some reason.   To save myself and others time next time around, here is a compilation of the facts, evidence, expert opinion, and more that ought to settle the issue for most fair minded people that are not overly ideologically blinkered.

In no particular order….

Expert opinion

Jerry Coyne on race: 

I do think that human races exist in the sense that biologists apply the term to animals, though I don’t think the genetic differences between those races are profound, nor do I think there is a finite and easily delimitable number of human races.  Let me give my view as responses to a series of questions.  I discuss much of this in chapter 8 of WEIT.
Continue reading →

On the popularly reported black implicit association test (IAT) results

Recently the media and various friends and family have been asserting that implicit association tests (IAT) “prove” that whites are biased against blacks and that this presumably substantially explains the racial disparities in police shootings.


Since I am skeptical about the racial angle in police shooting, the validity of measures like IATs, and of received wisdom in general, I thought I would take a look at “Project Implicit” to better understand it.  The raw data for these results is available in SPSS format on OSF.io (albeit at >2GB) so I downloaded the data and performed some analysis in R.

Here are a few things I can say:

1: The reported averages by group (e.g., ethnicity, gender, political views, etc) hides a lot of variance within groups and overlap amongst groups.


The typical standard deviation is ~0.4 for every group with a reasonably large N.



[Note: 1 = strongly conservative, 7 = strongly liberal.  N for 7 is fairly small]



2: The differences between the averages are typically very small by comparison to the variance in all of these groups

The only arguably exception to this is the target group (as in, blacks in this case), but even then this presumably implies that a long of blacks are strongly biased against black people.

3: The measured differences in the United States between whites, asians, hispanics, and other major ethnic groups are pretty negligible.


If these tests are measuring something “real” and important, the reporters ought to observe that non-hispanic whites are broadly inline with every other group, save for blacks.

4: The patterns appear to hold up internationally

iat_white_both_rez_and_cit iat_white_by_country_primary_residence iat_white_by_primary_citizenship



These patterns appear internationally and do not appear to be notably stronger amongst non-hispanic whites in the US than are in generally European countries. Other ethnic groups, both inside and outside the US (various configurations), also have this “bias” (again, save mainly for blacks/africans).  Non-Hispanic whites appear to have a slightly stronger “white” bias than other ethnic groups, but I would hardly say that that is strong evidence for even marginally higher anti-black bias.  As in, it’s more likely that groups prefer their own and certainly find them more familiar, other things being roughly equal.  I would bet that acculturation of minority groups in various majority populations likely makes them more familiar with the faces they encounter more regularly, providing that experience is not a generally negative experience.

5: Said bias appears to correlate positively with “diversity”


Non-hispanic whites reporting to live in counties (not just the state) with large black populations are more likely, not less likely, to exhibit this bias.


Likewise, “whiter” counties are less likely to exhibit this “bias”.  I would think this would run counter to the narrative of the people reporting this stuff, i.e., presumably whites with less interaction with blacks should harbor stronger biases, not weaker biases.

iat_black_scatter_by_pct_nhw_means iat_black_scatter_by_pct_county_black_means




iat_black_scatter_faceted_by_state_black[Note: East Asians, Hispanics, and other groups also demonstrate more “bias” the larger the black population and/or the smaller the non-hispanic white population]

See here for: IAT (W-B) amongst non-hispanic whites PDF (long) density graphs

6: Similar “bias” exists against “Asians”






Asians have higher income, higher education, significantly lower imprisonment rates, higher life expectancy, better average grades, and so on and so forth for most measurable statistics.  A certain breed of progressive even calls them a “model minority”.  They surely also have fewer interactions with the police despite this IAT “bias” that is presumably every bit as strong against them (and I would bet that they are under-represented nationally in police shootings and in crimes against the police, i.e., based on my recollection of summary statistics from my earlier review here).

7: The tests have real measurement issues


People that report taking more IATs have “significantly” less bias.  I suspect there is a real training issue here (which calls into question the whole pursuit)


Likewise, there is variance (which they imply is significant in other contexts) depending on the order of the test.  If all groups are observed at a high rate this might not be an issue, but when the N is small this is likely to (randomly) skew the results further, i.e., depending on which order the test takers are randomly assigned.

8: The faces chosen surely influence the results

I have not reviewed the literature, but, other issues aside, they only use a small number of faces to reach sweeping conclusions about race bias/association.  Are these faces and expressions truly representative of the broader black or white population so that we can draw sweeping conclusions about normal interactions in the real world?  I would bet that there is considerable variability depending on the facial structure, gender, expression, facial hair, etc present on the face presented (I don’t think they had this level of detail in their data files, but it’s possible I missed it), not to mention things that they do not even show like clothing, observed behaviors, etc.



Some additional perspective on the heat maps by state


Google Chrome 4 Google Chrome 2 Google Chrome 3

Google Chrome

[Note: some sample sizes with some of the minority groups are small in some of these states (I should have filtered them first!), so I wouldn’t read too much into it them other than, perhaps, to note broader patterns in this data]

A brief post on racial disparities in officer involved shootings

I have recently heard it said that the reason the police shoot blacks, especially young black men, at such a disproportionate rate is because they have an irrational fear of them because they are black.   Presumably the proponents of this view believe that shootings, “justifiable” or otherwise, should happen in roughly equal proportion to their share of population.  Although I do not believe the police are incapable of excessive force, racial discrimination, negligence, or what have you, the presumption that such disparities must be explained by presumed irrational fear of blacks strikes me as terribly naive on several levels.

Robert VerBruggen of RealClearPolicy did an interesting post on “Race, Age, and Police Killings” a few weeks back that compared nation-wide homicide rates by age group and race to the police shooting statistics.

rcp_white_black_homicide_offenders rcp_whites_blacks_killed_by_age


I thought this was a good and fair way to better illuminate the “fairness” issues here, since groups (e.g., sex, age, race, ethnicity, education, etc) that commit more murder (and other violent crimes) nationally can be reasonably assumed to be more likely to have confrontations with police and more violent confrontations when they do.

I found some data to take this point further by looking more granularly at the demographics of offenders that have actually killed law enforcement and offenders that have assaulted and seriously injured the police (as in with guns, knives, etc).  This data gives us a much better sense for the risks posed by each groups to the police and which groups are relatively more likely to be be confrontational, disobey, or even resort to violence, i.e., it speaks much more directly to the dynamics of police encounters with particular demographics (to the extent that one can argue that, say, national homicide rates are only black-on-black, gang-on-gang, or some such).  Most police encounters do not result in death of either party or even an exchange of gun fire, but groups that kill, injure, or assault the police at (much) higher rates can be reasonably presumed to be at (much) higher risk of getting killed by the police, “justifiable” or otherwise.

Law enforcement officers feloniously killed by demographics of known assailants


[Blacks killed 44% of the police officers killed in the FBI’s 2003-2012 data]


[Those under age 35 (all races) accounted for the vast majority of police killings]


[Males killed 98% of the police in this dataset]

Law enforcement officer assaulted AND injured with firearms, knives, and other cutting instruments in 2013 by demographics of known assailants


[Blacks accounted for 54%]


[Age distribution of white assailants]


[Age distribution of black assailants.  Notice that, amongst blacks, young people account for a much larger share of the assaults than amongst whites]

Google Chrome 48Source: A study of felonious assaults on our nation’s law enforcement officers

I am not involved with law enforcement in any shape, way, or form. I do not believe that the police are beyond reproach.  I do not subscribe to the notion that police tactics are necessarily optimal today. I do not believe that there is zero bias or racism in police departments nationally.

However, it does not strike me these disparities are out of line with the data that most honest people should be well aware of by now.  The police may well be more “fearful” in dealing with some groups more than others (e.g., blacks more than others, males more than women, young more than old, poor more than rich, big more than small, etc), but that does not imply that there is not a very real rational basis for their fears or that this fear is likely to explain the outcomes we find.  When we know that some groups are actually disproportionately likely to kill, injure, or assault police we should not be very surprised when these same groups are the “victims” (“justifiable” or otherwise) of police shootings at higher rates too.



Some of this is likely because some groups clearly have many more encounters with police, i.e., because they commit crimes at a higher rates or are otherwise involved in circumstances that require police intervention (e.g., fights, disturbing the peace, etc).  Some groups are more likely to resist arrest, to confront the police, or even to try to assault or injure the police on average.  Some cities and communities also have more different risks and relationships with police.  These and other factors are likely to notably increase each groups’ odds of getting shot and/or killed by the police.  We do not need to invoke notions like irrational fear to explain why some groups are disproportionately likely to get shot.   In this particular case, I cannot help but noticing that blacks accounted for 44% of the LEO killed and 54% of injuries of LEO (assaults with guns and knives) but “only” 31% of LEO’s victims (at least in 2012).

Moreover, I object to the false dichotomy being presented here by many progressive and libertarian-types, i.e., either the victim(s) “deserved” it or the individual police officer(s) are guilty of murder or, at least, manslaughter.  Police officers are imperfect human beings who we depend on to enforce the law and whose lives are at very real risk in these sorts of situations (usually).   That law enforcement may sometimes make mistakes, both in the shootings and in the moments leading up to them, does not mean that every shooting is a mistake or that every victim “deserved” it.  People that argue this point of view demonstrate a poverty of imagination.

Just because we learn that a particular victim was “unarmed”, for instance, does not mean that law enforcement (especially lone officers) do not have good reason to fear for their life or that they necessarily had other options (e.g., if the suspect already assaulted them, if the suspect is larger and/or stronger, if the suspect disobeys a direct order and charges them, etc).  It especially does not imply that the officer’s individual behavior was actually criminal or should be treated as such. There is room to acknowledge that these victims are not necessarily homicidal murderers that “deserve” to die or that police procedures might be able to change so as to minimize the frequency of these sorts of incidents, but we should also recognize that the so-called victims often play a very large role in their own deaths (e.g., in assaulting the police officers, ignoring their orders, drawing weapons, etc…. not to mention choosing to engage in criminal actions more often than not).

In any event, some groups clearly place themselves at much greater risk of being shot by the police than others for a variety of reasons. If we care about “fairness” and about actually minimizing these deaths (both LEO and the broader community), then we should look at all of the data and try to better understand the root causes instead of putting forth facile arguments about subconscious bias explaining everything with nary any reference to assaults on police officers or what have you!

Some visualizations of ancestry.com’s genetic data

As a quick follow up to my earlier post using ancestry.com’s “Genetic Census of America”, I thought I’d post some more heat maps using the data I aggregated by major continental group (“race”) and by the more granular “adjusted” European ethnicities (i.e., whereby I simply divide the ethnicity by the total european “ethnicities” in the state).

Note: You can click these images for an interactive view to see the actual numbers for each state if you care.

Adjusted European Ethnicities

Google Chrome (8)

Google Chrome (7)

Google Chrome (6)

Google Chrome (3)

Google Chrome (5)

Google Chrome (1) Google Chrome (2) Google Chrome
Google Chrome (4)Continental Groups (no adjustment)

Google Chrome 31

Google Chrome 32

Google Chrome 34

Google Chrome 29 Google Chrome 30

Google Chrome 33[Note: These values here were all well below 1 percent and openheatmap rounds to the nearest integer, so I multiplied these by 100 to show the small absolute variance in more detail]

Google Chrome 35 Google Chrome 36

Please note that the low levels of subsaharan African ethnicities and american indian ethnicity in ancestry.com’s data implies that blacks and latinos, in particular, are very much under-represented based of their share of the population.  I would presume that this is because blacks and latinos are much less likely to use AncestryDNA (and probably genealogy services in general).

Below I very crudely estimated their share of the population based on the proportions of American Indian (~40% average for Latinos) and subsaharan african (~80% average for blacks) genetic material, assuming no admixture with other groups (which obviously is not quite right either).

Google Chrome 47

I am not claiming that these figures are exactly right or uniform in all states (I know there is real variance, especially with hispanics/latinos), but it ought to be pretty clear that they are using the service at something like 20-50% the rate of non-hispanic whites, or at least were at the time ancestry.com compiled this “census” (which is why I felt more comfortable using my crude adjustment method to calculate the European percentages!).

census_comparisons.png-1 census_comparisons.png-2 census_comparisons.png-3 census_comparisons.png-4


The probable genetic explanation for interstate differences in mortality amongst non-hispanic whites

A couple months ago I stumbled across ancestry.com’s  “Genetic Census of America”.  Since I was researching the health outcomes question already I remembered that this data existed and I decided to bite the bullet and actually analyze this data systematically.  Lo and behold, I quickly discovered some very strong correlations between these genetic proportions (crudely without any particular techniques) and the life expectancy of non-hispanic whites in each state.  I refined this a bit and produced a toy model that can explain about 85% of the variance in life expectancy between states (not to mention other things)!

Before I get started, let me get some caveats out of the way:

  • correlation does not necessarily imply causation
  • these particular genetic groups may just be proxies in this country for particular ethnic or other genetic groups (at least in part)
  • this could “cultural” (people with particular frequencies of SNPs are also more likely to have had particular cultural mores, values, and the like passed onto them through their ancestors/parents).
  • most “whites” have some fraction of other continental groups, but it’s usually pretty small on average
  • the DNA testers may not necessarily be representative of the larger “white” population, but I think it’s good enough to represent the white population (probably less so other groups).
  • binning these together by states and other high levels of aggregation likely improves the “accuracy” of these methods since random accidents, stochastic variances in gene expression, or what have you get averaged out to large degree.  Likewise, to the extent these groups are just a crude proxy for actual groups, this level of aggregation likely further helps.
  • Ancestry.com does not provide details by race/ethnic group and my procedures cannot perfectly remove any potential signature introduced by others.   Blacks and latinos, in particular, surely introduce some european genetic groups into this data, although they are obviously much under-represented in ancestry’s DNA analysis and I do not think it would skew the results that much.

This is a simple model that I produced to calculate non-hispanic white (NHW) life expectancy by state using a simple genetic calculation and smoking rates (weighted equally on standard deviations from the national non-hispanic mean amongst states) .


As for how I got here….

First, you can start to see these patterns with the three groups, i.e., Great Britain, Ireland, and W. Europe,  I keyed in on without any adjustments at all.

No-Adjustment: NHW LE by % GB


No-Adjustment: NHW LE by % Irish


No-Adjustment: NHW LE by % Western European


“Great British” is pretty apparent, but the latter two are obviously noisier.  This is without any any adjustment for other major racial/ethnic groups in ancestry’s estimates (e.g., blacks, latinos, asians, american indians, etc) that took these DNA tests in those states and, obviously, some states have considerably more (broad-strokes) racial/ethnic diversity than others.  Put differently, states with more non-european groups and thus test takers (albeit seriously under-represented) will tend to skew the european fractions down without any adjustment.

However, if we merely aggregate these three groups together they tighten up a bit:


We can do quite a bit better though if we crudely divide these and other components all by the total european ethnic groups (as defined by ancestry.com, which is generally pretty similar to major continental categories broadly) to prevent these other genetic groups from obscuring this signal (Yes, I know that’s imperfect, but it seems to work!)

For each group I used:

Adj. British


Adj. Irish


Adj. Western European


Adjusted Combined (Great British + Irish + W. European)


I did similar analysis for every other european group (and other non-european groups), most of which produced less strong signals and/or positive relationships (e.g., E. Europe, Scandinavia, etc), but I found this to be the cleanest (probably, in part, because these populations are larger and thus their contribution to mean NHW life expectancy can be mostly clearly observed).

Moreover, there is a good reason to think that these particular groups (GB, Irish, and W. European) are closer to each other genetically than other groups (geographical, historical, and genetic analysis).

See this PCA analysis (amongst others):


source: Population structure and genome-wide patterns of variation in Ireland and Britain.

Or this:


As for ancestry.com’s methodology definitions of these three ethnic groups, see here:


[Note: As you can see I am about as “Great British” as the average English person, presumably. That is probably significantly more than I should be if I interpret this as actual strictly nationally-aligned ethnic groups rather one of probabilities of these particular ethnic/genetic groups being found in particular regions in higher concentrations.  Although, if you look carefully, you can see a fair amount of GB in northern parts of FR, DE, BE, NL, etc]


[Much less than I would expect based on my knowledge of my family’s recent history, genealogy, and this map!]


A list of ancestry.com’s European ethnic/genetic groups by the average proportions of people in those regions


[Note: Both the “Great Britain” and “Europe West” regions have a lot of other genetic groups.  Ireland, by contrast, is much more “pure” or, more technically, less recently admixed or to a much lesser degree]


Their method is far from perfect and their ethnic labels are a bit confusing, if interpreted in a particular literal fashion, but there is still real information content there and it seems to be good enough to find evidence of population structure within the non-hispanic white population of the United States.

I ran some comparisons against my crude genetic model (GB+IR+WEU fraction alone) to see if there were any apparent systematic patterns whereby other groups seemed to cause notable under- or over-prediction.  I did not find much evidence for this (note: positive Y values=over-prediction)

I also ran a series of cross-checks to see if perhaps other populations might be skewing my data somehow:

My calculated fraction by proportion of census takers reporting to be NHW


[There does not seem to be any systematic relationship between my calculation and the proportion of NHW on the US census]


[States with higher latino proportions on the US census do not report seem to skew significantly based on census proportions of latinos]

My calculated fraction by proportion of census takers reporting to be black


[States with larger black populations report somewhat larger adj. GB/IR/WEU proportions, which is consistent with known migrations and much stronger patterns in this data.  If black test takers (national avg ~20% european) were skewing this a lot I think we’d see a stronger shift here.]

NHW LE by proportion of NHW on census


[NHW life expectancy (LE) is not correlated with the census NHW percentage]

NHW LE by the proportion of census latino


[States with larger latino populations have very slightly higher NHW life expectancies, but that’s probably not significant]

NHW LE by proportion census black


[States with proportionally larger black populations have lower NHW life expectancies.  I interpret this as mostly being a reflection of southern states having both more blacks and lower NHW LE for the reasons I am describing here (genetic and possibly cultural).  Although I suppose it is possible that the small amount of african admixture may explain a small fraction of this too]

Delta from my simple prediction (genes-only) by percentage American Indian


[A genetic check for the latino angle, mainly.  Apparently not many latinos amongst the DNA testers (that outlier is NM) and they don’t seem to be skewing the results terribly]


[States with larger fractions of sub-saharan african genetic/ethnic groups in the genetic data may have their life expectancy over-estimated slightly, based on my crude method (excluding smoking), but not by much all that much.  ]

NHW LE by total proportion of European ethnic groups in ancestry.com’s DNA results


[Note: Most states results are very european–obviously not many non-whites took these tests. Those states with higher european genetic fractions amongst ancestry’s results have higher NHW life expectancy, but that’s consistent with the south and the like]

I compared all of the other major continental groups versus my model (over-prediction) — See this PDF here (to save space/bandwidth)


We can produce a simple model using this great british, irish, and w. european fraction like so. r**2=0.54, not bad.


However, we can do much better if we add in the proportions current NHW smokers.


[This is a very simple model that just uses smoking rates and the european proportion, equally weighted, for r**2=0.85. I can do a bit better with more variables and stronger weighting, but to prove the point I thought I’d keep this as simple as possible (for now at least) lest I be accused of over-fitting this!]

It can be interesting to compare this analysis with life expectancy for other groups to see how they correlate (or not).  Yes, I am well aware that “Asian” and “Hispanic” are not homogeneous populations, i.e., some states have systematically different national/ethnic and admixture characteristics here than others, however lack of correspondence ought to tell us something.  As in, if we find no relationship amongst these other groups than state-wide explanations like in healthcare systems, policy, and the like are unlikely to explain much (this position ought to be especially difficult for genetic and/or culture deniers to argue with a straight face!)

State life expectancy by estimated GB/IR/WEU proportion amongst whites


Whites and blacks move together, although the fit is considerably tighter for whites, whereas asians and latinos do not appear to correlate at all!   I suppose one could make the argument that cultural/lifestyle differences explain this particular pattern, especially amongst whites and blacks, but one could also argue, if this is genetic, that that particular european admixture in the south has something to do with this too!  It is certainly hard to blame state policy without making any account for genes, culture, or other behavioral elements given the difference outcomes observed in different states

I, for one, find the similarity of the patterns that I observed earlier in the state of California very interesting.



[source for the data]

[Obviously this is at a different level of granularity, i.e., state vs SES, and it’s possible that the latinos and asians are significantly different than these “same” groups in other states (different national origin, culture, and/or admixture proportions), but I really do not believe that is mere coincidence!  Although these other groups are clearly not homogeneous generally, my theory is that they are actually less stratified genetically and culturally with respect to measures of SES because they have not been here nearly as long, on average, and because their historical regions have not had the same structure with respect to long established market systems and the like.]


[Similar patterns… although, unfortunately, the only other ethnicity they had was “other” — presumably mostly asians and latinos]





[Other groups appear to be virtually uncorrelated with this measure]


[Curiously asthma rates seem to notably lower for blacks in these states]



It can also be useful to compare this against other measures.




[repeat of the same above for easy comparison]


[Little to no correlation here and, if anything, it does not move in the expected direction]


[Not much of are correlation here either!]

Comparison between ethnic group life expectancy and each group’s “own” (corresponding) 8th grade NAEP scores


[Note: This fit is much better than just about anything else, save for smoking rates by group probably, and there is some data to support the notion that IQ predicts health outcomes quite well (or here).  Obviously NAEP scores are not a perfect proxy for IQ and likely more subject to variances in effort and conscientiousness, but they are certainly pretty well correlated…]


[repeat of earlier graph for comparison]

Life expectancy by current smoking rates for each group (data is spotty for latinos and not available for asians)


Obviously smoking adds a lot to this, especially for non-hispanic whites:


However, it’s important to note that these genetic proportions predict much higher rates of smoking, lower NAEP scores, higher obesity, and more, so I wouldn’t assume that smoking per se explains as much itself (surely it’s terrible for health, but it’s also a signal of intelligence these days and preferences/lifestyle).  Why do some states have much higher smoking rates than others?  Genes and (arguably) culture explain this better than GDP or inequality (see above).


[inequality adds virtually nothing and it’s probably not even statistically significant]


[likewise for GDP per capita]


[NAEP scores seem to add something though]

Even without smoking we can improve out model by adding NAEP scores



[smoking per se still adds something, but less than before]

Or obesity rates



[likewise for smoking with respect to obesity and genes]

Or we can combine genes (GB/IR/WEU fraction), NAEP, and obesity rates for an even better alternative



[smaller still]

 I still thinking smoking adds real incremental power and that it is certainly absolutely terrible for health, but my point here is that smoking per se does not explain everything (i.e., it’s as much a signal of other behaviors and other health issues as much as it is a cause in and of itself) and that still leaves open the question of why some groups smoke so much more than others (the variances today are not apt to be well explained by policy).


[We can account for most of the variance in smoking rates amongst NHW with a simple model using equal weights for genes, obesity, and NAEP scores!]


[and the usual suspects in like GDP per capita don’t tell us nearly as much


[inequality tell us virtually nothing for non-hispanic whites]


[NAEP scores in and of themselves tell us something]


[States that are 1 SD above average, i.e., average of lower gene fraction  and higher NAEP scores in SD from mean, smoke, on average, ~2.5 percentage points less per capita and vice versa]


[States that are 1 SD above average, i.e., average of lower gene fraction and higher NAEP scores in SD from mean, have on average ~5.3 K more GDP per capita (remember this is without factoring other groups into the mix that surely confound this!)]


[A giant mess of different measures/outcomes against the gene percentage]



[States that are 1 SD units above the mean in GB/IR/W. Euro gene fraction and NHW smoking rates are, on average, ~1 SD units below the mean in life expectancy]

Below I tossed a lot of this data into a heat map (US states) to try to show the patterns in standard deviation terms.  Blue=better, white=average, red=worse.

I picked this color scheme to show the differences in the sharpest relief possible.  When it’s off by a bit, especially around the mean (+/- average) it looks worse than it really is.  Nevertheless, you can probably pick up some pretty clear patterns here!

Google Chrome 2

Google Chrome 6

Google Chrome

Google Chrome 5

Google Chrome 10

[Note: I accidentally flipped the sign here!.   positive = under-prediction = blue; negative = over-prediction = red.  Nevada is a clear over-prediction here!]

Google Chrome 4

Google Chrome 7

Google Chrome 8

[Texas NHW do surprisingly well on the NAEP.  Accurate or skewed by varying state standards?]

Google Chrome 9

And, of course, these state wide averages obscure tremendous country-level variation (which is all the more reason why “policy” is unlikely to explain much here!)

Male life-expectancy


Female life-expectancy


[Note: these maps include all racial/ethnic groups so it tends to exaggerate a bit]

Nevertheless, you can see these by white-only groups for specific major causes of death, i.e., CHD and stroke, both hospitalizations and death rates here:

Google Chrome 27 Google Chrome 28 Google Chrome 25 Google Chrome 26 Google Chrome 23 Google Chrome 24

You can sort of observe the non-random distribution of ethnicity/ancestry by looking at specific (reported) ancestry in the US census:












[Note: The responses are fairly fickle (trends, politics, etc) and not weighted by actual proportion amongst “whites”, so it’s not that useful as an absolute gauge… which is why I prefer the genetic approach here, even if it’s not quite as granular!]


1990 US census data

In any event, I am not necessarily arguing that the three ethnic/genetic groups I picked out are homogenous (within or between said groups) or have a uniform distribution, so that we should necessarily expect the same (implied) outcomes in, say, England (although there’s probably some general directional effect here if we compare these groups to, say, southern european life expectancy) or with any individual that has said proportions (i.e., if you could somehow clone said individual or their immediate family 1000 times and observe their mortality rates).  However, it seems very likely to me that this is at least a good proxy for particular ethnic groups that settled or emigrated to this country so we can make some reasonable guesses about the sort of people in these state (on average) based on the average genetic proportions amongst (apparent) europeans broadly.

In other words, it’s not necessarily that those that the “Great British”, “Irish”, or “Western European” genetic groups are as unhealthy or underperforming otherwise as we (generally) find here, but rather that those states with high proportions of “Great British”, “Irish”, or “West European” (ancestry.com’s methods) ethnic groups amongst NHWs claim a higher share of particular finer-grain ethnic populations and that many of those groups have a long history of problems (although they could be sorted to some degree based on these sort of admixture levels).   Likewise, there are averages amongst states: different cities and regions have different distributions of ethnic groups and the like too.  The Scottish (including our so-called “Scotch-Irish”), Irish (especially Northern/Protestants), and people in Northern parts of England have long had worse outcomes (and still do) and I rather suspect many of our early immigrants came from those groups and brought their troubles with them even as they prospered.  Many of them settled in the south, greater Appalachia, etc and a good number of them have since interbred with other groups and moved to other parts of the US (which is where genetical/admixture analysis can possibly reveal more than, say, US census reported ethnicity which changes depending on current trends/popularity and the like!).

These sorts of patterns are not entirely unique to the United States.

There is a very well established “North-South” gradient in the UK in health outcomes.

Google Chrome 15

Scottish life expectancy by council area

Google Chrome 14

CHD mortality ratios for Wales, N. Ireland, and Scotland vs England

Google Chrome 11

CHD mortality in 1961 vs 2009

Google Chrome 13

Google Chrome 12

[Note: Very little change in relative risk!]




[Notice the relative lack of spread in nothern ireland, by health authority at least.  I would not be surprised if this was the result of relatively (traditionally) greater genetic homogeneity in NI — see ancestry.com’s regional distribution, amongst others]

Google Chrome 17

Google Chrome 16

Nor have they abolished differences by major ethnic/racial groups

Google Chrome 18





Google Chrome 19

Study of Y-haplogroups and CHD risks in the UK


Patterns in health disparities are hardly unique to the UK or the US either

Health inequality study in various european countries (healthcare amenable vs non-amenable mortality)


[There are significant long-standing differences in mortality rates according to class in all of Europe: both “amenable” and otherwise]

Finland CHD mortality rates by income group


Life expectancy in Finland, amongst adults age 35, by class and sex



The notion that all groups have the same risk factors, the same life expectancy, and so on, ergo all differences in outcomes can only be explained by differences in health care treatment are really not tenable any longer (it’s not even likely).  I suspect that these differences are mostly genetic, even many of the behavioral components are apt to be strongly influenced by genes (e.g., propensity for addiction to tobacco, over-eating, etc), but, if nothing else, we really do not have good environmental explanations, save for smoking, particular types of drug use, STDs, etc.  We barely even have good policy solutions to even address those limited parts that we do understand (never mind that which we do even understand!).   Many medical/technological improvements will help all groups, but, more often that not, these will tend to amplify pre-existing disparities in outcomes and rarely close them appreciably (unless a particular treatment is highly efficacious in treating diseases that disproportionately effect particular groups so as to have a pronounced effect on overall mortality/odds ratios) !

We clearly have more genetic diversity than do most European/Anglo countries that we are typically compared to, even amongst “whites”.



Many of our “white” immigrants came from different countries, regions, religious groups, and more.   If these different broad groups have (had) different genotypical life expectancies (given similar environmental conditions), risk factors, and so on this alone will create more apparent health “inequality” (especially if we look at the nation as a whole rather than more narrowly within a particular broad ethnic group and region) even if most people do not think of these groups as being (visibly) different.  (Not to mention some potential admixing with other groups and/or other groups identifying as NHW)

Furthermore, even many of these individual countries, regions, and the like are clearly not homogeneous either.

We can see this quite clearly with the averages produced by ancestry.com’s analysis for Europe (in these regions, some individuals clearly have more, some have less since this admixture is relatively recent):


It is quite likely that they were and still are stratified genetically by class, by region, by historical religious groups, etc (whether they know it or not) and that this has implications for life expectancy, disease rates, and other important outcomes.  Emigration from these places to the United States was hardly random: some classes, religious groups, regions, and so on clearly settled here at very different rates for very different reasons at different times.  If we drew a disproportionate share of lower class or otherwise marginalized groups, it is quite likely that many of our groups could have somewhat worse outcomes even when we try to compare apples-to-apples (not that easy!).  There is some evidence for long-standing historical differences in different immigrant populations (and these differences still exist in many of these countries and, particularly, regions that they immigrated from).

See this table on Irish immigrants to the United States and historical health disparities (starting in 1850):


[This is obviously long before modern medicine, fast food, etc and these patterns in CHD differences existed even then!]

Life expectancy at birth by region (both sexes) — zoom in Northern Europe

Google Chrome 21

Life expectancy at birth, both sexes, zoom out

Google Chrome 22

source: WHO

There is clearly not just one German, English, Scottish, French, Dutch, etc  life expectancy, not even at regional level (i.e., there is certainly more within regions, ethnic groups, classes, etc)!  [I would love to get similar level of detail for UK and various european countries to see how this sort of analysis plays out, crude though it may be].  Some regions have different distributions of risk factors and there is some evidence that this correlates with population structure (see this for CHD risks in the UK).  We also know that these groups are not that homogeneous either.

I suspect (but cannot prove) that interstate differences amongst non-hispanic whites, like differences within and between other first world countries, are largely genetic and probably somewhat cultural/behavioral (to the extent we can disentangle culture/behavior from genetics).  Of course lifestyle (e.g., smoking, eating habits, etc) and healthcare systems can have an impact, and they can change over time, but these variances are  usually not that pronounced in practice and they tend not to explain much within broadly similar environments (as in, amongst well established citizens that have adopted broadly western diets, that have decent healthcare systems, not excessively high rates of accidents/murder/etc).

Some other issues with comparing US healthcare costs and so-called “outcomes”

Besides my previously mentioned objections with simplistic comparisons between healthcare systems, vis-a-vis naive economic comparisons and the effect of taxation on behaviors, it is very difficult to compare the actual performance of healthcare systems, in both financial and human terms (e.g., life expectancy, mortality rates, etc), without accounting for other differences in the populations (e.g., genetics, health/risk behaviors, lifestyle, etc).   These simplistic comparisons of national health care systems based on crude mortality rates and the like are very much like comparing performance of goalies in various sports based on who wins the game alone, i.e., without making any real attempt to control for the performance of the rest of the teams’ defense, the performance of the offense, and so on, when what we really want to know, at bare minimum, is the number of saves as compared the number of shots on goal (and even then that’s an imperfect metric).  Of course some goalies are likely to be somewhat more effective than others and, other things equal, goalies can have a pronounced impact on the outcomes, but you cannot simply assume that there are not any significant systematic differences between teams in general or on game day.

These are just a few relevant differences I can think of off the top of my head:

  • The United States population is not a mirror image of Europe: genetically, culturally, or otherwise
  • Much higher smoking rates historically
  • Relatively high rates of obesity (although other countries are starting to catch up to us now)
  • Much higher homicide rate.
  • Higher rates of sexually transmitted diseases (see the AIDS crisis)
  • More geographically distributed than most (as in, lower population density, significant populations living in rural locations, etc)
  • Higher rates of serious automobile accidents per capita

….. (and probably more I’m forgetting)


The United States is sicker by many measures and much of this can be attributed to behavioral differences.

Self-reported disease rates by country and gender, ages 65+


Percentage of population using anti-hypertensive drugs, ages 50+


Life Expectancy at age 50 and rate of obesity by country and gender, 2004


Age Adjusted Obesity Rates in USA, aged 20+ by gender


Obesity Trends International Comparison


Comparison of risk factors between US and UK by education and income


Comparison of self-reported health between US and UK by education and income



The role of diet and obesity

Comparable cross-national data on dietary practices are limited for the reasons noted above, including the challenges that countries face in evaluating the diets of their populations and inconsistencies across countries in food culture, defining indicators, sampling respondents, and administering surveys. Data collected within the United States suggest that the American diet has become less nutritious over time. Between 1971 and 2000, aver- age daily caloric consumption increased from 2,450 kcals to 2,618 kcals among men and from 1,542 kcals to 1,877 kcals among women;5 similarly, carbohydrate intake increased by 67.7 grams and 62.4 grams, respectively for men and women, and total fat intake increased by 6.5 grams and 5.3 grams, respectively, for men and women (Centers for Disease Control and Prevention, 2004). Between 1950 and 2000, annual per capita food consumption in the United States increased by 20 percent for fruits and vegetables but also for grains (by 44.5 pounds, a 29 percent increase), meats (by 57 pounds, a 41 percent increase), cheese (by 22.1 pounds, a 287 percent increase), and caloric sweeteners (by 42.8 pounds, a 39 percent increase). High-fructose corn syrup consumption per capita rose from zero in 1950 to 85.3 pounds by 2000. Some of these increases may be associated with an increase in dining out, which increased from 18 percent of total food energy consumption in 1977-1978 to 32 percent in 1994-1996 (U.S. Department of Agriculture, 2012). How do these trends compare with other rich nations? Americans consumed 3,770 kcals per person per day in 2005-2007,6 more than any other country in the world: see Figure 5-4. This trend is not new: the United States also had the highest caloric consumption in 2003-2005 and ranked fourth in the world in 1999-2001 (behind Austria, Belgium, and Italy). Between 1999-2001 and 2005-2007, the U.S. ranking on fat intake rose from seventh to fourth in the world, with Americans consuming an average of 161 grams per person per day. By comparison, in 2005-2007 the average Swede consumed 17 percent fewer calories and 24 percent less fat (Food and Agriculture Organization, 2010).


Overweight/obesity rates 5-17 years old, comparison


Cardiovascular risks comparison by sex, ages 50-54


[Note: Smoking, diabetes, and obesity are all strongly associated with cardiovascular risks and the US is very much elevated in this cohort across all three]

Diabetes rates by sex and age group comparison
diabetes_rates_comparison_pt1 diabetes_rates_comparison_pt2

Where the impact of obesity (especially at modest levels of it) on life expectancy is far from settled, it surely plays a large role in influencing type II diabetes rates and surely increases healthcare costs (e.g., treatment of diabetes, heart conditions, etc).


Smoking/tobacco consumption

The United States historically had much higher rates of smoking and intensiveness (e.g., cigarettes per day per smoker) than any other comparable country.  The effects of this are surely still being felt today.

Trends in cigarette consumption per capita, USA vs selected countries

cig_trends_selected_countriesTobacco Consumption Rates (grams per capita), 1960-2010 USA vs typical comparison countries (decadal average)


Although the United States is no longer the top smoking nation (amongst presumably comparable countries), we are still amongst the top and, more importantly, the cumulative impact of those earlier decades of smoking has had and still have a pronounced impact on health and mortality today (this was/is especially true for US women).   This behavior likely plays explains much of our elevated rates of stroke, heart disease, and more.  Even when current or prior smoking rates are reported to be very similar amongst presumably comparable demographics (e.g., same age group in the US and the UK) that does not mean that both groups of “smokers” started at the same age or smoked as intensely.

OECD Estimates of “healthcare amenable” mortality in 2007 as compared to cigarette consumption


[Note: These are all presented in standard deviation units from the mean of the reported countries — overall rates have fallen substantially so the same differences in recent years are less pronounced in absolute terms]

source: cigarette consumption data

[Note: The fit is pretty good in earlier decades with this limited data but it gets increasingly worse in subsequent decades…. which is not surprising given that we do not expect instantaneous consequences from recent smoking activity and the declining rates of consumption internationally in recent decades!]

Estimated gains in life expectancy by gender from eliminating smoking in 2003


Estimated life expectancy gains in the US at age 50 without smoking by gender


Estimated Fraction of All Deaths at age 50 and older attributable to smoking by country and gender


[Note: Observe how the fraction has increased in recent years despite falling rates of current consumption.  Also the mortality rate was and is higher in the US, so the similar fractions still imply significantly more deaths from smoking per capita terms]

Using an innovative macrostatistical method, Preston and colleagues (2010a, 2010b) estimated the attributable fraction of deaths after age 50 from smoking2 and its effect on life expectancy at age 50 among 10 high-income countries in 1955, 1980, and 2003. The authors calculated that by 2003 smoking accounted for 41 percent of the difference in male life expectancy at age 50 between the United States and 9 comparison countries and for 78 percent of the difference in female life expectancy at age 50 (Preston et al., 2010b). Smoking appeared to have a larger impact on women because of the later uptake of smoking by U.S. women (U.S. Department of Health and Human Services, 2000, 2002): see Figure 5-3. The smoking-attributable fraction of U.S. deaths among males age 50 and older was 23 and 22 percent in 1980 and 2003, respectively, but during the same years increased from 8 to 20 percent among females of the same age (Preston et al., 2010b). Based on the researchers’ assumptions, smoking accounted for 67 percent of the shortfall in life expectancy gains that U.S. women experienced relative to 20 other countries between 1950 and 2003. These findings implicate smoking as a potential cause of the shorter life expectancy of adults age 50 and older, but they do not explain the lower life expectancy observed in younger people. The U.S. health dis- advantage before age 50 has worsened over the same time that smoking prevalence rates in this population have decreased. The reduction in smoking rates will produce benefits in years to come. Wang and Preston (2009) predicted that the future will bring a decline in deaths attributable to smoking among men but that improvements for women will occur later.


Time lag illustration between smoking rates and smoking related mortality


Age Standardized Mortality Rates by amongst men 50+ by cause


Age Standardized Mortality Rates by amongst women 50+ by cause


I am, of course, not the only one to point out the role of historical smoking.  Some proponents of single-payer have tried to argue that our current smoking rates really are not all that different, but you might notice that: they don’t look at earlier rates; they don’t account for volume/intensity; or reference the peer reviewed literature of the impacts of life history of smoking!  Nor do they mention that many of the countries with recently elevated rates of smoking have seen very similar patterns of disease rates and the like as the United States has (albeit in the earlier stages of this progression).  Put simply, differences in tobacco consumption can, in and of itself, explain a very large share of differences in life expectancy, differences in mortality from various specific diseases, and rates and severity of various diseases (e.g., heart, stroke, etc)


Patterns in trends

I, for one, find it hard to reconcile the argument that these other differences must primarily be the result of lack of single payer when our differences with Europe mostly predate the rollout of national healthcare systems.  Both 55 year old men and women in the US had amongst the highest probabilities of death in 1950 and they still do.  Our male trend has not much changed, relative to Europe, whereas it clearly has with women (see age 65 & 75), which is very consistent with the evidence on smoking.

Mortality rates comparison by age group amongst men


Mortality rates comparison by age group amongst women


The role of race/ethnicity on health outcomes

There are clearly very large differences in mortality rates by cause and by age between racial/ethnic groups in the United States.

Estimated years of life lost by category cause, odds ratio relative to non-hispanic whites, amongst populations before age 70


[Note: Accidents, Suicide, and Homicides alone account for ~25% of the estimated life expectancy impact and non-hispanic whites actually do worse than the average here if we combined them all like this.  However, when it comes to heart disease, stroke, diabetes, and other presumably preventable causes, save for cancer, the combined population and age-adjusted impact of our other major racial/ethnic groups skew our outcomes for the worst, especially the amongst the major causes and those that are most apt to be considered “amenable” to healthcare]


A more detailed version of the above graph



Age-adjusted mortality rates by category cause, odds ratio relative to non-hispanic whites


[Note: This is the age-adjusted mortality data across all age groups (not just <70).  Also the age-adjusted mortality data does NOT map 1:1 with life expectancy (or the comparable measure years-of-life lost) because deaths at a younger age (e.g., car accidents) have a much greater impact than those that occur amongst predominantly older people]

Age-adjusted mortality rates by (detailed) cause, odds ratio relative to non-hispanic whites

Source: CDC

There are major differences between racial/ethnic groups in terms of deaths from HIV, diabetes, suicide, homicide, cancer, stroke, heart disease, and many many more.  Some of this may be purely cultural/behavioral and some of it very likely to be very much genetic (or see here)


There are probably also very significant differences within all of these populations, including non-hispanic whites, so I do not think that anyone can reasonably downplay the role of genes and behavioral differences without even trying to account for differences between major known groups.

Most countries do not track this data unfortunately but whenever they do you invariably find very similar patterns and large disparities between groups.  Compare the following data from the UK.

UK Diabetes rates differences


UK Diabetes Rates by ethnic group and sexuk_self_reported_diabetes_rates

UK infant mortality trends


UK Infant mortality rates differences


UK CHD mortality rates differences


UK stroke mortality rates differences


Canadian black immigrant outcomes (Quebec)


[Note: Canada publishes practically no data disaggregated by race/ethnicity, but you can clearly see that both Haitians and other black Caribbean immigrants have notably worse odds ratios]

I am not suggesting that there no differences between “whites” in the US versus the UK (see higher obesity & diabetes rates when adjusting for SES), but that ethnicity differences clearly play a very large role here, too, and that it is quite possible that different ethnic groups in the US have significantly different risk factors (e.g., people with more southern european vs northern european heritage).

So when I see analysis that claims to attribute differences in “healthcare amenable” mortality to differences in healthcare systems I am thoroughly unconvinced.

Below I have compared the OECD “healthcare amenable” mortality rates between the US and averages of leading European countries (Australia, Austria, Canada, Germany, Denmark, Finland France, United Kingdom, Ireland, Italy, Luxembourg, Netherlands, Norway, New Zealand, Sweden) to the United States.

Age-adjusted “healthcare amenable” mortality rates, odds ratio vs European mean


Age-adjusted “healthcare amenable” mortality, US compared to European mean


Age-adjusted “healthcare amenable” mortality, European US delta


Age-adjusted “healthcare amenable” mortality, stacked US delta vs *weighted* European mean


The US apparently outperforms in a few categories (cancers, diseases of nervous system, digestive system), but apparently mostly underperforms in heart disease, stroke, infectious diseases, and genitor-urinary systems.  A lot of these categories map fairly closely with some pretty significant differences in US categories by ethnic group (see this repeated graph) and some pretty well established historical behavioral differences in the US more broadly (see smoking, over-eating, etc).   [If these behavioral differences are the fault of the healthcare system, despite damn little empirical evidence for successful behavioral modification, then that must imply that these other systems are failing too– see rising obesity rates in Europe, higher smoking rates in the Netherlands today, etc]  Moreover, given the large variances between these leading European countries it is hard to credit systematic differences in healthcare delivery with these (especially when they are unable to identify significant plausible causative explanations and there are known lifestyle differences, like smoking rates, obesity, and the like).

Transit mortality comparison


Violence related mortality comparison


Mortality from injuries comparison


Deaths from communicable diseases


Deaths from non-communicable diseases


Death rates from specific causes


Years of life lost before age 50, males


Years of life lost before age 50, females


Years of life lost before age 50, major causes for males


Years of life lost before age 50, causes for females

Comparison of mortality rates by age group between US and Canada


Many of of the differences within the United States are not at all consistent with the argument that “class” or healthcare explains group differences.   Actually poor Asians and Hispanics out live most whites regardless of SES (and especially blacks) and, unlike other groups, they demonstrate much less difference (“inequality”) in outcomes by SES.

Male Life Expectancy at birth in California, 1999-2001 by SES


Female Life Expectancy at birth in California, 1999-2001 by SES


source: California Life expectancy data by ethnic group

The arguments for healthcare differences would be much more persuasive if this sort of evidence did not exist and if the national healthcare systems of Europe demonstrated dramatically smaller differences in healthcare outcomes.  If these differences are not genetic and if the modern healthcare can presumably make irrelevant differences in obesity, smoking, drug use, and the like, then we should not find the outcomes that we routinely find when we look closely.

UK Spatial/Racial inequality


Relative Inequality Comparison


Inequality in Europe: both “amenable” and non-amenable”


[Note: Both amenable and non-amenable causes of death demonstrate very consistent patterns within and between nations by levels of SES.  National healthcare systems clearly do not eliminate differences in genetics, behaviors, or what have you.]

Age-adjusted mortality rates by class in selected European countries


Finland age-adjusted smoking rates


Finland differences in CHD mortality by income level


Age-adjusted obesity rates by class in Finland


Finland male and female life expectancy at age 35 by educational level


Finland life-expectancy by SES by sex at age 35


source for Finland health data

 Mortality rate data according to education, males 30-74 (includes black and white in US)


 Mortality rate data according to education, female 30-74 (includes black and white in US)


[Note: One of the problems with the two above graphs is that the US has many more “educated” people, with their particular method than others in proportional terms, so the least educated in the US are truly not comparable to the same categories in, say, the UK or France.  Moreover, the United States is much more heterogeneous, even within these groups, than most of Europe, is larger, and is more distributed geographically, culturally, and otherwise so we should probably expect somewhat more distance given larger differences in behaviors at least…. compare whites in, say, Minnesota to West Virginia]

We also observe very large differences within the US, even at the level of states, when we compare the “same” groups.

State Life Expectancy by ethnicity and 8th grade math scores by ethnicity


[Note: This is 8th grade math scores and the correlation here is much better than it is with income or inequality statistics below!  Yes, I know they’re not the “same” people but we can reasonably assume that the current cohort of children in each group are broadly representative of the overall population in each group]

State Life Expectancy by Ethnicity and state average TIMSS 8th grade math scores


[Note: These scores are the average for the entire state, not by race/ethnic group.  The prior graph (NAEP) has test scores by ethnic group and correlates much better overall]

State Life Expectancy by Ethnicity and State GDP per capita


[Note: They do appear to be somewhat correlated, but not nearly as much as life expectancy is with test scores]

State Life Expectancy by State Income inequality (GINI)


[Note: These are basically completely uncorrelated]

source: Life expectancy data by state and race/ethnicity

source: NAEP scores by state and race/ethnicity


Smoking and obesity rates, males 50+ international comparison

smoking_and_obesity_rates_intl_male_50_plusSmoking and obesity rates, females aged 50+ international comparison


Amenable mortality international comparison, 1998 & 2007


Amenable mortality by state map


Some Infant Mortality Rate Comparisons and Data

Infant mortality rate comparison, international and state ranking, 2002


Odds ratio by race/ethnicity and education relative to mean infant mortality rate of non-hispanic whites


Infant Mortality Rate odds ratio in Massachusetts by ethnicity/country of origin and education level


MA IMR by ethnicity (actual rates)


MA IMR odds ratio by ethnicity


CDC: Infant mortality rate by ethnicity


CDC: Infant mortality rate by birth weight and birth cohort


CDC: Infant Mortality Rate by State


CDC: Infant mortality rate by major race/ethnicity, 1960 – 2010


[Note: Infant mortality rates have fallen across groups and, especially, if you look at survival by birth weight, prematurity, etc…. it’s only when you average it all together without regard for race/ethnicity, maternal age, and type of delivery do you find these apparent (naive) underperformance of the “system”]

Underweight birth rates


Pre-term birth rates


CDC comparison of pre-term birth rates in the US and various OECD countries


Adolescent pregnancy rates comparison


US vs Canada, IMR by birth weight comparison


CDC: Various childbirth related mortality rates by race/ethnicity


Perinatal mortality rate under alternative definition


CDC compilation of infant mortality rates by gestational age between US and Europe


[Note: Even without taking into account race/ethnicity or other characteristics such as birthweight, the US is pretty comparable in IMR]

Perinatal mortality rates amongst reasonably comparable European countries


[Note: If you even compare perinatal mortality rates with the OECD’s official data, which addresses many of the vagaries of IMR calculations and differences between countries, especially with respect to premature and other presumably non-viable fetuses, the US compares much more favorably (and would likely be well above average if we account for other previously mentioned issues).  They define it thusly “The ratio of deaths of children within one week of birth (early neonatal deaths) plus fetal deaths of minimum gestation period 28 weeks or minimum fetal weight of 1000g, expressed per 1000 births.”]

Various IMR -related 2007 statistics from the CDC relating to race/ethnicity





Here is a study into the reasons behind the black – white gap in IMR

Black vs white IMR by birth weight, 2001


Black vs White IMR by gestational age


Birth Statistics by birth cohort, part 1

US_IMR_birth_cohort_statistics_part1Birth Statistics by birth cohort, part 2 (1986 | 1991 | 1996 | 2001 | 2004)


Some more data and analysis on US state level mortality as compared to various predictors

CDC: white obesity rate by state


CDC: black obesity rate by state


CDC: latino obesity rate by state


CDC: obesity rate by state (all races/ethnicity combined)


CDC: Diabetes rate by state amongst adults with more than HS education

CDC: Diabetes rate by state amongst those with only HS education


CDC: Diabetes rate by state amongst those with less than HS education


[Note: the color coding with the diabetes maps varies for each level of education.  Diabetes rates have increased across the board, but the rates are much higher amongst the less well educated]

Life expectancy by county, male


Life expectancy by county, female


source: for maps and data

Shorter version: There is tremendous variance amongst states and even counties and very little of it has to do with policy.  People that smoke heavily and overeat are at much greater risk and there is only so much the healthcare system can do to counteract these effects (not to mention different risk factors due to genetics and other unmeasured differences)

If you compare state smoking rates to the variances in these outcomes you can actually account for the vast majority of the differences.

Below I analyzed available data from the CDC and KFF to compare white non-hispanic health outcomes to white non-hispanic smoking rates (amongst others) and vice versa.  I have converted all of them to standard deviation units from the mean of states to make it easier to compare relative differences and the like….

Assorted health outcomes by smoking rates (complex)


State life expectancy and infant mortality rates (IMR) by smoking rates


A 1 SD increase in smoking rates amongst non-hispanic whites (by state) is associated with a ~0.85 SD decrease in life expectancy and a ~0.78 SD increase in infant mortality (note: life expectancy is inverted to make viewing here).  I suppose it is possible that smoking rates may act as something of a (health) IQ test or life-style preferences test amongst states and that this is not entirely a product of actual smoking, but it’s worth nothing that is is not nearly as well associated with state GDP per capita or income inequality.

State Life expectancy and IMR by GDP per capita (inverted)


State life expectancy and IMR by income inequality


However, both smoking rates and obesity rates as quite well associated with state test scores (NAEP 8th grade test scores amongst non-hispanic whites).


They are poorly associated with “inequality”


And not nearly as well associated with state GDP per capita


So, for these reasons, amongst others, it’s not very surprising when we find that a 1 SD gain in test scores corresponds to an average gain of about ~0.7 SD in life expectancy (albeit with more variance in outcomes, possibly due to differences in smoking and other behaviors).



[Note: I inverted the SD units for test scores here, i.e., higher X axis values = worse test scores, to keep the directional impact the same across all measures]

We can also do a bit better with a simple weighted predictor (smoking=2, NAEP = 2, obesity = 1).

Various health outcomes by predictor variable


Life expectancy and IMR by the same


Comparing the above specified weighted predictor to the various components used


[Note: it ought to be pretty obvious that these measures are pretty well correlates with each other.  High scoring states tend to be low smoking and low obesity rate states and vice versa.   All 3 are pretty well correlated with each other.]

Obviously life expectancy is a product of multiple causes of death, each with different impacts (depending on mean age of death) and with different contributions, so it can be helpful to decompose various elements of it.  Obesity, for example, may not be well correlated with life expectancy independent of these other predictors, but for some things it is likely much better (e.g., obesity -> diabetes -> poor control -> death from diabetes).

Here is another way to approach this problem: swap the axises so that that the outcome we are trying to predict (e.g., life expectancy) is on the X axis and the various predictors (or other presumed correlates) are on the Y (e.g., smoking).

This allows us to quickly compare multiple predictors in the same view.

Life expectancy amongst non-hispanic whites


CHD mortality amongst non-hispanic whites


Diabetes mortality rates amongst non-hispanic whites


Infant mortality rates amongst non-hispanic whites


Pre-term birth rate amongst non-hispanic whites


Low-birth weight rate amongst non-hispanic whites


There are real geographic patterns to this data too (and the few outliers usually have known systematic differences)

The heat maps below are all presented in SD units, inverted where necessary, from the mean amongst states in each category.












[Note: inequality is much more random than the others — many of the rich/progressive states have high inequality whereas others are poor and relatively poorly educated]


Below are some of the same data as heat maps with a sharper, more contrasy, color scheme to draw these differences into sharper relief.  Blue = “good”, white=”average”, Red = “bad”

You should also be able to click through to them for an interactive map too…..

Google Chrome 7

Google Chrome 4

Google Chrome 5

Google Chrome 3

Google Chrome 6

Google Chrome 8

The National Academy of Sciences report on US health difference identified the follow key differences:

1. Adverse birth outcomes: For decades, the United States has experienced the highest infant mortality rate of high-income countries and also ranks poorly on other birth outcomes, such as low birth weight. American children are less likely to live to age 5 than children in other high-income countries.

2. Injuries and homicides: Deaths from motor vehicle crashes, non- transportation-related injuries, and violence occur at much higher rates in the United States than in other countries and are a leading cause of death in children, adolescents, and young adults. Since the 1950s, U.S. adolescents and young adults have died at higher rates from traffic accidents and homicide than their counterparts in other countries.

3. Adolescent pregnancy and sexually transmitted infections: Since the 1990s, among high-income countries, U.S. adolescents have had the highest rate of pregnancies and are more likely to acquire sexually transmitted infections.

4. HIV and AIDS: The United States has the second highest prevalence of HIV infection among the 17 peer countries and the highest incidence of AIDS.

5. Drug-related mortality: Americans lose more years of life to alcohol and other drugs than people in peer countries, even when deaths from drunk driving are excluded.

6. Obesity and diabetes: For decades, the United States has had the highest obesity rate among high-income countries. High prevalence rates for obesity are seen in U.S. children and in every age group thereafter. From age 20 onward, U.S. adults have among the high- est prevalence rates of diabetes (and high plasma glucose levels) among peer countries.

7. Heart disease: The U.S. death rate from ischemic heart disease is the second highest among the 17 peer countries. Americans reach age 50 with a less favorable cardiovascular risk profile than their peers in Europe, and adults over age 50 are more likely to develop and die from cardiovascular disease than are older adults in other high-income countries.

8. Chronic lung disease: Lung disease is more prevalent and associated with higher mortality in the United States than in the United Kingdom and other European countries.

9. Disability: Older U.S. adults report a higher prevalence of arthritis and activity limitations than their counterparts in the United Kingdom, other European countries, and Japan.


Population structure in Europe (genetic PCA analysis)


To wrap this up, I do not think anyone can plausibly argue that these differences must be, or are even likely to be, explained by differences in our healthcare system given:

  • Large interstate differences amongst white non-hispanics that cannot plausibly be explained by healthcare and which appear to be strongly correlated with test scores, smoking, and obesity (especially combined)
  • Large differences within essentially all of the countries by education, class, and the like, i.e., their healthcare systems empirically do not make these behavioral and/or genetic differences between their own (relatively homogeneous) groups insignificant.
  • Large differences in life expectancy and, especially, specific causes of mortality amongst different (large grain) racial/ethnic groups in the United States, which are not well explained by SES, and which also appear to quite consistent with all the available international data (where available, e.g., UK).
  • Large differences in historical smoking rates (!!!)
  • Large differences in historical and current obesity rates (especially with respect to diabetes and complications thereof) — this also goes to costs independent of life expectancy.
  • Large observed differences in population health measures (e.g., diabetes, heart disease, etc) which cannot plausibly be blamed on healthcare system differences
  • The very real possibility that different European or “white” populations differ significantly at a genetic level with respect to health outcomes, i.e., both between US regions/states and different historically European countries (including Canada, New Zealand, etc).
  • Significant cultural and lifestyle differences and preferences driving both observed (e.g., over-eating, smoking, drug use, teen pregnancy, etc) and unobserved differences.
  • Changing population makeup and its effects on different health measures (e.g., falling proportion of “white” births in the US–many of these groups have worse statistics with respect to IMR and early childhood mortality)
  • The paucity of data published on racial/ethnic group differences within almost all European countries.
  • The existence of very large unexplained differences between these presumably comparable European countries.
  • Lack of good empirical data on the impact of nutrition (e.g., fat, carbs, nutrients, etc) on life expectancy and health more broadly.
  • Rising diabetes and obesity rates in much of Europe (if our obesity rates are the fault of our healthcare system, then theirs are too; we’ve been richer & fatter much longer than them, broadly speaking).
  • High smoking rates in several European countries (which will likely start to see the health outcomes associated with this in ensuing decades as the damage accumulates and those cohorts approach middle ages and beyond)
  • The fact that most of these differences pre-existed the rollout of national healthcare systems in Europe and that even the divergence appears to be largely a US inflection point (i.e., we stopped gaining ground, especially women, but that’s pretty well explained by smoking and other lifestyle differences).
  • Different immigrant population (see particular infectious diseases found much more amongst 3rd world countries).