A response to concerns over r-squared statistics and non-linear models

Recently on Twitter, Nassim Taleb attacked the r-squared statistic I presented in a single plot in my prior post on health care prices.

Another guy on Twitter, Salil Mehta, even created an entire YouTube video to argue this point.


A few comments


I have no doubt they are very much competent statisticians.  However, I suspect at least one of them has internalized conventional wisdom on US health spending and leaped to the conclusion that I simply must have been playing fast and loose with the statistics to produce anything resembling the r-squared I presented in the plot.  Even if their concerns are purely technical and limited in scope, not everyone will treat it as such, so I’ll say a few more words on this topic.


do recognize there is perfectly legitimate concern over the use of r-squared statistics with non-linear models and that there is always the potential for over-fitting.


While accounting for observed non-linear increase puts the US substantially closer to trend, my core arguments genuinely do not hinge on this.  It’s a distant secondary point at best.  Indeed, my earlier posts on the explanatory power of disposable income and consumption focused near-exclusively on linear specifications.   Merely switching to more robust national indicators of material living conditions, as opposed to less predictive, yet commonly cited, GDP or GNI, dramatically alters this picture for the US specifically and does a much better job of predicting health expenditures within and between countries (even if the US is excluded).  I would be quite content to get more people to simply accept this argument!


Contrary to Salil’s assertion that the indicated r-squared is entirely the result of the “highly complex” non-linear model, that would presumably be much smaller, but for these extra degrees of freedom (“r-squared….less than 0.5”), it’s actually quite similar!


Whereas I obtained r-squared of .94 (adjusted r^2=.93) in the above (previously undisclosed) 2nd-order polynomial specification (just one more degree of freedom), I obtained r-squared of .91 in the linear specification on precisely the same data.

Even this crude linear specification would have the US spending 2-3 times the OECD average — much more than many naively argue we would consume if we had a health system that looked more like the typical OECD country.



High r-squared is also obtained with other sorts of standard linear models:


And the r-squared is also quite high when HCE is measured as a share of disposable income.



While r-squared coefficients with non-linear models are uninterpretable, the polynomial model I plotted was actually linear because the parameters themselves are linear (i.e., they are “linear in the parameters”). [cite, cite, cite] Therefore, while Taleb and others may question the extra degree of freedom it adds, it is not inherently wrong to show an r-squared statistic.

My original plot (the fitted line and the actuals) should have made this reasonably clear visually, I think.  It surely did not necessarily suggest a complex model with quadratic terms that would have radically inflated the r-squared coefficient.

This time I’ll plot the equation and the adjusted r-squared.


The residuals on the same 2014 plot look kinda random to my eyes.


Likewise with 2012 data

Similarly, if I feed data from 2012 into the same model I get much the same plot2012_data_with_2014_model.png
[note: While I expect some very modest secular time trend, both HCE and income are expressed in constant PPP-adjusted 2010 USD]


Both polynomial terms in the original plot are statistically significant with the initial orthogonal polynomial transformation I used.


The orthogonal polynomial is model two.  I included the more readily recognizable polynomial specification in model three to show that it generates an identical fit.  That these terms are both statistically significant, even on the more limited cross-sectional data I plotted for 2014, lends some support to the view that the true trend is non-linear with respect to income.


The non-linearity in the cross-sectional OECD data may be somewhat debatable in any given year.  However, having actually studied this data on many prior occasions and from many different angles, I already knew that analyses of cross-sectional data that include a much greater quantity of and a more economically-diverse array of countries, make a compelling case for these sorts of models.

For example, using data from the WorldBank’s 2011 ICP program, we find this:


I’ve included a loess smoother to make clear that the linear model is systematically biased.  This is even more evident if I plot the residuals against the fitted values.


My preferred model specification here, given the unusually extensive range of economic conditions and the modest bias found in the 2nd-degree polynomial, is a 3rd-degree polynomial.  As you can see in model three, all terms are very much significant on the full dataset whereas when I exclude the US, as in model four, this term loses its significance.


Plotted as in model three (including the US):


Plotted as in model four (excluding the US, but projecting the model out):



Much the same goes for my prior analysis of OECD panel data.  I will briefly re-do part of that analysis here.

For instance, if we coarsely plot HCE by AIC:


The 3rd-degree polynomial model explains substantially more of the variance and lacks the readily apparent bias found in the linear specification.



I would not suggest this model is perfect. Amongst other issues, there is a small independent secular trend.   However, if I plot the residuals by year, it nonetheless holds up reasonably well.


Moving to slightly more complex models, wherein I account for linear time trends or year fixed effects, I find time only has a modest effect (as expected).  The time coefficients in both specifications are relatively small (averaging around 7 dollars per year) and it does not dramatically alter these conclusions.


All polynomial coefficients are still very much statistically significant and similar in magnitude.


All of these models suggest that the passage of time, i.e., changes in technology, disease burdens, social norms, etc., have only modest explanatory power once changes in material living conditions have been adequately accounted for.   While changes in medical technology and know-how are likely to be a primary proximate cause of spending in high-income countries, it’s likely not a substantial root cause.

To put this into perspective:


Changes in material living conditions (the X-axis) are doing almost all of the work in this model, not time (the Y-axis).  Put differently, this model indicates a country with a given constant income would be expected to spend roughly the same amount on health in 1970 as in 2015 (+/- 400 USD, less on average between the period).

If plot the fitted data against the actuals for each country using a 3rd-degree polynomial and an annual time trend (fewer degrees of freedom than year FE), it looks quite reasonable.


Likewise for fitted versus residuals


This still seemingly modestly underpredicts both the relatively low-income and relatively high-income observations (time+country), which an extra degree (4th-order) would smooth out considerably, but I’m not going to get into that here.



Cross-sectional linear specifications perform abysmally in forecasting future expenditures, even when we adjust for inflation and compare against the same presumably comparable set of countries (excluding the US).


As these countries moved towards higher real income levels, the cross-sectional data suggests an increasingly steeper slope and the average spending level is much higher than what would have been expected based on earlier regressions. The linear slopes are increasing here in a way that is consistent with the polynomial fit observed in the panel data.    Because these are computed in constant PPP adjusted dollars, inflation is unlikely to explain it.  The time trend coefficients I presented earlier also suggest the passage of time (technology, social norms, etc.) is unlikely to resolve this either.


I can respect a principled argument that there is some uncertainty, given the substantial difference between the US and most of the rest of the OECD vis-a-vis income and health spending, as Salil Mehta argued on YouTube.


However,  if one is to be consistent, this view should also call into question assertions that the US’s high health expenditures cannot be explained by income, as Frakt and Carroll implied in the NYTimes and have surely explicitly argued on their blog (polynomials and r^2s too!). At least it should if one uses robust measures of average  material living conditions, such as Actual Individual Consumption or Adjusted Household Disposable Income, which are more robust predictors of health expenditures!

That being said, my choice of PPPs, AIC, which better indicates the purchasing power for households than GDP PPPs do, amplifies the US difference on both axes (US households are more affluent than typical comparisons let on).  US HCE is not all that far from the relatively high-income OECD countries in the latest data when figures are converted to GDP PPPs instead.


We can also plot it like this:


For comparison’s sake, at AIC PPPs:


More generally, if the argument is that there is something fundamental about the US health system that causes exceptionally high health spending conditional on incomes, we can reasonably compare what the US spent on health in relation to its income (or consumption) to other countries of comparable means in constant PPP-adjusted terms.


What we find is that countries with similar income levels tend to spend comparable shares of their income (or overall consumption) on health in the long run (i.e., countries tend to revert to broader international trends).  I, for one, certainly do not think it is a coincidence that when the US had contemporaneous material living conditions closer to the likes of Switzerland and Norway, its spending was much more in line with the OECD average.  Presumably, the same idiosyncratic market failures that are supposed to cause excessive health expenditures at present should also have also produced excessive spending prior to the mid-80s.



Not until US material living conditions diverged rapidly from the OECD did spending jump so clearly away from the linear trend (Nb. Keep in mind that there is some noise in all of these measurements and that the observed income response lag varies some between countries– US private spending, in particular, responds more rapidly to increases and decreases).





The change in US national health expenditures is also rather well explained by the change in material living conditions in time series analysis (which tends to support causal inference).

This has been pointed out by a good number of people.  For example, Victor Fuchs did so with GDP (less than ideal as a measure of material living conditions, but still a reasonable proxy for it) using moving averages.


Louise Sheiner of the Brookings Institution fitted a reasonable model using out of sample data on GDP.


CMS has, for years, used disposable personal income (DPI) as the primary exogenous variable in their long-term projections and has made similar observations as I have about its utility as compared to GDP.


Disposable Personal Income is a much stronger indicator of domestic material living conditions and, not coincidentally, a much stronger predictor of changes in domestic health spending than GDP.

Instead of using a moving average, I’ll show much the same for Personal Consumption Expenditures (PCE) and Disposable Personal Income (DPI), the approximate domestic counterparts to AIC and NAHDI respectively, using loess smoothers.


We also find this linkage if we deflate health expenditures by the PCE health price index or use the BLS’s supplied quantity index.


Contrary to the arguments made by Frakt and Carroll, though some of the growth can be attributed to excess health inflation (health inflation over and above the general rate of inflation), most of the long run growth is attributable to changes in the volumes (quantities) of health care consumed per capita.  It is also likely that a large part of the excess health inflation can be attributed to Baumol’s cost disease (rising income levels).

Many other expenditure categories respond to changes in material living conditions and even respond more rapidly, with greater volatility, but health is one of a few categories of expenditure that persistently and dramatically exceeds the rate of long-run income growth in times series, cross-sectional, and panel data analyses.


(Of course, many of these items would likely show at least somewhat higher growth as quantity indices given differences in inflation rates)



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s