Modelling Stillbirth

William Easterly and Laura Freschi go after "Inception Statistics" in the latest post on AidWatch. They criticize -- in typically hyperbolic style, with bonus points for the pun in the title -- both the estimates of stillbirth and their coverage in the news media. I left a comment on their blog outlining my thoughts but thought I'd re-post them here with a little more explanation. Here's what I said:

Thanks for this post (it’s always helpful to look at quality of estimates critically) but I think the direction of your criticism needs to be clarified. Which of the following are you upset about (choose all that apply)?

a) the fact that the researchers used models at all? I don’t know the researchers personally, but I would imagine that they are concerned with data quality in general and would much preferred to have had reliable data from all the countries they work with. But in the absence of that data (and while working towards it) isn’t it helpful to have the best possible estimates on which to set global health policy, while acknowledging their limitations? Based on the available data, is there a better way to estimate these, or do you think we’d be better off without them (in which case stillbirth might be getting even less attention)? b) a misrepresentation of their data as something other than a model? If so, could you please specify where you think that mistake occurred — to me it seems like they present it in the literature as what it is and nothing more. c) the coverage of these data in the media? On that I basically agree. It’s helpful to have critical viewpoints on articles where there is legitimate disagreement.

I get the impression your main beef is with (c), in which case I agree that press reports should be more skeptical. But I think calling the data “made up” goes too far too. Yes, it’d be nice to have pristine data for everything, but in the meantime we should try for the best possible estimates because we need something on which to base policy decisions. Along those lines, I think this commentary by Neff Walker (full disclosure: my advisor) in the same issue is worthwhile. Walker asks these five questions – noting areas where the estimates need improvement: - “Do the estimates include time trends, and are they geographically specific?” (because these allow you to crosscheck numbers for credibility) - “Are modelled results compared with previous estimates and differences explained?” - “Is there a logical and causal relation between the predictor and outcome variables in the model?” - “Do the reported measures of uncertainty around modelled estimates show the amount and quality of available data?” - “How different are the settings from which the datasets used to develop the model were drawn from those to which the model is applied?” (here Walker says further work is needed)

I'll admit to being in over my head in evaluating these particular models. As Easterly and Freschi note, "the number of people who actually understand these statistical techniques well enough to judge whether a certain model has produced a good estimate or a bunch of garbage is very, very small." Very true. But in the absence of better data, we need models on which to base decisions -- if not we're basing our decisions on uninformed guesswork, rather than informed guesswork.

I think the criticism of media coverage is valid. Even if these models are the best ever they should still be reported as good estimates at best. But when Easterly calls the data "made up" I think the hyperbole is counterproductive. There's an incredibly wide spectrum of data quality, from completely pulled-out-of-the-navel to comprehensive data from a perfectly-functioning vital registration system. We should recognize that the data we work with aren't perfect. And there probably is a cut-off point at which estimates are based on so many models-within-models that they are hurtful rather than helpful in making informed decisions. But are these particular estimates at that point? I would need to see a much more robust criticism than AidWatch has provided so far to be convinced that these estimates aren't helpful in setting priorities.