Miscellany: Epidemic City and life expectancy

Brett Keller May 11, 2011

In 8 days I'll be done with my first year of graduate studies and will have a chance to write a bit more. I've been keeping notes all year on things to write about when I have more time, so I should have no shortage of material! In the meantime, two links to share: 1) Just in time for my summer working with the New York City Department of Health comes Epidemic City: The Politics of Public Health in New York. The Amazon / publisher's blurb:

The first permanent Board of Health in the United States was created in response to a cholera outbreak in New York City in 1866. By the mid-twentieth century, thanks to landmark achievements in vaccinations, medical data collection, and community health, the NYC Department of Health had become the nation's gold standard for public health. However, as the city's population grew in number and diversity, new epidemics emerged, and the department struggled to balance its efforts between the treatment of diseases such as AIDS, multi-drug resistant tuberculosis, and West Nile Virus and the prevention of illness-causing factors like lead paint, heroin addiction, homelessness, smoking, and unhealthy foods. In Epidemic City, historian of public health James Colgrove chronicles the challenges faced by the health department in the four decades following New York City's mid-twentieth-century peak in public health provision.

This insightful volume draws on archival research and oral histories to examine how the provision of public health has adapted to the competing demands of diverse public needs, public perceptions, and political pressure.

Epidemic City delves beyond a simple narrative of the NYC Department of Health's decline and rebirth to analyze the perspectives and efforts of the people responsible for the city's public health from the 1960s to the present. The second half of the twentieth century brought new challenges, such as budget and staffing shortages, and new threats like bioterrorism. Faced with controversies such as needle exchange programs and AIDS reporting, the health department struggled to maintain a delicate balance between its primary focus on illness prevention and the need to ensure public and political support for its activities.

In the past decade, after the 9/11 attacks and bioterrorism scares partially diverted public health efforts from illness prevention to threat response, Mayor Michael Bloomberg and Department of Health Commissioner Thomas Frieden were still able to work together to pass New York's Clean Indoor Air Act restricting smoking and significant regulations on trans-fats used by restaurants. Because of Bloomberg's willingness to exert his political clout, both laws passed despite opposition from business owners fearing reduced revenues and activist groups who decried the laws' infringement upon personal freedoms. This legislation preventative in nature much like the 1960s lead paint laws and the department's original sanitary code reflects a return to the 19th century roots of public health, when public health measures were often overtly paternalistic. The assertive laws conceived by Frieden and executed by Bloomberg demonstrate how far the mandate of public health can extend when backed by committed government officials.

Epidemic City provides a compelling historical analysis of the individuals and groups tasked with negotiating the fine line between public health and political considerations during the latter half of the twentieth century. By examining the department's successes and failures during the ambitious social programs of the 1960s, the fiscal crisis of the 1970s, the struggles with poverty and homelessness in the 1980s and 1990s, and in the post-9/11 era, Epidemic City shows how the NYC Department of Health has defined the role and scope of public health services, not only in New York, but for the entire nation.

2) Aaron Carroll at the Incidental Economist writes about the subtleties of life expectancy. His main point is that infant mortality skews life expectancy figures so much that if you're talking about end-of-life expectations for adults who have already passed those (historically) most perilous times as a youngster, you really need to look at different data altogether.

The blue points on the graph below show life expectancy for all races in the US at birth, while the red line shows life expectancy amongst those who have reached the age of 65. Ie, if you're a 65-year-old who wants to know your chances of dying (on average!) in a certain period of time, it's best to consult a more complete life table rather than life expectancy at birth, because you've already dodged the bullet for 65 years.

(from the Incidental Economist)

Modelling Stillbirth

Brett Keller April 18, 2011

William Easterly and Laura Freschi go after "Inception Statistics" in the latest post on AidWatch. They criticize -- in typically hyperbolic style, with bonus points for the pun in the title -- both the estimates of stillbirth and their coverage in the news media. I left a comment on their blog outlining my thoughts but thought I'd re-post them here with a little more explanation. Here's what I said:

Thanks for this post (it’s always helpful to look at quality of estimates critically) but I think the direction of your criticism needs to be clarified. Which of the following are you upset about (choose all that apply)?

a) the fact that the researchers used models at all? I don’t know the researchers personally, but I would imagine that they are concerned with data quality in general and would much preferred to have had reliable data from all the countries they work with. But in the absence of that data (and while working towards it) isn’t it helpful to have the best possible estimates on which to set global health policy, while acknowledging their limitations? Based on the available data, is there a better way to estimate these, or do you think we’d be better off without them (in which case stillbirth might be getting even less attention)? b) a misrepresentation of their data as something other than a model? If so, could you please specify where you think that mistake occurred — to me it seems like they present it in the literature as what it is and nothing more. c) the coverage of these data in the media? On that I basically agree. It’s helpful to have critical viewpoints on articles where there is legitimate disagreement.

I get the impression your main beef is with (c), in which case I agree that press reports should be more skeptical. But I think calling the data “made up” goes too far too. Yes, it’d be nice to have pristine data for everything, but in the meantime we should try for the best possible estimates because we need something on which to base policy decisions. Along those lines, I think this commentary by Neff Walker (full disclosure: my advisor) in the same issue is worthwhile. Walker asks these five questions – noting areas where the estimates need improvement: - “Do the estimates include time trends, and are they geographically specific?” (because these allow you to crosscheck numbers for credibility) - “Are modelled results compared with previous estimates and differences explained?” - “Is there a logical and causal relation between the predictor and outcome variables in the model?” - “Do the reported measures of uncertainty around modelled estimates show the amount and quality of available data?” - “How different are the settings from which the datasets used to develop the model were drawn from those to which the model is applied?” (here Walker says further work is needed)

I'll admit to being in over my head in evaluating these particular models. As Easterly and Freschi note, "the number of people who actually understand these statistical techniques well enough to judge whether a certain model has produced a good estimate or a bunch of garbage is very, very small." Very true. But in the absence of better data, we need models on which to base decisions -- if not we're basing our decisions on uninformed guesswork, rather than informed guesswork.

I think the criticism of media coverage is valid. Even if these models are the best ever they should still be reported as good estimates at best. But when Easterly calls the data "made up" I think the hyperbole is counterproductive. There's an incredibly wide spectrum of data quality, from completely pulled-out-of-the-navel to comprehensive data from a perfectly-functioning vital registration system. We should recognize that the data we work with aren't perfect. And there probably is a cut-off point at which estimates are based on so many models-within-models that they are hurtful rather than helpful in making informed decisions. But are these particular estimates at that point? I would need to see a much more robust criticism than AidWatch has provided so far to be convinced that these estimates aren't helpful in setting priorities.

"Small Changes, Big Results"

Brett Keller March 24, 2011

The Boston Review has a whole new set of articles on the movement of development economics towards randomized trials. The main article is Small Changes, Big Results: Behavioral Economics at Work in Poor Countries and the companion and criticism articles are here. They're all worth reading, of course. I found them through Chris Blattman's new post "Behavioral Economics and Randomized Trials: Trumpeted, Attacked, and Parried." I want to re-state a point I made in the comments there, because I think it's worth re-wording to get it right. It's this: I often see the new randomized trials in economics compared to clinical trials in the medical literature. There are many parallels to be sure, but the medical literature is huge, and there's really one subset of it that offers better parallels.

Within global health research there are a slew of large (and not so large), randomized (and other rigorous designs), controlled (placebo or not) trials that are done in "field" or "community" settings. The distinction is that clinical trials usually draw their study populations from a hospital or other clinical setting and their results are thus only generalizable to the broader population (external validity) to the extent that the clinical population is representative of the whole population; while community trials are designed to draw from everyone in a given community.

Because these trials draw their subjects from whole communities -- and they're often cluster-randomized so that whole villages or clinic catchment areas are the unit that's randomized, rather than individuals -- they are typically larger, more expensive, more complicated and pose distinctive analytical and ethical problems. There's also often room for nesting smaller studies within the big trials, because the big trials are already recruiting large numbers of people meeting certain criteria and there are always other questions that can be answered using a subset of that same population. [All this is fresh on my mind since I just finished a class called "Design and Conduct of Community Trials," which is taught by several Hopkins faculty who run very large field trials in Nepal, India, and Bangladesh.]

Blattman is right to argue for registration of experimental trials in economics research, as is done with medical studies. (For nerdy kicks, you can browse registered trials at ISRCTN.) But many of the problems he quotes Eran Bendavid describing in economics trials--"Our interventions and populations vary with every trial, often in obscure and undocumented ways"--can also be true of community trials in health.

Likewise, these trials -- which often take years and hundreds of thousands of dollars to run -- often yield a lot of knowledge about the process of how things are done. Essential elements include doing good preliminary studies (such as validating your instruments), having continuous qualitative feedback on how the study is going, and gathering extra data on "process" questions so you'll know why something worked or not, and not just whether it did (a lot of this is addressed in Blattman's "Impact Evaluation 2.0" talk). I think the best parallels for what that research should look like in practice will be found in the big community trials of health interventions in the developing world, rather than in clinical trials in US and European hospitals.

Evaluation in education (and elsewhere)

Brett Keller March 1, 2011

Jim Manzi has some fascinating thoughts on evaluating teachers at the American Scene. Some summary outtakes:

1. Remember that the real goal of an evaluation system is not evaluation. The goal of an employee evaluation system is to help the organization achieve an outcome....

2. You need a scorecard, not a score. There is almost never one number that can adequately summarize the performance of complex tasks like teaching that are executed as part of a collective enterprise....

3. All scorecards are temporary expedients. Beyond this, no list of metrics can usually adequately summarize performance, either....

4. Effective employee evaluation is not fully separable from effective management

When you zoom out to a certain point, all complex systems in need of reform start to look alike, because they all combine social, political, economic, and technical challenges, and the complexity, irrationality, and implacability of human behavior rears its ugly head at each step of the process. The debates about tactics and strategy and evaluation for reforming American education or US aid policy or improving health systems or fostering economic development start to blend together, so that Manzi's conclusions sound oddly familiar:

So where does this leave us? Without silver bullets.

Organizational reform is usually difficult because there is no one, simple root cause, other than at the level of gauzy abstraction. We are faced with a bowl of spaghetti of seemingly inextricably interlinked problems. Improving schools is difficult, long-term scut work. Market pressures are, in my view, essential. But, as I’ve tried to argue elsewhere at length, I doubt that simply “voucherizing” schools is a realistic strategy...

Read the rest of his conclusions here.

Academic vs. Applied... Everything

Brett Keller January 31, 2011

When I posted on Academic vs. Applied Epi I included the following chart:

Then I realized that this breakdown likely works pretty well for other fields too. I sent a link to an economist friend, who responded: "No doubt this is similar with econ. The theoreticians live in a world of (wrong) assumptions, while the practitioners are facing the tough policy challenges. And there are quite a few similarities with the below...such as urgency etc."

You can replace "physicians" with "economists" or many other professions and the chart holds up. Contrasting academic economics researchers with policymakers, the fields for Timeline, Data quality, Scientific values, Outputs, and Competencies needed all hold up pretty well.

Many positions that are basically epidemiological in nature are filled by physicians with clinical training but very little formal public health and epidemiology training, which is strongly paralleled in the policy realm. Some sort of graduate training is generally necessary for many jobs, so those aiming for the applied track tend to get multipurpose 'public policy' degrees often viewed as weak by the more purist academics, while those studying public policy deride the inapplicability of the theoretical work done by academics. And the orientation of many academic fields towards a set of skills primarily useful in pursuits that aren't highly valued by the more applied practitioners may go a long way in explaining animosity between the two camps.

Randomizing in the USA, ctd

Brett Keller December 21, 2010

[Update: There's quite a bit of new material on this controversy if you're interested. Here's a PDF of Seth Diamond's testimony in support of (and extensive description of) the evaluation at a recent hearing, along with letters of support from a number of social scientists and public health researchers. Also, here's a separate article on the City Council hearing at which Diamond testified, and an NPR story that basically rehashes the Times one. Michael Gechter argues that the testing is wrong because there isn't doubt about whether the program works, but, as noted in the comments there, doesn't note that denial-of-service was already part of the program because it was underfunded.] A couple weeks ago I posted a link to this NYTimes article on a program of assistance for the homeless that's currently being evaluated by a randomized trial. The Poverty Action Lab blog had some discussion on the subject that you should check out too.

The short version is that New York City has a housing assistance program that is supposed to keep people from becoming homeless, but they never gave it a truly rigorous evaluation. It would have been better to evaluate it up front (before the full program was rolled out) but they didn't do that, and now they are. The policy isn't proven to work, and they don't have resources to give it to everyone anyway, so instead of using a waiting list (arguably a fair system) they're randomizing people into receiving the assistance or not, and then tracking whether they end up homeless. If that makes you a little uncomfortable, that's probably a good thing -- it's a sticky issue, and one that might wrongly be easier to brush aside when working in a different culture. But I think on balance it's still a good idea to evaluate programs when we don't know if they actually do what they're supposed to do.

The thing I want to highlight for now is the impact that the tone and presentation of the article impacts your reactions to the issue being discussed. There's obviously an effect, but I thought this would be a good example because I noticed that the Times article contains both valid criticisms of the program and a good defense of why it makes sense to test it.

I reworked the article by rearranging the presentation of those sections. Mostly I just shifted paragraphs, but in a few cases I rearranged some clauses as well. I changed the headline, but otherwise I didn't change a single word, other than clarifying some names when they were introduced in a different order than in the original. And by leading with the rationale for the policy instead of with the emotional appeal against it, I think the article gives a much different impression. Let me know what you think:

City Department Innovates to Test Policy Solutions

By CARA BUCKLEY with some unauthorized edits by BRETT KELLER

It has long been the standard practice in medical testing: Give drug treatment to one group while another, the control group, goes without.

Now, New York City is applying the same methodology to assess one of its programs to prevent homelessness. Half of the test subjects — people who are behind on rent and in danger of being evicted — are being denied assistance from the program for two years, with researchers tracking them to see if they end up homeless.

New York City is among a number of governments, philanthropies and research groups turning to so-called randomized controlled trials to evaluate social welfare programs.

The federal Department of Housing and Urban Development recently started an 18-month study in 10 cities and counties to track up to 3,000 families who land in homeless shelters. Families will be randomly assigned to programs that put them in homes, give them housing subsidies or allow them to stay in shelters. The goal, a HUD spokesman, Brian Sullivan, said, is to find out which approach most effectively ushered people into permanent homes.

The New York study involves monitoring 400 households that sought Homebase help between June and August. Two hundred were given the program’s services, and 200 were not. Those denied help by Homebase were given the names of other agencies — among them H.R.A. Job Centers, Housing Court Answers and Eviction Intervention Services — from which they could seek assistance.

The city’s Department of Homeless Services said the study was necessary to determine whether the $23 million program, called Homebase, helped the people for whom it was intended. Homebase, begun in 2004, offers job training, counseling services and emergency money to help people stay in their homes.

The department, added commissioner Seth Diamond, had to cut $20 million from its budget in November, and federal stimulus money for Homebase will end in July 2012.

Such trials, while not new, are becoming especially popular in developing countries. In India, for example, researchers using a controlled trial found that installing cameras in classrooms reduced teacher absenteeism at rural schools. Children given deworming treatment in Kenya ended up having better attendance at school and growing taller.

“It’s a very effective way to find out what works and what doesn’t,” said Esther Duflo, an economist at the Massachusetts Institute of Technology who has advanced the testing of social programs in the third world. “Everybody, every country, has a limited budget and wants to find out what programs are effective.”

The department is paying $577,000 for the study, which is being administered by the City University of New York along with the research firm Abt Associates, based in Cambridge, Mass. The firm’s institutional review board concluded that the study was ethical for several reasons, said Mary Maguire, a spokeswoman for Abt: because it was not an entitlement, meaning it was not available to everyone; because it could not serve all of the people who applied for it; and because the control group had access to other services.

The firm also believed, she said, that such tests offered the “most compelling evidence” about how well a program worked.

Dennis P. Culhane, a professor of social welfare policy at the University of Pennsylvania, said the New York test was particularly valuable because there was widespread doubt about whether eviction-prevention programs really worked.

Professor Culhane, who is working as a consultant on both the New York and HUD studies, added that people were routinely denied Homebase help anyway, and that the study was merely reorganizing who ended up in that pool. According to the city, 5,500 households receive full Homebase help each year, and an additional 1,500 are denied case management and rental assistance because money runs out.

But some public officials and legal aid groups have denounced the study as unethical and cruel, and have called on the city to stop the study and to grant help to all the test subjects who had been denied assistance.

“They should immediately stop this experiment,” said the Manhattan borough president, Scott M. Stringer. “The city shouldn’t be making guinea pigs out of its most vulnerable.”

But, as controversial as the experiment has become, Mr. Diamond said that just because 90 percent of the families helped by Homebase stayed out of shelters did not mean it was Homebase that kept families in their homes. People who sought out Homebase might be resourceful to begin with, he said, and adept at patching together various means of housing help.

Advocates for the homeless said they were puzzled about why the trial was necessary, since the city proclaimed the Homebase program as “highly successful” in the September 2010 Mayor’s Management Report, saying that over 90 percent of families that received help from Homebase did not end up in homeless shelters. One critic of the trial, Councilwoman Annabel Palma, is holding a General Welfare Committee hearing about the program on Thursday.

“I don’t think homeless people in our time, or in any time, should be treated like lab rats,” Ms. Palma said.

“This is about putting emotions aside,” [Mr. Diamond] said. “When you’re making decisions about millions of dollars and thousands of people’s lives, you have to do this on data, and that is what this is about.”

Still, legal aid lawyers in New York said that apart from their opposition to the study’s ethics, its timing was troubling because nowadays, there were fewer resources to go around.

Ian Davie, a lawyer with Legal Services NYC in the Bronx, said Homebase was often a family’s last resort before eviction. One of his clients, Angie Almodovar, 27, a single mother who is pregnant with her third child, ended up in the study group denied Homebase assistance. “I wanted to cry, honestly speaking,” Ms. Almodovar said. “Homebase at the time was my only hope.”

Ms. Almodovar said she was told when she sought help from Homebase that in order to apply, she had to enter a lottery that could result in her being denied assistance. She said she signed a letter indicating she understood. Five minutes after a caseworker typed her information into a computer, she learned she would not receive assistance from the program.

With Mr. Davie’s help, she cobbled together money from the Coalition for the Homeless and a public-assistance grant to stay in her apartment. But Mr. Davie wondered what would become of those less able to navigate the system. “She was the person who didn’t fall through the cracks,” Mr. Davie said of Ms. Almodovar. “It’s the people who don’t have assistance that are the ones we really worry about.”

Professor Culhane said, “There’s no doubt you can find poor people in need, but there’s no evidence that people who get this program’s help would end up homeless without it.”