Thursday, April 14, 2011

Evaluating NDLP

In the post by Matt Khan that I blogged about a few weeks ago, he argues that one purpose of a blog is publicize one's own research. I have not done much of that, other than in a very broad sense, but it is probably a good idea.

So, in that spirit, I note this paper, which I recently circulated as an IZA working paper:
The Impact of the UK New Deal for Lone Parents on Benefit Receipt

Peter Dolton
Royal Holloway College, University of London,
London School of Economics and IZA

Jeffrey Smith
University of Michigan,

This paper evaluates the UK New Deal for Lone Parents (NDLP) program, which aims to return lone parents to work. Using rich administrative data on benefit receipt histories and a “selection on observed variables” identification strategy, we find that the program modestly reduces benefit receipt among participants. Methodologically, we highlight the importance of flexibly conditioning on benefit histories, as well as taking account of complex sample designs when applying matching methods. We find that survey measures of attitudes add information beyond that contained in the benefit histories and that incorporating the insights of the recent literature on dynamic treatment effects matters even when not formally applying the related methods. Finally, we explain why our results differ substantially from those of the
official evaluation of NDLP, which found very large impacts on benefit exits.
This is an old paper. It got started in the late 1990s as a result of the two of us being on the technical advisory panel for the official UK government evaluation of the NDLP. The results of that evaluation were so positive that even the government agency sponsoring the evaluation did not quite believe them (a high bar indeed!) and so Peter and I, along with his gradual student Joao Pedro Azevedo who is now at the Wild Bank, were retained to reanalyze the data.

Filled with hubris, we expected that if we just redid the analysis and tweaked the methods a bit, the results would change dramatically. This turned out to be wrong. Neither varying the matching method nor worrying a lot about survey non-response moved the estimates very much at all. Changing the outcome variable from "leaving income support within six months" to a monthly measure of benefit receipt did matter some. We document these findings in this report for the UK Department of Work and Pensions.

We continued to pursue the mystery of the oddly high impacts. The paper reflects the additional work we did after writing the report for the DWP. In the end, using just the administrative data, we are able to get the impact estimates down to levels that are likely still a bit too high, but at least within the range of plausibility suggested by the literature. Our analysis reinforces the message from some of my earlier work with Heckman and others regarding the importance of flexibly conditioning on pre-program outcomes. We also indirectly show the value of recent developments in the literature on dynamic treatment assignment, as in the important paper by Barbara Sianesi (2004) Review of Economics and Statistics. We were surprised to find that questions about attitudes toward work do surprisingly well as conditioning variables, while variables related to local labor markets matter very little, contrary to the findings using the JTPA data in Heckman, Ichimura, Smith and Todd (1998) Econometrica. This latter point merits further investigation: when do local labor markets matter in evaluating active labor market programs and when do they not?

Given the heavy UK content, we sent the paper to the Economic Journal, which is the journal of the Royal Economic Society.

Regular readers will note that our estimator in this paper is drawn from the curio cabinet. That is to say, we rely on nearest neighbor matching. The reason for this is essentially path dependence. When we started the work in the early noughties, the gain in speed from using nearest neighbor matching in preference to kernel matching - inverse propensity weighting was not on the radar screen at that point - was large enough that it seemed the correct choice. Once we started down that path, we never quite got around to changing estimators to something better. We'll see what the referees think about that.

No comments: