Overdrive for Results? The Blessings and Limitations of Randomized Control Trials for Development Policy

By Luca Etter

Whenever either politicians or economists are claiming to have found the answer to most of the developing world’s questions, caution is warranted. Politicians – currently incapable of addressing even the most basic issues – seem to be in a rare phase of humble agreement that they do not have the recipes to lift the world’s poorest out of misery. Meanwhile, economists have reached an apparent consensus not on which development policies are the best, but at least on how to measure their impact. A new generation of highly acclaimed economic researchers – all in the noble quest to find out what “works and what doesn’t” – have made randomized control trials (RCTs) the gold standard of program evaluation in development practice. While RCTs have certainly helped practitioners around the world gain a much better understanding about the effectiveness of their policies, the proponents of this movement should also start listening to their critics. If they do so, Poor Economics can indeed be the key to a wealth not only of data, but also much needed policy solutions.

Without any doubt, RCTs are a blessing for development practitioners around the world and have the potential to be a blessing for the world’s poor, too. By randomly assigning participants to a program and comparing their outcomes with a similar group of people who are not participating, all the differences between the groups can be attributed to the policy, usually referred to as the “treatment”. The randomization of the treatment takes care of many statistical problems associated with other forms of policy evaluation, namely the unobserved characteristics – for example, motivation and ability – of people who choose to participate in a certain program. These characteristics, rather than the program itself, may drive the results. RCTs, therefore, provide policymakers with a clear answer to the question whether or not a program was effective in achieving a certain outcome.

The Limitations: Are RCTs evaluating policy questions or policy environments?

There are, however, limitations and problems associated with randomized experiments. The limitations have mainly to do with the generalizability of a finding outside of the specific context an experiment was conducted in or, to put it in economic terms, the external validity of an evaluation. Because experiments by definition have to be carried out in a specific treatment area they will also provide results that may be specific to that area and that point in time. As Harvard Economist Dani Rodrik points out:

“The typical evaluation will have been carried out in a specific locale on a specific group and under specific experimental conditions.  Its generalizability to other settings is never assured—this is the problem of “external validity”—and it is certainly not established by the evaluation itself… the only truly hard evidence that randomized evaluations typically generate relates to questions that are so narrowly limited in scope and application that they are in themselves uninteresting. The “hard evidence” from the randomized evaluation has to be supplemented with lots of soft evidence before it becomes usable.”
Other limitations to RCTs, summarized by Martin Ravallion, include a bias towards evaluating (and reporting about) projects that are successes rather than those that are failures, the fact that short-term impacts get more attention than long-term consequences of interventions, and that some policies (transfers, other social programs) are evaluated more often than others (infrastructure) which may skew the research agenda in directions other than the most important issues in a specific country/time context. Many of these shortcomings, however, are not unique to RCTs and the benefits of randomized experiments may still outweigh the limitations, in particular when compared to other evaluation methods (for a full discussion of the pro’s and con’s of RCTs vis-à-vis other methods, read this paper by Guido Imbens in response to some of these critiques).

This leaves us with three problems. First and foremost, there are inherent ethical issues associated with RCTs: By randomly allocating people into treatment and control groups, a certain program is by design denied to people who would want and need it while it is imposed on others that don’t. This is often the case when randomization is not carried out at the individual level, but by government district, school, health clinic, or any other entity providing public services. What is more, there have been unintended but predictable adverse consequences of experimental designs, most famously a corruption study in India where individuals were asked to bribe officials to get a driver’s license, thus putting unsafe drivers on the road and therefore putting individuals in danger. Second, RCTs have become so dominant in many research areas that they not only influence policy priorities, but also disqualify “non-rigorous” data. In many developing countries data is scarce and development economists should try to design methodologies that can make the best use of existing data rather than requiring new data collection before policy intervention. Third, while the great push for RCTs has set free tremendous resources for implementing and evaluating pilot projects, the attention of the researchers and policymakers often fades once the pilot is over.

Moving forward: Randomization Plus

To summarize, yes, the influx of RCTs in development economics has had a tremendous impact on development policy by focusing on results instead of textbook paradigms. People at the forefront of the movement – namely Esther Duflo and others at J-PAL – have created a body of evidence that will make policies around the developing world smarter and more cost-effective. To truly exploit the great benefits of experimental designs, however, the movement needs to acknowledge its limitations and address some of its problems. First, other, quantitative and qualitative forms of evidence need to be treated as valuable complements to RCTs rather than as second class data. Second, where implementing a policy on a randomized basis can have negative effects on recipients, RCTs should not be used. To better identify such cases, evaluability assessments should be carried out before implementing RCTs.  And third, RCTs need to be conducted only in environments where there are resources and commitment to continue and scale up the policy if it is found to be working.

+ posts

Established in 1995, the Georgetown Public Policy Review is the McCourt School of Public Policy’s nonpartisan, graduate student-run publication. Our mission is to provide an outlet for innovative new thinkers and established policymakers to offer perspectives on the politics and policies that shape our nation and our world.