How can we know which policies to fight Covid-19 actually work?

Policy-makers, scientists and the public are engaged in heated debates about the right responses to Covid-19. In making decisions on such matters as school re-opening or mandatory mask wearing, we need a way to measure both the benefits and the economic and psychological costs.

The global pandemic has galvanised governments, public health agencies, scientists and the public to take strong measures to prevent the spread of coronavirus. Many countries have implemented lockdowns, closed schools and businesses, and made social distancing measures mandatory.

Opinions differ widely as to what are the benefits of these measures in terms of reductions in transmission of Covid-19 – and whether they outweigh the costs, which potentially include negative effects on mental health, increases in domestic violence and widening educational and economic disparities. Even agencies such as the World Health Organization have changed their views over time, for example, over the use of masks. How can we know what works and what doesn’t?

How can we evaluate interventions in the context of Covid-19?

In medical science, new treatments and vaccines are tested using ‘randomised controlled trials’ (RCTs). In these studies, participants are allocated to a treatment or a control group by lottery, and due to this assignment mechanism, any subsequent difference between the average outcomes of the two groups can be unambiguously attributed to the treatment. In practice, individuals will be randomly placed in one of two groups, for example, A and B, where:

Group A: Individuals receive a treatment: for example, a new drug that scientists have developed.
Group B: Individuals receive a placebo: for example, a pill containing no active ingredient.

Comparison of outcomes for individuals in Group A and Group B will identify the impact of the drug.

A prominent example for the successful deployment of RCTs under crisis conditions is the Ebola outbreak in West Africa in 2014-15, where a candidate Ebola treatment was tested in a large RCT. Now, in the Covid-19 outbreak, medical scientists are already testing candidate vaccines using RCTs.

Rolling out a vaccine to the general population without having first conducted such a trial would be unthinkable: deploying such a measure without knowing whether it works, and what its side effects are, would put many lives at risk, violate ethical norms and damage public trust in government and science.

Why, then, are non-medical interventions such as school closures not also subjected to such tests? It cannot be because the stakes are lower: for example, education has large returns in terms of future income and health; and schools are vital in producing an educated citizenry that understands and sustains democracy. It is also difficult to justify not testing school closures because the stakes are too high: when we test vaccines, the stakes are also high, and testing is required precisely for this reason.

Is testing of non-medical interventions realistic?

Another possible argument against systematic testing of non-medical interventions against Covid-19 is that it may be practically infeasible. But as we have recently argued (Haushofer and Metcalf, 2020), RCTs to test such policies could be conducted relatively easily and quickly.

To take the example of schools: a country, region or city might decide to stagger the re-opening of schools, beginning earlier in some parts than in others. In practice, as above, communities would be randomly placed in one of two groups, A and B, where:

Group A: Schools in these communities open immediately.
Group B: Schools in these communities open two weeks later.

Comparison of numbers of cases and other relevant psychological and social measures in communities in Group A and Group B will identify the impact of the intervention (as long as the communities are far enough apart so that ‘spillover’ effects are minimised).

Similarly, lockdowns could be lifted in phases, with some areas beginning sooner than others; and travel restrictions could be eased first for some people and regions, and then for others. If this is done in a randomised fashion, governments and researchers can make statements about the relative importance of each intervention in preventing disease spread, and the potential dangers of relaxing it.

Importantly, even relatively short periods are sufficient to enable such comparisons; for example, staggering the re-opening of schools in different areas by two weeks would allow statements about the effectiveness of closings and the dangers of re-opening in terms of infection transmission.

Of course, such gradual re-openings or closures have to be ethically defensible. A crucial consideration in this context is that of ‘equipoise’ – that is, uncertainty as to which of the tested policies is better than the other.

For many policies, this uncertainty exists: as heated debates in recent months have shown, all interventions have costs and benefits, and reasonable observers can disagree whether, for example, schools should re-open or remain closed at any given point. Re-opening creates the risk of disease spread and deaths; continued school closings deprive children of their education and burden their parents. Each choice is a policy; there is no neutral option. As a result, there is also no single ‘correct’ answer as to whether schools should re-open; policy-makers have to make a judgment call.

In addition, even if there is an understanding that re-opening is necessary, it is likely that there is at least some uncertainty around what is precisely the right moment. For example, there was nothing magical about the date of 1 July, when many European countries re-opened; rather, it was a salient date that fell roughly in the right timeframe.

Randomised, phased re-opening is both an acknowledgment of this uncertainty and, at the same time, a way to generate knowledge about the precise costs and benefits of closures. In clinical studies, such uncertainty features prominently in deciding whether a trial is ethically defensible or not.

In addition, phase-in designs imply that everyone is at some point exposed to both policies, further increasing fairness. As mentioned above, the ethical acceptability of phase-in/phase-out designs is illustrated by the fact that even vaccines for highly lethal pathogens such as Ebola are tested in clinical trials using such designs (Henao-Restrepo et al, 2017).

The research frontier for interventions against Covid-19

Thus, randomised experiments to test interventions during the pandemic are practically possible and ethically defensible. As a result, such approaches have already been used to test the impact of policies to contain Covid-19.

For example, a group of economists around 2019 Nobel laureate in economics Abhijit Banerjee used his fame in his native West Bengal, India, to generate behaviour change: they sent text messages to 25 million people, either containing a short clip in which Banerjee encourages hygiene behaviours and social distancing (Banerjee et al, 2020) or a message that simply referred people to government information. In the framing above, individuals were randomly placed into two groups, where:

Group A: Individuals received the text message with the clip from Banerjee.
Group B: Individuals received a different text message.

People in Group A were more likely to report Covid-19 symptoms to local authorities than the control group (Group B). They also reduced their travel and increased handwashing. Interestingly, the intervention ‘spilled over’ in that it affected even behaviours that were not mentioned in the message (such as wearing masks), and people who didn’t receive the messages themselves, but had neighbours who did.

Because text messages are cheap to send, and because of these spillover effects, this intervention promises to be very cost-effective, although a precise quantification has yet to be undertaken. The study shows that simple, low-cost interventions can play an important role in encouraging preventative behaviours. But more importantly, it illustrates the power of systematic randomised testing to identify what works and what doesn’t.

A further example from Norway illustrates that such studies can also be used to test closings and re-openings. In March 2020, the Norwegian government issued an emergency law that mandated the closing of all gyms. To test whether it was safe to re-open, randomly selected members of a small number of gyms were invited to begin training again, following strict hygiene rules. These individuals could then be compared to a randomly chosen control group for whom training was still disallowed. In practice, gym members were randomly placed into two groups where:

Group A: Gyms were not accessible.
Group B: Gyms were accessible.

The two groups showed no difference in infection rates several weeks after re-opening, suggesting that with hygiene rules in place, the opening of the gyms did not pose a major health hazard at the point during the outbreak when re-opening occurred (TRAiN Study Group, 2020). It should be noted, however, that almost nobody was infected in the country more broadly at that point; the results are therefore only valid for this specific point in the outbreak.

Thus, closings and re-openings can be tested using randomised phase-in designs. Yet governments have shown little interest in using this approach to understand the effects of preventative measures against Covid-19. Given what is at stake, it seems imprudent to deploy and relax such policies without understanding their effects, and wasteful to do so without using the opportunity to learn more. If policy-makers embrace this principle of learning, the rewards could be rapid and considerable.

Where can I find out more?

Messages on Covid-19 prevention in India increased symptoms reporting and adherence to preventive behaviors among 25 million recipients with similar effects on non-recipient members of their communities: Economics Nobel laureate Abhijit Banerjee and colleagues show how a randomised controlled trial can be used to test the effect of text messages on care-seeking behaviour for Covid-19 in India.
Building resilient health systems: experimental evidence from Sierra Leone and the 2014 Ebola outbreak: Darin Christensen and colleagues show that accountability interventions randomly assigned to health clinics by the government of Sierra Leone before the 2014-15 Ebola crisis increased care-seeking behaviour for Ebola and reduced mortality.
Randomized re-opening of training facilities during the COVID-19 pandemic: The TRAiN Study Group shows how randomised gym re-opening can be used to test the impact of gym use on Covid-19 spread.
Which interventions work best in a pandemic? Johannes Haushofer and Jessica Metcalf lay out the practical and ethical case for randomised testing of openings and closings in the context of Covid-19, and provide mathematical illustrations of the measurement requirements in such studies.