Sample size and clustering advice needed

brb243

[Question]

Sample size and clustering advice needed

brb243

1 min readJul 29, 2020

Comments 6

Sorted by

New & upvoted

Sindy

Hey, thank you for the work you are doing! Here are my thoughts (I'm an economist at IDinsight and work on this type of research):

If you want to understand the impact of your program, I don't recommend doing an RCT at this stage. This seems like a very small pilot and you won't have enough power / sample size to detect an effect (more see below). You should only consider running an RCT if and when you plan to scale this up later to a sufficient scale.
Instead what I advise is trying to understand and improve your impact by doing some small sample survey + qualitative research. E.g. when you go to a village, talk to locals (ideally capture a good representation of different types of people in the community, not just leaders but also relatively marginalized groups; you could do a rigorous sampling but I'm not sure if that's realistic or worthwhile at this stage given the trouble that involves) to understand their current knowledge, attitudes, and behavior around COVID (what knowledge they lack, what attitude needs changed, what rumors are around etc.) -- to better design your messages; also ask them what kind of information campaign would engage them, and after you do your program ask how they felt -- whether they liked it, whether they found it useful, what they learned, what they'd do differently etc. Can also contact them some time later to see if they observe any behavioral change among people in the community (better than asking what they themselves do due to social desirability bias).

More technical details:

Since you're doing a clustered RCT -- treatment is at the village level and the outcomes of people within a village are likely positively correlated -- you'll need a larger sample size than if you were doing an individual-level RCT (for the math, see section 4.2 of this -- generally a great resource for RCT design). You can do a power calculation for a clustered randomized controlled trial, e.g. using Stata's "power twomeans" command. One parameter that's missing is the intraclass correlation (correlation among individuals within a treatment unit). However, since your cluster size is SO small (3 and 3), when I try to do this calculation in Stata with any reasonable assumption Stata says you cannot have enough power (assuming you want all the standard -- 80% power, 5% significance level etc.). That's why I recommend not doing an RCT unless you have a program at scale

brb243

Hello Sindy,

Thank you so much. This answers my question. Yes, there will be a before and after qualitative survey asking about own and others' behavior - which may need to be truncated to speak with more different groups. Then, the face covering data can be used to complement the survey information.

Matt_Lerner

If you don't already have it, I would strongly recommend getting a copy of Gerber & Green's Field Experiments. I would also very strongly recommend that you (or EA Cameroon) engage an experimental methodology expert for this project, rather than pose the question on the forum (I am not such an expert).

It is very difficult to address all of these questions in a broad way, since the answers depend on:

The smallest effect size you would hope to observe
Your available resources
The population within each cluster
The total population
Your analysis methodology

I'm a little confused about the setup. You say that there are 6 groups— so how would it be possible to have "6 intervention + 3 non-intervention?" Sorry if I'm misunderstanding.

In general, and particularly in this context, it makes sense to split your clusters evenly between treatment and control. This is the setup that minimizes the standard error of the difference between groups. When the variance is larger, smaller effect sizes are difficult to detect. The smaller the number of clusters in your control group, for example, the larger the effect size that you would have to detect in order to make a statistically defensible claim.

With such a small number of clusters, effect sizes would have to be very large in order to be statistically distinguishable from zero. If indeed 50% of the population in these groups is already masked, 6 clusters may not be enough to see an effect.

Can we get some clarification on some of your questions? Particularly:

How important, in terms of statistical power is to include all clusters

If you have only 6 to choose from, then the answer is very important. But I'm not sure this is the sense in which you mean this.

How many persons should be observed at each place?

My inclination here is to say "as many as possible." But this is constrained by your resources and your method of observation. Can you say more about the data collection plan?

brb243

Thank you. I was not able to get (a pdf of) Field Experiments, but downloaded the "Field Experimental Designs for the Study of Media Effects," also co-authored by Green. They point out "robust cluster standard errors" to estimate "individual-level average treatment effect" (172).

To answer your points:

The smallest effect size you would hope to observe

20%. From 5/10 to 6/10 or equivalent % increase

Your available resources

Researchers in all of the campaign clusters and some of the non-campaign ones. They can count whether e. g. few hundreds of individuals wear face covering

The population within each cluster

Different, average of 180,000/6 = 30,000.

The total population

Since we are just looking to estimate the impact of the 180,000-person campaign and not to generalize it, this should be 180,000x2 (180,000 participating and an equal number of non-participants who are the nearest geographically and in characteristics).

Your analysis methodology

Probit, logit or simple linear regression, but open to suggestions

I meant 6 groups in the intervention area, and some number of groups (e. g. 3 or 6) in the non-intervention area.

OK. So 3 intervention clusters and 3 non-intervention clusters are better than 6 intervention clusters and 3 non-intervention clusters but 6+6 may be necessary? Would the answer depend on the intra-cluster correlation coefficient (ρ)? Perhaps, the texts that generally talk about clustering assume relatively significant between cluster variability and low within cluster variability (so high ρ). However, in this study, how people respond to the messaging may not depend much on their 'cluster assignment,' but much more on their individual characteristics that, on average, may be comparable across the clusters and the studied population.

I should ask EA Cameroon about the possibility of different average responses in different villages.

Do you know of any online sample size calculator that includes clusters?

Matt_Lerner

I refer you to Sindy's comment (she is actually an expert) but I want to note and verify that it sounds as if you may not actually be thinking of collecting individual-level data, and that you're thinking of making observations at the village level (e.g. what % of people in this village wear masks?). So it's not just the case that you wouldn't have enough clusters to make a statistical claim, but you may actually be talking about doing an experiment in which the units are villages... so n = 6 to 12. Then of course you'd have considerable error in the village-level estimate, and uncertainty about the representativeness about the sample within each village. I agree with Sindy that you probably don't want an RCT here.

brb243

OK, thank you.

Comments

More from the author

Sign of quality of life in GiveWell’s analyses

brb243·3y ago·3m read

EA Cameroon - COVID-19 Awareness and Prevention in the Santa Division of Cameroon Project Proposal

brb243·5y ago·15m read

A counterfactual QALY for USD 2.60–28.94?

brb243·5y ago·6m read

Curated and popular this week

Cultivating hope: calibrating the expectations for cultivated meat to end factory farming

PabloAMC 🔸·1w ago·Curated 5d ago·22m read

116

Maybe do the thing you wish CEA would do

alejoacelas 🔸·4d ago·2m read

I used AI to fix transcription errors, rerrarange the ideas, and suggest tweaks to the title and some sentences. Three of the most exciting projects to come out of EA in recent years are, in a vague sense, CEA spinouts: * Kairos is directly a spinout of CEA and now handles most support for university AI safety groups. Basically everyone I've found who knows them is really excited about what they do * NEST is an opinionated ideas-fi...

RP is looking for project founders in neglected animal areas

Rethink Priorities·5d ago·7m read

TLDR; To help the effective animal advocacy movement cost-effectively absorb greater amounts of funding in the near future, we are seeking expressions of interest from people who could found a new organization focused on: * Highly neglected animals: insects, wild animals, shrimp, fish, etc, or * AI and animals: AI alignment and governance for animal welfare, strategic actions considering transformative AI, AI for wild animals, etc. * ...

Recent opportunities to take action

Inspiring colleagues in Luxembourg on Effective Giving + identifying infrastructural gaps

Lorenzo Fong Ponce 🔸·8h ago·12m read

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·1d ago·2m read

171

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·1w ago·4m read

Sindy

Hey, thank you for the work you are doing! Here are my thoughts (I'm an economist at IDinsight and work on this type of research):

If you want to understand the impact of your program, I don't recommend doing an RCT at this stage. This seems like a very small pilot and you won't have enough power / sample size to detect an effect (more see below). You should only consider running an RCT if and when you plan to scale this up later to a sufficient scale.
Instead what I advise is trying to understand and improve your impact by doing some small sample survey + qualitative research. E.g. when you go to a village, talk to locals (ideally capture a good representation of different types of people in the community, not just leaders but also relatively marginalized groups; you could do a rigorous sampling but I'm not sure if that's realistic or worthwhile at this stage given the trouble that involves) to understand their current knowledge, attitudes, and behavior around COVID (what knowledge they lack, what attitude needs changed, what rumors are around etc.) -- to better design your messages; also ask them what kind of information campaign would engage them, and after you do your program ask how they felt -- whether they liked it, whether they found it useful, what they learned, what they'd do differently etc. Can also contact them some time later to see if they observe any behavioral change among people in the community (better than asking what they themselves do due to social desirability bias).

More technical details: