Experiment Design Checklist
Atal Gawande’s The Checklist Manifesto is a compelling account of how the simple checklist can scaffold expertise and teamwork in complex domains. Good checklists, he says, nudge our memories, prompting us to do what we already know we should do, but risk missing.
Good checklists are also short. 5-9 “killer” items, says Gawande.
So, I got to thinking, what would be on my checklist for experiment design? Additional suggestions welcome, by email or on mastodon, but this is what I got so far:
1. Did you discuss authorship with the research team?
2. Is your study adequately powered?
Most studies are underpowered, don’t be one of them. Useful: Understanding mixed effects models through data simulation. What is your recruitment plan? How will participants be enticed to take part? How will they be rewarded for taking part? Will they be brought into the research process in any way?
3. Should you include a manipulation check?
Probably you should. How will you know that participants were affected the way you think they should have been by your intervention?
4. Does your experiment produce data that you can analyse?
5. How will you judge the size of any effect?
even a statistically significant result might be meaningless. How will you calculate an effect size, and how will you gauge whether it is important? Bonus points: maximal positive controls
6. Will you be able to interpret all possible results which aren’t in line with your predictions?
7. Do you have a plan (and consent) for storing and sharing the data?
Aim for your final data to be FAIR - Findable, Accessible, Interoperabe and Reusable
8. Have you checked prior work on this topic?
How systematic was that review of the previous literature?
9. How will the final result be criticised?
Imagine what your strongest critic will say when presented with your final results. Plan your defence. You might want to consider List #2
List #2 : Common criticisms
Standard flaws, and standard criticisms that you might hear about your result
-
failure to generalise
Does your sample align with the population you want to draw inferences about? Do the stimuli you use fairly represent the situations you want to talk about?
Reading on this: Yarkoni, T. (2019, November 22). The Generalizability Crisis. https://doi.org/10.31234/osf.io/jqw35
placebo effect
Participants responding to the situation, not your intended treatment. Address with an adequate control
demand effect
Participants responding to what they think you want. Address by keeping partipants blind to which condition they are in. Advanced: your experiment will contain an incentive structure which participants respond to. It may be that errors are more costly than successes (so you are rewarding conservative behaviour), or it may be something as simple as that you can get through the experiment more quickly if you take a certain choice (so you are rewarding fast/inaccurate responding). You should know what the incentive structure of your experiment is. The incentive structure of your experiment may be driving behaviour, as much as any deeper psychological desires or biases. Even if this isn’t the case, how will you convince a critic that your experiment reveals something about human psychology beyond “participants do what they are rewarded for?”
experimenter bias
may not be deliberate. Could result from unintentional exploitation of researcher degrees of freedom, inadvertant signalling to participants. Partially address with preregistration.
selection and survivorship bias
Are the results distorted by how you recruited, or how participants differentially dropped out during the experiment? Partially address with (truly) random assignment.
regression to the mean
Particularly a problem with test-retest designs
common method variance
Particularly a problem if both your dependent and independent measures are of the same type (e.g. survey items)
so what?
“This was obvious”, says everyone after you’ve done the work to show it happens
Here’s a quote to keep in mind, if you do want to rest the value of the experiment on theory-testing
But I will mention one severe but useful private test – a touchstone of strong inference – that removes the necessity for third-person criticism, because it is a test that anyone can learn to carry with him for use as needed. It is our old friend the Baconian “exclusion,” but I call it “The Question.” Obviously it should be applied as much to one’s own thinking as to others’. It consists of asking in your own mind, on hearing any scientific explanation or theory put forward, “But sir, what experiment could disprove your hypothesis?” ; or, on hearing a scientific experiment described, “But sir, what hypothesis does your experiment disprove?”
Platt, J. R. (1964). Strong inference. Science, 146(3642), 347-353.
false positive
You got lucky, they say. Address by replicating and/or pre-registering
confound
Something else you didn’t control for produced the effect
You may also enjoy
- Peter Norvig: Warning Signs in Experimental Design and Interpretation
- Julia Strand: Error Tight: Exercises for Lab Groups to Prevent Research Mistakes
- Samuel J. Lord Checklist for Experimental Design (from a Cell biology and microscopy researcher)
- Ritter, F. E., Kim, J. W., Morgan, J. H., & Carlson, R. A. (2011). Practical aspects of running experiments with human participants. In Universal Access in Human-Computer Interaction. Design for All and eInclusion: 6th International Conference, UAHCI 2011, Held as Part of HCI International 2011, Orlando, FL, USA, July 9-14, 2011, Proceedings, Part I 6 (pp. 119-128). Springer Berlin Heidelberg.
- Kane, J. V. (2024). More than meets the ITT: A guide for anticipating and investigating nonsignificant results in survey experiments. Journal of Experimental Political Science, 1-16. https://doi.org/10.1017/XPS.2024.1
Created: 2021-07-10
Latest update: 2024-06-11
Repo (contains citation widget): github.com/tomstafford/psy-checklist
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.