I was recently asked to co-guest-edit a special issue of Frontiers in Cognition on
“failures to replicate.” I liked the idea of a special issue. I just didn’t
think it had the right angle. What if someone had “successfully” replicated a
study, would they not be allowed to submit? I was worried this would create a
kind of reverse file drawer problem. Only if the replication was unsuccessful
was it a candidate for publication. Others have expressed the same
concern.
If you think about it, it makes sense. Nonreplications are
in a superficial sense more informative than replications. Replications are like someone in the
desert yelling: “Look over there, an oasis” followed by someone else yelling “Yes,
I see it too.” A nonreplication is like the second person yelling “No, that’s
not an oasis, it’s a mirage.”
At a deeper level, however, both are informative. The replication
gives us greater confidence in the presence of an oasis. After all, how can we
stake our lives on a single dehydrated member of our crew of explorers? The
nonreplication decreases our confidence in the presence of an oasis and helps
us in potentially wasting our resources (or even lives). Still, nonreplications
seem sexier than replications. I fell for this myself when, in a previous
post, I said “The most interesting effect occurred…” referring to the one nonreplication
in the paper.
So how do we eliminate this inherent bias toward
nonreplication? The highly useful Psychfiledrawer
site lists replication attempts in psychology. Right now, there are about twice
as many nonreplications as there are replications listed but it is still early
in the game, so there is certainly no evidence of a nonreplication bias. On the
contrary, the curious fact presents itself that the site reports a successful
replication of Bem’s work on precognition (as well as an unsuccessful one). Moreover,
we really have no idea what the percentage of findings is that will replicate.
The Reproducibility
Project will give us an estimate for the 2008 volumes of three different
journals.
Still, there is a way to avoid bias and that is to use pre-registration.
The steps required are nicely outlined here.
Researchers register their replication attempt beforehand. They indicate why it
is important to replicate a certain study, they perform power analyses, and
they specify the research plan. This proposal is reviewed and if it checks out,
the paper is provisionally accepted, regardless of the results. Provisionally accepted studies are carried out and the results are included in the paper. The full
paper will then be reviewed to make sure the authors have delivered what they
promised to do and for methodological accuracy and a fair discussion. The
outcome of the experiment will play no role anywhere during the evaluation
process.
The editors of Frontiers in Cognition liked our plan and so
we are going to go ahead with it. I will provide more information and a call for proposals in my next post.
To close off with an anecdote, here is the labyrinthine route
toward nonreplication that we once took. We discussed an interesting paper
outside of our research area during a lab meeting. We developed ideas on how to
tweak the paradigm described in the paper for our own studies on language.
Our first experiment, titled “Object 1” (maybe we had the precognition that this was the first in a series) was an abysmal failure. Not a failure to replicate—we weren’t
even trying to replicate—just a bad experiment. Object 2 was not much better
and then we realized we should probably move closer to the original experiment.
This is what we did in successive steps in Object 3 through Object 12. By now
we were pretty close to the original experiment. Object 13 was our final
attempt: a very close replication. Again no effect. We gave up. Apparently,
this paradigm was beyond our capabilities.
I discussed our failed attempts with a colleague at a
conference. He said he had also had repeated failures to get the effect and
then contacted the author (which we should have done as well, of course). He
found out there was a critical aspect to the manipulation that was not
mentioned in the paper. With this component, the effect proved reproducible.
The authors can be faulted for not including this component
in the paper. It wasted a lot of our, the colleague’s, and probably a lot of
other people’s time. But maybe the authors had simply forgotten to mention this
critical detail or they were not aware of its critical role. This just goes to
show that no detail is too trivial to mention in a method section.
There is another point and maybe it doesn’t reflect well on
us. We went about it bass ackwards. Rather than taking the paradigm and run
with it, we should have sat down and try an exact replication of the original
finding first— Object 1 in this alternate universe. If we hadn’t been able to
replicate the original finding, there probably would not have been alternate
Object 2 through 13 and we would have had a lot of alternate time to run other
experiments.
Reacties
Een reactie posten