In the latest conversational
turn in my ongoing dialog (diablog?) with Dan Simons about replications and
validity, Dan provides some useful insights into what qualifies as a direct
replication:
a direct replication
can be functionally the same if it uses the same materials, tasks, etc. and is
designed to generalize across the same variations as the original.
I agree completely. As Dan notes, no replication can be exact
and some changes are inevitable for the experiment to make sense. At Registered
Replication Reports (RRR), Dan and his colleague Alex Holcombe have instituted some
interesting procedures:
Our approach with
Registered Replication Reports is to ask the original authors to specify a
range of tolerances on the parameters of the study.
This is a great idea. What I like even more is that Dan not
only talks the talk but also walks the walk. He is using this approach in his
own papers by adding a paragraph to the method section in which he states the
generalization target for his experiments. It would be tremendously useful if
we all did this. For example, I’d be very interested to know whether authors think
their experiments can be extended to Mechanical Turk.
Of course, allowing authors to define the scope provides
them with a way to obstruct replication attempts. For example, an author could
claim that the effect can only be found in cubicles of such-and-such dimensions
on the 12th floor of a building in a medium-sized Dutch city on a sunny Thursday afternoon.
Fortunately, RRR has a robust way of dealing with such shenanigans:
…we should then treat
the original effect as unreliable and unreproducible, and it should not factor
into larger theorizing…
This sounds exactly right to me.
I think there is only one issue that needs to be clarified.
It is very well possible that I did not make myself clear enough. Dan states
the bottom-line of my previous
post as follows:
Rolf's larger point,
though, is that it should be considered a direct replication to vary things in
ways that are consistent with the theory that governs the study itself.
I’m not sure this is how I would characterize my larger
point. Rather, my point was that a direct replication could be augmented with
slight variations in the manner I described. This would enable us to transcend
idiosyncrasies of the original study to produce an authoritative test of the
hypothesis. So the direct replication would be part of a larger constellation
of studies. This way we would have (direct replication) our cake and eat
(validity) it too. This is exactly what Dan describes in his last sentence.
So the way I see it, we are in complete agreement. Direct
replications are the first important step but they should be followed up or
combined with slight variations to enhance the validity of our findings.
Commenting on my first post on this subject, Etienne LeBel suggests that this
goal can often be achieved by adding one or more conditions to the direct
replication. If the original experiment has a between-subjects design, this is
a great idea (though its execution may not always be feasible). For
within-subjects designs, it would clearly be a bad idea, as it would
significantly alter the experiment.
I look forward to hearing more about the thinking behind RRR
as it progresses. I already think it is a major positive development in our
field and it is likely to become even better.
I'm not sure I like the word "diablog" - it reminds me of "diabolical" and "diatribe". Fortunately, your interchange has been neither of those, I have followed it with great interest. How about "blogalog"?? Or does that sound too sing-song?
BeantwoordenVerwijderenThanks, blogalog does sound a bit too "sing-songy" for me although it does capture the friendly nature of the interaction between Dan and me.
Verwijderen