Wednesday, June 5, 2013

The Diablog on Replications and Validity Continues


In the latest conversational turn in my ongoing dialog (diablog?) with Dan Simons about replications and validity, Dan provides some useful insights into what qualifies as a direct replication:

a direct replication can be functionally the same if it uses the same materials, tasks, etc. and is designed to generalize across the same variations as the original.

I agree completely. As Dan notes, no replication can be exact and some changes are inevitable for the experiment to make sense. At Registered Replication Reports (RRR), Dan and his colleague Alex Holcombe have instituted some interesting procedures:

Our approach with Registered Replication Reports is to ask the original authors to specify a range of tolerances on the parameters of the study. 

This is a great idea. What I like even more is that Dan not only talks the talk but also walks the walk. He is using this approach in his own papers by adding a paragraph to the method section in which he states the generalization target for his experiments. It would be tremendously useful if we all did this. For example, I’d be very interested to know whether authors think their experiments can be extended to Mechanical Turk.

Of course, allowing authors to define the scope provides them with a way to obstruct replication attempts. For example, an author could claim that the effect can only be found in cubicles of such-and-such dimensions on the 12th floor of a building in a medium-sized Dutch city on a sunny Thursday afternoon. Fortunately, RRR has a robust way of dealing with such shenanigans:

…we should then treat the original effect as unreliable and unreproducible, and it should not factor into larger theorizing…

This sounds exactly right to me.

I think there is only one issue that needs to be clarified. It is very well possible that I did not make myself clear enough. Dan states the bottom-line of my previous post as follows:

Rolf's larger point, though, is that it should be considered a direct replication to vary things in ways that are consistent with the theory that governs the study itself. 

I’m not sure this is how I would characterize my larger point. Rather, my point was that a direct replication could be augmented with slight variations in the manner I described. This would enable us to transcend idiosyncrasies of the original study to produce an authoritative test of the hypothesis. So the direct replication would be part of a larger constellation of studies. This way we would have (direct replication) our cake and eat (validity) it too. This is exactly what Dan describes in his last sentence.

So the way I see it, we are in complete agreement. Direct replications are the first important step but they should be followed up or combined with slight variations to enhance the validity of our findings. Commenting on my first post on this subject, Etienne LeBel suggests that this goal can often be achieved by adding one or more conditions to the direct replication. If the original experiment has a between-subjects design, this is a great idea (though its execution may not always be feasible). For within-subjects designs, it would clearly be a bad idea, as it would significantly alter the experiment.

I look forward to hearing more about the thinking behind RRR as it progresses. I already think it is a major positive development in our field and it is likely to become even better.

2 comments:

  1. I'm not sure I like the word "diablog" - it reminds me of "diabolical" and "diatribe". Fortunately, your interchange has been neither of those, I have followed it with great interest. How about "blogalog"?? Or does that sound too sing-song?

    ReplyDelete
    Replies
    1. Thanks, blogalog does sound a bit too "sing-songy" for me although it does capture the friendly nature of the interaction between Dan and me.

      Delete