Monday, August 7, 2017

Publishing an Unsuccessful Self-replication: Double-dipping or Correcting the Record?

Collabra: Psychology  has a submission option called streamlined review. Authors can submit papers that were previously rejected by another journal for reasons other than a lack of scientific, methodological, or ethical rigor. Authors request permission from the original journal and then submit their revised manuscript with the original action letters and reviews. Editors like me then make a decision about the revised manuscript. This decision can be based on the ported reviews or we can solicit further reviews.

One recent streamlined submission had previously been rejected by an APA journal. It is a failed self-replication. In the original experiment, the authors had found that a certain form of semantic priming, forward priming, can be eliminated by working-memory load, which suggests that forward semantic priming is not automatic. This is informative because it contradicts theories of automatic semantic priming. When they tried to follow up on this work for a new paper, however, the researchers were unable to obtain this elimination effect in two experiments. Rather than relegating the study to the file drawer, they decided to submit it to the journal that had also published their first paper on the topic. Their submission was rejected. It is now out in Collabra: Psychology. The reviews can be found here.

[Side note: I recently conducted a little poll on Twitter asking whether or not journals should publish self-nonreplications. A staggering 97% of the respondents said journals should indeed publish self-nonreplications. However, if anything, this is evidence of the Twitter bubble I’m in. Reality is more recalcitrant.]

I thought the other journal’s reviews were thoughtful. Nevertheless, I reached a different conclusion than the original editor. A big criticism in the reviews was the concern about “double-dipping.” If an author publishes a paper with a significant finding, it is unfair to let that same author then publish a paper that reports a nonsignificant finding, as this gives the researcher two bites at the apple.

I understand the point. What drives this perception of unfairness is our current incentive system.
People are (still) rewarded for the number of articles they publish, so letting someone first publish a finding and then a nonreplication of this finding is unfair. It is as if in football (the real football, where you use your feet to propel the ball) you get a point for scoring a goal and then an additional point for missing a shot from the same position.

However understandable, this idea loses its persuasive power once we take the scientific record into account. As scientists, we want to understand the world and lay a foundation for further research. It is therefore important to have good estimates of effect sizes and the confidence we should have in them. A nonreplication serves to correct the scientific record. It tells us that the effect is less robust than we initially thought. This is useful information for meta-analysts, who can now include both findings in their collection. And even more importantly, it is very useful for researchers who want to build on this research. They now know that the finding is less reliable than they previously thought. It might prevent them from wandering into a potential blind alley.

As with anything in science, allowing the publication of self-nonrreplications opens the door to gaming the system. People could p-hack their way to a significant finding, publish it and then fail to “replicate” the finding in a second paper. As an added bonus, the self-nonreplication will also give them the aura of earnest, self-critical, and ethical researchers. Moreover, the self-nonreplication pretty much inoculates the finding from “outside” replication efforts. Why try to replicate something that even the authors themselves could not replicate?

That’s not two, not three, but four birds with one stone! You might think that I’m making up the inoculation motive for dramatic effect. I’m not. A researcher I know actually suspects another researcher of using the inoculation strategy.

How worried should we be about the misuse of self-nonreplications? I’m not sure. One potential safeguard is to have the authors explain why they performed the replication. Did they think there was something wrong with the original finding or were they just trying to build on it and were surprised to discover they couldn’t reproduce the original finding? And if a researcher makes a habit of publishing self-nonreplications, I’m sure people would be on to them in no time and questions would be asked.

So I think we should publish self-nonreplications. (1) They help to make the scientific record more accurate. (2) They are likely to prevent other researchers from ending up in a cul-de-sac.

The concern about double-dipping is only a concern given our current incentive system, which is one more indication that this system is detrimental to good science. But that’s a topic for a different post.





6 comments:

  1. I agree that of course we should publish self-nonreplications. The kind of person who deliberately p-hacks to get a sexy result is not the kind of person who would want the world to know the result didn't replicate. If people really would do this as a cynical ploy to boost their publication count then there really is no hope for the field.

    ReplyDelete
  2. I agree with you two (not that it matters). since when is it in a reviewer/editor's role or prerogatives to judge the consequence of a publication for a researcher's (or others) career? It would be unfair to others to publish a non-replication by the same authors would find the effect in the first place? No wonder there is such a publication bias. The main source of not replicated effects are probably people who work on these effects in the first place, and who, using the same or comparable material and methods, fail to find the effect they were looking for. This is frustrating per se, and these authors may not be willing to publish such non-replications, but if reviewers or editors of a journal also refrain from publishing these, furthermore for fallacious reasons, we are not out of the woods.

    ReplyDelete
  3. At least in UK universities, what counts for promotion etc. these days is not really the number of papers, but the number of papers that are deemed 3*-4* for the REF (Research Excellence Framework). It's hard to imagine a failed self-replication ever being deemed 3*, meaning that the double-dipping issue doesn't arise - A nice (if rare) example of an unintended consequence that turns out positive!

    ReplyDelete
  4. I completely agree. You've already laid out all the philosophical reasons for that and also why it simply is unlikely to be a major problem in practice. But there is also another pragmatic reason why we should make it easy to publish non-replications: if we want to change the incentive structures in research we need to give people a better incentive for doing self-corrective science. If authors are not allowed to publish failures to replicate their own findings, what possible reason would anyone have for doing it?

    I would bet that even in these days of shameless little bullies and methodological terrorism, most replications are done by the authors themselves because they follow up on previous research they did. (Probably more often than not these are non-exact replications but still the same point applies). People don't just replicate other people's experiments unless there is a good reason for it (sensational claim, key result for theory, etc). Under the status quo there is not really any good motivation to publish self-non-replications. Yes, it adds a publication to your CV but probably also hurts your citations on the original study.

    In general I strongly feel we must get away from this thinking. We should not treat individual papers as gemstones our crowns. Any scientific finding should produce a tree of follow up research, including direct replications, indirect replications, or investigations of confounding or moderating factors. The whole thing should be regarded in its entirety not just each piece on its own. I don't know how easy it is to get to that stage but I think it's essential.

    On a more basic note, new discoveries should probably be expected to contain at least one pre-registered self-replication. But importantly, publication of those findings shouldn't be contingent on the outcome of that either. Even an inconsistent set of results could be important - possibly for reasons that have nothing to do with the result (such as the method used, the theory behind it, etc).

    ReplyDelete
  5. hi all,
    I was one of the authors of the paper under discussion. I'm 57 and at this moment in my career, I have absolutely nothing to gain from an additional paper. but apart from that, if I were the kind of person who wants to boost his CV by cheating, wouldn't there be better ways to do that (e.g., by just making up data and write a sensational paper for a high impact journal -- I could name one, but I won't do it), for instance. what I mean is, isn't the double-dipping idea far-fetched?
    gert storms, university of leuven

    ReplyDelete