In the past few years, a set of new terms has become common parlance in post-publication discourse in psychology and other social sciences: sloppy science, questionable research practices, researcher degrees of freedom, fishing expeditions, and data that are too-good-to-be-true. An excellent new paper by Andrew Gelman and Eric Loken takes a critical look at this development. The authors point out that they regret having used the term fishing expedition in a previous article that contained critical analyses of published work.
The problem with such terminology, they assert, is that it implies conscious actions on the part of the researchers even though—as they are careful to point out--the people who have coined, or are using, those terms (this includes me) may not think in terms of conscious agency. The main point Gelman and Loken make in the article is that there are various ways in which researchers can unconsciously inflate effects. I will write more about this in a later post. I want to focus on the nomenclature issue here. Gelman and Loken are right that despite the post-publication reviewers’ best intentions, the terms they use do evoke conscious agency.
We need to distinguish between post-publication review and ethics investigations in this regard, as these activities have different goals. Scientific integrity committees are charged with investigating the potential wrongdoings of scientists; they need to reverse-engineer behavior from the information at their disposal (published data, raw data, interviews with the researcher, their collaborators, and so on). Post-publication review is not about research practices. It is about published results and the conclusions that can or cannot be drawn from them.
If we accept this division of labor, then we need to agree with Gelman and Loken that the current nomenclature is not well suited for post-publication review. Actions cannot be unambiguously reverse-engineered from the published data. Let me give a linguistic example to illustrate. Take the sentence Visiting relatives can be frustrating. Without further context, it is impossible to know which process has given rise to this utterance. The sentence is a standing ambiguity and any Chomskyan linguist will tell you that it has one surface structure (the actual sentence) and two deep structures (meanings). The sentence can mean that it is frustrating to visit relatives or that it is frustrating when they are visiting you. There is no way to tell which deep structure has given rise to this surface structure.
It is the same with published data. Are the results the outcome of a stroke of luck, optional stopping, selective removal of data, selective reporting, an honest error, or outright fraud? This is often difficult to tell and probably not something that ought to be discussed in post-publication discourse anyway.
So the problem is that the current nomenclature generally brings to mind agency. Take sloppy science. It implies that the researcher has failed to exert an appropriate amount of care and attention; science itself cannot be sloppy. As Gelman and Loken point out, p-hacking is not necessarily intended to mean that someone deliberately bent the rules (and, in fact, their article is about how researchers unwittingly inflate the effects they report; more about this interesting idea in a later post). However, the verb implies actions on the part of the researcher; it is not a description of the results of a study. The same is true, of course, of fishing expedition. It is the researchers who are going on a fishing expedition; it is not the data who have cast their lines. Questionable research practices is obviously a statement about the researcher, as is researcher degrees of freedom.
But how about too-good-to-be-true? Clearly this qualifies as a statement about the data and not about the researcher. Uri Simonsohn used it to describe the data of Dirk Smeesters and the Scientific Integrity Committee I chaired adopted this characterization as well. Still, it has a distinctly negative connotation. Frankly, the first thing I think of when I hear too-good-to-be-true is Donald Trumps hair. And let’s face it: no researcher on this planet wants to be associated—however remotely—with Donald Trump’s hair.
What we need for post-publication review is a term that does not imply agency or refer to the researcher—we cannot reverse engineer behavior from the published data—and that does not have a negative connotation. A candidate is implausible pattern of results (IPR). Granted, researchers will not be overjoyed when someone calls their results implausible but the term does not imply any wrongdoing on their part and yet does express a concern about the data.
But who am I to propose a new nomenclature? If readers of this blog have better suggestions, I’d love to hear them.