Today a first in this blog: a guest post! In this post Alexa
Tullett reflects on the consequences of Fox's data manipulation, which I
described in the previous post, for her own research and that of her
collaborator, Will Hart.
Alexa Tullett
University of Alabama
[Disclaimer: The opinions expressed in this post are my own
and not the views of my employer]
When I read Rolf’s previous post about the verb aspect RRR I
resonated with much of what he said. I have been in Rolf’s position before as
an outside observer of scientific fraud, and I have a lot of admiration for his
work in exposing what happened here. In
this case, I’m not an outside observer. Although I was not involved with the
RRR that Rolf describes in detail, I was a collaborator of Fox’s (I’ll keep up
the pseudonym) and my name is on papers that have been, or are in the process
of being retracted. I also continue to be a collaborator of Will Hart’s, and
hope to be for a long time to come. Rolf has been kind enough to allow me space
here to provide my perspective on what I know of the RRR and the surrounding
events. My account is colored by my personal relationships with the people
involved, and while this unquestionably undermines my ability to be objective,
perhaps it also offers a perspective that a completely detached account cannot.
I first became involved in these events after Rolf requested
that Will re-examine the data from his commentary for the RRR. Will was of the
mind that data speak louder than words, so when the RRR did not replicate his
original study he asked Fox to coordinate data collection for an additional
replication. Fox was not an author on the original paper, and was not told the
purpose of the replication. Fox ran the replication, sent the results to Will,
and Will sent those and his commentary to Rolf. Will told me that he had
reacted defensively to Rolf’s concerns about these data, but eventually Will
started to have his own doubts. These doubts deepened when Will asked Fox for
the raw data and Fox said he had deleted the online studies from Qualtrics
because of “confidentiality” issues. After a week or two of communicating with
the people at Qualtrics Will was able to obtain the raw data, and at this point
he asked me if I would be willing to compare this with the “cleaned” data he
had sent to Perspectives.
I will try to be as transparent as possible in documenting
my thought process at the time these events unfolded. It’s easy to forget – or
never consider – this naïve perspective once fraud becomes uncontested. When I
first started to look at the data, I was far from the point where I seriously
entertained the possibility that Fox tampered with the data. I thought
scientific fraud was extremely rare. Fox was, in my mind, a generally
dependable and well-meaning graduate student. Maybe he had been careless with
these data, but it seemed far-fetched to me that he had intentionally changed
or manipulated them.
I started by looking for duplicates, because this was the
concern that Will had passed along from Rolf. They weren’t immediately obvious
to me, because the participant numbers (the only unique identifiers) had been
deleted by Fox. But, when I sorted by free-response answers several duplicates
became apparent, as one can see in Rolf’s screenshot. There were more
duplicates as well, but they were harder to identify for participants who
hadn’t given free-response answers. I had to find these duplicates based on
patterns of Likert-scale answers. I considered how this might have happened,
and thought that perhaps Fox had accidentally downloaded the same condition
twice, rather than downloading the two conditions. As I looked at these data
further I realized that there had also been deletions. I speculated that Fox
had been sloppy when copying and pasting between datasets – maybe some
combination of removing outliers without documenting them and accidentally
repeatedly copying cases from the same dataset.
I only started to genuinely question Fox’s intentions when I
ran the key analysis on the duplicated and deleted cases and tested the
interaction. Sure enough, the effect was there in the duplicated cases, and
absent in the deleted cases. This may seem like damning evidence, but to be
honest I still hadn’t given up on the idea that this might have happened by
accident. Concluding that this was fraud felt like buying into a conspiracy
theory. I only became convinced when Fox eventually admitted that he had done
this knowingly. And had done the same thing with many other datasets that were
the foundation of several published papers—including some on which I am an
author.
Fox confessed to doing this on his own, without the
knowledge of Will, other graduate students, or collaborators. Since then, a
full investigation by UA’s IRB has drawn the same conclusion. We were asked not
to talk about these events until that investigation was complete.
Hindsight’s a bitch. My thinking prior to Fox’s confession
seems as absurd to me as it probably does to you. How could I have been so
naively reluctant to consider fraud? How could I have missed duplicates in
datasets that I handled directly? I
think part of the answer is that when we get a dataset from a student or a
collaborator, we assume that those data are genuine. Signs of fraud are more
obvious when you are looking for them. I wish we had treated our data with the
skepticism of someone who was trying to determine whether they were fabricated,
but instead we looked at them with the uncritical eye of scientists whose
hypotheses were supported.
Fox came to me to apologize after he admitted to the
fabrication. He described how and why he started tampering with data. The first
time it happened he had analyzed a dataset and the results were just shy of
significance. Fox noticed that if he duplicated a couple of cases and deleted a
couple of cases, he could shift the p-value to below .05. And so he did. Fox
recognized that the system rewarded him, and his collaborators, not for
interesting research questions, or sound methodology, but for significant
results. When he showed his collaborators the findings they were happy with
them—and happy with Fox.
The silver lining. I’d like to think I’ve learned something
from this experience. For one thing, the temptation to manipulate and fake
data, especially for junior researchers, has become much more visible to me.
This has made me at once more understanding and more cynical. Fox convinced
himself that his research was so trivial that faking data would be
inconsequential, and so he allowed his degree and C.V. to take priority. Other
researchers have told me it’s not hard to relate. Now that I have seen and can
appreciate these pressures, I have become more cynical about the prevalence of
fraud.
My disillusionment is at least partially curbed by the
increased emphasis on replicability and transparency that has occurred in our
field over the past 5 years. Things have changed in ways that make it much more
difficult to get away with fabrication and fraud. Without policies requiring
open data, this case and others like it would often go undiscovered. Even more
encouragingly, things have changed in ways that begin to alter the incentive
structures that made Fox’s behavior (temporarily) rewarding. More and more
journals are adopting registered report formats where researchers can submit a
study proposal for evaluation and know that, if they faithfully execute that
study, it will get published regardless of outcome. In other words, they will
have the freedom to be un-invested in how their study turns out.
Yes, it is amazing how people oversee misconduct. In an analysis of reviews and reasons for rejecting/accepting articles, Bornmann, Nast, Daniel (2008)identified 572 reasons... none of them related to (suspected) misconduct. Even when misconduct in in a paper is as evident as the sun on a cloudless day in the Sahara, co-authors, editors, reviewers most often do not see it.
BeantwoordenVerwijderenOne line in the post disturbed me, and it is this one: "For one thing, the temptation to manipulate and fake data, especially for junior researchers, has become much more visible to me." Please leave out "especially for junior researchers". The temptations are big for everyone, and not particularly for young researchers. As far as I can tell (but please correct me if I am wrong), no evidence exist that misconduct is more prevalent or tempting for young researchers.
Thanks for raising this point Marcel. I certainly didn’t mean to imply that misconduct is more prevalent among junior researchers. I simply wanted to acknowledge that often the stakes are higher for junior researchers – especially those without jobs or tenure – because they may have more to lose if they don’t meet certain standards of productivity. Certainly these same pressures apply to academics at all levels, and I know of no evidence that one group succumbs to these pressures more than others.
VerwijderenAlexa Tullett’s thought-provoking comments regarding Will Hart’s recently retracted Psychological Science article end with the wish that in future registered reports will enable researchers to “have the freedom to be un-invested in how their study turns out.” That gave me pause. It might very well be a good thing, but what a sea change! In my perception, the modus operandi of psychological science has long been that researchers stake out some proposition (e.g., that misleading postevent suggestions impair memory or that exercising self-control depletes a limited resource) and then cleverly defend those claims against all comers. Yes, researchers are motivated to publish, but perhaps even more they are motivated to advance and defend their theories. Mike Watkins observed back in 1984 that in psychology theories are like toothbrushes; everyone has one and no one wants to use anyone else’s. Obviously that's not great, but surely we need theory and we need research that refines theory.
BeantwoordenVerwijderenThere are things to be said in favour of caring how our studies turn out. Such care may motivate the use of effective manipulations, sensitive and reliable measures, rigorous control, and high power. I think what we should aspire to is not a world in which we don’t care how our studies turn out, but instead one in which we care more about producing results that are informative than about obtaining a p < .05. Perhaps that’s what Alexa meant.
Thanks to Roddy Roediger for reminding me that Mike Watkins was the source of the models/toothbrushes analogy and for comments on a draft of this post.
Steve Lindsay raises an interesting point. I actually don't see a tension between his goal of a world in which "we care more about producing results that are informative than about obtaining p < .05" and Alexa's wish for researchers to "have the freedom to be un-invested in how their results turn out." In order to care more about producing results that are informative than about showing that you or your theory are right, you must have the freedom to be wrong, and to be happy to have found out you were wrong. Maybe "un-invested" is not the right word, but I think you are both getting at the same idea. A researcher, even one who has provided support for a theory their whole career, should *want* to know if they are wrong, and if the results of their study tell them that they're wrong, they should be glad that they found out. Which is inconsistent with the model of science where a researcher defends their theory against all criticisms.
BeantwoordenVerwijderenThat doesn't mean we can't be invested in a theory at all, but it does mean we can't let that investment be more important to us than being right. We need to sincerely hope that if we're wrong, the truth will come out. That might indeed mean a sea change.
PS: I meant to sign my comment (above)! -Simine Vazire
BeantwoordenVerwijderenSteve, I agree that it’s important to consider the compromise between caring about a theory and being open to any result. As Simine notes, in principle the two are not in conflict. Although it can be very difficult to completely eliminate your stake in your own theory, I still think there are many merits to striving for disinterest (not in the sense that one is doesn’t care, but in the sense that one is open to any result). Caring about knowing an answer, and not caring what the answer is, seems to me to be the ideal way to ensure that truth gets prioritized over a particular theoretical position.
VerwijderenThanks very much for this post Alexa. One comment:
BeantwoordenVerwijderen"How could I have missed duplicates in datasets that I handled directly? I think part of the answer is that when we get a dataset from a student or a collaborator, we assume that those data are genuine. Signs of fraud are more obvious when you are looking for them."
It's true that few people go looking for fraud in their colleague's data, but surely it is important to be alert to the possibility of error? In this case, the duplications "Fox" produced easily could have arisen accidentally. If I saw such duplications in my students' data my first thought would be: they've copied and pasted something wrongly in Excel. And I would be alert to that possibility because I've made similar errors myself (only to catch myself later) - I'm sure we all have.