It's becoming a trend: another guest blog post. This time, J.P. de Ruiter shares his view, which I happen to share, on the value of experience in criticizing research.
J.P. de Ruiter
One of the reasons that the scientific method was such a brilliant idea is that it has criticism built into the process. We don’t believe something on the basis of authority, but we need to be convinced by relevant data and sound arguments, and if we think that either the data or the argument is flawed, we say this. Before a study is conducted, this criticism is usually provided by colleagues, or in case of preregistration, reviewers. After a study is submitted, critical evaluations are performed by reviewers and editors. But even after publication, the criticism continues, in the form of discussions in follow-up articles, at conferences, and/or on social media. This self-corrective aspect of science is essential, hence criticism, even though at times it can be difficult to swallow (we are all human) is a very good thing.
We often think of criticism as pointing out flaws in the data collection, statistical analyses, and argumentation of a study. In methods education, we train our students to become aware of the pitfalls of research. We teach them about assumptions, significance, power, interpretation of data, experimenter expectancy effects, Bonferroni corrections, optional stopping, etc. etc. This type of training leads young researchers to become very adept at finding flaws in studies, and that is a valuable skill to have.
While I appreciate that noticing and formulating the flaws and weaknesses in other people’s studies is a necessary skill for becoming a good critic (or reviewer), it is in my view not sufficient. It is very easy to find flaws in any study, no matter how well it is done. We can always point out alternative explanations for the findings, note that the data sample was not representative, or state that the study needs more power. Always. So pointing out why a study is not perfect is not enough: good criticism takes into account that research always involves a trade-off between validity and practicality.
As a hypothetical example: if we review a study about a relatively rare type of Aphasia, and notice that the authors have studied 7 patients, we could point out that a) in order to generalize their findings, they need inferential statistics, and b) in order to do that, given the estimated effect size at hand, they’d need at least 80 patients. We could, but we probably wouldn’t, because we would realize that it was probably hard enough to find 7 patients with this affliction to begin with, so finding 80 is probably impossible. So then we’d probably focus on other aspects of the study. We of course do keep in mind that we can’t generalize over the results in the study with the same level of confidence as in a lexical decision experiment with a within-subject design and 120 participants. But we are not going to say, “This study sucks because it had low power”. At least, I want to defend the opinion here that we shouldn’t say that.
While this is a rather extreme example, I believe that this principle should be applied at all levels and aspects of criticism. I remember that as a grad student, a local statistics hero informed me that my statistical design was flawed, and proceeded to require an ANOVA that was way beyond the computational capabilities of even the most powerful supercomputers available at the time. We know that full LMM models with random slopes and intercepts often do not converge. We know that many Bayesian analyses are intractable. In experimental designs, one runs into practical constraints as well. Many independent variables simply can’t be studied in a within-subject design. Phenomena that only occur spontaneously (e.g. iconic gestures) cannot be fully controlled. In EEG studies, it is not feasible to control for artifacts due to muscle activity, hence studying speech production is not really possible with this paradigm.
My point is: good research is always a compromise between experimental rigor, practical feasibility, and ethical considerations. To be able to appreciate this as a critic, it really helps to have been actively involved in research projects. Not only because that gives us more appreciation of the trade-offs involved, but also, perhaps more importantly, of the experience of really wanting to discover, prove, or demonstrate something. It makes us experience first-hand how tempting it can be, in Feynman’s famous formulation, to fool ourselves. I do not mean to say that we should become less critical, but rather that we become better constructive critics if we are able to empathize with the researcher’s goals and constraints. Nor do I want to say that criticism by those who have not yet have had positive research experience is to be taken less seriously. All I want to say here is that (and why) having been actively involved in the process of contributing new knowledge to science makes us better critics.