Monday, March 21, 2016

Truth in Advertising

As I indicated in my previous post, it is not easy to estimate beforehand to which situations your conclusions generalize. But it is important to at least  make an effort. Often, conclusions are wildly oversold, creating a paradoxical situation when the results are not replicated. Usually, a hidden moderator is invoked and all of a sudden the scope of conclusions previously advertised as far-reaching is drastically narrowed. An example of this arrived in my mailbox the other day.

A while ago, our registered replication report of Hart & Albarracin (2011) came out. I already blogged about it. The first author, William Hart, has now written a response to our report. It will appear in the next issue of Perspectives on Psychological Science; I have an advanced copy of the response.

Hart doesn’t raise substantive concerns about our report but he does suggest that maybe we didn’t replicate the original findings because the original study was run in Florida; he doesn't specify where, but I'm assuming Gainesville. Most of the studies in our replication were conducted with what he views as more liberal samples.

A number of issues are relevant here. For example, how strong were the original findings in the first place? Another question is how predictive the conservativeness of a county is of the conservativeness of the student body at a university situated in that county. Large universities attract students from all over the country and the way I understand it--political scientists might want to correct me here--students typically vote in their home state. 

I'm going to ignore these questions here because this post is about truth in advertising. Did the original study warn us that the conclusions would only hold for conservative samples? The answer is simple. No, it didn’t. All we get is this.

This doesn't even tell us the university that these 48 students were from, let alone how conservative they are. At least we now have Hart's response, which has narrowed our location down to the Sunshine State. Clearly, at the time the authors didn't think the geographical location of the student sample (let alone its conservativeness) was worth mentioning.

But the discrepancy between truth and advertising is even larger. Here are the article's conclusions.

Rather than alerting the reader that the effect is limited to conservative student samples, this statement suggests that the findings might generalize from the lab to the courtroom!

Am I being fair here? After all, isn't it normal scientific progress when later research finds the limitations of earlier findings? Yep, that's true. But I'm not so sure this is the case here. For one, in his response Hart doesn't provide any evidence that the finding replicates in a conservative sample--he merely offers the suggestion. 

And there also is this question. Does it make sense to generalize from findings with p-values of .01, .03, .02, obtained with a small (N=48 in a between-subjects design) sample, and a single vignette to courtroom behavior?

Rather than using the discussion section for overgeneralizations, it makes more sense to use it for specifying the situations under which the conclusions can be expected to hold. Not only does this provide more truth in advertising but it's also an important theoretical exercise. It's not easy, though.

I plan to pursue the topic of calibrating our conclusions in future posts. Now please excuse me while I try to assemble a Billy, Hemnes, Klippan, Poäng, or Bestå.


  1. This is a minor quibble, but if we're on truth in advertising, the excerpt you quoted specifically said "might be able to" and you say that he claims they "will generalize." It's fair to hold the original authors to the standard of the claims they make, but perhaps unfair to hold them to the standards of those that they don't.

    1. Fair point. I've changed "will generalize" to "might generalize."

  2. I agree wholeheartedly with your point, Rolf. At the same time, being able to specify, at the time of publication, all of the moderating variables and boundary conditions that limit the generalizability of your findings is asking a lot. I know I'm not capable of it. But acknowledging this difficulty means that authors ought to be circumspect in the speculation about possible "hidden moderators" following a failure to replicate. If they didn't think it mattered enough to mention at the time of publication, it's problematic to claim it as a key moderator to explain a failure to replicate. --Don Moore

    1. Thanks for your comment, Don. I agree that it is asking a lot (if not impossible) to specify all the boundary conditions ahead of time. This is why I'm going to devote a series of posts to it. I do think we can get calibrated a little bit better, however. In some cases, the discrepancy between what is claimed initially and what is claimed after a nonreplication is quite jarring in some cases.

  3. Quick note on the politics: US college students are allowed to vote in the state they go to school in - they have a choice ( For national elections, voting in FL, a battleground state, would count more than if you are from a state that consistently votes one party. I won't be surprised if many students from outside FL choose to vote there. Also, Alachua County containing Gainsville is dominated by registered Democrats:(,_Florida#Politics) At least wikipedia speculates this is university-driven trend.
    With the exception of religious or traditionally conservative private universities, I can't see any sample of undergrads in intro psych being meaningfully more conservative than at other universities. Direct data about the sample would be needed to back up that claim.

    One factor of the current replication atmosphere is the immediate post-mortem by the original author that seems to be becoming standard after a failure to replicate. Why reply right away with speculation in print? If there are no clear issues with how the replication was carried out, why not take the time to do some careful experimentation? If an effect is real but the boundaries weren't sufficiently specified to guide the researchers who attempted replication, this is an opportunity to iterate and clarify. Nobody wants to find out that their work doesn't replicate. The need for immediate replies force researchers to respond while likely feeling more defensive than collaborative. They are also in a weak position of needing to speculate as they don't have time to collect data before their reply is expected.

    1. Thanks for your thoughtful comments. I figured Alachua County was not too different from Leon County, where I used to live. This suggests that the conservatism argument is a nonstarter. That is, if the experiment was conducted in Gainesville. I’m not sure this was the case.

      Regarding your second point. I know for a fact that in this particular case there was more than enough time for the original authors to conduct their own replications.

  4. Thank you for you nice blog about a topic I come across quite often.

    I'm wondering what findings from cognitive psychology/ cognitive neuroscience can be generalised to any meaningful extent? I give a class in cognitive ergonomics to industrial design bachelor students. They are looking for relevant and robust design principles guided by what we know about human cognition. If you look at the material that is out there you arrive at work by Norman, Wickens and Proctor, who are basing themselves mostly on what we learned already 30 years ago on human cognition. The evergreens about attention, memory, action, language and decision making. If you take a recent book (2014) (designing with the mind in mind) by someone who is also an interaction designer, and who consulted with cognitive scientists, then this book also only presents these golden oldies.

    Maybe this is just me (as an impatient human) looking at something (scientific progress, and it's implications) that just takes a looong time. :)

    In any case, I'm very curious about your next posts on generalizations!

    1. Thanks Matthijs. You bring up a very interesting perspective that I hadn't thought of. I've also noticed that applied researchers usually rely on old chestnuts. They seem a few decades behind. This is another reason why it is useful to put more effort into thinking about the generalization of our conclusions.