Sunday, February 3, 2013

The Actual Results are in!

Good thing I called the previous update “preliminary results!” I discovered later that many subjects had “clicked through” the descriptions, merely reading the titles and spending only a second or so on the descriptions. In my preregistration I had said that I would exclude the data from subjects if their viewing times were < 30 sec for two or more abstracts.

Well, it turned out I would have had to throw away data from more than half the subjects! Therefore I decided to rerun the study but now with a stern warning in the instruction that viewing times would be measured and that data from subjects with impossibly short viewing times would be unusable and that the subjects would not get paid.

This seemed to help. In the second run, far fewer subjects had impossibly short viewing times, although there were still quite a few delinquents (they did not get paid).

The lesson here: first run a sizeable pilot study before you pre-register, dummy!

I overshot a little, but at least I have 206 usable subjects now. These subjects took about 47 seconds on average to read the abstracts, which seems reasonable. There was no difference in viewing times between the amusing and nonamusing conditions.

As a good boy, I’m separating the discussion into a confirmatory part and an exploratory part.

The confirmatory part

People expressed less confidence in the research in the amusing condition than in the nonamusing condition and this difference was significant (p=.006; the bars in the figure represent standard errors). It was a lot smaller than in my original report though, and according to a Bayesian analysis, the evidence for my alternative hypothesis is only about twice as strong as for the Null hypothesis of there not being an effect.

The pattern for interestingness was basically a wash. Numerically, people found the studies with the nonamusing titles more interesting than those with amusing titles (opposite to my prediction, but this difference was not significant (p=.13).

So far for the confirmatory part—on to the exploratory part.

The exploratory part

At the end of the experiment, I asked one true/false question about each study. The question always pertained to the main finding of the study. I had merely included these questions to get a sense of the subjects’ understanding of the abstracts without expecting there to be differences between the conditions. However, the amusing condition yielded a lower proportion of correct answers than the nonamusing condition (.66 vs. .69, p=.013). Although this evidence is at best ambiguous in the land of Bayes, it might not be a bad idea to include a more extensive test of comprehension in future studies.

Limitations: Part I

The effects of amusing titles will likely be more pronounced with researchers in psychology, or with scientists in general. One reason for this is that the experts will be more intrinsically motivated to read the abstracts.

It is also important to note that laypeople may not know some of the technical terms in the nonamusing titles and descriptions, so the amusing title might provide scaffolding (to use a term from educational psychology) for their understanding of the description, whereas it will be ornamental to the experts.

Also, experts may take a more serious view of science than do laypeople, and might therefore be more likely to be put off by amusing titles. But this is an empirical question of course. After all, it is the scientists who generated the amusing titles in the first place!

There was one comment from a subject I found both moving and telling about the current economic situation: “Thank you for easing my unemployment.”

Limitations: Part II

Another limitation were the titles and abstracts. I selected only 12 because I thought this was about as much as the Turkers could handle, which is probably right. I would have been more comfortable with at least 20.

In addition to the number of stimuli, their content is also a potential issue. From perusing many amusing titles and the associated abstracts, I learned that amusing titles come in many variants (more about this in a later post). I didn’t really take this into account and basically selected titles in which the pre-colon part was (somewhat) amusing and did not provide information that the post-colon part did not also provide. This was a judgment call, of course, and as I indicated above, the pre-colon part might not have always been redundant to the subjects.

Another selection criterion was that the abstract should not be too technical (again, a judgment call on my part).

It is quite possible that these criteria have produced a set of titles that is heterogeneous in terms of its amusingness.


The main conclusion is that there is a tendency among laypeople to view evidence from articles with amusing titles somewhat less convincing than the same evidence from the same articles with nonamusing titles.

Where do we go from here?

An experiment with an expert sample would be a good idea (assuming there still are psychologists left who are not readers of this blog;)).

This experiment would have to involve more abstracts to gain more power. More abstracts will be less onerous on the experts that they will be on Turkers.

It might also be a good idea to first perform a careful analysis of types of amusing titles. It is likely that they don’t all have the same effect.

And for the rest I’m open to your suggestions. Please fire away!


  1. One idea is that instead of simply dividing abstracts into "amusing" vs. "non-amusing," you could collect individual ratings of amusingness from each participants, and predict a participant's confidence and interest from their amusingness ratings in a multilevel model.

    And if you decide to do something like this, I guess you would want to think about whether amusingness in itself is really what's driving things here, or something else related to it. Is it literally the fact that people personally find the titles humorous? Or maybe just the perception that some people might find it humorous, whether or not they do? Maybe it is not directly to do with humor, but instead just that it seems generally less professional in some way? I don't know the right answers to these conceptual questions, or to what extent you wish to address them, but they are things to think about.

    1. You are right. "Amusing" is just a label. I got this from an article on this topic, which I cite in an earlier post. I'm working on a better labeling system. The general category would be "nonliteral" titles. The subcategory of this will be the one that you might label "nonserious" or "nonfunctional." It is those titles that I expect will have a harmful effect on the perception of the research.

  2. Thanks for the follow-up. Just a quick note: I would also check if the declared level of abstract's interestingness was predicted by the reading time measurement and/or index of abstract's understanding in both conditions. This might give you a hint about the mechanism to test in the next study. I would predict that the regression would be significant only in non-amusing condition.

    1. Thanks for the suggestion. I'd have to control for a number of factors then, such as length of the abstract (in words or characters), word frequency, syntactic structure, coherence, all of which are known to influence reading times.

  3. Maybe this is a blunt question to ask, but why is this research interesting in the first place?

    I thought we were moving away from doing research into 'funny' subjects without theoretical underpinning.

    1. To be equally blunt, the presuppositions behind your question are completely off.

      (1) This is a blog, not a scientific article. In an earlier post, I proposed to strengthen the division of labor between scientific articles and blogs Blogs are partly meant to entertain, articles are not. Extensive theoretical underpinning are for article but not so much for blogs.

      (2) The question I'm investigating here is that the division between entertainment and science has been blurred over the years and that the main manifestation of this is in the titles of journal articles. I wrote about this in my first post on this topic. Then, in a later post, I provided quantitative evidence that amusing titles have been increasing in number over the years (well, in one journal at least).

      (3) Then I set out to investigate the effects of amusing titles on the perception of research, of which this is the first experiment.

      (4) In performing this experiment, I conducted a meta-experiment on open science by pre-registering the study, then performing it, and then reporting the results. I also wrote about pre-registration in an earlier post and this was a way of putting my money where my mouth is.

      (5) I am now learning from the feedback of interested readers, which may help me design future experiments. In that sense, it is also an experiment in crowd sourcing research. If I ever write them up in the form of a scientific article, they will be grounded in the literature. Again, though, that is not the point of this blog.

      (6) This is indeed a "funny" subject, but the undertone is dead serious: maybe we should be less funny in scientific articles, because this is what happens. Now, I'll be the first one to say that we cannot draw this conclusion based on this one experiment, but it's a start.

    2. Blogs are partly meant to entertain, articles are not. Extensive theoretical underpinning are for article but not so much for blogs.
      I actually disagree with this point a little; it can work this way but it's not compulsory. We've been using our blog as a place to work out serious science for several years and it's been extraordinarily useful. There is a slightly more relaxed atmosphere with a blog; it's slightly more 'work in progress'.

      But I think it's ok to take blogging seriously too, if you want to. We've used ours as a space to talk about and critique published work, for example, and we've taken that seriously: we haven't just been shooting from the hip because it's a blog. I think you can make a serious contribution in this forum.

    3. "There is a slightly more relaxed atmosphere with a blog; it's slightly more 'work in progress'."

      We're on the same page here.

      "But I think it's ok to take blogging seriously too, if you want to."

      Again, no disagreement here. I'm aiming for a combination of the serious and the loose.

    4. Thank you for your extensive answer.

      But when you state "If I ever write them up in the form of a scientific article, they will be grounded in the literature", shouldn't it be the other way around? First reading and thinking instead of immediately rushing into conducting (exploratory) experiments.

    5. The topic of this study is in the area of discourse comprehension. I have read and thought about this topic. I have even written a paper or two on this topic. I've even done research showing that people's expectations of the genre of a text influence their processing and mental representations of it.

      So the general grounding is there. I just don't think an extensive lit review belongs in a blogpost. As far as I know, there is no research on the effects of amusing titles. But perhaps you can point me to the relevant literature.

      Finally, the experiment was confirmatory. I had a specific hypothesis, which I tested.

  4. I think your blog should have a 'like' button!

    Or is it the titles of the articles that should have one? Or, an amusing title of a short post on your blog, like this one;)
    I still think that the experts already form their first impression on the title only ("do I want to continue reading?") and in that case an informative title that reveals the results of the study has more preference, whether amusing or not. I'm looking forward to reading your analysis on the types of titles, keep it up!

  5. Thanks!

    It is indeed a good idea to examine perceptions of the titles themselves and assess whether people are going to read the articles. This would be fairly easy to implement.

  6. Final try:

    Just looked up a random article title on physics journal and found: "A DFT + U study of (Rh, Nb)-codoped rutile TiO2" which impressed me immensely, because I have no idea what it is about. Possibly the amusing titles give a sense of understanding to the readers, causing the lower confidence judgments (if I can understand this, as non-scientist, it can not be much).

    Was also wondering whether ingroup scientist (psychologist) appreciate amusing titles as form of inside joke, but outgroup scientists do not. Seems also matter of norms.

    Enjoying this curiosity-in-action on this blog!

  7. Interesting point. Might be true for some. I have an alternative explanation. See my next post.

    The in-group out-group idea makes sense as well. You might find those effects. As a critical observer I would say that science should never be an in-group thing.

    Thanks for the compliment. I like the description!