Friday, August 16, 2013

50 Questions About Messy Rooms and Clean Data

About a month ago, I had a difficult conversation with my daughter. Her year in college had not gone particularly well and I asked her what she was going to do differently next year. One of the first things she was going to do, she said, was to clean up her student room. It was just too cluttered to concentrate.

Not only 20-year-old students hypothesize about the effects of environments on thought; social psychologists do too. My daughter’s hypothesis is straightforward: messy environments are distracting. The social psychologists’ hypotheses take us a little further afield. For example: Messy environments promote stereotyping. The paper describing research into this hypothesis was co-authored by Diederik Stapel and has been retracted. Another hypothesis is that messy environments promote a longing for simplicity. The paper describing research into this hypothesis was co-authored by Dirk Smeesters and has been retracted.

Now there is a new study on messiness. It is about to be published in Psychological Science and has already received a lot of press coverage. The main findings are claimed to be that neat environments promote giving to charity and healthy eating behavior whereas messy environments promote creativity.

While I was reading the article, many questions arose. Given their obviousness, I’m surprised that these questions did not occur to the researchers who wrote the paper, the reviewers who commented on the manuscript, the editor who accepted the manuscript for publication, and the journalists who wrote breathless news stories about it. So in the rest of this post I’m just going to list these questions. I will not focus on theoretical aspects of the study (or the lack thereof), which would have made the list even longer.

My questions follow the structure of the paper.

Experiment 1

Thirty-four Dutch students participated. They were randomly assigned to an orderly or a disorderly condition.

(1) Isn’t 34 a small N for a between-subjects design with a subtle manipulation?
(2) At what university were these students? The authors are at the University of Minnesota. (I learned via Twitter that the subjects were likely run at Radboud University in Nijmegen, a university that the authors are not affiliated with.)
(3) How many male and female students were in the sample?
(4) Is there any additional information on the subjects that might be relevant?

We manipulated environmental orderliness by having participants complete the study in an orderly or disorderly room (Fig. 1).

(5) Doesn’t the “orderly room” look like a testing room and the “disorderly room” like someone’s office? Is, in other words, orderliness the only thing that is varied between conditions or are there one or more confounds? 

Participants wrote the amount, if any, they chose to donate on a sheet of paper, which they placed into a sealed envelope (so that self-presentation concerns would be dispelled).

(6) Did the subjects actually donate the money? If so, how was this accomplished?

Upon exiting, participants were allowed to take an apple or chocolate bar, which constituted the measure of healthy food choice.

(7) What was the motivation for using these particular snacks?
(8) Weren’t the authors worried that some people may never eat chocolate whereas others never eat apples?
(9) Did the authors have independent information on the subjects’ snack preferences?

Participants who completed the study in the orderly room donated more than twice as much as those who completed the study in the disorderly room (M  = €3.19, SD  = 3.01, vs. M  = €1.29, SD  = 1.76), t (32) = 2.24, p  = .03, d  = 0.73. Fully 82% of participants in the orderly room donated some money, versus 47% in the disorderly room, χ2 (1, N  = 34) = 4.64, p  < .04, ϕ  = .37.

(10) Didn’t the authors/reviewers/editor find this a surprisingly strong effect for such a small sample and such a subtle manipulation?

Also as predicted, participants in the orderly room chose the apple (over the chocolate) more often than those in the disorderly room1  (M  = 67% vs.
M  = 20%), χ2 (1, N  = 30) = 6.65, p  < .05, ϕ  = .44.

(11) See previous question.
(12) How can the authors be sure that out of 30 subjects randomly assigned to two conditions the numbers who normally prefer apples over chocolate were about equal before the manipulation? Weren’t they worried that an over-representation of apple lovers in the disorderly room would destroy their hypothesized effect? If not, why were they unconcerned about this?
(13) What did the subjects do with the snacks? Eat them? Give them away? Dump them in the trash?
(14) Does it count as a healthy choice if someone selects an apple but then doesn’t eat it?
(15) How were the snacks presented? Was the chocolate in a wrapper? And how about the apple?
(16) What would have happened if more subjects in the orderly room had selected the chocolate? Would the authors have post-hoc hypothesized that some compensatory mechanism was at work? (Chocolate counteracts the effects of being in sterile environments.)
(17) Was no one concerned that the donation task might influence the snack-selection task?
(18) Was no one concerned about demand effects?
(19) Were these two tasks the only tasks that were performed?

Experiment 2

Given that orderliness is paired with valuing convention, a disorderly state should encourage breaking with convention, which is needed to be creative (Simonton, 1999). Therefore, we predicted that being in a disorderly environment would have the desirable effect of stimulating creativity.

(20) Did the reviewers/editor consider this a convincing rationale for the prediction?

Forty-eight American students participated in a two-condition (orderly vs. disorderly environment) design.

(21) What was these students’ affiliation?
(22) How many males vs. females were in the experiment?

Participants completed tasks in a room arranged to be either orderly or disorderly (Fig. 2).

(23) Are the authors/reviewers/editor/journalists serious? Is this really the same manipulation of orderliness as in Experiment 1? The room looks orderly alright in the picture on the left but on the right it looks like some errant groundskeeper had just wandered in with a leaf blower on at full blast.
(24) Do de authors/reviewers/editor seriously believe that orderliness is the only dimension along which the two rooms differ? Are there no confounds?
(25) What did the subjects say upon entering the disorderly room? Did they perchance say Is this a practical joke? In others words, did they take the experiment seriously? As seriously at least as those in the orderly room?

Two coders, blind to condition, rated each idea on a 3-point scale (1 = not at all creative , 3 = very creative ; κ  = .81, p  < .01); disagreements were resolved through discussion.

(26) What were the criteria that were used by the raters? What is an example of a “very creative” idea?

Results (all effects were significant and effect sizes were large)

(27) Was nobody surprised about this? Not the authors, not the reviewers, and not the editor?


It could be that our disorderly laboratory violated participants’ expectations

(28) Was this sentence included for comical effect? If so, it worked.

Our preferred explanation, though, is that cues of disorder can produce creativity because they inspire breaking free of convention

(29) Did the reviewers/editor consider this a satisfactory explanation? I mean, when it comes to ice cream flavors, I prefer pistachio to strawberry. Of course, the ice cream vendor doesn’t demand an explanation; he’s just as happy to sell me the pistachio as he is to sell me the strawberry. We’re talking not about ice cream flavors here, though, but about science, so shouldn't people be held to a higher standard than merely stating their preferences?
(30) Didn’t anyone find it ironic that the alternative explanation is supported with a reference whereas the preferred one is not?

Experiment 3

We measured preference for a new versus a classic option. Participants completed a task that ostensibly would help local restaurateurs create new menus. One of the options was labeled differently in the two conditions. That option was framed as either classic, or new, an unexplored option (Eidelman et al., 2009). We predicted that participants would choose the option framed as classic more when seated in an orderly (vs. disorderly) room, and, conversely, that they would choose the option framed as new more when seated in a disorderly (vs. orderly) room.

(31) Many questions could be asked at this point. I’ll just ask one: WTF? On the positive side, the sequence of experiments does bring back fond memories of Lazy Susan.

One hundred eighty-eight American adults participated in a 2 (environmental orderliness: orderly vs. disorderly) A 2 (label: classic vs. new) between-subjects design.

(32) Who were these mysterious “American adults”? I assume they were not students, unless the authors got tired of typing “students.”
(33) How were they recruited?
(34) What was their age range?
(35) How many of them were male vs. female?
(36) Where were they tested? Were the rooms on a college campus?
(37) How were they compensated?

We manipulated environmental orderliness by randomly assigning participants to complete the study in a room arranged to be orderly or disorderly (Fig. 3)

(38) What did the subjects say when they stepped into the rooms on the right? Did they say If you want me to participate in your experiment, can you first please clean up the mess or do you want me to hopscotch to my seat?
(39) Do the authors/reviewers/editor/journalists really believe that orderliness was the only dimension on which these rooms varied? The “disorderly” rooms look very staged for example. And the “orderly” rooms look like the “disorderly” room of Experiment 1.
(40) Was there an effect of room? For example, one orderly room has a boombox whereas the other does not. One disorderly room has a book weirdly placed behind the monitor whereas the other one has pencils strewn all over the floor. 

Participants imagined that they were getting a fruit smoothie with a “boost” (i.e., additional ingredients). Three types of boosts were available: health, wellness, or vitamin.

(41) Didn’t the subjects have a problem performing this task? Did anyone care to ask? Maybe I’m a particularly unimaginative guy but I don’t think I could do a good job imagining a "fruit smoothie with a health boost." And how is that different from one with a “wellness” or “vitamin” boost anyway?

We varied the framing of the health-boost option so that it cued the concept of convention or novelty (Fig. 4). To cue novelty, we added a star with the word new superimposed. To cue convention, we added a star with the word classic superimposed. The dependent measure was choice of the health-boost option.

(42) Were the authors confident that this manipulation of room and the labels “classic” vs. “new” would yield a crossover interaction? I guess they were but I wonder if anyone else would be, besides the reviewers and editor of course.

Planned contrasts supported our predictions (Fig. 5).

(43) No kidding. Evidently, the authors’ ability to create messy rooms is matched only by their ability to obtain perfect crossover interactions. Did the reviewers/editor not think that this interaction is, indeed, very very pretty?
(44) Was the pretest conducted in an orderly room? If so, it shows that there is no preference for label in an orderly room. Doesn't this contradict the main experiment, where a 35% vs. 17% preference was found for the classic label?

General discussion

Orderly environments promote convention and healthy choices…

(45) Is it a healthy choice if someone selects an apple and then doesn’t eat it? I guess you could call it that but it would be meaningless unless you're interested in demand effects. Have the authors/reviewers/editor considered demand effects in any of these experiments?

Our systematic investigations revealed that both kinds of settings can enable people to harness the power of these environments to achieve their goals.

(46) Did the reviewers/editor not think the authors grossly overstated their results here?
(47) Did no one chuckle when reading about the power of these environments in connection with the messy rooms?

One such person was Einstein, who is widely reported to have observed, “If a cluttered desk is a sign of a cluttered mind, of what, then, is an empty desk a sign?” (e.g.,

(48) Was it too much trouble to locate the source of this quote?

Author contributions

Data collection and analyses were overseen by all authors.

(49) How did this work if the data were collected at a university that none of the authors are affiliated with? I’m sure it can be done, but it would be important to know. And what does “overseen” mean here?
(50) And finally, does reading this article prime any thoughts of Stapel and Smeesters?

I’m sure that there are a lot more questions that could be asked about this research. My point is that they should have been asked and answered by all concerned before the research was published and before big claims about it were made in the media.

I hope no one will mind if I keep my office reasonably orderly. I’m sure the lady who cleans my office won’t appreciate me ransacking the place just so I can be more creative. And I don't think this study has convinced me that it would matter anyway. In fact, I find my daughter’s hypothesis far more compelling—and she didn’t even need imaginary smoothies, .8 effect sizes, and perfect crossover interactions to convince me: Too much clutter is distracting. Papers on messiness are a case in point.