Wednesday, May 15, 2013

Fun with Flying Pigs: The Importance of Context in Language Comprehension


How do we understand the phrase flying pig? This sounds like a silly question (and later on I will show that it is) but people can get quite emotional about it. In a recent blog post, for example, Greg Hickok took Ben Bergen to task for things Ben said about a flying pig in an interview on NPR.

Flying pig you ask? Yes, it was a comment that Ben made about a flying pig that set Greg off in what someone on Twitter called an “epic rant.” An epic rant about a flying pig: what could be better than that? (In truth the rant was about more than just the pig but I hope you forgive my fascination with the pig.)

What did Ben say about the pig? Here is the transcript of his interview. Ben first sets the NPR listeners’ minds at ease: a flying pig isn't something that actually exists in the real world.  I like the no-nonsense approach here, but then Ben immediately veers into the danger zone: Yet when we read those words we see one in our mind's eye. Most people see a pig with wings above its shoulders… But some people imagine a pig with a cape, flying like Superman.

Uh-oh, epic rant alert! Ben should not have said this because now Greg is all over it: Or maybe I combine pig with my experience flying on 737s and imagine a pig sitting in coach ordering a Diet Coke.  Or should I combine pig with my baseball experiences and picture a mini pig being used as a baseball and getting smacked out to center field.  Hold it right there, sir! To have a pig drink Diet Coke is one thing but to miniaturize it and then smack it out to center field, well that’s just cruel.

What does all of this have to do with language? Thought you’d never ask. Ben argues in the interview that we understand the phrase flying pig by performing mental simulations based on our previous experiences: a flying pig has meaning to us because our brain is using things we have seen — pigs and birds — to create something we've never seen. This is one way to think about it but you really don’t have to think of it in terms of visual representations, abstract symbols will do just fine.

Traditional cognitive theories assume that we have networks of nodes that represent concepts and their connections. Those connections specify the relation between the concepts. For example, BIRD might have CAN FLY and HAS WINGS as features. AIRPLANE would have the same features (obviously, because airplanes are not identical to birds they have different features as well, such as HAS LEGS and HAS WHEELS). PIG does not have these features but when it becomes associated with FLYING, HAS WINGS might become temporarily activated and associated with PIG. In this case, there is no mental simulation that involves the visual system but we would still end up with a winged pig.

As Ben and Greg have just shown us, there are other ways to think of a flying pig. Ben argues that these can also be represented via mental simulation. That’s possible but they can also be represented by the amodal-abstract-arbitrary symbol system that I just described (or a variant of it). So Ben and Greg are both wrong. Ben is wrong to imply that mental simulation is the only way in which multiple interpretations of FLYING PIG can be generated and Greg is wrong to view the unrestrained generation of flying-pig interpretations as a unique weakness of mental simulation, because the same criticism can be leveled at traditional models of semantic representation.

Ben and Greg both ignore an important issue. Without sufficient context, any phrase is open to multiple interpretations. We really cannot say much about the interpretation of flying pig in isolation. Suppose someone came up to you at a party and said: I saw a flying pig. I highly doubt that you would generate all the interpretations that Ben and Greg came up with. Your response would probably be more like Wow, someone must have spiked your drink, bro or, if you’re more polite, Enjoy the rest of the party, after which you would hurry to a far corner of the room pretending to have spotted an old friend.

Context serves to restrict the number of potential interpretations. Only (psycho)linguists and philosophers study language snippets in isolation. This has led to fruitless, decades-long debates about the interpretation of sentences no sane individual would ever utter, like The horse raced past the barn fell or The present king of France is bald.

So let’s create a context. Suppose there is a guy named Greg who lives in Macon, Georgia. Greg’s brother Duane is a huge Game of Thrones fan. Huge fan. So huge in fact that he has become interested in medieval warfare and has built his own trebuchet. Unfortunately for Duane though, there are no boulders on his property to hurl at targets. Resourceful guy that he is, Duane quickly realizes the solution is living right next door. His neighbor, Dickey, is a pig farmer. He buys all of Dickey’s pigs and each night (non-miniaturized) squealing pigs (talk about live ammo) are being launched at targets. One day Duane’s wife, Jessica, says to Greg: I’m gettin’ sick of them flyin’ pigs.

Small chance Greg will interpret this (either though mental simulation or through amodal symbol manipulation) to refer to a pig with a cape or a pig ordering Diet Coke on a plane.

Making the relatively uncontroversial assumption that in most contexts flying primes wings, we can predict that this priming effect is erased by the context of the story. In fact, there is a study showing exactly this. In most contexts, peanut and salted are associated, meaning that peanut primes salted. But let’s uppose that we have a story in which the peanut is a protagonist who feels emotions, such as happiness and sadness. Normally peanut would not prime sadness but in the context of the story, it should and it should not prime salted. This is exactly what Mante Nieuwland and Jos van Berkum found in their cool study.

So Ben and Greg are both wrong. The promiscuity of interpretations of flying pig is not a pro (Ben) or a con (Greg) of simulation theory. It is a problem that occurs when we take language out of context and study “textoids.”

In closing, here is an appropriately titled song by Pink Floyd. It sounds a little tinny but the guitar solo is great.




Friday, May 10, 2013

Replication Done Right


When I started this blog, I was a little worried that I might soon run out of topics. So far, however, the topics have been presenting themselves. And now readers have even started to suggest topics for blog posts!

I recently received an email message from Etienne LeBel who said he’d enjoyed my Lazy Susan and Bruce Springsteen post (Lazy Susan, the gift that keeps on giving) and suggested I write a post about a recent positive replication experience of his. Specifically, he said: We really want to get the message out that these replication efforts need not be adversarial and antagonistic, and that it should be considered as a normal part of ensuring our science is self-correcting.

It thought this was a great idea—who wouldn’t want to be the bearer of good news? So here goes.

The study Lebel and his co-author Lorne Campbell (a great name for a sheriff in a Western) set out to replicate study 1 of a paper by Matthew Vess published in Psychological Science. Vess compared individuals who were more or less “anxiously attached.” (Being a non-specialist in this area, my first association with the phrase anxiously attached was that of a severed limb having been put back in place by a nervous surgeon, but I think I have a global idea of what it means now.) Vess asked both groups of subjects about their food preferences. The more anxiously attached subjects reported heightened preferences for warm foods compared to the more securely attached subjects. But this occurred only when attachment concerns were activated (i.e., reflecting on a romantic breakup) and not in a control condition.

LeBel and Campbell (L&C) say that they are sympathetic to the theoretical integration of the study but that they wanted to assess the reproducibility of that finding, given that it was only reported in one study with 56 subjects.

How did they go about it?

(1) They made sure they had sufficient power to detect the effect. They didn’t take half-measures and quadrupled the original sample size so that they had a power of .95 to detect an effect. I recently said on Twitter that an underpowered (and failed) replication attempt is more like libel than like research, which seemed to resonate with several people interested in replication. Clearly, L&C are not guilty of libel.

(2) They contacted the original author, Matthew Vess, for details about the procedure and materials.

(3) They preregistered the studies prior to data collection.

(4) As in any good replication study, they faithfully copied the procedure and nature of the sample of the original study.

(5) They went the extra mile by asking the original author to critique their first attempt. Vess noticed some small discrepancies between L&C’s first attempt and his original experiment. These discrepancies were resolved before the second attempt.

(6) They used the exact same analytical procedures as were used in the original study.

(7) They report their findings in a concise and respectful manner.

So what did L&C find? Well, they did not replicate the original finding in either sample. We all can check this because they made all project materials, raw data, and syntax files available online.

L&C argue that their findings are difficult to reconcile with the original ones. There were no major procedural discrepancies between their replication attempts and the original study and they had sufficient power to detect an effect. And because the replication attempts were preregistered, selective reporting was not an issue.

L&C conclude: Our findings, however, do not provide empirical support for the notion that activating the attachment system of more anxious individuals increases sensitivity to temperature cues, although it is possible that this theoretical idea reflects a reproducible phenomenon under a different set of operationalizations. We therefore advise researchers to proceed with caution when exploring links between anxious attachment and temperature experiences in potentially relationship threatening contexts.

This is a nicely worded conclusion. It contains no criticism of the original study, merely suggesting that researchers in this area should tread lightly, given that the empirical foundation may not be as strong as formerly believed. As such, this study is a prime example of what I talked about in my very first post: replications should be about checking the structural integrity of the empirical foundations of the field rather than about pointing fingers.

The steps that L&C followed might serve as a blueprint for other replication studies. Clearly, researchers need to be true to the original study in terms of design, procedure, exclusion criteria, and data analysis. Clearly, they need sufficient power to detect an effect of the size reported in the original paper. Because of publication bias, this means running considerably more subjects than in the original study. Pre-registration seems the way to go, as it prevents a reverse file-drawer problem in which people will only report non-replications, as these might seem more informative than replications.

It is also important to consult the original author.  Obviously, original authors may not always be as helpful as Vess evidently was in this case but they should at least be consulted regarding the materials, design, procedure, and analyses of the intended replication attempt.

L&C’s paper is currently in press in Psychological Science, the journal that also published the original study by Vess.

The original findings may not have been replicated, but I’d still call this a successful replication attempt. 

Thursday, May 2, 2013

Social Priming in Theory Part 2


The discussion on social priming is still raging, with researchers being unable to replicate key original findings, replication efforts being criticized by the original researchers, replication researchers replying to the criticism, other researchers weighing in, journalists sensationalizing the controversy, and wiser heads trying to put things in perspective and calm the waters.

As in my previous post, I’m going to ignore the empirical debate and look at social priming from a more theoretical perspective. Some people (on Twitter) wondered what the point of this was. After all, if some of the key findings cannot be replicated, does it make sense to build a theory on this? My response to this criticism is that just because there might be problems with some of the experiments in this area, it doesn’t mean that the phenomenon itself does not exist. Perhaps it has not been investigated properly.

So let’s look at the essence of social priming. The basic idea—as I understand it—is that there are man-made cues in our environment that impact our thoughts and actions in non-arbitrary ways. It is easy to think of examples here. Maybe the most extreme one would be a dictatorial regime, where people are bombarded on a daily basis with a barrage of images, sounds, and words promoting the leader, the regime, national pride, and a certain ethos. Other examples in this vein are armies and religious groups, where uniformity in dress (uniforms, habits) and behavior (e.g., marching, mass prayer) serve to suppress individuality in thought and action. Of course, the Milgram and Stanford prison experiments are classic examples of this in the social psychological literature.

More general forms of social priming may have their origin in constraints imposed by geography or biology. For example, being on higher ground provides an advantage in battle. This is where the commanders are usually situated, overlooking the battlefield. Commanders, such as knights, used to be on horseback whereas foot soldiers were (you guessed it) on foot. So it is only natural perhaps that our culture has come to associate power with being “up.” Take a look at the organogram of the university or company you work at. There is a good chance that the levels that have the most power are located at the top and the ones with the least power near the bottom.

The signature examples of social priming are nothing like this, however. For example, the word bingo is not designed to make people walk slowly. Rather, it is used to refer to a boring game that happens to be played mostly by old people. Likewise, the word professor refers to an academic rank and is not designed to make people perform better on a general knowledge test.

There are several attempts to make sense of the social priming literature. I discussed two of them in my previous post. I will focus on a third one here, an article by David Loersch and Keith Payne. Right off the bat, we are confronted with an unfortunate detail. They build their theory on the works of Stapel and Smeesters, who have had their articles retracted, and Dijksterhuis and Bargh, whose main findings have not been replicated in recent studies. Clearly, these studies do not make for the strongest of empirical foundations.

Still, I think it is possible to separate the theory from its shaky foundation. In fact, let’s just assume that there is no empirical foundation at all and that the theory is based on casual observations and armchair philosophizing of how social priming might work in principle.

Loersch & Payne, like other theoreticians of social priming whose work I discussed in my previous post, argue that there are different ways of priming. They provide a nice example. Suppose you are primed with words related to hostility, like anger and punch. You might exhibit the following types of priming.

(1) Semantic priming. You would more quickly recognize words like enemy and violence.
(2) Construal Priming. You would perceive another person as more hostile.
(3) Behavior Priming. You would become more hostile yourself.
(4) Goal Priming. You would become more motivated to seek out opportunities to be aggressive yourself.

There clearly is the most evidence for semantic priming. I would even go so far as to say that semantic priming is uncontroversial in both cognitive and social psychology. The notion underlying semantic priming is that of a semantic network where nodes are words or concepts and the links associations between them. If two words are closely associated, for example apple and pear, there is more priming between them than if they are less closely associated, for example apple and bread.

It is logical to think that semantic priming is more direct than the other three types of priming, which require activation to spread among many more nodes in a network. Dan Simons makes the same argument in a recent blog post. The amount of spreading activation is necessarily smaller at each cycle (each new set of nodes that is activated); it has to be this way because otherwise the entire network would be activated each time a stimulus is presented.

And what I would write next is exactly what Simons wrote in his post, so I’m just going to follow his reasoning, using Loersch & Payne as a framework. First, Simons notes that the effect size for semantic priming is about r=.21. The reported effect sizes for the three other forms of priming tend to be larger. How can this be if these forms of priming are less direct? Dan Simons suggests three possibilities.

  • The chain of associations for construal, goal, and behavioral priming is more direct than for semantic priming. This would require a rethinking of the structure of representations.
  • The mechanisms guiding behaviors in goal priming are different from and more powerful than those underlying other forms of priming.
  • The social priming effects are not as large (or the semantic priming results not as small) as the published reports suggest. 
I think there is a fourth possibility. The primes used in social priming research are stronger than those in semantic priming research. In a typical semantic priming experiment, a single prime is presented, followed (at some interval) by the target. This is then repeated multiple times so that averages can be computed for two or more conditions (for example semantic associates versus unrelated words) within each subject.

In many social priming studies, the sentence unscrambling method developed by Bargh is used. In this task, subjects see about 15 sets of five words, such as "disciplined", "man", "flower", "the", "was". For each set of five words, participants must form a sentence, using only four of these terms, such as "the man was disciplined". Embedded within about 60% to 80% of these sets is a word that is synonymous with the goal, motivation, or value that researchers would like to evoke. All other words are unrelated to the goals. In other words, subjects receive upwards from 9 primes for a single target and they end up using some of them in a sentence to boot.

A first useful empirical step might be to put the four types of priming on equal footing by comparing effect sizes using the sentence-unscrambling task. My prediction, and that of Simons and probably many others, would be that semantic priming produces the largest effect.

Several interesting things might happen. Suppose semantic priming is not the winner of our priming competition. This would have to lead to the theoretical revisions of the structure of mental representations, as Simons notes.

But suppose it proves difficult to reliably get any other form of priming than semantic priming. This might require a reconsideration of the priming paradigm. Surely living in a dictatorship, serving in the army, or participating in a Stanford prison experiment exposes you to a lot more primes than the 9 or 12 in a sentence-unscrambling task, not to mention that those primes are going to be a lot more salient.

Once very strong (and replicable!) social priming effects have been found, it would then be possible to gradually dismantle the priming procedure to find out how much social priming is needed to find an effect. And one could go from there…