Thursday, July 25, 2013

The Gains and Pains of Joint Authorship

There are many differences between the scholarly cultures of the humanities and the natural sciences. One seemingly superficial but striking difference is in the number of authors per paper. A significant number of papers in the humanities are single-authored whereas in the natural sciences multi-authored papers are the norm; for some articles, the author list is longer than the article itself. For example, this paper on an experiment performed at the CERN Large Hadron Collider sports an army 2,926 authors, enough to fill two concert halls.

In contrast to the natural sciences, publications in the humanities are typically single-authored; examples are essays in philosophy, linguistics, and literary criticism. Such scholarly endeavors are by nature individualistic. The author’s style of writing and argumentation play an important role. References to these essays are therefore often accompanied by quotes rather than by a dry summary of findings. It is apparently not only important what the author said but also how he or she said it.

In the humanities, the author can be held responsible for the entire content of a paper; all remaining errors are my own is a common expression in the acknowledgements of such scholarly contributions. In the natural sciences, complementary types of expertise are essential to carry out a project. (I haven’t tried it yet but I’m pretty sure you cannot single-handedly conduct an experiment in a particle accelerator.) As a result, among the thousands of authors there probably isn't a single one who oversees the entire paper.

So what is the lay of the land in the social sciences? In psychology—the field I am focusing on—multi-authored papers have become the norm. Often co-authorships are student-mentor partnerships but especially with the advent of neuroimaging techniques, the complementary-expertise model of the natural sciences has become common.

Multi-authored papers raise all kinds of issues regarding credit. How much credit should go to the first author relative to the other authors and what is the status of the last author? Various journals, including Psychological Science as of this year, are now requiring authors to specify their respective contributions. Who designed the experiment? Who analyzed the data? Who wrote the paper? Who went along for the ride? And so on. This is a good idea. Moreover, it is not only a good idea for assigning credit but also for assigning responsibility.

To what extent should a co-author be held accountable for the entirety of a scientific article? A—what I would call—shared-responsibility view holds that by signing on as co-author, a researcher is responsible for the entire paper. The rationale for this assumption is that if you want the credit then you should also accept the responsibility. And if there is blame to throw around, it should fall on everyone. If you burn your behind, you’re going to have to sit on the blisters as the Dutch expression goes.

Another view assumes divided responsibility. Authors are only responsible for the part that is covered by their domain of expertise. The rationale for this view is that you cannot hold people responsible for things they have no control over. This would seem obvious in the physics example I just gave. Forgive my profound lack of knowledge on the topic, but I would assume that the guy who cranks up the particle accelerator and the guy who touches up the images in Photoshop have no overlapping expertise, so it would seem unfair to rake the former over the coals if the images are artistically subpar.

Which view should we adopt in psychology, shared or divided responsibility? The case of Barbara Fredrickson, which I described in my previous post, provides a poignant illustration of the issue. Fredrickson had co-authored a paper with Marcial Losada in which they presented, among other things, a mathematical model of emotional dynamics based on fluid dynamics. A recent paper convincingly and eloquently showed this model to be a mathematical shambles.

In a response to this critique Fredrickson radically disowned the model. She argued that the modeling was entirely Losado’s work and that it, all things considered, was not even relevant to the rest of the research, so that it could be safely expunged from the record.

Many people find Fredrickson’s response inadequate (see for example the comments on this Neuroskeptic post) and it is easy to see why. On multiple occasions Fredrickson has embraced the model and touted its virtues, for example in a popular book and in this talk (starting at 12:35). Her own website until very recently displayed the butterfly image produced by the mathematical model. The image is gone now, but at the bottom of this post is a screen shot.

By washing her hands of the model now, Fredrickson has shifted from a shared model of credit to a divided model of responsibility. All gain, no pain in other words.

There may not be a good solution to the problem of assigning credit and responsibility but all gain no pain doesn’t seem the right model. It might be a start to require authors to indicate not only which components of the work they want to receive credit for but also which ones they want to be held responsible for. And wouldn’t we want to have a perfect match between credit and responsibility?

Monday, July 22, 2013

The Fanciful Number 2.9013, Plus or Minus Nothing

Can we capture any aspect of human psychology in a single number?

In 1956 a paper entitled The Magical Number Seven Plus or Minus Two saw the light. It was destined to become a classic in cognitive psychology. In it, George A. Miller argued that human short-term memory is limited to seven units of information. The 7 was just an approximation, which is why Miller added the cautious plus-or-minus-two.

In 1993 my former colleague Anders Ericsson wrote a much-cited paper in which he argued that minimally 10 years of deliberate practice are required to achieve expertise in a domain, an idea that was popularized by Malcolm Gladwell.

It is rare to tie psychological performance to a number, but Miller’s proposal is modest, as is Ericsson’s. The numbers are not claimed to be exact and are derived in a straightforward way from the data.

There is another number in psychology that makes a lot less sense: the number 2.9013. It represents the ratio between positive emotions and negative emotions that an individual or a group expresses within a given period of time. How did this number get to be so exact, with no less than four decimals? I mean it’s not 3 plus or minus 2 but it’s exactly 2.9013! The number is based on a set of differential equations used to model fluid dynamics.

According to a paper by Frederickson and Losada (2005) there is a tipping point in the ratio between positive and negative emotions. Below the tipping point (e.g., if your ratio is 2.8593) you are languishing and above the tipping point you are flourishing (up to 11.6346, after which things take a turn for the worse again). 2.9013 has become known as the positivity ratio. 

There is only one problem. The math behind 2.9013 and 11.6346 has recently been demonstrated by Brown and colleagues to be entirely fanciful. To make matters worse, the reasoning that the math gave rise to was also flawed.

Fredrickson & Losada's (2005) Butterfly Plot
The paper by Frederickson and Losada has been cited 970 times on Google scholar (as of July 19, 2013), which is a very high number for a psychology article. Hindsight is 20/20 and I was not aware of the positivity ratio before I read the Brown et al. article—so we must take what I’m saying here with a grain of salt—but I found it shocking that so many researchers had apparently forgotten to put on their critical thinking caps when they cited this article. Why did they fall for the positivity ratio? (I don't mean the general idea, which seems plausible enough, but the exact ratio and its origin in fluid dynamics.) Here are some reasons why this might have happened.

(1) Physics envy. At heart, all social scientists want to be natural scientists and so if natural scientists use differential equations to do cool stuff, they want to use them too.

(2) A desire for simplicity. How nice when everything and everyone can be reduced to a single number. We all have our BMIs and IQ,s if we are baseball hitters our RBIs and OBPs, if we are chess players our ELO-ratings, if we are researchers our H-Indexes, if we are earthquakes our Richter scale scores, and if we are scientific journals our impact factors. So why can’t we have a positivity ratio as well?

(3) A desire to see order in nature. The universe has yielded to scientific inquiry. Just as the elements let themselves be herded beautifully into the Periodic Table, so human emotional well-being can be cordoned off between two numerical boundaries.

(4) The desire to see all things connected. The positivity ratio was derived from a model of fluid dynamics. Apparently, the same set of equations that can be used to describe convective flow in fluids can be used to derive the upper and lower limits of emotional well-being. This gives a sense of profundity: the researchers have really hit on something fundamental to human existence here. It appears that Losada was so taken with the correspondence between fluid dynamics and emotional dynamics that he took his own metaphor literally, which resulted in a bizarre line of reasoning. He used a parameter in his model that expresses the ratio between buoyancy and viscosity in fluids. In the performance of the teams of workers he had observed, he noticed an interaction between what he described (entirely metaphorically of course) as buoyancy and viscosity. High-performance teams operated in a buoyant atmosphere whereas low-performance teams could be characterized as being stuck in a viscous atmosphere highly resistant to flow. 

In her response to the Brown et al. article, Frederickson swiftly relinquishes the tipping point and distances herself from Losada’s mathematical modeling. She notes that Losada did not want to respond to the Brown et al. article. This left her to defend the questionable math for which she was not responsible, so it is perfectly understandable that her positivity ratio dipped significantly below 2.9013 and that she threw her erstwhile co-author under the bus.

Frederickson argues that there are three components to her 2005 paper with Losada: (1) theory, (2) empirical evidence, and (3) mathematical modeling. Without the mathematical modeling, the first two remain. Frederickson is right that it is common for psychology to only have theory and empirical data.  But one has to wonder whether the work would have had the impact it has had without the mathematical model. One also has to wonder what the relevance of the model to the theory is if it can be jettisoned so easily and why it was included in the first place.

So what lessons can we draw from this latest kerfuffle in social psychology? We should be cautious whenever someone claims a psychological phenomenon can be captured in a number, especially when it has a lot of decimals. And we should keep our physics envy in check. We should also be wary of “deep truths” and of our inclination to see connections everywhere.

See also my next post on this topic.

Monday, July 15, 2013

Let the Sabers Rattle: A Cross-cultural Comparison of Doctoral Defenses

Along with getting tenure, the most important rite of passage in an academic career is the defense of the doctoral dissertation.  Across the globe, defenses range from academic inquisitions to folkloristic public events.

Dissertation defenses in the Netherlands fall on the folkloristic end of the spectrum. This is because the defense and conferral of the degree are rolled into one event. The candidate’s family, friends, and department colleagues are in attendance and the eight or so committee members, including the candidate’s major professor, are wearing full academic regalia.

Defenses in the United Kingdom are a decidedly more austere affair involving only the candidate and two external examiners; not even the major professor is allowed present. The examination I participated in took place in a small room in the bowels of a building. After the, successful, defense the candidate’s major professor and two lab mates suddenly appeared in the hallway with champagne glasses and the candidate was toasted. I’ve been to pet funerals that were more festive.

Defenses in the Unites States and Canada are less grim than those in the U.K. The committee is made-up by familiar (and usually friendly) faces from the department and the major professor is present. But they also lack the pomp and circumstance of Dutch defenses. This is because the defense is separated in time and space from the conferral of the degree.

Some countries have their unique traditions. Almost 10 years ago, I was asked to serve as an opponent in a Finnish dissertation defense. I agreed, assuming that a Finnish defense was going to be similar to a folkloristic Dutch defense rather than an American one. (After all, I was repeatedly asked to bring my cap and gown.) I carefully read the dissertation and prepared two questions.

The day before my departure, I was talking to my much-respected senior colleague Al Lang. I told him about my upcoming Finnish adventure, telling him that I would be one of several opponents, assuming he was not familiar with European tradition. No, you’re it Al said. What do you mean, you’re it? I asked. It turned out that Al had been an opponent at a different Finnish university some years before. He told me that the defense was indeed public but that I was the sole opponent and that I was supposed to question the candidate in public for at least an hour and up to however long it took. I quickly ran to my office and generated a whole bunch more questions.

Finnish defenses are folkloristic alright but in their own special way, as I soon found out. Right before the defense, the candidate, her promotor, and I gathered in the promotor’s office. He opened an expensive looking box, which contained three glasses. He poured each of us a single malt whisky. We toasted, downed our scotch, and then walked—thus fortified—to the room where the defense was to take place.

To this date, I’m not completely sure if the scotch is an integral part of the Finnish protocol or that it was an idiosyncrasy on the part of the promotor—after all his bookshelves were lined with empty cartons of Laphroig, MacAllan, Talisker, Oban, Aberlour, and Lagavullin— but I kind of hope it is the former (the box looked official enough).

There was one other unexpected twist. Where I was wearing my cap and gown (not exactly my most manly outfit), the promotor was wearing a saber. I felt jealous that I wasn’t issued a cool weapon as well—if not a sabre, then perhaps a crossbow. Or maybe a mace. As it was, I felt seriously outgunned.

The defense went fine, although I felt the pressure to be both critical of the student and at least somewhat entertaining to the audience for over an hour.

Later that day, there was a dinner at which I was to be the guest of honor. Great care was taken to make sure I was the last one to arrive. When I entered the dining room, everyone stood up. I was led to my table. The dinner was a buffet and I was wondering why nobody was getting food. Then I noticed that people were looking at me. I was supposed to to get up first. There were speeches, which were all held in English for my benefit, as I was the only one there who did not speak Finnish.  I found this very touching and humbling, although I also felt ill at ease in my role of foreign dignitary.

After the dinner, a band was making preparations and all of a sudden, the candidate said to me Now we have to dance. If she’d said Now we’re going to cut your left pinky off with the saber, I would have been equally pleased. But I was a good sport and danced, as best as was possible for a foreign dignitary. Luckily, the dance floor quickly filled up with other people so I could escape to the bar.

Much alcohol was consumed and the party had moved outdoors. The candidate received presents. I probably was not the only one who was surprised to see that one of her presents was a chainsaw. What with the saber, I could not be completely sure but I still thought it unlikely that it was a Finnish tradition to gift candidates with big power tools after a successful defense.

My confusion was resolved. Apparently, the candidate had inherited a vacation home and needed to clear some trees. She then made preparations to cut a copy of her dissertation, which was lying in the grass. Even after several drinks, I could see that this was a recipe for disaster. I was relieved to see that the plan was 86ed before any damage was done.

Back to our cross-cultural comparisons of doctoral defense traditions. I appreciate the public nature of the Finnish and Dutch defenses. Friends, relatives, and others can get an idea of what the candidate has been doing with their tax money for all these years. I am always moved by the looks of pride in the eyes of the candidate’s parents and grandparents. And I thoroughly enjoy the banter and drinks with colleagues at the reception after a Dutch defense. But I still wonder if this is the best way to organize a doctoral defense.

I think the British, usually not averse to pomp and circumstance, have it right. A defense should be austere and challenging. There is only one thing missing of course. External examiners must at all times be outfitted with a saber.