Doorgaan naar hoofdcontent

Who’s Gonna Lay Down the Law in Psytown?

These are troubled times in our little frontier town called Psytown. The priest keeps telling us that deep down we’re all p-hackers and that we must atone for our sins.

If you go out on the streets, you face arrest by any number of unregulated police forces and vigilantes.

If you venture out with a p-value of .065, you should count yourself lucky if you run into deputy Matt Analysis. He’s a kind man and will let you off with a warning if you promise to run a few more studies, conduct a meta-analysis, and remember never to use the phrase “approaching significance” ever again.

It could be worse.

You could be pulled over by a Bayes Trooper. “Please step out of the vehicle, sir.” You comply. “But I haven’t done anything wrong, officer, my p equals .04.” He lets out a derisive snort “You reckon that’s doin’ nothin’ wrong? Well, let me tell you somethin’, son. Around these parts we don’t care about p. We care about Bayes factors. And yours is way below the legal limit. Your evidence is only anecdotal, so I’m gonna have to book you.”

Or you could run into the Replication Watch. “Can we see your self-replication?” “Sorry, I don’t have one on me but I do have a p<.01.” “That’s nice but without a self-replication we cannot allow you on the streets.” “But I have to go to work.” “Sorry, can’t do, buddy.” “Just sit tight while we try to replicate you.”

Or you could be at a party when suddenly two sinister people in black show up and grab you by the arms. Agents from the Federal Bureau of Pre-registration. “Sir, you need to come with us. We have no information in our system that you’ve pre-registered with us.” “But I have p<.01 and I replicated it” you exclaim while they put you in a black van and drive off.

Is it any wonder that the citizens of Psytown stay in most of the day, fretting about their evil tendency to p-hack, obsessively stepping on the scale worried about excess significance, and standing in front of the mirror checking their p-curves?

And then when they are finally about to fall asleep, there is a loud noise. The village idiot has gotten his hands on the bullhorn again. “SHAMELESS LITTLE BULLIES” he shouts into the night. “SHAMELESS LITTLE BULLIES.”

Something needs to change in Psytown. The people need to know what’s right and what’s wrong. Maybe they need to get together to devise a system of rules. Or maybe a new sheriff needs to ride into town and lay down the law.

Reacties

  1. Or maybe we need to make a stronger distinction between the scientific investigation of truth and the forensic determination of fraud - which all these metaphors are doing their best to blur.

    BeantwoordenVerwijderen
    Reacties
    1. I only had the former in mind when writing this post. None of these methods are suitable for the determination of fraud, in my view.

      Verwijderen
  2. Fantastic ;-) I am going to read this parabel to my children (when they are older, maybe in their twenties ...).

    BeantwoordenVerwijderen
    Reacties
    1. Thanks! I'm thinking of turning it into a TV series.;)

      Verwijderen
  3. Nice post, which sums up what seems (to me as an outsider, anyway) to be one of the principal problems in social science generally, namely that there is very little agreement on what any of the statistical systems actually *mean* --- as illustrated by the number of law enforcement organisations, each enforcing their own laws, some of them mutually incompatible. As a result, everyone has their own statistical system, with their own preferred interpretations. Not only is that inherently bad science, but it also creates lots of convenient cracks in which to hide QRPs.

    Andrew Gelman had a blog post about a month ago on the astonishingly basic question of if (and if so, when) it's justifiable to use one-tailed tests. It turns out there is surprisingly little consensus. So some authors will continue to double-dip on p<.05 by doing one-tailed comparisons "because I stated a directional hypothesis", and dare the reviewers to call them out on it.

    To still be arguing over basic questions like this, 70 or more years after Fisher, Neyman, and Pearson, is ridiculous. Of course, stats need interpretation, but without some kind of standards (which, I suggest, can only be imposed by the journals), it's going to continue to be possible, indeed almost mandatory, for (A) and (not A) to be true --- not undetermined, but actually true --- simultaneously. How about the next edition of the APA Publication Manual taking a position on some of these questions, instead of finding more obsessive rules for how to punctuate references?

    BeantwoordenVerwijderen
    Reacties
    1. A counterpoint against "more rules" and "more standards": Truth and objectivity is only approached by invariance. (note: "approached". Probably never reached, and if reached, we never can known whether we have reached it).

      What does invariance mean? Gerhard Vollmer wrote an illuminating paper about it, here's the key sentence:

      "A proposition about the world is objective if and only if its meaning and its truth is invariant against a change in the conditions under which it was formulated, that is, if it is independent of its author, observer, reference system, test method, and conventions."

      That means, when different researchers, using different tests, and different conventions come to the same conclusion, then it has good chances to be closer to the truth than otherwise.
      So, if Bayes factors, p values, likelihoods, and posteriors all agree about the presence or the absence of an effect (and other conditions, such as validity, hold), *then* we can make a claim about the world.
      If they disagree, we have to make a step backwards and start thinking again - why do they differ?


      Some more excerpts (I really like the paper):

      "But when is a description of nature objective? Evidently it is desirable to have a criterion of objectivity. How about intersubjectivity? Very often people are satisfied with such a criterion or even define objectivity as intersubjectivity. [...]
      But this is not enough: When all men were convinced that the earth was a disk, this conviction was completely intersubjective, but it was wrong and it was by no means objective. [...]
      Hence, intersubjectivity is not enough. It is necessary but not sufficient. [...]
      There is, indeed, another common property. It is the independence of the structure in question of certain changes, its stability against pertinent alterations, its invariance under some specified transformations. Thus, we say: A proposition is objective if and only if its meaning and its truth is invariant against a change in the conditions under which it was formulated, that is, if it is independent of its author, observer, reference system, test method, and conventions."

      Vollmer, G. (2010). Invariance and Objectivity. Foundations of Physics, 40(9-10), 1651–1667. doi:10.1007/s10701-010-9471-x

      Verwijderen
    2. Good points! I'm reminded of an experience early in my career. One of my co-authors wanted to include one-tailed tests but the editor told us "to get rid of that one-tailed nonsense." A few years later, I encountered one-tailed tests in the editor's own papers.

      Verwijderen
  4. Ha, great allegory! Reminded me of this bit which fits well with your portrayal of psy-town.

    "O Zarathustra, here is the great city: here have you nothing to seek and
    everything to lose."

    Thus spoke Zarathustra, p. 140

    BeantwoordenVerwijderen
    Reacties
    1. I had Clint Eastwood in mind but nice that it makes you think of Nietzsche.;)

      Verwijderen
  5. There are whispers about about an ancient fella called Omniscient Jones, who is a master in the lost way of the Theory. He is wanted by the Correlational Intelligence Agency because his apprentices disappear without any trace of scientific output only to emerge at least five years later, arguments blazing, shooting holes in the very fabric of reality.

    BeantwoordenVerwijderen
  6. I think it's a bit too early to set up rules. I for one thing still haven't seen an Anova where the effect size (eg eta squared) is not only reported but also interpreted. So I'm not really sure what the effect-size troops on the ground are up to...

    Also I think, many researcher's hold the misguided idea that once you get tenure you will use the same methodology until you retire. Then they are surprised when a new methodology is asked of them. Methodology and statistics like other branches of science evolve and researchers should keep an eye on the new developments.

    BeantwoordenVerwijderen
    Reacties
    1. Right now the problem is that nobody seems agree on what to do and so whatever you do, there will always be people to jump on you. This is making people skittish.

      Verwijderen
  7. The problem with statistics--old approaches and new--is our tendency to regard them as probative in themselves. At best, the new bureaus and agencies of your fable promise, with their updated decision-making criteria, to free us (as "p=.05" once did) from all the bother and trouble of exercising our scientific judgement. At worst, they deny us the opportunity to do the same. (Apologies to Robert Abelson.)

    BeantwoordenVerwijderen
  8. Perhaps one of the problems in Psytown is that there is black-market demand for novel results that are "blessed" with some type of statistical significance. With the high demand (and reward) that come making this product, some of the folks in Psytown are not interested in implementing any quality-control procedures or in changing their products. The product still sells.

    BeantwoordenVerwijderen
  9. That was a fun read, but I think a more appropriate analogy is that scientists are like people trying to build a house. It's a complicated process and they may not always know what they are doing. Various inspectors find faults in issues that may seem superfluous to the homebuilder, but it is usually in everyone's best interest to comply with the regulations (even if you don't care about the electrical wiring being up to code, your neighbors care and you might sell the house to someone else).

    Where this analogy breaks down is that I am not sure the regulations being applied make sense, and this agrees with your final paragraph. Although they are relevant, I don't see any of the proposed methods (meta-analysis, Bayes factors, replication, or pre-registration) as really solving the fundamental issues. I am also not sure what the fundamental issues are, but I think it involves theory development from statistical data.

    What this implies to me is that scientists need to be careful about their claims. I think we should stop the press releases and gushing enthusiasm about new (and old) findings until we more fully understand how to generate and interpret our data.

    BeantwoordenVerwijderen
    Reacties
    1. I completely agree with your last paragraph in particular. I also think the homebuilder metaphor is apt; I've used in previous posts. It takes multiple metaphors to describe the target domain. In this post I was trying to convey "the angst of the experimenter on the ground," which sounds awfully pretentious of course.;)

      Verwijderen

Een reactie posten