gears, cogs, machine

Mechanistic, repetitive, and unreflective hypothesis testing

Mind your Ps and Qs – use your IQ…

Statistics has been called “the grammar of science” (Cumming, 2012) and inferential reasoning processes lie at the very heart of conclusions drawn from scientific research. Currently, Fisherian null hypothesis significance testing (hereafter NHST) is the dominant inferential method in many scientific disciplines (Fisher himself was a geneticist). Unfortunately, it is a robust empirical finding that the underlying Aristotelian syllogistic logic of NHST is widely misunderstood, not just by students, but also by their teachers (e.g., Haller & Krauss, 2002), by professional academic researchers (e.g., Rozeboom, 1960), and even by professional statisticians (e.g., Lecoutre, et al., 2003). That is, unsound logical thinking and wrong knowledge and beliefs concerning NHST are ubiquitous in the scientific community. Peer-reviewed scientific publications, textbooks, lecturers, and high-ranking professionals perpetuate the misinterpretations of NHST (they hand down mutated memes). In order to break this vicious circle, researchers should (1) acknowledge the problem, (2) understand the logical pitfalls, and (3) learn about alternative inferential techniques (i.e., Bayesian probability theory).


Further References

Gigerenzer, G.. (2004). Mindless statistics. Journal of Socio-Economics

Plain numerical DOI: 10.1016/j.socec.2004.09.033
DOI URL
directSciHub download

Show/hide publication abstract
“Statistical rituals largely eliminate statistical thinking in the social sciences. rituals are indispensable for identification with social groups, but they should be the subject rather than the procedure of science. what i call the ‘null ritual’ consists of three steps: (1) set up a statistical null hypothesis, but do not specify your own hypothesis nor any alternative hypothesis, (2) use the 5% significance level for rejecting the null and accepting your hypothesis, and (3) always perform this procedure. i report evidence of the resulting collective confusion and fears about sanctions on the part of students and teachers, researchers and editors, as well as textbook writers. © 2004 elsevier inc. all rights reserved.”
Haller, H., Krauss, S., & Kraus, S.. (2002). Misinterpretations of Significance : A Problem Students Share with Their Teachers ?. Methods of Psychological Research

Plain numerical DOI: http://www.mpr-online.de
DOI URL
directSciHub download

Show/hide publication abstract
“The use of significance tests in science has been debated from the invention of these tests until the present time. apart from theoretical critiques on their appropriateness for evaluating scientific hypotheses, significance tests also receive criticism for inviting mi- sinterpretations. we presented six common misinterpretations to psychologists who work in german universities and found out that they are still surprisingly widespread – even among instructors who teach statistics to psychology students. although these mi- sinterpretations are well documented among students, until now there has been little research on pedagogical methods to remove them. rather, they are considered ‘hard facts’ that are impervious to correction. we discuss the roots of these misinterpretations and propose a pedagogical concept to teach significance tests, which involves explaining the meaning of statistical significance in an appropriate way.”
Cohen, J.. (1994). The earth is round (p < .05). American Psychologist

Plain numerical DOI: 10.1037/0003-066X.49.12.997
DOI URL
directSciHub download

Show/hide publication abstract
“After 4 decades of severe criticism, the ritual of null hypothesis significance testing (mechanical dichotomous decisions around a sacred .05 criterion) still persists. this article reviews the problems with this practice, including near universal misinterpretation of p as the probability that h₀ is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects h₀ one thereby affirms the theory that led to the test. exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods are suggested. for generalization, psychologists must finally rely, as has been done in all the older sciences, on replication.”
Cohen, J.. (1995). The Earth Is Round (p < .05): Rejoinder. American Psychologist

Plain numerical DOI: 10.1037/0003-066X.50.12.1103
DOI URL
directSciHub download

Show/hide publication abstract
“Replies to the comments of g. l. baril and j. t. cannon, k. o. mcgraw, s. parker, and r. w. frick (see pa, vol 83:13436; 13468; 13472; and 13448, respectively) on j. cohen’s article discussing the problems and misinterpretations in null hypothesis significance testing.”