Mechanistic, repetitive, and unreflective hypothesis testing
Mind your Ps and Qs – use your IQ…
Statistics has been called “the grammar of science” (Cumming, 2012) and inferential reasoning processes lie at the very heart of conclusions drawn from scientific research. Currently, Fisherian null hypothesis significance testing (hereafter NHST) is the dominant inferential method in many scientific disciplines (Fisher himself was a geneticist). Unfortunately, it is a robust empirical finding that the underlying Aristotelian syllogistic logic of NHST is widely misunderstood, not just by students, but also by their teachers (e.g., Haller & Krauss, 2002), by professional academic researchers (e.g., Rozeboom, 1960), and even by professional statisticians (e.g., Lecoutre, et al., 2003). That is, unsound logical thinking and wrong knowledge and beliefs concerning NHST are ubiquitous in the scientific community. Peer-reviewed scientific publications, textbooks, lecturers, and high-ranking professionals perpetuate the misinterpretations of NHST (they hand down mutated memes). In order to break this vicious circle, researchers should (1) acknowledge the problem, (2) understand the logical pitfalls, and (3) learn about alternative inferential techniques (i.e., Bayesian probability theory).
Further References
Gigerenzer, G.. (2004). Mindless statistics. Journal of Socio-Economics
Haller, H., Krauss, S., & Kraus, S.. (2002). Misinterpretations of Significance : A Problem Students Share with Their Teachers ?. Methods of Psychological Research
“The use of significance tests in science has been debated from the invention of these tests until the present time. apart from theoretical critiques on their appropriateness for evaluating scientific hypotheses, significance tests also receive criticism for inviting mi- sinterpretations. we presented six common misinterpretations to psychologists who work in german universities and found out that they are still surprisingly widespread – even among instructors who teach statistics to psychology students. although these mi- sinterpretations are well documented among students, until now there has been little research on pedagogical methods to remove them. rather, they are considered ‘hard facts’ that are impervious to correction. we discuss the roots of these misinterpretations and propose a pedagogical concept to teach significance tests, which involves explaining the meaning of statistical significance in an appropriate way.”
Cohen, J.. (1994). The earth is round (p < .05). American Psychologist
“After 4 decades of severe criticism, the ritual of null hypothesis significance testing (mechanical dichotomous decisions around a sacred .05 criterion) still persists. this article reviews the problems with this practice, including near universal misinterpretation of p as the probability that h₀ is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects h₀ one thereby affirms the theory that led to the test. exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods are suggested. for generalization, psychologists must finally rely, as has been done in all the older sciences, on replication.”
Cohen, J.. (1995). The Earth Is Round (p < .05): Rejoinder. American Psychologist
“Replies to the comments of g. l. baril and j. t. cannon, k. o. mcgraw, s. parker, and r. w. frick (see pa, vol 83:13436; 13468; 13472; and 13448, respectively) on j. cohen’s article discussing the problems and misinterpretations in null hypothesis significance testing.”