The p-value is a measure of the statistical evidence against the null hypothesis in a hypothesis test. When the p-value is less than 0.05, it is conventionally interpreted that there is strong evidence against the null hypothesis, and it is rejected in favor of the alternative hypothesis.
The tradition of using a p-value of less than 0.05 (or 5% significance level) as a threshold for statistical significance dates back to the early 20th century and is attributed to the statistician Ronald Fisher. Fisher suggested the 5% level as a convenient choice that balanced avoiding false positives (Type I errors) and false negatives (Type II errors). However, it's important to note that Fisher didn't intend for this to be a hard-and-fast rule, but rather a useful guideline.
The selection of 0.05 as a significance level has since become deeply ingrained in many fields of science and research, often due to tradition, convenience, and the desire for a common standard that allows results to be compared across studies. It has also been codified in the teaching of statistics, further reinforcing its usage.
However, there has been ongoing debate within the scientific community about the reliance on this threshold. Critics argue that it can lead to a binary view of results (significant or not significant) rather than a nuanced interpretation of the evidence. Some also argue that it contributes to issues like p-hacking, where researchers manipulate their analysis to achieve a p-value below the threshold.
In recent years, there has been a push towards a more thoughtful interpretation of p-values and statistical significance, considering the context of the study, the size of the effect, and other evidence rather than simply whether p is less than 0.05. In some cases, researchers might choose a stricter threshold (like 0.01) or a more lenient one (like 0.10) based on the specifics of their study and field.
BridgeText can help you with all of your statistical analysis needs.