by W. B. Meitei, PhD
Before exploring the differences between p-values and Bayes factors, it’s important to understand the fundamental distinction between the Frequentist and Bayesian approaches to statistics.
Frequentist vs Bayesian
Statistics is commonly framed within two main perspectives: the Frequentist approach and the Bayesian approach. While Bayesian methods often provide a more intuitive and flexible framework for inference and hypothesis comparison, both approaches have unique philosophies and methodologies. The key difference lies in how they interpret and use probability.
The Frequentist approach relies on classical tools such as p-values, significance levels, statistical power, and confidence intervals. Here, probability is used only to model certain specific processes described by the sampling procedure (or described by the sample). Thus, allowing the data to carry some amount of uncertainty. While adopting a Frequentist approach, there is always a worry in mind that the "correct model" is specified, or a null model is not supported by the data.
In contrast, the Bayesian approach treats probability in a more diverse way to model the sampling processes as well as the other related uncertainties. Here, we focus on whether or not the parameters and model are sensible for that particular dataset by providing credible intervals instead of confidence intervals. With this approach, we don't need to worry about setting up a null hypothesis; instead, we are given the power to make a direct probability statement about the parameter of interest. In the Bayesian approach, we make direct probability statements about the parameters using the observed sample. In contrast, the p-value is calculated on the assumption of drawing a hypothetical infinite number of samples (i.e., sampling distribution) that we never really observe. Finally, the Bayesian approach is also known to deal better with small samples. The Bayesian approach also incorporates prior information (about previous findings and theory) into the estimation, which sometimes can prove to be highly useful. Credible intervals (Bayesian analogues to confidence intervals) express interval estimates grounded on the posterior distribution. The Bayesian framework also naturally incorporates prior information, such as previous research or expert knowledge, adding valuable context to the analysis.
Moreover, Bayesian methods often perform better with smaller sample sizes, offering robust inference where Frequentist methods might struggle. Overall, the Bayesian approach provides a more comprehensive and probabilistic understanding of both data and uncertainty.
p-value Vs Bayes factor
p-value:Fisher originally
introduced the p-value within a carefully designed agricultural experiment.
While it serves as an intuitively useful measure against the null hypothesis,
it is often misunderstood or misused. Some common errors we make when
interpreting p-values include:
- Interpreting it as the probability that H0 is (not) true, while it measures only the extremeness of the observed result under H0.
- It doesn't express the probability that the observed result occurred under H0, but is rather the probability of observing a more extreme result under H0. This implies that it is based not only on the observed result but also on fictive (never observed) data. For example, if we want to test the significance of the regression estimates (beta coefficients), we look for the p-value from the t-distribution, which is hypothetical.
- It is not an absolute measure. A small value does not necessarily mean there is a significant difference between two or more characteristics of interest (variables).
- It does not take into account the size of the study. (Royall, 1997)
It would be rather more informative to use a 95% confidence interval in place of the p-value, as it is considered to provide more insights relevant to the obtained result.
Bayes factor:
The Bayes factor is a key concept in Bayesian statistics, used to quantify how strongly data support one statistical hypothesis over another. Bayes factor is the outcome of one of the major contributions of Jeffreys in the early 20th century. The Bayes factor compares the probabilities of the observed data under two competing models or hypotheses, typically a null and an alternative (i.e., it measures the change from prior to posterior odds favouring the null hypothesis). It is the Bayesian equivalent of the likelihood ratio test. By providing a direct, interpretable measure of evidence from the data, the Bayes factor offers an alternative to traditional p-values, allowing researchers to update their beliefs in light of new information and assess which hypothesis is better supported by the evidence.
If y represents the observed data and H0 represents the null hypothesis to be tested. Then, according to Bayes' theorem,
Here, H1 is the alternative hypothesis.
Similarly,
Thus,
The term,
is known as the Bayes factor. Its value ranged from 0 to infinity.
Thus,
posterior odds = Bayes factor × prior odds
The values of the Bayes factor larger than 1 are interpreted as evidence in favour of H0 (relative to H1). The larger the values, the stronger the evidence. On the contrary, values less than 1 favour H1.
According to Jeffreys, the classification of the Bayes factor favouring H0 against H1 is given below,
- "decisive" if Bayes factor > 100
- "very strong" if 32 < Bayes factor ≤ 100
- "strong" if 10 < Bayes factor ≤ 32
- "substantial" if 3.2 < Bayes factor ≤ 10
- "not worth" if 1 < Bayes factor ≤ 3.2
A more precise Jeffreys' scale of Bayes factor favouring H0 against H1, and H1 against H0 provided in Jeffreys' book "The Theory of Probability" is given below.
Bayes factor (BF) favouring H0 against H1 |
|
Bayes factor (BF) favouring H1 against H0 |
||||
BF |
log10(BF) |
Strength of evidence |
BF |
log10(BF) |
Strength of evidence |
|
1 to 101/2 |
0 to ½ |
Not worth |
1 to 10-1/2 |
0 to -1/2 |
Not worth |
|
101/2 to 102/2 |
½ to 2/2 |
Substantial |
10-1/2 to 10-2/2 |
-1/2 to -1 |
Substantial |
|
102/2 to 103/2 |
2/2 to 3/2 |
Strong |
10-2/2 to 10-3/2 |
-1 to -3/2 |
Strong |
|
103/2 to 104/2 |
3/2 to 4/2 |
Very strong |
10-3/2 to 10-4/2 |
-3/2 to -2 |
Very strong |
|
> 104/2 |
> 2 |
Decisive |
< 10-4/2 |
< -2 |
Decisive |
A note of caution while using the Bayes factor is that the interpretation of the Bayes factor is not universal. A certain calibration is required with sensitivity analysis using different priors, which will help reduce the risk of misleading conclusions. The table below summarises how various researchers, including Jeffreys, have classified the strength of evidence against the null hypothesis (H0) when the Bayes factor is less than or equal to 1.
Bayes factor |
Strength of evidence against H0 |
||
1
to 1/3 |
Bare mention |
|
Weak |
1/3
to 1/10 |
Substantial |
Weak to moderate |
Moderate |
1/10
to 1/30 |
Strong |
Moderate to strong |
Substantial |
1/30
to 1/100 |
Very strong |
Strong |
Strong |
1/100
to 1/300 |
Decisive |
Very strong |
Very strong |
<1/300 |
|
|
Decisive |
Jeffreys actually used the slightly different cut
points 1/10a/2, a = 1, 2, 3, 4, whereas Goodman specified evidence
categories “weak,” “moderate,” “moderate to strong,” and “strong to very
strong” for Bayes factors of 1/5, 1/10, 1/20, and 1/100, respectively, which
we have modified and aligned with our cut points. |
|||
Table Source: Held & Ott (2018) |
Note: This blog highlights the key conceptual difference between the p-value and the Bayes factor. For more information, please go through the material provided in the reading list.
Suggested Reading:
- Assaf, A. G., & Tsionas, M. (2018). Bayes factors vs p-values. Tourism Management, 67, 17-31.
- Held, L., & Ott, M. (2018). On p-values and Bayes factors. Annual Review of Statistics and Its Application, 5(1), 393-419..
- Lesaffre, E., & Lawson, A. B. (2012). Bayesian Biostatistics. John Wiley & Sons.
- Royall, R. (1997). Statistical evidence: a likelihood paradigm. Routledge.
- Taboga, M. (2021). Jeffreys' scale. Fundamentals of Statistics.
- Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773-795.
Suggested Citation: Meitei, W. B. (2020). Difference between p-value and Bayes factor. WBM STATS.
No comments:
Post a Comment