Model Comparison#
By computing the evidence, we can take advantage of model comparison. The evidence describes \(p(D | M)\), the probability that the data, \(D\), would be observed given some model \(M\), regardless of the parameters of \(M\). This means that by comparing the Bayesian evidence for two different models for the same data, we can quantify the relative quality of the models. Such a tool is extremely powerful in allowing the rational comparison of models, particularly in the context of overfitting, by using more complex models, i.e., analytical models with more and more parameters. You may be familiar with the idea that with five complex parameters, you can fit an elephant [9]. Bayesian model selection is a tool that we can use to defend against this.
Bayes Factor#
The Bayes factor, \(B\), enables the power of model comparison. This is computed to compare model \(M_a\) with model \(M_b\) as follows,
where \(p(M_a)\) is the prior associated with model \(a\) and similarly for \(p(M_b)\). These priors are associated with our relative confidence in each model, so if we think both models are equally likely, the ratio would be unity. However, if we believed that model \(a\) was ten times more likely than model \(b\), we would say that this ratio equals 10.
Interpretation of the Bayes Factor#
Typically, the Bayes factor’s role is in comparing a more complex model \(b\) with a simple model \(a\), which we will see an example of in the next section. In Table 1, we outline the Interpretation provided by Kass and Raftery [10], commonly used for comparing more complex models. One way to think about the Bayesian evidence in the context of comparing a simpler model with a more complex one is that the Bayesian evidence is an integral over all \(\Theta\), and a more complex model will add a dimension to this integral with each additional parameter. Therefore, for each parameter, there needs to be a substantial increase in the probability that the model describes the data.
log10Bab Range |
Bab Range |
Evidence for a More Complex Model |
---|---|---|
0 - 0.5 |
1 - 3.2 |
Not worth more than a bare mention |
0.5 - 1 |
3.2 - 10 |
Substantial |
1 - 2 |
10 - 100 |
Strong |
Greater than 2 |
Greater than 100 |
Decisive |
Comparing COVID-19 Tests#
Consider our example from the previous section, where we were computing the evidence for a given COVID-19 test, which we will call test \(a\). We want to compare test \(a\) with another COVID-19 test \(b\), which is more complex and expensive to administer. Bayesian model selection is the ideal tool for this. First, we will calculate again the evidence for model \(a\).
likelihood_positive_covid_a = 0.995
likelihood_negative_nocovid_a = 0.989
prior_covid = 0.25
evidence_a = likelihood_positive_covid_a * prior_covid + (
1 - likelihood_negative_nocovid_a) * (1 - prior_covid)
evidence_a
0.257
Now consider the more expensive test \(b\), which the manufacturer claims will be positive when a patient has COVID-19 99.9 % of the time and negative when the patient does not have COVID-19 99.9 %.
likelihood_positive_covid_b = 0.9999
likelihood_negative_nocovid_b = 0.989
prior_covid = 0.25
evidence_b = likelihood_positive_covid_b * prior_covid + (
1 - likelihood_negative_nocovid_b) * (1 - prior_covid)
evidence_b
0.25822500000000004
We can then compute the Bayes factor between the two tests.
evidence_b / evidence_a
1.0047665369649807
This tells us that the more complex and expensive is “not worth more than a bare mention”. However, this is just a toy example of how one computes the Bayesian evidence in a real Bayesian analysis.