# Solution 3 - Model comparison

#### Goals

In this exercise you are asked to compare different substitution models (and their variations) on the same dataset.

Datasets
Dataset file:

##### Execution
Model Log-likelihood Parameters AIC
JC -6379.66383 37 12833.32766
JC_I -6311.4165 38 12698.833
JC_G -6304.38084 38 12684.76168
JC_G_I -6304.35263 39 12686.70526
HKY -6251.43051 41 12584.86102
HKY_I -6180.09689 42 12444.19378
HKY_G -6172.58045 42 12429.1609
HKY_G_I -6172.52896 43 12431.05792
GTR -6241.48788 45 12572.97576
GTR_I -6171.17472 46 12434.34944
GTR_G -6163.87291 46 12419.74582
GTR_G_I -6172.52896 47 12439.05792

The number of parameters are computed in the following way for a rooted
tree:

$k = (2 \ast \text{no. taxa} - 3) + \text{no. parameters in substitution model} + \text{extra parameters}$

substitution models: JC = 0 parameters | HKY = 4 parameters | GTR = 8
parameters
invariant sites = +1 parameter
gamma rates = +1 parameter

We compute the AIC value as follows:

$AIC = 2(k) - 2\log(L)$

##### Questions

1. Which model is the best (including HKY+Gamma), based on the AIC criterion?

We select the model with the lowest AIC value. In this case, GTR + Gamma

This exercise can also be solved using jmodeltest which performs automatically phylogenetic tree reconstructions with every model and extra parameters (i.e. invariant sites, gamma).