Evolution vs. Creationism: A completely unambiguous, logically unassailable scientific test. Now we can all stop arguing on the internet about it.

Bill Nye the Science Guy.

Bill Nye (the Science Guy!) has recently thrust himself into the public eye with some commentary on the implications of the persistent fights over the teaching of evolutionary theory in the United States. One of the soundbytes that emerged from the whole thing really jumped out at me:

Your world just becomes fantastically complicated when you don’t believe in evolution.

As an evolutionary biologist, my first defense against any religious impingement on science is often to say that appeals to divine intervention are not rejectable. Therefore they cannot be addressed using in the hypothetico-deductive method and so should be excluded from scientific inquiry. I often add when talking to undergraduates that if any of them came up with a rigorous, falsifiable model of divinity, they would certainly be famous for it.

But when I read this quote, I realized I’d been selling science short. Bill Nye’s statement suggests that we don’t HAVE to view religious hypotheses as untestable simply because they are unrejectable. In many cases in science, we are interested in finding a useful working model of some phenomenon. In those cases we regularly view the issue as one of choosing the best model from among a set of candidates rather than one of rejecting all models that are wrong. In at least one case I can think of, we apply a complex, unrejectable model in a test of the adequacy of a much simpler model (about which, more below).

When we engage in this process, we often employ Information Theory to guide our selection. Without getting into the details, we can think of information theoretic criteria for model selection as formally implementing Occam’s Razor: the simplest model with the most explanatory power is to be preferred. By preferring simple models, you guard against overinterpreting data, a pitfall that can make models poor predictors of new observations.

So, I realized as long as we can formulate any mathematical model of “The Hand of God”, rejectable or not, we can compare it to an evolutionary model in this framework. If, as Nye suggests, evolutionary theory is simple and powerful, and creationism is a model of fantastical complexity that doesn’t much improve our understanding of the data, information theory would help us sort that out.

So why not give it a whirl?

First we need to decide what data to analyze. I work in molecular phylogenetics, so I’m going to stick with what I know. Molecular phylogenetics seeks to explain observed DNA sequence data using models of DNA sequence evolution on phylogenetic trees. The core data are DNA sequence alignments. An example alignment is shown below:

A DNA sequence alignment.

In this case I’ve displayed part of the Cytochrome b gene from six great ape species. Cytochrome b is a very important metabolic gene and whose function is essentially identical across the species we’re considering. Each of the columns in the alignment represents a homologous nucleotide in each of the six species, and the evolutionary model will seek to explain the differences between them within each column. In the model, we’ll assume a tree that gives the shared common ancestry of these species and a DNA mutation process that accounts for the differences among them. The tree I assumed is the following, which is in accordance with the accepted phylogeny:

A phylogenetic tree of six great ape species.

The model is considered adequate to the extent that it can account for the patterns at all the columns in the alignment. We measure this ability in terms of the proposed model’s probability of producing the columns in the sequence alignment. In terms of the complexity of the model, that is measured in the number of parameters to be estimated. I believe this model has 9+5 parameters to be estimated (there are nine branches in the tree, and 5 for the sequence evolution model I’ve chosen (HKY+G)).

Now comes the harder part. Deciding how to model “God”. Fortunately for us, the Bible gives us some insight into God’s creative process. After having created the universe and everything in it, “God was pleased with what he saw.” I think, therefore, we can at least initally propose a model in which the abundance of a thing God has created is an index of his pleasure with it.

If we stick with our initial approach of trying to model columns in the alignment, I think the only thing we can infer about the patterns we observe is that it pleased God that they be so. As such, we will use the frequency of each pattern in the alignment as an index of God’s pleasure. In order to make this model comparable to the evolutionary one, we will make a big jump and assume that the probability of observing a column is proportional to how much it pleased God. (This model is actually already in use in molecular phylogenetics and is the unrejectable model I referred to above. It’s called the “multinomial model”. It can’t be rejected because it is a perfect fit to the data). The complexity of this model is n-1, where n is the number of possible patterns in the alignment. This is because we need to estimate the probability of observing each of these patterns. In this case that number is (4^6)-1, which is 4095.

So, we now have two models, both of which we can use to calculate the probability of the data. We will now be able to see which model predicts the data best, and decide if either of the models invokes unnecessary complexity.

Let’s do this:

The probabilities of obtaining any given data set for models such as these tend to be very tiny because the universe of possible data is very large, so we usually deal with them as Log(probabilities). So don’t be shocked that these numbers are all negative and very large. The highest probability is the number that is least negative (probabilities calculated using PAUP*).

The probability of the data given the evolutionary model is -3262.76306.

The probability of the data given “God” is -3075.75959.

Wow. So in real terms, the probability of obtaining the data under the “God” hypothesis is 4.048499 * 10^40 greater than under the evolutionary model.

As we noted before, however, the complexity of the model plays a big role in whether we prefer it. As we noted above, the evolutionary model has 14 parameters, while the “God” model has 4095. Perhaps you can see where this is going.

When we try to calculate our metric that accounts for complexity (the Akaike information criterion) and then turn it into an odds ratio comparable to above, we get +Infinity in favor of the evolutionary model. This is because the the corrected likelihood of the God model relative to the evolutionary model is so small that my computer processor can’t store it in memory.

So in conclusion, we must side with Bill Nye. If we model God in this way, we are invoking an incredible and unnecessary level of complexity. We certainly can’t reject the God model in this framework, but do we really need to? While we can argue that our model of God is silly (it surely is), that even if God did create everything, he certainly wasn’t thinking in terms of columns in sequence alignments, or that this whole thing is blasphemy and I’m going to hell, I think the broader point still holds, that evolutionary theory is a simpler and more powerful explanation than the idea that every single living thing on earth was individually specified to please God.

12 comments on “Evolution vs. Creationism: A completely unambiguous, logically unassailable scientific test. Now we can all stop arguing on the internet about it.

  1. Yoder says:

    Cross-commenting from Facebook:

    “I think, therefore, we can at least initally propose a model in which the abundance of a thing God has created is an index of his pleasure with it.” —What, no citation to J.B.S. Haldane?

  2. Luke Harmon says:

    Fantastic! I think this analysis is related to a paper in Nature testing universal common ancestry: http://www.nature.com/nature/journal/v465/n7295/full/nature09014.html

  3. I suspect when you tried to calculate the AIC, God crashed your computer in an attempt to stop you from uncovering the truth!

  4. John McVay says:

    I’d like to see some Bayes Factor analysis. What shape do you think God’s prior is?

  5. noahmattoon says:

    John, in this case I believe the dirichlet distribution might be the most appropriate for “God”.

  6. Ryan Baldini says:

    I believe that if you use AICc (a finite-sample size correction which is usually preferred to AIC), then the God model is *infinitely* bad. The second term of AICc has a denominator that equals zero when k (# of parameters) = n-1; hence infinity results. So they aren’t even theoretically comparable, by that criterion!

    • noahmattoon says:

      There are two reasons I didn’t use the AICc. First, an analysis of another wildly overparameterized model, the “no common mechanism model” of DNA sequence evolution (Holder Lewis and Swofford 2010) did not. That paper was a partial inspiration for this blog post. In that analysis, Holder et al. say outright that the asymptotic assumptions of the AIC break down with such models, and yet the results are still enlightening.

      The second reason is that the AICc is designed to ADD to the parameterization penalty for models where the number of parameters approaches the number of data points. The fact that the correction becomes negative when parameters outnumber data points is prima facie justification for not using it here.

      But if your real point is that this blog post is not, in fact, a completely unambiguous and logically unassailable test, well then I would have to say that I agree with you.

    • noahmattoon says:

      Whoah, apologies there. my brain misread your comment as “the evolutionary model is *infinitely* bad” and therefore misunderstood your point. in this case, strictly following the second term of the AICc for the god model yields a very large negative number, which I think causes the God model to actually have infinite (arithmetic overflow) odds in favor, so I guess I just filled in the blanks. chalk it up to six hours of intro biology TA this evening.

  7. Cool post. But PAUP’s log likelihoods are probably natural log not log10, no?

  8. [...] Mattoon contributes Evolution vs. Creationism: A completely unambiguous, logically unassailable scientific test. Now we … at Nothing in biology makes [...]

