I arrived in Ottawa a day before the proper start of the Evolution 2012 meetings so as to attend the symposium hosted by the journal Molecular Ecology, which was almost entirely devoted to the joys of genome-scale data collected from wild populations of our favorite species—and what we can and can’t learn from it. This, readers of this blog will recall, is one of the biggest changes in our field in the last few years.
Alex Buerkle kicked things off with the intersting question of how much data, exactly, do we need? It’s easy (given the funding) to obtain a lot of DNA sequence fragments from next-generation sequencing (NGS) methods—but is it better to collect lots of data from a few individuals (and thereby have high confidence in the data) or collect less data from more individuals and accept that there will be some uncertainty in the data for any one individual? Buerkle argued that the second option is preferable; it’s possible to account for uncertainty in your analysis, but if you don’t sample enough individuals, you can miss rare gene variants.
There was a tension between confidence and uncertainty in these great big genetic datasets running through the whole symposium. Buerkle also noted that patterns of differentiation and diversity across the genomes of related species can be very complex—and in the question and answer session, it was pointed out that complexity and noise can be hard to differentiate.