This week’s post is a guest contribution by David Hembry, who recently finished his Ph.D. at the University of California, Berkeley, working on coevolution and diversification of the obligate pollination mutualism between leafflower plants (Phyllantheae) and leafflower moths (Epicephala). He will be starting a postdoctoral fellowship at Kyoto University in the fall.
Last month, I filed my PhD dissertation, bringing to an end an intellectual and personal journey that began seven years ago in the summer of 2005. I know a lot more now than I did then, and I know a lot more about the boundaries of what I don’t know, too. But not only has my knowledge changed—evolution and ecology looks a lot different now than it did seven years ago when I was planning my dissertation research. At some point, and often multiple points, in the process of getting a PhD, everybody wonders whether what they’re doing is already out of date. Some of the transformations in the field I think I could see coming. For instance, it was clear in 2005 that computational power would keep increasing, phylogenetics would be used more and more to ask interesting questions, more and more genomes would be available for analysis, and evolutionary developmental biology was on the rise. It was unfortunately also predictable that it would be possible to study climate change in real time over PhD-length timescales. And although the 2008 global financial crisis didn’t help, it was clear that funding and jobs were going to be more competitive than they had been for our predecessors.
But there were a number of things I didn’t see coming, and which have made the field look radically different than it was back in 2005. Looking back, and looking towards the future, here are the changes I think were most important (from an evolutionist’s perspective), and what I think they mean for young scientists.
1. The rise of next-generation sequencing
This is the obvious one, but I didn’t think it would come so fast. In 2005, eukaryotic genome sequencing was still an enormous operation. Although more and more genomes were becoming available at the time, my vision of the future (which was hazily located sometime after getting my first faculty job) consisted of eventually being able to get one representative genome sequenced from my organism, and then I could use that for targeted Sanger sequencing of new loci, as well as hiring a postdoc or student to do a comparative genome evolution project. When I first took phylogenetics in 2006, the professor’s advice was (more or less) that if we ever fantasized about the possibility of having access to genome-scale datasets, we should go take a cold shower and never think about it again. Now, with the rise of high-throughput, next-generation sequencing (NGS) methods like Illumina sequencing (and a couple newer, apparently faster, methods that have yet to be available on the market; see this excellent presentation by Jonathan Eisen), it can cost the same to get massive amounts of sequence data for all taxa in a moderate-sized phylogenetics study than to get a many-gene dataset for the same taxa using PCR and Sanger sequencing.
The flip side is that these datasets are so large (on the scale of terabytes) that they cost more to store than to obtain; they are large enough that uploading and downloading them over the networks available to university biology departments is no longer trivial; and although there are lots of bioinformaticians out there, very few people trained in EEB (including yours truly) know how to use datasets of this size. The likely prospect of having to throw out a lot of data because storing it is too expensive is particularly bizarre, and disturbing, to a lot of researchers.
Of course, having this kind of data available is going to also have a huge effect on the kinds of questions that students and researchers in EEB ask—and if the number of publications that bioinformaticians have produced from the earliest sequenced genomes alone is any guide, a reasonable amount of NGS data could be more than enough for an entire dissertation, and put you in directions you might never have intended to pursue. The effect this is going to have on the direction the field as a whole goes, I think, has yet to become clear.
2. Reanalyzing other people’s data
Not only is gathering more data than you need trivial, you don’t even need to gather any on your own! Instead of spending years gathering data or samples in the field, processing it in the lab, and finally analyzing it, you can go to repositories like GenBank or TreeBase or Dryad or the Interaction Web Database and download all the gene sequences, phylogenetic trees, or ecological networks you want. This has spawned a new, fast-moving subfield that centers around the expert reanalysis of other people’s data—a trend that has definitely been facilitated by the rise to prominence in EEB of the R programming language. It’s possible now to build a whole career around analyzing previously published data, which is faster and cheaper than relying on going into the field or the lab to get your data. With the combination of NGS and data analysis in R, the whole field has got a lot more technical. This poses a number of challenges for new researchers who haven’t received instruction in these methods (as many grad programs don’t yet provide all their students with good training in these methods), but also can be an opportunity. Research funding has become increasingly competitive, but as long as you have a salary and a fast computer, meta-analyses are free.
3. “Field model organisms” in evolutionary ecology
I confess, I didn’t go to the Evolution meetings for three years. I missed Minnesota in 2008 due to fieldwork, Idaho in 2009 due to illness, and Portland in 2010 due to the EAPSI. When I “returned” in 2011 in Norman, it was like everybody had switched to working on anoles and sticklebacks!
This is not entirely fair. Research on anoles and sticklebacks had left their imprint on evolutionary ecology long before I started grad school. But in the time that I had been away, anoles and sticklebacks had fully blossomed into what I call “field model organisms.” In contrast to the classic model organisms in biology (mouse, C. elegans, E. coli, Drosophila), which were picked because they were hard to kill by neglect in the lab, “field model organisms” were picked because they had done really interesting things evolutionarily. And now it has become possible to leverage an unprecedented amount of genetic, developmental, phylogenetic, and ecological data on these organisms to ask fascinating “big questions” about their evolution in the wild—in ways that, until recently, were only possible in the wild relatives of lab model organisms. This is I think the most fascinating of the recent developments I didn’t anticipate.
Equally excitingly, there are a number of other systems—including the columbines (Aquilegia), monkeyflowers (Mimulus), and the leaf-mining drosophilid flies (Scaptomyza) that feed on wild Arabidopsis just to mention a few—which are showing signs of moving in the same direction as anoles and sticklebacks.
But I think this poses also an important choice for young scientists—either you can ask bigger questions perhaps more easily (but with more competition) or go it alone on your own, less-studied system—and not necessarily be able to ask the big questions as quickly. I personally feel like I struck a pretty good balance here by picking a system (leafflower trees and leafflower moths) that was already partially characterized, and analogous to a couple better-studied systems (figs and fig wasps, yuccas and yucca moths), yet I was the only person in the United States working on it. Which brings me to …
4. The rise of EEB in Brazil and China
Back in 2005, as any economist will tell you, the world was a very different place. And it was a very different place for an internationally-minded biologist. Most of the researchers were in the temperate zones, but most of the biodiversity was in the tropics. That fact alone has driven generations of biologists to travel from their temperate-zone institutions to tropical field sites. And if you had told me in 2005 that by now I would have gone to Brazil, be collaborating with Brazilian scientists, and making plans to visit Chinese colleagues who work on my system, I probably wouldn’t have believed you.
The rapid ascent of ecology and evolution in Brazil and China, fuelled in part by these countries’ rapid economic growth and spectacular increases in basic research investment, has changed this dynamic. (If you disagree, just open a recent issue of your favorite journal.) To say that both these countries have a lot of biodiversity is an understatement. The rise of in situ tropical biology research in these countries is probably the best thing to happen to basic research in our field in a long time. There are all kinds of fascinating organisms, interactions, and processes in these countries that have been barely studied—and will have a lot to tell us. More broadly, anybody who has done research in another country knows that while science is international, different countries develop different scientific cultures with different emphases and perspectives—and the more people working in our field, the more complementary perspectives and discoveries there will be. For young researchers everywhere, these provide potentially fantastic long-term opportunities for collaborations and intellectual exchange. This is no small matter given that funding is more competitive in many Western countries, and it’s getting more and more expensive to travel to the tropics to do tropical biology.
5. Rising fuel costs
It’s getting more expensive to do field biology. The fact that fuel is getting more expensive isn’t a bad thing for biodiversity. But it makes it harder to study it in the field, especially for students. Both the rise in the cost of plane tickets and gas and the increasing competitiveness of funding is making it more and more difficult for students to cobble together funds to do extensive fieldwork—and for PIs to support that kind of fieldwork by their students. In my PhD I spent twenty months in the field in six countries in the South Pacific and Asia piecing together a huge biogeographic and cophylogenetic story—but it’s hard for me to imagine funding myself to keep doing it, much less being able to support a few students doing those kind of theses. Combined with trends 1 and 2, it’s easy to see our field turning away from its roots in field biology and turning more and more towards the low-cost analysis of massive sequencing datasets and previously published data. On the other hand, trends 3 and 4 suggest a pretty bright future for some kinds of field biology. I think we have yet to see how this will all shake out, but I for one am looking forward to it.
 In a sign of how fast the field is changing, note that typing “Sanger sequencing” into Wikipedia directs you to the main “DNA sequencing” article, rather than to the subsection “chain-termination methods”, which is the only part of the article concerning Sanger sequencing itself.