Research

Our research is focused on mechanisms of evolution at the gene, genomic, cellular, and phenotypic levels, with special attention being given to the roles of mutation, random genetic drift, and recombination. For these purposes, we are currently utilizing several model systems, most notably the microcrustacean Daphnia, the ciliate Paramecium, and numerous additional unicellular eukaryotes and prokaryotes. In addition, comparative analyses of completely sequenced genomes are being performed to shed light on issues concerning the origins of genomic, gene-structural, and cellular diversity. Most of our empirical work is integrated with the development and use of mathematical theory in an effort to develop a formal understanding of the constraints on the evolutionary process. Evolution is a population-level process, and the underlying philosophy of our research has long been that "nothing in evolution makes sense except in the light of population genetics."

Evolutionary cell biology

Remarkably, although we have fairly well-established fields of molecular evolution, genome evolution, and phenotypic evolution, there is no comprehensive field of evolutionary cell biology. Yet, the resources that link molecular and phenotypic evolution reside at the level of cellular architecture. Thus, we are beginning to explore the potential for developing evolutionary theory to explain a wide array of observations from comparative cell biology. Some current interests include the evolution of multimeric proteins, the evolution of cellular surveillance mechanisms, the evolution of maximum growth capacity, the coevolution of interacting mitochondrial and nuclear-encoded proteins, and the limits to molecular perfection imposed by the barrier of random genetic drift. Related to these endeavors, the Center for Mechanisms of Evolution hosts a journal club on monthly focal topics (nonASU participants are welcome to attend), and a graduate class in Evolution of Proteins and Cells is being developed (see posted book chapters for material).

The evolution of genome architecture

It is commonly assumed that a causal link exists between complexity at the genomic and organismal levels. However, using population-genetic principles as a guide to understanding the evolution of duplicate genes, introns, mobile-genetic elements, and regulatory-region complexity, our work is advancing the hypothesis that much of eukaryotic genome complexity initially evolved as a passive indirect response to reduced population size (relative to the situation in prokaryotes). 

One of the primary goals of our work on gene duplication is to explain the shortcomings of the classical model, which postulates that the usual fate of a duplicated gene is either conversion to a nonfunctional pseudogene or acquisition of a new function. We promoted the alternative view that duplicate genes are frequently preserved through a partitioning of functions of ancestral genes (subfunctionalization), rather than by the evolution of new functions. Our empirical work in this area is now focused on ciliates within the Paramecium aurelia complex, which arose as a cryptic species radiation following two whole-genome duplication events (dating to nearly a billion years ago). Sequencing the complete genomes of the members of this lineage, along with the pre-duplication outgroup species, is revealing the degree to which specific members of duplicate-gene pairs are lost/preserved in parallel or divergently resolved in sister taxa, and high-throughput work in transcriptomics and proteomics is helping reveal the mechanisms of subfunctionalization. Over the next few years, we hope to fully ascertain the regulatory vocabulary (transcription-factor binding sites) of the members of this complex, how this has diverged over time, and how the functions of various proteins evolve across lineages.

Our work on intron evolution is focused on the hypothesis that newly arisen introns are typically mildly deleterious. A major goal is to understand how introns eventually came to be integrated into fundamental aspects of gene-transcript processing.  Empirical work in this area is being pursued with populations of Daphnia, which have revealed an unprecedented level of intron gain (to the extent that presence/absence polymorphisms, as well as parallel intron gains, can be found within populations). We hope that this work will eventually yield an answer to the long-term mystery as to the origins of introns.

Finally, to empirically determine the response of genomes to alterations in population effective sizes and mutation rates, we have been pursuing long-term experiments with highly replicated populations of the bacterium Escherichia coli, exposed to different population sizes and mutation rates, as well as to the presence of viral parasites.

The role of mutation in evolution

Although mutations provide the ultimate material upon which natural selection depends, most mutations are deleterious, and in certain settings can lead to a substantial fitness load. We are attempting to understand the nearly 1000-fold range of variation in the mutation rate that exists across the Tree of Life, through the study of a diversity of invertebrates and unicellular eukaryotes and prokaryotes, and viruses. This work exploits a mutation-accumulation strategy in which lines are propagated as single individuals to minimize the ability of natural selection to influence the fate of newly arisen mutations, often for hundreds to thousands of generations. Molecular analyses of these lines by complete-genome sequencing yield unbiased estimates of the rate and spectrum of mutations at the DNA level, revealing a dramatic negative scaling of the mutation rate with population size, an apparently universal mutation pressure towards AT composition, and many other previously unknown mutational features. 

In addition, we have developed a novel method that allows us to estimate transcription error rates and the degree to which these vary among eukaryotic lineages. Remarkably, error rates at this level are typically >1000x those at the level of genome replication. The implication is that >1% of transcripts typically contain an erroneous base.

Our work on mutation extends to the development of population-genetic theory for the evolution of the mutation rate itself, and to obtaining a general understanding of the consequences of somatic mutations for the evolution of multicellularity. Here, we are promoting the idea that the power of random genetic drift imposes a lower bound to the degree to which natural selection can reduce the mutation rate. This drift-barrier hypothesis seems to support a number of previously disconnected observations, including the increase in the mutation rate with reductions in effective population size, the magnified error rates associated with DNA polymerases and repair enzymes used only infrequently in replication, and the extraordinarily high rates of base misincorporation into transcripts.

The role of recombination in evolution

Sexual recombination provides a powerful means for producing multi-locus genotypes with high fitness but also has the negative side-effect of breaking apart coadaptive complexes of alleles. A great deal of theory has been developed to help explain the phylogenetic distribution of recombination, but the key biological observations for testing the various hypotheses remain to be developed. To help provide a mechanistic understanding of the causes and consequences of the loss of recombination, we are studying the microcrustacean Daphnia pulex, which consists of both sexual and asexual races of various evolutionary ages. Specific projects now include: the isolation and characterization of the genes responsible for meiosis suppression in obligate asexuals as well as for the suppression of male production in sexual forms; quantification of the rate of accumulation of deleterious mutations in asexual vs. sexual lineages; estimation of the rate and tempo of allele and genotype turnover in natural populations; and the coevolution of interacting mitochondrially and nuclear-encoded genes. The asexual lineages in this species complex are remarkably young (often <100 years), apparently owing to rapid extinction resulting from the "loss of heterozygosity" and exposure of pre-existing deleterious alleles by gene conversion, rather than from the accumulation of de novo mutations.

In this continuing project, we have sequenced the genomes of >3000 isolates, including the sequencing of ~100 genomes from each of ~50 populations, some of which have experienced extreme population bottlenecks. The pattern of temporal evolution is being pursued in an annual survey of one particular population, now with 10 consecutive years of data. 

This study, which is expected to yield an unprecedented understanding of the genomic features of natural populations, is now being expanded into a variety of areas of organismal biology, as we develop a complete atlas of tissue-specific and developmental-stage gene expression, and begin to explore methodologies for genetic transformation. 

Methods for analysis of high-throughput genomic data

The rapid emergence of technological innovations in genome sequencing has resulted in a situation where it is now possible to sequence multiple genomes from multiple natural populations. In anticipation of such data, and the enormous challenges that come with them (most notably, incomplete sampling of parental alleles and errors in sequence reads), we have begun to develop maximum-likelihood methods for estimating a broad array of population-genetic parameters, including nucleotide heterozygosity, linkage disequilibrium, the allele-frequency spectrum, population subdivision, and historical changes in effective population sizes. Applications of these methods currently involve the quantification of the above-mentioned parameters in Daphnia.