Our research is focused on mechanisms of evolution at the gene, genomic, cellular, and phenotypic levels, with special attention being given to the roles of mutation, random genetic drift, and recombination. For these purposes, we are currently utilizing several model systems, most notably the microcrustacean Daphnia, the ciliate Paramecium, and numerous additional unicellular eukaryotes and prokaryotes. In addition, comparative analyses of completely sequenced genomes are being performed to shed light on issues concerning the origins of genomic, gene-structural, and cellular diversity. Most of our empirical work is integrated with the development and use of mathematical theory in an effort to develop a formal understanding of the constraints on the evolutionary process. Evolution is a population-level process, and the underlying philosophy of our research is that "nothing in evolution makes sense except in the light of population genetics."
The Evolution of Genome Architecture
It is commonly assumed that a causal link exists between complexity at the genomic and organismal levels. However, using population-genetic principles as a guide to understanding the evolution of duplicate genes, introns, mobile-genetic elements, and regulatory-region complexity, our work is advancing the hypothesis that much of eukaryotic genome complexity initially evolved as a passive indirect response to reduced population size (relative to the situation in prokaryotes). One of the primary goals of our work on gene duplication is to explain the shortcomings of the classical model, which postulates that the usual fate of a duplicated gene is either conversion to a nonfunctional pseudogene or acquisition of a new function. We believe that duplicate genes are frequently preserved through a partitioning of functions of ancestral genes (subfunctionalization), rather than by the evolution of new functions.
Our empirical work on the evolutionary fates of duplicate genes is now focused on the genomes of species within the Paramecium aurelia complex, which arose as a cryptic species radiation following two whole-genome duplication events (dating to nearly a billion years ago). Sequencing the complete genomes of the members of this lineage, along with the pre-duplication outgroup species, is revealing the degree to which specific members of duplicate-gene pairs are lost/preserved in parallel or divergently resolved in sister taxa, and work on subcellular localization is helping reveal the mechanisms of subfunctionalization. Over the next few years, we hope to fully ascertain the regulatory vocabulary (transcription-factor binding sites) of the members of this complex and how this has diverged over time, and how the functions of various proteins evolve across lineages.
Our work on intron evolution is focused on the hypothesis that newly arisen introns are typically mildly deleterious. A major goal is to understand how introns eventually came to be integrated into fundamental aspects of gene-transcript processing. Empirical work in this area is being pursued with populations of Daphnia, which have revealed an unprecedented level of intron gain (to the extent that presence/absence polymorphisms, as well as parallel intron gains, can be found within populations). We hope that this work will eventually yield an answer to the long-term mystery as to the origins of introns.
Finally, to empirically determine the response of genomes to alterations in population effective sizes and mutation rates, we have initiated long-term experiments with highly replicated populations of the bacterium Escherichia coli. Some of the goals of this experiment include testing the mutational-hazard theory of genome evolution, ascertaining the degree to which the pathways taken by evolution are repeatable, understanding how the mutation rate evolves in different population-genetic environments, and determining whether population bottlenecks induce heritable problems in protein folding and challenges for chaperone systems.
The Role of Mutation in Evolution
Although mutations provide the ultimate material upon which natural selection depends, most mutations are deleterious, and in certain settings can lead to a substantial fitness load. We are attempting to understand the nearly 1000-fold range of variation in the mutation rate that exists across the Tree of Life, through the study of a diversity of invertebrates and unicellular eukaryotes and prokaryotes. This work exploits a mutation-accumulation strategy in which lines are propagated as single individuals to minimize the ability of natural selection to influence the fate of newly arisen mutations, often for hundreds to thousands of generations. Molecular analyses of these lines by complete-genome sequencing are yielding the first direct quantitative estimates of the rate and spectrum of mutations at the DNA level, revealing a dramatic scaling of the mutation rate with genome size, an apparently universal mutation pressure towards AT composition, and many other previously unknown mutational features. This work is now being extended to ~30 bacterial species, ranging widely in genome size and nucleotide composition.
In addition, we have recently developed a novel method that allows us to estimate transcription error rates and the degree to which these vary among eukaryotic lineages. Remarkably, error rates at this level are typically >1000x those at the level of genome replication. The implication is that >1% of transcipts typically contain an erroneous base.
Our work on mutation extends to the development of population-genetic theory for the evolution of the mutation rate itself, and to obtaining a general understanding of the consequences of somatic mutations for the evolution of multicellularity. Here, we are promoting the idea that the power of random genetic drift imposes a lower bound to the degree to which natural selection can reduce the mutation rate. This drift-barrier hypothesis seems to support a number of previously disconnected observations, including the increase in the mutation rate with reductions in effective population size, the magnified error rates associated with DNA polymerases and repair enzymes used only infrequently in replication, and the extraordinarily high rates of base misincorporation into transcripts.
The Role of Recombination in Evolution
Sexual recombination provides a powerful means for producing multi-locus genotypes with high fitness, but also has the negative side-effect of breaking apart coadaptive complexes of alleles. A great deal of theory has been developed to help explain the phylogenetic distribution of recombination, but the key biological observations for testing the various hypotheses remain to be developed. To help provide a mechanistic understanding of the causes and consequences of the loss of recombination, we are studying the microcrustacean Daphnia pulex, which consists of both sexual and asexual races of various evolutionary ages. Specific projects now include: the isolation and characterization of the genes responsible for meiosis suppression in obligate asexuals; quantification of the rate of accumulation of deleterious mutations in asexual vs. sexual lineages; estimation of the rate and tempo of allele and genotype turnover in natural populations; and the quantification of the influence of recombination on the activity of mobile genetic elements. The asexual lineages in this species complex are remarkably young (often <100 years), apparently owing to rapid extinction resulting from the "loss of heterozygosity" and exposure of pre-existing deleterious alleles by gene conversion, rather than from the accumulation of de novo mutations.
Our overall work on Daphnia genomics extends well beyond the above-mentioned work, having recently morphed into a much larger endeavor – the “5000 Genomes Project,” which is supporting the sequencing of ~100 genomes from each of ~50 populations, some of which have experienced extreme population bottlenecks. This study is expected to yield an unprecedented understanding of the genomic features of natural populations.
Methodology for the Analysis of High-Throughput Genomic Data
The rapid emergence of technological innovations in genome sequencing has resulted in a situation where it is now possible to sequence multiple genomes from natural populations. In anticipation of such data, and the enormous challenges that come with them (most notably, incomplete sampling of parental alleles and errors in sequence reads), we have begun to develop a new generation of maximum-likelihood methods for estimating a broad array of population-genetic parameters, including nucleotide heterozygosity, linkage disequilibrium, the allele-frequency spectrum, and population subdivision. Applications of these methods currently involve the quantification of the above-mentioned parameters in a variety of taxa. Particular emphasis is now being focused on the estimation of patterns of linkage disequilibrium in natural populations, and using this information to estimate effective population sizes, recombination frequencies, and gene-conversion tract lengths.
Evolutionary Cell Biology
Remarkably, although we have fairly well-established fields of molecular evolution, genome evolution, and phenotypic evolution, there is no comprehensive field of evolutionary cell biology. Yet, one could argue that the resources that link molecular and phenotypic evolution reside at the level of cellular architecture. Thus, we are beginning to explore the potential for linking evolutionary theory with various observations from comparative cell biology. Some current interests include the evolution of multimeric proteins, the evolution of cellular surveillance mechanisms, and the limits to molecular perfection imposed by the barrier of random genetic drift. To this end, members of the lab are now studying issues related to vesicle transport and gene expression using the Paramecium system, as well as comparative issues regarding protein architecture. A local journal club in the area, as well as a graduate class in Evolution of Proteins and Cells has emerged from these interests. Time will tell whether this foray into cell biology is a worthwhile venture.