NOTES ON COALESCENT THEORY

Spring 2014 -- Peter Ralph -- USC
http://petrelharp.github.io/coaltheory/outline.html

Notes

Here are links to web pages of notes that roughly follow the course. They are xhtml, with svg figures, so they might not work in IE; get firefox or opera if you have troubles. Or, download the source from my github repository and build whatever you want from the latex. They are currently (Spring 2014) in rapid upheaval.

  1. Introduction
  2. Summary statistics (of the realized ancestral recombination graph)

If you use these notes, please cite them, for instance:
Notes on Coalescent Theory, preprint, Peter Ralph, 2014. http://petrelharp.github.io/coaltheory

Overview

This short course is aimed at people with some degree of familiarity with probability and stochastic processes (say, a first-year graduate course). At the end of each class I will assign a paper to read; everyone is expected to bring, written down, one or two questions or observations about the reading to the next class. The first hour to hour-and-a-quarter of each class will be lecture (interruptions and questions encouraged); and the remaining part of the class will be spent discussing the paper, guided by the questions contributed by the class (and myself). The goal of this structure is to complement lecture with the sorts of learning we get by sorting through papers from the literature together, and trying to explain concepts to each other.

I am aiming for the papers we read to complement the lectures: so, there will be things in the papers that we won't have talked about in class, and some things I talk about in class that won't appear in the papers. I am concurrently writing up lecture notes at http://petrelharp.github.io/coaltheory/ ; comments (edits, pull requests, etc) are welcome.

Expectations

I expect everyone to turn in, at the start of each class, a short (half page or less) of question and/or observation suitable as a topic for discussion from the weekly discussion paper. At the end of my section of the course (on March 4th), I also expect a two-page writeup either summarizing further readings in the literature, or the results of simulating something we've learned about in class. An example of the first would be to read a few more papers on a particular topic, and write a summary of the main results, methods, and relationship to current practice. An example of the second would be to simulate a particular population process of interest, and compare the results to theory. I will provide suggestions for either, and am open to your ideas.

Readings

  1. (1/14) Gene genealogies and the coalescent process, Hudson, 1990.
    This is a review paper, which doesn't give many details, but is a very useful overview.
  2. (1/21) Gene genealogies within a fixed pedigree, and the robustness of Kingman's coalescent., by Wakeley, King, Low, and Ramachandran. This paper examines the effect of correlations due to a fixed pedigree that dissappear in the large-population limit underlying most theoretical results (using a real-life pedigree!).
  3. (1/28) Darwinian and demographic forces affecting human protein coding genes, by Nielsen et al, 2009. This is a good example of something we'd like to do: infer demography and the effects of selection.
  4. (2/4) Inference of human population history from individual whole-genome sequences, Li and Durbin, 2009. This paper, remarkably, shows how to use a single diploid sequence to infer past coalescent rates. Be sure to read the methods in the supplement; and also, Fast "coalescent" simulation, Marjoram and Wall, 2006, for a more readable description of the Markov assumption made here.
  5. (2/11) Classic selective sweeps were rare in recent human evolution, Hernandez et al 2011; and Pervasive Adaptive Protein Evolution Apparent in Diversity Patterns around Amino Acid Substitutions in Drosophila simulans, Sattath et al 2011. These two similar papers use mostly descriptive statistics of whole-genome data to look in aggregate at the nature of recent selection in humans and a fruit fly.
  6. (2/18) (find one or several papers you would like to read and report on)
  7. (2/25) (ten-minute student summaries of your papers)
  8. (3/4) (ten-minute student summaries of your papers)

Outline (tentative)

  1. Pedigrees and inheritance, and statistics of the realized ancestral recombination graph (ARG)

  2. Population models of pedigrees and lineages

  3. Expected values under population models

  4. Recombination, and blocks

  5. Junctions and multiply shared blocks

  6. Bulk coalescent theory

Further readings

Further topics to possibly cover: