<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8"/> <link rel="stylesheet" href="LaTeXML.css" type="text/css"/> <link rel="stylesheet" href="ltx-article.css" type="text/css"/> <link rel="stylesheet" href="latexmliness/plr-style.css" type="text/css"/> <script src="latexmliness/LaTeXML-maybeMathjax.js" type="text/javascript"/> <script src="latexmliness/adjust-svg.js" type="text/javascript"/> </head> <body> <div class="ltx_page_main"> <div class="ltx_page_content"> <div class="ltx_document"> <div id="Sx1" class="ltx_section"> <h1 class="ltx_title ltx_title_section">General goal</h1> <div id="Sx1.p1" class="ltx_para"> <p class="ltx_p">Perhaps we are presented with data: genome sequences plus other information about a number of individuals that are usually random samples from some population(s), and we want to learn about their shared history: estimate levels of relatedness between the samples; infer their ancestral genome sequence(s); or identify genomic locations subject to selection. Or, perhaps we want to compute the levels of genetic diversity predicted under a certain population model. In either case, we must understand the relationship between process parameters (e.g. migration rates, selection coefficients) and the observed genomic patterns of dissimilarity.</p> </div> <div id="Sx1.p2" class="ltx_para"> <p class="ltx_p">This discussion starts out from the unconvential direction of treating the usual population genetic quantities as summary statistics of the (unobserved) pedigree–with–recombination, the “ancestral recombination graph”, or ARG. Often, things like “coalescent time” are defined only in the context of mathematical models, but can equally well be thought of as descriptive statistics, whose expected values we can compute under certain models. This seems to me at least conceptually useful when thinking about such quantities as estimated from real data, where assumptions such as random mating are rarely met. </p> </div> <div id="Sx1.p3" class="ltx_para"> <p class="ltx_p">Some other discussions that take this point of view to at least some degree are: <cite class="ltx_cite"/>, OTHERS. Other good references for coalescent theory are <cite class="ltx_cite">Hudson (<a href="#bib.bib6" title="Gene genealogies and the coalescent process" class="ltx_ref">3</a>)</cite> (a nice article-length review), and the books <cite class="ltx_cite">Wakeley (<a href="#bib.bib1" title="Coalescent theory, an introduction" class="ltx_ref">4</a>)</cite> (fairly gentle) and <cite class="ltx_cite">Ewens (<a href="#bib.bib3" title="Mathematical population genetics" class="ltx_ref">1</a>)</cite> (fairly mathematical, and covers more general population genetics).</p> </div> </div> <div id="S1" class="ltx_section"> <h1 class="ltx_title ltx_title_section"><span class="ltx_tag ltx_tag_section">1 </span>Recombination, segregation, and mutation</h1> <div id="S1.p1" class="ltx_para"> <p class="ltx_p">First, an outline of how we model diploid reproduction, i.e. how the autosomal genome of an offspring gets assembled from those of the parents. There are exceptions (of course), but mostly, each organism has two copies of each autosomal chromosome, one from mom and one from dad. Offspring are the union of two gametes (an egg and a spermatid), and each gamete is produced by a single diploid cell:</p> <ul id="I1" class="ltx_itemize"> <li id="I1.i1" class="ltx_item" style="list-style-type:none;"><span class="ltx_tag ltx_tag_itemize">•</span> <div id="I1.i1.p1" class="ltx_para"> <p class="ltx_p">duplicating each chromosome (by mitosis)</p> </div></li> <li id="I1.i2" class="ltx_item" style="list-style-type:none;"><span class="ltx_tag ltx_tag_itemize">•</span> <div id="I1.i2.p1" class="ltx_para"> <p class="ltx_p">recombining the homlogous copies</p> </div></li> <li id="I1.i3" class="ltx_item" style="list-style-type:none;"><span class="ltx_tag ltx_tag_itemize">•</span> <div id="I1.i3.p1" class="ltx_para"> <p class="ltx_p">segregating these copies among four daughter cells. </p> </div></li> </ul> <p class="ltx_p">Sometimes all four daughter cells become gametes; sometimes (e.g. female meiosis in mammals) only one will.</p> </div> <div id="S1.SS0.SSS0.P1" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Recombination</h3> <div id="S1.SS0.SSS0.P1.p1" class="ltx_para"> <p class="ltx_p">is complicated, varies along the genome, and is affected by genomic factors and motifs. To make it tractable, we assume that recombination breakpoints occur as a Poisson point process with constant mean of 2 breakpoints per unit of length, and that each breakpoint is a crossing over between a randomly chosen maternal and a randomly chosen paternal chromosome. This produces an average of 1 crossover per chromosome per unit length, i.e. we are measuring length in genetic map length (Morgans).</p> </div> </div> <div id="S1.SS0.SSS0.P2" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Mutation</h3> <div id="S1.SS0.SSS0.P2.p1" class="ltx_para"> <p class="ltx_p">In a major concession to mathematical convenience, we’ll simply model mutation as another Poisson point process – suppose that each gamete differs from its progenitor chromosome at the points of an Poisson point process. This is the “infinite sites model”, assuming that mutation cannot hit at the same location twice. Usually we assume the point process of mutations has mean rate <math xmlns="http://www.w3.org/1998/Math/MathML" id="S1.SS0.SSS0.P2.p1.m1" class="ltx_Math" alttext="\mu" display="inline"><semantics><mi>μ</mi><annotation encoding="application/x-tex">\mu</annotation></semantics></math> per unit of map length, but in reality this rate varies along the genome.</p> </div> </div> <div id="S1.SS0.SSS0.P3" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Segregation</h3> <div id="S1.SS0.SSS0.P3.p1" class="ltx_para"> <p class="ltx_p">Once four gametes are produced, it remains to be decided which of the four produce the offspring. We assume that one is chosen uniformly at random, independently of the result of recombination. (Again, there are exceptions.)</p> </div> </div> </div> <div id="S2" class="ltx_section"> <h1 class="ltx_title ltx_title_section"><span class="ltx_tag ltx_tag_section">2 </span>The pedigree, with recombination</h1> <div id="S2.p1" class="ltx_para"> <p class="ltx_p">The <em class="ltx_emph">pedigree</em> of a set of individuals is a graph describing all parent-offspring relationships between these and some set of their ancestors. The <em class="ltx_emph">population pedigree</em> describes all such relationships between all individuals living and possibly dead. This records mate choice, but omits the important information of recombination and segregation, i.e. which parts of each chromosome derive from which of the parents’ two homologous copies. Adding this information to the pedigree obtains what is known as the <em class="ltx_emph">ancestral recombination graph</em>, or ARG. There are a number of ways to formalize this notion; see <cite class="ltx_cite">(<a href="#bib.bib4" title="An ancestral recombination graph" class="ltx_ref">2</a>; <a href="#bib.bib6" title="Gene genealogies and the coalescent process" class="ltx_ref">3</a>)</cite> for discussions. One way simply notes that the relatedness structure at any particular locus is a treelike subset of the pedigree, and that the collection of these trees – one for each base in the genome, say – is sufficient to reconstruct the entire history of inheritance, recombination and segregation. Alternatively, one can annotate each link in the pedigree with a labeling of the genome by <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.p1.m1" class="ltx_Math" alttext="\{m,p\}" display="inline"><semantics><mrow><mo>{</mo><mrow><mi>m</mi><mo>,</mo><mi>p</mi></mrow><mo>}</mo></mrow><annotation encoding="application/x-tex">\{m,p\}</annotation></semantics></math>, denoting which segments of the parent’s maternal and paternal chromosomal copies were passed down along that link. See figure <a href="#S2.F1" title="Figure 1 ‣ 2 The pedigree, with recombination" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a> for a partial depiction.</p> </div> <div id="S2.F1" class="ltx_figure"><object data="pedigree-ibd-recombination.svg" id="S2.F1.g1" class="ltx_graphics ltx_centering" width="316" height="134" alt=""/> <div class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 1: </span> A small pedigree relating two cousins to their six grandparents, with the extra information of recombination and segregation on one chromosome encoded by a coloring: each chromosome is composed of a patchwork of the grandparental chromosomes. XXX make additional figure showing marginal gene trees XXX </div> </div> <div id="S2.SS1" class="ltx_subsection"> <h2 class="ltx_title ltx_title_subsection"><span class="ltx_tag ltx_tag_subsection">2.1 </span>Coalescence times and gene trees</h2> <div id="S2.SS1.p1" class="ltx_para"> <p class="ltx_p">Perhaps the simplest thing we can obtain from the full ancestral recombination graph is the typical degree of relatedness of pairs of individuals. More concretely, we might ask for the empirical distribution of pairwise times back to the most recent common ancestor across all pairs of chromosomes and all loci: the distribution of <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p1.m1" class="ltx_Math" display="inline" alttext="\tau_{T}"><semantics><msub><mi>τ</mi><mi>T</mi></msub><annotation encoding="application/x-tex">\tau_{T}</annotation></semantics></math>, if <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p1.m2" class="ltx_Math" alttext="\tau_{T}" display="inline"><semantics><msub><mi>τ</mi><mi>T</mi></msub><annotation encoding="application/x-tex">\tau_{T}</annotation></semantics></math> is the length of a randomly chosen one of these paths.</p> </div> <div id="S2.SS1.p2" class="ltx_para"> <p class="ltx_p">A more precise way of formulating this is as follows: pick two random chromosomes and a random locus; follow the lineages of the two chromosomes at that locus up through the pedigree until their common ancestor; one-half the number of meioses encountered is <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p2.m1" class="ltx_Math" display="inline" alttext="\tau_{T}"><semantics><msub><mi>τ</mi><mi>T</mi></msub><annotation encoding="application/x-tex">\tau_{T}</annotation></semantics></math>. Since these two lineages will henceforth move together through if followed further back through the pedigree, <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p2.m2" class="ltx_Math" alttext="\tau_{T}" display="inline"><semantics><msub><mi>τ</mi><mi>T</mi></msub><annotation encoding="application/x-tex">\tau_{T}</annotation></semantics></math> is known as the “coalescence time” of the two lineages.</p> </div> <div id="S2.SS1.p3" class="ltx_para"> <p class="ltx_p">In this way, the phrase “coalescence time” is shorthand for “number of generations back to the most recent common ancestor”, taken as a random quantity across random samples of sets of chromosomes and/or loci. In this formulation, it is the empirical distribution of lengths of a certain set of paths to common ancestors through the pedigree.</p> </div> <div id="S2.SS1.p4" class="ltx_para"> <p class="ltx_p">The path back through the pedigree along which these two chromosomes have inherited that particular locus is a very simple tree, with two leaves (at the samples) and a root at their most recent common ancestor; the height of that tree is the coalescence time. More generally, the ancestral recombination graph encodes the marginal tree along which, at each locus, any set of sampled individuals have inherited at that locus. These are called “gene trees”. Each gene tree follows a path through the links of the pedigree, and if due to recombination during one of the meioses, loci <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p4.m1" class="ltx_Math" alttext="x" display="inline"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> and <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p4.m2" class="ltx_Math" display="inline" alttext="y"><semantics><mi>y</mi><annotation encoding="application/x-tex">y</annotation></semantics></math> are inherited from different parental chromosomes, the marginal gene trees at <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p4.m3" class="ltx_Math" display="inline" alttext="x"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> and <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS1.p4.m4" class="ltx_Math" alttext="y" display="inline"><semantics><mi>y</mi><annotation encoding="application/x-tex">y</annotation></semantics></math> will differ.</p> </div> </div> <div id="S2.SS2" class="ltx_subsection"> <h2 class="ltx_title ltx_title_subsection"><span class="ltx_tag ltx_tag_subsection">2.2 </span>Formalities</h2> <div id="S2.SS2.p1" class="ltx_para"> <p class="ltx_p">Now, a formal definition for the so far obvious-yet-vague “ancestral recombination graph”. This records all relationships between both chromsomes of all individuals that ever lived in the population: so, let <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m1" class="ltx_Math" alttext="\mathcal{C}_{t}" display="inline"><semantics><msub><mi>𝒞</mi><mi>t</mi></msub><annotation encoding="application/x-tex">\mathcal{C}_{t}</annotation></semantics></math> denote those chromosomes that were alive at time <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m2" class="ltx_Math" display="inline" alttext="t"><semantics><mi>t</mi><annotation encoding="application/x-tex">t</annotation></semantics></math>, and let <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m3" class="ltx_Math" display="inline" alttext="\mathcal{C}=\bigcup_{t}\mathcal{C}_{t}"><semantics><mrow><mi>𝒞</mi><mo>=</mo><mrow><msub><mo>⋃</mo><mi>t</mi></msub><msub><mi>𝒞</mi><mi>t</mi></msub></mrow></mrow><annotation encoding="application/x-tex">\mathcal{C}=\bigcup_{t}\mathcal{C}_{t}</annotation></semantics></math> denote all chromosomes in our universe, These are furthermore grouped together into individuals: <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m4" class="ltx_Math" display="inline" alttext="\mathcal{I}_{t}"><semantics><msub><mi>ℐ</mi><mi>t</mi></msub><annotation encoding="application/x-tex">\mathcal{I}_{t}</annotation></semantics></math> is a partition of <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m5" class="ltx_Math" alttext="\mathcal{C}_{t}" display="inline"><semantics><msub><mi>𝒞</mi><mi>t</mi></msub><annotation encoding="application/x-tex">\mathcal{C}_{t}</annotation></semantics></math> into pairs, and likewise <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m6" class="ltx_Math" display="inline" alttext="\mathcal{I}"><semantics><mi>ℐ</mi><annotation encoding="application/x-tex">\mathcal{I}</annotation></semantics></math> is a partition of <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m7" class="ltx_Math" alttext="\mathcal{C}" display="inline"><semantics><mi>𝒞</mi><annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics></math> into pairs. The relationships occur if two chromosomes in a diploid cell undergo recombination and meiosis, producing one chromosome for an offspring. So, if <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m8" class="ltx_Math" display="inline" alttext="(c_{m},c_{p})"><semantics><mrow><mo>(</mo><mrow><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">(c_{m},c_{p})</annotation></semantics></math> is a pair of chromosomes in <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m9" class="ltx_Math" display="inline" alttext="\mathcal{I}_{t}"><semantics><msub><mi>ℐ</mi><mi>t</mi></msub><annotation encoding="application/x-tex">\mathcal{I}_{t}</annotation></semantics></math>, and they recombine at locations <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m10" class="ltx_Math" display="inline" alttext="r=(r_{1},r_{2},\ldots,r_{k})"><semantics><mrow><mi>r</mi><mo>=</mo><mrow><mo>(</mo><mrow><msub><mi>r</mi><mn>1</mn></msub><mo>,</mo><msub><mi>r</mi><mn>2</mn></msub><mo>,</mo><mi mathvariant="normal">…</mi><mo>,</mo><msub><mi>r</mi><mi>k</mi></msub></mrow><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">r=(r_{1},r_{2},\ldots,r_{k})</annotation></semantics></math> to produce chromsome <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m11" class="ltx_Math" display="inline" alttext="c"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math>, then we say that the meiosis <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m12" class="ltx_Math" display="inline" alttext="(c,c_{m},c_{p},r)"><semantics><mrow><mo>(</mo><mrow><mi>c</mi><mo>,</mo><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub><mo>,</mo><mi>r</mi></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">(c,c_{m},c_{p},r)</annotation></semantics></math> has occurred. As a matter of convention, say that <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m13" class="ltx_Math" alttext="c" display="inline"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> has inherited from <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m14" class="ltx_Math" display="inline" alttext="c_{m}"><semantics><msub><mi>c</mi><mi>m</mi></msub><annotation encoding="application/x-tex">c_{m}</annotation></semantics></math> on odd intervals <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m15" class="ltx_Math" alttext="[0,r_{1})" display="inline"><semantics><mrow><mo>[</mo><mrow><mn>0</mn><mo>,</mo><msub><mi>r</mi><mn>1</mn></msub></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">[0,r_{1})</annotation></semantics></math>, <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m16" class="ltx_Math" display="inline" alttext="[r_{2},r_{3})"><semantics><mrow><mo>[</mo><mrow><msub><mi>r</mi><mn>2</mn></msub><mo>,</mo><msub><mi>r</mi><mn>3</mn></msub></mrow><mo>)</mo></mrow><annotation encoding="application/x-tex">[r_{2},r_{3})</annotation></semantics></math>, etcetera, and has inherited from <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m17" class="ltx_Math" alttext="c_{p}" display="inline"><semantics><msub><mi>c</mi><mi>p</mi></msub><annotation encoding="application/x-tex">c_{p}</annotation></semantics></math> from the remaining intervals, and that <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m18" class="ltx_Math" display="inline" alttext="0\leq r_{1}<r_{2}<\cdots<r_{k}=G"><semantics><mrow><mn>0</mn><mo>≤</mo><msub><mi>r</mi><mn>1</mn></msub><mo><</mo><msub><mi>r</mi><mn>2</mn></msub><mo><</mo><mi mathvariant="normal">⋯</mi><mo><</mo><msub><mi>r</mi><mi>k</mi></msub><mo>=</mo><mi>G</mi></mrow><annotation encoding="application/x-tex">0\leq r_{1}<r_{2}<\cdots<r_{k}=G</annotation></semantics></math>, where <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m19" class="ltx_Math" display="inline" alttext="G"><semantics><mi>G</mi><annotation encoding="application/x-tex">G</annotation></semantics></math> is the length of the chromosome. Let <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m20" class="ltx_Math" alttext="\mathcal{M}_{t}" display="inline"><semantics><msub><mi>ℳ</mi><mi>t</mi></msub><annotation encoding="application/x-tex">\mathcal{M}_{t}</annotation></semantics></math> be the meioses occurring at <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m21" class="ltx_Math" display="inline" alttext="t"><semantics><mi>t</mi><annotation encoding="application/x-tex">t</annotation></semantics></math>, and <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m22" class="ltx_Math" display="inline" alttext="\mathcal{M}=\bigcup_{t}\mathcal{M}_{t}"><semantics><mrow><mi>ℳ</mi><mo>=</mo><mrow><msub><mo>⋃</mo><mi>t</mi></msub><msub><mi>ℳ</mi><mi>t</mi></msub></mrow></mrow><annotation encoding="application/x-tex">\mathcal{M}=\bigcup_{t}\mathcal{M}_{t}</annotation></semantics></math>. To reiterate, if <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m23" class="ltx_Math" alttext="(c,c_{m},c_{p},r)\in\mathcal{M}_{t}" display="inline"><semantics><mrow><mrow><mo>(</mo><mrow><mi>c</mi><mo>,</mo><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub><mo>,</mo><mi>r</mi></mrow><mo>)</mo></mrow><mo>∈</mo><msub><mi>ℳ</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">(c,c_{m},c_{p},r)\in\mathcal{M}_{t}</annotation></semantics></math>, then <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m24" class="ltx_Math" alttext="c" display="inline"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> must be in an individual birthed at <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m25" class="ltx_Math" alttext="t" display="inline"><semantics><mi>t</mi><annotation encoding="application/x-tex">t</annotation></semantics></math> from living parents; i.e. <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m26" class="ltx_Math" alttext="c\in\mathcal{C}_{t}" display="inline"><semantics><mrow><mi>c</mi><mo>∈</mo><msub><mi>𝒞</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">c\in\mathcal{C}_{t}</annotation></semantics></math> but <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m27" class="ltx_Math" display="inline" alttext="c\notin\mathcal{C}_{s}"><semantics><mrow><mi>c</mi><mo>∉</mo><msub><mi>𝒞</mi><mi>s</mi></msub></mrow><annotation encoding="application/x-tex">c\notin\mathcal{C}_{s}</annotation></semantics></math> for any <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m28" class="ltx_Math" alttext="s<t" display="inline"><semantics><mrow><mi>s</mi><mo><</mo><mi>t</mi></mrow><annotation encoding="application/x-tex">s<t</annotation></semantics></math>; and <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p1.m29" class="ltx_Math" alttext="(c_{m},c_{p})\in\mathcal{I}_{t}" display="inline"><semantics><mrow><mrow><mo>(</mo><mrow><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub></mrow><mo>)</mo></mrow><mo>∈</mo><msub><mi>ℐ</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">(c_{m},c_{p})\in\mathcal{I}_{t}</annotation></semantics></math>. Note that it is natural to identify each chromosome with the meiosis that produced it.</p> </div> <div id="S2.SS2.p2" class="ltx_para"> <p class="ltx_p">Since <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p2.m1" class="ltx_Math" alttext="\mathcal{M}" display="inline"><semantics><mi>ℳ</mi><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics></math> contains all the information about the population pedigree as well as how recombination has acted within the pedigree, we refer to <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p2.m2" class="ltx_Math" display="inline" alttext="\mathcal{M}"><semantics><mi>ℳ</mi><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics></math> as “the ancestral recombination graph” (or, the ARG).</p> </div> <div id="S2.SS2.p3" class="ltx_para"> <p class="ltx_p">The <em class="ltx_emph">population pedigree</em> is just the information about who was whose parents; we denote this by <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p3.m1" class="ltx_Math" alttext="\mathcal{P}=\{(c,c_{m},c_{p})\colon(c,c_{m},c_{p},r)\in\mathcal{M}\}" display="inline"><semantics><mrow><mi>𝒫</mi><mo>=</mo><mrow><mo>{</mo><mrow><mrow><mo>(</mo><mrow><mi>c</mi><mo>,</mo><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub></mrow><mo>)</mo></mrow><mo separator="true">:</mo><mrow><mrow><mo>(</mo><mrow><mi>c</mi><mo>,</mo><msub><mi>c</mi><mi>m</mi></msub><mo>,</mo><msub><mi>c</mi><mi>p</mi></msub><mo>,</mo><mi>r</mi></mrow><mo>)</mo></mrow><mo>∈</mo><mi>ℳ</mi></mrow></mrow><mo>}</mo></mrow></mrow><annotation encoding="application/x-tex">\mathcal{P}=\{(c,c_{m},c_{p})\colon(c,c_{m},c_{p},r)\in\mathcal{M}\}</annotation></semantics></math>. The ARG <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p3.m2" class="ltx_Math" display="inline" alttext="\mathcal{M}"><semantics><mi>ℳ</mi><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics></math> carries the addition information of recombination locations, which is equivalent to knowing the population gene tree at each location on the chromosome.</p> </div> <div id="S2.SS2.p4" class="ltx_para"> <p class="ltx_p">For instance, let <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m1" class="ltx_Math" display="inline" alttext="T_{x}"><semantics><msub><mi>T</mi><mi>x</mi></msub><annotation encoding="application/x-tex">T_{x}</annotation></semantics></math> be the gene tree for a set of samples at position <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m2" class="ltx_Math" alttext="x" display="inline"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> along the genome. This is the minimal acyclical graph (a tree) whose nodes are chromosomes (or, equivalently, meioses), that contains the sampled chromosomes, and if a chromosome <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m3" class="ltx_Math" alttext="c" display="inline"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> is in <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m4" class="ltx_Math" display="inline" alttext="T_{x}"><semantics><msub><mi>T</mi><mi>x</mi></msub><annotation encoding="application/x-tex">T_{x}</annotation></semantics></math>, then so is the parent of <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m5" class="ltx_Math" display="inline" alttext="c"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> at which <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m6" class="ltx_Math" display="inline" alttext="c"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math> has inherited at <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m7" class="ltx_Math" display="inline" alttext="x"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math>. These trees change whenever a breakpoint is encountered in any of the constituent meioses: Formally, define <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m8" class="ltx_Math" display="inline" alttext="R(c)"><semantics><mrow><mi>R</mi><mo>⁢</mo><mrow><mo>(</mo><mi>c</mi><mo>)</mo></mrow></mrow><annotation encoding="application/x-tex">R(c)</annotation></semantics></math> to be the set of recombination breakpoints of the meiosis that created <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m9" class="ltx_Math" alttext="c" display="inline"><semantics><mi>c</mi><annotation encoding="application/x-tex">c</annotation></semantics></math>; then the next point to the right of <math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.SS2.p4.m10" class="ltx_Math" alttext="x" display="inline"><semantics><mi>x</mi><annotation encoding="application/x-tex">x</annotation></semantics></math> at which the tree changes is:</p> <table id="S3.EGx1" class="ltx_equationgroup ltx_eqn_align"> <tr id="S2.E1" class="ltx_equation ltx_align_baseline"> <td class="ltx_eqn_pad"/> <td class="ltx_td ltx_align_right"><math xmlns="http://www.w3.org/1998/Math/MathML" id="S2.E1.m1" class="ltx_Math" alttext="\displaystyle\inf\{y>x\colon T_{y}\neq T_{x}\}=\inf\{z\colon z>x\;\text{and}\;% z\in\bigcup_{c\in T_{x}}R(c)\}." display="inline"><semantics><mrow><mrow><mrow><mo movablelimits="false">inf</mo><mo>⁡</mo><mrow><mo>{</mo><mrow><mrow><mi>y</mi><mo>></mo><mi>x</mi></mrow><mo separator="true">:</mo><mrow><msub><mi>T</mi><mi>y</mi></msub><mo>≠</mo><msub><mi>T</mi><mi>x</mi></msub></mrow></mrow><mo>}</mo></mrow></mrow><mo>=</mo><mrow><mo movablelimits="false">inf</mo><mo>⁡</mo><mrow><mo>{</mo><mrow><mi>z</mi><mo separator="true">:</mo><mrow><mi>z</mi><mo>></mo><mrow><mpadded width="+2.777778pt"><mi>x</mi></mpadded><mo>⁢</mo><mpadded width="+2.777778pt"><mtext>and</mtext></mpadded><mo>⁢</mo><mi>z</mi></mrow><mo>∈</mo><mrow><mstyle displaystyle="true"><munder><mo movablelimits="false">⋃</mo><mrow><mi>c</mi><mo>∈</mo><msub><mi>T</mi><mi>x</mi></msub></mrow></munder></mstyle><mrow><mi>R</mi><mo>⁢</mo><mrow><mo>(</mo><mi>c</mi><mo>)</mo></mrow></mrow></mrow></mrow></mrow><mo>}</mo></mrow></mrow></mrow><mo>.</mo></mrow><annotation encoding="application/x-tex">\displaystyle\inf\{y>x\colon T_{y}\neq T_{x}\}=\inf\{z\colon z>x\;\text{and}\;% z\in\bigcup_{c\in T_{x}}R(c)\}.</annotation></semantics></math></td> <td class="ltx_eqn_pad"/> <td rowspan="1" class="ltx_align_middle ltx_align_right"><span class="ltx_tag ltx_tag_equation">(1)</span></td></tr> </table> </div> <div id="S2.SS2.p5" class="ltx_para"> <p class="ltx_p">This notation, so far, describes the facts: actual relationships that have already occurred in a population; many of which may be unobservable. To make inferences, we will need to put models of certain of these processes: on mutation, recombination, mate choice, and/or offspring number.</p> </div> </div> </div> <div id="S3" class="ltx_section"> <h1 class="ltx_title ltx_title_section"><span class="ltx_tag ltx_tag_section">3 </span>Conventions and definitions</h1> <div id="S3.SS2.SSS0.P1" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Generations</h3> <div id="S3.SS2.SSS0.P1.p1" class="ltx_para"> <p class="ltx_p">When two chromosomes share a common ancestor, we like to say that that ancestor lived some number of “generations” in the past. For most organisms, the notion of a generation is statistical, rather than a fixed quantity. What we actually care about is the number of meioses separating the two chromosomes – so, we hereby define the length of a path through the pedigree in generations as one-half the number of meioses. In fact, in the presence of inbreeding, it is possible for two chromosomes to have inherited different genomic regions from the same ancestor along different paths through the pedigree, which may have different lengths!</p> </div> </div> <div id="S3.SS2.SSS0.P2" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Mutation process</h3> <div id="S3.SS2.SSS0.P2.p1" class="ltx_para"> <p class="ltx_p">We will mostly work in the <em class="ltx_emph">infinite sites</em> model of mutation; we do this basically so that we can keep track of how many mutations have occurred at each locus, which although unrealistic helps greatly in the analysis. We also usually assume that mutation rates are homogeneous along the genome. This is clearly not correct, but a very good approximation over the right scales.</p> </div> </div> <div id="S3.SS2.SSS0.P3" class="ltx_paragraph"> <h3 class="ltx_title ltx_title_paragraph">Discrete or continuous rates</h3> <div id="S3.SS2.SSS0.P3.p1" class="ltx_para"> <p class="ltx_p">Here we define <math xmlns="http://www.w3.org/1998/Math/MathML" id="S3.SS2.SSS0.P3.p1.m1" class="ltx_Math" display="inline" alttext="\mu_{d}"><semantics><msub><mi>μ</mi><mi>d</mi></msub><annotation encoding="application/x-tex">\mu_{d}</annotation></semantics></math> to be the (“discrete”) mutation rate per generation per base – the probability that a given base differs from the homologous base in the parent it was inherited from. We will sometimes find it convenient to use <math xmlns="http://www.w3.org/1998/Math/MathML" id="S3.SS2.SSS0.P3.p1.m2" class="ltx_Math" display="inline" alttext="\mu=-\log(1-\mu_{d})"><semantics><mrow><mi>μ</mi><mo>=</mo><mrow><mo>-</mo><mrow><mi>log</mi><mo>⁡</mo><mrow><mo>(</mo><mrow><mn>1</mn><mo>-</mo><msub><mi>μ</mi><mi>d</mi></msub></mrow><mo>)</mo></mrow></mrow></mrow></mrow><annotation encoding="application/x-tex">\mu=-\log(1-\mu_{d})</annotation></semantics></math>, so that the probability of no mutation across <math xmlns="http://www.w3.org/1998/Math/MathML" id="S3.SS2.SSS0.P3.p1.m3" class="ltx_Math" display="inline" alttext="2t"><semantics><mrow><mn>2</mn><mo>⁢</mo><mi>t</mi></mrow><annotation encoding="application/x-tex">2t</annotation></semantics></math> meioses is <math xmlns="http://www.w3.org/1998/Math/MathML" id="S3.SS2.SSS0.P3.p1.m4" class="ltx_Math" alttext="(1-\mu_{d})^{2t}=\exp(-2t\mu)" display="inline"><semantics><mrow><msup><mrow><mo>(</mo><mrow><mn>1</mn><mo>-</mo><msub><mi>μ</mi><mi>d</mi></msub></mrow><mo>)</mo></mrow><mrow><mn>2</mn><mo>⁢</mo><mi>t</mi></mrow></msup><mo>=</mo><mrow><mi>exp</mi><mo>⁡</mo><mrow><mo>(</mo><mrow><mo>-</mo><mrow><mn>2</mn><mo>⁢</mo><mi>t</mi><mo>⁢</mo><mi>μ</mi></mrow></mrow><mo>)</mo></mrow></mrow></mrow><annotation encoding="application/x-tex">(1-\mu_{d})^{2t}=\exp(-2t\mu)</annotation></semantics></math>.</p> </div> </div> </div> <div id="bib" class="ltx_bibliography"> <h1 class="ltx_title ltx_title_bibliography">References</h1> <ul id="L1" class="ltx_biblist"> <li id="bib.bib3" class="ltx_bibitem ltx_bib_book"><span class="ltx_bibtag ltx_bib_author-year ltx_role_refnum">W.J. Ewens (2004)</span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Mathematical population genetics</span>, </span> <span class="ltx_bibblock"> <span class="ltx_text ltx_bib_publisher">Springer</span>. </span> <span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#Sx1.p3" title="General goal" class="ltx_ref"><span class="ltx_text ltx_ref_title">General goal</span></a>. </span></li> <li id="bib.bib4" class="ltx_bibitem ltx_bib_incollection"><span class="ltx_bibtag ltx_bib_author-year ltx_role_refnum">R. C. Griffiths and P. Marjoram (1997)</span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">An ancestral recombination graph</span>, </span> <span class="ltx_bibblock">in <span class="ltx_text ltx_bib_inbook">Progress in population genetics and human evolution (Minneapolis, MN, 1994)</span>, </span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_series">IMA Vol. Math. Appl.</span>, Vol. <span class="ltx_text ltx_bib_volume">87</span>, <span class="ltx_text ltx_bib_pages"> pp. 257–270</span>. </span> <span class="ltx_bibblock">External Links: <span class="ltx_text ltx_bib_links"><a href="http://www.math.canterbury.ac.nz/~r.sainudiin/recomb/ima.pdf" title="" class="ltx_ref ltx_bib_external">Link</a>, <a href="http://www.ams.org/mathscinet-getitem?mr=1493031" title="" class="ltx_ref mr ltx_bib_external">MathReview Entry</a></span>. </span> <span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#S2.p1" title="2 The pedigree, with recombination" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2</span></a>. </span></li> <li id="bib.bib6" class="ltx_bibitem ltx_bib_article"><span class="ltx_bibtag ltx_bib_author-year ltx_role_refnum">R.R. Hudson (1990)</span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Gene genealogies and the coalescent process</span>, </span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_journal">Oxford surveys in evolutionary biology</span> <span class="ltx_text ltx_bib_volume">7</span> (<span class="ltx_text ltx_bib_number">1</span>), <span class="ltx_text ltx_bib_pages"> pp. 44</span>. </span> <span class="ltx_bibblock">External Links: <span class="ltx_text ltx_bib_links"><a href="http://web.eve.ucdavis.edu/pbg298/pdfs/Hudson_OxfordSurveysEvolBiol_1991.pdf" title="" class="ltx_ref ltx_bib_external">Link</a></span>. </span> <span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#S2.p1" title="2 The pedigree, with recombination" class="ltx_ref"><span class="ltx_text ltx_ref_tag">2</span></a>, <a href="#Sx1.p3" title="General goal" class="ltx_ref"><span class="ltx_text ltx_ref_title">General goal</span></a>. </span></li> <li id="bib.bib1" class="ltx_bibitem ltx_bib_book"><span class="ltx_bibtag ltx_bib_author-year ltx_role_refnum">J. Wakeley (2005)</span> <span class="ltx_bibblock"><span class="ltx_text ltx_bib_title">Coalescent theory, an introduction</span>, </span> <span class="ltx_bibblock"> <span class="ltx_text ltx_bib_publisher">Roberts and Company</span>, <span class="ltx_text ltx_bib_place">Greenwood Village, CO</span>. </span> <span class="ltx_bibblock">External Links: <span class="ltx_text ltx_bib_links"><a href="http://www.coalescentheory.com/" title="" class="ltx_ref ltx_bib_external">Link</a></span>. </span> <span class="ltx_bibblock ltx_bib_cited">Cited by: <a href="#Sx1.p3" title="General goal" class="ltx_ref"><span class="ltx_text ltx_ref_title">General goal</span></a>. </span></li> </ul> </div> </div> </div> <div class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Wed Jan 22 10:48:40 2014 by <a href="http://dlmf.nist.gov/LaTeXML/">LaTeXML <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg==" alt="[LOGO]"/></a></div></div> </div> </body> </html>