Sunday, July 28, 2013

The human population harbors 172 mutations per non-lethal genome position. What'll happen to them?

A recent Panda's Thumb post highlighted that, given the size of the human genome, the rate of de novo point mutations, and the total size of the population, every non-lethal position can be expected to vary - meaning that, for every genome position or site, there's very likely at least one person (and usually dozens or more) with a new mutation there, so long as it's non-lethal. It's a trivial calculation and, while we could refine it in various ways, the essential point is clear.

"We are all, regardless of race,
genetically 99.9% the same."

Right or wrong?
Still, let's try to understand this a bit further. First, an equally simple, entirely compatible fact which might attenuate our surprise: the existence of a couple hundred people with new mutations in a certain site leaves about seven billion without a new mutation there. Indeed, at the vast majority of sites, almost all people are homozygous for the same allele - identical by descent from the hominid lineage.

In that light, here's a deep question one can ask about all those hundreds of billions of de novo mutations: what will be their ultimate fate? Will they all shuffle through the future human population, making our genome's future evolution look like the reels on a slot machine? Or is it going to be rather more like the pitch drop experiment?

Take any one of the 100 or so original mutations in my own genome. It could eventually go on to world domination - fixation, as it's usually called by the less ambitious - spreading by descent to every chromosome in the future population. Or at some point it could simply cease to exist - lost to the random vagaries of meiosis as well as my fecundicity and that of my progeny. On a certain timescale, these are the only two possible fates (except under some special conditions I won't consider here).

Intuition tells us world domination must be very unlikely, and so the alternative - loss - must be very likely. But let's be generous, and assume the mutation in question is actually a selectively beneficial one. Our intuition might then hesitate, so it'll be helpful to recall a classic equation from population genetics, which effectively considers all of the mutation's possible evolutionary trajectories toward future fixation or loss in the Wright-Fisher model. Let $s$ denote the selection coefficient associated with the beneficial mutation, which here we'll roughly understand to mean that the mutation additively confers an expected $\frac{s}{2(1-s)}$ more offspring. Also let $N$ be the human population size, and $N_e$ the effective population size. The probability of my mutation attaining world domination is:

$$ \mathbf{P}(\operatorname{Fix}|s,N,N_e) = \frac{1-\exp\left(- \frac{N_e}{N} s\right)}{1-\exp(-2 N_e s)} $$

Take $s = 0.001$, meaning that this mutation gives me a 0.05% better chance of passing it along. If that sounds small, consider everything that has to go just right for human reproduction to happen (giggity) and all the genomic loci that must play a role therein; you might then agree it's actually quite a large contribution from just a single point mutation! We can also take roughly $N = 10^{10}$. It's much more difficult to estimate what $N_e$ will be going forward into many future generations, so let's consider a wide range of possible values from $10^5$ to $10^9$:

$x$$\mathbf{P}(\operatorname{Fix}|s=0.001, N=10^{10}, N_e = x)$
    $10^5$    $10^{-8}$
    $10^7$    $10^{-6}$
    $10^9$    $10^{-4}$

So, even under a very generous assumption of $N_e = 10^9$, my quite beneficial mutation will be lost with 99.99% certainty. Its chances are far worse under smaller - and perhaps more plausible - guesses of the future effective population size, e.g. $10^6$ish. And even worse still for any less-beneficial or deleterious mutation: as $s \rightarrow 0$, the fixation probability tends to $\frac{1}{2N} = 5 \times 10^{-11}$ (to verify this, recall $e^{-x} \approx 1-x$ for small $x$).

So if indeed very beneficial mutations are very rare, then this simple model suggests that, of all those hundreds of billions of de novo mutations that must exist throughout the human population today, all but a few thousand or so are destined for evolutionary oblivion. Of course, they'll be "replaced" by the new ones arising in every future person born, a similarly tiny fraction of which will go on to fixation, and so the molecular evolution show will go on.

• • 

Here are two nagging issues with what we've concluded so far. First, perhaps it's too extreme to expect my mutation to achieve total world domination; couldn't I die happy if I only knew it would ultimately propagate to, say, 1% of the future population? Second, while the mutation is a new one in my genome, there are probably dozens of other people in the world whose genomes fortuitously experienced the very same mutation, by the original argument from the PT post. What if we considered their mutations and mine interchangeably?

We can address both of these questions with a slightly more general formula for the fixation probability,

\mathbf{P}(\operatorname{Fix}|s,N,N_e,k) = \frac{1-\exp\left(- \frac{N_e}{N} sk\right)}{1-\exp(-2 N_e s)}

where $k$ is the number of instances of the mutation found throughout the human population (previously we'd considered the special case $k=1$). To the second question above, even with $k=100$, we have $\mathbf{P}(\operatorname{Fix}|s=0.001,N=10^{10},N_e=10^6,k=100) = 10^{-5}$; it's still merely a one-in-a-hundred-thousand shot, even though this is a fairly beneficial mutation.

The way to the answer for the first question is a bit more roundabout. Let's assume my mutation were to attain 1% frequency. Under that more modest, yet still glorious, scenario, what is its fate? $\mathbf{P}(\operatorname{Fix}|s=0.001,N=10^{10},N_e=10^6,k=0.01 \cdot 10^{10}) = 99.995\%$. Once it rises to 1%, fixation is essentially inevitable! But, since we know that world domination for my new mutation is extremely unlikely, it must follow that attaining even 1% frequency is also extremely unlikely. And, more unlikely still for any other less-beneficial mutation.

It's intriguing that, while dozens to hundreds of instances of the mutation stand almost no chance, capturing just 1% of the population makes fixation virtually certain. What's the source of this striking shift in the underlying dynamics? The answer has to do with exponential growth. Natural selection acts to grow the existing population of the beneficial mutation: if there are few instances of the mutation, growth is slow, and the more instances there are, the faster the growth. By 1% the train has left the station. But, middling along with a few hundred copies, the slow rate of growth is more-or-less offset by the occasional losses to meiotic shuffling, infertility, premature death, abstinence, and contraception.

As Gillespie wrote (below): "Think of all the great mutations that failed to get by the quagmire of rareness!"

• • 

Having gained a little intuition from a simple model, it's always wise to recall salient limitations of that model. Some are obvious, like its treatment of $s$, $N$, and $N_e$ as constant over time and geography, the uncertainty about their values going forward, and the ignorance of extensive human population stratification. I can think of two others that seem worthy of mention in closing.

Despite my previous argument that $s=0.001$ would be a large contribution for an individual point mutation, there are some fascinating epistasis theories about the potential for one mutation to operate synergistically with many other loci, essentially providing the "straw that broke the camel's back" to a much larger fitness advantage. The neutral network is an interesting and complementary way to conceptualize this powerful idea - that many previously neutral alleles can suddenly gain tremendous selective meaning in light of a single new mutation. If we can seriously entertain $s \gg 0.001$, then of course such a mutation stands a better chance.

Finally, aside from far higher selection coefficients, my mutations have at least one other potential route to glory, absent entirely from the simple model considered above: genetic linkage to other beneficial variants, whether mine or my parents', rare or common. As for the ease of that route, we're still collecting data!

Recommended reading