In the book, we 're talking about adaptive evolution. Sewall Wright thought that was more likely to happen in small, isolated populations - his ‘shifting balance’ theory - while Fisher thought that adaptive evolution would move faster in large populations, since a large population will have more beneficial mutations. I’d say that the evidence favors Fisher. This faster evolution in large populations has been repeatedly demonstrated in selection experiments: adaptation to an environmental change goes a lot faster in large populations: you don’t have to wait so long for a favorable mutation.
The idea that a favorable mutation can somehow be ‘diluted out’ is just wrong.
Here’s how it works. Imagine a situation in which the environment has changed, so that a population is no longer well-matched to that environment. Changes that would improve fitness are now possible. For example, suppose that some human population starts raising cattle. They start out lactose intolerant - they stop making lactose around age five - and so cannot easily and completely digest milk as adults.
This is the default human pattern.
This population probably does use cow’s milk, but in a somewhat limited and inefficient way. Kids can drink it, and some probably do, for example if they lose their mother. Adults probably make cheese or yogurt out of it, but in so doing lose about a third of the food value.
Imagine someone with a mutation that leaves lactase production turned on indefinitely. Presto, he can drink milk as an adult. He can use a common food far more efficiently than other people. His fitness is higher: maybe as much as 10% higher, when we consider how fast lactose-tolerance mutations have spread.
Most likely this beneficial mutation does not spread. Imagine that he has two children: there’s a 25% chance that neither has a copy, which would end the story. Even if it survives the first generation, there are many ways in which it can be lost in later generations - at least up to a point. But its frequency does not just drift randomly (usually hitting zero and disappearing) as would be the case with a new allele with no advantage. If it survives at all, its frequency tends to increase, because of that 10% advantage. If it survives drift, it will become more and more common, eventually becoming immune to drift. After that it spreads. In a well-mixed population, the frequency grows exponentially. In a spatially spread-out population, its spreads as a nonlinear wave (Fisher-Kolmogorov wave) with a propagation speed proportional to the square root of the selective advantage.
Such an allele is either lost early or will spread widely. The probability of that spread is calculable. If the selective advantage is s (10% in this case), the chance of success is 2s. So you’d have to have about 5 lactose-tolerance mutations, on average, before one would take off.
This has happened, several times. The European lactose tolerance mutation is the oldest: it probably originated something like 8-10 thousands years ago, and is now found in Europe (especially northern Europe), in north India, and at lower frequencies in North Africa and even in the Fulani, just south of the the Sahara.
And of course in populations that originated from those places.
The Bedouin have their own mutation that roughly dates back to the domestication of the camel, and several pastoral groups in Africa (Nilotic and Cushitic groups) have their own versions. The Tutsi have a Nilotic mutation and about 90% of them are lactose tolerant.