After hearing that Ian Banks died recently, I decided that I would finally try reading some of his books. I had the book 'Consider Pheblas' recommended to me as a good place to start. I quite enjoyed it, except for this one passage near the introduction: (emphasis mine)

There were in excess of eighteen trillion people in the Culture, just about every one of them well nourished, extensively educated and mentally alert, and only thirty or forty of them had this unusual ability to forecast and assess on a par with a well-informed Mind (of which there were already many hundreds of thousands). It was not impossible that this was pure luck; toss eighteen trillion coins in the air for a while and a few of them are going to keep landing the same side up for a long, long time.

I had to stop and actually calculate the chances of this happening because it bothered me so much. Yes, so maybe I'm a little bit too much of a probability geek . . .

Here's the basic assumptions I'm going to make:

  1. There are 18 trillion people in the Culture.
  2. Each of the people is represented by a perfectly fair coin.
  3. The coins are flipped one hundred times.
  4. A person is considered to be a Culture Referer (i.e. one of the thirty or forty) iff their coin shows the same side across at least 90 of the 100 coinflips.

The third point might be the one with the most contention, but it seems reasonable in context, especially given later passages such as:

. . . who could give you an intuitive idea of what was going to happen, or or tell you why [they] thought that something which had already happened had happened the way it did, and almost certainly turn out right every time.

Well. Let's see about that, shall we?

The calculation

Let Generated by LaTeX be a random variable denoting the number of people in the Culture who are Referers. We can then express Generated by LaTeX as a sum of indicator random variables Generated by LaTeX, which are 1 if person Generated by LaTeX is a referer, and 0 otherwise.

By assumptions made above, we know that Generated by LaTeX is 1 iff a coin, flipped 100 times, shows the same side at least 90 times. There are twenty possible numbers of heads/tails in a flip sequence that will give this result: 0 heads/100 tails, 1 head/99 tails, ..., 10 heads/90 tails, 90 heads/10 tails, 91 heads/9 tails, ..., 100 heads/0 tails.

Consider a case with Generated by LaTeX heads and Generated by LaTeX tails. There are Generated by LaTeX possible sequences that will result in this summation. Hence, the number of sequences that will result in at least 90 of the same side showing is:

Generated by LaTeX

So, the probability of one fair coin showing the same side ninety times or more out of a hundred is Generated by LaTeX. That is to say, Generated by LaTeX for arbitrary Generated by LaTeX. But, as Generated by LaTeX is an indicator rv, that is also its expected value.

But since Generated by LaTeX is the sum of all the Generated by LaTeX, and expectation is linear, we can conclude that the expected number of coins, from the eighteen trillion, to show the same side at least 90% of the time will be: Generated by LaTeX

Supposedly, there are thirty or forty of these. How about ballparking the probability of the existence of twenty Referers via a tail bound?

Tail-bound

One way to state a Chernoff bound 1 is that, for Generated by LaTeX independent Poisson trials, Generated by LaTeX and Generated by LaTeX, that

Generated by LaTeX

If we substitute an appropriate Generated by LaTeX into this equation -- for the probability of getting 20 such Referers -- we get that

Generated by LaTeX

And this is for 20. If you set the target to be 30 Referers instead, you find that the probability is Generated by LaTeX.

Summary

At this point, I think it's safe to say that the analogy status is busted. The concept of a Referer is an interesting one, and I don't actually think it's terribly unrealistic, given a large enough population base to draw from.

But sequences of coins are a lot more predictable than one might think.

- ethereal


  1. See, for example, 'Randomized Algorithms' by Motwani and Raghaven, Theorem 4.1 on pg. 68.