What does randomness look like?

800px-V-1_cutaway

On 13 June 1944, a week after the allied invasion of Normandy, a loud buzzing sound rattled through the skies of battle-worn London. The source of the sound was a newly developed German instrument of war, the V-1 flying bomb. A precursor to the cruise missile, the V-1 was a self-propelled flying bomb, guided using gyroscopes, and powered by a simple pulse jet engine that gulped air and ignited fuel 50 times a second. This high frequency pulsing gave the bomb its characteristic sound, earning them the nickname buzzbombs.

From June to October 1944, the Germans launched 9,521 buzzbombs from the coasts of France and the Netherlands, of which 2,419 reached their targets in London. The British worried about the accuracy of these aerial drones. Were they falling haphazardly over the city, or were they hitting their intended targets? Had the Germans really worked out how to make an accurately targeting self-guided bomb?

Fortunately, they were scrupulous in maintaining a bomb census, that tracked the place and time of nearly every bomb that was dropped on London during World War II. With this data, they could statistically ask whether the bombs were falling randomly over London, or whether they were targeted. This was a math question with very real consequences.

Imagine, for a moment, that you are working for the British intelligence, and you’re tasked with solving this problem. Someone hands you a piece of paper with a cloud of points on it, and your job is to figure out if the pattern is random.

Let’s make this more concrete. Here are two patterns, from Steven Pinker’s book, The Better Angels of our Nature. One of the patterns is randomly generated. The other imitates a pattern from nature. Can you tell which is which?

pinker-glow-worms-and-stars-plot

Thought about it?

Here is Pinker’s explanation.

The one on the left, with the clumps, strands, voids, and filaments (and perhaps, depending on your obsessions, animals, nudes, or Virgin Marys) is the array that was plotted at random, like stars. The one on the right, which seems to be haphazard, is the array whose positions were nudged apart, like glowworms

That’s right, glowworms. The points on the right records the positions of glowworms on the ceiling of the Waitomo cave in New Zealand. These glowworms aren’t sitting around at random, they’re competing for food, and nudging themselves away from each other. They have a vested interest against clumping together.

Update: Try this out for yourself. After reading this article, praptak and roryokane over at hacker news wrote a script that will generate random and uniform distributions in your browser, nicely illustrating the point.

Try to uniformly sprinkle sand on a surface, and it might look like the pattern on the right. You’re instinctively avoiding places where you’ve already dropped sand. Random processes have no such prejudices, the grains of sand simply fall where they may, clumps and all. It’s more like sprinkling sand with your eyes closed. They key difference is that randomness is not the same thing as uniformity. True randomness can have clusters, like the constellations that we draw into the night sky.

Here’s another example. Imagine a professor asks her students to flip a coin 100 times. One student diligently did the work, and wrote down their results. The other student is a bit of a slacker, and decided to make up fake coin tosses instead of doing the experiment. Can you identify which student is the slacker?

Student 1:

THHHTHTTTTHTTHTTTHHTHTTHT

HHHTHTHHTHTTHHTTTTHTTTHTH

TTHHTTTTTTTTHTHHHHHTHTHTH

THTHTHHHHHTHHTTTTTHTTHHTH

Student 2:

HTTHTTHTHHTTHTHTHTTHHTHTT

HTTHHHTTHTTHTHTHTHHTTHTTH

THTHTHTHHHTTHTHTHTHHTHTTT

HTHHTHTHTHTHHTTHTHTHTTHHT

Take a moment to reason this through.

The first student’s data has clusters – long runs of up to eight tails in a row. This might look surprising, but it’s actually what you’d expect from random coin tosses (I should know – I did a hundred coin tosses to get that data!) The second student’s data in suspiciously lacking in clusters. In fact, in a hundred coin tosses, they didn’t get a single run of four or more heads or tails in a row. This has about a 0.1% chance of ever happening, suggesting that the student fudged the data (and indeed I did).

Image by Scott Adams.

Trying to work out whether a pattern of numbers is random may seem like an arcane mathematical game, but this couldn’t be further from the truth. The study of random fluctuations has its roots in nineteenth century French criminal statistics. As France was rapidly urbanizing, population densities in cities began to shoot up, and crime and poverty became pressing social problems.

371px-Adolphe_Quételet_by_Joseph-Arnold_Demannez

In 1825, France began to collect statistics on criminal trials. What followed was perhaps the first instance of statistical analysis used to study a social problem. Adolphe Quetelet was a Belgian mathematician, and one of the early pioneers of the social sciences. His controversial goal was to apply probability ideas used in astronomy to understand the laws that govern human beings.

In the words of Michael Maltz,

In finding the same regularity in crime statistics that was found in astronomical observations, he argued that, just as there was a true location of a star (despite the variance in the location measurements), there was a true level of criminality: he posited the construct of l’homme moyen (the “average man”) and, moreover, l’homme moyen moral. Quetelet asserted that the average man had a statistically constant “penchant for crime,” one that would permit the “social physicist” to calculate a trajectory over time that “would reveal simple laws of motion and permit prediction of the future” (Gigerenzer et al, 1989).

Quetelet noticed that the conviction rate of criminals was slowly falling over time, and deduced that there must be a downward trend in  the “penchant for crime” in French citizens. There were some problems with the data he used, but the essential flaw in his method was uncovered by the brilliant French polymath and scientist Siméon-Denis Poisson.

Simeon_Poisson

Poisson’s idea was both ingenious and remarkably modern. In today’s language, he argued that Quetelet was missing a model of his data. He didn’t account for how jurors actually came to their decisions. According to Poisson, jurors were fallible. The data that we observe is the rate of convictions, but what we want to know is the probability that a defendant is guilty. These two quantities aren’t the same, but they can be related. The upshot is that when you take this process into account, there is a certain amount of variation inherent in conviction rates, and this is what one sees in the French crime data.

In 1837, Poisson published this result in “Research on the Probability of Judgments in Criminal and Civil Matters“. In that work, he introduced a formula that we now call the Poisson distribution. It tells you the odds that a large number of infrequent events result in a specific outcome (such as the majority of French jurors coming to the wrong decision). For example, let’s say that on average, 45 people are struck by lightning in a year. Feed this in to Poisson’s formula, along with the population size, and it will spit out the odds that, say, 10 people will be struck by lightning in a year, or 50, or a 100. The assumption is that lightning strikes are independent, rare events that are just as likely to occur at any time. In other words, Poisson’s formula can tell you the odds of seeing unusual events, simply due to chance.

By Randall Munroe

One of the first applications of Mr. Poisson’s formula came from an unlikely place. Leap sixty years ahead, over the Franco-Prussian war, and land in 1898 Prussia. Ladislaus Bortkiewicz, a Russian statistician of Polish descent, was trying to understand why, in some years, an unusually large number of soldiers in the Prussian army were dying due to horse-kicks. In a single army corp, there were sometimes 4 such deaths in a single year. Was this just coincidence?

A single incidence of death by horse kick is rare (and assumedly independent, unless the horses have a hidden agenda). Bortkiewicz realized that he could use Poisson’s formula to work out how many deaths you expect to see. Here is the prediction, next to the real data.

Number of Deaths by Horse Kick in a year Predicted Instances (Poisson) Observed Instances
0 108.67 109
1 66.29 65
2 20.22 22
3 4.11 3
4 0.63 1
5 0.08 0
6 0.01 0

See how well they line up? The sporadic clusters of horse-related deaths are just what you would expect if horse-kicking was a purely random process. Randomness comes with clusters.

By Ryan North

I decided to try this out for myself. I looked for publicly available datasets for deaths due to rare events, and came across the International Shark Attack File, that tabulates worldwide incidents of sharks attacking people. Here’s the data of shark attacks in South Africa.

Year Number of Shark Attacks in South Africa
2000 4
2001 3
2002 3
2003 2
2004 5
2005 4
2006 4
2007 2
2008 0
2009 6
2010 7
2011 5

The numbers are fairly low, with an average of 3.75. But compare 2008 and 2009. One year has zero shark attacks, and the next has 6. And then in 2010, there are 7. You can already imagine the headlines crying out, “Attack of the sharks!“. But is there really a shark rebellion, or would you expect to see these clusters of shark attacks due to chance? To find out, I compared the data to Mr. Poisson’s prediction.

shark_attacks_south_africa
“Anyone else see the shark fin?” Nice catch, by @Gareth_Elms

In blue are the observed counts of years with a 0,1,2,3.. shark attacks. For example, the long blue bar represents the 3 years in which there were 4 shark attacks (2000, 2005 and 2006). The red dotted line is the Poisson distribution, and it represents the outcomes that you would expect if the shark attacks were a purely random process. It fits the data well – I found no evidence of clustering beyond what is expected by a Poisson process (p=0.87). I’m afraid this rules out the great South African shark uprising of 2010. The lesson, again, is that randomness isn’t uniform.

Which brings us back to the buzzbombs. Here’s a visualization of the number of bombs dropped over different parts, reconstructed by Charles Franklin using the original maps in the British Archives in Kew.

london buzzbomb distribution

Note: A clarification. The plot above shows the distribution of bombs that were dropped over London. The question I’m asking is, if you zoom in to the part of the city most heavily under attack (essentially the mountain that you see in the figure above), are the bombs being guided more precisely, to hit specific targets?

It’s far from a uniform distribution, but does it show evidence of precise targeting? At this point, you can probably guess how to answer this question. In a report titled An Application of the Poisson Distribution, a British statistician named R. D. Clarke wrote,

During the flying-bomb attack on London, frequent assertions were made that the points of impact of the bombs tended to be grouped in clusters. It was accordingly decided to apply a statistical test to discover whether any support could be found for this allegation.

Clarke took a 12 km x 12 km heavily bombed region of South London, and sliced it up in to a grid. In all, he divided it into 576 squares, each about the size of 25 city blocks. Next, he counted the number of squares with 0 bombs dropped, 1 bomb dropped, 2 bombs dropped, and so on.

In all, 537 bombs fell over these 576 squares. That’s a little under one bomb falling per square, on average. He plugged this number into Poisson’s formula, to work out how much clustering you would expect to see by chance. Here’s the relevant table from his paper:

poisson table buzzbombs

Compare the two columns, and you can see how incredibly close the prediction comes to reality. There are 7 squares that were hit by 4 bombs each – but this is what you would expect by chance. Within a large area of London, the bombs weren’t being targeted. They rained down at random in a devastating, city-wide game of Russian roulette.

The Poisson distribution has a habit of creeping up in all sorts of places, some inconsequential, and others life-altering. The number of mutations in your DNA as your cells age. The number of cars ahead of you at a traffic light, or patients in line before you at the emergency room. The number of typos in each of my blog posts. The number of patients with leukemia in a given town. The numbers of births and deaths, marriages and divorces, or suicides and homicides in a given year. The number of fleas on your dog.

From mundane moments to matters of life and death, these Victorian scientists have taught us that randomness plays a larger role in our lives than we care to admit. Sadly, this fact offers little consolation when the cards in life fall against your favor.

“So much of life, it seems to me, is determined by pure randomness.” – Sidney Poitier

References

Shark attacks and the Poisson approximation. A nice introduction to using Poisson’s formula, with applications including the birthday paradox, one of my favorite examples of how randomness is counter-intuitive.

From Poisson to the Present: Applying Operations Research to Problems of Crime and Justice. A good read about the birth of operations research as applied to crime.

Applications of the Poisson probability distribution. Includes a list of many applications of the Poisson distribution.

Steven Pinker’s book The Better Angels of our Nature has many great examples of how our intuition about randomness is generally wrong.

Want to know more about the accuracy of the flying bombs? The story is surprisingly rich, involving counterintelligence and espionage. Here’s a teaser.

  • LaQuan Jackson

    Randomness is rac!st.

    • SixMissle1000

      Word

  • Pedro Daltro

    amazing!

  • http://twitter.com/SPulim Sandeep Pulim

    Wow…Thanks for sharing!

  • Gay 8er

    Good read

  • Ryan Beesley

    Interesting read.

    I was recently watching a magic trick that involved a “random” die throw. What I noticed about the random die was that for 5 throws there wasn’t a single time that the die landed on one of the other faces… there was no grouping or collisions.

    Using what I knew about hash collisions and the birthday paradox, I satisfied to myself that what was supposedly random was in fact not random at all. To do this I calculated what the odds were that on each die throw the die would land on a face that had already been thrown. By the 5th throw, there was only a 10% chance that the die wouldn’t have resulted in a collision. By the 6th throw, that fell to less than 2%, and while 2% certainly allows for the possibility that the throws were indeed random, I was fairly certain I was looking at loaded dice.
    It occurs to me that what I was trying to solve is a somewhat related problem. The difference is that if you are measuring shark attacks there is no upper bounds on how many shark attacks you could have in a year. Therefore I believe the problem I was solving is a related but slightly more constrained version of the same problem.

    As to the random distribution of V-1 bombs, I don’t know that it is entirely random still. The V-1 was targeted in that there were probably several launched from the same site with similar amounts of fuel, meaning that there was some trajectory involved. Since they weren’t actively guided bombs, other factors such as atmospheric conditions and manufacturing inaccuracies compounded to produce a bomb that may have been intended to have more precise attack patterns, but chaos is actually what generated the random distribution seen in the data.

  • http://www.facebook.com/profile.php?id=503220607 Josh On

    There was a fantastic Radiolab episode that covered some of these ideas: http://www.radiolab.org/2009/jun/15/

  • http://thisiscsr.com/ CSR

    Brilliant!!

    I have never really liked probability and statistics (which kind of explains why I don’t usually like Quantum mechanics). But this article got me HUGELY interested in randomness!

    By far, the best thing I’ve read this week!

    p.s: there’s a broken link where you’ve linked ‘Poisson’s distribution.’

    • http://www.empiricalzeal.com Aatish

      Very happy to hear it. Quantum mechanics is awesome, so I hope you get over your bias against it :) I suggest picking up QED by Richard Feynman as a tricky but very enlightening read on the subject. Here’s a teaser to pique your interest. http://www.empiricalzeal.com/2011/06/10/why-a-quantum-particle-is-not-like-a-water-drop-a-tale-of-two-slits-part-1/

      • http://thisiscsr.com/ CSR

        Another clincher. I’m into QM now thanks to you :)
        That interference pattern from electrons sent one at a time strangely brings me back to Poisson’s distribution :P I guess I’ll have to really come to terms with statistics to begin my journey back into QM after all these years.

  • Adolf Hitler

    Very nice read! Thank you.

  • Jo

    Poisson distribution pdf link is broken.

    • http://www.empiricalzeal.com Aatish

      fixed. thanks

  • maninthemiddle

    Excellent article. Thanks

  • alex

    I’ve learned a few things. Thanks!

  • http://www.facebook.com/john.mount John Mount

    Great article. I (and others) recently tried to test the video game’s XCOM pseudo random number generator for fairness: http://www.win-vector.com/blog/2012/12/how-to-test-xcom-dice-rolls-for-fairness/ . Ran into a lot of issue of player perception.

  • Prateek

    Cool! One thing about the bombs on London which is confusing me: The analysis with the cells show that the cells were chosen independently. But what if all the “5 and over” cells all cluster around a point, and overall the distribution tails off away from the point? If you choose your binning fine enough, wouldn’t you end up with a Poisson-like distribution for a finite sample generated from almost any underlying distribution?

    • http://www.empiricalzeal.com Aatish

      The cells weren’t chosen independently – they are just a subgrid of a square region of London. You may be right that all the heavily bombed regions are next to each other, the only way to know is to get your hands on the data, which I’ve been hoping to do.

      I disagree with the idea that any distribution would tend to the Poisson distribution if binned finely enough – think of localized spikes in the extreme case (delta functions), no matter how finely you bin, you’d never get a fit to a Poisson. Also, at some scale of binning, the assumptions that go into the Poisson distribution must break down.

      • Prateek

        Agree with the delta function spike, but let’s stick with smooth distributions. Why do you say that decreasing the bin size decreases the sample size? You’d still have the same “N”, but now distributed over more”cells”, right?

        • http://www.empiricalzeal.com Aatish

          Ah, I misunderstood your point. Are you looking at distributions with large mean? If so, I don’t expect it to fit a Poisson distribution even when you get to the limit of one bin for each integer. If you have a low mean and continuous distribution – it could be bimodal or have a very long tail (like an exponential), and then it won’t fit a Poisson. If you’re assuming a normal-like distribution with low mean, then sure, I agree – because the Poisson is exactly that limit.

          • Prateek

            Hmm, I am not sure I agree (if I understand what you said). Take a bimodal continuous distribution, with whatever mean you like. Then sample a million points in a given range. Now bin the data finely enough that the expected number of points in a given bin is say, 2. Is it not true that the number of bins with 0,1,2… points are given by the Poisson distribution? (Btw, I don’t mean to pick nits, just curious).

          • http://www.empiricalzeal.com Aatish

            Eesh.. I was getting confused again. I thought we were binning an empirical distribution with a finite number of points. Now I see that you mean to draw samples from a distribution. (sorry for the confusion, just glossed over your meaning even though you were quite clear). You’re probably right, but I’ll have to think it over a bit – not sure why this is true.

  • Sebi

    If I understand it right you could predict the next thing with a higher probability. e.g. The shark attacks distribution looks like there will be only one attack in 2012. So you can predict random numbers, if they happen in a chain?

    • Anders Jackson

      No, I don’t think you. If you could, it wouldn’t be random.
      The attacks are supposed to happen without any dependency on other shark attacks.
      You might want to do some Markov chains analize to see if there acctually are some non random part, where next is dependent on previous. But I guess there are better ways to see that.

    • Jason

      No, you cannot predict random numbers. There is no guarantee that a Poisson distribution will be present. It is just highly likely that random numbers will form a Poisson distribution as the number of observations grows large. So the odds of there being one shark attack in 2012 are the same as they were in 2011, and in every other year (assuming no change in shark hunting habits and the number of sharks). The odds of there being many, many years without one shark attack, however, are considerably lower, but still non-zero.

      • http://www.facebook.com/profile.php?id=639065138 Darek Jedzok

        The thing I don’t get is that the “Deaths by Horse Kick” tab looks like a very precise prediction, all 8 steps are spot on. How is that possible?

        • http://www.empiricalzeal.com Aatish

          Because even though each individual death in a random event, they occur at a certain fixed rate, so you can predict how many streaks one should see according to the rules of chance. You don’t know when it will happen, but you can predict how many times it happens, on average. It’s like saying, I can predict how many people will win a lottery, but I don’t know who will win.

        • Gen Zhang

          Actually, it is well-known that particular dataset turns out to have been fabricated. Run a quick chi-squared or something, and the deviation from the perfect poisson distribution is far too small to be random.

  • peter jankuliak

    Thank you for the article, I enjoyed reading it.

    Poisson distribution is new to me, but I could rationalize to myself why I would use binomial one. Thus I’m having problems understanding why choosing one over the other. In the case of buzzbombs the table would look similar:

    No. of flying bombs per squareExpected no. of squares (Binom)
    0226.56
    1211.59
    298.62
    330.59
    47.1

    • http://www.empiricalzeal.com Aatish

      Hi – great question! It turns out that the Poisson distribution is just a limiting case of the Binomial distribution. Think of the year of bombing as being made up as a huge number (N) of infinitesimal moments, each with the same fixed probability p of a single bomb falling. Poisson took the limit of the Binomial distribution when N, the number of such moments, goes to infinity, while keeping the expected number of bombs ( = p*N) constant.

      From wikipedia: “There is a rule of thumb stating that the Poisson distribution is a good approximation of the binomial distribution if N is at least 20 and p is smaller than or equal to 0.05, and an excellent approximation if n ≥ 100 and np ≤ 10.” Citation: http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm

      • Peter Jankuliak

        I think I get it, just need to digest :). Thanks again.

        • http://www.empiricalzeal.com Aatish

          In hindsight I muddled up that explanation, because I was describing a Poisson distribution over time (dividing a time interval into tiny moments), but the bombs example is a distribution in space, not time. The same idea can apply, but now instead you divide a geographical area into tiny squares. If there are too few squares, the Poisson distribution won’t work well.

  • Gculliss

    In the London example, one can see that if you zoom in far enough you will find randomness. You could perform the same analysis by looking at the distribution of hits on a bullseye and conclude that the hits were random, but when looking at the dartboard it would be clear that they were well targeted shots. So, finding randomness is a function of the scope of your inquiry.

    • http://www.facebook.com/sandip.dev Sandip Dev

      On the contrary hits on a dartboard are expected to be clustered. It would be random if you closed your eyes and hit. And randomness is not dependent on the scope of your inquiry.

      • http://twitter.com/_rahim Rahim Packir Saibo

        I believe your example disproves your point.

        Imagine you were an alien using only the location of the darts to detect some non random process in action (in this case the whims of human beings).

        Assuming closing your eyes led to always hitting a random point on the board then we’d expect sampling equal areas of the board to fit Poisson.

        However, if our samples were board sized and covered the room, or room sized and covered the house we’d expect to see a distribution that was decidedly non random.

    • http://twitter.com/Jiaaro James Robert

      It’s worth considering that the real goal was to uncover whether the bombs were hitting specific targets (like a certain building), or if they were aimed at London in general.

  • http://twitter.com/socrates1998 socrates

    I have read Nicholas N. Taleb’s books on similar topics. Is the Poisson distribution related or completely different than the Gaussian distribution? Or is it more a power law?
    If you could point me in the right direction, I would appreciate it.

    • http://www.empiricalzeal.com Aatish

      Here’s one way it’s related – if you have a Poisson distribution with a large mean, it approached a Gaussian distribution. Both the Gaussian and Poisson distributions are limits of something called the Binomial distribution, which is the one that tells you the odds of getting, say, 3 heads of out 10 coin tosses. More here http://www.phy.duke.edu/courses/391/faqs/faq3/node2.html

  • Clifton

    A couple important themes of Pynchon’s _Gravity’s Rainbow_ involve the Poisson distribution and randomness as specifically applied to V-2 rockets falling on London. You might enjoy a read, if you can get through it. (It’s not easy going for most readers.)

  • Cookie

    I think you’ve written a great article. Many examples and insightful thoughts. Before I continue, I must emphasize what I’m going to say are my opinions alone. I don’t mean to offend anybody. So here goes. I don’t think the question What does randomness look like? is a well framed question. It might give people the wrong impression. Think of a hypothetical equation that can tell you everything about anything in this universe. By anything, I mean any subatomic/atomic/micro/macro object that exists in this universe. Before someone talks about the Uncertainty principle, let’s assume this hypothetical equation doesn’t give a damn about it and can still accurately tell you everything about anything in this universe. Let’s also assume this equation knows that you know about it. Thus there are no loopholes. Would anything seem “random” to you? No. Does humankind have such an equation? No? Good, probability is useful. Even if we had such an equation, would we be able to design a computer that can store all the data required? No? Great. The use of probability becomes obvious. In my opinion nothing in the universe is random. Everything has an effect, no matter how infinitesimal, on everything else. As humans we cannot quantify all these effects, at least not in real time. Thus we try to limit our scope of observation.

    In my opinion probability (and it’s unpopular offspring Statistics) are based on a few basic ideas:

    1) We perform a set of actions (let’s refer to this set as experiment)

    2) The experiment can be repeated innumerable times under the same conditions.

    3) The experiment can have multiple outcomes.

    The last two points are crucial. We tacitly assume the experiment (such as tossing a coin) can be repeated in the exact same way innumerable times. If the experiment produced the same outcome, then it would have a unique deterministic outcome, and it would not seem random to us. Thus we are interested only in experiments that can have multiple outcomes. Now that we have this in place, let’s ask ourselves can we really reproduce any experiment in real life? Impossible. Even if we managed to execute all the actions in the exact same way as before, the universe, that is changing, will have some effect on it. We can never reproduce any experiment exactly. Of course, this problem is solved as easily as it arose. Simply define the experiment better. When the Germans dropped the bombs on London, did they do so under identical conditions? Of course not. Did it bother R.D. Clarke? No. He defined his experiment in such a way that weather and everything else that’s superfluous was removed. Does this mean that those superfluous factors didn’t matter? No. They did. R. D. Carke choose to ignore it (Thank God). Something that seems random, is not necessarily random. We are actually trying to account for what we don’t know.

    For the two pattern example, let’s say a computer generates a sequence {1,2,3,4,5,6,7,8,9,10}. Would you say it’s not random? Why? The sequence {1,2,3,4,5,6,7,8,9,10} is as random as the sequence {10,7,6,1,3,5,8,2,4,9} with either having a chance of 1/10! of occurring. Hard to really call something random. Nothing is random. It’s easier to accept the limitations on the information available. Then we can use suitable mathematical tools to tackle lack of information. Probability (which studies randomness) is one such tool. Most of what has been described here is a frequentist view of probability, where one observes an experiment repeatedly and checks how many times any outcome occurs. It’s not really “random”. We are checking what effect all the information we ignored has on the outcome. Even if we deduce the probability distribution for an experiment, the future use of it is faithfully based on the fact that the experiment will repeat in the same way as it did in the past. To make a long story short, randomness is really lack of information. It is not necessarily a pattern. By suitably defining the experiment, these patterns can be changed. Once again, my opinion. I don’t expect anyone to agree with me.

    • 2 S’s

      For reference, why the sequence {1,2,3,4,5,6,7,8,9,10} looks less random than {10,7,6,1,3,5,8,2,4,9} is related to Kolmogorov complexity. The theory would show that some sequences look ‘more random’ because of how incompressible they are; they have less structure. As an example of the idea, I can refer to the first sequence by {1,..,10}, but there is no obvious way to shorthand the second. (There may be one for it, but some of the sequences of numbers will not be shrinkable. And it is – in a precise way – very hard to tell which are not compressible.)

      Of course, this is not really related to your point since they are equally likely, probability-wise, but you appealed to the appearence of randomness to which this is closely related.

  • http://twitter.com/worldofpiggy Piggy

    Nice post! Easy and interesting reading.

  • http://twitter.com/hwestiii Howard West

    I had the same reaction as @clifton, you’ve just restated the first 100 pages of “Gravity’s Rainbow” in terms that real people can understand.

  • dude_1024

    Stars are randomly distributed? Hmmm, galaxies seem to have a lot of order to me. I get the idea of humans looking for pattern and being deceived, but start with something completely random to base your claim on. Also, while glow worms maybe actively distributing themselves, there is no global patterning, thus the information channel is noisy introducing a certain level of randomness to their placement.

    • Rx

      Stars in the sky are randomly distributed. They are all pretty close to us and thus the grander-scale order in the galaxy doesn’t come into play.

  • Mitch

    Here’s the actual London WWII bombing data that everyone is talking about: http://bombsight.org/

  • AddyVar

    Well written – but I still don’t get it – while Poisson’sD does tell the probability of ‘X’ num of events occurring together over a unit interval/area – it still doesn’t say anything about events clustering around in successive/adjacent areas. E.g, for the above, there’s a 36 % chance of 1 bomb dropping per unit area – but that doesn’t tell us anything about how many such ‘units’ will be geographically adjacent! How do we account for real geo clustering of at least 1 bomb per unit area?

  • AddyVar

    Well written – but I still don’t get it – while Poisson’sD does tell the probability of ‘X’ num of events occurring together over a unit interval/area – it still doesn’t say anything about events clustering around in successive/adjacent areas. E.g, for the above, there’s a 36 % chance of 1 bomb dropping per unit area – but that doesn’t tell us anything about how many such ‘units’ will be geographically adjacent! How do we account for real geo clustering of at least 1 bomb per unit area?