So far we’ve talked about probabilities for discrete systems, like getting three heads, or 25 date rejections. Now lets move on to something that you run up against quite frequently: continuous probability distributions.

As an example, Bob decides to switch majors and study botany. For his senior thesis he gets it in his head that he wants to study the distribution of leaf lengths in Poa Supina. This is the grass in the lawn outside the library where he spends most of his time. He spends some time measuring the lengths of blades of grass, compiling 6 measurements: 4.53 cm, 6.14 cm, 6.50 cm, 6.22 cm, 5.91 cm, 6.19 cm. But what does he do with this data? He’d like to plot it but doesn’t know quite how. He decides to plot it using impulses like this:

This is less than edifying. So Bob gets serious. After several hours with his nose in the grass he does 100 measurements. He gets this:

This doesn’t strike him as the best way of displaying the data. Then he remembers something about doing a histogram. You bin the data into buckets (or bins), say of size $\colorbox[rgb]{1,1,1}{$1$}\colorbox[rgb]{1,1,1}{$c$}\colorbox[rgb]{1,1,1}{$m$}$. And then you count the number of times a data point falls into a bucket. This gives:

But Bob still didn’t think this was accurate enough to show to his adviser. He went back and did a lot more measurements, 10000. He then binned it the same way and got:

This looks nice an symmetric. Clearly those measurements were worth the effort. He showed this result to his adviser, Professor Snodgrass, who peered down with his glasses and said ”that’s very interesting for a first try, but why don’t you use smaller bins?”. Bob was crest-fallen, ”I’ll never make captain of the science club if I can’t think of that myself” he says to himself. After that golden suggestion Bob gets:

This is starting to look more like a brontosaurus. ”I wonder if there’s a connection?” thought Bob. Yet there was no grass at the time of the dinosaurs, so he was left with this elusive yet tantalizing similarity.

But one other thing he notices was that the distribution seemed to have gotten smaller. The maximum has gone down by about a factor 4. Why is that? Well, if your bin size is smaller, less points will land inside a bin. Therefore it makes sense to divide by the size of the bin:

That looks more like it. You divide by the number of points and the size of the bin.

If Bob had the patience and the longevity go to an astronomical number of measurements and small enough bins, he’d get:

This is a continuous distribution. In this limit you have bins of infinitesimal size. You’re interested in the probability of being in a bin divided by the size of a bin.

Another way of thinking about this is a probability per unit length, or probability density. That’s because you do the same thing with a density (in one dimension). You take the mass and divide it by the length.

This is the right way of thinking about continuous distributions. You think about things in terms of probability density, and in fact you’ve got to be careful about the labeling of the vertical axis because now it denotes probability per unit length.