# Reverend Bayes helps us understand the Higgs

I was listening to Tim Harford’s More or Less podcast last week in the wake of all of the excitement over the possible detection of the Higgs boson. More or Less is a show all about statistical abuses in the news and in life, and I strongly recommend it. In this episode, they had Robert Matthews, a physicist from Aston Universities talking about the nature of statistical significance. He did a good job, but his discussion was oddly lacking any reference to the Reverend Thomas Bayes.

This post will be a little bit technical (but nothing on the level of what we’ve seen in some of our relativity discussions), so you may want to get a bit of scratch paper out. It deals with how we actually discover new things, and in particular, how we assess the uncertainty of things that we can’t possibly know.

I should also add that while I will throw around a few example numbers, and that these resemble the real numbers put out by the ATLAS and CMS experiments at the LHC and the Tevatron at Fermilab this is really meant to be an overview.

What $sigma$ means

Take the plot at the top of the page. That’s a measurement of the signal and noise from the ATLAS collaboration. They were, as you know, looking for the Higgs Boson. The idea of the plot is that if there is no Higgs than the signal (the solid line) should appear within the green curve approximately 68% of the time (so-called $1-sigma$), and within the yellow curve approximately 95% of the time ($2-sigma$).

This “68% of the time” or “95% of the time” needs some explanation. An experiment is going to be noisy. Sometimes the noise is going to produce an uptick, and sometimes a downtick. If there were no Higgs, and if you ran the same experiment thousands of times, you’d expect to find yourself within the yellow 95% of the time, above the yellow 2.5% of the time, and below the yellow 2.5% of the time.

In other words, a $2-sigma$ result is suggestive, but even naively, it’s not overwhelming. I wouldn’t bet my life on something where 1 time in 40, even no signal at all could produce as significant a result.

There are other complications that I’m going to ignore here, since they don’t change the basic picture. One is that in searching for the Higgs, you get to look at lots of different possible masses. If you could look at 40 different masses, and all of them are independent, 1 of them is likely to be a $2-sigma$ result. This is the “look elsewhere” result that is often cited in these results.

For our purposes, we want to only take into account the data around 125GeV, and ask what we can say about whether a Higgs at that mass is likely or not. Taking all of this into account, the ATLAS result is at $2.3sigma$. This means:

If there is no Higgs, the odds of finding a signal this far above the expected background level is about 1%

This does not mean that the probability that the Higgs is real is 99%. That was Matthews’s point. To figure out how likely the Higgs is, we need to say a bit about Bayesian inference.

Bayes Theorem

Suppose I told you that I had a magic genie that grants wishes. I could prove it. I wish for a coin that comes up heads every time. To test my conjecture, I flip a coin 5 times in a row and get heads every time. That’s only a 1 in 32 probability (about 3%)!

Would you assume that I really did have a genie?

• My argument (following the same logic that we applied to the Higgs) might be that the probability of the genie is 97%.
• The argument against is that while 5 heads in a row (especially if I called it ahead of time) is unlikely, the odds of genies are even more unlikely.

All of this can be formalized in a relation known as Bayes Theorem, which says that if I have two events, A & B, their probabilities can be related with the expression:

$displaystyle P(A|B)=frac{P(B|A)P(A)}{P(B)}$

Don’t be scared off by the notation. It’s more straightforward than you might think. $P(A)$ and $P(B)$ simply mean the probability of A or B being true before we do any experiments at all.

In this case:

• $A$ = There really is a genie with the properties I described.
• $B$ = I flip 5 heads in a row.

$P(A|B)$ is “The probability of A given that B has already occurred.” In our case,

• $P(B|A)$ = The probability that we flip 5 heads if there really is a genie. = 100%. (That’s the rule for a genie.)

Computing the other terms are a bit more complicated. For one thing, we don’t really know ahead of time what the a priori probability of having a genie is ($P(A)$). This is really a measure of our pre-existing belief in genies.

At one extreme, I might say that there is literally zero chance of genies, or I might pick an incredibly small number, perhaps 1 in a million. At the other extreme, I might say that I’ve seen and interacted with genies before, so the chance is 100%. We really don’t know.

The good news, though, is that once we pick this number, we can compute everything else. The a priori probability is simply:

$displaystyle P(B)=(1-P(A))times left(frac{1}{2}right)^5+P(A)$

Put into words, this is simply the probability of flipping 5 heads on a fair coin (with no genie) added to the probability that there’s a genie.

From this, we can compute the “posterior probability” of there being a genie given that I’ve just flipped 5 heads, by plugging into Bayes theorem:

Naturally, if we already knew that there was a genie, our certainty was 100% before and after we did the experiment.

On the other extreme, if we really don’t believe in genies, this experiment isn’t going to help very much. If we thought that particular type of genies were a one in a million chance (a rather generous assumption) then even after seeing this “proof” we would only raise that belief from 0.0001% to 0.0032%.

And the Higgs

The calculation is very similar for the Higgs. We’re assuming that everything else has been ruled out and the “look elsewhere” effect has already been taken into account. We’re just asking about a Higgs at or near 125 GeV.

In this case, our two events might be:

• $A$ = There is no Higgs.
• $B$ = The ATLAS team finds a $2.3sigma$ measurement (or higher).

We already know one of these terms, the probability of getting such a large amount of noise if there really is no Higgs

$displaystyle P(B|A)=0.01$

The number we don’t know, of course, is $P(A)$, the probability that there’s no Higgs. Theory strongly supports the existence of a Higgs particle in approximately the range we’re talking about, so I might be inclined to put the a priori probability of no Higgs as quite low, perhaps 10%. But let’s argue from the other direction. Suppose that you would have given the Higgs only a 5% chance of being real, that means:

$displaystyle P(A)=0.95$

which can then be plugged in to compute

$displaystyle P(B)=(1-P(A))+0.01times P(A)=0.06$

where I’ve assumed (not entirely accurately) that if there really were a Higgs, then we’d see a $2.3sigma$ result 100% of the time.

Just to make this clear. The idea is that we’ll only get a signal this large in the 5% probability that there is a Higgs plus the 95% probability of no Higgs times the 1% chance of getting the level of noise that we see. That adds up to 6%.

Plugging in the numbers, we’d get:
$displaystyle P(A|B)=0.16$

Or in other words, even if you believed with 95% confidence ahead of time that there shouldn’t be a Higgs, after conducting this experiment, your posterior probability is more like 16%. The likelihood of a Higgs is therefore 84%.

More generally, we can look at the posterior probabilities for any prior we like:

Note that the y-axis has a log scaling, and that it represents the probability of no Higgs (after seeing the data from ATLAS).

Some things to remember

1. I picked my prior probability of the Higgs rather arbitrarily. If you think it 99.99% probable that there’s no Higgs (or any probability you like) then it’s going to be that much harder to convince you.
2. ATLAS isn’t the only experiment in town. CMS detected something like a $1.9sigma$ result, and the Tevatron had something like a $1sigma$ result. Combined, these are at the $3.1sigma$ level (about 0.1% probability by pure chance). Using our 5% prior likelihood of the Higgs really being in this range, we get about a 98% posterior probability. This is why I’m so confident that Higgs will ultimately be officially detected.
3. The particle physics community usually relies on a $5sigma$ result to “detect” a particle — less than 1 part in a million for $P(B|A)$. Even if you were 99.999% ahead of time that there was no Higgs (and really, how did you come to such a precise number?), a $5sigma$ detection would still have a 97% probability of being real.

Apologies for the fairly technical post, and to the statistics community, apologies for playing fast and loose with some terminology.

In either case, I hope this helps to explain why there are some people who aren’t quite ready to believe the Higgs is real (yet) and why others (like me) are.

-Dave

This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

### 4 Responses to Reverend Bayes helps us understand the Higgs

1. Thanks for the kind words about my piece on More or Less about significance testing abuse. I can understand why you’re puzzled about my decision to leave out mention of Bayes’s Theorem, given its key role in transposition of conditional probabilities. As you’d appreciate, one always has to make a judgement call about just how many unfamiliar ideas to hit an audience with in one go, and I decided introducing Bayes was just one too many….As it is, I’ve had to write an explanatory piece for listeners about the fairly easy stuff I did include !

Anyhow, it’s great to see you guys take on the job of explaining the issues further. Of course, the statistical analysis of the Higgs data is way more sophisticated and complicated that I was able to convey…but some of its concepts (eg the CLs method) includes some p-value approaches that this Bayesian finds very hard to swallow !
Cheers
Robert

• dave says: