A young Catholic lay Apologist who has an addiction to all things Papist, Romanist, and shiny.

Wednesday, September 28, 2011

Perplexity's Exposition of Baye's Theorem.

This post will contain Perplexity's explanation of Baye's Theorem.

Bayes' Theorem:

The human mind assesses evidence for a belief or hypothesis in a certain manner and on the basis of certain assumptions. We naturally assume, for instance, that if a belief or hypothesis is probably true then it will yield fewer surprises, given our evidence. We’re inclined to think that a good way to discover reality is to collect evidence and adopt the belief which is least surprised by it. Therefore, we also assume that if, given our evidence, a belief is very surprising (not leading us to expect the evidence), then it’s not worthy of our belief…at least initially.

For example, take the following competing hypotheses: (1) Jones is not the murderer. (2) Jones is the murderer.

Suppose we have some evidence at the scene of the crime, namely the murder weapon, a gun which was licensed to and bought by Jones, and Jones’ fingerprints on the murder weapon. Call all of this evidence, E.

Which of the competing hypotheses leads us to expect E, if we suppose it were true? Well, if (1) is true, it’d be surprising that Jones’ gun was the murder weapon! Of all the people in the world who own guns, why would Jones’ be the one? And even more shocking is that his fingerprints were on it! How couldn’t he be the murderer? Therefore, (2) makes much more sense of E, consequently we naturally take E to confirm or support (2) rather than (1).

What I propose to do in this post is to outline a principled way in which we can discern when some evidence should lead us to think a belief or hypothesis is surprising, or to be expected. We would then have a principled way of discerning truth.

Let H = Hypothesis, E = Evidence, and K = Background Knowledge. We already know what a hypothesis is and have an intuitive grasp of what evidence is.[1] But, what is background knowledge? It’s all the things we can assume about the world which aren’t already included in E. Thus, in the homicide example above, K could include knowledge that Jones had motive for killing the victim, that Jones had stated on many occasions that he wished to kill this person, etc.

What we’re looking for is how probably true some hypothesis (a.k.a., H) is, given the evidence we have [E] and our background knowledge, K. That is, how probable does E and K make H? We can formalize this notion as follows: P(H|E & K), which stands for “the probability of H given E and K.”

P(H|E & K) is also known as posterior probability. It is this probability that we’re interested in when we’re trying to figure out how well evidenced a given hypothesis is. It’s this probability we should pay careful attention to in the debate to follow, where the position Alert argues for will be his hypothesis, and the position I argue for will be mine.

How do we figure this probability out? We figure it out by using something called Bayes’ Theorem. This may look intimidating at first but it’s not near as complicated as it may initially appear, and can be quite fun once you get the hang of it. Also, at the end of this post I’ve linked to a “Bayesian calculator” which is an easy short cut to getting H’s posterior:

P(H|E & K) = P(H|K) * P(E|H & K) / P(E|K)

So, there are two notions that we need to have a fair understanding of, nothing complicated though.

P(E|H & K) is simply how probable the evidence is given our hypothesis and background knowledge, or assuming our hypothesis and K to be the case, how likely is the evidence we have? This is the kind of reasoning we used in the homicide example above: we assumed each hypothesis was true and then discerned how probable the evidence was. This is also known as a hypothesis’ explanatory power.

P(H|K) is simply the probability of the hypothesis given our background knowledge, before we even take into account the evidence E for that hypothesis. This is known as prior probability, or prior for short.

Finally, P(E|K) is simply the prior of the evidence. That is, given our background knowledge, how probable is E?

So, Bayes’ Theorem can be translated as follows:

The posterior of H is equal to the prior of H times the explanatory power of H divided by the prior of E.

That’s almost all that’s needed to be known at this point. There remains only one more thing, perhaps the most important. Where do you get the numbers from?

Well, when you go to the doctor’s office and you’re asked to rate your pain on a scale of 1-10, do you have any trouble representing your degree of pain numerically? I doubt it. It’s the same concept, except, instead of degrees of pain we’re trying to numerically represent, it’s degrees of confidence. This is a pretty subjective thing, just like pain, which is why this is actually a great way for *you* to discern how a given piece of evidence effects the probability of a hypothesis, given your own relative knowledge etc.

So, basically, 1 = the highest degree of confidence, 0 = the least, and .5 = neither confident nor unconfident.

When you roll 3 die, how confident are you they’re all going to land on 6? How about when rolling 20 die? Etc. thought experiments like these can be good ways to gauge your degrees of confidence. (I've seen lotteries used, or cards in a deck)

So, let’s return to the homicide example to see how this all fits together (if it does):

Let H1 = The hypothesis that Jones is the murderer.

Let H2 = The hypothesis that Jones is not the murderer.

Let E = The murder weapon is a gun bought by and licensed to Jones, and Jones’ finger prints were on the murder weapon.

Let K = Finger prints on murder weapons generally indicate who the murderer was; Jones had murderous intent towards the victim, and motives for killing her and so forth.

What is H’s posterior? In other words, how probable is H1 given our evidence and background knowledge? Well, remember H’s posterior is equal to H’s prior times H’s explanatory power divided by E’s prior.

So, first, what are H’s prior and explanatory power?

Personally, I’d say K makes H1 more probable than not, so I’m will to say .6.

What is P(E|H1 & K)? Well, does H1 and K lead us to expect E? For me at least, yes! So, if the “doctor” asked me to rate my confidence that H1 and K make E expected on a scale of 0 to 1, 0 being the least and 1 being the most, I’d have to say far greater than .5! For simplicity, I’ll just say .75.

I’m not really sure about the prior of E, though. I don’t know *how* serious Jones was with his intent, but let’s suppose it was worrisome. I’d place it around .5. So how does this all add up?

Well, just take the numbers I’ve given and plug them into the theorem above and calculate

.6 * .75 / .5 = .9

This is basically it.

Now, there are online Bayesian calculators for short cuts although you’ll still need to figure some numbers out. So, for instance, this link is to a calculator for posteriors (when discerning between competing hypotheses), but you’ll still need to figure out priors and explanatory power.

Remember, it’s a pretty subjective thing.

Finally, I’m only claiming that when Bayes’ theorem calculates a probability, it’s right. I’m not claiming the numerical representations we use are 100% accurate [those are subjective and open to variance], but all we *really* need to know is whether something is more probable than not, and Bayes’ theorem is great for that.



[1]

Bayesianism usually says something counts as evidence for some hypothesis if it makes that hypothesis more probable than it otherwise would be. This is generally formalized as follows:

E = evidence if, P(H|E & K) > P(H|K)     

No comments:

Post a Comment