Monthly Archives: September 2012

How to Tell the Future

Many religions and new religious movements use foretelling as a staple part of both their justification and often their religious practice. I’ve had Christians tell me that the chance of Jesus fulfilling all the prophecies he fulfilled, just by chance, would be equivalent to throwing ten thousand coins and them all landing heads. I’ve read spiritualist testimony of receiving detailed foretelling of 9/11 and the London 7/7 bombings. The coming of the Prophet Mohammed is traced by some Muslims into the Vedas, right down to the Sanskrit translation of his name and specific details of his biography. Bahá’í likewise believe that Bahá’u’lláh was foretold in many different contexts, tracing prophecies of his arrival into the New Testament, the Hebrew Bible, the Quran, and into Buddhist writings. It is a very common, almost ubiquitous, feature of religious justification, and is particularly strong in religious movements with a messianic figure.

I’ll use the terms ‘prophecy’ and ‘foretelling’ to mean the same thing here. I’m aware that ‘prophecy’ in technical theological terms means something a little different (so most foretelling is prophecy, but the opposite is not necessarily true). But for the duration of this post, I’ll use it in the less formal sense.

So how does this foretelling work?

It turns out the process is pretty consistent across religions, large and small. There are a few tricks to it.

1. Vagueries — Prophecies are often vague. A vague prophecy is one that could be matched to a number of different events. Believers will, of course, argue that the prophecy isn’t as vague as it is. They will normally do so by pointing to specific interpretations of specific details in the prophecy (using a combination of this and technique 5, below). The interpretation of vague prophecies is easy to show in an experiment, however. Give a prophecy to someone unfamiliar with the prophecy in question, give them the proposed fulfilment, and a number of carefully selected other events from history, worded to maximise their fulfilment, and see how they classify which are hits and which are not.

2. Setting Concrete — You will notice that the vast majority of the details of what any prophecy means are first recorded after the prophecy is considered to have been fulfilled. Thus “a man will come from the East” might become “Blessed leader John was born in Hong Kong, but taught mostly in Europe”. The prophecy will never say “A leader named John will be born in Hong Kong, but teach mostly in Europe.” Sometimes the details will be written back into the prophecy, but invariably the first confirmed record of the detail will be post-hoc. The fingerprint of this technique is that the correct interpretation of the prophecy was not the consensus interpretation until after it was fulfilled.

3. Self-Fulfilment — The most striking prophecy fulfilments are those that are specifically engineered to be fulfilled. Any time people involved with the fulfilment are aware of the prophecy, they may consciously or subconsciously conform their behaviour to the prophecy. This is very common with new religious movements that trace their existence to prophecies in the scripture of major world religions. The ‘prophecies’ of the book of Revelation are popular source material. As are the Prophets of the Hebrew Bible. Thus Jesus’s Son of Man claims (assuming they are claims made by the historical Jesus, of course), have to be tempered by the fact he would have had access to their source in Daniel.

4. Revising History — An example shows this the best. A large number of Christians I’ve asked about this say they know that during Crucifixion a person’s shoulders, elbows and wrists are dislocated. Yet we have no historical records for this, we have scant historical records for how Roman crucifixion was performed, in detail, and the forms of crucifixion we do know of do not normally result in dislocation of the limbs. The source of this information is Psalm 22, where it says “all my bones are out of joint”, which is taken as a prophecy of the crucifixion of Jesus. Ancient prophecies can come to be fulfilled, simply by virtue of becoming the source material for their own fulfilment. When pressed, those who make these claims will often use a variant of “well, it isn’t impossible that Jesus’s bones were dislocated”. Which, of course, it isn’t, but that isn’t a fulfilment of prophecy.

5. Hit Counter — Prophecies are often drawn from large works: whether the Vedas, the Hebrew Bible, the Book of Revelation, or the Quran and Hadiths. In each case there is a lot of source material. Some of it is interpreted as highly accurate (via the methods above), the rest of it is interpreted as not relevant. So Psalm 22 (as noted above) is clearly messianic prophecy, but Psalm 21 is clearly not a messianic prophecy. How do we know? Because it doesn’t seem to contain anything recognizable in the life of Jesus. Isaiah 7:14 is a prophecy of Jesus’s birth (“behold a virgin has conceived…”), but Isaiah 7:16 isn’t a prophecy of his early life (some Christians do have very creative ways to interpret all this passage as being about Jesus, but only by hypothesizing history for the purpose of being fulfilled). So you count the passages that work, and ignore the rest, even if (as in the Isaiah case) they are part of the same passage.

6. Go Metaphysical — When a prophecy that really should fit doesn’t, it can be interpreted as being about the spiritual or supernatural realm. This is particularly crucial for prophecies made before the fact. For example, the famous “Great Disappointment” in 1844, the date William Miller had foretold the second coming. Nothing appeared to have happened (as usual), any many were disillusioned. But others claimed the prophecy had been fulfilled, Christ had come again, but in the supernatural realm. His return couldn’t be seen, but could be experienced in a believer’s heart. From that even come fringe Christian groups like the Seventh Day Adventists, who maintain that prophecy was fulfilled that day. (Other Christians maintain that the second coming happened at other points, such as at Pentecost, or at the appearance to Paul on the Damascus Road). Interestingly there is a teaching in the Bahá’í faith that the second coming did occur around the time Miller said, in the descending of the spirit upon the Báb, the first key figure in the emergence of the Bahá’í faith.

7. Go Future — Another approach to conspicuously unfulfilled prophecy is to say it has not yet been fulfilled. So for those Christians who believe the second coming has not occurred, it must be in the future. Thus ultimately, few prophecies are ever falsifiable. Even those that appear to give a date (such as Jesus describing his second coming as within the generation of those listening), the prophecy can be reinterpreted to extend the date, so that it can remain not-yet-fulfilled, rather than unfulfilled.

8. Quantity not Quality Faced with a detailed analysis of a particular prophecy, religious believers may concede it has multiple interpretations, or that it was only recognized as a prophecy post-hoc, or that it sits among non-prophetic content. But they may claim that the sheet quantity of prophecies fulfilled makes up for the fact that no particular prophecy is watertight. All those prophecies can’t be wrong: even if each one were 50/50, the chance of them all being wrong is billions-to-one. In practice, each prophecy is not fulfilled independently, so it is as simple to create a thousand prophetic fulfilments with these techniques as it is one. And most prophetically minded faiths tend to have large quantities of fulfilment claims.

9. Sure Things (h/t Shane) We all have a relatively poor sense of probability. We tend to estimate that things are less likely than they really are, and we have a very poor understanding of repeated-trials: when something can happen in very many ways, even if each one is unlikely, the overall chance of it happening is good. These cognitive biases converge to make some foretellings appear to be impressive, when in fact they are almost certain. Shane uses the example of a particular city being destroyed by war. In iron-age times this was very common, and really only a matter of time. A more contemporary example might be the occurrence of a huge natural disaster or a terrorist attack.

10. Unverifiable (h/t Bob Moore) This is the general form of number 4, above. If there is no evidence that a fulfilment didn’t happen, then one can simply claim that it did. In the best cases this claim will then get remembered as historical, and we’re at number 4; if not, then it is still difficult to prove that something didn’t happen, and if said with enough conviction, those predisposed to believe will accept that it did.

11. After the Fact (h/t Malcolm) If you take number 1 to its extreme, you have a prophecy that only arises after the fact, but is claimed to have been made further back in time, before the thing it refers to. This is a particularly important kind of prophecy for dating ancient texts. A text telling the story of a prior figure will describe them as having prophesied a later event. We use this as evidence that the text was composed no earlier than that event.

So here are several tried and tested ways to make prophecies work. If you happen to believe in the prophecies of your own religious tradition, then you’ll probably agree that these approaches (or others like them) account for prophetic fulfilment in other religions. But you’ll think your religion is special. The same goes for believers in each religion of course.

In my experience, when prophecy is met with due skepticism (the kind of skepticism anyone would apply to another person’s religion) it is interpreted as hostility. There is something quite primal, personal, and precious about the belief that one’s belief rest on a basis of miraculous foretelling.

Please suggest additions to this list, or suggest ways to demonstrate that the prophecies you believe do not fall into these categories. Both will help compile a more complete list.


Filed under Uncategorized

Cold Reading

I’m reading Ian Rowland’s book on Cold Reading, and how it works in the context of the psychic industry. It is excellent, and fascinating, and I’ve been trying bits out, which is fun.

Anyway, last night a thought struck me. I’ve not come across this kind of psychic reading in a Christian context, and I wonder if I’ve missed something. Let me explain what I mean with a kind of example.

Emma goes to visit “Word of Truth Ministries”, she is shown into an office of “Pastor Mark”, who has pictures on his walls showing bible verses, a seminary certificate, a photo of him on the couch with Pat Robertson, and a bookcase full of fancy theology books. Mark welcomes Emma and says

“As you know, God gives us each spiritual gifts. He has given me a strong gift of knowledge, as the Apostle Paul wrote in his letter to the Corinthians. Some pastors are called to be preachers and teachers, I’m called to the ministry of prophecy. I’m happy to use that gift to help you, with God’s grace. Shall we pray?”

“Lord Jesus Christ, thank you for Emma and her earnest desire to seek your will here today. Please bless me again through your holy spirit, strengthen your gift of knowledge within me. Bless Emma with a gift of wisdom and interpretation, so she can understand and apply what knowledge you give to me. In your precious name we pray, Amen.”

“Now, Emma, I am just a sinner, God has given me a precious gift, but as it says in the bible, I still see dimly, as through a mirror. Only in the life to come will things be total clear. So as God lays on my heart different things for you, it is important you use your gifts an intuition so we can hone in on the details of what God is saying… are you ready?”

And then the cold reading begins, with things such as

“God is giving me the picture of a young man, whose name is James, or Jay, or something similar.”

Emma: “My father’s name is James, but everyone calls him Jim.”

Mark: “Yes, it is your father that God is putting on my heart, but when he was a young man. His parent’s used to call him James, I think.”

Emma: “I don’t know.”

Mark: “They did, yes, and he is the same age as you are now. And he is worried you will make the same mistakes as he did. You need to ask him about that, because he’ll not tell you himself.”

Emma: “But he died a couple of years ago.”

Mark: “Yes, I know, that’s why he can’t tell you what he wants you to know, so you need to ask the question: ask yourself what might he have regretted, what you could do differently? God’s heart is to avoid you suffering the same pain he did.”

… and so on …

It strikes me that the story surrounding the cold reading could easily be made to fit evangelical theology. And it strikes me that evangelical anti-psychic rhetoric (which I’ve heard many times) means that significant numbers of adherents would never consult a psychic. So there is a big market. So why hasn’t it been tapped? If it has, can someone point me to it in a way that I can find out more about the scene? I would find it fascinating. If it hasn’t, is there a reason why not? Even if you think Christianity is true, there is no doubt there are plenty of hucksters around looking for an opportunity. Why is this not huge?

Just to acknowledge, I’m aware of larger group events, like Peter Poppoff and other healing service folks, where cold-reading techniques are used (Poppoff famously used hot-reading, of course). But I was thinking more of this one-on-one reading-style format. Any thoughts?


Filed under Uncategorized

An Introduction to Probability Theory and Why Bayes’s Theorem is Unhelpful in History

This post follows from the previous review of Richard Carrier’s “Proving History”, which attempts to use Bayes’s Theorem to prove Jesus didn’t exist. In my review I point out a selection of the mathematical problems with that book, even though I quite enjoyed it. This post is designed to explain what Bayes’s Theorem actually does, and show why it isn’t particularly useful outside of specific domains. It is a journey through basic probability theory, for folks who aren’t into math (though I’ll assume high-school math). It is designed to be simple, and therefore is rather long. I will update it and clarify it from time to time. [Edit: There is also a new post on errors, which follows on from this].

Let’s think about the birth of Christianity. How did it happen? We don’t know, which is to say there are a lot of different things that could have happened. Let’s use an illustration to picture this.

Complex diagram, eh? I want this rectangle to represent all possible histories: everything that could have happened. In math we call this rectangle the ‘universe‘, but meant metaphorically: the universe of possibilities. In the rectangle each point is one particular history. So there is one point which is the actual history, the one-true-past (OTP in the diagram below), but we don’t know which it is. In fact, we can surely agree we’ve no hope of ever finding it, right? To some extent there will always be things in history that are uncertain.

When we talk about something happening in history, we aren’t narrowing down history to a point. If we consider the claim “Jesus was the illegitimate child of a Roman soldier”, there are a range of possible histories involving such a Jesus. Even if we knew 100% that were true, there would be a whole range of different histories including that fact.

Napolean moved his knife in a particular way during his meal on January 1st 1820, but he could have moved that knife in any way, or been without a knife, and the things we want to say about him wouldn’t change. His actual knife manipulation is part of the one-true-past, but totally irrelevant for Napoleonic history1.

So any claim about history represents a whole set of possible histories. We draw such sets as circles. And if you’re a child of the new math, you’ll recognize the above as a Venn diagram. But I want to stress what the diagram actually means, so try to forget most of your Venn diagram math for a while.

At this point we can talk about what a probability is.

There are essentially an infinite number of possible histories (the question of whether it is literally infinite is one for the philosophy of physics, but even if finite, it would be so large as to be practically infinite for the purpose of our task). So each specific history would be infinitely unlikely. We can’t possibly say anything useful about how likely any specific point is, we can’t talk about the probability of a particular history.

So again we turn to our sets. Each set has some likelihood of the one-true-past lying somewhere inside it. How likely is it that Jesus was born in Bethlehem? That’s another way of asking how likely it is that the one-true-past lies in the set of possible histories that we would label “Jesus Born in Bethlehem”. The individual possibilities in the set don’t have a meaningful likelihood, but our historical claims encompass many possibilities, and as a whole those claims do have meaningful likelihood. In other words, when we talk about how likely something was to have happened, we are always talking about a sets of possibilities that match our claim.

We can represent the likelihood on the diagram by drawing the set bigger or smaller. If we have two sets, one drawn double the size of the other, then the one-true-past is twice as likely to be in the one that is drawn larger.

So now we can define what a probability is for a historical claim. A probability is a ratio of the likelihood of a set, relative to the whole universe of possibilities. Or, in terms of the diagram, what fraction of the rectangle is taken up by the set of possibilities matching our claim?

If we can somehow turn likelihood into a number, (i.e. let’s say that the likelihood of a set S is a nmber written L(S)) and if the universe is represented by the set U, probability can be mathematically defined as:

But where do these ‘likelihood’ numbers come from? That’s a good question, and one that turns out to be very hard to give an answer for that works in all cases. But for our purpose, just think of them as a place-holder for any of a whole range of different things we could use to calculate a probability. For example: if we were to calculate the probability of rolling 6 on a die, the likelihood numbers would be the number of sides: the likelihood of rolling a 6 would be 1 side, the likelihood of rolling anything would be 6 sides, so the probability of rolling a six is 1/6. If we’re interested in the probability of a scanner diagnosing a disease, the likelihoods would be the numbers of scans: on top would be the number of successful scans, the number on the bottom would be the total number of scans. We use the abstraction as a way of saying “it doesn’t much matter what these things are, as long as they behave in a particular way, the result is a probability”.

Now we’ve reached probabilities, we’ve used these ‘likelihoods’ as a ladder, and we can move on. We only really worry about how the probability is calculated when we have to calculate one, and then we do need to figure out what goes on the top and bottom of the division.

Another diagram.

In this diagram we have two sets. These are two claims, or two sets of possible histories. The sets may overlap in any combination. If no possible history could match both claims (e.g. “Jesus was born in Bethlehem” and “Jesus was born in Nazereth”), then the two circles wouldn’t touch [kudos if you are thinking “maybe there are ways both could be kind-of true” – that’s some math for another day]. Or it might be that the claims are concentric (“Jesus was born in Bethlehem”, “Jesus was born”), any possibility in one set, will always be in another. Or they may, as in this case, overlap (“Jesus was born in Nazereth”, “Jesus was born illegitimately”).

I’ve been giving examples of sets of historical claims, but there is another type of set that is important: the set of possible histories matching something that we know happened. Of all the possible histories, how many of them produce a New Testament record that is similar to the one we know?

This might seem odd. Why does our universe include things we know aren’t true? Why are there possibilities which lead to us never having a New Testament? Why are there histories where we have a surviving comprehensive set of writings by Jesus? Can’t we just reject those outright? The unhelpful answer is that we need them for the math to work. As we’ll see, Bayes’s Theorem requires us to deal with the probability that history turned out the way it did. I’ll give an example later of this kind of counter-factual reasoning.

So we have these two kinds of set. One kind which are historical claims, and the other which represent known facts. The latter are often called Evidence, abbreviated E, the former are Hypotheses, or H. So let’s draw another diagram.

where H∩E means the intersection of sets H and E – the set of possible histories where we both see the evidence and where our hypothesis is true (you can read the mathematical symbol ∩ as “and”).

Here is the basic historical problem. We have a universe of possible histories. Some of those histories could have given rise to the evidence we know, some might incorporate our hypothesis. We know the one true past lies in E, but we want to know how likely it is to be in the overlap, rather than the bit of E outside H. In other words, how likely is it that the Hypothesis true, given the Evidence we know?

Above, I said that probability is how likely a set is, relative to the whole universe. This is a simplification we have to revisit now. Probability is actually how likely one sets is, relative to some other set that completely encompasses it (a superset in math terms).

We’re not actually interested in how likely our Hypothesis is, relative to all histories that could possibly have been. We’re only interested in how likely our hypothesis is, given our evidence: given that the one-true-past is in E.

So the set we’re interested in is the overlap where we have the evidence and the hypothesis is true. And the superset we want to compare it to is E, because we know the one-true-past is in there (or at least we are willing to assume it is). This is what is known as a conditional probability. It says how likely is H, given that we know or assume E is true: we write it as P(H|E) (read as “the probability of H, given E”). And from the diagram it should be clear the answer is:

It is the ratio of the size of the overlap, relative to the size of the whole of E. This is the same as our previous definition of probability, only before we were comparing it to the whole universe U, now we’re comparing it to just the part of U where E is true2.

We could write all probabilities as conditional probabilities, because ultimately any probability is relative to something. We could write P(S|U) to say that we’re interested in the probability of S relative to the universe. We could, but it would be pointless, because that is what P(S) means. Put another way, P(S) is just a conveniently simplified way of writing P(S|U).

So what is a conditional probability doing? It is zooming in, so we’re no longer talking about probabilities relative to the whole universe of possibilities (most of which we know aren’t true anyway), we’re now zooming in, to probabilities relative to things we know are true, or we’re willing to assume are true. Conditional probabilities throw away the rest of the universe of possibilities and just focus on one area: for P(H|E), we zoom into the set E, and treat E as if it were the universe of possibilities. We’re throwing away all those counter-factuals, and concentrating on just the bits that match the evidence.

The equation for conditional probability is simple, but in many cases it is hard to find P(H∩E), so we can manipulate it a little, to remove P(H∩E) and replace it with something simpler to calculate.

Bayes’s Theorem is one of many such manipulations. We can use some basic high school math to derive it:

Step-by-step math explanation: The first line is just the formula for conditional probability again. If we multiply both sides by P(E) (and therefore move it from one side of the equation to the other) we get the first two parts on the second line. We then assume that P(H∩E) = P(E∩H) (in other words, the size of the overlap in our diagram is the same regardless of which order we write the two sets), which means that we can get the fourth term on the second line just by changing over E and H in the first term. Line three repeats these two terms on one line without the P(H∩E) and P(E∩H) in the middle. We then divide by P(E) again to get line four, which gives us an equation for P(H|E) again.

What is Bayes’s Theorem doing? Notice the denominator is the same as for conditional probability P(E), so what Bayes’s Theorem is doing is giving us a way to calculate P(H∩E) differently. It is saying that we can calculate P(H∩E) by looking at the proportion of H taken up by H∩E, multiplied by the total probability of H. If I want to find the amount of water in a cup, I could say “its half the cup, the cup holds half a pint, so I have one half times half a pint, which is a quarter of a pint”. That’s the same logic here. The numerator of Bayes’s theorem is just another way to calculate P(H∩E).

So what is Bayes’s Theorem for? It let’s us get to the value we’re interested in — P(H|E) — if we happen to know, or can calculate, the other three quantities: the probability of each set, P(H) and P(E) (relative to the universe of possibilities), and the probability of seeing the evidence if the hypothesis were true P(E|H). Notice that, unlike the previous formula, we’ve now got three things to find in order to use the equation. And either way, we still need to calculate the probability of the evidence, P(E).

Bayes’s Theorem can also be useful if we could calculate P(H∩E), but with much lower accuracy than we can calculate P(H) and P(E|H). Then we’d expect our result from Bayes’s Theorem to be a more accurate value for P(H|E). If, on the other hand we could measure P(H∩E), or we had a different way to calculate that, we wouldn’t need Bayes’s Theorem.

Bayes’s Theorem is not a magic bullet, it is just one way of calculating P(H|E). In particular it is the simplest formula for reversing the condition, if you know P(E|H), you use Bayes’s Theorem to give you P(H|E)3.

So the obvious question is: if we want to know P(H|E), what shall we use to calculate it? Either of the two formulae above need us to calculate P(E), in the universe of possible histories, how likely are we to have ended up with the evidence we have? Can we calculate that?

And here things start to get tricky. I’ve never seen any credible way of doing so. What would it mean to find the probability of the New Testament, say?

Even once we’ve done that, we’d only be justified in using Bayes’s Theorem if our calculations for P(H) and P(E|H) are much more accurate than we could manage for P(H∩E). Is that true?

I’m not sure I can imagine a way of calculating either P(H∩E) or P(E|H) for a historical event. How would we credibly calculate the probability of the New Testament, given the Historical Jesus? Or the probably of having both New Testament and Historical Jesus in some universe of possibilities? If you want to use this math, you need to justify how on earth you can put numbers on these quantities. And, as we’ll see when we talk about how these formulae magnify errors, you’ll need to do more than just guess.

But what of Carrier’s (and William Lane Craig’s) favoured version of Bayes’s Theorem? It is is derived from the normal version by observing:

in other words, the set E is just made up of the bit that overlaps with H and the bit that doesn’t (~H means “not in H”), so because

(which was the rearrangement of the conditional probability formula we used on line two of our derivation of Bayes’s Theorem), we can write Bayes’s Theorem as

Does that help?

I can’t see how. This is just a further manipulation. The bottom of this equation is still just P(E), we’ve just come up with a different way to calculate it4. We’d be justified in doing so, only if these terms were obviously easier to calculate, or could be calculated with significantly lower error than P(E).

If these terms are estimates, then we’re just using other estimates that we haven’t justified. We’re still having to calculate P(E|H), and now P(E|~H) too. I cannot conceive of a way to do this that isn’t just unredeemable guesswork. And it is telling nobody I’ve seen advocate Bayes’s Theorem in history has actually worked through such a process with anything but estimates.

This is bad news, and it might seem that Bayes’s Theorem could never be any useful for anything. But there are cases when we do have the right data.

Let’s imagine that we’re trying a suspect for murder. The suspect has a DNA match at the scene (the Evidence). Our hypothesis is that the DNA came from the suspect. What is P(H|E) – the probability that the DNA is the suspect’s, given that it is a match? This is a historical question, right? We’re asked to find what happened in history, given the evidence before us. We can use Bayes here, because we can get all the different terms.

P(E|H) is simple – what is the probability our test would give a match, given the DNA was the suspect’s? This is the accuracy of the test, and is probably known. P(E) is the probability that we’d get a match regardless. We can use a figure for the probability that two random people would have matching DNA. P(H) is the probability that our suspect is the murderer, in the absence of evidence. This is the probability that any random person is the murderer (if we had no evidence, we’d have no reason to suspect any particular person). So the three terms we need can be convincingly provided, measured, and their errors calculated. And, crucially, these three terms are much easier to calculate, with lower errors, than if we used the P(H∩E) form. What could we measure to find the probability that the suspect is the murderer and their DNA matched? Probably nothing – Bayes’s Theorem really is the best tool to find the conditional probability we’re interested in.

While we’re thinking about this example, I want to return briefly to what I said about counter-factual reasoning. Remember I said that Bayes’s Theorem needs us to work with a universe of possibilities where things we know are true, might not be true? The trial example shows this. We are calculating the probability that the suspect’s DNA would match the sample at the crime scene – but this is counter-factual, because we know it did (otherwise we’d not be doing the calculation). We’re calculating the probability that the DNA would match, assuming the suspect were the murderer, but again, this is counter-factual, because the DNA did match, and we’re trying to figure out whether they are the murderer. This example shows that the universe of possibilities we must consider has to be bigger than the things we know are true. We have to work with counter-factuals, to get the right values.

So Bayes’s Theorem is useful when we have the right inputs. Is it useful in history? I don’t think so. What is the P(E) if the E we’re interested in is the New Testament? Or Jospehus? I simply don’t see how you can give a number that is rooted in anything but a random guess. I’ve not seen it argued with any kind of rational basis.

So ultimately we end up with this situation. Bayes’s Theorem is used in these kind of historical debates to feed in random guesses and pretend the output is meaningful. I hope if you’ve been patient enough to follow along, you’ll see that Bayes’s Theorem has a very specific meaning, and that when seen in the cold light of day for what it is actually doing, the idea that it can be numerically applied to general questions in history is obviously ludicrous.

But, you might say, in Carrier’s book he pretty much admits that numerical values are unreliable, and suggests that we can make broad estimates, erring on the side of caution and do what he calls an a fortiori argument – if a result comes from putting in unrealistically conservative estimates, then that result can only get stronger if we make the estimates more accurate. This isn’t true, unfortunately, but for that, we’ll have to delve into the way these formulas impact errors in the estimates. We can calculate the accuracy of the output, given the accuracy of each input, and it isn’t very helpful for a fortiori reasoning. That is a topic for another part.

As is the little teaser from earlier, where I mentioned that, in subjective historical work, sets that seem not to overlap can be imagined to overlap in some situations. This is another problem for historical use of probability theory, but to do it justice we’ll need to talk about philosophical vagueness and how we deal with that in mathematics.

Whether I get to those other posts or not, the summary is that both of them significantly reduce the accuracy of the conclusions that you can reach with these formula, if your inputs are uncertain. It doesn’t take much uncertainty on the input before you loose any plausibility for your output.

1 Of course, we can hypothesize some historical question for which it might not be irrelevant. Perhaps we’re interested in whether he was sick that day, or whether he was suffering a degenerating condition that left his hands compromised. Still, the point stands, even those claims still encompass a set of histories, they don’t refer to a single point.

2 Our definition of probability involved L(S) values, what happened to them? Why are we now dividing probabilities? Remember that a Likelihood, L(S), could be any number that represented how likely something was. So something twice as likely had double the L(S) value. I used examples like number of scans or number of sides of a die, but probability values also meet those criteria, so they can also be used as L(S) values. The opposite isn’t true, not every Likelihood value is a probability (e.g. we could have 2,000 scans, which would be a valid L(S) value, but 2,000 is not a valid probability).

3 Though Bayes’s Theorem is often quoted as being a way to reverse the condition P(H|E) from P(E|H), it does still rely on P(E) and P(H). You can do further algebraic manipulations to find these quantities, one of which we’ll see later to calculate P(E). Here the nomenclature is a bit complex. Though Bayes’s Theorem is a simple algebraic manipulation of conditional probability, further manipulation doesn’t necessarily mean a formula is no longer a statement of Bayes’s Theorem. The presence of P(E|H) in the numerator is normally good enough for folks to call it Bayes’s Theorem, even if the P(E) and P(H) terms are replaced by more complex calculations.

4 You’ll notice, however, that P(E|H)P(H) is on both the top and the bottom of the fraction now. So it may seem that we’re using the same estimate twice, cutting down the number of things to find. This is only partially helpful, though. If I write a follow up post on errors and accuracy, I’ll show why I think that errors on top and bottom can pull in different directions. I did write that other post, but I didn’t talk about this, as my views on this are rather half-baked, and not easily demonstrable. For the purpose of this discussion this claim is not very relevant, however, so I’ll withdraw it.


Filed under Uncategorized

Eyewitness Testimony

An excellent TED talk hit my reader today.

The implications for religious testimony are obvious.

The bit about 9/11 really confused me. I don’t have the memory he described: on 9/11 I got to a TV later that night. But just by logic: how come it was 24 hours later when footage of the second tower collapsing aired? I thought the news networks were all showing live footage, at least from key vantage points, even if the helicopters had been grounded by then. Can anyone shed light on that?


Filed under Uncategorized

A Mathematical Review of “Proving History” by Richard Carrier

Carrier, Richard. Proving History: Bayes’s Theorem and the Quest for the Historical Jesus. Prometheus Books, 2012.

This is a review of Carrier’s book from purely a mathematical perspective, the historical merit has been reviewed elsewhere. Given the primary audience of this blog, and the book, however, I will review the mathematics in fairly non-technical terms, though I will assume some knowledge of probability theory.

Proving History is the first book of a pair on the topic of whether there was a historical figure of Jesus, written by independent scholar Carrier (who describes himself as a historian and philosopher), funded philanthropically by members of an online atheist group. It sets up a thesis based on using probability theory to reason about historical evidence. In particular, Carrier focuses on what he calls Bayes’s Theorem as the fundamental underlying process of doing history.

I will be unable to deal with every mathematical problem in the book in a short review, so will limit myself to issues arising from the first mathematical chapter. The issues below are not mitigated later in the book.

Chapter three introduces Bayes’s theorem (BT) as

[sic] which is a confusing and unnecessarily complex form of the formula more easily stated as

While Carrier devotes a lot of ink to describing the terms of his long-form BT, he nowhere attempts to describe what Bayes’s Theorem is doing. Why are we dividing probabilities? What does his denominator represent? He comes perilously close in chapter six (“Hard Stuff”), when talking about reference classes (which are quite closely related to the meaning of the denominator), but doesn’t try to bring his audience to the level of competency to do anything more than take his word for these mathematical assertions.

So this unduly complex version of BT serves as a kind a magic black box:

“the theorem was discovered in the late eighteenth century and has since been formally proved, mathematically and logically, so we know its conclusions are always necessarily true—if its premises are true”

In this Carrier allows himself to sidestep the question whether these necessarily true conclusions are meaningful in a particular domain. A discussion both awkward for his approach, and one surely that would have been more conspicuously missing if he’d have described why BT is the way it is.

The addition of background knowledge (b in his formula) in every term (the probability theory equivalent of writing x1 or +0) is highly idiosyncratic, though I’ve seen William Lane Craig use the same trick. Footnote 10 states that he made the choice so as to make explicit that background knowledge is always present. Clearly his audience can’t be expected to remember this basic tenet of probability theory.

Carrier correctly states that he is allowed to divide content between evidence and background knowledge any way he chooses, provided he is consistent. But then fails to do so throughout the book. For example on page 51 is an explanation of a ‘prior’ probability which explicitly includes the evidence in the prior, and therefore presumably in the background knowledge (emphasis original):

“the measure of how ‘typical’ our proposed explanations is a measure of how often that kind of evidence has that kind of explanation. Formally this is called the prior

Going on to say (emphasis original):

For example, if someone claims they were struck by lightening five times … the prior probabilty they are telling the truth is not the probability of being struck by lightening five times, but the probability that someone in general who claims such a thing would be telling the truth.

This is not wrong, per se, but highly bizarre. One can certainly bundle the claim and the event like that, but if you do so Bayes’s Theorem cannot be used to calculate the probability that the claim is true based on that evidence. The quote is valid, but highly misleading in a book which is seeking to examine the historicity of documentary claims.

The final problem I want to focus on in chapter three, is the claim of BT special status. Carrier asserts it is both necessary and sufficient for any probabilistic reasoning about evidence. This is indicative of a confusion of nomenclature that permeates the book, at times he uses Bayes’s Theorem to mean probabilistic reasoning generally, then switches to using his idiosyncratic equation form (as if his claims about the former, therefore lead one to the latter necessarily), and at other times uses it as a stand in for Bayesian reasoning (to which I’ll return below). If he had started from the definition of conditional probability:

he might have noticed that Bayes Theorem is merely an application of basic high-school algebra to the definition, and there are many other such applications which would not give BT, but would be equally valid. Thus statements such as

“any historical reasoning that cannot be validly described by Bayes’s Theorem is itself invalid”

(which he claims he will show in the following chapter, but does not credibly do so) are laughable if understood to mean

but have been argued for (though by no means to universal acceptance) if taken to mean the Bayesian interpretation of probabilities.

Carrier joins that latter debate too, in what he describes as a “cheeky” unification of Bayesian and Frequentist interpretations, but what reads as a misunderstanding of what the differences between Bayesian and Frequentist statistics are. Describing what this means is beyond my scope here, but I raise it because it is illustrative of a tone of arrogance and condescension that I consistently perceived throughout the book. To use the word “cheeky” to describe his “solution” of this important problem in mathematical philosophy, suggests he is aware of his hubris. Perhaps cheeky indicates that his preposterous claim was made in jest. But given the lack of mathematical care demonstrated in the rest of the book, to me it came off as indicative of a Dunning-Kruger effect around mathematics.

I had many other problems with the mathematics presented in the book, I felt there were severe errors with his arguments a fortiori (i.e. a kind of reasoning from inequalities — the probability is no greater than X); and his set-theoretic treatment of reference classes was likewise muddled (though in the latter case it coincidentally did not seem to result in incorrect conclusions). But in the interest of space, the above discussion gives a flavour of the issues I found throughout.


Outside the chapters on the mathematics, I enjoyed the book, and found it entertaining to consider some of the historical content in mathematical terms. I strongly support mathematical literacy in the arts. History and biblical criticism would be better if historians had a better understanding of probability (among other topics: I do not think the lack of such knowledge is an important weakness in the field).

I am also rather sympathetic to many of Carrier’s opinions, and therefore predisposed towards his conclusions. So while I consistently despaired of his claims to have shown his results mathematically, I agree with some of the conclusions, and I think that gestalts in favour of those conclusions can be supported by probability theory.

But ultimately I think the book is disingenuous. It doesn’t read as a mathematical treatment of the subject, and I can’t help but think that Carrier is using Bayes’s Theorem in much the same way that apologists such as William Lane Craig use it: to give their arguments a veneer of scientific rigour that they hope cannot be challenged by their generally more math-phobic peers. To enter an argument against the overwhelming scholarly consensus with “but I have math on my side, math that has been proven, proven!” seems transparent to me, more so when the quality of the math provided in no way matches the bombast.

I suspect this book was always designed to preach to the choir, and will not make much impact in scholarly circles. I hope it doesn’t become a blueprint for other similar scholarship, despite agreeing with many of its conclusions.


Filed under Uncategorized