# A Mathematical Review of “Proving History” by Richard Carrier

Carrier, Richard. Proving History: Bayes’s Theorem and the Quest for the Historical Jesus. Prometheus Books, 2012.

This is a review of Carrier’s book from purely a mathematical perspective, the historical merit has been reviewed elsewhere. Given the primary audience of this blog, and the book, however, I will review the mathematics in fairly non-technical terms, though I will assume some knowledge of probability theory.

Proving History is the first book of a pair on the topic of whether there was a historical figure of Jesus, written by independent scholar Carrier (who describes himself as a historian and philosopher), funded philanthropically by members of an online atheist group. It sets up a thesis based on using probability theory to reason about historical evidence. In particular, Carrier focuses on what he calls Bayes’s Theorem as the fundamental underlying process of doing history.

I will be unable to deal with every mathematical problem in the book in a short review, so will limit myself to issues arising from the first mathematical chapter. The issues below are not mitigated later in the book.

Chapter three introduces Bayes’s theorem (BT) as

[sic] which is a confusing and unnecessarily complex form of the formula more easily stated as

While Carrier devotes a lot of ink to describing the terms of his long-form BT, he nowhere attempts to describe what Bayes’s Theorem is doing. Why are we dividing probabilities? What does his denominator represent? He comes perilously close in chapter six (“Hard Stuff”), when talking about reference classes (which are quite closely related to the meaning of the denominator), but doesn’t try to bring his audience to the level of competency to do anything more than take his word for these mathematical assertions.

So this unduly complex version of BT serves as a kind a magic black box:

“the theorem was discovered in the late eighteenth century and has since been formally proved, mathematically and logically, so we know its conclusions are always necessarily true—if its premises are true”

In this Carrier allows himself to sidestep the question whether these necessarily true conclusions are meaningful in a particular domain. A discussion both awkward for his approach, and one surely that would have been more conspicuously missing if he’d have described why BT is the way it is.

The addition of background knowledge (b in his formula) in every term (the probability theory equivalent of writing x1 or +0) is highly idiosyncratic, though I’ve seen William Lane Craig use the same trick. Footnote 10 states that he made the choice so as to make explicit that background knowledge is always present. Clearly his audience can’t be expected to remember this basic tenet of probability theory.

Carrier correctly states that he is allowed to divide content between evidence and background knowledge any way he chooses, provided he is consistent. But then fails to do so throughout the book. For example on page 51 is an explanation of a ‘prior’ probability which explicitly includes the evidence in the prior, and therefore presumably in the background knowledge (emphasis original):

“the measure of how ‘typical’ our proposed explanations is a measure of how often that kind of evidence has that kind of explanation. Formally this is called the prior

Going on to say (emphasis original):

For example, if someone claims they were struck by lightening five times … the prior probabilty they are telling the truth is not the probability of being struck by lightening five times, but the probability that someone in general who claims such a thing would be telling the truth.

This is not wrong, per se, but highly bizarre. One can certainly bundle the claim and the event like that, but if you do so Bayes’s Theorem cannot be used to calculate the probability that the claim is true based on that evidence. The quote is valid, but highly misleading in a book which is seeking to examine the historicity of documentary claims.

The final problem I want to focus on in chapter three, is the claim of BT special status. Carrier asserts it is both necessary and sufficient for any probabilistic reasoning about evidence. This is indicative of a confusion of nomenclature that permeates the book, at times he uses Bayes’s Theorem to mean probabilistic reasoning generally, then switches to using his idiosyncratic equation form (as if his claims about the former, therefore lead one to the latter necessarily), and at other times uses it as a stand in for Bayesian reasoning (to which I’ll return below). If he had started from the definition of conditional probability:

he might have noticed that Bayes Theorem is merely an application of basic high-school algebra to the definition, and there are many other such applications which would not give BT, but would be equally valid. Thus statements such as

“any historical reasoning that cannot be validly described by Bayes’s Theorem is itself invalid”

(which he claims he will show in the following chapter, but does not credibly do so) are laughable if understood to mean

but have been argued for (though by no means to universal acceptance) if taken to mean the Bayesian interpretation of probabilities.

Carrier joins that latter debate too, in what he describes as a “cheeky” unification of Bayesian and Frequentist interpretations, but what reads as a misunderstanding of what the differences between Bayesian and Frequentist statistics are. Describing what this means is beyond my scope here, but I raise it because it is illustrative of a tone of arrogance and condescension that I consistently perceived throughout the book. To use the word “cheeky” to describe his “solution” of this important problem in mathematical philosophy, suggests he is aware of his hubris. Perhaps cheeky indicates that his preposterous claim was made in jest. But given the lack of mathematical care demonstrated in the rest of the book, to me it came off as indicative of a Dunning-Kruger effect around mathematics.

I had many other problems with the mathematics presented in the book, I felt there were severe errors with his arguments a fortiori (i.e. a kind of reasoning from inequalities — the probability is no greater than X); and his set-theoretic treatment of reference classes was likewise muddled (though in the latter case it coincidentally did not seem to result in incorrect conclusions). But in the interest of space, the above discussion gives a flavour of the issues I found throughout.

Conclusion

Outside the chapters on the mathematics, I enjoyed the book, and found it entertaining to consider some of the historical content in mathematical terms. I strongly support mathematical literacy in the arts. History and biblical criticism would be better if historians had a better understanding of probability (among other topics: I do not think the lack of such knowledge is an important weakness in the field).

I am also rather sympathetic to many of Carrier’s opinions, and therefore predisposed towards his conclusions. So while I consistently despaired of his claims to have shown his results mathematically, I agree with some of the conclusions, and I think that gestalts in favour of those conclusions can be supported by probability theory.

But ultimately I think the book is disingenuous. It doesn’t read as a mathematical treatment of the subject, and I can’t help but think that Carrier is using Bayes’s Theorem in much the same way that apologists such as William Lane Craig use it: to give their arguments a veneer of scientific rigour that they hope cannot be challenged by their generally more math-phobic peers. To enter an argument against the overwhelming scholarly consensus with “but I have math on my side, math that has been proven, proven!” seems transparent to me, more so when the quality of the math provided in no way matches the bombast.

I suspect this book was always designed to preach to the choir, and will not make much impact in scholarly circles. I hope it doesn’t become a blueprint for other similar scholarship, despite agreeing with many of its conclusions.

Filed under Uncategorized

### 30 responses to “A Mathematical Review of “Proving History” by Richard Carrier”

1. Well done. I was most likely taken due to my predisposition to the arguments as well. But I don’t know the math. I will wait to see if he responds for a final concession.

PS I think you meant idiosyncratic.

2. Thanks. That was an interesting review.

I guess I should add that I am a mathematician and have actually taught classes in statistical inference, though typically using Neyman-Pearson methods rather than Bayesian methods.

What always concerns me about statistical inference, is that it is too easy to use statistics carelessly so as to provide apparent (but dubious) support for what you want to be true. I see from your review that this is also one of your concerns.

…, but I raise it because it is illustrative of a tone of arrogance and condescension that I consistently perceived throughout the book.

When Richard Carrier began to blog on FreeThoughtBlogs, I read the first few of his posts. And that tone of arrogance and condescension came through very clearly. I concluded that I cannot trust his judgment. I cannot tell whether his conclusion is based on the evidence, or whether he has his own axe to grind and is using the evidence to back up conclusions that really come from his own beliefs.

3. Ian

Thanks Neil. Yes, that is my underlying concern. It is telling in the book that, though Carrier gives several numerical worked examples, every single value he uses (outside dice rolls) is estimated without any explicit grounds for the estimation. Even if he had correctly articulated what the terms mean, it wouldn’t help his overall argument.

I’m planning to write another blog post about why Bayes’s Theorem isn’t very useful in history, beyond informing the judgement of historians.

Ultimately I think folks without math knowledge get lured into thinking that these very precise, provable formulae give some weight to their output, regardless of the quality of the input. Both Carrier and Lane Craig try to get agreement on inputs and want to claim that means we should all agree on the conclusions generated by the output.

4. Ian

Mark, thanks. I did mean idiosyncratic, yes, corrected now.

5. Grizel

Thanks for this Ian. Well done and understandable to a BT novice like myself.

I was intrigued by Carrier when he appeared on the scene, but have since grown wary (his post on FtB about people being either friends or enemies of Atheism+ made me realize how strongly his biases may drive his work).

When you say “outside of the math” you enjoyed the book, would you mind going into a little more detail? I was under the assumption the book mainly dealt with BT. Does he deal with historical methods and theories on more than a superficial level?

I look forward to reading why you feel BT isn’t very useful in history. Oh, and I am now subscribed to new comments AND posts!

6. Grizel

Ah. I just followed the link you supplied to McGrath’s blog and this answers my question somewhat.

7. Ian

Thanks Grizel. So I share what I think is Carrier’s view on Jesus “Questing” and its results. Carrier spends a good chunk of the book talking about criteria (by far the biggest chapter), and although there are many issues I have there, I share his views about applying them too strongly. I think they are useful to inform general analysis, but are not reliable in their own right. So I think some of Carrier’s discussion of why we should be skeptical is good, and I think some of the conclusions he draws are valid. I don’t think the criteria are quite as irredeemable as he does (because I can see how he’s stacked the mathematical deck against them), but they are broad and sensitive heuristics. And when Carrier lets himself off from trying to make everything and his dog about BT, he argues quite cogently and interestingly, I thought.

I did get the sense, however, that he is somewhat arguing against a type of Historical Jesus scholarship that is waning in the academy. We’ve come a long way from the third Quest now, I think, and more recent books on HJ don’t tend to be so atomic in the way they apply criteria.

8. Interesting review, thanks.

Because Christian apologists and counter-apologists use BT in their arguments, it seems important for a “seeker” to have a general knowledge of BT and how it’s used in these arguments.

I’ve heard that Bolstad’s Intro to BT is a good source for beginners to learn the basics. Do you have any other recommendations?

9. Ian

Geoffrey, thanks for commenting, and welcome to the blog. I don’t have a good recommendation, no, though googling comes up with a few options that seem reasonable. I’ve written, but have yet to illustrate a major post here introducing probability theory (including Bayes’s Theorem, but not only that). I hope I’ll get it finished this week, if you subscribe to posts in the top right of the blog, you’ll get notified, you can cancel your subscription right from that notification email then.

10. Interesting criticisms – they seem important. The math escaped me, however. Maybe your next post will help. Thanx.

11. PeeJ

@Neil Rickert: “When Richard Carrier began to blog on FreeThoughtBlogs, I read the first few of his posts. And that tone of arrogance and condescension came through very clearly. I concluded that I cannot trust his judgment. I cannot tell whether his conclusion is based on the evidence, or whether he has his own axe to grind and is using the evidence to back up conclusions that really come from his own beliefs, …”

Maybe if you payed more attention to the evidence he presents and the argument he makes you would not have that problem. I can assure you that relying on your own assessment of actual evidence and strength of argument is better than relying on someone’s conclusion based on what you perceive their attitude to be.

12. PeeJ

It is clear that Carrier has a layman’s understanding of Bayesian methods, I’ll give you that. But I find it less offensive than you because I am not a mathematician (though I do have a B.S. In math) and Aldo because I think he is going about it (inaptly to some extent but also with some validity) in good faith. William Lane Craig, who clearly has almost no understanding of the math, is misapplying it in bad faith.

History is not a rigorous discipline in the same way that the sciences are. I don’t fault Carrier for trying to inject more rigorous methods into what is otherwise a “who can convince the most people with the cleverest argument” discipline. To approach the subject as a question of probabilities is, in my eyes, a very good thing. And for that, I am willing to cut some slack. The big but: but we should neither accept nor discard any particular conclusion or assessment without examining whether that particular result is justified by the application of the method in that case.

13. Ian

@PeeJ – it doesn’t surprise me that you think Carrier made honest mistakes in good faith, while Craig is a nefarious schemer trying to deceive. But can’t you recognize that your assessment is a transparent statement of your biases? There are plenty who would make the opposite judgement, for exactly the same reasons as you: they happen to agree with one side rather than the other.

This is not how real scholarship works. It doesn’t matter who is arguing in good or bad faith, it matters how good their arguments are (yes, ‘who can convince the most people with the cleverest argument’ – nothing wrong with that – and that is exactly Carrier’s gambit, to use the math to make him sound ‘cleverer’). Both Carrier and Craig claim to be bringing more rigour into the discussion with Bayes. They are both wrong, they both end up using it as a fig leaf to their predetermined conclusions. It is telling, for example, that both of them discover, using Bayes’s Theorem, that the position they’ve been arguing all along happens to be supported by the theorem – what an amazing coincidence! That I personally happen to disagree strongly with Craig, and agree mostly with Carrier is irrelevant.

Your “big but” is right, and it seems we’re probably on the same side overall, so how about we all ante up and to call a spade a spade? We have to be willing to say “we generally agree with you, Carrier, but when you pull out pseudoscholarship to prove your point, you don’t get a free pass.” That way we get to be more credible when we tell Craig he’s talking rubbish.

Otherwise we may as well just uncritically accept anything said to us by people we agree with, and reject outright those we disagree with. That’s not the mark of people who are honestly seeking truth.

14. You make it sound as if P(H|E) = P(E|H)P(H)/(P(E|H)P(H) + P(E|~H)P(~H)) is some obscure and unnecessarily complex form of Bayes’s formula, when it is probably the most common way the formula is expressed in textbooks and is usually more useful than just writing P(E) in the denominator, since P(E|~H) is independent of P(E|H) and P(H). What’s really strange about Carrier’s way of writing the formula is the inclusion of the background information explicitly, which he may have picked up from Lane.

Carrier’s explanation on p. 51 for the prior probability and the lightning example I found particularly annoying. He writes as if one couldn’t use H=”probability of being struck by lightning 5 times” and E=”someone claims to have been struck 5 times” to calculate the probability that someone who claims to have been struck by lightning 5 times is telling the truth. Sure, it may be very difficult to estimate P(H) or P(E) this way, but that doesn’t make this approach “wrong.”

Instead, without informing the reader, he treats E as if it were really the intersection of two events, E and F, with E = “someone claims to have been struck 5 times” and F = “someone makes a similar sort of claim,” with E of course contained within F, and then treats F as part of the background:

P(H|EB) = P(H|EFB) = P(E|HFB)P(H|FB)/P(E|FB).

There’s nothing wrong with this mathematically, and it may even make the calculation easier, but it would be nice of him to say that this is what he is doing, since it doesn’t match his description of the formula.

15. Ian

“You make it sound as if P(H|E) = P(E|H)P(H)/(P(E|H)P(H) + P(E|~H)P(~H)) is some obscure and unnecessarily complex form of Bayes’s formula, when it is probably the most common way the formula is expressed in textbooks ”

It isn’t the most common way to introduce it, in my experience (which is using statistics in bioinformatics and computing) though obviously it can be useful if you have its terms. Any textbook worth using should explain why Bayes’s Theorem is how it is. And I’d be surprised if anyone tries to do this with the rewritten denominator. Of course, depending on your domain, it may be the form you want to use most often. In the cases with which I deal now (AI) it isn’t. Even if so you may need to expand some other terms, unless you know the definition of conditional probability Bayes’s Theorem is just some magic mathematical box, which is how Carrier is introducing it, in my opinion.

The rest, yes, I agree with you. But I’ll respond to your comment on the other post later in more detail.

16. I think I understand what you’re getting at – that Carrier should’ve derived the formula, since it’s not that hard and would give the readers a better idea of what it means – and I agree, but I don’t see anything wrong with initially stating the formula using an expanded denominator, since that is the way it will be used.

I took a look through the probability books that I have and found that out of 12 books where I could find a definition of Bayes’s formula/theorem/rule, 9 of them stated it with the expanded denominator, 2 with only P(E) in the denominator, and 1 gave both as parts (a) and (b) of Bayes’s theorem. Naturally any proof of the theorem starts with the definition of conditional probability, rearranges the numerator, and then expands the denominator, but not all books proved the formula – some just stated it. One of the 2 books that defined it using only P(E) then gave the other form as an application in the next sentence.

One of the most effective examples for introducing Bayes’s formula that I’ve seen is the medical testing one, and that one uses the expanded denominator. Carrier certainly could’ve worked though one of those, considering how much other space in the book was wasted with pedantic fluff or repetition.

17. Ian

Interesting, and quite surprising. Maybe the issue isn’t limited to Carrier then. How do your books explain what Bayes’s Theorem is doing, then, without mentioning what the denominator is? Do they just introduce it as a magic black box?

Proving Bayes’s Theorem is fine, and trivial with the right starting point. But I’m less concerned pedagogically with proofs and more with explanations.

I have a slight problem with the medical testing example done with the long-form denominator, too. It strikes me that P(+ve test) is at least as accurate as P(+ve test|~disease), since the former is obviously measurable, where the latter assumes accurate exclusion of diagnosis (not an unwarranted assumption, but unnecessary — the numerator assumes accurate diagnosis, of course, which is a much worse assumption, as the medical example is usually tasked with showing).

18. Not surprising at all to me because when I was in school (more than 20 years ago) it was taught with the expanded denominator. The formula is such a trivial derivation from the definition of conditional probability and the axioms of probability theory that it’s hardly worthy of being called a theorem, especially if all you’ve done is rearranged the numerator. Ultimately the only reason to give it a name is because it is frequently used, so I guess the practice would be to label whichever form is most common in applications; or perhaps Bayes originally used it in this form (I have no idea if he did).

The books I looked through varied a lot in level and target audience, but almost all of them derived the theorem. Naturally the proof would include the version you prefer as an intermediate step.

In medical testing, you generally see reported the incidence of false positives (P(E|~H)) and accuracy (P(E|H)). It wouldn’t make much sense to report P(E) because that would involve assumptions about the distribution of the disease in the population (i.e., P(H)). If someone introduces a new test, you can compare it to already existing ones by comparing their P(E|~H) and P(E|H). No one would just compare P(E) because that tells you nothing about which is a better test without making further assumptions, which lowers the reliability of your assessments of these values. When using Bayes’s formula in the medical context, the P(E|H) and P(E|~H) come from the manufacturer, in controlled experiments where the incidence of the disease in the general population is irrelevant (and these are generally much better known than P(H) or P(E)), whereas P(H) comes from a different group of medical researches doing epidemiology.

19. Ian

Interesting, thanks. I agree that the long-form is neither surprising or problematic (I just think it isn’t as understandable). Obv. a trivial extension. As for naming it, looking around, I think “Bayes’s Theorem” seems to be used generally for anything where P(A|B) (or a function of it) is expressed in terms of P(B|A). Bayes original form was neither: http://www.stat.ucla.edu/history/essay.pdf Laplace was the person who first brought it to light, and after wrestling for years with it, he finally concluded that the problem of priors was insurmountable, and became a frequentist.

I’m not very familiar with the medical use. Other than in automated diagnosis (I do AI for a living), and there y knowledge is in the more algorithmic internals. So it is interesting to hear what you’re saying.

I agree that P(E) would be useless in assessing a test, which is a good practical point (but a rather different probability question – it is no surprise that finding a different confidence should require different inputs!).

Your main point seems to be that P(E|H) and P(E|~H) are manufacturer supplied based on highly controlled tests, and therefore likely to be much higher accuracy that aggregate statistics acquired in the field. The dependence of P(E) on P(H) you refer to is presumably because we don’t tend to test people unless we have some reason to suspect P(H), so if S is suspicion, using the rates of positive tests would be equivalent to doing P(H|E,S) = P(E|H) P(H) / P(E|S), which is not valid. Both good points I missed.

I’ve appreciated the punctilious responses, thanks.

20. Hi, first time here. A friend pointed me to this critique recently, as I’m plowing through Carrier’s book at the moment.

“There are models of God I don’t have a problem with, but they aren’t the ones people think of when you say “God”, so I don’t find it helpful to label my beliefs with the term… I used to be a Christian. I still live in a Christian culture. Most of my friends and family are Christians.”

Precisely.

I’m a research engineer by profession, so I have a natural propensity for technical things. Perhaps birds of a feather in natural bent with a scientist chap like yourself.

When my feelers first shot up regarding Christianity, I dug in quite hard on the research side of things, and I’ve now wound up an unbeliever. My autobiography of departure is over at my blog under “Journey,” which may or may not be of interest. My departure began from evolutionary dilemmas but centered around the historical collapse of Judaism as a foundational implosion for atonement surrounding about Jesus. There is more data, as they say, for a firm ruling about the multiple grand claims in the OT than there is surrounding the meagerly documented life and times of Jesus. (And here I certainly agree with Carrier about the despairing state of Historical Jesus studies.)

21. Ian

Thanks for the kind words. I’m probably not going to update the blog much this year, for reasons that I don’t want to make public. I’ve been reading through your blog for the last hour. There are startling similarities yes! If you want to connect, I’d be happy to ping you by email. I have your email, but just reply to the comment if you want to.

22. Ian, sure I’d live to exchange email. Its the one there on the site…

23. Reblogged this on Research Reviews and commented:
It was interesting to find another mathematical perspective concerning Carrier’s book. However, unlike the author this treatment, I have somewhat of a background in history and historical methods in addition to a mathematical background (or focus; at least of late). Apart from studying French, German, and Italian, my degree in classical languages required studying…well…classical languages (ancient Greek and Latin). Anybody who has done this at the university level knows that one can’t actually study these languages without studying history. So the only thing I would add to the review of Dr. Carrier’s mathematical treatment of “Bayes’ Theorem” proffered by the blog repost below is that the historical treatment in the book is bereft of any historiographical value. Hopefully, this is because Dr. Carrier sought to be accessible to a wide audience. The sequel to Proving History may prove much more valuable.

24. Elliott

I think Carrier is fine with using Bayes’s Theorem as a tool for evaluating claims about history, although I wish that he had perhaps gotten a mathematician to write up the discussion of BT. Either this or he could have included an appendix where he goes in to detail about it. Preferably this appendix would have a brief explanation of set theory and probability theory. I agree with the author of this post that Carrier’s treatment of the relevant math is very hard to follow, and could be cleared up by just using some notation, the art majors be dammed!

That being said I do want to defend his use of the more complicated form of BT. Since Carrier trying to get people to consider how well different hypotheses explain the set of evidence that we have, I think he’s ok with breaking up P(E) into it’s different parts. Now I wish he had bothered to explain what exactly he mentioned by “e.b” right away rather than when he did, or nothe worries so much about the distinction between “e” and “b”, although I appriceate

25. Ian

I’ve written a few more posts on why Carrier isn’t fine using BT in this way.

I think he’s ok with breaking up P(E) into it’s different parts.

The problem is twofold

a) He doesn’t say he’s breaking it up into different parts, in fact he doesn’t discuss the fact that the denom is P(E) at all. I suspect this is because, if you say “The probability of the evidence” it will be much clearer that you’ve no hope of actually coming up with a sane number, whereas by obfuscating this it makes it less obvious.

b) Breaking up an expression like that, really just substituting P(E) for an expression that evaluates to P(E), is only justified if the component parts are more easily acquirable than the whole. If you’ve got a reasonable likelihood of knowing P(E|H) and P(E|~H) to a degree of accuracy that is much higher than you would if you estimated P(E). The two are not equivalent practically, since more terms independently acquired introduces more independent sources of error. There are cases where it is advisable to use this, but not this one.

So I still stand by my assessment that this form of BT is only used because it is more bamboozly than the simpler form. An assessment which I think is backed up by his lack of desire to explain the probability to his readership, and his tendency to further obfuscate the equation with background knowledge.