[This isn’t another post on Richard Carrier’s Proving History. It is, however, a post I suggested I might write as a follow on to my review, that explores another way to think mathematically about knowledge, since there are quite a few folks who’ve arrived here because of my posts on probability theory.]
So in recent posts we’ve been looking at probabilities and their use to talk about knowledge. Probability, as we saw, relies only on some general notion of ‘likelihood’, and in one of the comments, I mentioned that probabilities can be used for a whole range of related ideas. The most common are the odds of something happening (a so-called frequentist interpretation) and the confidence one can have in something being true (the Bayesian interpretation). It might be obvious that in many cases these two interpretations are rather similar: the odds of rolling a 6 on a die, may seem obviously the same as your confidence that a hidden die-roll came up 6. The two interpretations do diverge, however, they aren’t always the same (but that’s not the point of this post, so I won’t go into the Frequentist-Bayesian problem).
But there are other ways we can interpret probability, and one of those will let us jump into some territory that is less well trodden, but even more fruitful.
We can use probability to represent how true something is. This is quite different from a Bayesian interpretation which is how confident we can be that something is black-and-white true. We’ll call our ‘how true’ interpretation the fuzzy interpretation, because it treats truth as somewhat fuzzy.
Let’s take an example (and to avoid the criticism that this is about Carrier or mythicism, let’s talk sport). If I described an NBA basketball player, and asked “What is the probability that he is tall?”, a Bayesian might say: “let’s define tall as over 6’6″, how confident am I that an NBA player is at least that tall?” A fuzzy interpretation might say “someone who is 5’6″ is not at all ‘tall’, someone who is 7′ is entirely ‘tall’, people in between are somewhat tall, how tall is my player?”
So let’s imagine that the average height of NBA players is 6’9″, and only 30% are shorter than 6’6″. The Bayesian then says, the probability of your NBA player being tall is 70% (because only 30% aren’t), the ‘fuzzyist’ says the probability is 83%, because all we know about NBA players on average is they are 83% tall.
There are some bits of knowledge that are black-and-white, but perhaps not many. The philosophical problem of vagueness is important (see my post on Sorites paradox for some more background), and it crops up in many places. When you ask questions about knowledge, it turns out to be very difficult to be precise enough to avoid having vague criteria. Even in our Bayesian NBA example there is a small degree vagueness: how do we measure, should the person be standing as straight as they can, how firm do we press down their hair, do we round near values up or down, how confident are we in our measuring stick? In this case, the vagueness probably isn’t important, but it is still there.
We can get away with ignoring vagueness, if we can be specific enough, but in many cases it would be impossible to agree standards that are specific-enough so there isn’t some degree of interpretation needed. So fuzzy values turn out to be a very natural way of representing knowledge in a large number of domains. It turns out to be much easier to agree the extremes than find an exact boundary in the middle where something flips from being false to being true.
So far I’ve said that ‘fuzzy values’ are another way of interpreting probability. They certainly can be. Probability theory is perfectly valid way to model and manipulate them. But it isn’t a very common one. In fact, it turns out that the math of probability theory isn’t very good for representing non-trivial reasoning.
Logic has a series of logical connectives (also called ‘truth functions’) which combine bits of knowledge into bigger wholes. So ‘and’ is such a connective: if we have claims A, B, then we can form a new claim “A and B”, other connectives are “not” and “or”, and most significantly, “therefore”. Probability theory can model these connectives, but doing so requires such to figure out the way the probability of each claim depends on all the other claims, before we can combine them. This is difficult at best and often impossible.
When probabilities are used with operations commonly found in logic, it is called ‘probabilistic logic’, whereas one of the alternate mathematical formulations is normally called a ‘fuzzy logic’ (note that we can still use our ‘fuzzy interpretation’ with a probabilistic logic, the fuzzy interpretation is just how we interpret the number, not what we do with it afterwards).
Unless you have good reason not to, I’d suggest that any logical reasoning with knowledge should probably be done with fuzzy logic, not probability theory.
Aside from the ability to easily do math on common concepts such as ‘and’ and ‘therefore’, fuzzy logics also provide sets of tools for modelling other ideas about knowledge. There are a set of mathematical tools to represent intensives, known as hedges. So “X is *very* Y” has a natural mathematical form in fuzzy logic. We can put all this together in questions such as “if very tall players make successful centers, is a not-very-tall player better as a forward?”, and it will have a natural mathematical expression, that can be manipulated, and a truth value will result (i.e. the result will be “from zero to one, how true is it that a not-very-tall player is better as a forward?”).
Fuzzy logic has advantages over probabilistic systems.
The advantage of not having to account for dependence is one. A second is that the output of fuzzy reasoning is generally more robust to inaccuracies in input data. Fuzzy logic systems can cope with hundreds of thousands of different pieces of input data, some dependent, some independent. Probabilistic logics often struggle beyond a few inputs (here I mean different sources of data, probabilistic logics tend to be fine when you have masses of data that all mean the same thing, like when you’re analysing SAT scores for a state).
The results or intermediate values in probabilistic calculations often get very small, making errors an absolute nightmare to handle in practice. As I showed in the last post of probability theory, Bayes’s Theorem isn’t nicely behaved with small inputs.
But the biggest, by far the biggest, win with fuzzy logics, is that there is a way to turn expert knowledge and expert reasoning into math. So, in our NBA case, we can interview expert talent scouts, and they can come up with rules like “If a player is fast, but is easily put off his dribble by a larger opponent, then I try him with strength drills, if he shows the potential to keep his ground, even if his skills don’t stay as precise, he’s worth a second look.”
This is part of my skepticism about how useful probability theory is in the humanities. Because it has so few rules, and they are so universal, reasoning is reduced to estimating inputs. There is no place for process, for higher order reasoning. It encourages atomization of complex problems, without due consideration of whether the pieces can be reassembled into anything meaningful.
Many automated financial trading systems work on fuzzy logic, for this reason; it is relatively easy to add sophisticated rules, new sources of evidence, and new evaluation criteria. You can easily bring the results of probabilistic calculations into a fuzzy logic core, but not the other way around.
If anyone is serious about putting reasoning in the humanities on a mathematical footing, it might be a good place to start, because it works well in domains where things aren’t clean. The kind of tricky judgement-call reasoning, in the presence of huge bodies of contradictory evidence, where it isn’t clear what should be an exception and what should be a rule. That’s where it gets used every day.
 Some ‘fuzzyists’ might object to basing an estimate off the average NBA player height at all, and would insist they can only give you the result if you tell them the player in question’s exact height. Different probability theorists can be quite finicky about what statistics they allow in their calculations, and reasoning with fuzzy values isn’t normally done with averages in this way. I intend this example to be more illustrative of the meaning of truth values, not an example of actually doing statistics with these quantities.
 I alluded to the fact that there are different ways to do the math in fuzzy logic, this definition of the AND operator is the most common, it is called a Zadeh operator. Fuzzy logic math is actually determined by the ‘therefore’ operator and the modus ponens rule of classical logic through a mathematical relation called the ‘T-norm’. The details of this are irrelevant here, but I mention it because probability theory defines one such T-norm, and this is how we can show that probability theory forms a perfectly valid basis for reasoning about fuzziness.
 At the simplest, you can simply interpret your Bayesian probability for some claim X as the fuzzy truth of the statement “I am confident that X”. The latter statement is one with graduated truth.