Vague Knowledge

[This isn't another post on Richard Carrier's Proving History. It is, however, a post I suggested I might write as a follow on to my review, that explores another way to think mathematically about knowledge, since there are quite a few folks who've arrived here because of my posts on probability theory.]

So in recent posts we’ve been looking at probabilities and their use to talk about knowledge. Probability, as we saw, relies only on some general notion of ‘likelihood’, and in one of the comments, I mentioned that probabilities can be used for a whole range of related ideas. The most common are the odds of something happening (a so-called frequentist interpretation) and the confidence one can have in something being true (the Bayesian interpretation). It might be obvious that in many cases these two interpretations are rather similar: the odds of rolling a 6 on a die, may seem obviously the same as your confidence that a hidden die-roll came up 6. The two interpretations do diverge, however, they aren’t always the same (but that’s not the point of this post, so I won’t go into the Frequentist-Bayesian problem).

But there are other ways we can interpret probability, and one of those will let us jump into some territory that is less well trodden, but even more fruitful.

We can use probability to represent how true something is. This is quite different from a Bayesian interpretation which is how confident we can be that something is black-and-white true. We’ll call our ‘how true’ interpretation the fuzzy interpretation, because it treats truth as somewhat fuzzy.

Let’s take an example (and to avoid the criticism that this is about Carrier or mythicism, let’s talk sport). If I described an NBA basketball player, and asked “What is the probability that he is tall?”, a Bayesian might say: “let’s define tall as over 6’6″, how confident am I that an NBA player is at least that tall?” A fuzzy interpretation might say “someone who is 5’6″ is not at all ‘tall’, someone who is 7′ is entirely ‘tall’, people in between are somewhat tall, how tall is my player?”

So let’s imagine that the average height of NBA players is 6’9″, and only 30% are shorter than 6’6″. The Bayesian then says, the probability of your NBA player being tall is 70% (because only 30% aren’t), the ‘fuzzyist’ says the probability is 83%, because all we know about NBA players on average is they are 83% tall[1].

There are some bits of knowledge that are black-and-white, but perhaps not many. The philosophical problem of vagueness is important (see my post on Sorites paradox for some more background), and it crops up in many places. When you ask questions about knowledge, it turns out to be very difficult to be precise enough to avoid having vague criteria. Even in our Bayesian NBA example there is a small degree vagueness: how do we measure, should the person be standing as straight as they can, how firm do we press down their hair, do we round near values up or down, how confident are we in our measuring stick? In this case, the vagueness probably isn’t important, but it is still there.

We can get away with ignoring vagueness, if we can be specific enough, but in many cases it would be impossible to agree standards that are specific-enough so there isn’t some degree of interpretation needed. So fuzzy values turn out to be a very natural way of representing knowledge in a large number of domains. It turns out to be much easier to agree the extremes than find an exact boundary in the middle where something flips from being false to being true.

So far I’ve said that ‘fuzzy values’ are another way of interpreting probability. They certainly can be. Probability theory is perfectly valid way to model and manipulate them. But it isn’t a very common one. In fact, it turns out that the math of probability theory isn’t very good for representing non-trivial reasoning.

Logic has a series of logical connectives (also called ‘truth functions’) which combine bits of knowledge into bigger wholes. So ‘and’ is such a connective: if we have claims A, B, then we can form a new claim “A and B”, other connectives are “not” and “or”, and most significantly, “therefore”. Probability theory can model these connectives, but doing so requires such to figure out the way the probability of each claim depends on all the other claims, before we can combine them. This is difficult at best and often impossible.

Here’s a sample of the difference in math. If I have two claims, that are independent, A has a Bayesian probability of P(A)=0.5, B of P(B)=0.2, then, the probability of both claims: “A and B” is:

P(A and B) = P(A)*P(B) = 0.1.

if I’m interested in P(A or B) I get:

P(A or B) = P(A)+P(B)-P(A*B) = 0.6

In fuzzy logic if A has a value e(A)=0.5 and B is e(B)=0.2, then

e(A and B) = min(e(A), e(B)) = 0.2[2]

and

e(A or B) = max(e(A), e(B)) = 0.5

Both have the same definition of “not”:

P(not A) = 1-P(A)

e(not A) = 1-e(A)

And, you can confirm that the numbers work for logical identities such as

A or B = not (not A and not B)

Note that, in the probability case, we have to make sure that the two claims are totally independent, otherwise the calculation is wrong, in the fuzzy case, this is not so. The fuzzy approach is more general.

When probabilities are used with operations commonly found in logic, it is called ‘probabilistic logic’, whereas one of the alternate mathematical formulations is normally called a ‘fuzzy logic’ (note that we can still use our ‘fuzzy interpretation’ with a probabilistic logic, the fuzzy interpretation is just how we interpret the number, not what we do with it afterwards).

Unless you have good reason not to, I’d suggest that any logical reasoning with knowledge should probably be done with fuzzy logic, not probability theory.

Aside from the ability to easily do math on common concepts such as ‘and’ and ‘therefore’, fuzzy logics also provide sets of tools for modelling other ideas about knowledge. There are a set of mathematical tools to represent intensives, known as hedges. So “X is *very* Y” has a natural mathematical form in fuzzy logic. We can put all this together in questions such as “if very tall players make successful centers, is a not-very-tall player better as a forward?”, and it will have a natural mathematical expression, that can be manipulated, and a truth value will result (i.e. the result will be “from zero to one, how true is it that a not-very-tall player is better as a forward?”).

Fuzzy logic has advantages over probabilistic systems.

The advantage of not having to account for dependence is one. A second is that the output of fuzzy reasoning is generally more robust to inaccuracies in input data. Fuzzy logic systems can cope with hundreds of thousands of different pieces of input data, some dependent, some independent. Probabilistic logics often struggle beyond a few inputs (here I mean different sources of data, probabilistic logics tend to be fine when you have masses of data that all mean the same thing, like when you’re analysing SAT scores for a state).

The results or intermediate values in probabilistic calculations often get very small, making errors an absolute nightmare to handle in practice. As I showed in the last post of probability theory, Bayes’s Theorem isn’t nicely behaved with small inputs.

But the biggest, by far the biggest, win with fuzzy logics, is that there is a way to turn expert knowledge and expert reasoning into math. So, in our NBA case, we can interview expert talent scouts, and they can come up with rules like “If a player is fast, but is easily put off his dribble by a larger opponent, then I try him with strength drills, if he shows the potential to keep his ground, even if his skills don’t stay as precise, he’s worth a second look.”

This is part of my skepticism about how useful probability theory is in the humanities. Because it has so few rules, and they are so universal, reasoning is reduced to estimating inputs. There is no place for process, for higher order reasoning. It encourages atomization of complex problems, without due consideration of whether the pieces can be reassembled into anything meaningful.

Many automated financial trading systems work on fuzzy logic, for this reason; it is relatively easy to add sophisticated rules, new sources of evidence, and new evaluation criteria. You can easily bring the results of probabilistic calculations into a fuzzy logic core[3], but not the other way around.

If anyone is serious about putting reasoning in the humanities on a mathematical footing, it might be a good place to start, because it works well in domains where things aren’t clean. The kind of tricky judgement-call reasoning, in the presence of huge bodies of contradictory evidence, where it isn’t clear what should be an exception and what should be a rule. That’s where it gets used every day.

[1] Some ‘fuzzyists’ might object to basing an estimate off the average NBA player height at all, and would insist they can only give you the result if you tell them the player in question’s exact height. Different probability theorists can be quite finicky about what statistics they allow in their calculations, and reasoning with fuzzy values isn’t normally done with averages in this way. I intend this example to be more illustrative of the meaning of truth values, not an example of actually doing statistics with these quantities.

[2] I alluded to the fact that there are different ways to do the math in fuzzy logic, this definition of the AND operator is the most common, it is called a Zadeh operator. Fuzzy logic math is actually determined by the ‘therefore’ operator and the modus ponens rule of classical logic through a mathematical relation called the ‘T-norm’. The details of this are irrelevant here, but I mention it because probability theory defines one such T-norm, and this is how we can show that probability theory forms a perfectly valid basis for reasoning about fuzziness.

[3] At the simplest, you can simply interpret your Bayesian probability for some claim X as the fuzzy truth of the statement “I am confident that X”. The latter statement is one with graduated truth.

About these ads

5 Comments

Filed under Uncategorized

5 responses to “Vague Knowledge

  1. stuart

    Fascinating stuff.

    Richard Carrier’ argument is that since all reasoning is Bayesian reasoning then all we have to do is to be more explicit about. It seems to me that there is a fatal problem with this. We simply don’t know what happens in our brains when we make judgements. For example, it should be possible for a chess grandmaster to explain exactly why he made a particular move, but in practice this isn’t possible. Chess grandmasters simply can’t explain why they make certain moves because they don’t know what’s happening in their brains at the time. A grandmaster can no more tell you how the decision was made than a tennis player can tell you exactly how much force was exerted by every single muscle in his body when he played a shot.

    Richard Carrier himself doesn’t know what processes occurred in his brain when he arrived at the judgement that Jesus never existed. So when he explains in his next book how bayesian reasoning shows that Jesus never existed he won’t even be telling us how he arrived at that conclusion himself. He will just be offering a plausible sounding rationalisation.

  2. Ian

    Thanks Stuart, and welcome. I think you’re right. I think probability theory can help guide our intuitions, and can be useful to show obvious mistakes in reasoning, but real historical research isn’t so atomic. We simultaneously weigh lots of bits of evidence against one another, and use quite elaborate heuristics to guide the process.

    My original review of Carriers book concluded by saying that the book should be seen as polemic rather than math. A rational defence of his views, using the domain of probability theory, rather than an objective method for arriving at them.

    Of course, we’re yet to see how he applies the process to the HJ question.

    One of the interesting things about fuzzy logic, is that it is implemented in software in ways that are deliberately intended to allow experts to experiment with their intuitions to generate good rules. It isn’t perfect. The reality is a lot more messy than I made out above, but it is about the best tool I can think of for doing it.

  3. Orwin O'Dowd

    Hi, Ian, your caution about higher-order issues rings a bell here. Richard Bacon has a proof, responding to the Sorites paradox, that vagueness ramifies to any order. One can say the same for Russell’s famous theory of types, which may be the cost of the vague claim that there exist statements or propositions and such. The effect in Medieval thought was infinite hierarchies of angels and demons, and the sense still lingering that we might yet evolve a whole way. But I must say this doesn’t bode well for the fuzzy interpretation, and fuzzy controls for appliances have quietly tanked. Is fuzzy finance doing any better? Quantum excitations also ramify to any order, converging on the ionization energy, which suggests that physical chaos may be a better analogy. With a strange attractor, the essence is vague but the effect is categorical: you can’t say what it is, but you know when you’ve got it. That resonates in history with ideas like the Stoic Way of Nature, the Chinese Mandate of Heaven, and Christian grace.

  4. Ian

    I’m sorry as a whole I found this response incomprehensible.

    “One can say the same thing about Russell’s famous theory of types” – well one clearly can, since “ramifies to any order” is surely a direct reference to Russell’s Type theory. If that phrase were used in any context, surely that would be the allusion. You may want to say a bit more about how this applies to Sorites paradox.

    “The effect in Medieval thought was infinite hierarchies of angels” And a 90 degree turn… if you can’t be explicit about why Russell’s type theory had the effect of inspiring speculation about angels hundreds of years before it was suggested, then I feel entitled to start thinking you’re a bit of a crank.

    “fuzzy controls for appliances have quietly tanked. Is fuzzy finance doing any better” Fuzzy control is alive and well. It is totally pointless in applications with small numbers of inputs and relatively simple rules. You may as well just write a control function in that case. “Fuzzy logic” rice steamers were as much a marketing plot as the fad for things having “neural networks” a few years later.

    “Quantum excitations also ramify to any order, converging on the ionization energy” Right, now we’re back on stuff that I understand to some extent, so I’m pretty sure you’re bluffing here. I assume you’re talking about the Rydberg series. If so, then I understand them pretty well, and they having nothing to do with this topic. I do note, however, that mentioning Quantum anything is a *very* popular tactic for appearing to say something profound while actually uttering nonsense.

    “which suggests that physical chaos may be a better analogy.” Analogy for what? Not to mention another non-sequitir. If we’re talking about electron binding energies, then pray, how does this suggest deterministic chaos at all?

    “With a strange attractor, the essence is vague but the effect is categorical: you can’t say what it is, but you know when you’ve got it.” Now you’ve wandered into territory I know quite a lot about, having done a fair amount of complex systems research academically and professionally. And again I’m pretty sure you’re making it up because things like ‘chaos’ and ‘strange attractors’ sound mystically profound. You’ll need to get very specific here, I think, before I can begin to take this seriously. As for the clause after your colon – both bits of that sentence are simply false.

    “That resonates in history with ideas like the Stoic Way of Nature, the Chinese Mandate of Heaven, and Christian grace.” And, surely it is also at the heart of the obscuritanist’s doctrine of semiotic fungibility!

    A few minutes and I was able to find your real name, ‘research’ and a plethora of reviews that show I’m not alone in thinking that you substitute buzzword soup for anything of substance… Ah, life is interesting.

  5. This is really a great post!

    One of my main problems with single point valued objective Bayesianism is that it cannot make a difference between warranted knowledge and ignorance .

    As I wrote in a post:

    https://lotharlorraine.wordpress.com/2014/03/02/a-mathematical-proof-of-ockhams-razor/

    **********************
    The principle of indifference is not only unproven but also often leads to absurd consequences. Let us suppose that I want to know the probability of certain coins to land odd. After having carried out 10000 trials, I find that the relative frequency tends to converge towards a given value which was 0.35, 0.43, 0.72 and 0.93 for the four last coins I investigated. Let us now suppose that I find a new coin I’ll never have the opportunity to test more than one time. According to the principle of indifference, before having ever started the trial, I should think something like that:

    Since I know absolutely nothing about this coin, I know (or consider here extremely plausible) it is as likely to land odd as even.

    I think this is magical thinking in its purest form.
    **************************************

    Such concern has been leading an increasing number of Bayesians to consider degrees of belief as probability intervals instead of single values.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s