CNOT

date

Ellsberg's Wager

belief, decision, Dempster-Shafer, games, Pearl, probability, religion

There is a bag containing at least one token. Tokens must be either black or white. One token will be drawn at random from the bag and then returned to it, say at noon on Sunday. Now imagine that everyone in some community has one black token and one white token. They can place one (and only one) on a Magic Box before noon, and if it's colour matches the token that is drawn out at noon, then a Desirable Prize is dispensed. There's no penalty for getting it wrong. Which token should they choose? For those who are interested we're playing a game where the payoff matrix is identity.

		token choice
		black	white
the bag's outcome	black	1	0
	white	0	1

Obviously the best strategy depends on what the bag contains. Suppose it's a balanced bag that contains an equal number of white tokens and black tokens. Even small children will tell you that either token is equally good in this case, and when psychologists have investigated people's choices they about 50% of people choose black and 50% of people choose white. What if we have no idea what's in the bag except that nothing but a white or a black token will be pulled out? It's a bag of mystery! Again either token has a 50-50 chance. And given the choice, in this case too people will choose roughly equal proportions of each token.

So lets's be clear - the strategies for a one-off game are identical for balanced and mystery bags. Now suppose we offer people a choice of matching their token against a token drawn from either the balanced bag or the mystery bag. Most people choose to play against the balanced bag. Most of the people choosing the balanced bag will not switch to the mystery bag even if the game is repeated. This is known as Ellsberg's paradox. It's named after Daniel Ellsberg, better known for leaking the Pentagon Papers in an attempt to stop the Vietnam war, by the way. He's now an outspoken opponent of the USA's war in Iraq, and the frequently threatened attack on Iran.

See, in the case of the repeated game, the mystery bag offers the opportunity of learning to exploit any imbalance that may exist in the proportion of token colours, but in the balanced case you're always stuck at 50% and can never improve upon it. Ellsberg's paradox caused much consternation amongst many economists as their theories are frequently based on the assumption (spoken or unspoken, and frequently attended by hordes of similar assumptions) that all agents act rationally (ie. in their best interests). Psychologists aren't really sure why so many people choose the balanced bag - could it be a commitment to fairness, a preference for not making hard choices, fear of the unknown...? Nobody really knows for certain. But maybe we can make an educated guess.

Let's look more closely at how strategies for the one-off game are arrived at. In the case of the balanced bag we have an probability estimate of 50%, and a maximum confidence in that estimate - given that token colour is a random variable we can't do better. The balanced probability is considered to be unconditional. In the case of the bag of mystery we have an initial probability estimate of 50% but no confidence at all in it - it's really a representation of our own ignorance, what Bayesians would call a 'prior'.

A mathematical formalism that addresses this notion of confidence is called Dempster-Shafer theory, a generalisation of Bayesian statistics. Take the set ω of outcomes. In the case above ω = {black, white}. We are interested in the 'power set' or set-of-all-subsets of ω. We write this as 2^ω = {ø, {black}, {white}, ω}. Now let's define a function called the mass m, which will be a little bit like probability in that it adds up to 1, but it's defined not on ω but on 2^ω. We'll define m(ø) = 0 and then we can ignore the empty set completely.

The mass of a set s is m(s) and represents the 'weight of evidence' that the outcome is in that set, but not evidence for a particular member of that set. At the beginning of our game:

m	Balanced	Mystery
ω	0	1
{white}	½	0
{black}	½	0

As the game progresses, we will receive evidence for the mystery bag's internal state (the particular outcomes of each draw) - and gradually we will see m(ω) going down, and m({black}) and m({white}) rising. While we will never have the rock-solid certainty of the balanced bag system, we could conceivably find that one or other of the outcomes is in the lead and adapt our strategy accordingly.

For decision making purposes, we usually don't want to work with the masses themselves, but with an interval of probabilities. The Support S(p) where p ⊆ ω is defined as the sum of m(x) where x ∈ 2^p. The Plausibility of p is the extent to which evidence against p leaves room for the possibility that the event is in p and is given by Pl(p) = 1 - S(¬ p). Also note that S(p) ≤ Pl(p).

To look at our table again:

	Balanced			Mystery
	m	S	Pl	m	S	Pl
ω	0	0	0	1	1	1
{white}	½	½	½	0	0	1
{black}	½	½	½	0	0	1

This sort of reasoning has lead some people assert the nonexistence (or sometimes non-utility) of probability altogether (whatever that turns out to mean). Judea Pearl, an AI dude from UCLA, has argued that the support is actually a probability of provability, where provability is contrasted with truth. Judea Pearl is also the originator of much of the science of belief propagation (by 'belief' he means 'support') and causality research (I've become very interested in this recently).

As evidenced by the confused thinking on the subject of decision under uncertainty it appears a pervasive human cognitive bias is preference for the experience of irreducible risk over the experience of having to deal with our own ignorance. Knowing this bad mental habit is certainly useful in itself, but I was interested to find out that there is an entire branch of decision theory that is based on it, for those situations where it is actually appropriate. They call it 'info-gap theory' which is just about the worst name I could have imagined. It comes as no surprise that the 1950s are ultimately responsible for this crime against nomenclature.

The appropriate domain is wherever the precautionary principle applies wherever making the wrong decision leads to irreversible harm and there is no further opportunity to make the decision over again- medicine, biodiversity, etc. In info-gap theory we need make decisions based on optimising robustness under failure, not the expectation of 'utility' (or just 'payoff' in pure game theory). Pascal's wager is a fun example. Pascal's wager is the argument that you have nothing to lose from believing in a God because if (s)He doesn't exist you'll still be as just as dead as if you hadn't. There is a similar, but subtler, argument attributed to the Buddha.

There is an obvious problem with Pascal's approach; the assumption that Gods will invariably reward belief and/or punish denial. We don't know this. Now we're almost back to our bag of tokens; we have two levels of uncertainty. Drawing any conclusions, that is, deciding whether to believe in Gods, deciding which set of Gods to pick, deciding whether to decide whether to believe in Gods at all, and adapting your strategy to account for potential past or future lives is left as an exercise for the reader.

19:32 20 Jan 2008 /articles/ ellsbergswager

cherry nothing (CNOT)

I watched C beams glitter in the dark near the Hadamard Gate.

Ellsberg's Wager

quick links

categories

feeds

contact

elsewhere