Why evidential and causal decision theory usually agree

Are you accounting for all the evidence?

Jan 5, 2023 23 min read

A friend of mine once told me about a colleague of his with a controversial opinion: it’s a good thing that elite colleges preferentially admit legacy students. Why? We should expect the children of alumni to share many of the impressive qualities of their parents. For example, the children of Barack Obama are more likely to be exceptionally intelligent and determined than the average applicant to Columbia. Given this, colleges should use what they know about legacy to predict who will succeed.

Even if we put aside considerations of unfairness, this is a specious argument. Having a parent who is an alumnus may be a signal of competence in the absence of any other information. But student applications contain much more information than just legacy status: students must submit their high school transcript, standardized test scores, letters of recommendation, and much more. With all these more direct predictors of academic achievement, it is unlikely that having a parent who is an alumnus will provide substantial additional information about this student’s potential.¹

In statistics, we call two pieces of information conditionally independent if knowing one of these pieces of information tells us nothing about the other piece once we account for some additional information we already have. In this example, we might conjecture that the alumni status of an applicant’s parents is (more or less) conditionally independent of the applicant’s future academic performance; once we condition on information such as that student’s grades and test scores, we learn very little about the student’s potential success by learning that a parent is an alumnus. For this reason, alumni status should be irrelevant to a college’s decision of whether to admit a student.

Almost every decision we make in the real world depends on a multitude of information, so most of the time, the type of independence we should attend to is conditional independence. Many observations provide substantial information on their own, but negligible information in the presence of other observations. In what follows, I consider how this concept may offer clarity on a longstanding puzzle in decision theory: Newcomb’s paradox.

The strangest game show in the universe

You’ve found yourself on the set of a game show in which two boxes are presented in front of you. One box is transparent, and in it you can see a thousand dollars in cash. The other box is opaque. From an intercom, a host informs you that this box either contains a million dollars or nothing at all. At this point, you can see that nobody will be able to manipulate the contents of the boxes. You face a simple choice: you can either (a) take just the contents of the opaque box (a decision we will call “one-boxing”) or (b) take the contents of both boxes (“two-boxing”).

Before you entered the set, however, the producers of the show had fed some information about you into a computer program (which I will call the “Oracle”) whose goal is to predict what you will do as accurately as possible. If the Oracle had predicted that you would one-box, the host placed a million dollars in the opaque box before you entered the set; if it had predicted you would two-box, he put nothing in this box. You know that the Oracle has a track record of making predictions that are clearly better than random chance and that it is discriminating in its predictions (i.e., it doesn’t just predict one- or two-boxing for all contestants).

At this point, we can imagine different versions of how the Oracle might be making predictions. As I will argue below, these details are crucial to understanding what you should do. But first let’s briefly explore why Newcomb’s problem is considered a “paradox.” The problem highlights a rift between two schools of decision-theory: “evidential” and “causal.”

Evidential decision theorists believe that the standard tools of probability theory and Bayesian inference are sufficient to prescribe the right course of action in this game and, more generally, for any decision. Because the Oracle can discriminate who will one- or two-box with better-than-chance accuracy, your choice in this game — whether to one- or two-box — seems to provide evidence about the Oracle’s prior prediction and, in turn, whether there is a million dollars in the opaque box. Evidential decision theory, therefore, says that, if this evidence is strong enough, you should one-box to increase your confidence that you’ll win a million dollars.

Causal decision theorists argue that this line of thinking is misguided, and it’s easy to see why. By the time the contestant entered the set, the contents of the opaque box were fixed. So, in the absence of time travel, the contestant’s decision couldn’t possibly change what the Oracle predicted — that is wishful thinking. You should just take the extra thousand dollars and hope that the Oracle predicted that you wouldn’t.

The crux of the disagreement can be formalized with some arithmetic. The goal of both evidential and causal decision theory is to pick the action that maximizes “expected utility.” For simplicity, “utility” here just refers to money earned.² To compute the utility that is expected, we weigh each possible outcome by how likely we think it is to occur. Call one’s belief about the likelihood of a given outcome a “credence,” where $\text{Cr}(X \mid Y)$ can be read as “the decision maker’s credence in $X$ given that $Y$ is true”. Then, the expected utility of one-boxing ($1b$) and two-boxing ($2b$) is the following:

\[ \begin{split} EU(1b) & = \text{Cr}(\text{opaque filled} \mid 1b)*\$1000k + \text{Cr}(\text{opaque empty} \mid 1b)*\$0 \\ EU(2b) & = \text{Cr}(\text{opaque filled} \mid 2b)*\$1001k + \text{Cr}(\text{opaque empty} \mid 2b)*\$1k, \end{split} \]

where $\text{Cr}(\text{opaque empty} \mid A) = 1 - \text{Cr}(\text{opaque filled} \mid A)$ for either action $A$. But evidential and causal decision theories disagree about what these credences should be. To the evidential theorist, a credence is just a conditional probability, i.e.,

\[ \text{Cr}(X \mid Y) = P(X \mid Y). \]

(We’re going to embellish this definition soon.) Meanwhile, to the causal decision theorist, credences depend on one’s causal beliefs. Without getting into the details, a causal decision theorist would argue in this case that

\[ \text{Cr}(\text{opaque filled} \mid 1b) = \text{Cr}(\text{opaque filled} \mid 2b), \]

i.e., your action in the game is irrelevant to your belief about whether the opaque box has a million dollars in it.

So on the causal decision theorist’s version of expected utility,

\[ EU(2b) = EU(1b) + \$1,000. \]

Since your goal is to maximize expected utility, two-boxing is the obvious choice. For evidential decision theory, however,

\[ P(\text{opaque filled} \mid 1b) > P(\text{opaque filled} \mid 2b) \]

because the Oracle’s history of accurate prediction indicates that it is more probable for the opaque box to be filled in cases in which people one-box than in cases in which they two-box. If the probability on the left is even slightly higher than the probability on the right, then $EU(1b) > EU(2b)$, i.e., you should one-box.

How wily is the Oracle?

By now, you can probably figure out where I’m headed. While a naive version of evidential decision theory only uses one’s decision as evidence for what the oracle predicted, a more sophisticated version uses much more information. That other information usually renders your action conditionally independent of the Oracle’s prediction, just as a student’s grades and standardized test scores render her parent’s alumni status conditionally independent of her future academic success. However, in the case of Newcomb’s problem, the details of the game matter. In some versions, one’s future action is clearly conditionally independent of the Oracle’s prediction, or at least provides so little new information to be practically irrelevant. In these cases, you should two-box. In more unrealistic versions of the game, your action does provide unique information about the past state of the world; therefore, you should one-box.

The omniscient Oracle

Some versions of Newcomb problem assume a science-fiction universe in which the Oracle is basically omniscient and can predict the future with perfect, or near perfect, accuracy.³ Even though this AI makes its prediction in advance of your decision, it has such a detailed model of the universe that it knows you better than you know yourself. Even at the point at which you feel unsure about whether you’ll one- or two-box, the Oracle knows where your thought process will lead. Your brain, after all, is part of the physical world and part of the Oracle’s impeccable model.

Given this scenario, some people contend that there’s no point debating what the ‘right’ choice is, as it’s clear that your choice was predetermined and you lack free will. I believe this line of thinking is misguided: we can divorce questions of free will from questions about what’s rational. After all, most scientists believe that we currently live in a more or less deterministic universe, which means that all our choices in this universe could theoretically be predicted in advance. Yet we can still discriminate rational from irrational decisions.

So what should you do? From your vantage point, you know that if you one-box, you’ll get a million dollars, and you know that if you two-box, you’ll leave with only a thousand. It’s difficult for me to fathom the contorted logic that might lead someone to two-box in this situation. It would seem to depend on a mistaken model of the world according to which your private thoughts are unknown to the Oracle in advance. But you can’t outsmart an omniscient AI, and I don’t see the point in trying. A million dollars is waiting for you in all worlds in which you one-box. You can’t get that money if you two-box.

In short, in this sci-fi universe, causal decision theory leads you down the road to crazy town by recommending two-boxing. If you’re one of those poor losers who steadfastly believes that two-boxing is always the reasonable choice, though, you can stop reading now, as I will next argue that two-boxing is the smart choice in any realistic version of Newcomb’s problem. But do we need causal decision theory?

Just another ML algorithm

Consider a version of the game with a much less impressive Oracle. A very slight majority of men choose to 2-box, while a very slight majority of women choose to 1-box. The Oracle only uses the contestant’s sex to predict what he or she will do, so the Oracle puts a million dollars in the opaque box for all female contestants and nothing in the opaque box for all male contestants.

If you were a contestant in this game, what would you do? Well, if you already knew how the majority of your sex chooses to play the game, you know what’s in the opaque box, so your action couldn’t possibly give you any additional information about what the Oracle predicted; i.e., your action is conditionally independent of the prediction. But that’s not very interesting.⁴ Instead, imagine you know how the Oracle is making its prediction — by simply predicting the majority decision for each sex in its training data — but not whether your sex predominantly one- or two-boxes. Further, the host informs you that the Oracle is correct in its predictions 50.5% of the time.

A 50.5% accuracy doesn’t sound impossible by today’s standards, nor does it sound particularly impressive. The Oracle may be significantly above chance in its predictions (assuming chance is 50%), but barely. Still, the naive version of evidential decision theory recommends one-boxing⁵:

\[ \begin{split} EU(1b) & = P(\text{opaque filled} \mid 1b)*\$1000k + P(\text{opaque empty} \mid 1b)*\$0 \\ & = 0.505*\$1000k \\ & = \$505,000 \\ \\ EU(2b) & = P(\text{opaque filled} \mid 2b)*\$1001k + P(\text{opaque empty} \mid 2b)*\$1k \\ & = 0.495*\$1001k + 0.505*\$1k \\ & = \$495,505. \end{split} \]

Now it seems that evidential decision theory is leading us down the road to crazy town! Unlike the omniscient Oracle, this one is using the simplest algorithm imaginable — taking a majority vote of its training data for a single demographic feature. Even if you don’t know whether your sex mostly one- or two-boxes, it’s absurd to imagine that your choice in the game will give you useful information about that. And this intuition is correct; we need to think more carefully about the evidence.

When conditional independence matters

Let’s dig a bit deeper into the math. The way we’ve set up this particular Oracle’s prediction algorithm, it will tell the host to fill the opaque box if and only if the majority of the contestant’s sex has chosen to one-box in the past.⁶ Let $M_{1b}$ indicate that the majority of the contestant’s sex has chosen to one-box and $M_{2b}$ for the majority two-boxing. Then $P(\text{opaque filled} \mid A)$ is just $P(M_{1b} \mid A)$ for either action $A$, and we can rewrite the above equations as

\[ \begin{split} EU(1b) & = P(M_{1b} \mid 1b)*\$1000k \\ EU(2b) & = P(M_{1b} \mid 2b)*\$1000k + \$1k. \end{split} \]

From Bayes’ theorem, we can rewrite either of the above probabilities as

\[ \begin{split} P(M_{1b} \mid A) & = \frac{P(M_{1b})P(A \mid M_{1b})}{P(M_{1b})P(A \mid M_{1b}) + P(M_{2b})P(A \mid M_{2b})} \\ & = \frac{P(M_{1b})P(A \mid M_{1b})}{P(M_{1b})P(A \mid M_{1b}) + (1 - P(M_{1b}))P(A \mid M_{2b})}. \end{split} \]

… or at least this is how a naive evidential decision theorist would infer what the Oracle predicted. The equation tells us that this belief depends on the contestant’s prior belief that the majority of his sex one-boxes ($P(M_{1b})$) and the likelihood that this contestant would one-box given that the majority of his sex one-boxes ($P(A \mid M_{1b})$) or one-box given that the majority of his sex two-boxes ($P(A \mid M_{2b})$). The Oracle’s accuracy for past prediction indicates that — in the absence of other information — the contestant is more likely to one-box if the majority of men one-box than if the majority of men two-box, and vice versa if this male contestant chooses to two-box.

What are the implications of this formula? First, as already discussed, we can see in the formula above that the likelihood terms will be irrelevant for the contestant if he already knows that the majority of men two-box, i.e., $P(M_{1b}) = 0$. More generally, the importance of the likelihood terms will depend on the contestant’s confidence in this prior belief. If he’s fairly sure that most men two-box (e.g., $P(M_{1b}) = 0.05$), then $P(M_{1b} \mid A)$ will be closer to $P(M_{1b})$ than if he’s unsure (e.g., $P(M_{1b}) = 0.45$).

A strong prior probability is thus one avenue by which the contestant’s action can come to be irrelevant, or mostly irrelevant, to his belief about the Oracle’s prediction. But we can clearly imagine versions of Newcomb’s problem in which the prior will be closer to 50%. For example, if I were playing such a game, I would be quite uncertain whether the majority of men have a tendency to one- or two-box. (Indeed, I assumed this uninformative prior in the calculations earlier that show one-boxing has a higher expected utility than two-boxing.) So our focus will be on the likelihood terms.

The naive contestant’s action influences his expected value calculations because of this inequality:

\[ P(A \mid M_{1b}) \neq P(A \mid M_{2b}). \]

This inequality tells us that the probability that the contestant one-boxes depends on whether the majority of men take this action. In the absence of any other information, we would expect a male contestant to be slightly more likely to two-box if we learned that a (slight) majority of men have two-boxed in the past, and vice versa for women.

But are these the likelihood terms that a less naive evidential decision theorist playing this game would actually use? This more thoughtful contestant should realize that she has access to a cornucopia of information about herself. She has a detailed knowledge of her personality, values, memories, and so on — and she has access to her private thoughts leading up to her final decision. Call this private knowledge $K$. Then the correct formula this wise female contestant should be using to update her beliefs about the Oracle’s prediction is

\[ P(M_{1b} \mid A, K) = \frac{P(M_{1b} \mid K)P(A \mid M_{1b}, K)}{P(M_{1b} \mid K)P(A \mid M_{1b}, K) + (1 - P(M_{1b} \mid K))P(A \mid M_{2b}, K)}. \]

In other words, the contestant ought to be conditioning on all of her knowledge when estimating whether the majority of women one- or two-box and, in turn, whether the Oracle put a million dollars in the opaque box.

What should we make of the relationship between $P(A \mid M_{1b}, K)$ and $P(A \mid M_{2b}, K)$ now? At this point, there’s no easy way to specify what this probability should be, as it depends on a myriad of unique facts about the contestant and her thought process before she commits to one- or two-boxing. Nonetheless, here’s an intuition of how we might think about the contestant’s state of mind, narrated from the standpoint of a third-party observer: Imagine we are observing our female contestant, Susan, playing this game. Although the Oracle is only making its prediction on the basis of Susan’s sex, we have access to a detailed dossier that contains every imaginable detail about Susan’s life that Susan herself could tell us. Further, as Susan is deliberating on the set of the game show, we can view a direct readout of her conscious thoughts as if we were Susan herself. At the moment immediately before Susan announces her decision, we take a guess about how likely it is that Susan is about to one-box. (Remember, we know everything about her that she knows about herself, and we’ve heard her entire inner monologue as she contemplates her decision.) Right after we make this first guess about what Susan will do, we learn one further piece of information: a slight majority of female contestants choose to one-box. Now we can give a new estimate of Susan’s likelihood of one-boxing. Does it change?

Although I narrated this inferential process from the standpoint of a third-party observer, this is the way that Susan herself should reason. Up until she commits to a decision, she can contemplate whether her final choice will give her any additional information about whether the majority of women one- or two-box. She does this by imagining what would happen if she learned that the majority of women one- or two-box. Would this make it more or less likely that someone just like her in her exact situation would one- or two-box? Surely not in any practically relevant sense; this coarse demographic information about women as a whole is useless in the face of all the extremely specific facts Susan knows about herself and her process of deliberation. That is,

\[ P(A \mid K) \approx P(A \mid M_{1b},K), \]

and same for $P(A \mid M_{2b}, K)$. This implies that

\[ P(M_{1b} \mid K) \approx P(M_{1b} \mid A,K), \]

i.e., the Oracle’s prediction is conditionally independent of Susan’s final choice. Given this, Susan should two-box, as she is expected to make an extra $1000 regardless of what she estimates $P(M_{1b} \mid K)$ to be. Indeed, she doesn’t even need to bother trying to estimate this quantity; it’s enough to know that her action won’t make her more or less likely to think that there’s a million dollars in the opaque box.

Screening off your decision

Thus far, to simplify the problem, I have assumed that the Oracle is just using information about Susan’s sex. But the same logic of conditional independence applies to an Oracle that uses a more comprehensive, but still superficial, set of features (e.g., income, age, education, etc.). Even if the AI knows that the majority of 33-year-old married women who majored in biology and graduated from Susan’s alma mater and live in Atlanta choose to one-box, this information pales in comparison to what Susan knows about herself before she commits to a decision. The various reasons that women like this may, on average, slightly favor one- or two-boxing are already reflected in Susan’s private knowledge. Note, also, that Susan doesn’t need to know exactly what features the Oracle is using to make its prediction; she just needs to know that the features are superficial enough to be superseded by her private knowledge.⁷

As popularized by the computer scientist Judea Pearl, there is a more formal way to represent what Susan knows about the relationship between her introspective knowledge, her decision, and the Oracle’s prediction. We can draw a directed graph that represents the causal dependencies:

\[ \text{Oracle's prediction} \leftarrow \text{Superficial facts about Susan's demographic} \rightarrow \text{Susan's introspective knowledge} \rightarrow \text{Susan's decision}. \]

In this graph-based framework, causality is fully understood through the logic of conditional dependencies. That is, we are not denying that causal relationships exist; we are showing how they can be inferred in a purely evidential framework. The graph implies that when Susan doesn’t condition on her introspective knowledge, her decision depends on the Oracle’s prediction, even though this relationship is not causal. She blocks this “backdoor” path, however, by conditioning on her introspective knowledge, $K$. This screens off the dependence between the Oracle’s prediction and Susan’s future decision, rendering the decision irrelevant to the prediction.⁸

Moreover, by formalizing causal relationships in this way, we can now understand why Susan should not discount her decision as evidence in the sci-fi version of Newcomb’s problem, in which the Oracle is omniscient. As discussed, for the Oracle to be omniscient, it must have a detailed predictive model of the universe, which includes the full workings of Susan’s brain up to her final decision. This model thus knows more about Susan than she can introspect about herself. So in addition to the causal path shown above, which flows through Susan’s introspective knowledge, there is a different backdoor path that Susan cannot block with her introspective knowledge:

\[ \text{Oracle's prediction} \leftarrow \text{Perfect simulation of Susan's brain} \rightarrow \text{Non-introspectable workings of Susan's mind} \rightarrow \text{Susan's decision}. \]

In other words, Susan can only learn that she’s about to win a million dollars by announcing her decision to one-box. It would be foolish to pass up on this opportunity!

Avoid the headache: Stick with causal decision theory

Hopefully by now it’s clear that the standard objections to evidential decision theory evince a misunderstanding about the nature of evidence. In any realistic version of Newcomb’s problem, a properly specified evidential decision theory arrives at the same answer as causal decision theory without leading to the absurd conclusion that we should two-box when the Oracle is omniscient. Furthermore, we can preserve the tools of probability theory without resorting to new ad hoc methods required of causal decision theory.

Practically speaking, though, it is probably easiest for most people to think of their decisions in a causal manner. That is, as we contemplate different possible behaviors to engage in, we can hold our beliefs about everything prior to the decision fixed without worrying about how a future action might give us evidence about the past. Because of conditional independence, this heuristic is unlikely to lead us astray in any real-world context (at least before the robots take over), and it avoids the kind of incorrectly specified evidential models that could recommend irrational decisions.

Was all of this, then, just a pointless philosophical exercise? Well, yeah, mostly. But a detailed understanding of Newcomb’s problem has given me a new perspective on many of the more prosaic dilemmas I face in life, even if it hasn’t affected the ultimate decisions I would make.

Here’s one example. For most of my life, I have dealt with a minor stomach condition known as “GERD” (i.e., acid reflux disease), which doctors have treated with the drug esomeprazole. For the last couple of decades, this class of drug has been blamed for just about every health problem imaginable and has been associated with premature death — even after statistically controlling for obvious demographic and lifestyle differences between people who take the medication and those who don’t. Understandably, I have been concerned about staying on the medication, even though it is by far the most effective drug for managing my heartburn. Would it be worth the cost of switching to an inferior medication to reduce my risk of an early death?

Fortunately, in recent years, scientists have finally gotten around to testing for causality directly by conducting a randomized control trial, and this trial found essentially no evidence that the medication causes adverse events. A commonsense causal decision theorist would tell me that I should stay on the medication without worrying about it killing me.

But wait, wouldn’t an evidential decision theorist argue that I should still get off the drug? Even though it seems unlikely that the drug itself is causing harm, all of the observational findings suggest that taking the drug portends future doom, even for people who share my demographic and lifestyle characteristics. Clearly, there must be something different between people who take the drug and people who don’t, even within this restricted group of people who are similar to me, but I don’t know exactly what this difference is.⁹

However, the more careful evidential decision theorist in me notes that, whatever this difference is, it’s not something that the action of taking the drug itself will provide unique evidence about, once my private knowledge is accounted for. For example, perhaps the group that takes the drug is a bit more likely to engage in some hard-to-measure unhealthy behaviors, such as consuming excess sugar or sitting for long periods of time. If these behaviors explain the spurious correlation between use of the drug and adverse events, I won’t learn anything about the extent to which I engage in these behaviors by observing whether or not I take esomeprazole. I already know everything there is to know about my habits and lifestyle. This knowledge screens off the information that the action itself might provide.

I suppose it’s possible that the decision to take the drug uniquely reveals some latent psychological tendency to neglect my health that cannot be learned from any other past behavior or introspective knowledge. But I’m pretty sure that the world doesn’t work that way. I’ll take the risk.

If anything, if we compare a legacy student to a non-legacy student with identical grades and test scores, we might expect the non-legacy student to perform better, given that they probably needed to work harder to achieve the same thing.↩︎
Technically, most people’s utility function is not linear in dollars earned, but for the sake of this problem, we can pretend that dollars are utility or that the game is played on a utility scale.↩︎
If the possibility of a perfectly accurate, omniscient Oracle messes with your conception of causality, feel free to imagine that the Oracle is just extremely unlikely to make a mistake (e.g., the odds are one in a trillion). Thus it is at least ‘possible’ that you could win over a million dollars by two-boxing.↩︎
Not to mention that if it was common knowledge that the Oracle was giving a million dollars to all the women, women would start two-boxing, and the Oracle’s predictions would break down.↩︎
I am taking a shortcut here in assuming that the prior probability $P(\text{opaque filled})$ is 0.5, and therefore we can use Bayes’ theorem to equate the Oracle’s known accuracy with the posterior probability that the Oracle predicted one- or two-boxing.↩︎
For simplicity, I’m ignoring sampling error here. Assume that the Oracle has sampled enough men and women to know whether the majority of each sex chooses to one- or two-box. None of our conclusions will depend on this simplification.↩︎
If we wanted to dive further into the math, we could marginalize, or average, over Susan’s beliefs about the possible prediction algorithms that the Oracle could be using. For all plausible algorithms, Susan’s private knowledge should render her action conditionally independent of the prediction.↩︎
Note, further, that the timing of the Oracle’s prediction is irrelevant here. What matters is the information reflected in that prediction.↩︎
In this case, you might guess that it’s just difficult to properly control for general characteristics such as caring about your health, eating well, and so on. In other words, even if the researchers tried to control for important lifestyle factors, there is residual confounding.↩︎