This is the second in a series of posts exploring consequentialist cluelessness and its implications for effective altruism:
- The first post describes cluelessness & its relevance to EA; arguing that for many popular EA interventions we don’t have a clue about the intervention’s overall net impact.
- This post considers a potential reply to concerns about cluelessness – maybe when we are uncertain about a decision, we should just choose the option with the highest expected value.
- Following posts discuss how tractable cluelessness is, and what being clueless implies about doing good.
Consider reading the first post first.
A rationalist’s reply to concerns about cluelessness could be as follows:
- Cluelessness is just a special case of empirical uncertainty.[1]
- We have a framework for dealing with empirical uncertainty – expected value.
- So for decisions where we are uncertain, we can determine the best course of action by multiplying our best-guess probability against our best-guess utility for each option, then choosing the option with the highest expected value.
While this approach makes sense in the abstract, it doesn’t work well in real-world cases. The difficulty is that it’s unclear what “best-guess” probabilities & utilities we should assign, as well as unclear to what extent we should believe our best guesses.
Consider this passage from Greaves 2016 (“credence function” can be read roughly as “probability”):
The alternative line I will explore here begins from the suggestion that in the situations we are considering, instead of having some single and completely precise (real-valued) credence function, agents are rationally required to have imprecise credences: that is, to be in a credal state that is represented by a many-membered set of probability functions (call this set the agent’s ‘representor’). Intuitively, the idea here is that when the evidence fails conclusively to recommend any particular credence function above certain others, agents are rationally required to remain neutral between the credence functions in question: to include all such equally-recommended credence functions in their representor.
To translate a little, Greaves is saying that real-world agents don’t assign precise probabilities to outcomes, they instead consider multiple possible probabilities for each outcome (taken together, these probabilities sum to the agent’s “representor”). Because an agent holds multiple probabilities for each outcome, and has no way by which to arbitrate between its multiple probabilities, it cannot use a straightforward expected value calculation to determine the best outcome.
Intuitively, this makes sense. Probabilities can only be formally assigned when the sample space is fully mapped out, and for most real-world decisions we can’t map the full sample space (in part because the world is very complicated, and in part because we can’t predict the long-run consequences of an action).[2] We can make subjective probability estimates, but if a probability estimate does not flow out of a clearly articulated model of the world, its believability is suspect.[3]
Furthermore, because multiple probability estimates can seem sensible, agents can hold multiple estimates simultaneously (i.e. their representor). For decisions where the full sample space isn’t mapped out (i.e. most real-world decisions), the method by which human decision-makers convert their multi-value representor into a single-value, “best-guess” estimate is opaque.
The next time you encounter someone making a subjective probability estimate, ask “how did you arrive at that number?” The answer will frequently be along the lines of “it seems about right” or “I would be surprised if it were higher.” Answers like this indicate that the estimator doesn’t have visibility into the process by which they’re arriving at their estimate.
So we have believability problems on two levels:
- Whenever we make a probability estimate that doesn’t flow from a clear world-model, the believability of that estimate is questionable.
- And if we attempt to reconcile multiple probability estimates into a single best-guess, the believability of that best-guess is questionable because our method of reconciling multiple estimates into a single value is opaque.[4]
By now it should be clear that simply following the expected value is not a sufficient response to concerns of cluelessness. However, it’s possible that cluelessness can be addressed by other routes – perhaps by diligent investigation, we can grow clueful enough to make believable decisions about how to do good.
The next post will consider this further.
Thanks to Jesse Clifton and an anonymous collaborator for thoughtful feedback on drafts of this post. Views expressed above are my own. Cross-posted to my personal blog.
Footnotes
[1]: This is separate from normative uncertainty – uncertainty about what criterion of moral betterness to use when comparing options. Empirical uncertainty is uncertainty about the overall impact of an action, given a criterion of betterness. In general, cluelessness is a subset of empirical uncertainty.
[2]: Leonard Savage, who worked out much of the foundations of Bayesian statistics, considered Bayesian decision theory to only apply in "small world" settings. See p. 16 & p. 82 of the second edition of his Foundations of Statistics for further discussion of this point.
[3]: Thanks to Jesse Clifton to making this point.
[4]: This problem persists even if each input estimate flows from a clear world-model.
I was pretty surprised by this sentence. Maybe you could say more precisely what you mean?
I take the core concern of cluelessness to be that perhaps we have no information about what options are best. Expected value gives a theoretical out to that (with some unresolved issues around infinite expectations for actors with unbounded utility functions). Approximations to expected value that humans can implement are as you point out kind of messy and opaque, but that's a feature of human reasoning in general, and doesn't seem particularly tied to expected value. Is that what you're pointing at?
I can’t speak for the author, but I don’t think the problem is the difficulty of “approximating” expected value. Indeed, in the context of subjective expected utility theory there is no “true” expected value that we are trying to approximate. There is just whatever falls out of your subjective probabilities and utilities.
I think the worry comes more from wanting subjective probabilities to come from somewhere — for instance, models of the world that have a track-record of predictive success. If your subjective probabilities are not grounded in such a model, as is arguably often the case with EAs trying to optimize complex systems or the long-run future, then it is reasonable to ask why they should carry much epistemic / decision-theoretic weight.
(People who hold this view might not find the usual Dutch book or representation theorem arguments compelling.)
I'll second this. In double cruxing EV calcs with others it is clear that they are often quite parameter sensitive and that awareness of such parameter sensitivity is rare/does not come for free. Just the opposite, trying to do sensitivity analysis on what are already fuzzy qualitative->quantitative heuristics is quite stressful/frustrating. Results from sufficiently complex EV calcs usually fall prey to ontology failures, ie key assumptions turned out wrong 25% of the time in studies on analyst performance in the intelligence community, and most scenarios have more than 4 key assumptions.
Good points, but this seems to point to a weakness in the way we do modeling, not a weakness in expected value.
But that just means that people are making estimates that are insufficiently robust to unknown information and are therefore vulnerable to the optimizer's curse. It doesn't imply that taking the expected value is not the right solution to the idea of cluelessness.
I'm not sure what you mean. There is nothing being estimated and no concept of robustness when it comes to the notion of subjective probability in question.
The expected value of your actions is being estimated. Those estimates are based on subjective probabilities and can be well or poorly supported by evidence.
For a Bayesian, there is no sense in which subjective probabilities are well or poorly supported by the evidence, unless you just mean that they result from calculating the Bayesian update correctly or incorrectly.
Likewise there is no true expected utility to estimate. It is a measure of an epistemic state, not a feature of the external world.
I am saying that I would like this epistemic state to be grounded in empirical reality via good models of the world. This goes beyond subjective expected utility theory. As does what you have said about robustness and being well or poorly supported by evidence.
Yes, whether you are Bayesian or not, it means that the estimate is robust to unknown information.
No, subjective expected utility theory is perfectly capable of encompassing whether your beliefs are grounded in good models. I don't see why you would think otherwise.
No, everything that has been written on the optimizer's curse is perfectly compatible with subjective expected utility theory.
I’m having difficulty understanding what it means for a subjective probability to be robust to unknown information. Could you clarify?
Could you give an example where two Bayesians have the same subjective probabilities, but SEUT tells us that one subjective probability is better than the other due to better robustness / resulting from a better model / etc.?
It means that your credence will change little (or a lot) depending on information which you don't have.
For instance, if I know nothing about Pepsi then I may have a 50% credence that their stock is going to beat the market next month. However, if I talk to a company insider who tells me why their company is better than the market thinks, I may update to 55% credence.
On the other hand, suppose I don't talk to that guy, but I did spend the last week talking to lots of people in the company and analyzing a lot of hidden information about them which is not available to the market. And I have found that there is no overall reason to expect them to beat the market or not - the info is good just as much as it is bad. So I again have a 50% credence. However, if I talk to that one guy who tells me why the company is great, I won't update to 55% credence, I'll update to 51% or not at all.
Both people here are being perfect Bayesians. Before talking to the one guy, they both have 50% credence. But the latter person has more reason to be surprised if Pepsi diverges from the mean expectation.
It sounds to me like this scenario is about a difference in the variances of the respective subjective probability distributions over future stock values. The variance of a distribution of credences does not measure how “well or poorly supported by evidence” that distribution is.
My worry about statements of the form “My credences over the total future utility given intervention A are characterized by distribution P” does not have to do with the variance of the distribution P. It has to do with the fact that I do not know whether I should trust the procedures that generated P to track reality.
Well in this case at least, it is apparent that the differences are caused by how well or poorly supported people's beliefs are. It doesn't say anything about variance in general.
Distribution P is your credence. So you are saying "I am worried that my credences don't have to do with my credence." That doesn't make sense. And sure we're uncertain of whether our beliefs are accurate, but I don't see what the problem with that is.
I’m having difficulty parsing the statement you’ve attributed to me, or mapping it what I’ve said. In any case, I think many people share the intuition that “frequentist” properties of one’s credences matter. People care about calibration training and Brier scores, for instance. It’s not immediately clear to me why it’s nonsensical to say “P is my credence, but should I trust it?”
I agree with Jesse's reply.
I’m late to the party, but I’ve really enjoyed this series of posts. Thanks for writing.
I don't see how this implies that the expected value isn't the right answer. Also, what exactly do you mean by "believability"? It's a subjective probability estimate.
I don't hold multiple probabilities in this way. Sure some agents do, but presumably those agents aren't doing things correctly. Maybe the right answer here is "don't be confused about the nature of probability."
There are lots of claims we make on the basis of intuition. Do you believe that all such claims are poor, or is probability some kind of special case? It would help to be more clear about your point - what kind of visibility do we need and why is it important?
This statement is kind of nonsensical with a subjective Bayesian model of probability; the estimate is your belief. If you don't have that model, then sure a probability estimate could be described as likely to be wrong, but it's still not clear why that would prevent us from saying that a probability estimate is the best we can do.
The way of reconciling multiple estimates is to treat them as evidence and update via Bayes' Theorem, or to weight them by their probability of being correct and average them using standard expected value calculation. If you simply take issue with the fact that real-world agents don't do this formally, I don't see what the argument is. We already have a philosophical answer, so naturally the right thing to do is for real-world agents to approximate it as well as they can.
"Approximate it as well as they can" implies a standard beyond the subjective Bayesian framework by which subjective estimates are compared. Outside of the subjective Bayesian framework seems to be where the difficulty lies.
I agree with what Jesse stated above: "I am saying that I would like this epistemic state to be grounded in empirical reality via good models of the world. This goes beyond subjective expected utility theory. As does what you have said about robustness and being well or poorly supported by evidence."
A standard like "how accurately does this estimate predict the future state of the world?" is what we seem to use when comparing the quality (believability) of subjective estimates.
I think the difficulty is that it is very hard to assess the accuracy of subjective estimates about complicated real-world events, where many of the causal inputs of the event are unknown & the impacts of the event occur over a long time horizon.
How does it imply that? A Bayesian agent makes updates to their beliefs to approximate the real world as well as it can. That's just regular Bayesian updating, whether you are subjective or not.
I don't see what this has to do with subjective estimates. If we talk about estimates in objective and/or frequentist terms, it's equally difficult to observe the long term unfolding of the scenario. Switching away from subjective estimates won't make you better at determining which estimates are correct or not.
I don't have a fully articulated view here, but I think the problem lies with how the agent assesses how its approximations are doing (i.e. the procedure an agent uses to assess when an update is modeling the world more accurately or less).
Agreed. I think the difficulty applies to both types of estimates (sorry for being imprecise above).
This business with multiple possible probabilities sounds like you are partway through reinventing Bayesian model uncertainty. Seems like "representor" corresponds to "posterior distribution over possible models". From a Bayesian perspective, you can solve this problem by using the full posterior for inference, and summing out the model.
"It is better to be approximately right than to be precisely wrong." - Warren Buffett
"Anything you need to quantify can be measured in some way that is superior to not measuring it at all." - Gilb's Law
I don't follow. What does it mean to use "the full posterior for inference," in this context?
A couple examples would help me.
I think this is the jargon: https://en.wikipedia.org/wiki/Posterior_predictive_distribution
Sorry, I'm not sure what the official jargon for the thing I'm trying to refer to is. In the limit of trying to be more accessible, I'm basically teaching a class in Bayesian statistics, and that's not something I'm qualified to do. (I don't even remember the jargon!) But the point is there are theoretically well-developed methods for talking about these issues, and maybe you shouldn't reinvent the wheel. Also, I'm almost certain they work fine with expected value.
Hm, I feel sorta strange about this exchange.
Here's a toy model of the story in my head:
Does that seem strange to you, too? I'm not trying to be unfair here.
Basically it seems strange that you know that Bayesian statistics addresses this issue, but it's not easy to give examples of how.
Do you think it's an acceptable conversational move for me to give you pointers to a literature which I believe addresses issues you're working on even if I don't have a deep familiarity with that literature?
I think it's acceptable, but being "acceptable" feels like a pretty low bar.
Basically I don't think it's rude, or arguing in bad faith, or anything like that. But not being able to give a specific reference when we dig into one of your claims lowers my credence of that claim.
For what it's worth, this is Greaves' terminology, not mine.