Tl;dr: the problem of how to make decisions using multiple (potentially incompatible) worldviews (which I'll call the problem of meta-rationality) comes up in a range of contexts, such as epistemic deference. Applying a policy-oriented approach to meta-rationality, and deferring to other worldviews' advice, dissolves several undesirable consequences of the standard approach of deferring to credences.
When thinking about the world, we’d ideally like to be able to integrate all our beliefs into a single coherent worldview, with clearly-demarcated uncertainties, and use that to make decisions. Unfortunately, in complex domains, this can be very difficult. Updating our beliefs about the world often looks less like filling in blank parts of our map, and more like finding a new worldview which reframes many of the things we previously believed. Uncertainty often looks less like a probability distribution over a given variable, and more like a clash between different worldviews which interpret the same observations in different ways.
By “worldviews” I include things like ideologies, scientific paradigms, moral theories, perspectives of individual people, and sets of heuristics. The key criterion is that each worldview has “opinions” about the world which can be determined without reference to any other worldview. Although of course different worldviews can have overlapping beliefs, in general their opinions can be incompatible with those of other worldviews - for example:
I think of “intelligence” as the core ability to develop and merge worldviews; and “rationality” as the ability to point intelligence in the most useful directions (i.e. taking into account where intelligence should be applied). Ideally we’d like to always be able to combine seemingly-incompatible worldviews into a single coherent perspective. But we usually face severe limitations on our ability to merge worldviews together (due to time constraints, cognitive limitations, or lack of information). I’ll call the skill of being able to deal with multiple incompatible worldviews, when your ability to combine them is extremely limited, meta-rationality. (Analogously, the ideal of emotional intelligence is to have integrated many different parts of yourself into a cohesive whole. But until you’ve done so, it’s important to have the skill of facilitating interactions between them. I won’t talk much about internal parts as an example of clashing worldviews throughout this post, but I think it’s a useful one to keep in mind.)
I don’t think there’s any sharp distinction between meta-rationality and rationality. But I do think meta-rationality is an interesting limiting case to investigate. The core idea I’ll defend in this post is that, when our ability to synthesize worldviews into a coherent whole is very limited, we should use each worldview to separately determine an overall policy for how to behave, and then combine those policies at a high level (for example by allocating a share of resources to each). I’ll call this the policy approach to meta-rationality; and I’ll argue that it prevents a number of problems (such as over-deference) which arise when using other approaches, particularly the epistemic approach of combining the credences of different worldviews directly.
Let’s consider one central example of meta-rationality: taking into account other people’s disagreements with us. In some simple cases, this is straightforward - if I vaguely remember a given statistic, but my friend has just looked it up and says I’m wrong, I should just defer to them on that point, and slot their correction into my existing worldview. But in some cases, other people have worldviews that clash with our own on large-scale questions, and we don’t know how to (or don’t have time to) merge them together without producing a frankenstein worldview with many internal inconsistencies.
How should we deal with this case, or other cases involving multiple inconsistent worldviews? The epistemic approach suggests:
This seems sensible, but leads to a few important problems:
The key problem which underlies these different issues is that the epistemic approach evaluates and merges the beliefs of different worldviews too early in the decision-making process, before the worldviews have used their beliefs to evaluate different possible strategies. By contrast, the policy approach involves:
One intuitive description of how this might occur is the parliamentary approach. Under this approach, each worldview is treated as a delegate in a parliament, with a number of votes proportional to how much weight is placed on that worldview; delegates can then spread their votes over possible policies, with the probability of a policy being chosen proportional to how many votes are cast for it.
The policy approach largely solves the problems I identified previously:
I also think that the policy approach is much more compatible with good community dynamics than the epistemic approach. I’m worried about cycles where everyone defers to everyone else’s opinion, which is formed by deferring to everyone else’s opinion, and so on. Groupthink is already a common human tendency even in the absence of explicit epistemic-modesty-style arguments in favor of it. By contrast, the policy approach eschews calculating or talking about all-things-considered credences, which pushes people towards talking about (and further developing) their own worldviews, which has positive externalities for others who can now draw on more distinct worldviews to make their own decisions.
Having said all this, there are several salient problems with the policy approach; I’ll cover four, but argue that none of them are strong objections.
Firstly, although we have straightforward ways to combine credences on different claims, in general it can be much harder to combine different policies. For example, if two worldviews disagree on whether to go left or right (and both think it’s a very important decision) then whatever action is actually taken will seem very bad to at least one of them. However, I think this is mainly a problem in toy examples, and becomes much less important in the real world. In the real world, there are almost always many different strategies available to us, rather than just two binary options. This means that there’s likely a compromise policy which doesn’t differ too much from any given worldview’s policy on the issues it cares about most. Admittedly, it’s difficult to specify a formal algorithm for finding that compromise policy, but the idea of fairly compromising between different recommendations is one that most humans find intuitive to reason about. A simple example: if two policies disagree on many spending decisions, we can give each a share of our overall budget and let it use that money how it likes. Then each policy will be able to buy the things it cares about most: getting control over half the money is usually much more than twice as valuable as getting control over all the money.
Secondly, it may be significantly harder to produce a good estimate of the value of each worldview’s advice than the accuracy of each worldview’s predictions, because we tend to have much less data about how well personalized advice works out. For example, if a worldview tells us what to do in a dozen different domains, but we only end up entering one domain, it’s hard to evaluate the others. Whereas if a worldview makes predictions about a dozen different domains, it’s easier to evaluate all of them in hindsight. (This is analogous to how credit assignment is much harder in reinforcement learning than in supervised learning.)
However, even if in practice we end up mostly evaluating worldviews based on their epistemic track record, I claim that it’s still valuable to consider the epistemic track record as a proxy for the quality of their advice, rather than using it directly to evaluate how much we trust each worldview. For example: suppose that a worldview is systematically overconfident. Using a direct epistemic approach, this would be a big hit to its trustworthiness. However, the difference between being overconfident and being well-calibrated plausibly changes the worldview’s advice very little, e.g. because it doesn’t change that worldview’s relative ranking of options. Another example: predictions which many people disagree with can allow you to find valuable neglected opportunities, even if conventional wisdom is more often correct. So when we think of predictions as a proxy for advice quality, we should place much more weight on whether predictions were novel and directionally correct than whether they were precisely calibrated.
Thirdly, the policy approach as described thus far doesn’t allow worldviews to have more influence over some individuals than others - perhaps individuals who have skills that one worldview cares about far more than any other; or perhaps individuals in worlds where one worldview’s values can be fulfilled much more easily than others’. Intuitively speaking, we’d like worldviews to be able to get more influence in those cases, in exchange for having less influence in other cases. In the epistemic approach, this is addressed via variance normalization across many possible worlds - but as discussed above, this could be significantly affected by how you differentiate the possibilities (and also what your prior is over those worlds). I think the policy approach can deal with this in a more principled way: for any set of possible worlds (containing people who follow some set of worldviews) you can imagine the worldviews deciding on how much they care about different decisions by different people in different possible worlds before they know which world they’ll actually end up in. In this setup, worldviews will trade away influence over worlds they think are unlikely and people they think are unimportant, in exchange for influencing the people who will have a lot of influence over more likely worlds (a dynamic closely related to negotiable reinforcement learning).
This also allows us a natural interpretation of what we’re doing when we assign weights to worldviews: we’re trying to rederive the relative importance weights which worldviews would have put on the branch of reality we actually ended up in. However, the details of how one might construct this “updateless” original position are an open problem.
One last objection: hasn’t this become far too complicated? “Reducing” the problem of epistemic deference to the problem of updateless multi-agent negotiation seems very much like a wrong-way reduction - in particular because in order to negotiate optimally, delegates will need to understand each other very well, which is precisely the work that the whole meta-rationality framing was attempting to avoid. (And given that they understand each other, they might try adversarial strategies like threatening other worldviews, or choosing which decisions to prioritize based on what they expect other worldviews to do.)
However, even if finding the optimal multi-agent bargaining solution is very complicated, the question that this post focuses on is how to act given severe constraints on our ability to compare and merge worldviews. So it’s consistent to believe that, if worldviews are unable to understand each other, they’ll do better by merging their policies than merging their beliefs. One reason to favor this idea is that multi-agent negotiation makes sense to humans on an intuitive level - which hasn’t proved to be true for other framings of epistemic modesty. So I expect this “reduction” to be pragmatically useful, especially when we’re focusing on simple negotiations over a handful of decisions (and given some intuitive notion of worldviews acting “in good faith”).
I also find this framing useful for thinking about the overall problem of understanding intelligence. Idealized models of cognition like Solomonoff induction and AIXI treat hypotheses (aka worldviews) as intrinsically distinct. By contrast, thinking of these as models of the limiting case where we have no ability to combine worldviews naturally points us towards the question of what models of intelligence which involve worldviews being merged might look like. This motivates me to keep a hopeful eye on various work on formal models of ideal cognition using partial hypotheses which could be merged together, like finite factored sets (see also the paper) and infra-bayesianism. I also note a high-level similarity between the approach I've advocated here and Stuart Armstrong's anthropic decision theory, which dissolves a number of anthropic puzzles via converting them to decision problems. The core insight in both cases is that confusion about how to form beliefs can arise from losing track of how those beliefs should relate to our decisions - a principle which may well help address other important problems.