Note: this was written months ago, before the recent FTX events - so the post can’t be read as a specific (over-)reaction to the FTX fiasco. I do think there is a weak connection: some blogs and comments by the FTX leadership provide some evidence that the sort of bounded consequentialism / local consequentialism criticised in this post was part of what went wrong at  FTX, or at least part of the rationalisations. This makes me a bit more confident that if people are making updates from the FTX events, the arguments in this post are at least directionally sane. 

This is a rough attempt to present deontology and virtue ethics as approximate, effective theories [1] of consequentialist morality, for agents severely bounded in their ability to determine the consequences of their actions, and delegating actions to future selves. I think this view could help effective altruists to avoid some problems caused by what we may call bounded consequentialism or  local consequentialism. None of this is very original: you can get some of the same intuitions from the works of Derek Parfit and Toby Ord among others, but it seemed worth writing down a short less formal synthesis of my current views. 

The main claims I want to make are:

  • Because of boundedness, deontology and virtue ethics act as effective theories of consequentialism 
  • Where attempts at consequentialist calculations diverge from virtue ethical and deontic intuitions, it's often because the consequentialist calculation is mistaken
  • If you're strongly consequentialist, because of boundedness and delegation, you should take deontic and virtue ethical intuitions pretty seriously - as that will lead to better consequences

In the post, I:

  • Set out how deontology and virtue ethics can be seen as effective approximations of idealised, unbounded consequentialism, given we are just humans
  • Offer some metaphors from physics for understanding the relationship between different ethical theories
  • Give some real-life examples of what I see as bounded consequentialism in effective altruism, relating to fire alarms, demandingness, outreach to young people, and optimization of social interactions

Consequentialism

Consequentialism is the view that the right thing to do is always the action that will produce the best consequences. 

Act consequentialism - where an actor directly chooses actions based on an assessment of the consequences - works straightforwardly in situations where it is possible to compute the consequences of actions, and we are able to estimate the value of the consequences. A textbook example would be selecting between a small number of lotteries, which have different probabilities of leading to different amounts of money. Another example would be a case of triage, where a paramedic has to choose between several casualties with different survival probabilities.

The direct act-based decision procedure becomes less tractable with large action spaces, when not having complete information about the world and limited ability to predict consequences.  Note that this is usually the situation we find ourselves in in practice.

Deontology 

Classically, deontology is the view that some actions are intrinsically right or wrong, regardless of the consequences. It holds that there are certain things that we ought never to do, even if doing them would lead to good consequences. For example, it is always wrong to torture an innocent person, regardless of whether or not doing so would lead to some greater good. 

It has been well argued already that some versions of rule-based consequentialism - where instead of assessing the consequences of individual actions, actors follow rules which are selected for their average consequences - are equivalent to deontology, in the sense that both theories will recommend the same actions. 

In my view, deontology approach to ethics as a whole can often be seen as an approximate effective  consequentialist theory in the following way: 

  • The consequences of individual actions are often hard to compute, and hard to experimentally verify. 
  • The consequences of a large number of applications of a rule are easier to compute, and easier to experimentally verify. 
  • So deontology can be seen as a method of reducing the space of possible actions by eliminating some actions which are likely to have bad consequences, either on the basis of past data, or considerations about the consequences of widespread application of the rules

Within a given ethical system, it is possible that some deontological rules will have worse consequences in some cases than a direct consequentialist approach. But, in general, deontological rules are much more likely to have good consequences than the actions they rule out.  

The normative consequentialist claim is something like this: because decision making based on rules is easier to compute and the effects of rules are easier to verify, a deontic decision procedure is often the optimal computation to run under some combination of bounds on computation, information, rationality and cluelessness. 

Briefly: something like non-fanatical deontology seems like an effective first order approximation of consequentialism in at the regime of bounded rationality at human level and moderate cluelessness.

Very informal paraphrase: in practice, if you are violating some clearly sensible deontic constraint, such as 'don't lie', and justifying it with some sort of clever consequentialist argument, it's likely you are more confused and less smart than you think. 

Virtue ethics 

Virtue ethics is the view that our actions should be motivated by the virtues and habits of character that promote the good life. Virtues are not intrinsically right or wrong; they are attributes that we should strive to cultivate. In classical virtue ethics, the virtuous life is seen as the purpose of humans, and of intrinsic value. 

Taking honesty as an example virtue, we should strive to be honest, even if being dishonest would lead to some greater good. 

In my view, virtue ethics can also often be seen as an approximate effective theory of consequentialism in the following way:

Realistically, the action we take most of the time with our most important decisions is to delegate them to other agents. This is particularly true if you think about your future selves as different ("successor") agents. From this perspective, actions usually have two different types of consequences - in the external world, and in "creating a successor agent, namely future you".  

Virtue ethics tracks the consideration that the self-modification part is often the larger part of the consequences. 

In all sorts of delegations, often the sensible thing to do is to try to characterise what sort of agent we delegate to, or what sort of computation we want the agent to run, rather than trying to specify what actions should the agent take. [2] As you are delegating the rest of your life to your future selves, it makes sense to pay attention to who they are.

Once we are focusing on what sort of person it would be good for us to be, what sort of person would make good decisions for the benefit of others as well as themselves, what sort of computations to select, we are close to virtue ethics. (Or at least to what I would call virtue ethics)

Relations between the approximate theories

Often, people try to understand the relationships between the theories with the help of "credences", or of philosophical arguments. Here I’m going to offer a different approach: using physics as a metaphor for understanding the relationship between theories. The short version: theories often exist alongside one another. 

Pareto-optimality under computational constraints

In contrast to the oversimplified view sometimes presented in introductory history of science courses, it is often not the case that newer theories or paradigms wholly replace older theories. Older theories often stay pareto-optimal under particular computational constraints. 

For example, even in predicting the motion of molecules and atoms, quantum mechanics has not fully replaced Newtonian mechanics for practical applications. While QM provides deeper understanding, in practice so-called "force field" simulations are common, and are basically Newtonian mechanics. Why do we use them, when we have had QM for more than a century? Because these simulations are pareto-optimal on some combination of system size, time, precision, effects studied, and available computational budget. 

Perturbative expansions

Another common relation between theories in physics is perturbative expansion: the harder-to-compute theory is used to generate some modifications or approximations, which are then added on top of the easier-to-compute theory in borderline cases. For example, general relativity is used to compute small deviations from Newton's law, and this modified Newtonian theory is then used to do calculations.

Limiting cases

Sometimes, the relation between theories in physics is that one theory can be understood to be a special or limited case of another theory, for example a low-energy limit, high-energy limit, or classical limit. In these cases, the theory covering more of the territory can sometimes tell us when we can safely use the simpler, limiting case theory.


Using the physics metaphor to reconcile ethical theories

I suggest attempting to understand the relations between deontology and virtue ethics on the one hand, and consequentialism on the other, in a similar way.

In the same way that different theories in physics are pareto-optimal under different computational constraints, system sizes and complexities; deontology and virtue ethics can be pareto-optimal under different computational constraints, decision problem sizes and environmental complexities.

A lot of philosophising about ethics deals with extreme cases, improbable thought experiments, and situations that "stretch ethical theories to the limits of their validity." This is often seen as a way to refute a theory or point out its weaknesses. From the perspective of the physics metaphor, we can see these edge cases not as refuting moral theories, but as either suggesting modifications (as when harder-to-compute theories in physics are used to generate modifications or approximations which are added on top of the easier-to-compute theory), or illustrating the limits of a moral theory’s validity (as when a new theory in physics shows that an older theory is actually a limit case, and suggests where the old theory still applies).

Applying this metaphor to virtue ethics, deontology and consequentialism puts consequentialism in the role of the newer/harder-to-compute/more complete theory.

But note the other side of this view: in practice, even when you assume that consequentialism is never outside of its domain of theoretical validity, the normative theory often suggests that bounded and clueless agents should be doing something other than directly attempting to predict and compare consequences. To take another physics example: if you want to direct gunfire using a pocket calculator, your best bet is Newtonian mechanics. In practice, attempts to do the numerical calculations directly from the quantum level would fail horribly. Consequentialism understood correctly actually warns you against this sort of calculation.

This view also points to typical mistakes bounded consequentialists or local consequentialist are prone to:

  1. Mistakes from the shallow depth of evaluating plans. 
  2. Mistakes from not understanding the action space: when deliberative resources are spent on comparing just a few options from a huge action space.
  3. Delegation failures: mistakes when instead of trying to specify virtues of a delegate, the naive consequentialist attempts to specify what actions should be taken.

Conflict of local consequentialist intuitions and virtue ethics and/or deontic rules often points to the fact that the local consequentilist calculation is actually mistaken, and a different decision procedure should be used.

Examples in practice

How does this relate to a range of practical problems in current effective altruism?

Fire alarms prompting people to suggest rule-violating and unvirtuous acts

Multiple recent advances of machine learning, coupled with sceptical takes on AI alignment progress by Eliezer Yudkowsky, prompted some people to suggest various extreme actions which would be both unvirtuous, and rule-violating under most deontic considerations. While I won't link to specific public proposals, you can, for example, imagine actions in the space of "sabotaging the work of others".

In multiple cases here, the conflict between the alleged consequentialist value of the suggested acts, and virtuous or rule-abiding behaviour, is resolved just by reflecting for a longer time. When a less bounded consequentialist starts to consider the impacts of such non-virtuous actions on coordination, epistemics, responses from other actors in the space… the actions seem clearly bad bets. In these cases, the fact that a local bounded calculation comes apart from virtuous or deontic intuitions should be taken as an indication that the calculations may be incorrect; and just acting upon virtuous or deontic intuitions in the first place would be a better approximation of consequentialist action.

(It's probably worth noting that Eliezer Yudkowsky himself understands this and warned readers of his post against non virtuous, confused consequentialist actions in strong words.) 

Demandingness objection

Ín a recent criticism of effective altruism, Michael Nielsen eloquently raises a concern about the principle of "doing the most good" leading to unhealthy outcomes when taken very seriously, such as people considering whether to give up having children, converting ice cream purchases to lives saved, or working in an inadequate environment,.

Again, to me, these unhealthy outcomes seem like problems of local consequentialism. There are at least two things going on here:

  • Often the action space is huge and there's lots of uncertainty about which actions are best. Even if you are willing to sacrifice your happiness for the greater good, in practice it's likely that there will be many actions in the intersection between 'could have really good consequences' and 'makes me happy'. Choosing actions from the 'makes me unhappy' part of the space often seems confused given large amounts of uncertainty, and likely negative second-order effects on your future selves. 
  • We don't control our minds. Ignoring the things you actually want, care about, feel motivated by (like children or friends or art) seems empirically to make people miserable, which in turn seems to lead to bad consequences overall.

Noticing the tension between these demanding lifestyles and virtues like moderation or wholesomeness can be read as a warning that there may be mistakes in the naive consequentialist calculation.

Naive over-optimization of social interactions

Local consequentialist group organiser may be prone to thinking about people as means to their own impact, and optimising by, for example, allocating their time based on the estimated increase of likelihood the person will become 'highly engaged EA' per amount of effort spent on persuasion. 

There are various problems with this approach from a deontological perspective, but a clear one is that the approach is not stable upon becoming public: some of people you would want to talk to the most would not agree to such a conversation if they understood that this was the motivation. 

From virtues perspective, the group organizer should be worried if they are on the path to turn themselves from a thinker to an idea salesperson.

Outreach to young people

Some EA-motivated efforts targeting young people seem to me to be based on local consequalist calculations. Based on an intuition like "we will need more ML engineers in 10 years", the aim seems to be to motivate young people to move to such careers early. 

This seems problematic from the perspective of virtue ethics, and also conflicts with deontic constraints such as "not considering humans as just means to your impact". 

What might this tension point to? Plausibly, again, a problem with cluelessness. I don’t know what will happen in 10 years’ time.. If someone asked me to propose concrete effective altruistic actions for them to take in 2032, I would be pretty stuck. Naive consequentialist calculations may make approaches like this seem valuable - but the world is a sufficiently complex system that it seems hubristic to actually act upon them.

Approaching  people in a more virtue ethical way seems much more robust to me:

  • I would be much more confident in advising someone about which virtues it would be good for them to cultivate - for example scale-sensitivity, ability to think clearly, getting better at noticing inadequacies… 
  • Virtue-based approaches have a much better track record: decisions made by scale-sensitive people attempting to think clearly now seem much better than their speculations about specific actions from 10 years ago.

Personal addendum

This part was written post FTX events.

In my personal experience, what's often hard about this, is, the confused bounded consequentialist actions don't feel like that from the inside. I was most prone to them when my internal feeling was closest to consequentialism is clearly the only sensible theory here.

For this reason, I also dislike the philosophical vocabulary using the term naive consequentialism for all related problems: the word naive  subtly suggests that if you are smart, and spent few hours reasoning on something, surely you are not naive? The reality is much worse - in some sense, all consequentialism running on human brains is quite bounded, and only rarely do humans see the contours of what would the true, non-naive consequentialism imply. 

So is dropping direct consequentialism and embracing deontology a virtue ethics the answer?  Unfortunately, while at places this post advocates to give them a lot of weight, these theories are also not the answer.  We are in state where everything has serious limits or is too hard to compute.

Also: we don't have some nice, simple and easy to explain meta-theory even for the effective theories of physics, ahe current state of theorizing about moral uncertainty is much less developed than that.  In physics, we at least have many links explaining some effective theories as limit cases of stronger theories - however, in practice, physicists also rely on a lot of implicit reasoning in decisions about what you can ignore, and what models to use, and this also likely the current best option for ethics.

Thanks Rose and  Gavin for help with editing various versions of this text.
 

  1. ^

    I'm using effective theory as in physics: effective theory is a tool used to handle situations where full knowledge of a phenomenon is not available, but where partial knowledge is sufficient to make useful predictions. An effective theory may be thought of as an approximate model of the underlying full theory. It is usually simpler than the full theory, and contains only the degrees of freedom and interactions that are relevant to the problem at hand.

  2. ^

    Let's take a simple example of a car repair shop. If I arrive there with the problem "the car has started making strange noises", I probably have the best chance of a good result if I can specify the virtues of the mechanic.

    Since I don't understand the car, I can't describe what actions the mechanic should take.

    I can't even describe well the "goal" of the repair: for example, the specification "remove strange noises" could be met by a recless mechanic by removing the part of the brake that is causing the noise.

47

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 6:11 AM

Thanks for the post! I liked the physics perspective, and am a bit more convinced of the value of virtue ethics than I was before.

Classically, deontology is the view that some actions are intrinsically right or wrong, regardless of the consequences. It holds that there are certain things that we ought never to do, even if doing them would lead to good consequences. For example, it is always wrong to torture an innocent person, regardless of whether or not doing so would lead to some greater good.

 

  1. I think that what you describe is better termed "absolute deontology". Arguably, absolute deontology is (under most precisifications) a pretty radical view that most people would reject (at least if they really thought about it). More common is what Zamir and Medina call "moderate" (or "threshold") deontology. Moderate deontology allows one to do things like lie, cheat, steal, or cause direct bodily harm to innocents when the benefits of doing so massively outweigh the costs. It prohibits one from doing so when the benefits only moderately outweigh the costs.
    1. Does stealing from the rich and giving to the poor count as a case where the benefit massively outweighs the costs?
    2. It should be noted that, under these views, the 'benefits and costs' will not be just total welfare, but may instead involve things like deservingness, automony, rights, equality, justice.  
    3. Of course, there are different degrees of moderate deontology.
    4. How best to think about moderate deontology under uncertainty may depend on the case, but notions of ex-ante and ex-post harm can be used.
  2. Although I agree with your general point that thinking in somewhat non-consequentialist ways can often be a more  effective way to pursue good consequences (if that's what you want to do), this is not always true. If you are a real a utilitarian, there  is a non-trival chance that at some point in your life, you're gonna have to do something which is seriously at odds with conventional morality.  
  3. These issues about how to discipline one's mind to effectively pursue good consequences (directly or indirectly) are very important in practice for consequentialists, but I think are probably not fully resolvable in theory. Ultimately, pursuing good consequences is an art.
  1. Yes / I mostly tried to describe "pure" version of the theory, not moderated by applications of other types of reasoning.

  2. I don't think the way I'm using 'deontology ' and 'virtue ethics' reduce to 'conventional morality' either.

For example, I currently have something like a deontic commitment to not do/say things which would predictably damage epistemics - either of me or of others. (Even if the epistemic damage would have an upside like someone taking some good actions)

I think this is ultimately more of an 'attempt at approximating true consequentialism' rather then 'conventional morality' which does not care about it.

  1. The theory & current state of cognitive science & applied rationality certainly don't fully resolve the problem, but I'm optimistic they provide at least some decent approximate guesses

Just skimmed this, but note that there is an existing literature in philosophy on whether non-consequentialist theories can or should be 'consequentialised', that is, turned into consequentialist theories, e.g. Portmore (2009), Brown (2011), Sinnott-Armstrong (2019),  Shroeder (2019),  Muñoz (2021). I found these in 5 minutes, so there's probably loads more. 

A very general problem with the move, as Sinnott-Armstrong (2019) points out, is that if all theories can be re-presented as consequentialist, then it means little to label a theory as consequentialist. Even if successful, we would then have many different 'consequentialisms' that suggest, in practice, different things: should you kill the one and harvest their organs to save the five? 

Of course, all minimally plausible versions of deontology and virtue ethics must be concerned in part with promoting the good. As John Rawls, not a consequentialist himself, famously put in A Theory of Justice: “All ethical doctrines worth our attention take consequences into account in judging rightness.  One which did not would simply be irrational, crazy.” That does not, however, make everyone a consequentialist. 

I would suggest to actually read, and try to understand  the post?

The papers you link mostly use the notion of 'consequentializing'  in the sense that you can re-cast many other theories as consequentialist. But often this is almost trivial, if you allow yourself the degree of freedom of 'changing what's considered good'  on the consequentialist side (as some of the papers do). For some weird reason, you have a deontic theory prohibiting people to drink blue liquids? Fine, you  can represent that in consequentialist terms, by ranking all the words where people drink blue liquids as worse than any word where this does not happen.  This has the  problem you mention - everything becomes 'some form of consequentialism'. 

This is not what this post is about, and what I'm arguing for goes mostly in a different direction. You basically take consequentialism as "true" and some  ordering of good states of world states as given.  Next, you notice that the act-based consequentialism is often computationally intractable, for humans, using their brains. (This seem different angle of view than most philosophy papers, which by default ignore the wisdom you get from computational complexity  or information theory & physics.)

if all theories can be re-presented as consequentialist, then it means little to label a theory as consequentialist. Even if successful, we would then have many different 'consequentialisms' that suggest, in practice, different things

I think this is very much answered in the post, by the analogy to limiting cases:

Sometimes, the relation between theories in physics is that one theory can be understood to be a special or limited case of another theory, for example a low-energy limit, high-energy limit, or classical limit. In these cases, the theory covering more of the territory can sometimes tell us when we can safely use the simpler, limiting case theory.

The claim is not "all these theories are equivalent despite arriving at different conclusions" - but rather "these are what consequentialism boils down to under the assumptions that are in fact relevant to us most of the time". Different limiting cases can have contradictory results, but that just means you need to know which one is a good approximation for your case and which one isn't.

Note: this was written months ago, before the recent FTX events - so the post can’t be read as a specific (over-)reaction to the FTX fiasco.

Are you confident that you would end up publishing this post on this forum if the FTX events did not occur?

Pretty confident. I typically have a stack of drafts of the type this was, and they  end up public at some point. 

Btw, I think parts of Ways money can make things worse  describe what I think EA community actually did wrong even ex post : succumb to the epistemic distortion field too much. 

something like non-fanatical deontology seems like an effective first order approximation of consequentialism in at the regime of bounded rationality at human level and moderate cluelessness

 

Here is one really concrete sense in which deontology is what one gets by Taylor expanding consequentialism up to first order (this is copied from a rambling monograph):

Actually, there is a way to justify a kind of deontological principle as a heuristic for utility maximization, at least for a completely altruistic agent. For concreteness, consider the question of whether to recycle (or whether to be vegan for environmental reasons (the same approach also works for animal welfare, although in this case the negative effect is less diffuse and easier to grasp directly), or whether to engage in some high--emission-activity, or possibly whether to lie, etc.). It seems like the positive effect from recycling to each other agent is tiny, so maybe it can be safely ignored, so recycling has negative utility? I think this argument is roughly as bad as saying that the number of people affected is huge, so the positive effect must be infinite. A tiny number times a large number is sometimes a reasonably-sized number.

A better first-order way to think of this is the following. Try to imagine a world in which everyone recycles, and one in which no one does. Recycle iff you'd prefer the former to the latter. This is a lot like the categorical imperative. What justifies this equivalence? Consider the process of going from a world where no one recycles to a world where everyone does, switching people from non-recycler to recycler one by one. We will make a linearity assumption, saying that each step along the way changes total welfare by the same amount. It follows that one person becoming a recycler changes total welfare by a positive amount iff a world in which everyone recycles has higher total welfare than a world in which no one does. So if one is completely altruistic (i.e. maximizes total welfare), then one should become a recycler iff one prefers a world where everyone is a recycler.

I think the main benefit of this is that it makes the tradeoff easier to imagine, at least in some cases. Here are three final remarks on this:

1) If our agent is not completely altruistic, then one can still understand diffuse effects in this way, except one needs to add a multiplier on one side of the equation. E.g. if one assigns a weight of  to everyone else, then one should compare the current world to a world in which everyone recycles, but with the diffuse benefits from recycling being only  of what they actually are.

2) We might deviate from linearity, but we can often understand this deviation. E.g. early vegans probably have a superlinear impact because of promoting veganism.

3) See this for discussion of an alternative similar principle.

I'd also add that virtues and deontologically right actions are results of a memetic evolution and as such can be thought of as precomputed actions or habits that have proven to be beneficial over time and have thus high expected value.

Was the shavery and oppression propagated by the same "memetic evolution" mechanics though?