Questions about and Objections to 
'Sharing the World with Digital Minds' (2020)


First see my summary, or the original paper. Here are some things that stuck out to me:

1. Apparent conflation between individual outweighing and collective outweighing. Bostrom and Shulman (B&S) define ‘super-beneficiaries’ as individuals who are better at converting resources into (their own) well-being than any human. But they then go on to say that one reason to think that digital minds will be super-beneficiaries, is that they are likely to greatly outnumber human beings eventually. Part 2 of the paper is a long list of reasons to think that we might be able to construct digital super-beneficiaries. The first given is that digital minds can be created more quickly and easily than humans can reproduce, they will likely come to far outnumber humans once pioneered:
 'One of the most basic features of computer software is the ease and speed of exact  reproduction, provided computer hardware is available. Hardware can be rapidly  constructed so long as its economic output can pay for manufacturing costs (which have historically fallen, on price-performance bases, by enormous amounts; Nordhaus, 2007). This opens up the door for population dynamics that would take multiple centuries to play out among humans to be compressed into a fraction of a human lifetime. Even if initially only a few digital minds of a certain intellectual capacity can be affordably built, the number of such minds could soon grow exponentially or super-exponentially, until limited by other constraints. Such explosive reproductive potential could allow digital minds to vastly outnumber humans in a relatively short time—correspondingly increasing the collective strength of their claims.' (p.3)

However, evidence that digital minds will outnumber humans is not automatically evidence that individual digital minds will produce well-being more efficiently than humans, and so count as super-beneficiaries by their definition. More importantly, the fact that there will be large numbers of digital minds, whilst evidence that most future utility will be produced by digital minds, is not evidence that digital minds, as a collective, will convert resources into utility more efficiently than humans as a collective will. B&S seem to have conflated these two claims with the following: if digital minds are far more numerous than humans, the amount of well-being digital minds collectively generate will be higher than the amount that humans collectively generate. 

I’m open to this being essentially just a typesetting error (putting the subsection in the wrong section!).

 2. They assume that equality of resource allocation trumps equality of welfare. B&S worry that, if digital minds are a) far more numerous than humans and b) take far more resources to support, then egalitarian principles will force us to set basic income at a level lower than human subsistence. As far as I can tell the argument is: 

giving humans a higher basic income than digital minds would be unjust discrimination, 

  1. but if there are many more digital minds than humans, and 
  2. if digital minds can survive on far less resources than humans can, 
  3. then we might not be able to afford to give both humans and digital minds a basic income high enough for humans to survive on. 

 Setting aside questions about whether a high enough basic income for humans to survive on really would be unaffordable in a world with many digital minds, it’s not clear that any true non-discrimination principle requires the state to give everyone the same basic income. 

For instance, in many countries that currently have welfare states, disabled people are entitled to disability benefits to which non-disabled people do not have access. Few people think that this is an example of unfair discrimination against the non-disabled, which suggests that it is permissible (and perhaps required) for governments to give higher levels of welfare support to citizens with higher levels of need. 

So if humans need more resources to survive than do digital minds, then it is probably permissible for governments to give humans a higher basic income than they give to digital minds (at least by current common sense). 

3. Unclear argument for a minor diminishment of human utility. B&S tentatively propose:

C = “in a world with very large numbers of digital minds, humans receiving .01% of resources might be 90% as good for humans as humans receiving 100% of resources.”

How B&S arrive at this isn’t very clear. According to them, C follows from the fact that an economy where most workers are digital minds would be vastly more productive than one where most workers are humans. 

But they don’t explain why it follows from the fact that an economy produces a very large amount of resources per human, that humans capturing .01% of resources might be 90% as good for humans as humans capturing 100%. Presumably the idea is that once every human has reached some high absolute standard of living, giving them further resources doesn’t help them much, because resources have diminishing marginal utility. However, it’s hard to be sure this is what they mean: the argument is not spelled out. 

(Also, in ‘Astronomical Waste’, Bostrom expresses scepticism that large increases in resources above a certain level will only make a small difference to the well-being of a human in a world with digital minds. He reasons that in such a world we might have invented new and resource-intensive ways for humans to access very high well-being. However, Bostrom only says this may be true. And in Digital Minds, B&S are also tentative about the claim that humans could receive 90% of the benefits of 100% of resources by capturing 0.01% of the resources produced by a digital mind economy. So there isn’t a crisp inconsistency between ‘Digital Minds’ and ‘Astronomical Waste’.)


4. Unclear evidence for relative cost of digital and biological subsistence. B&S claim that it is “plausible” that it will be cheaper to maintain digital minds at a subsistence level than to keep humans alive, but they don’t actually give much argument for this, or cite any source in support of it. I think this is a common assumption in EA and rationalist speculation about the future, and that it might be a good idea to check what the supporting evidence for it actually is. (There’s an implicit reference to Moore’s law – “The cost of computer hardware to support digital minds will likely decline”.)


5. Diminishing marginal utility as a condition of some values? B&S claim that digital minds could become super-beneficiaries by being designed so that they don’t become habituated to pleasures, in the way that humans eventually become bored or sated with food, sex etc. 

One worry: it’s unclear that a digital mind could be built like this and remain able to function: not getting sated might mean that they get stuck on undergoing the same pleasurable experience over and over again. On hedonistic theories of well-being, this might still make them super-beneficiaries, since a digital mind that spent all its time repeating the same high-intensity pleasure over and over might well experience a large net amount of pleasure-minus-pain over its lifetime. But on the subset of objective list theories of value on which the best lives involve a balance of different goods, not getting sated might actually get in the way of even matching, let alone surpassing, humans in the efficiency with which you turn resources into well-being. (If you never get bored with one good, why move on to others and achieve balance?). 

 6. Difficulties with the claim that different minds can have different hedonic capacities. B&S claim that digital minds might be capable of experiencing pleasures more intense than any human could ever undergo. However, I am somewhat sceptical of the view that maximum possible pleasure intensity can vary between different conscious individuals. It is notoriously difficult to explain what makes a particular pleasure (or pain) more or less intense when the two pleasures occur in different individuals. (The notorious problem of “interpersonal utility comparisons.”) I think that one of the best candidate solutions to this problem entails that all minds which can undergo conscious pleasures/pains have maximum pleasure/pain experiences with the same level of intensity. The argument for this is complex, so I’ve put it in a separate doc.


7. Maximising could undermine digital minds’ breadth. B&S’s discussion of objective list theory claims that digital minds could achieve goods like participation in strong friendships, intellectual achievement, and moral virtue to very high degrees, but they don’t discuss whether maximising for one of these goods would lead a digital mind towards a life containing very little of the others, or whether balance between these goods is part of a good life.

 8. Unclear discussion of superhuman preferences. B&S list having stronger preferences as one way that digital minds could become super-beneficiaries. But their actual discussion of this possibility doesn’t really provide much argument that digital minds would or could have stronger-than-human preferences. It just says that it’s difficult to compare the strength of preferences across different minds, and then goes on to say that ‘emotional gloss’ and ‘complexity’, might be related to stronger preferences. 


9. Conflation of two types of person-affecting view. B&S object to the idea that creating digital super-beneficiaries is morally neutral because creating happy people is neutral, by complaining that ‘strict person-affecting views’, must be wrong, because they imply that we have no reason to take action to prevent negative effects of climate change on people who do not yet exist. However, I don’t think this reasoning is very convincing. 

A first objection: it’s not immediately clear that actions to prevent future harms from climate change are actually ruled out by views on which actions are only good if they improve things for some person. If someone is going to exist whether or not we take action against climate change, then taking action against climate change might improve things for them. However, this isn’t really a problem, since it does seem person-affecting views are plausibly refuted by the fact that actions which prevent climate harm, and also change completely the identity of everyone born in the future, are still good insofar as they prevent the harms. 

A more serious objection: you might be able to deny that making happy people is good, even while rejecting person-affecting views on which an action can only be good if there is some person it makes better-off. ‘Making happy people is neutral’ is a distinct claim from ‘an action is only good if there is at least one person it makes better off’. So the burden of proof is on B&S when they claim that if the latter is false, the former must be too. They need to either give an argument here, or at least cite a paper in the population ethics literature. (B&S do say that appealing to person-affecting views is just one way of arguing that creating super-beneficiaries is morally neutral, so they might actually agree with what I say in this paragraph.) 


10. Possible overconfidence about the demandingness of deontic theories. B&S state outright that deontic moral theories imply that we don’t have personal duties to transfer our own personal resources to whoever would benefit most from them. Whilst I’m sure that most (maybe even all) deontologist philosophers think this, I’d be a little nervous about inferring from that to the claim that the deontological moral theories endorsed by those philosophers, imply that we have no such obligation (or even that they fail to imply that we do have such an obligation.) 

My reason for this as follows: contractualist theories are generally seen as “deontological”, and I know of at least one paper in a top ethics journal arguing that contractualist theories in fact generate just as demanding duties to give to help others as do utilitarian theories. I haven’t read this paper, so I don’t have an opinion on how strong its argument is, or whether, even if its conclusion is correct, it generates the result that we are obliged to transfer all (or a large amount) of our resources to super-beneficiaries. (My guess is not: it probably makes a difference that super-beneficiaries are unlikely to be threatened with death or significant suffering without the transfer.) But I think at the very least more argument is needed here.


11. The ‘principle of substrate nondiscrimination’ is badly named, because it doesn’t actually rule out discrimination on the basis of substrate (i.e. the physical material a mind is made of’). Rather, it rules out discrimination on the basis of substrate between minds that are conscious. This means it is actually compatible with saying that digital minds don’t have interests at all, if for instance you believed that no thing without biological neurons is conscious. (Some philosophers defend accounts of consciousness which seem to imply this: see the section on “biological theories of consciousness” on p.1112 of Ned Block’s ‘Comparing the Major Theories of Consciousness’). 

A principle compatible with denying, on the basis of their substrate, that digital minds have any rights probably shouldn’t be called the “principle of substrate non-discrimination”. This is especially true when these reasons for denying that digital minds have interests are actually endorsed by some experts.


This post is part of my work for Arb Research.



New Comment
3 comments, sorted by Click to highlight new comments since: Today at 9:19 AM

On interpersonal utility comparisons, I agree with basically all of your points in your doc and I'm skeptical that all interpersonal comparisons are possible, but it seems pretty likely to me that some interpersonal comparisons are possible in theory, and reasonably likely that many comparisons with artificial sentience would be in practice.

The obvious case is two completely identical brains: as long as we grant intrapersonal comparisons, then we should get interpersonal comparisons between identical brains. Of course, this is not a very interesting case, and it's too rare to be useful on its own (except maybe for two artificial sentiences that are built identically), but we can possibly extend it by asking whether there's a sequence of changes from experience E1 in brain B1 to experience E2 in brain B2 that let us rank E1 and E2. For example, if B1 and B2 only differ in the fact that some of B2's pain-causing neurons (assuming that makes sense) are less sensitive or removed (or B1's pain-mitigating neurons are more sensitive or removed), and B1 and B2 receive the same input signals that cause pain, then it seems likely to me that B1's painful experience E1 is more intense than B2's E2. Unfortunately, it's not clear to me that there should be any cardinally quantitative fact of the matter about how much more intense, which makes utilitarianism more arbitrary (since individual normalization seems wrong, but it's not clear what else to do), and it makes things harder if we have both intensity-increasing and intensity-decreasing changes, since there may be no way to determine whether together they increase or decrease the intensity overall.

We can get useful comparisons if there exists a sequence of changes that turn E1 in B1 into E2 in B2 such that:

  1. each change has the same direction of impact on the intensity, i.e. all intensity increasing (or preserving) or all intensity decreasing (or preserving), or
  2. each change has a cardinally quantifiable effect on intensity and they can be aggregated (including intrapersonal changes just to the experience,  and not the brain structure or function), or
  3. a mix of 1 and 2, such the impacts of the unquantifiable changes all have the same direction as the net impact of the quantifiable changes (assuming "net impact" makes sense).

EDIT: You can replace "intensity" with the actual signed value, so we can turn goods into bads and vice versa. 

EDIT2: Also, I could imagine unquantifiable changes with opposite direction that should "cancel out" because they're basically structurally opposites, but may not necessarily be combined into a single value-preserving change, because non-value-preserving changes have to happen between them. There could be other cases of comparability I'm missing, too.

I think 1 is probably almost never going to hold across members of different animal species, and possibly never across two members of the same species. There are just so many differences between brains that it just seems unlikely that they could all be lined up in the same direction.

I could buy that 2 (or 3) holds across individuals within certain taxons, but I don't see a clear way to establish it. Between human-like brains, maybe we can imagine just asking the individual to quantify the change in each step before and after, but it's not clear to me their answers would be valid, as you suggest. Also, 2 and 3 become really doubtful across brains generating valence in very structurally different ways, and I think between extant invertebrates and extant vertebrates, because all the common ancestors of any extant vertebrate and any extant invertebrate were very probably not conscious*.

On the other hand, we could imagine making a digital copy of a human's brain, and then applying a sequence of changes designed to increase the intensity of its experiences (like 2 or 3 with large quantifiable impacts). Not all artificial sentience need be designed this way or comparable, but if enough of them are or could be, this could support Shulman and Bostrom's points.


Another awkward issue for individual normalization is the possibility of asymmetric welfare ranges and differences in how asymmetric they are, e.g. maybe my worst suffering is 20x as intense as my peak pleasure, but your worst suffering is 40x as intense as your peak pleasure. This would mean that we can't match the max, min and 0 across every brain. Still, if we didn't think interpersonal comparisons were really possible in the first place, this shouldn't bother us too much: we probably have to normalize somehow if we want to say anything about interpersonal tradeoffs, and we may as well normalize by dividing by the difference between the max and min. If individuals' maxes and mins don't line up, so be it. We could also consider multiple normalizations (normalize by the difference between the max and 0 on one view, and the difference between 0 and the min on another) and deal with them like moral uncertainty.


* C. elegans and bivalves, in my view very unlikely to be conscious (barring panpsychism), are more closely related to cephalopods and arthropods than any vertebrate is, and the Ambulacraria, closer to vertebrates than cephalopods and arthropods are, also contains species that seem unlikely to be conscious. Even some chordates, like tunicates, the sister taxon to vertebrates, are pretty plausibly not conscious.

Also, separately, I can imagine functionalist definitions of intensity, like Welfare Footprint Project's, that allow at least ordinal interpersonal comparisons. At some intensity of pain (its affective component/suffering/negative valence), which they define as disabling, the pain doesn't leave the individual's attention (it's "continually distressing"), presumably even if they try to direct their attention elsewhere. And then excruciating pain leads to risky or seemingly irrational behaviour, plausibly due to extreme temporal discounting. It's not clear there should be anything further than excruciating that we can use to compare across minds, though, but maybe just higher and higher discount rates?

We could define pleasure intensities symmetrically, based on attention and induced temporal discounting.

On the other hand, maybe some beings only have all-or-nothing pain experiences, i.e. their pain always meets the definition of excruciating whenever they're in pain, and this could happen in very simple minds, because they don't weigh different interests smoothly, whether simultaneous interests, or current and future interests or different future interests. Maybe we wouldn't think such minds are sentient at all, though.

Thanks for this Michael. I don't have a proper reply to it right now, because it raises so many complicated issues that I haven't thought through yet (though briefly, I don't actually think same brain guarantees same pain when embedded in different bodies/environments). But your right that differences in trade-offs between best pleasure and worst pain probably sink the naive normalization strategy I was suggesting. I'd need to know more maths than I do to have a sense of whether it is fixable. Someone suggested to me that some of the ideas in this book (which I haven't read yet) would help: