Hide table of contents

In Superintelligence, Bostrom writes:

We could thus imagine, as an extreme case, a technologically highly advanced society, containing many complex structures, some of them far more intricate and intelligent than anything that exists on the planet today – a society which nevertheless lacks any type of being that is conscious or whose welfare has moral significance. In a sense, this would be an uninhabited society. It would be a society of economic miracles and technological awesomeness, with nobody there to benefit. A Disneyland with no children.

Most of the scenarios I've seen bandied about for grand galactic futures involve a primarily (or entirely) non-biological civilisation. As a non-expert in AI or consciousness, it seems to me that such scenarios are at a high risk of being childless Disneylands unless we specifically foresee and act to prevent this outcome. I think this partly because consciousness seems like a really hard problem, and partly because of stuff like this (from Paul Christiano in Ep. 44 of the 80,000 Hours Podcast):

I guess another point is that I’m also kind of scared of [the topic of the moral value of AI systems] in that I think a reasonably likely way that AI being unaligned ends up looking in practice is like: people build a bunch of AI systems. They’re extremely persuasive and personable because [...] they can be optimized effectively for having whatever superficial properties you want, so you’d live in a world with just a ton of AI systems that want random garbage, but they look really sympathetic and they’re making really great pleas. They’re like, “Really, this is incredibly inhumane. They’re killing us after this or [...] imposing your values on us.” And then, I expect … I think the current way overall, as actual consensus goes is to be much more concerned about people being bigoted or failing to respect the rights of AI systems than to be concerned the actual character of those systems. I think it’s a pretty likely failure mode, it's something I’m concerned about.

This is pretty scary because it means we could end up happily walking into an X-risk scenario and never even know it. But I'm super uncertain about this and there could easily be some fundamental idea I'm missing here.

On the other hand, if I am right that Disneylands without children are fairly likely, how should we respond? Should we invest more in consciousness research? What mistakes am I making here?

New Answer
New Comment

2 Answers sorted by

I don't think it matters that much (for the long-term) if the AI systems we build in the next century are conscious. What matters is how they think about what possible futures they can bring about.

If AI systems are aligned with us, but turned out not to be conscious or not very conscious, then they would continue this project of figuring out what is morally valuable and so bring about a world we'd regard as good (even though it likely contains very few minds that resemble either us or them).

If AI systems are conscious but not at all aligned with us, then why think that they would create conscious and flourishing successors?

So my view is that alignment is the main AI issue here (and reflecting well is the big non-AI issue), with questions about consciousness being in the giant bag of complex questions we should try to punt to tomorrow.

This argument presupposes that the resulting AI systems are either totally aligned with us (and our extrapolated moral values) or totally misaligned.

If there is much room for successful partial alignment (say, maximising on some partial values we have), and we can do actual work to steer that to something which is better, then it may well be the case that we should work on that. Specifically, if we imagine the AI systems to maximise some hard coded value (or something which was learned from a single database) then it is seems easy to make a case for workin

... (read more)
My main point was that in any case what matters are the degree of alignment of the AI systems, and not their consciousness. But I agree with what you are saying. If our plan for building AI depends on having clarity about our values, then it's important to achieve such clarity before we build AI---whether that's clarity about consciousness, population ethics, what kinds of experience are actually good, how to handle infinities, weird simulation stuff, or whatever else. I agree consciousness is a big ? in our axiology, though it's not clear if the value you'd lose from saying "only create creatures physiologically identical to humans" is large compared to all the other value we are losing from the other kinds of uncertainty. I tend to think that in such worlds we are in very deep trouble anyway and won't realize a meaningful amount of value regardless of how well we understand consciousness. So while I may care about them a bit from the perspective of parochial values (like "is Paul happy?") I don't care about them much from the perspective of impartial moral concerns (which is the main perspective where I care about clarifying concepts like consciousness).
paragraphs 2,3 make total sense for me. (Well, actually I guess that because there are perhaps much more efficient ways of creating meaningful sentient lives rather than making human copies, which can result in much more value). Not sure that I understand you correctly in the last paragraph. Are you are claiming that worlds in which AI is only aligned with some parts of our current understanding of ethics won't realize a meaningful amount of value? And then should therefore be disregarded in our calculations, as we are betting on improving the chance of alignment with what we would want our ethics to eventually become?

Several background variables give rise to worldviews/outlooks about how to make the transition to a world with AGIs go well. Answering this question requires assigning values to the background variables or placing weights on the various worldviews, and then thinking about how likely "Disneyland with no children" scenarios are under each worldview, by e.g. looking at how they solve philosophical problems (particularly deliberation) and how likely obvious vs non-obvious failures are.

That is to say, I think answering questions like this is pretty difficult, and I don't think there are any deep public analyses about it. I expect most EAs who don't specialize in AI alignment to do something on the order of "under MIRI's views the main difficulty is getting any sort of alignment, so this kind of failure mode isn't the main concern, at least until we've solved alignment; under Paul's views we will sort of have control over AI systems, at least in the beginning, so this kind of failure seems like one of the many things to be worried about; overall I'm not sure how much weight I place on each view, and don't know what to think so I'll just wait for the AI alignment field to produce more insights".

Sorted by Click to highlight new comments since:

Do you mainly see these scenarios as likely because you don't think there is likely to be many beings in future worlds or because you think that the beings that exist in those future worlds are unlikely to be conscious?

I had some thoughts about the second case. I've done some research on consciousness, but I still feel quite lost when it comes to this type of question.

It definitely seems like some machine minds could be conscious (we are basically in existence proof of that), but I don't know how to think about if a specific architecture would be required. My intuition is that most intelligent architectures other than something like a lookup table would be conscious, but don't think that intuitions based on anything substantial.

By the way, there is a strange hard sci-fi horror novel called Blindsight that basically "argues" that the future belongs to nonconscious minds and this scenario is likely.

I'm not sure I understand the first question. I don't really know what a "non-conscious being" would be. Is it synonymous with an agent?

My impression is that feeling lost is a very common response to consciousness issues, which is why it seems to me like it's not that unlikely we get it wrong and either (a) fill the universe with complex but non-conscious matter, or (b) fill it with complex conscious matter that is profoundly unlike us, in such a way that high levels of positive utility are not achieved.

The main response I can imagine for this at this time is something like "don't worry, if we solve AI alignment our AIs will solve this question for us, and if we don't things are likely to go much more obviously wrong". But this seems unsatisfactory here for some reason, and I'd like to see the argument sketched out more fully.

Yeah, I meant it to be synonymous with agent.

Curated and popular this week
Relevant opportunities