Derek Shiller
Lead Web DeveloperatThe Humane League

Topic Contributions


On Deference and Yudkowsky's AI Risk Estimates

then it would be a violation of the law of the conservation of expected evidence for you to update your beliefs on observing the passage of a minute without the bomb's exploding.

Interesting! I would think this sort of case just shows that the law of conservation of expected evidence is wrong, at least for this sort of application. I figure it might depend on how you think about evidence. If you think of the infinite void of non-existence as possibly constituting your evidence (albeit evidence you're not in a position to appreciate, being dead and all), then that principle wouldn't push you toward this sort of anthropic reasoning.

I am curious, what do you make of the following case?

Suppose you're touring Acme Bomb & Replica Bomb Co with your friend Eli. ABRBC makes bombs and perfect replicas of bombs, but they're sticklers for safety so they alternate days for real bombs and replicas. You're not sure which sort of day it is. You get to the point of the tour where they show off the finished product. As they pass around the latest model from the assembly line, Eli drops it, knocking the safety back and letting the bomb (replica?) land squarely on its ignition button. If it were a real bomb, it would kill everyone unless it were one of the 1-in-a-million bombs that's a dud. You hold your breath for a second but nothing happens. Whew. How much do you want to bet that it's a replica day?

On Deference and Yudkowsky's AI Risk Estimates

Suppose you've been captured by some terrorists and you're tied up with your friend Eli. There is a device on the other side of the room you that you can't quite make out. Your friend Eli says that he can tell (he's 99% sure) it is a bomb and that it is rigged to go off randomly. Every minute, he's confident there's a 50-50 chance it will explode, killing both of you. You wait a minute and it doesn't explode. You wait 10. You wait 12 hours. Nothing. He starts eying the light fixture, and say's he's pretty sure there's a bomb there too. You believe him?

The importance of getting digital consciousness right

One can argue that AI reflects the society (e. g. in order to make good decisions or sell products), so would, at most, double the sentience in the world. Furthermore, today, many individuals (including humans not considered in decisionmaking, not profitable to reach, or without the access to electricity, and non-human animals, especially wild ones) are not considered by AI systems. Thus, any possible current and prospective AI's contribution to sentience is limited.

It is very unclear how many digital minds we should expect, but it is conceivable that in the long run they will greatly outnumber us. The reasons we have to create more human beings -- companionship, beneficence, having a legacy -- are reasons we would have to create more digital minds. We can fit a lot more digital minds on Earth than we can humans. We could more easily colonize other planets with digital minds. For these reasons, I think we should be open to the possibility that most future minds will be digital.

Unintentional creation of necessary suffering AI that would not reflect the society but perceive relatively independently is the greatest risk. For example, if AI really hates selling products in a way that in consequence and in the process reduces humans' wellness, or if it makes certain populations experience low or negative wellbeing otherwise.

It strikes me as less plausible that we will have massive numbers of digital minds that unintentionally suffer while performing cognitive labor for us. I'm skeptical that the most effective ways to produce AI will make them conscious, and even if it does it seems like a big jump from phenomenal experience to suffering. Even if they are conscious, I don't see why we would need a number of digital minds for every person. I would think that the cognitive power of artifical intelligence means we would need rather few of them, and so the suffering they experience, unless particularly intense, wouldn't be particularly significant.

The importance of getting digital consciousness right

they've got leading advocates of two leading consciousness theories (global workspace theory and integrated information theory;

Thanks for sharing! This sounds like a promising start. I’m skeptical that things like this could fully resolve the disagreements, but they could make progress that would be helpful in evaluating AIs.

I do think that there is a tension between taking a strong view that AI is not conscious/ will not be conscious for a long time, versus assuming that animals with very different brain structures do have conscious experience.

If animals with very different brains are conscious, then I’m sympathetic with the thought that we could probably make conscious systems if we really tried. Modern AI systems look a bit Chinese roomish, so it might still be that the incentives aren’t there to put in the effort to make really conscious systems.

The importance of getting digital consciousness right

You’re probably right. I’m not too optimistic that my suggestion would make a big difference. But it might make some.

If a company were to announce tomorrow that it had built a conscious AI and would soon have it available for sale, I expect that it would prompt a bunch of experts to express their own opinions on twitter and journalists to contact a somewhat randomly chosen group of outspoken academics to get their perspective. I don’t think that there is any mechanism for people to get a sense of what experts really think, at least in the short run. That’s dangerous because it means that what they might hear would be somewhat arbitrary, possibly reflecting the opinion of overzealous or overcautious academics, and because it might lack authority, being the opinions of only a handful of people.

In my ideal scenario, there would be some neutral body, perhaps that did regular expert surveys, that journalists would think to talk to before publishing their pieces and that could give the sort of judgement I gestured to above. That judgement might show that most views on consciousness agree that the system is or isn’t conscious, or at least that there is significant room for doubt. People might still make up their minds, but they might entertain doubts longer, and such a body might provide incentives for companies to try harder to build more likely to be conscious systems.

The importance of getting digital consciousness right

I was imagining that the consensus would concern conditionals. I think it is feasible to establish what sets of assumptions people might naturally make, and what views those assumptions would support. This would allow a degree of objectivity without settling the right theory. It might also involve assigning probabilities, or ranges of probabilities to view themselves, or to what it is rational for other researchers to think about different views.

So we might get something like the following (when researchers evaluate gpt6):

There are three major groups of assumptions, a, b, and c.

  • Experts agree that gpt6 has a 0% probability of being conscious if a is correct.
  • Experts agree that the rational probability to assign to gpt6 being conscious if b is correct falls between 2 and 20%.
  • Experts agree that the rational probability to assign to gpt6 being conscious if c is correct falls between 30 and 80%
‘Consequentialism’ is being used to mean several different things

My impression is that EAs also often talk about ethical consequentialism when they mean something somewhat different. Ethical consequentialism is traditionally a theory about what distinguishes the right ways to act from the wrong ways to act. In certain circumstances, it suggests that lying, cheating, rape, torture, and murder can be not only permissible, but downright obligatory. A lot of people find these implications implausible.

Ethical consequentialists often think what they do because they really care about value in aggregate. They don't just want to be happy and well off themselves, or have a happy and well off family. They want everyone to be happy and well off. They want value to be maximized, not distributed in their favor.

A moral theory that gets everyone to act in ways that maximize value will make the world a better place. However, it is consistent to think that consequentialism is wrong about moral action and to nonetheless care primarily about value in aggregate. I get the impression that EAs are more attached to the latter than the former. We generally care that things be as good as they can be. We have less a stake in whether torture is a-ok if the expected utility is positive. The EA attitude is more of a 'hey, lets do some good!' and less of a 'you're not allowed to fail to maximize value!'. This seems like an important distinction.

Animal Welfare: Reviving Extinct (human) intermediate species?

That humans and non-human animals are categorically distinct seems to be based on the fairly big cognitive and communicative gap between humans and the smartest animals.

There is already a continuum between the cognitive capacities of humans and animals. Peter Singer has pointed to cognitively disabled humals in arguing for better treatment of animals.

Do you think homo erectus would add something further? People often (arbitrarily) draw the line at species, but it seems to me that they could just as easily draw it at any clade. Growing fetuses display a similar variation between single cells and normal adults, and it seems most people don't have issues carving moral categories along arbitrary lines.

Key questions about artificial sentience: an opinionated guide

Computational functionalism about sentience: for a system to have a given conscious valenced experience is for that system to be in a (possibly very complex) computational state. That assumption is why the Big Question is asked in computational (as opposed to neural or biological) terms.

I think it is a little quick to jump from functionalism to thinking that consciousness is realizable in a modern computer architecture if we program the right functional roles. There might be important differences in how the functional roles are implemented that rules out computers. We don't want to allow just any arbitrary gerrymandered states to count as an adequate implementation of consciousness's functional roles; the limits to what is adequate are underexplored.

Suppose that Palgrave Macmillan produced a 40 volume atlas of the bee brain, where each neuron is drawn on some page (in either a firing or silent state) and all connections are accounted for. Every year, they release a new edition from a momentary time slice later, updating all of the firing patterns slightly after looking at the patterns in the last edition. Over hundreds of years, a full second of bee brain activity is accounted for. Is the book conscious? My intuition is NO. There are a lot of things you might think are going wrong here -- maybe the neurons printed on each page aren't doing enough causal work in generating the next edition, maybe the editions are too spatially or temporally separated, etc. I could see some of these explanations as applying equally to contemporary computer architectures.

Consciousness, counterfactual robustness and absurdity

But there are many ordered subsets of merely trillions of interacting particles we can find, effectively signaling each other with forces and small changes to their positions.

In brains, patterns of neural activity stimulate further patterns of neural activity. We can abstract this out into a system of state changes and treat conscious episodes as patterns of state changes. Then if we can find similar causal networks of state changes in the wall, we might have reason to think they are conscious as well. Is this the idea? If so, what sort of states are you imagining to change in the wall? Is it the precise configurations of particles? I expect a lot of the states you'll identify to fulfill the relevant patterns will be arbitrary or gerrymandered. That might be an important difference that should make us hesitate before ascribing conscious experiences to walls.

Load More
Lead Web DeveloperatThe Humane League