Derek Shiller

Philosophy Researcher @ Rethink Priorities
589 karmaJoined Mar 2019Derekshiller.com


It sounds like you're giving IIT approximately zero weight in your all-things-considered view. I find this surprising, given IIT's popularity amongst people who've thought hard about consciousness, and given that you seem aware of this.

From my experience, there is a significant difference in the popularity of IIT by field. In philosophy, where I got my training, it isn't a view that is widely held. Partly because of this bias, I haven't spent a whole lot of time thinking about it. I have read the seminal papers that introduce the formal model and given the philosophical justifications, but I haven't looked much into the empirical literature. The philosophical justifications seem very weak to me -- the formal model seems very loosely connected to the axioms of consciousness that supposedly motivate it. And without much philosophical justification, I'm wary of the empirical evidence. The human brain is messy enough that I expect you could find evidence to confirm IIT whether or not it is true, if you look long enough and frame your assumptions correctly. That said, it is possible that existing empirical work does provide a lot of support for IIT that I haven't taken the time to appreciate.

Additionally, I'd be interested to hear how your view may have updated in light of the recent empirical results from the IIT-GNWT adversarial collaboration:

If you don't buy the philosophical assumptions, as I do not, then I don't think you should update much on the IIT-GNWT adversarial collaboration results. IIT may have done better, but if the two views being compared were essentially picked out of a hat as representatives of different philosophical kinds, then the fact that one view does better doesn't say much about the kinds. It seems weird to me to compare the precise predictions of theories that are so drastically different in their overall view of things. I don't really like the idea of adversarial approaches across different frameworks. I would think it makes more sense to compare nearby theories to one another.

That strikes me as plausible, but if so, then rats are much more competent than humans in their 'blindsight' like abilities. My impression is that in humans, blindsight is very subtle. A human cannot use blindsight to walk into the kitchen and get a glass of water. Rats seem like they can rely on their midbrain to do this sort of thing. If rats are able to engage in complex behavior without consciousness, that should make us wonder if consciousness ever plays a role in their complex behavior. If it doesn't, then why should we think they are conscious?

You might think that we have evidence from comparative neuroanatomy. Humans have some abilities in both places, and something in the cortex adds consciousness. Maybe the same is true for rats. But if so, that would push the questions down the phylogeny to creatures who can achieve as much or more than rats with their midbrains and don't have any complex cortex.

I've generally been more sympathetic with functionalism than any other realist view about the nature of consciousness. This project caused me to update on two things.

1.) Functionalism can be developed in a number of different ways, and many of those ways will not allow for digital consciousness in contemporary computer architectures, even if they were to run a program faithfully simulating a human mind. The main thing is abstraction. Some versions of functionalism allow a system to count as running a program if some highly convoluted abstractions on that system can be constructed that mirror that program. Some versions require the program to have a fairly concrete mapping to the system. I think digital consciousness requires the former kind of view, and I don't think that there are good reasons to favor that kind of functionalism over the other.

2.) Functionalism is a weirder view than I think a lot of people give it credit for and there really aren't much in the way of good arguments for it. A lot of the arguments come down to intuitions about cases, but it is hard to know why we should trust our intuitions about whether random complex systems are conscious. Functionalism seems most reasonable if you don't take consciousness very seriously to begin with and you think that our intuitions are constitutive in carving off a category that we happen to care about, rather than getting at an important boundary in the world.

Overall, I feel more confused than I used to be. My probability of functionalism went down, but it didn't go to a rival theory.

I worry about the effect that AI friends and partners could have on values. It seems plausible that most people could come to have a good AI friend in the coming decades. Our AI friends might always be there for us. They might get us. They might be funny and insightful and eloquent. How would it play out if they're opinions are crafted by tech companies, or the government, or even are reflections of what we want our friends to think? Maybe AI will develop fast enough and be powerful enough that it won't matter what individuals think or value, but I see reasons for concern potentially much greater than the individual harms of social media.

I find points 4, 5, and 6 really unconvincing. Are there any stronger arguments for these, that don't consist of pointing to a weird example and then appealing to the intuition that "it would be weird if this thing was conscious"?

I'm not particularly sympathetic with arguments that rely on intuitions to tell us about the way the world is, but unfortunately, I think that we don't have a lot else to go on when we think about consciousness in very different systems. It is too unclear what empirical evidence would be relevant and theory only gets us so far on its own.

That said, I think there are some thought experiments that should be compelling, even though they just elicit intuitions. I believe that the thought experiments I provide are close enough to this for it to be reasonable to put weight on them. The mirror grid, in particular, just seems to me to be the kind of thing where, if you accept that it is conscious, you should probably think everything is conscious. There is nothing particularly mind-like about it, it is just complex enough to read any structure you want into it. And lots of things are complex. (Panpsychism isn't beyond the pale, but it is not what most people are on board with when they endorse functionalism or wonder if computers could be conscious.)

Another way to think about my central point: there is a history in philosophy of trying to make sense of why random objects (rocks, walls) don't count as properly implementing the same functional roles that characterize conscious states. There are some accounts that have been given for this, but it is not clear that those accounts wouldn't predict that contemporary computers couldn't be conscious either. There are plausible readings of those accounts that suggest that contemporary computers would not be conscious no matter what programs they run. If you don't particularly trust your intuitions, and you don't want to accept that rocks and walls properly implement the functional roles of conscious states, you should probably be uncertain over exactly which view is correct. Since many views would rule out consciousness in contemporary computers, you should lower the probability you assign to that.

I don't get the impression that EAs are particularly motivated by morality. Rather, they are motivated to produce things they see as good. Some moral theories, like contractualism, see producing a lot of good things (within the bounds of our other moral duties) as morally optional. You're not doing wrong by living a normal decent life. It seems perfectly aligned with EA to hold one of those theories and still personally aim to do as much good as possible.

A moral theory is more important in what it tells you you can't do in pursuit of the good. Generally what is practical to do if you're trying to effectively pursue the good and abiding by the standard moral rules of society (e.g. don't steal money to give to charity) go hand in hand, so I would expect to see less discussion of this on the forum. Where they come apart, it is probably a significant reputational risk to discuss them.

I like this take: if AI is dangerous enough to kill us in three years, no feasible amount of additional interpretability research would save us.

Our efforts should instead go to limiting the amount of damage that initial AIs could do. That might involve work securing dangerous human-controlled technologies. It might involve creating clever honey pots to catch unsophisticated-but-dangerous AIs before they can fully get their act together. It might involve lobbying for processes or infrastructure to quickly shut down Azure or AWS.

Even in humans, language production is generally subconscious. At least, my experience of talking is that I generally first become conscious of what I say as I'm saying it. I have some sense of what I might want to say before I say it, but the machinery that selects specific words is not conscious. Sometimes, I think of a couple of different things I could say and consciously select between them. But often I don't: I just hear myself speak. Language generation may often lead to conscious perceptions of inner speech, but it doesn't seem to rely on it.

All of this suggests that the possibility of non-conscious chatbots should not be surprising. It may be that chatbots provide pretty good evidence that cognitive complexity can come apart from consciousness. But introspection alone should provide sufficient evidence for that.

Load more