478 karmaJoined


Sorted by New
· · 11m read


I think game playing AI is pretty well characterized as having the goal of winning the game, and being more or less capable of achieving that goal at different degrees of training. Maybe I am just too used to this language but it seems very intuitive to me. Do you have any examples of people being confused by it?

I don’t think the airplane analogy makes sense because airplanes are not intelligent enough to be characterized as having their own preferences or goals. If there were a new dog breed that was stronger/faster than all previous dog breeds, but also more likely to attack their owners, it would be perfectly straightforward to describe the dog as “more capable” (but also more dangerous).

I think you should update approximately not at all from Sam Altman saying GPT-5 is going to be much better. Every CEO says every new version of their product is much better--building hype is central to their job.

I think this post is conflating two different things in an important way. There are two distinctions here about what we might mean by "A is trying to do what H wants":

  • Whether the trying is instrumental, or terminal. (A might be trying to do what H wants because A intrinsically wants to try to do so, or because it thinks trying to do so is a good strategy to achieve some unrelated thing it wants.)
  • Whether the trying is because of coinciding utility functions, or because of some other value structure, such as loyalty/corrigibility. A might be trying to do what H wants because it values [the exact same things as H values], or it might be because A values [trying to do things H values].

To illustrate the second distinction: I could drive my son to a political rally because I also believe in the cause, or because I love my son and want to see him succeed at whatever goals he has.

I think it is much more likely that we will instill AIs with something like loyalty than that we will instill them with our exact values directly, and I think most alignment optimists consider this the more promising direction. (I think this is essentially what the term "corrigibility" refers to in alignment.) I know this has been Paul Christiano's approach for a long time now, see for example this post.

I think this comment does a really bad job of interpreting Austin in good faith. You are putting words in his mouth, rewriting the tone and substance of his comment so that it is much more contentious than what he actually expressed. Austin did not claim:

  • that Manifest is cooler or less snooty than EA
  • that EA and Manifest are equally controversial
  • that turning people away is never the right thing to do (baring a physical threat)

I think it is pretty poor form to read someone's comment in such a hostile way and then attribute views to them they didn't express.

Interesting post! I think analogies are good for public communication but not for understanding things at a deep level. They're like a good way to quickly template something you haven't thought about at all with something you are familiar with. I think effective mass communication is quite important and we shouldn't let the perfect be the enemy of the good.

I wouldn't consider my Terminator comparison an analogy in the sense of the other items on this list. Most of the other items have the character of "why might AI go rogue?" and then they describe something other than AI that is hard to understand or goes rogue in some sense and assert that AI is like that. But Terminator is just literally about an AI going rogue. It's not so much an analogy as a literal portrayal of the concern. My point wasn't so much that you should proactively tell people that AI risk is like Terminator, but that people are just going to notice this on their own (because it's incredibly obvious), and contradicting them makes no sense.

I'm of a split mind on this. On the one hand, I definitely think this is a better way to think about what will determine AI values than "the team of humans that succeeds in building the first AGI".

But I also think the development of powerful AI is likely to radically reallocate power, potentially towards AI developers. States derive their power from a monopoly on force, and I think there is likely to be a period before the obsolesce of human labor in which these monopolies are upset by whoever is able to most effectively develop and deploy AI capabilities. It's not clear who this will be, but it hardly seems guaranteed to be existing state powers or property holders, and AI developers have an obvious expertise and first mover advantage.

I think the first point is subtly wrong in an important way. 

EAGs are not only useful in so far as they let community members do better work in the real world. EAGs are useful insofar as they result in a better world coming to be.

One way in which EAGs might make the world better is by fostering a sense of community, validation, and inclusion among those who have committed themselves to EA, thus motivating people to so commit themselves and to maintain such commitments. This function doesn't bare on "letting" people do better work per se. 

Insofar as this goal is an important component of EAG's impact, it should be prioritized alongside more direct effects of the conference. EAG obviously exists to make the world a better place, but serving the EA community and making EAs happy is an important way in which EAG accomplishes this goal. 

I think this is a great post.

One reason I think it would be cool to see EA become more politically active is that political organizing is a great example of a low-commitment way for lots of people to enact change together. It kind of feels ridiculous that if there is an unsolved problem with the world, the only way I can personally contribute is to completely change careers to work on solving it full time, while most people are still barely aware it exists. 

I think the mechanism of "try to build broad consensus that a problem needs to get solved, then delegate collective resources towards solving it" is underrated in EA at current margins. It probably wasn't underrated before EA had billionaire-level funding, but as EA comes to have about as much money as you can get from small numbers of private actors, and it starts to enter the mainstream, I think it's worth taking the prospect of mass mobilization more seriously. 

This doesn't even necessarily have to look like getting a policy agenda enacted. I think of climate change as a problem that is being addressed with by mass mobilization, but in the US, this mass mobilization has mostly not come in the form of government policy (at least not national policy). It's come from widespread understanding that it's a problem that needs to get solved, and is worth devoting resources to, leading to lots of investment in green technology.

I don't think this is a good characterization of e.g. Kelsey's preference for her Philip Morris analogy over the Terminator analogy--does rogue Philip Morris sound like a far harder problem to solve than rogue Skynet? Not to me, which is why it seems to me much more motivated by not wanting to sound science-fiction-y. Same as Dylan's piece; it doesn't seem to be saying "AI risk is a much harder problem than implied by the Terminator films", except insofar as it misrepresents the Terminator films as involving evil humans intentionally making evil AI.

It seems to me like the proper explanatory path is "Like Terminator?" -> "Basically" -> "So why not just not give AI nuclear launch codes?" -> "There are a lot of other ways AI could take over". 

"Like Terminator?" -> "No, like Philip Morris" seems liable to confuse the audience about the very basic details of the issue, because Philip Morris didn't take over the world. 

Load more