All of skluug's Comments + Replies

I think game playing AI is pretty well characterized as having the goal of winning the game, and being more or less capable of achieving that goal at different degrees of training. Maybe I am just too used to this language but it seems very intuitive to me. Do you have any examples of people being confused by it?

I don’t think the airplane analogy makes sense because airplanes are not intelligent enough to be characterized as having their own preferences or goals. If there were a new dog breed that was stronger/faster than all previous dog breeds, but also more likely to attack their owners, it would be perfectly straightforward to describe the dog as “more capable” (but also more dangerous).

2
sawyer🔸
I think people would say that the dog was stronger and faster than all previous dog breeds, not that it was "more capable". It's in fact significantly less capable at not attacking its owner, which is an important dog capability. I just think the language of "capability" is somewhat idiosyncratic to AI research and industry, and I'm arguing that it's not particularly useful or clarifying language. More to my point (though probably orthogonal to your point), I don't think many people would buy this dog, because most people care more about not getting attacked than they do about speed and strength. As a side note, I don't see why preferences and goals change any of this. I'm constantly hearing AI (safety) researchers talk about "capabilities research" on today's AI systems, but I don't think most of them think those systems have their own preferences and goals. At least not in the sense that a dog has preferences or goals. I just think it's a word that AI [safety?] researchers use, and I think it's unclear and unhelpful language. #taboocapabilities

I think you should update approximately not at all from Sam Altman saying GPT-5 is going to be much better. Every CEO says every new version of their product is much better--building hype is central to their job.

8
defun 🔸
That's true for many CEOs (like Elon Musk) but Sam Altman did not over-hype any of the big OpenAI launches (ChatGPT, gpt3.5, gpt4, gpt4o, dall-e, etc.). It's possible that he's doing it for the first time now, but I think it's unlikely. But let's ignore Sam's claims. Why do you think LLM progress is slowing down?

I think this post is conflating two different things in an important way. There are two distinctions here about what we might mean by "A is trying to do what H wants":

  • Whether the trying is instrumental, or terminal. (A might be trying to do what H wants because A intrinsically wants to try to do so, or because it thinks trying to do so is a good strategy to achieve some unrelated thing it wants.)
  • Whether the trying is because of coinciding utility functions, or because of some other value structure, such as loyalty/corrigibility. A might be trying to do wha
... (read more)

I think this comment does a really bad job of interpreting Austin in good faith. You are putting words in his mouth, rewriting the tone and substance of his comment so that it is much more contentious than what he actually expressed. Austin did not claim:

  • that Manifest is cooler or less snooty than EA
  • that EA and Manifest are equally controversial
  • that turning people away is never the right thing to do (baring a physical threat)

I think it is pretty poor form to read someone's comment in such a hostile way and then attribute views to them they didn't express.

Interesting post! I think analogies are good for public communication but not for understanding things at a deep level. They're like a good way to quickly template something you haven't thought about at all with something you are familiar with. I think effective mass communication is quite important and we shouldn't let the perfect be the enemy of the good.

I wouldn't consider my Terminator comparison an analogy in the sense of the other items on this list. Most of the other items have the character of "why might AI go rogue?" and then they describe somethi... (read more)

I'm of a split mind on this. On the one hand, I definitely think this is a better way to think about what will determine AI values than "the team of humans that succeeds in building the first AGI".

But I also think the development of powerful AI is likely to radically reallocate power, potentially towards AI developers. States derive their power from a monopoly on force, and I think there is likely to be a period before the obsolesce of human labor in which these monopolies are upset by whoever is able to most effectively develop and deploy AI capabilities.... (read more)

I think the first point is subtly wrong in an important way. 

EAGs are not only useful in so far as they let community members do better work in the real world. EAGs are useful insofar as they result in a better world coming to be.

One way in which EAGs might make the world better is by fostering a sense of community, validation, and inclusion among those who have committed themselves to EA, thus motivating people to so commit themselves and to maintain such commitments. This function doesn't bare on "letting" people do better work per se. 

Insofar ... (read more)

I think this is a great post.

One reason I think it would be cool to see EA become more politically active is that political organizing is a great example of a low-commitment way for lots of people to enact change together. It kind of feels ridiculous that if there is an unsolved problem with the world, the only way I can personally contribute is to completely change careers to work on solving it full time, while most people are still barely aware it exists. 

I think the mechanism of "try to build broad consensus that a problem needs to get solved, then... (read more)

I don't think this is a good characterization of e.g. Kelsey's preference for her Philip Morris analogy over the Terminator analogy--does rogue Philip Morris sound like a far harder problem to solve than rogue Skynet? Not to me, which is why it seems to me much more motivated by not wanting to sound science-fiction-y. Same as Dylan's piece; it doesn't seem to be saying "AI risk is a much harder problem than implied by the Terminator films", except insofar as it misrepresents the Terminator films as involving evil humans intentionally making evil AI.

It seem... (read more)

I feel like this is a pretty insignificant objection, because it implies someone might going around thinking, "don't worry, AI Risk is just like Terminator! all we'll have to do is bring humanity back from the brink of extinction, fighting amongst the rubble of civilization after a nuclear holocaust".  Surely if people think the threat is only as bad as Terminator, that's plenty to get them to care. 

5
Robert_Wiblin
I interpreted them not as saying that Terminator underplays the issue but rather that it misrepresents what a real AI would be able to do (in a way that probably makes the problem seem far easier to solve). But that may be me suffering from the curse of knowledge.

“Perhaps the best window into what those working on AI really believe [about existential risks from AI] comes from the 2016 survey of leading AI researchers. As well as asking if and when AGI might be developed, it asked about the risks: 70 percent of the researchers agreed with Stuart Russell’s broad argument about why advanced AI might pose a risk; 48 percent thought society should prioritize AI safety research more (only 12 percent thought less). And half the respondents estimated that the probability of the longterm impact of AGI being “extremely bad (... (read more)

Thanks for reading—you’re definitely right, my claim about the representativeness of Yudkowsky & Christiano’s views was wrong. I had only a narrow segment of the field in mind when I wrote this post. Thank you for conducting this very informative survey.

Thanks for reading! I admire that you take the time to respond to critiques even by random internet strangers. Thank you for all your hard work in promoting effective altruist ideas.

Yeah, you're right actually, that paragraph is a little too idealistic.

As a practical measure, I think it cuts both ways. Some people will hear "yes, like Terminator" and roll their eyes. Some people will hear "no, not like Terminator", get bored, and tune out. Embracing the comparison is helpful, in part, because it lets you quickly establish the stakes. The best path is probably somewhere in the middle, and dependent on the audience and context.

Overall I think it's just about finding that balance.

fwiw my friend said he recently explained AI risk to his mom, and her response was "yeah, that makes sense."

Wow, this is a really interesting point that I was not aware of.

I think these have more to do with how some people remember Terminator than with Terminator itself:

  • As I stated in this post, the AI in Terminator is not malevolent; it attacks humanity out of self-preservation.
  • Whether the AIs are conscious is not explored in the movies, although we do get shots from the Terminator's perspective, and Skynet is described as "self-aware". Most people have a pretty loose understanding of what "consciousness" means anyway, not being far off from "general intelligence".
  • Cyberdyne Systems is not portrayed as greedy, at least in th
... (read more)
9
technicalities
Fair!  Though, sudden emergent consciousness is a more natural way to read "self-aware" than sudden recursive self-improvement or whatever other AI / AGI threshold you want to name.

I would like to thank N.N., Voxette, tyuuyookoobung & TCP for reviewing drafts of this post. 

I rewatched Terminator 1 & 2 to write this post. One thing I liked but couldn't fit in: Terminator 2 contains an example of the value specification problem! Young John Connor makes the good Terminator swear not to kill people; the Terminator immediately goes on to merely maim severely, instead. 

I think "Windfall" fits the bill as a positive surprise and has the benefit of being an existing word (I'm probably not going to bother setting up a ETH wallet to submit it).

It can seem strange to focus on the wellbeing of future people who don’t even exist yet, when there is plenty of suffering that could be alleviated today. Shouldn’t we aid the people who need help now and let future generations worry about themselves?

We can see the problems with near-sighted moral concern if we imagine that past generations had felt similarly. If prior generations hadn’t cared for the future of their world, we might today find ourselves without many of the innovations we take for granted, suffering from far worse degradation of the environ... (read more)

2
james
Thanks for your submission!

I see the reasoning here for images/video, but I'm not sure it applies to audio--long-form podcasting is a great medium for serious discourse.

3
RyanCarey
Agreed. But if you're not an audio-creator, and you want to seriously refer to a podcast,  it would usually make most sense to just transcribe it. Especially as this becomes automatable.

On specifically getting them interested: presumably, new-Steve-Jobs doesn't want to come in and run someone else's company, they want to start their own. You could pay them a lot of money, but if they really are Jobs, the opportunity cost of not starting their own is extremely high!

A little more to the point, I think Yudkowsky is seriously underestimating the information/coordination costs associated with finding the next Steve Jobs. Maybe one does exist--in fact, I would say I am pretty sure one does--but how do you find them? How do you get them interested? How do you verify they can do what Steve did, without handing them control of a trillion dollar company? How can you convince everyone else that they should trust and support new-Steve's decisions?

These seem like the more significant obstacles than just any new Steve existing at all.

(I also think the relevance is kind of strained.)

1
skluug
On specifically getting them interested: presumably, new-Steve-Jobs doesn't want to come in and run someone else's company, they want to start their own. You could pay them a lot of money, but if they really are Jobs, the opportunity cost of not starting their own is extremely high!
1
skluug
A little more to the point, I think Yudkowsky is seriously underestimating the information/coordination costs associated with finding the next Steve Jobs. Maybe one does exist--in fact, I would say I am pretty sure one does--but how do you find them? How do you get them interested? How do you verify they can do what Steve did, without handing them control of a trillion dollar company? How can you convince everyone else that they should trust and support new-Steve's decisions? These seem like the more significant obstacles than just any new Steve existing at all. (I also think the relevance is kind of strained.)
Answer by skluug4
0
0

Four layers come to mind for me:

  • Have strong theoretical reasons to think your method of creating the system cannot result in something motivated to take dangerous actions
  • Inspect the system thoroughly after creation, before deployment, to make sure it looks as expected and appears incapable of making dangerous decisions
  • Deploy the system in an environment where it is physically incapable of doing anything dangerous
  • Monitor the internals of the system closely during deployment to ensure operation is as expected, and that no dangerous actions are attempted

You know, my take on this is that instead of resisting comparisons to Terminator and The Matrix, they should just be embraced (mostly). "Yeah, like that! We're trying to prevent those things from happening. More or less."

The thing is, when you're talking about something that sounds kind of far out, you can take one of two tactics: you can try to engineer the concept and your language around it so that it sounds more normal/ordinary, or you can just embrace the fact that it is kind of crazy, and use language that makes it clear you understand that perception.

So like, "AI Apocalypse Prevention"?

I like this! UI suggestion: instead of "The first option is 5x as valuable as the second option", I would insert the sentence between them in the middle: "...is 5x as valuable as...". Or if you're willing to mess up marginal/total utility, you could format it as "One [X] is worth as much as five [Y]", which I think would help it be more concrete to most people.

5
NunoSempere
Done!
2
NunoSempere
Hey, this is a good idea, but it turns out it's slightly tricky to program. I'll get around to it eventually, though

Clickhole is in fact no longer owned by The Onion! It was bought by the Cards Against Humanity team in early 2020. (link)

I also consider their famous article Heartbreaking: The Worst Person You Know Just Made A Great Point an enormous contribution to the epistemic habits of the internet.

6
Aaron Gertler 🔸
Wow, big news! I've made the necessary correction and changed the title.

hi, i'm skluug! i've been consuming EA-sphere content for a long time and have some friends who are heavily involved, but i so far haven't had much formal engagement myself. i graduated college this year and have given a few thousand dollars to the AMF (i signed up for the GWWC Pledge back in college and enjoy finally making good on it!). i'm interested in upping my engagement with the community and hopefully working towards a career with direct impact per 80k recommendations (i'm a religious 80k podcast listener).

4
Aaron Gertler 🔸
Congratulations on getting to follow through on your pledge! I remember making my first post-college donation, once I was finally drawing a salary, and that was a special time.