harfe

902 karmaJoined Feb 2021

Message

Posts
1

Sorted by New

harfe's Quick takes

harfe

· 1y ago · 1m read

Comments
161

Chris Leong's Quick takes

harfe8d13

Once upon a time, some people were arguing that AI might kill everyone, and EA resources should address that problem instead of fighting Malaria. So OpenPhil poured millions of dollars into orgs such as EpochAI (they got 9 million). Now 3 people from EpochAI created a startup to provide training data to help AI replace human workers. Some people are worried that this startup increases AI capabilities, and therefore increases the chance that AI will kill everyone.

Alignment is not *that* hard

harfe8d4

However, a model trained to obey the RLHF objective will expect negative reward if decided taking over the world

If an AI takes over the world there is no-one around to give it a negative reward. So the AI will not expect a negative reward for taking over the world

Alignment is not *that* hard

harfe9d11

The issue is not whether the AI understands human morality. The issue is whether it cares.

The arguments from the "alignment is hard" side that I was exposed to don't rely on the AI misinterpreting what the humans want. In fact, superhuman AI assumed to be better at humans at understanding human morality. It still could do things that go against human morality. Overall I get the impression you misunderstand what alignment is about (or maybe you just have a different association to words as "alignment" than me).

Whether a language model can play a nice character that would totally give back the dictatorial powers after takeover is barely any evidence whether the actual super-human AI system will step back from its position of world dictator after it has accomplished some tasks.

Donor Lotteries Aren't Worth the Effort

harfe2mo0

How is that better than individuals just donating to wherever they think makes sense on the margin?

I think the comment already addresses that here:

moreover, rule by committee enables deliberation and information transfer, so that persuasion can be used to make decisions and potentially improve accuracy or competence at the loss of independence.

Why EA can (and should) appeal to Christians

harfe2mo1

This article has a lot of downvoting (net karma of 39 from 28)

This does not seem to be an unusual amount of downvoting to me. The net karma is even higher than the number of votes!

As a more general point, I think people should worry less about downvotes on posts with a high net karma.

Could humanity be saved by sending people to other planets (like Mars)?

Answer by harfeFeb 16, 20255

As for existential risk from AI takeover, I don't think having a self-sustaining civilization on Mars would help much.

If an AI has completed takeover on earth and killed all humans on earth, taking over Mars too does not sound that hard, especially since the human civilization is likely quite fragile. (There might be some edge cases, where you solve the AI control problem well enough to guarantee that all advanced AIs leave Mars alone, but not well enough for AI to leave Australia alone, but I think scenarios like these are extremely unlikely).

For other existential risks, it might be in principle useful, but practically very difficult. Building a self-sustaining city on Mars will take a lot of time and resources. On the scale of centuries, it seems like a viable option though.

Matthew_Barnett's Quick takes

harfe2mo3

At the same time though I don't think you mean to endorse 1).

I have read or skimmed some of his posts and my sense is that he does endorse 1). But at the same time he says

critics seem to frequently conflate my arguments with other, simpler positions that can be more easily dismissed.

so maybe this is one of these cases and I should be more careful.

Ideas EAIF is excited to receive applications for

harfe2mo1

A recent comment says that restriction has been lifted and the website will be updated next week: https://forum.effectivealtruism.org/posts/aBkALPSXBRjnjWLnP/announcing-the-q1-2025-long-term-future-fund-grant-round?commentId=FFFMBth8v7WBqYFzP

Why misaligned AGI won’t lead to mass killings (and what actually matters instead)

harfe3mo3

the AI won't ever have more [...] capabilities to hack and destroy infrastructure than Russia, China or the US itself.

Having better hacking capability than China seems like a low bar for super-human AGI. The AGI would need to be better at writing and understanding code than a small group of talented humans, and have access to some servers. This sounds easy if you accept the premise of smarter-than-human AGI.

Eugenics-Adjacent's Quick takes

harfe3mo6

Merely listing EA under "Memetics adjacence" does not support the claim "is also an avowed effective altruist."

harfe

Posts 1

Comments161

Posts
1

Comments
161