Linch commented on Future of Humanity Institute 2005-2024: Final Report 12m ago

221

Future of Humanity Institute 2005-2024: Final Report

· 13d ago · 6m read

This is a linkpost for https://static1.squarespace.com/static/660e95991cf0293c2463bcc8/t/661a3fc3cecceb2b8ffce80d/1712996303164/FHI+Final+Report.pdf

Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.

Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse

...

Linch12m2

Yudkowsky's comments at his sister's wedding seems surprisingly relevant here:

David Bashevkin:
And I would not think it was not think that Eliezer Yudkowsky would be the best sheva brachos speaker, but it was the most lovely thing that he said. What did Eliezer Yudkowsky say at your sheva brachos?
Channah Cohen:
Yeah, it’s a great story because it was mind-blowingly surprising at the time. And it is, I think the only thing that anyone said at a sheva brachos that I actually remember, he got up at the first sheva brachos and he said, when you die after 120 yea

... (read more)

Arepo

21m

When did he get feedback from Kings? Googling it, the only thing I can see is that he was invited to an event which the Swedish king was also at. Also, most of Bostrom's extra-academic prestige is based on a small handful of the papers listed. That might justify making him something like a public communicator of philosophy, but it doesn't obviously merit sponsoring an entire academic department indefinitely. To be clear, I have no strong view on whether the university acted reasonably a) in the abstract or b) according to incentives in the unique prestige ecosystem which universities inhabit. But I don't think listing a handful of papers our subgroup approves of is a good rationale for claiming that it did neither.

Linch

16m

I'm at work and don't have the book with me, but you can look at the "Acknowledgements" section of Superintelligence. I agree that it's not clear whether the Department of Philosophy acted reasonably in the unique prestige ecosystem which universities inhabit, whether in the abstract or after adjusting for FHI quite possibly being unusually difficult/annoying to work with. I do think history will vindicate my position in the abstract and "normal people" with a smattering of facts about the situation (though perhaps not the degree of granularity where you understand the details of specific academic squabbles) will agree with me.

Denis posted Guardian article on Nick Bostrom and the closure of the Future of Humanity Institute (link-post) 24m ago

Guardian article on Nick Bostrom and the closure of the Future of Humanity Institute (link-post)

Denis

· 24m ago · 1m read

Just read this in the Guardian.

The title is: "‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute"

The sub-headline states: "Nick Bostrom’s centre for studying existential risk warned about AI but also gave rise to cultish ideas such as effective altruism."

The tone of the rest of the article is similar. Very disappointing from the Guardian, who typically would align with EA thinking on many topics. But probably today EA is an easy target. It's useful to be aware of the uphill struggle we have, even among liberals, to ensure that EA gets at least a fair hearing.

Sharing for info only. Obviously I don't agree with the article.

Matthew_Barnett commented on Analyzing the moral value of unaligned AIs 1h ago

Analyzing the moral value of unaligned AIs

Matthew_Barnett

· 22d ago · 23m read

A crucial consideration in assessing the risks of advanced AI is the moral value we place on "unaligned" AIs—systems that do not share human preferences—which could emerge if we fail to make enough progress on technical alignment.

In this post I'll consider three potential...

Rohin Shah

Given my new understanding of the meaning of "contingent" here, I'd say my claims are: 1. I'm unsure about how contingent the development of utilitarianism in humans was. It seems quite plausible that it was not very historically contingent. I agree my toy model does not accurately capture my views on the contingency of total utilitarianism. 2. I'm also unsure how contingent it is for unaligned AI, but aggregating over my uncertainty suggests more contingent. One way to think about this is to ask: why are any humans utilitarians? To the extent it's for reasons that don't apply to unaligned AI systems, I think you should feel like it is less likely for unaligned AI systems to be utilitarians. So e.g. if I thought humans were utilitarians primarily because it is simple to express in concepts that humans and AIs share, then I would agree with you. But in fact I feel like it is pretty important that humans feel pleasure and pain, and have empathy, to explain why some humans are utilitarians. (Mostly I think the "true explanation" will have to appeal to more than simplicity, and the additional features this "true explanation" will appeal to are very likely to differ between humans and AIs.) Indeed I feel like AIs probably build fewer pyramids in expectation, for basically the same reason. (The concrete hypothesis I generated for why humans build pyramids was "maybe pyramids were especially easy to build historically".) General note: I want to note that my focus on AI alignment is not necessarily coming from a utilitarian perspective. I work on AI alignment because in expectation I think a world with aligned AI will better reflect "my values" (which I'm uncertain about and may not reflect utilitarianism) than a world with unaligned AI. But I'm happy to continue talking about the implications for utilitarians.

Matthew_Barnett1h2

So e.g. if I thought humans were utilitarians primarily because it is simple to express in concepts that humans and AIs share, then I would agree with you. But in fact I feel like it is pretty important that humans feel pleasure and pain, and have empathy, to explain why some humans are utilitarians. (Mostly I think the "true explanation" will have to appeal to more than simplicity, and the additional features this "true explanation" will appeal to are very likely to differ between humans and AIs.)

Thanks for trying to better understand my views. I apprecia... (read more)

Rohin Shah

I agree it's clear that you claim that unaligned AIs are plausibly comparably utilitarian as humans, maybe more. What I didn't find was discussion of how contingent utilitarianism is in humans. Though actually rereading your comment (which I should have done in addition to reading the post) I realize I completely misunderstood what you meant by "contingent", which explains why I didn't find it in the post (I thought of it as meaning "historically contingent"). Sorry for the misunderstanding. Let me backtrack like 5 comments and retry again.

You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.

tlevin posted a Quick Take 1h ago

tlevin1h5

I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.

I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.

In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.

Would be interested in empirical evidence on this question (ideally actual...

lauradelaplaya posted How we can improve the U.S. social safety net and save lives 3h ago

How we can improve the U.S. social safety net and save lives

lauradelaplaya

· 3h ago · 9m read

Summary

U.S. poverty is deadlier than many might realize and worse than most wealthy countries.
The U.S. spends $600B/year on poverty interventions, and many are less effective than they could be because of poor design.
The work of GiveDirectly U.S. and others aims to improve how government poverty programs are designed and administered, in effect making that $600B/year do more good.
While a dollar to a U.S. cause has less impact by EA-merits than giving to an international one, we should have a similar debate on the effectiveness of U.S. causes for those already giving domestically.

Author: Laura Keen, Senior U.S. Program Manager at GiveDirectly

NOTE: This post is specific to GiveDirectly’s work in the U.S., which is run by dedicated U.S. staff and funded by U.S.-restricted donations. Donations to GiveDirectly only fund our international work unless expressly given...

Joseph Miller commented on Why I'm doing PauseAI 3h ago

Why I'm doing PauseAI

Joseph Miller

· 7h ago

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict...

Matthew_Barnett

A few questions: * What is the risk level below which you'd be OK with unpausing AI? * What do you think about the potential benefits from AI? * How do you interpret models of AI pause, such as this one from Chad Jones?

Joseph Miller3h8

What is the risk level below which you'd be OK with unpausing AI?

I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so I think if we could release that model and then pause, I'd be okay with that.

A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord's x-risk estimates:

AI - 1 in 10
Engineering Pandemic - 1 in 30
Unforeseen anthropogenic risks (eg. dystopian regime, nanotech) - 1 in 30
Other anthropogenic risks - 1 in 50

... (read more)

SummaryBot

Executive summary: The author argues that mass protests against AI development are necessary as a backup plan to prevent potential catastrophic risks from future AI systems like GPT-6, given the uncertainty and limitations of current governance and technical solutions. Key points: 1. GPT-5 training is starting soon, and while catastrophic risks are unlikely, they are hard to predict and mitigate with certainty. 2. Governance efforts and technical solutions for AI alignment may not be sufficient to prevent the development of potentially dangerous AI systems like GPT-6 by 2028. 3. Mass protests against AI are a viable "Plan B" because they require no new ideas or permissions, and most people support pausing AI development without feeling like they are sacrificing anything. 4. Building a small protest movement now through efforts like PauseAI can lay the foundation for a larger, more impactful movement when the general public becomes more aware of AI's imminent effects on society and the economy. This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Jacob-Haimes posted "Open Source AI" is a lie, but it doesn't have to be 4h ago

"Open Source AI" is a lie, but it doesn't have to be

Jacob-Haimes

· 4h ago · 8m read

This is a linkpost for https://jacob-haimes.github.io/independent/Open-Source-AI-is-a-lie/

As advanced machine learning systems become increasingly widespread, the question of how to make them safe is also gaining attention. Within this debate, the term “open source” is frequently brought up. Some claim that open sourcing models will potentially increase the likelihood of societal risks, while others insist that open sourcing is the only way to ensure the development and deployment of these “artificial intelligence,” or “AI,” systems goes well. Despite this idea of “open source” being a central debate of “AI” governance, only one group has released cutting edge “AI” which can be considered Open Source.

Image by Alan Warburton / © BBC / Better Images of AI / Plant / CC-BY 4.0

The term Open Source was first used to describe software in 1998, and was coined by Christine Peterson to describe the principles that would guide the development of the Netscape...

Ryan Greenblatt commented on Joining the Carnegie Endowment for International Peace 4h ago

181

Joining the Carnegie Endowment for International Peace

Holden Karnofsky

· 1d ago · 2m read

Effective today, I’ve left Open Philanthropy and joined the Carnegie Endowment for International Peace^[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities...

Ryan Greenblatt4h5

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments

This is discussed in Holden's earlier post on the topic here.

Matthew_Barnett commented on Imitation Learning is Probably Existentially Safe 4h ago

Imitation Learning is Probably Existentially Safe

Vasco Grilo

· 6h ago · 3m read

This is a linkpost for https://www.openphilanthropy.org/wp-content/uploads/Imitation_Learning_Safe_ready.pdf

This is a linkpost for Imitation Learning is Probably Existentially Safe by Michael Cohen and Marcus Hutter.

Abstract

Concerns about extinction risk from AI vary among experts in the field. But AI encompasses a very broad category of algorithms. Perhaps some algorithms would

...

Matthew_Barnett4h4

I agree with the title and basic thesis of this article but I find its argumentation weak.

First, we’ll offer a simple argument that a sufficiently advanced supervised learning algorithm, trained to imitate humans, would very likely not gain total control over humanity (to the point of making everyone defenseless) and then cause or allow human extinction from that position.
No human has ever gained total control over humanity. It would be a very basic mistake to think anyone ever has. Moreover, if they did so, very few humans would accept human extinction. A

... (read more)

Effective Altruism Forum
EA Forum

New & upvoted

Posts tagged community

Quick takes

Popular comments

Recent discussion

Summary

Abstract

Resources

Opportunities

Listen to posts anywhere