New & upvoted

Customize feedCustomize feed
CommunityCommunity
Personal+
193
· 4d ago · 9m read

Posts tagged community

Quick takes

Show community
View more
5
tlevin
1h
0
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable. I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies. In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences. Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
Excerpt from the most recent update from the ALERT team:   Highly pathogenic avian influenza (HPAI) H5N1: What a week! The news, data, and analyses are coming in fast and furious. Overall, ALERT team members feel that the risk of an H5N1 pandemic emerging over the coming decade is increasing. Team members estimate that the chance that the WHO will declare a Public Health Emergency of International Concern (PHEIC) within 1 year from now because of an H5N1 virus, in whole or in part, is 0.9% (range 0.5%-1.3%). The team sees the chance going up substantially over the next decade, with the 5-year chance at 13% (range 10%-15%) and the 10-year chance increasing to 25% (range 20%-30%).   their estimated 10 year risk is a lot higher than I would have anticipated.
I can't find a better place to ask this, but I was wondering whether/where there is a good explanation of the scepticism of leading rationalists about animal consciousness/moral patienthood. I am thinking in particular of Zvi and Yudkowsky. In the recent podcast with Zvi Mowshowitz on 80K, the question came up a bit, and I know he is also very sceptical of interventions for non-human animals on his blog, but I had a hard time finding a clear explanation of where this belief comes from. I really like Zvi's work, and he has been right about a lot of things I was initially on the other side of, so I would be curious to read more of his or similar people's thoughts on this. Seems like potentially a place where there is a motivation gap: non-animal welfare people have little incentive to convince me that they think the things I work on are not that useful.
American Philosophical Association (APA) announces two $10,000 AI2050 Prizes for philosophical work related to AI, with June 23, 2024 deadline:  https://dailynous.com/2024/04/25/apa-creates-new-prizes-for-philosophical-research-on-ai/ https://www.apaonline.org/page/ai2050 https://ai2050.schmidtsciences.org/hard-problems/
Time for the Shrimp Welfare Project to do a Taylor Swift crossover? https://www.instagram.com/p/C59D5p1PgNm/?igsh=MXZ5d3pjeHAxeHR2dw==

Popular comments

Recent discussion

Anders Sandberg has written a “final report” released simultaneously with the announcement of FHI’s closure. The abstract and an excerpt follow.


Normally manifestos are written first, and then hopefully stimulate actors to implement their vision. This document is the reverse

...
Continue reading

Yudkowsky's comments at his sister's wedding seems surprisingly relevant here:

David Bashevkin:

And I would not think it was not think that Eliezer Yudkowsky would be the best sheva brachos speaker, but it was the most lovely thing that he said. What did Eliezer Yudkowsky say at your sheva brachos?

Channah Cohen:

Yeah, it’s a great story because it was mind-blowingly surprising at the time. And it is, I think the only thing that anyone said at a sheva brachos that I actually remember, he got up at the first sheva brachos and he said, when you die after 120 yea

... (read more)
2
Arepo
21m
When did he get feedback from Kings? Googling it, the only thing I can see is that he was invited to an event which the Swedish king was also at. Also, most of Bostrom's extra-academic prestige is based on a small handful of the papers listed. That might justify making him something like a public communicator of philosophy, but it doesn't obviously merit sponsoring an entire academic department indefinitely. To be clear, I have no strong view on whether the university acted reasonably a) in the abstract or b) according to incentives in the unique prestige ecosystem which universities inhabit. But I don't think listing a handful of papers our subgroup approves of is a good rationale for claiming that it did neither.
2
Linch
16m
I'm at work and don't have the book with me, but you can look at the "Acknowledgements" section of Superintelligence.  I agree that it's not clear whether the Department of Philosophy acted reasonably in the unique prestige ecosystem which universities inhabit, whether in the abstract or after adjusting for FHI quite possibly being unusually difficult/annoying to work with. I do think history will vindicate my position in the abstract and "normal people" with a smattering of facts about the situation (though perhaps not the degree of granularity where you understand the details of specific academic squabbles) will agree with me.

Just read this in the Guardian. 

The title is: "‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute"

The sub-headline states: "Nick Bostrom’s centre for studying existential risk warned about AI but also gave rise to cultish ideas such as effective altruism."

The tone of the rest of the article is similar. Very disappointing from the Guardian, who typically would align with EA thinking on many topics. But probably today EA is an easy target. It's useful to be aware of the uphill struggle we have, even among liberals, to ensure that EA gets at least a fair hearing. 

Sharing for info only. Obviously I don't agree with the article. 

Continue reading

A crucial consideration in assessing the risks of advanced AI is the moral value we place on "unaligned" AIs—systems that do not share human preferences—which could emerge if we fail to make enough progress on technical alignment.

In this post I'll consider three potential...

Continue reading
2
Rohin Shah
2h
Given my new understanding of the meaning of "contingent" here, I'd say my claims are: 1. I'm unsure about how contingent the development of utilitarianism in humans was. It seems quite plausible that it was not very historically contingent. I agree my toy model does not accurately capture my views on the contingency of total utilitarianism. 2. I'm also unsure how contingent it is for unaligned AI, but aggregating over my uncertainty suggests more contingent. One way to think about this is to ask: why are any humans utilitarians? To the extent it's for reasons that don't apply to unaligned AI systems, I think you should feel like it is less likely for unaligned AI systems to be utilitarians. So e.g. if I thought humans were utilitarians primarily because it is simple to express in concepts that humans and AIs share, then I would agree with you. But in fact I feel like it is pretty important that humans feel pleasure and pain, and have empathy, to explain why some humans are utilitarians. (Mostly I think the "true explanation" will have to appeal to more than simplicity, and the additional features this "true explanation" will appeal to are very likely to differ between humans and AIs.) Indeed I feel like AIs probably build fewer pyramids in expectation, for basically the same reason. (The concrete hypothesis I generated for why humans build pyramids was "maybe pyramids were especially easy to build historically".) General note: I want to note that my focus on AI alignment is not necessarily coming from a utilitarian perspective. I work on AI alignment because in expectation I think a world with aligned AI will better reflect "my values" (which I'm uncertain about and may not reflect utilitarianism) than a world with unaligned AI. But I'm happy to continue talking about the implications for utilitarians.

So e.g. if I thought humans were utilitarians primarily because it is simple to express in concepts that humans and AIs share, then I would agree with you. But in fact I feel like it is pretty important that humans feel pleasure and pain, and have empathy, to explain why some humans are utilitarians. (Mostly I think the "true explanation" will have to appeal to more than simplicity, and the additional features this "true explanation" will appeal to are very likely to differ between humans and AIs.)

Thanks for trying to better understand my views. I apprecia... (read more)

2
Rohin Shah
3h
I agree it's clear that you claim that unaligned AIs are plausibly comparably utilitarian as humans, maybe more. What I didn't find was discussion of how contingent utilitarianism is in humans. Though actually rereading your comment (which I should have done in addition to reading the post) I realize I completely misunderstood what you meant by "contingent", which explains why I didn't find it in the post (I thought of it as meaning "historically contingent"). Sorry for the misunderstanding. Let me backtrack like 5 comments and retry again.
Sign up for the Forum's email digest
You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.
tlevin posted a Quick Take 1h ago

I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.

I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.

In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.

Would be interested in empirical evidence on this question (ideally actual...

Continue reading

Summary 

  • U.S. poverty is deadlier than many might realize and worse than most wealthy countries. 
  • The U.S. spends $600B/year on poverty interventions, and many are less effective than they could be because of poor design.
  • The work of GiveDirectly U.S. and others aims to improve how government poverty programs are designed and administered, in effect making that $600B/year do more good. 
  • While a dollar to a U.S. cause has less impact by EA-merits than giving to an international one, we should have a similar debate on the effectiveness of U.S. causes for those already giving domestically.

Author: Laura Keen, Senior U.S. Program Manager at GiveDirectly

NOTE: This post is specific to GiveDirectly’s work in the U.S., which is run by dedicated U.S. staff and funded by U.S.-restricted donations. Donations to GiveDirectly only fund our international work unless expressly given...

Continue reading

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict...

Continue reading
10
Matthew_Barnett
5h
A few questions: * What is the risk level below which you'd be OK with unpausing AI? * What do you think about the potential benefits from AI? * How do you interpret models of AI pause, such as this one from Chad Jones?
  • What is the risk level below which you'd be OK with unpausing AI?

I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so I think if we could release that model and then pause, I'd be okay with that.

A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord's x-risk estimates:

  • AI - 1 in 10
  • Engineering Pandemic - 1 in 30
  • Unforeseen anthropogenic risks (eg. dystopian regime, nanotech) - 1 in 30
  • Other anthropogenic risks - 1 in 50
... (read more)
1
SummaryBot
6h
Executive summary: The author argues that mass protests against AI development are necessary as a backup plan to prevent potential catastrophic risks from future AI systems like GPT-6, given the uncertainty and limitations of current governance and technical solutions. Key points: 1. GPT-5 training is starting soon, and while catastrophic risks are unlikely, they are hard to predict and mitigate with certainty. 2. Governance efforts and technical solutions for AI alignment may not be sufficient to prevent the development of potentially dangerous AI systems like GPT-6 by 2028. 3. Mass protests against AI are a viable "Plan B" because they require no new ideas or permissions, and most people support pausing AI development without feeling like they are sacrificing anything. 4. Building a small protest movement now through efforts like PauseAI can lay the foundation for a larger, more impactful movement when the general public becomes more aware of AI's imminent effects on society and the economy.     This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

As advanced machine learning systems become increasingly widespread, the question of how to make them safe is also gaining attention. Within this debate, the term “open source” is frequently brought up. Some claim that open sourcing models will potentially increase the likelihood of societal risks, while others insist that open sourcing is the only way to ensure the development and deployment of these “artificial intelligence,” or “AI,” systems goes well. Despite this idea of “open source” being a central debate of “AI” governance, only one group has released cutting edge “AI” which can be considered Open Source.

Image by Alan Warburton / © BBC / Better Images of AI / Plant / CC-BY 4.0

The term Open Source was first used to describe software in 1998, and was coined by Christine Peterson to describe the principles that would guide the development of the Netscape...

Continue reading

Effective today, I’ve left Open Philanthropy and joined the Carnegie Endowment for International Peace[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities...

Continue reading

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments

This is discussed in Holden's earlier post on the topic here.

This is a linkpost for Imitation Learning is Probably Existentially Safe by Michael Cohen and Marcus Hutter.

Abstract

Concerns about extinction risk from AI vary among experts in the field. But AI encompasses a very broad category of algorithms. Perhaps some algorithms would

...
Continue reading

I agree with the title and basic thesis of this article but I find its argumentation weak.

First, we’ll offer a simple argument that a sufficiently advanced supervised learning algorithm, trained to imitate humans, would very likely not gain total control over humanity (to the point of making everyone defenseless) and then cause or allow human extinction from that position.

No human has ever gained total control over humanity. It would be a very basic mistake to think anyone ever has. Moreover, if they did so, very few humans would accept human extinction. A

... (read more)