Evan R. Murphy

AI Alignment Researcher @ Independent/Non-profit

597 karmaJoined Oct 2021Working (6-15 years)Vancouver, BC, Canada

Bio

Formerly a software engineer at Google, now I'm doing independent AI alignment research.

Because of my focus on AI alignment, I tend to post more on LessWrong and AI Alignment Forum than I do here.

I'm always happy to connect with other researchers or people interested in AI alignment and effective altruism. Feel free to send me a private message!

Posts
6

Sorted by New

Evan R. Murphy's Quick takes

Evan R. Murphy

· 4y ago · 1m read

Proposal: Funding Diversification for Top Cause Areas

Evan R. Murphy

· 2y ago · 3m read

New US Senate Bill on X-Risk Mitigation [Linkpost]

Evan R. Murphy

· 3y ago · 1m read

New series of posts answering one of Holden's "Important, actionable research questions"

Evan R. Murphy

· 3y ago · 1m read

Action: Help expand funding for AI Safety by coordinating on NSF response

Evan R. Murphy

· 3y ago · 4m read

People in bunkers, "sardines" and why biorisks may be overrated as a global priority

Evan R. Murphy

· 3y ago · 3m read

Comments
71

Joining the Carnegie Endowment for International Peace

Evan R. Murphy5mo1

one thing I have been pretty enthused about for a while is putting more effort into investigating potentially concerning AI incidents in the wild. Based on case studies, I believe that exposing and helping the public understand any concerning incidents could easily be the most effective way to galvanize more interest in safety standards, including regulation. I'm not sure how many concerning incidents there are to be found in the wild today, but I suspect there are some, and I expect there to be more over time as AI capabilities advance.

Interesting idea - I can see how exposing AI incidents could be important. This brought to my mind the paper Malla: Demystifying Real-world Large Language Model Integrated Malicious Services. (No affiliation with the paper, just one that I remember reading and we referenced in some Berkeley CLTC AI Security Initiative research earlier this year.) The researchers on the Malla paper dug into the dark web and uncovered hundreds of malicious services based on LLMs being distributed in the wild.

Evan R. Murphy's Quick takes

Evan R. Murphy1y6

Animal welfare

Open Phil claims that campaigns to make more Americans go vegan and vegetarian haven't been very successful. But does this analysis account for immigration?

If people who already live in the US are shifting their diets, but new immigrants skew omnivore, a simple analysis could easily miss the former shift because immigration is fairly large in the US.

Source of Open Phil claim at https://www.openphilanthropy.org/research/how-can-we-reduce-demand-for-meat/ :

But these advocates haven’t achieved the widespread dietary changes they’ve sought — and that boosters sometimes claim they have. Despite the claims, 6% of Americans aren’t vegan and vegetarianism hasn’t risen fivefold lately: Gallup polls show a constant 5-6% of Americans have identified as vegetarians since 1999 (Gallup found 2% identified as vegans the only time it asked, in 2012). The one credible poll showing vegetarianism doubling in recent years still found only 5-7% of Americans identifying as vegetarian in 2017 — consistent with the stable Gallup numbers.

Shutting down AI Safety Support

Evan R. Murphy2y11

Will the AI alignment Slack continue to run?

Thanks JJ and everyone who has worked on AISS for all your great work!

AGI x Animal Welfare: A High-EV Outreach Opportunity?

Evan R. Murphy2y1

Peter Singer and Tse Yip Fai were doing some work on animal welfare relating to AI last year: https://link.springer.com/article/10.1007/s43681-022-00187-z It looks like Fai at least is still working in this area. But I'm not sure whether they have considered or initiated outreach to AGI labs, that seems like a great idea.

If your AGI x-risk estimates are low, what scenarios make up the bulk of your expectations for an OK outcome?

Evan R. Murphy2y3

I place significant weight on the possibility that when labs are in the process of training AGI or near-AGI systems, they will be able to see alignment opportunities that we can't from a more theoretical or distanced POV. In this sense, I'm sympathetic to Anthropic's empirical approach to safety. I also think there are a lot of really smart and creative people working at these labs.

Leading labs also employ some people focused on the worst risks. For misalignment risks, I am most worried about deceptive alignment, and Anthropic recently hired one of the people who coined that term. (From this angle, I would feel safer about these risks if Anthropic were in the lead rather than OpenAI. I know less about OpenAI's current alignment team.)

Let me be clear though: Even if I'm right above and massively catastrophic misalignment risk one of these labs creating AGI is ~20%, I consider that very much an unacceptably high risk. I think even a 1% chance of extinction is unacceptably high. If some other kind of project had a 1% chance of causing human extinction, I don't think the public would stand for it. Imagine some particle accelerator or biotech project had a 1% chance of causing human extinction. If the public found out, I think they would want the project shut down immediately until it could be pursued safely. And I think they would be justified in that, if there's a way to coordinate on doing so.

If your AGI x-risk estimates are low, what scenarios make up the bulk of your expectations for an OK outcome?

Answer by Evan R. MurphyMay 02, 20231

A key part of my model right now relies on who develops the first AGI and on how many AGIs are developed.

If the first AGI is developed by OpenAI, Google DeepMind or Anthropic - all of whom seem relatively cautious (perhaps some more than others) - I put the chance of massively catastrophic misalignment at <20%.

If one of those labs is first and somehow able to prevent other actors from creating AGI after this, then that leaves my overall massively catastrophic misalignment risk at <20%. However, while I think it's likely one of these labs would be first, I'm highly uncertain about whether they would achieve the pivotal outcome of preventing subsequent AGIs.

So, if some less cautious actor overtakes the leading labs, or if the leading lab who first develops AGI cannot prevent many others from building AGI afterward, I view there's a much higher likelihood of massively catastrophic misalignment from one of these attempts to build AGI. In this scenario, my overall massively catastrophic misalignment risk is definitely >50%, and perhaps closer to the 75%~90% range.

NYT: Google will ‘recalibrate’ the risk of releasing AI due to competition with OpenAI

Evan R. Murphy2y2

You're right - I wasn't very happy with my word choice calling Google the 'engine of competition' in this situation. The engine was already in place and involves the various actors working on AGI and the incentives to do so. But these recent developments with Google doubling down on AI to protect their search/ad revenue are revving up that engine.

NYT: Google will ‘recalibrate’ the risk of releasing AI due to competition with OpenAI

Evan R. Murphy2y13

It's somewhat surprising to me the way this is shaking out. I would expect DeepMind and OpenAI's AGI research to be competing with one another*. But here it looks like Google is the engine of competition, less motivated by any future focused ideas about AGI more just by the fact that their core search/ad business model appears to be threatened by OpenAI's AGI research.

*And hopefully cooperating with one another too.

Keep EA high-trust

Evan R. Murphy2y83

I think it's not quite right that low trust is costlier than high trust. Low trust is costly when things are going well. There's kind of a slow burn of additional cost.

But high trust is very costly when bad actors, corruption or mistakes arise that a low trust community would have preempted. So the cost is lumpier, cheap in the good times and expensive in the bad.

(I read fairly quickly so may have missed where you clarified this.)

Process for Returning FTX Funds Announced

Evan R. Murphy2y14

If anyone consults a lawyer about this or starts the process with FTXrepay@ftx.us , it could be very useful to many of us if you followed up here and shared what your experience of the process was like.

Evan R. Murphy

Bio

Posts 6

Comments71

Posts
6

Comments
71