Peter

658 karmaJoined Aug 2021Working (0-5 years)

Message

Interests:

Effective altruism art and fictionAI governanceS-riskBioethicsElectoral politics

Bio

Participation
2

Interested in AI safety talent search and development.

How others can help me

Discuss charity entrepreneurship ideas, nuts & bolts.
Robustly good opportunities for positive impact in short AI timeline worlds <3 years
Connect me with peers, partners, or cowriters for research or fiction.

How I can help others

Making and following through on specific concrete plans.

Posts
8

Sorted by New

Contest for Better AGI Safety Plans

Peter

· 11d ago · 10m read

Has anyone done research on why Covid-19 didn't become a bigger warning shot for future pandemic prevention?

Peter

· 4mo ago · 1m read

Over 4 billion people don't have access to clean drinking water at home, more than 2x previous estimates

Peter

· 11mo ago · 1m read

"We can Prevent AI Disaster Like We Prevented Nuclear Catastrophe"

Peter

· 2y ago · 1m read

How to Reform Effective Altruism after SBF Vox interview with Holden Karnofsky 1/23/2023

Peter

· 2y ago · 2m read

EARadio Interview: Could treating depression be a top intervention?

Peter

· 3y ago · 1m read

EARadio Returns - Suggest Episodes and Shoutouts

Peter

· 3y ago · 1m read

Carrick Flynn Results and Additional Ideas for Passing Pandemic Prevention Policy

Peter

· 3y ago · 3m read

Comments
140

Topic contributions
2

Lessons from the Iraq War for AI policy

Peter1h1

Would be curious to hear your thoughts on this as one strategy for eliciting better plans

Lessons from the Iraq War for AI policy

Peter3d3

Do you have ideas about how we could get better plans?

Why AI Safety Needs a Centralized Plan - And What It Might Look Like

Peter2mo3

I've actually been working on how to get better AI safety plans so I would be keen to chat with anyone who is interested in this. I think the best plan so far (covering the Alignment side) is probably Google's [2504.01849] An Approach to Technical AGI Safety and Security. On the more governance side, one of the most detailed one's is probably AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

kuhanj's Quick takes

Peter2mo18

I've been thinking about coup risks more lately so would actually be pretty keen to collaborate or give feedback on any early stuff. There isn't much work on this (for example, none at RAND as far as I can tell).

I think EAs have frequently suffered from a lack of expertise, which causes pain in areas like politics. Almost every EA and AI safety person was way off on the magnitude of change a Trump win would create - gutting USAID easily dwarfs all of EA global health by orders of magnitude. Basically no one took this seriously as a possibility, or at least I do not know of anyone. And it's not like you'd normally be incentivized to plan for abrupt major changes to a longstanding status quo in the first placce.

Oversimplification of neglectedness has definitely been an unfortunate meme for a while. Sometimes things are too neglected to make progress or don't make sense for your skillset, or are neglected for a reason, or just less impactful. To a lesser extent, I think there has been some misuse/misunderstanding of counterfactual thinking as well instead of Shapley additives. Or being overly optimistic "our few week fellowship can very likely change someone's entrenched career path" if they haven't strongly shown that as their purpose for participating.

Definitely agree we have a problem with deference/not figuring things out. It's hard and there's lots of imposter syndrome where people think they aren't good enough to do this or try to do it. I think sometimes people get early negative feedback and over-update, dropping projects before they've tested things to see results. I would definitely like to see more rigorous impact evaluation in the space. At one point I wanted to start an independent org that did this. It seems surprisingly underprioritized. There's a meme that EAs like to think and research and need to just do more things, but I think it's a bit of a false dichotomy and on net more research + iteration is valuable and amplifies your effectiveness, making sure you're prioritizing the right things in the right ways.

Another way deference expresses negative effects is that established orgs act as whirlpools that suck up all the talent and offer more "legitimacy" including frontier AI companies, but I think they're often not the highest impact thing you could do. Often there is something that would be impactful but won't happen if you don't do it. Or would happen worse. Or happen way later. People also underestimate how much the org they work at will change how they think and what they think about and what they want to do or are willing to give up. But finding alternatives can be tough - how many people really want to continue working as independent contractors with no benefits and no coworkers indefinitely? it's very adverse selection against impact. Sure, this level of competition might weed out some worse ideas but also good ones.

Genes did misalignment first: comparing gradient hacking and meiotic drive

Peter3mo1

This makes me wonder if there could be good setups for evaluating AI systems as groups. You could have separate agent swarms in different sandboxes competing on metrics of safety and performance. The one that does better gets amplified. The agents may then have some incentive to enforce positive social norms for their group against things like sandbagging or deception. When deployed they might have not only individual IDs but group or clan IDs that tie them to each other and continue this dynamic.

Maybe there is some mechanism where membership gets shuffled around sometimes the way alleles do between genes. Or traits of the systems, though that seems less clearly desirable. There are already algorithms to imitate genetic recombination but that would be somewhat different. You could also combine social group membership systems and trait recombination systems potentially. Given the level of influence over AIs, it might be somewhat closer to selective breeding in certain respects but not entirely.

Protesting Now for AI Regulation might be more Impactful than AI Safety Research

Peter3mo14

I'm not sure the policies have been mostly worked out but not implemented. Figuring out technical AI governance solutions seems like a big part of what is needed.

On January 1, 2030, there will be no AGI (and AGI will still not be imminent)

Peter3mo14

That's a really broad question though. If you asked something like, which system unlocked the most real-world value in coding, people would probably say the jump to a more recent model like o3-mini or Gemini 2.5

You could similarly argue the jump from infant to toddler is much more profound in terms of general capabilities than college student to phd but the latter is more relevant in terms of unlocking new research tasks that can be done.

What's the Difference between the AI Threat and the Multinational Mega Corporation?

Peter4mo2

States and corporations rely on humans. They have no incentive to get rid of them. AGI would mean you don't have to rely on humans. So AGIs or people using AGIs might not care about humans anymore or even see them as an obstacle.
States and corporations aren't that monolithic; they are full of competing factions and people who often fail to coordinate or behave rationally. AGI will probably be much better at strategizing and coordinating.
States and corporation are constrained by balance of power with other states/corporations/actors. Superhuman AIs might not have this problem if they exceed all of humanity combined or might think they have more in common with each other than with humans.

Ozzie Gooen's Quick takes

Peter4mo3

Hmm maybe it could still be good to try things in case timelines are a bit longer or an unexpected opportunity arises? For example, what if you thought it was 2 years but actually 3-5?

Orienting to 3 year AGI timelines

Peter7mo1

So it seems like you're saying there are at least two conditions: 1) someone with enough resources would have to want to release a frontier model with open weights, maybe Meta or a very large coalition of the opensource community if distributed training continues to scale, 2) it would need at least enough dangerous capability mitigations like unlearning and tamper resistant weights or cloud inference monitoring, or be behind the frontier enough so governments don't try to stop it. Does that seem right? What do you think is the likely price range for AGI?

I'm not sure the government is moving fast enough or interested in trying to lock down the labs too much given it might slow them down more than it increases their lead or they don't fully buy into risk arguments for now. I'm not sure what the key factors to watch here are. I expected reasoning systems next year, but it seems like even open weight ones were released this year that seem around o1 preview level just a few weeks after, indicating that multiple parties are pursuing similar lines of AI research somewhat independently.

Peter

Bio

Participation2

How others can help me

How I can help others

Posts 8

Comments140

Topic contributions2

Participation
2

Posts
8

Comments
140

Topic contributions
2