The linked article is my research project for the AI Safety Fundamentals course. As I waited for active moderator approval of the preprint, Erich Grunewald beat me to writing a post with the same point. Luckily, our arguments largely complement each other, which is nicely symbolic of the whole debate.

In summary:

  • AI ethics is more focused on existing problems, AI safety on those arising in the near future. Since these communities build on different intellectual traditions, they view AI risks through different aesthetics.
    •  E.g. AI safety speaks in terms of utility functions, agents and incentives, AI ethics speaks in terms of fairness or accountability.
  • In terms of policy recommendations, these differences don't seem to matter. Unfortunately, the combination of political attention with the tough competition within academia and Big Tech involvement heighten suspicion and incentivize the creation of intellectual coalitions that often form around aesthetics, as the simplest common denominator.
    • This may lead to the feeling that people focused on different problems divert attention from what's really important
  • However, this micropolitics masks the fact that, in practice, AI ethics and AI safety are highly complimentary and both benefit from the shared spotlight of attention.
  • This was well demonstrated with the EU AI act, as:
    1. The act demonstrated there is a consensus on the meta-principles that should guide AI policy.
      1. Particularly the principle that AI developers need to provide sufficient evidence that they are taking reasonable measures to ensure that their technology is beneficial
      2. Therefore, there doesn't need to be an agreement regarding p(doom). If it's obvious that an advanced AI is extremely unlikely to cause a catastrophe, it should be easy to demonstrate. In such a case, the policy rightfully wouldn't slow AI development down. However, if an AI has destructive capabilities and there aren't arguments or experiments that can demonstrate safety, it's good if regulation poses an obstacle to its development.
    2. The act demonstrated the two perspectives offer different reasons for a plethora of important policies.
      1. Both safety and fairness requires a corporate governance that ensures transparent, controllable algorithms and a practical accountability of digital firms
      2. Both perspectives highlight the harms caused by the social power of social media, addictions, manipulation, surveillance, social credit systems or autonomous weapons
    3. The act demonstrated that the increased attention around AI risks gives weight to the voices from both sides.
      1. Studies suggest that mentioning scientific uncertainty (one stemming from known gaps in knowledge) and technical uncertainty (one inherent to statistical models) in science communication has neutral or positive effects on the trust of the source. However, consensus uncertainty, stemming from differences in opinions, has clearly negative effects towards both sides of a debate, particularly if it's heated. Importantly, it's a question of framing whether uncertainty stems from a gap in knowledge or differences among scientists. Therefore, merely friendlier relationships seem beneficial, in order to promote policies in the intersection, as not knowing who to trust fosters inaction.
      2. In practice, we might have witnessed this when a group of MEPs wrote their own version of the FLI open letter, which possibly sped up the EU AI act, even though the group's members expressed skepticism towards the FLI letter x-risk concerns. This may have been allowed by the FLI's letter openness towards both AI ethics and AI safety-based concerns.
  • I ran a tiny survey (n=82) to explore the effects of interaction of AI safety and AI ethics attitudes. I found out the concern for AI safety and AI bias correlated positively (r = .28). More importantly, when respondents were first asked to think about AI safety, their concern regarding AI bias was not lower in any of the measured dimensions (perceived significance, support for policy, support of research funding) - and vice versa. The interaction effect was only significant in one direction - people who were first asked about AI safety reported higher support for policy targeting AI bias (ß = .23).
  • More evidence based on web interest and themes of other policy & funding initiatives in Erich's post
Comments2


Sorted by Click to highlight new comments since:

In terms of policy recommendations, these differences don't seem to matter.

Maybe I'm nitpicking, but I see this point often and I think it's a little too self-serving. There are definitely policy ideas in both spheres that trade-off against the others. E.g. many AI X-risk policy analysts (used to) want few players to reduce race dynamics, while such concentration of power would be bad for present-day harms. Or keeping significant chip production out of developing countries.

More generally, if governments really took x-risk seriously, they would be willing to sacrifice significant civil liberties, which wouldn't be acceptable at low x-risk estimates.

That's a good note. But it seems to me a little like pointing out there's a friction between a free market policy and a pro-immigration policy because

a) Some pro-immigration policies would be anti-free market (e.g. anti-discrimination law)
b) Americans who support one tend to oppose the other

While that's true, philosophically, the positions support each other and most pro-free market policies are presumably neutral or positive for immigration.

Similarly, you can endorse the principles that guide AI ethics while endorsing less popular solutions because of additional, x-risk considerations. If there are disagreements, they aren't about moral principles, but empirical claims (x-risk clearly wouldn't be an outcome AI ethics proponents support). And the empirical claims themselves ("AI causes harm now" and "AI might cause harm in the future") support each other & correlated in my sample. My guess is that they actually correlate in academia as well.

It seems to me the negative effects of the concentration of power can be eliminated by other policies (e.g. Digital Markets Act, Digital Services Act, tax reforms)

Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
 ·  · 8m read
 · 
In my past year as a grantmaker in the global health and wellbeing (GHW) meta space at Open Philanthropy, I've identified some exciting ideas that could fill existing gaps. While these initiatives have significant potential, they require more active development and support to move forward.  The ideas I think could have the highest impact are:  1. Government placements/secondments in key GHW areas (e.g. international development), and 2. Expanded (ultra) high-net-worth ([U]HNW) advising Each of these ideas needs a very specific type of leadership and/or structure. More accessible options I’m excited about — particularly for students or recent graduates — could involve virtual GHW courses or action-focused student groups.  I can’t commit to supporting any particular project based on these ideas ahead of time, because the likelihood of success would heavily depend on details (including the people leading the project). Still, I thought it would be helpful to articulate a few of the ideas I’ve been considering.  I’d love to hear your thoughts, both on these ideas and any other gaps you see in the space! Introduction I’m Mel, a Senior Program Associate at Open Philanthropy, where I lead grantmaking for the Effective Giving and Careers program[1] (you can read more about the program and our current strategy here). Throughout my time in this role, I’ve encountered great ideas, but have also noticed gaps in the space. This post shares a list of projects I’d like to see pursued, and would potentially want to support. These ideas are drawn from existing efforts in other areas (e.g., projects supported by our GCRCB team), suggestions from conversations and materials I’ve engaged with, and my general intuition. They aren’t meant to be a definitive roadmap, but rather a starting point for discussion. At the moment, I don’t have capacity to more actively explore these ideas and find the right founders for related projects. That may change, but for now, I’m interested in