I recently co-authored a paper with Pablo Moreno, Implications of Quantum Computing for Artificial Intelligence alignment research, which can be accessed through arXiv.

Our paper focuses on analyzing the interaction between Quantum Computing (QC) and the current landscape of research in Artificial Intelligence (AI) alignment, and we weakly conclude that knowledge of QC is unlikely to be helpful to address current bottlenecks in AI alignment.

In this post I intend to very briefly summarize the generator of the main arguments of the paper, convey the main conclusions and invite the reader to read the full report if they wish to get a deeper intuition or see our list of open questions.


It might be tempting to conclude that QC has important implications for AI alignment since there are some promising avenues of research in Quantum Machine Learning, so QC might end up being an integral component of future AI systems.

However, we argue that for the most part QC can be simplified away as a black box accelerator that lets you exponentially speed up certain computations - the so-called quantum speedup. This is relevant because we believe that current research in alignment should feel free to use invocations to that kind of oracles to discuss formal solutions for the different problems of the field, and worry about its concrete efficient implementation later down the line.


The biggest challenge that QC supposes for AI Alignment purposes is what we called quantum obfuscation - the fact that reading the contents of a quantum computing is hard to do classically, which may render some oversight mechanisms we might design useless.

However most research agendas and problems AI alignment researchers are working on have little to do with the actual implementation of low-level oversight mechanisms, and focus rather on aligning the incentives of AI systems to cooperatively send information to its operators in an interpretable way.

Furthermore, there might be direct analogues of classical oversight in the quantum realm, so research conducted in this stage may be rescued later instead of wasted.


We have also looked into reasons why QC might be a good tool to solve some AI alignment subproblems, and identified a couple of cases. They are however not especially promising.

First, we identified the possibility of using access to quantum computing as an amplification of an overseer that verifies or provides the reward in a way hard to understand by an agent being verified - we call this exploiting quantum asymmetry.

Second, we might be able to exploit quantum isolation to monitor quantum agents - the fact that a quantum computer has to remain isolated to be able to achieve quantum speedups. This might point in the direction of a tripwire that would allow us to detect whether a system has interacted with the outside world without our consent. Albeit we have not looked into this in-depth, we weakly argue against the possibility of an efficient schema of this type.


Long story short, we do not believe that QC is a critical area of knowledge for advancing current research agendas of technical AI alignment, and I would weakly recommend against pursuing a career in it for this purpose or funding research in this intersection.

For the full discussion of our reasoning and a list of open questions, I refer the reader to our paper.

This post was written by Jaime Sevilla, summer fellow at the Future of Humanity Institute. I want to thank Pablo Moreno for working with me on this topic and his feedback on this summary.

Comments4


Sorted by Click to highlight new comments since:

Maybe having a good understanding of Quantum Computing and how it could be leveraged in different paradigms of ML might help with forecasting AI-timelines as well as dominant paradigms, to some extend?

If that was true, while not necessarily helpful for a single agenda, knowledge about quantum computing would help with the correct prioritization of different agendas.

I do agree with your assesment, and I would be medium excited about somebody informally researching what algorithms can be quantized to see if there is low hanging fruit in terms of simplifying assumptions that could be made in a world where advanced AI is quantum-powered.

However my current intuition is there is no much sense in digging in this unless we were sort of confident that 1) we will have access to QC before TAI and that 2) QC will be a core component of AI.

To give a bit more context to the article, Pablo and me originally wrote it because we disagreed on whether current research in AI Alignment would still be useful if quantum computing was a core component of advanced AI systems.

Had we concluded that quantum ofuscation threatened to invalidate some assumptions made by current research, we would have been more emphatic about the necessity of having quantum computing experts working on "safeguarding our research" on AI Alignment.

First off, I really appreciate the straightshooter conclusion of 'QC is unlikely to be helpful to address current bottlenecks in AI alignment.' even while you both spent many hours looking into it.


Second, I'm curious to hear any thoughts on the amateur speculation I threw at Pablo in a chat at the last AI Safety Camp:

Would quantum computing afford the mechanisms for improved prediction of the actions that correlated agents would decide on?

As a toy model, I'm imagining hundreds of almost-homogenous reinforcement learning agents within a narrow distribution of slightly divergent maps of the state space, probability weightings/policies, and environmental inputs. Would current quantum computing techniques, assuming the hardware to run them on is available, be able to more quickly/precisely derive the % portions of those agents at say State1 would take Action1, Action2, or Action3?

I have a broad vague sense that if that set-up works out, you could leverage that to create a 'regulator agent' for monitoring some 'multi-agent system' composed of quasi-homogenous autonomous 'selfish agents' (e.g. each negotiating on behalf of their respective human interest group) that has a meaningful influence on our physical environment. This regulator would interface directly with a few of the selfish agents. If that selfish agent subset are about to select Action1, it will predict what % of other, slightly divergent algorithms would also decide Action1. If the regulator prognoses that an excessive number of Action1s will be taken – leading to reduced rewards to or robustness of the collective (e.g. Tragedy of the Commons case of overutilisation of local resources) – it would override that decision by commanding a compensating number of the agents to instead select the collectively-conservative Action2.

That's a lot of jargon, half of which I feel I have little clue about... But curious to read any arguments you have on how this would (not) work.

Would current quantum computing techniques, assuming the hardware to run them on is available, be able to more quickly/precisely derive the % portions of those agents at say State1 would take Action1, Action2, or Action3?

I think so! But I also think that you can do it easily with a bunch of GPUs. Let me explain: the idea is parallelizing the process of the agents and then just sampling from the agents. You can do that using "quantum parallelism", but I feel it will be simpler to just use GPUs for that.

I believe that you might be able to get some (polynomial, probably quadratic) speedup in the precision of the estimate using quantum resources, although I am not sure how useful is that.

Curated and popular this week
 ·  · 38m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028? In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote). This means that, while the compa
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal
 ·  · 23m read
 · 
Or on the types of prioritization, their strengths, pitfalls, and how EA should balance them   The cause prioritization landscape in EA is changing. Prominent groups have shut down, others have been founded, and everyone is trying to figure out how to prepare for AI. This is the first in a series of posts examining the state of cause prioritization and proposing strategies for moving forward.   Executive Summary * Performing prioritization work has been one of the main tasks, and arguably achievements, of EA. * We highlight three types of prioritization: Cause Prioritization, Within-Cause (Intervention) Prioritization, and Cross-Cause (Intervention) Prioritization. * We ask how much of EA prioritization work falls in each of these categories: * Our estimates suggest that, for the organizations we investigated, the current split is 89% within-cause work, 2% cross-cause, and 9% cause prioritization. * We then explore strengths and potential pitfalls of each level: * Cause prioritization offers a big-picture view for identifying pressing problems but can fail to capture the practical nuances that often determine real-world success. * Within-cause prioritization focuses on a narrower set of interventions with deeper more specialised analysis but risks missing higher-impact alternatives elsewhere. * Cross-cause prioritization broadens the scope to find synergies and the potential for greater impact, yet demands complex assumptions and compromises on measurement. * See the Summary Table below to view the considerations. * We encourage reflection and future work on what the best ways of prioritizing are and how EA should allocate resources between the three types. * With this in mind, we outline eight cruxes that sketch what factors could favor some types over others. * We also suggest some potential next steps aimed at refining our approach to prioritization by exploring variance, value of information, tractability, and the