AI Programme Officer at Longview Philanthropy and AI DPhil student at Oxford
The topline comparison between LLMs and superforecasters seems a bit unfair. You compare a single LLM's forecast against the median from a crowd of superforecasters. But we know the median from a crowd is typically more accurate than any particular member of the crowd. Therefore I think it'd be more fair to compare a single LLM to a single superforecaster, or a crowd of LLMs against a crowd of superforecasters. Do we know whether the best LLM is better than the best individual forecaster in your sample, or how the median LLM compares to the median forecaster?
(Nitpick aside, this is very interesting research, thanks for doing it.)
Thanks for this post—I agree on many of the key points. I was Longview's grant investigator on CAIP and, as I wrote in our official reply to CAIP (posted here), I wish there had been enough c4 funding available to sustain CAIP. Unfortunately, funding for 501c4 work remains scarce.
If anyone reading this is interested in contributing >$100K to 501c4 policy advocacy or any other kind of work on AI safety, please feel free to reach out to me at aidan@longview.org. We've comprehensively reviewed the 501c4 policy advocacy ecosystem and many other opportunities, and we’d be happy to offer detailed info and donation recommendations to potential large donors.
Agreed with the other answers on the reasons why there's no GiveWell for AI safety. But in case it's helpful, I should say that Longview Philanthropy offers advice to donors looking to give >$100K per year to AI safety. Our methodology is a bit different from GiveWell’s, but we do use cost-effectiveness estimates. We investigate funding opportunities across the AI landscape from technical research to field-building to policy in the US, EU, and around the world, trying to find the most impactful opportunities for the marginal donor. We also do active grantmaking, such as our calls for proposals on hardware-enabled mechanisms and digital sentience. More details here. Feel free to reach out to aidan@longview.org or simran@longview.org if you'd like to learn more.
Now, Anthropic, OpenAI, Google DeepMind, and xAI say their most powerful models might have dangerous biology capabilities and thus could substantially boost extremists—but not states—in creating bioweapons.
I think the "not states" part of this is incorrect in the case of OpenAI, whose Deep Research system card said: "Our evaluations found that deep research can help experts with the operational planning of reproducing a known biological threat, which meets our medium risk threshold."
One other potential suggestion: Organizers should consider focusing on their own career development rather than field-building if their timelines are shortening and they think they can have a direct impact sooner than they can have an impact through field-building. Personally I regret much of the time I spent starting an AI safety club in college because it traded off against building skills and experience in direct work. I think my impact through direct work has been significantly greater than my impact through field-building, and I should've spent more time on direct work in college.
What about corporations or nation states during times of conflict - do you think it's accurate to model them as roughly as ruthless in pursuit of their own goals as future AI agents?
They don't have the same psychological makeup as individual people, they have a strong tradition and culture of maximizing self-interest, and they face strong incentives and selection pressures to maximize fitness (i.e. for companies to profit, for nation states to ensure their own survival) lest they be outcompeted by more ruthless competitors. On average, while I'd expect that these entities tend to show some care for goals besides self-interest maximization, I think the most reliable predictor of their behavior is the maximization of their self-interest.
If they're roughly as ruthless as future AI agents, and we've developed institutions that somewhat robustly align their ambitions with pro-social action, then we should have some optimism that we can find similarly productive systems for working with misaligned AIs.
Human history provides many examples of agents with different values choosing to cooperate thanks to systems and institutions:
If two agents' utility functions are perfect inverses, then I agree that cooperation is impossible. But when agents share a preference for some outcomes over others, even if they disagree about the preference ordering of most outcomes, then cooperation is possible. In such general sum games, well-designed institutions can systematically promote cooperative behavior over conflict.
Nice, I think this is a great perspective. One comment on "becoming known": I used to think trying to become well known was mostly zero-sum—you're just competing against other candidates for a fixed pool of jobs and marketing yourself to beat them. That's definitely part of the story, but it misses a key positive sum benefit of becoming better known.
Employers have a pessimistic prior on job applicants and struggle to tell whether someone is truly excellent, so making your skills legible (e.g. by writing in public, getting credentials building relationships with domain experts who can vouch for you) allows employers to hire you when they otherwise wouldn't have the confidence to hire anyone. From the perspective of an employer, one candidate known to be very good is often better than a bunch of candidates who might have better skills in expectation but whose skills are extremely difficult to verify, because you can actually hire the first person whereas you might not be able to make a hire from the second pool.