Oliver Sourbut

PhD student (AI) @ University of Oxford
286 karmaJoined Sep 2020Pursuing a doctoral degree (e.g. PhD)Working (6-15 years)Oxford, UK



Call me Oliver or Oly - I don't mind which.

I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I'm currently (2022) embarking on a PhD in AI in Oxford, and also spend time in (or in easy reach of) London. Until recently I was working as a senior data scientist and software engineer, and doing occasional AI alignment research with SERI.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently

  • Ord - The Precipice
  • Pearl - The Book of Why
  • Bostrom - Superintelligence
  • McCall Smith - The No. 1 Ladies' Detective Agency (and series)
  • Melville - Moby-Dick
  • Abelson & Sussman - Structure and Interpretation of Computer Programs
  • Stross - Accelerando
  • Graeme - The Rosie Project (and trilogy)

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

  • Hanabi (can't recommend enough; try it out!)
  • Pandemic (ironic at time of writing...)
  • Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
  • Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.


Sure, take it or leave it! I think for the field-building benefits it can look more obviously like an externality (though I-the-fundraiser would in fact be pleased and not indifferent, presumably!), but the epistemic benefits could easily accrue mainly to me-the-fundraiser (of course they could also benefit other parties).

How much of this is lost by compressing to something like: virtue ethics is an effective consequentialist heuristic?

I've been bought into that idea for a long time. As Shaq says, 'Excellence is not a singular act, but a habit. You are what you repeatedly do.'

We can also make analogies to martial arts, music, sports, and other practice/drills, and to aspects of reinforcement learning (artificial and natural).

Simple, clear, thought-provoking model. Thanks!

I also faintly recall hearing something similar in this vicinity: apparently some volunteering groups get zero (or less!?) value from many/most volunteers, but engaged volunteers dominate donations, so it's worthwhile bringing in volunteers and training them! (citation very much needed)

Nitpick: are these 'externalities'? I'd have said, 'side effects'. An externality is a third-party impact from some interaction between two parties. The effects you're describing don't seem to be distinguished by being third-party per se (I can imagine glossing them as such but it's not central or necessary to the model).

Yeah. I also sometimes use 'extinction-level' if I expect my interlocutor not to already have a clear notion of 'existential'.

Point of information: at least half the funding comes from Schmidt futures (not OpenAI), though OpenAI are publicising and administrating it.

Another high(er?) priority for governments:

  • start building multilateral consensus and preparations on what to do if/when
    • AI developers go rogue
    • AI leaked to/stolen by rogue operators
    • AI goes rogue

I think this is a good and useful post in many ways, in particular laying out a partial taxonomy of differing pause proposals and gesturing at their grounding and assumptions. What follows is a mildly heated response I had a few days ago, whose heatedness I don't necessarily endorse but whose content seems important to me.

Sadly this letter is full of thoughtless remarks about China and the US/West. Scott, you should know better. Words have power. I recently wrote an admonishment to CAIS for something similar.

The biggest disadvantage of pausing for a long time is that it gives bad actors (eg China) a chance to catch up.

There are literal misanthropic 'effective accelerationists' in San Francisco, some of whose stated purpose is to train/develop AI which can surpass and replace humanity. There's Facebook/Meta, whose leaders and executives have been publicly pooh-poohing discussion of AI-related risks as pseudoscience for years, and whose actual motto is 'move fast and break things'. There's OpenAI, which with great trumpeting announces its 'Superalignment' strategy without apparently pausing to think, 'But what if we can't align AGI in 5 years?'. We don't need to invoke bogeyman 'China' to make this sort of point. Note also that the CCP (along with EU and UK gov) has so far been more active in AI restraint and regulation than, say, the US government, or orgs like Facebook/Meta.

Suppose the West is right on the verge of creating dangerous AI, and China is two years away. It seems like the right length of pause is 1.9999 years, so that we get the benefit of maximum extra alignment research and social prep time, but the West still beats China.

Now, this was in the context of paraphrases of others' positions on a pause in AI development, so it's at least slightly mention-flavoured (as opposed to use). But as far as I can tell, the precise framing here has been introduced in Scott's retelling.

Whoever introduced this formulation, this is bonkers in at least two ways. First, who is 'the West' and who is 'China'? This hypothetical frames us as hivemind creatures in a two-player strategy game with a single lever. Reality is a lot more porous than that, in ways which matter (strategically and in terms of outcomes). I shouldn't have to point this out, so this is a little bewildering to read. Let me reiterate: governments are not currently pursuing advanced AI development, only companies. The companies are somewhat international, mainly headquartered in the US and UK but also to some extent China and EU, and the governments have thus far been unwitting passengers with respect to the outcomes. Of course, these things can change.

Second, actually think about the hypothetical where 'we'[1] are 'on the verge of creating dangerous AI'. For sufficient 'dangerous', the only winning option for humanity is to take the steps we can to prevent, or at least delay[2], that thing coming into being. This includes advocacy, diplomacy, 'aggressive diplomacy' and so on. I put forward that the right length of pause then is 'at least as long as it takes to make the thing not dangerous'. You don't win by capturing the dubious accolade of nominally belonging to the bloc which directly destroys everything! To be clear, I think Scott and I agree that 'dangerous AI' here is shorthand for, 'AI that could defeat/destroy/disempower all humans in something comparable to an extinction event'. We already have weak AI which is dangerous to lesser levels. Of course, if 'dangerous' is more qualified, then we can talk about the tradeoffs of risking destroying everything vs 'us' winning a supposed race with 'them'.

I'm increasingly running with the hypothesis that many anglophones are mind-killed on the inevitability of contemporary great power conflict in a way which I think wasn't the case even, say, 5 years ago. Maybe this is how thinking people felt in the run up to WWI, I don't know.

I wonder if a crux here is some kind of general factor of trustingness toward companies vs toward governments - I think extremising this factor would change the way I talk and think about such matters. I notice that a lot of American libertarians seem to have a warm glow around 'company/enterprise' that they don't have around 'government/regulation'.

[ In my post about this I outline some other possible cruxes and I'd love to hear takes on these ]

Separately, I've got increasingly close to the frontier of AI research and AI safety research, and the challenge of ensuring these systems are safe remains very daunting. I think some policy/people-minded discussions are missing this rather crucial observation. If you expect it to be easy (and expect others to expect that) to control AGI, I can see more why people would frame things around power struggles and racing. For this reason, I consider it worthwhile repeating: we don't know how to ensure these systems will be safe, and there are some good reasons to expect that they won't be by default.

I repeat that the post as a whole is doing a service and I'm excited to see more contributions to the conversation around pause and differential development and so on.

  1. Who, me? You? No! Some development team at DeepMind or OpenAI, presumably, or one of the current small gaggle of other contenders, or a yet-to-be-founded lab. ↩︎

  2. If it comes to it, extinction an hour later is better than an hour sooner. ↩︎

I think that the best work on AI alignment happens at the AGI labs

Based on your other discussion e.g. about public pressure on labs, it seems like this might be a (minor?) loadbearing belief?

I appreciate that you qualify this further in a footnote

This is a controversial view, but I’d guess it’s a majority opinion amongst AI alignment researchers.

I just wanted to call out that I weakly hold the opposite position, and also opposite best guess on majority opinion (based on safety researchers I know). Naturally there are sampling effects!

This is a marginal sentiment, and I certainly wouldn't trade all lab researchers for non-lab researchers or vice versa. Diversification of research settings seems quite precious, and the dialogue is important to preserve.

I also question

Reasons include: access to the best alignment talent,

because a lot of us are very reluctant to join AGI labs, for obvious reasons! I know folks inside and outside of AGI labs, and it seems to me that the most talented are among the outsiders (but this also definitely could be an artefact of sample sizes).

This is an exemplary and welcome response: concise, full-throated, actioned. Respect, thank you Aidan.

Sincerely, I hope my feedback was all-considered good from your perspective. As I noted in this post, I felt my initial email was slightly unkind at one point, but I am overall glad I shared it - you appreciate my getting exercised about this, even over a few paragraphs!

It’s important to discuss national AI policies which are often explicitly motivated by goals of competition without legitimizing or justifying zero-sum competitive mindsets which can undermine efforts to cooperate.

Yes, and I repeat that CAIS newsletter has a good balance of nuance, correctness, helpfulness, reach. Hopefully your example here sets the tone for conversations in this space!

(Prefaced with the understanding that your comment is to some extent devil's advocating and this response may be too)

both the US and Chinese governments have the potential to step in when corporations in their country get too powerful

What is 'step in'? I think when people are describing things in aggregated national terms without nuance, they're implicitly imagining govts either already directing, or soon/inevitably appropriating and directing (perhaps to aggressive national interest plays). But govts could just as readily regulate and provide guidance on underprovisioned dimensions (like safety and existential risk mitigation). Or they could in fact be powerless, or remain basically passive until too late, or... (all live possibilities to me).

In these alternative cases, the kind of language and thinking I'm highlighting in the post seems like a sort of nonsense to me - like it doesn't really parse unless you tacitly assume some foregone conclusions.

Load more