Oliver Sourbut

Oliver - or call me Oly: I don't mind which!

I'm particularly interested in sustainable collaboration and the long-term future of value. Currently based in London, I'm in my early-ish career working as a senior software engineer/data scientist, and doing occasional AI alignment work with SERI.

I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! Recently I've enjoyed

  • Ord - The Precipice
  • Pearl - The Book of Why
  • Bostrom - Superintelligence
  • McCall Smith - The No. 1 Ladies' Detective Agency
  • Abelson & Sussman - Structure and Interpretation of Computer Programs
  • Stross - Accelerando

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

  • Hanabi (can't recommend enough; try it out!)
  • Pandemic (ironic at time of writing...)
  • Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
  • Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.

Posts

Sorted by New

Topic Contributions

Comments

Critiques of EA that I want to read

Got it, I think you're quite right on one reading. I should have been clearer about what I meant, which is something like

  • there is a defensible reading of that claim which maps to some negative utilitarian claim (without necessarily being a central example)
  • furthermore I expect many issuers of such sentiments are motivated by basically pretheoretic negative utilitarian insight

E.g. imagine a minor steelification (which loses the aesthetic and rhetorical strength) like "nobody's positive wellbeing (implicitly stemming from their freedom) can/should be celebrated until everyone has freedom (implicitly necessary to escape negative wellbeing)" which is consistent with some kind of lexical negative utilitarianism.

You're right that if we insist that 'freedom' be interpreted identically in both places (parsimonious, granted, though I think the symmetry is better explained by aesthetic/rhetorical concerns) another reading explicitly neglects the marginal benefit of lifting merely some people out of illiberty. Which is only consistent with utilitarianism if we use an unusual aggregation theory (i.e. minimising) - though I have also seen this discussed under negative utilitarianism.

Anecdata: as someone whose (past) political background and involvement (waning!) is definitely some kind of lefty, and who, if it weren't for various x- and s-risks, would plausibly consider some form (my form, naturally!) of lefty politics to be highly important (if not highly tractable), my reading of that claim at least goes something like the first one. I might not be representative in that respect.

I have no doubt that many people expressing that kind of sentiment would still celebrate marginal 'releases', while considering it wrong to celebrate further the fruits of such freedom, ignoring others' lack of freedom.

Are too many young, highly-engaged longtermist EAs doing movement-building?

It's possible the selection bias is high, but I don't have good evidence for this besides personal anecdata. I don't know how many people are relevantly similar to me, and I don't know how representative we are of the latest EA 'freshers', since dynamics will change and I'm reporting with several years' lag.

Here's my personal anecdata.

Since 2016, around when I completed undergrad, I've been an engaged (not sure what counts as 'highly engaged') longtermist. (Before that point I had not heard of EA per se but my motives were somewhat proto EA and I wanted to contribute to 'sustainable flourishing at scale' and 'tech for good'.) Nevertheless, until 2020 or so I was relatively invisibly upskilling, reflecting on priorities, consuming advice and ideas etc. and figuring out (perhaps too humbly and slowly) how to orient. More recently I've overcome some amount of impostor syndrome and simultaneously become more 'community engaged' (hence visible) and started directly contributing to technical AI safety research.

If there are a lot with stories like that, they might form a large but quiet cohort countervailing your concern.

Having said that, I think what you express here is excellent to discuss, I think I may have been unusually quiet+cautious, I didn't encounter EA during undergrad, and I suspect (without here justifying) that community dynamics have changed sufficiently that my anecdote is not IID with the cohort you're discussing.

On Deference and Yudkowsky's AI Risk Estimates

I just wanted to state agreement that it seems a large number of people largely misread Death with Dignity, at least according to what seems to me the most plausible intended message: mainly about the ethical injunctions (which are very important as a finitely-rational and prone-to-rationalisation being), as Yudkowsky has written of in the past.

The additional detail of 'and by the way this is a bad situation and we are doing badly' is basically modal Yudkowsky schtick and I'm somewhat surprised it updated anyone's beliefs (about Yudkowsky's beliefs, and therefore their all-things-considered-including-deference beliefs).

I think if he had been a little more audience-aware he might have written it differently. Then again maybe not, if the net effect is more attention and investment in AI safety - and more recent posts and comments suggest he's more willing than before to use certain persuasive techniques to spur action (which seems potentially misguided to me, though understandable).

Blake Richards on Why he is Skeptical of Existential Risk from AI

I wrote something similar (with more detail) about the Gato paper at the time.

I don't think this is any evidence at all against AI risk though? It is maybe weak evidence against 'scaling is all you need' or that sort of thing.

Blake Richards on Why he is Skeptical of Existential Risk from AI

Thanks Rohin, I second almost all of this.

Interested to hear more about why long-term credit assignment isn't needed for powerful AI. I think it depends how you quantify those things and I'm pretty unsure about this myself.

Is it because there is already loads of human-generated data which implicitly embody or contain enough long-term credit assignment? Or is it that long-term credit assignment is irrelevant for long-term reasoning? Or maybe long-term reasoning isn't needed for 'powerful AI'?

How I failed to form views on AI safety

OK, this is the terrible terrible failure mode which I think we are both agreeing on (emphasis mine)

the perceived standard of "you have to think about all of this critically and by your own, and you will probably arrive to similar conclusions than others in this field"

By 'a sceptical approach' I basically mean 'the thing where we don't do that'. Because there is not enough epistemic credit in the field, yet, to expect that all (tentative, not-consensus-yet) conclusions to be definitely right.

In traditional/undergraduate mathematics, it's different - almost always when you don't understand or agree with the professor, she is simply right and you are simply wrong or confused! This is a justifiable perspective based on the enormous epistemic weight of all the existing work on mathematics.

I'm very glad you call out the distinction between performing skepticism and actually doing it.

How I failed to form views on AI safety

I feel like while “superintelligent AI would be dangerous” makes sense if you believe superintelligence is possible, it would be good to look at other risk scenarios from current and future AI systems as well.

I agree, and I think there's a gap for thoughtful and creative folks with technical understanding to contribute to filling out the map here!

One person I think has made really interesting contributions here is Andrew Critch, for example on Multipolar Failure and Robust Agent-Agnostic Processes (I realise this is literally me sharing a link without much context which was a conversation-failure-mode discussed in the OP so feel free to pass on this). He also has made some attempts to discuss more breadth e.g. here. Critch isn't the only one.

How I failed to form views on AI safety

I’m fairly sure deep learning alone will not result in AGI

How sure? :)

What about some combination of deep learning (e.g. massive self-supervised) + within-context/episodic memory/state + procedurally-generated tasks + large-scale population-based training + self-play...? I'm just naming a few contemporary 'prosaic' practices which, to me, seem plausibly-enough sufficient to produce AGI that it warrants attention.

How I failed to form views on AI safety

I was one of the facilitators in the most recent run of EA Cambridge's AGI Safety Fundamentals course, and I also have professional DS/ML experience.

In my case I very deliberately emphasised a sceptical approach to engaging with all the material, while providing clarifications and corrections where people's misconceptions are the source of scepticism. I believe this was well-received by my cohort, all of whom appeared to engage thoughtfully and honestly with the material.

I think this is the best way to engage, when time permits, because (in brief)

  • many arguments invoke ill-defined terms, and we need to sharpen these
  • many arguments are (perhaps explicitly) speculative and empirically uncertain
  • even mathematically/empirically rigorous content has important modelling assumptions and experimental caveats
  • scepticism often produces better creative/generative engagement
  • collectively we will fail if our individual opinions are overly shaped by founder effects

I hope that this is a common perspective, but to the extent that it isn't, I wonder if this (especially the last point) may be a source of some of your confusing experiences.

I'd also say: it seems appropriate to have 'very messy views' if by that you mean uncertainty about where things are going and how to make them better! I think folks who don't are doing one of two things

  • mistakenly concentrating more hypothesis weight than their observations/thinking in fact justify (which is a bad idea)
  • engaging in a thinking manoeuvre something like 'temporary MAP stance' or 'subjective probability matching' (which may be a good idea, if done transparently)

'Temporary MAP stance' or 'subjective probability matching'

MAP is Maximum A Posteriori i.e. your best guess after considering evidence. Probability matching is making actions/guesses proportional to your estimate of them being right (rather than picking the single MAP choice)

By this manoeuvre I'm gesturing at a kind of behaviour where you are quite unsure about what's best (e.g. 'should I work on interpretability or demystifying deception?') and rather than allowing that to result in analysis paralysis, you temporarily collapse some uncertainty and make some concrete assumptions to get moving in one or other direction. Hopefully in so doing you a) make a contribution and b) grow your skills and collect new evidence to make better decisions/contributions next time.

Load More