Effective Altruism Forum
EA Forum

All of Oliver Sourbut's Comments + Replies

A nit

lifestyle supports the planet, rather than taking from it

appeals to me, I'm sure to some others, but (I sense) could come across with a particular political-tribal flavour, which you might want to try neutralising. (Or not! if that'd detract from the net appeal)

Myles Stremick

23d

This reminds me of when I pitched this idea to my brother and he said "I don't like the framing that a human life is negative and needs to be made up for." and I clarified "I think on the whole our lives are good and nothing to be ashamed about, but there are particular areas of our life that cause harm that we can make up for. Not that the whole life needs to be made up for." I do think having some but not an overwhelming amount of guilt-based messaging is useful here, but I'll reconsider this line and see if it's too much.

Rerunning the Time of Perils

Oliver Sourbut2mo1

On point 1 (space colonization), I think it's hard and slow! So the same issue as with bio risks might apply: AGI doesn't get you this robustness quickly for free. See other comment on this post.

Rerunning the Time of Perils

Oliver Sourbut2mo1

I like your point 2 about chancy vs merely uncertain. I guess a related point is that when the 'runs' of the risks are in some way correlated, having survived once is evidence that survivability is higher. (Up to an including the fully correlated 'merely uncertain' extreme?)

Rerunning the Time of Perils

Oliver Sourbut2mo1

For clarity, you're using 'important' here in something like an importance x tractability x neglectedness factoring? So yes more important (but there might be reasons to think it's less tractable or neglected)?

Toby_Ord

2mo

Yeah, I mean 'more valuable to prevent', before taking into account the cost and difficulty.

Rerunning the Time of Perils

Oliver Sourbut2mo1

I've been meaning to write something about 'revisiting the alignment strategy'. The section 5 here ('Won't AGI make post-AGI catastrophes essentially irrelevant?') makes the point very clearly:

On this view, a post-AGI world is nearly binary—utopia or extinction—leaving little room for Sisyphean scenarios.

But I think this is too optimistic about the speed and completeness of the transition to globally deployed, robustly aligned "guardian" systems.

without making much of a case for it. Interested in Will and reviewers' sense of the space and literature here.

Toby_Ord

2mo

I've often been frustrated by this assumption over the last 20 years, but don't remember any good pieces about it. It may be partly from Eliezer's first alignment approach being to create a superintelligent sovereign AI, where if that goes right, other risks really would be dealt with.

Rerunning the Time of Perils

Oliver Sourbut2mo5

Yep, definitely for me 'big civ setbacks are really bad' was already baked in from the POV of setting bad context for pre-AGI-transition(s) (as well as their direct badness). But while I'd already agreed with Will about post-AGI not being an 'end of history' (in the sense that much remains uncertain re safety), I hadn't thought through the implication that setbacks could force a rerun of the most perilous transition(s), which does add some extra concern.

Rerunning the Time of Perils

Oliver Sourbut2mo1

A small aside: some put forth interplanetary civilisation as a partial defence against either of total destruction and 'setback'. But reaching the milestone of having a really robustly interplanetary civ might itself take quite a long time after AGI - especially if (like me) you think digital uploading is nontrivial.

(This abstractly echoes the suggestion in this piece that bio defence might take a long time, which I agree with.)

William_MacAskill

2mo

I agree with this. One way of seeing that is how many doublings of energy consumption civilisation can have before it needs to move beyond the solar system? The answer to that is about 40 doublings. Which, depending on your views on just how fast explosive industrial expansion goes, could be a pretty long time, e.g. decades.

Better than logarithmic returns to reasoning?

Oliver Sourbut6mo1

Some gestures which didn't make the cut as they're too woolly or not quite the right shape:

adversarial exponentials might force exponential expense per gain
- e.g. combatting replicators
- e.g. brute forcing passwords
many empirical 'learning curve' effects appear to consume exponential observations per increment
- Wright's Law (which is the more general cousin of Moore's Law) requires exponentially many production iterations per incremental efficiency gain
- Deep learning scaling laws appear to consume exponential inputs per incremental gain
- AlphaCode and A

Oliver Sourbut8mo5

This is lovely, thank you!

My main concern would be that it takes the same very approximating stance as much other writing in the area, conflating all kinds of algorithmic progress into a single scalar 'quality of the algorithms'.

You do moderately well here, noting that the most direct interpretation of your model regards speed or runtime compute efficiency, yielding 'copies that can be run' as the immediate downstream consequence (and discussing in a footnote the relationship to 'intelligence'^[1] and the distinction between 'inference' and training compute... (read more)

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver Sourbut9mo1

Glad to hear it! Any particular thoughts or suggestions? (Consider applying, or telling colleagues and friends you think would be a good fit!)

What if we just…didn’t build AGI? An Argument Against Inevitability

Oliver Sourbut9mo1

On this note, the Future of Life Foundation (headed by Anthony Aguirre, mentioned in this post) is today launching a fellowship on AI for Human Reasoning.

Why? Whether you expect gradual or sudden AI takeoff, and whether you're afraid of gradual or acute catastrophes, it really matters how well-informed, clear-headed, and free from coordination failures we are navigating into and through AI transitions. Just the occasion for human reasoning uplift!

12 weeks, $25-50k stipend, mentorship, and potential pathways to future funding and impact. Applications close June 9th.

Knowledge, Reasoning, and Superintelligence

Oliver Sourbut10mo5

(cross-posted on LW)

Love this!

As presaged in our verbal discussion my top conceptual complement would be to emphasise exploration/experimentation as central to the knowledge production loop - the cycle of 'developing good taste to plan better experiments to improve taste (and planning model)' is critical (indispensable?) to 'produce new knowledge which is very helpful by the standards of human civilization' (on any kind of meaningful timescale).

This because just flailing, or even just 'doing stuff', gets you some novelty of observations, but directedly see... (read more)

Decomposing Agency — capabilities without desires

Oliver Sourbut2y5

I like this decomposition!

I think 'Situational Awareness' can quite sensibly be further divided up into 'Observation' and 'Understanding'.

The classic control loop of 'observe', 'understand', 'decide', 'act'^[1], is consistent with this discussion, where 'observe'+'understand' here are combined as 'situational awareness', and you're pulling out 'goals' and 'planning capacity' as separable aspects of 'decide'.

Are there some difficulties with factoring?

Certain kinds of situational awareness are more or less fit for certain goals. And further, the important 're... (read more)

Careers Questions Open Thread

Oliver Sourbut2y6

A little followup:

I took part in the inaugural SERI MATS programme in 2021-2022 (where incidentally I interacted with Richard), and started an AI Safety PhD at Oxford in 22.

I'm now working for the AI Safety Institute (UK Gov) since Jan 2024 as a hybrid technical expert, utilising my engineering and DS background, alongside AI/ML research and threat modelling. Likely to continue such work, there or elsewhere. Unsure if I'll finish my PhD in the end, as a result, but I don't regret it: I produced a little research, met some great collaborators, and had fun w... (read more)

AI Regulation is Unsafe

Oliver Sourbut2y8

FWIW I work at the AI Safety Institute UK and we're considering a range of both misuse and misalignment threats, and there are a lot of smart folks on board taking things pretty seriously. I admit I... don't fully understand how we ended up in this situation and it feels contingent and precious, as does the tentative international consensus on the value of cooperation on safety (e.g. the Bletchley declaration). Some people in government are quite good, actually!

Public Fundraising has Positive Externalities

Oliver Sourbut2y1

Sure, take it or leave it! I think for the field-building benefits it can look more obviously like an externality (though I-the-fundraiser would in fact be pleased and not indifferent, presumably!), but the epistemic benefits could easily accrue mainly to me-the-fundraiser (of course they could also benefit other parties).

Altruism sharpens altruism

Oliver Sourbut2y12

How much of this is lost by compressing to something like: virtue ethics is an effective consequentialist heuristic?

I've been bought into that idea for a long time. As Shaq says, 'Excellence is not a singular act, but a habit. You are what you repeatedly do.'

We can also make analogies to martial arts, music, sports, and other practice/drills, and to aspects of reinforcement learning (artificial and natural).

Stefan_Schubert

It doesn't just say that virtue ethics is an effective consequentialist heuristic (if it says that) but also has a specific theory about the importance of altruism (a virtue) and how to cultivate it. There's not been a lot of systematic discussion on which specific virtues consequentialists or effective altruists should cultivate. I'd like to see more of it. @Lucius Caviola and I have written a paper where we put forward a specific theory of which virtues utilitarians should cultivate. (I gave a talk along similar lines here.) We discuss altruism but also five other virtues.

Public Fundraising has Positive Externalities

Oliver Sourbut2y5

Simple, clear, thought-provoking model. Thanks!

I also faintly recall hearing something similar in this vicinity: apparently some volunteering groups get zero (or less!?) value from many/most volunteers, but engaged volunteers dominate donations, so it's worthwhile bringing in volunteers and training them! (citation very much needed)

Nitpick: are these 'externalities'? I'd have said, 'side effects'. An externality is a third-party impact from some interaction between two parties. The effects you're describing don't seem to be distinguished by being third-party per se (I can imagine glossing them as such but it's not central or necessary to the model).

Larks

Interesting argument about 'side effects' vs 'externalities'. I was assuming that organizations/individuals were being 'selfishly' rational, and assuming that a relatively small fraction of things like the field-building effects would benefit the specific organization doing the field-building. But 'side effects' does seem like it might be more accurate, so possibly I should adjust the title.

Attention on AI X-Risk Likely Hasn't Distracted from Current Harms from AI

Oliver Sourbut2y5

Yeah. I also sometimes use 'extinction-level' if I expect my interlocutor not to already have a clear notion of 'existential'.

OpenAI's Superalignment team has opened Fast Grants

Oliver Sourbut2y3

Point of information: at least half the funding comes from Schmidt futures (not OpenAI), though OpenAI are publicising and administrating it.

Thoughts on the AI Safety Summit company policy requests and responses

Oliver Sourbut2y1

Another high(er?) priority for governments:

start building multilateral consensus and preparations on what to do if/when
- AI developers go rogue
- AI leaked to/stolen by rogue operators
- AI goes rogue

Pause For Thought: The AI Pause Debate

Oliver Sourbut2y31

I think this is a good and useful post in many ways, in particular laying out a partial taxonomy of differing pause proposals and gesturing at their grounding and assumptions. What follows is a mildly heated response I had a few days ago, whose heatedness I don't necessarily endorse but whose content seems important to me.

Sadly this letter is full of thoughtless remarks about China and the US/West. Scott, you should know better. Words have power. I recently wrote an admonishment to CAIS for something similar.

The biggest disadvantage of pausing for a long

... (read more)

Aim for conditional pauses

Oliver Sourbut2y3

I think that the best work on AI alignment happens at the AGI labs

Based on your other discussion e.g. about public pressure on labs, it seems like this might be a (minor?) loadbearing belief?

I appreciate that you qualify this further in a footnote

This is a controversial view, but I’d guess it’s a majority opinion amongst AI alignment researchers.

I just wanted to call out that I weakly hold the opposite position, and also opposite best guess on majority opinion (based on safety researchers I know). Naturally there are sampling effects!

This is a margi... (read more)

AnonResearcherMajorAILab

Yes, if I changed my mind about this I'd have to rethink my position on public advocacy. I'm still pretty worried about the other disadvantages so I suspect it wouldn't change my mind overall, but I would be more uncertain.

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut2y8

This is an exemplary and welcome response: concise, full-throated, actioned. Respect, thank you Aidan.

Sincerely, I hope my feedback was all-considered good from your perspective. As I noted in this post, I felt my initial email was slightly unkind at one point, but I am overall glad I shared it - you appreciate my getting exercised about this, even over a few paragraphs!

It’s important to discuss national AI policies which are often explicitly motivated by goals of competition without legitimizing or justifying zero-sum competitive mindsets which can unde

Oliver Sourbut2y1

(Prefaced with the understanding that your comment is to some extent devil's advocating and this response may be too)

both the US and Chinese governments have the potential to step in when corporations in their country get too powerful

What is 'step in'? I think when people are describing things in aggregated national terms without nuance, they're implicitly imagining govts either already directing, or soon/inevitably appropriating and directing (perhaps to aggressive national interest plays). But govts could just as readily regulate and provide guidance... (read more)

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut2y1

Thanks Ben!

Please don't take these as endorsements that this thinking is correct, just that it's what I see when I inspect my instincts about this

Appreciated.

These psychological (and real) factors seem very plausible to me for explaining why mistakes in thinking and communication are made.

maybe we can think of the US companies as simultaneously closer friends and closer enemies with each other?

Mhm, this seems less lossy as a hypothetical model. Even if they were only 'closer friends', though, I don't think it's at all clearcut enough for it to be a... (read more)

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut2y2

Just in case we're out of sync, let's briefly refocus on some object details

China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.

Are you aware of the following?

the smuggling was done by... smugglers
the buying of chips under the limit was done by multiple suppliers in China
the selling of chips under the limit was done by Nvidia (and perhaps others)
the investment in China's chip industry was done by the CCP

If... (read more)

Gerald Monroe

What makes it a foregone conclusion is the powerful nature of race dynamics are convergent. Actions that would cause a party to definitely lose a race have feedback. Over time multiple competing agents will choose winning strategies, and others will copy those, leading to strategy mirroring. Certain forms of strategy (like nationalizing all the AI labs) are also convergent and optimal. And see a party could fail to play optimally, then observe they are losing, and be forced to choose optimal play in order to lose less. So my seeming overconfidence is because I am convinced the overall game will force all these disparate uncoordinated individual events to converge on what it must. I expect there are several views, but let's look at the bioweapon argument for a second. In what computers can the "escaped" AI exist in? There is no biosphere of computers. You need at least (1600 Gb x 2 / 80 x 2) = 80 H100s to host a GPT-4 instance. The real number is rumored to be about 128. And that's a subhuman AGI at best without vision and other critical features. How many cards will a dangerous ASI need to exist? I won't go into the derivation here but I think the number is > 10,000, and they must be in a cluster with high bandwidth interconnects. As for the second part, "how are we going to use it as a stick". Simple. If you are unconcerned with the AI "breaking out", you train and try a lot of techniques, and only use "in production" (industrial automation, killer robots etc) the most powerful model you have that is measurably reliable and efficient and doesn't engage in unwanted behavior. None of the bad AIs ever escape the lab, there's nowhere for them to go. Note that might be a different story in 2049, that would be when Moore's law would put a single GPU at the power of 10,000 of them. It likely can't continue that long, exponentials stop, but maybe computers built with computronium printed off a nanoforge. But we don't have any of that, and won

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut2y1

Interesting. I'd love to know if you think the crux schema I outlined is indeed important? I mean this:

How quickly/totally/coherently could US gov/CCP capture AI talent/artefacts/compute within its jurisdiction and redirect them toward excludable destructive ends? Under what circumstances would they want/be able to do that?

Correct me at any point if I misinterpret: I read that, on the basis of answers to something a bit like these, you think an international competition/race is all but inevitable? Presumably that registers as terrifically dangerous for... (read more)

-7

Gerald Monroe

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut2y0

Thanks for this thoughtful response!

this tendency leads to analysis that assumes more coordination among governments, companies, and individuals in other countries than is warranted. When people talk about "the US" taking some action... more likely to be aware of the nuance this ignores... less likely to consider such nuances when people talk about "China" doing something

This seems exactly right and is what I'm frustrated by. Though, further than you give credit (or un-credit) for, frequently I come across writing or talking about "US success in AI", "... (read more)

Daniel_Eth

I'm pretty sure what most (educated) people think is they are part of the US (in the sense that they are "US entities", among other things), that they will pay taxes in the US, will hire more people in the US than China (at least relative to if they were Chinese entities), will create other economic and technological spillover effects in greater amount in the US than in China (similar to how the US's early lead on the internet did), will enhance the US's national glory and morale, will provide strategically valuable assets to the US and deny these assets to China (at least in a time of conflict), will more likely embody US culture and norms than Chinese culture and norms, and will be subject to US regulation much more than Chinese regulation. Most people don't expect these companies will be nationalized (though that does remain a possibility, and presumably more so if they were Chinese companies than US companies, due to the differing economic and political systems), but there are plenty of other ways that people expect the companies to advantage their host country['s government, population, economy, etc].

Gerald Monroe

Yes. In the end, all the answers to your questions are yes. The critical thing to realize is until basically EOY 2022, AI didn't exist. It was narrow and expensive and essentially non-general - a cool party trick but the cost to build a model for anything and get to useful performance levels was high. Self driving cars were endlessly delayed, Recsys work but their techniques to correlate fields of user data with preferences are only a little better using neural networks than older cheaper methods, for most other purposes AI was just a tech demo. You need to think in terms of "what does it means that AI works now and how are decisions going to be different". With that said, governments won't nationalize AI companies until they develop a lot stronger models. Imagine the Manhattan project never happened, but GE and a few other US companies kept tinkering with fission. Eventually they would have build critical devices, and EOY 2022 is the "Chicago pile" moment - there's a nuclear reactor, and we can plot out the yield for a nuke, but the devices have not yet been built. Around the time GE is building nuclear bombs for military demos, at some point the US government has to nationalize it all. It's too dangerous. As for the rest of your post, i don't see how "non framing a competition as a competition" is very useful. It's not the media. We live on a finite planet with finite resources, and the only reason there are different countries is the most powerful winners have not found a big enough club to conquer everyone else. You know nations used to be way smaller, right. Why do you think they are so large now? In each history someone found a way to depose all the other feudal kings and lords. AGI may be that club, and whoever builds it fastest and bestest may in fact just be able to crush everyone. Even if they can't, each superpower has to assume that they can.

Israeli Prime Minister, Musk and Tegmark on AI Safety

Oliver Sourbut2y3

The stream cut out, but there are longer versions available e.g. https://www.youtube.com/watch?v=Dg-rKXi9XYg

Are there diseconomies of scale in the reputation of communities?

Oliver Sourbut3y3

Great read, and interesting analysis. I like encountering models for complex systems (like community dynamics)!

One factor I don't think was discussed (maybe the gesture at possible inadequacy of $f (N)$ encompasses this) is the duration of scandal effects. E.g. imagine some group claiming to be the Spanish Inquisition or the Mongol Horde, or the Illuminati tried to get stuff done. I think (assuming taken seriously) they'd encounter lingering reputational damage more than one year after the original scandals! Not sure how this models out; I'm not planning to d... (read more)

Ben_West🔸

Thanks Oliver! It seems basically right to me that this is a limitation of the model, in particular f(N), like you say.

I'm interviewing Jan Leike, co-lead of OpenAI's new Superalignment project. What should I ask him?

Answer by Oliver SourbutJul 21, 20239

OpenAI as a whole, and individuals affiliated with or speaking for the org, appear to be largely behaving as if they are caught in an overdetermined race toward AGI.

What proportion of people at OpenAI believe this, and to what extent? What kind of observations, or actions or statements by others (and who?) would change their minds?

A Defense of Work on Mathematical AI Safety

Oliver Sourbut3y1

Great post. I basically agree, but in a spirit of devil's advocating, I will say: when I turn my mind to agent foundations thinking, I often find myself skirting queasily close to concepts which feel also capabilities-relevant (to the extent that I have avoided publicly airing several ideas for over a year).

I don't know if that's just me, but it does seem that some agent foundations content from the past has also had bearing on AI capabilities - especially if we include decision theory stuff, dynamic programming and RL, search, planning etc. which it's arg... (read more)

Talking publicly about AI risk

Oliver Sourbut3y1

I wrote a little here about unpluggability (and crossposted on LessWrong/AF)

Talking publicly about AI risk

Oliver Sourbut3y4

Thank you for sharing this! Especially the points about relevant maps and Meta/FAIR/LeCun.

I was recently approached by the UK FCDO as a technical expert in AI with perspective on x-risk. We had what I think were very productive conversations, with an interesting convergence of my framings and the ones you've shared here - that's encouraging! If I find time I'm hoping to write up some of my insights soon.

Oliver Sourbut

I wrote a little here about unpluggability (and crossposted on LessWrong/AF)

Why Neuron Counts Shouldn't Be Used as Proxies for Moral Weight

Oliver Sourbut3y1

I've given a little thought to this hidden qualia hypothesis but it remains very confusing for me.

To what extent should we expect to be able to tractably and knowably affect such hidden qualia?

Adam Shriver

Here's the report on conscious subsystems: https://forum.effectivealtruism.org/posts/vbhoFsyQmrntru6Kw/do-brains-contain-many-conscious-subsystems-if-so-should-we

Effective altruism in the garden of ends

Oliver Sourbut3y8

This is beautiful and important Tyler, thank you for sharing.

I've seen a few people burn out (and come close myself), and I have made a point of gently socially making and reinforcing this sort of point (far less eloquently) myself, in various contexts.

I have a lot of thoughts about this subject.

One thing I embrace always is silliness and (often self-deprecating) humour, which are useful antidotes to stress for a lot of people. Incidentally, your tweet thread rendition of the Eqyptian spell includes

I am light heading for light. Even in the dark, a fi

... (read more)

tyleralterman

Agree so much with the antidote of silliness! I’m happy to see that EA Twitter is embracing it. Excited to read the links you shared, they sound very relevant. Thank you, Oliver. May your fire bum into the distance.

Why EAs are skeptical about AI Safety

Oliver Sourbut4y2

Seconded/thirded on Human Compatible being near that frontier. I did find its ending 'overly optimistic' in the sense of framing it like 'but lo, there is a solution!' while other similar resources like Superintelligence and especially The Alignment Problem seem more nuanced in presenting uncertain proposals for paths forward not as oven-ready but preliminary and speculative.

Announcing Non-trivial, an EA learning platform for teenagers

Oliver Sourbut4y1

I think it's a staircase? Maybe like climbing upwards to more good stuff. Plus some cool circles to make it logo ish.

Zach Stein-Perlman

“Abstract stairs” was my best guess too. It doesn’t work for me, and I don’t get the second circle.

Announcing Non-trivial, an EA learning platform for teenagers

Oliver Sourbut4y5

I'm intrigued by this thread. I don't have an informed opinion on the particular aesthetic or choice of quiz questions, but I note some superficial similarities to Coursera, Khan Academy, and TED-Ed, which are aimed at mainly professional age adults, students of all ages, and youth/students (without excluding adults) respectively.

Fun/cute/cartoon aesthetics do seem to abound these days in all sorts of places, not just for kids.

My uninformed opinion is that I don't see why it should put off teenagers (talented or otherwise) in particular, but I weakly agree that if something is explicitly pitched at teenagers, that might be offputting!

The Future Might Not Be So Great

Oliver Sourbut4y1

It looks like I got at least one downvote on this comment. Should I be providing tips of this kind in a different way?

The Future Might Not Be So Great

Oliver Sourbut4y31

I've considered a possible pithy framing of the Life Despite Suffering question as a grim orthogonality thesis (though I'm not sure how useful it is):

We sometimes point to the substantial majority's revealed preference for staying alive as evidence of a 'life worth living'. But perhaps 'staying-aliveness' and 'moral patient value' can vary more independently than that claim assumes. This is the grim orthogonality thesis.

An existence proof for the 'high staying-aliveness x low moral patient value' quadrant is the complex of torturer+torturee, which quite cl... (read more)

The Future Might Not Be So Great

Oliver Sourbut4y64

I'm shocked and somewhat concerned that your empirical finding is that so few people have encountered or thought about this crucial consideration.

My experience is different, with maybe 70% of AI x-risk researchers I've discussed with being somewhat au fait with the notion that we might not know the sign of future value conditional on survival. But I agree that it seems people (myself included) have a tendency to slide off this consideration or hope to defer its resolution to future generations, and my sample size is quite small (a few dozen maybe) and quit... (read more)

Jacy

This is helpful data. Two important axes of variation here are: - Time, where this has fortunatley become more frequently discussed in recent years - Involvement, where I speak a lot with artificial intelligence and machine learning researches who work on AI safety but not global priorities research; often their motivation was just reading something like Life 3.0. I think these people tend to have thought through crucial considerations less than, say, people on this forum.

Ben_West🔸4y24

My anecdata is also that most people have thought about it somewhat, and "maybe it's okay if everyone dies" is one of the more common initial responses I've heard to existential risk.

But I agree with OP that I more regularly hear "people are worried about negative outcomes just because they themselves are depressed" than "people assume positive outcomes just because they themselves are manic" (or some other cognitive bias).

The Future Might Not Be So Great

Oliver Sourbut4y8

Typo hint:

"10<sup>38</sup>" hasn't rendered how you hoped. You can use <dollar>10^{38}<dollar> which renders as $10^{38}$

Oliver Sourbut

It looks like I got at least one downvote on this comment. Should I be providing tips of this kind in a different way?

Fai

Maybe another typo? : "Bostrom argues that if humanizes could colonize the Virgo supercluster", should that be "humanity" or "humans"?

Jacy

Whoops! Thanks!

Critiques of EA that I want to read

Oliver Sourbut4y2

Got it, I think you're quite right on one reading. I should have been clearer about what I meant, which is something like

there is a defensible reading of that claim which maps to some negative utilitarian claim (without necessarily being a central example)
furthermore I expect many issuers of such sentiments are motivated by basically pretheoretic negative utilitarian insight

E.g. imagine a minor steelification (which loses the aesthetic and rhetorical strength) like "nobody's positive wellbeing (implicitly stemming from their freedom) can/should be cel... (read more)

abrahamrowe

That makes sense to me. Yeah, I definitely think that also many people from left-leaning spaces who come to EA also become sympathetic to suffering focused work in my experience, which also seems consistent with this.

Critiques of EA that I want to read

Oliver Sourbut4y-3

Minor nitpick: "nobody's free until everyone is free" is precisely a (negative) utilitarian claim (albeit with unusual wording)

abrahamrowe

That doesn't seem quite right - negative utilitarians would still prefer marginal improvements even if all suffering didn't end (or in this case, a utilitarian might prefer many become free even if all didn't become free). The sentiment is interesting because it doesn't acknowledge marginal states that utilitarians are happy to compare against ideal states, or worse marginal states.

Are too many young, highly-engaged longtermist EAs doing movement-building?

Answer by Oliver SourbutJun 25, 20225

It's possible the selection bias is high, but I don't have good evidence for this besides personal anecdata. I don't know how many people are relevantly similar to me, and I don't know how representative we are of the latest EA 'freshers', since dynamics will change and I'm reporting with several years' lag.

Here's my personal anecdata.

Since 2016, around when I completed undergrad, I've been an engaged (not sure what counts as 'highly engaged') longtermist. (Before that point I had not heard of EA per se but my motives were somewhat proto EA and I wanted to... (read more)

Anonymous_EA

Appreciate the anecdata! I agree that probably there are at least a good number of people like you who will go under the radar, and this probably biases many estimates of the number of non-community-building EAs downward (esp estimates that are also based on anecdata, as opposed to e.g. survey data).

On Deference and Yudkowsky's AI Risk Estimates

Oliver Sourbut4y23

I just wanted to state agreement that it seems a large number of people largely misread Death with Dignity, at least according to what seems to me the most plausible intended message: mainly about the ethical injunctions (which are very important as a finitely-rational and prone-to-rationalisation being), as Yudkowsky has written of in the past.

The additional detail of 'and by the way this is a bad situation and we are doing badly' is basically modal Yudkowsky schtick and I'm somewhat surprised it updated anyone's beliefs (about Yudkowsky's beliefs, and th... (read more)

Blake Richards on Why he is Skeptical of Existential Risk from AI

Oliver Sourbut4y3

I wrote something similar (with more detail) about the Gato paper at the time.

I don't think this is any evidence at all against AI risk though? It is maybe weak evidence against 'scaling is all you need' or that sort of thing.

Blake Richards on Why he is Skeptical of Existential Risk from AI

Oliver Sourbut4y1

Thanks Rohin, I second almost all of this.

Interested to hear more about why long-term credit assignment isn't needed for powerful AI. I think it depends how you quantify those things and I'm pretty unsure about this myself.

Is it because there is already loads of human-generated data which implicitly embody or contain enough long-term credit assignment? Or is it that long-term credit assignment is irrelevant for long-term reasoning? Or maybe long-term reasoning isn't needed for 'powerful AI'?

Rohin Shah

We're tackling the problem "you tried out a long sequence of actions, and only at the end could you tell whether the outcomes were good or not, and now you have to figure out which actions ". Some approaches to this that don't involve "long-term credit assignment" as normally understood by RL practitioners: * Have humans / other AI systems tell you which of the actions were useful. (One specific way this could be achieved is to use humans / AI systems to provide a dense reward, kinda like in summarizing books from human feedback.) * Supervise the AI system's reasoning process rather than the outcomes it gets (e.g. like chain-of-thought prompting but with more explicit supervision). * Just don't even bother, do regular old self-supervised learning on a hard task; in order to get good performance maybe the model has to develop "general intelligence" (i.e. something akin to the algorithms humans use in order to do long-term planning; after all our long-term planning doesn't work via trial and error). I think it's also plausible that (depending on your definitions) long-term reasoning isn't needed for powerful AI.