All of William_MacAskill's Comments + Replies

Merry Christmas, everyone!

This year, I’m feeling grateful to be me.

Recently, I gave some information about myself to Claude, and asked how I compared to other 38-yr-old men in the world.

I thought I understood global inequality well, but I still found the results quite moving. The usual AI sycophancy and reassurance was gone:

Compared to Global Averages, you’re in an exceptionally privileged position - likely in the top 1-3% globally:

- Work: Most men globally work manual labor, informal economy, or small trade with no job security. Your stable research posit

... (read more)
3
Clara Torres Latorre 🔸
I love the sentiment of the post, and tried it myself. I think a prompt like this makes answers less extreme than what they actually are, because it's like a vibes-based answer instead of a model-based answer. I would be surprised if you are not in the top 1% globally. I would really enjoy something like this but more model-based, as the GWWC calculator. Does anyone know of something similar? Should I vibe code it and then ask for feedback here? I tried this myself and I got "you're about 10-15% globally", which I think is a big underestimate. For context, pp adjusted income is top 2%, I have a PhD (1% globally? less?), live alone in an urban area. Asking more, a big factor pushing down is that I rent the place that I live in instead of owning it (which, don't get me started on this from a personal finance perspective, but shouldn't be that big of a gap I guess?).

In my memory, the main impetus was a couple of leading AI safety ML researchers started making the case for 5-year timelines. They were broadly qualitatively correct and remarkably insightful (promoting the scaling-first worldview), but obviously quantitatively too aggressive. And AlphaGo and AlphaZero had freaked people out, too. 

A lot of other people at the time (including close advisers to OP folks) had 10-20yr timelines. My subjective impression was that people in the OP orbit generally had more aggressive timelines than Ajeya's report did. 

Re "Oxford EAs" - Toby Ord is presumably a paradigm of that. In the Great AI Timelines Scare of 2017, I spent some time looking into timelines. His median, then, was 15 years, which has held up pretty well. (And his x-risk probability, as stated in the Precipice, was 10%.)

I think I was wrong in my views on timelines then. But people shouldn't assume I'm a stand-in for the views of "Oxford EAs".
 

I ran a timelines exercise in 2017 with many well known FHI staff (though not including Nick) where the point was to elicit one's current beliefs for AGI by plotting CDFs. Looking at them now, I can tell you our median dates were: 2024, 2032, 2034, 2034, 2034, 2035, 2054, and 2079. So the median of our medians was (robustly) 2034 (i.e. 17 more years time). I was one of the people who had that date, though people didn't see each others' CDFs during the exercise.

I think these have held up well.

So I don't think Eliezer's "Oxford EAs" point is correct.

1
SLermen
Toby Ord had a x-risk probability of 10% from AI and about 7% from other causes back then for a total of about 1/6. Reading this, I thought Toby Ord had a total all-cause x-risk probabilitity of 10% back then at first and checked it. Thought this might be helpful since Eliezer specifically mentioned <10% x-risk from AI as very unreasonable.
4
Denkenberger🔸
Wow - @Toby_Ord then why did you have such a high existential risk for climate? Did you have large likelihoods that AGI would take 100 or 200 years despite a median date of 2032?
7
Eric Neyman
What's the Great AI Timelines Scare of 2017?

I agree - this is a great point. Thanks, Simon!

You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can't change that, or in worlds where it's very unlikely and we can't change that, those aren't the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon).  So, if we're taking this consideration into account, this should also discount th... (read more)

2
Arepo
I would strongly push back on the idea that a world where it's unlikely and we can't change that is uninteresting. In that world, all the other possible global catastrophic risks become far more salient as potential flourishing-defeaters.

Okay, looking at the spectrum again, it still seems to me like I've labelled them correctly? Maybe I'm missing something. It's optimistic if we can retain a knowledge of how to align AGI because then we can just use that knowledge later and we don't face the same magnitude of risk of the misaligned AI. 

4
Owen Cotton-Barratt
Sorry, I didn't mean mislabelled in terms of having the labels the wrong way around. I meant that the points you describe aren't necessarily the ends of the spectrum -- for instance, worse than just losing all alignment knowledge is losing all the alignment knowledge while keeping all of the knowledge about how to build highly effective AI. At least that's what I had in mind at the time of writing my comment. I'm now wondering if it would actually be better to keep the capabilities knowledge, because it makes it easier to do meaningful alignment work as you do the rerun. It's plausible that this is actually more important than the more explicitly "alignment" knowledge. (Assuming that compute will be the bottleneck.)

I agree with this. One way of seeing that is how many doublings of energy consumption civilisation can have before it needs to move beyond the solar system? The answer to that is about 40 doublings. Which, depending on your views on just how fast explosive industrial expansion goes, could be a pretty long time, e.g. decades.

I do think that non-existential level catastrophes are a big deal even despite the rerun risk consideration, because I expect the civilisation that comes back from such a catastrophe to be on a worse values trajectory than the one we have today. In particular, because the world today is unusually democratic and liberal, and I expect a re-roll of history to result in less democracy than we have today at the current technological level. However, other people have pushed me on that, and I don't feel like the case here is very strong. There are also obvious re... (read more)

3
Simon
"I don't think that the Spanish flu made us more prepared against Covid-19" actually I'm betting our response to Covid-19 was better than it would have been without having had major pandemics in the past. For example, the response involved developing effective vaccines very quickly

One more clarification to the comment for forum users: I have tendonitis, and so I'm voice dictating all of my comments, so they might read oddly!

Thanks, that's a good catch. Really, in the simple model the relevant point of time for the first run should be when the alignment challenge has been solved, even for superintelligence. But that's before 'reasonably good global governance".

Of course, there's an issue that this is trying to model alignment as a binary thing for simplicity, even though really if a catastrophe came when half of the alignment challenge had been solved, that would still be a really big deal for similar reasons to the paper.

One additional comment is that this sort of "concepts m... (read more)

2
Owen Cotton-Barratt
Why do you think alignment gets solved before reasonably good global governance? It feels to me pretty up in the air which target we should be aiming to hit first. (Hitting either would help us with the other. I do think that we likely want to get important use out of AI systems before we establish good global governance; but that we might want to then do the governance thing to establish enough slack to take the potentially harder parts of alignment challenge slowly.)

I think not at all a dumb idea, and I talk about this in Section 6.4. It actually feels like an activity you could do at very low cost that might have very high value per unit cost. 

3
Aaron Bergman
I did something related but haven't updated it in a couple years! If there's a good collection of AI safety papers/other resources/anything anywhere it would be very easy for me to add it to the archive for people to download locally, or else I could try to collect stuff myself

The second way in which this post is an experiment is that it's an example of what I've been calling AI-enhanced writing. The experiment here is to see how much more productive I can be in the research and writing process by relying very heavily on AI assistance — Ttrying to use AI rather than myself wherever I can possibly do so. In this case, I went from having the basic idea to having this draft in about a day of work.

I'd be very interested in people's comments on how apparent it is that AI was used so extensively in drafting this piece — in particular if there are examples of AI slop that you can find in the text and that I missed.

4
Aaron Bergman
I strongly endorse this and think that there are some common norms that stand in the way of actually-productive AI assistance. 1. People don't like AI writing aesthetically 2. AI reduces the signal value of text purportedly written by a human (i.e. because it might have been trivial to create and the "author" needn't even endorse each claim in the writing) Both of these are reasonable but we could really use some sort of social technology for saying "yes, this was AI-assisted, you can tell, I'm not trying to trick anyone, but also I stand by all the claims made in the text as though I had done the token generation myself."
5
OscarD🔸
When I read the first italicised line of the post, I assumed that one of the unusual aspects was that the post was AI-written. So then I was unusually on the lookout for that while reading it. I didn't notice clear slop. The few times that seemed not quite in your voice/a bit more AI-coded were (I am probably forgetting some): * The talk of 'uncontacted tribes' - are there any? Seems like more something I would expect AIs to mention than you. * 'containerisation tools' - this is more computer techno-speak than I would expect from you (I don't really know what these tools are, maybe you do though). * ‘Capacitors dry out, solder joints crack, chips suffer long-term degradation.’ - I quite like this actually but it is a bit more flowery than your normal writing I think. So overall, I would say the AIs acquitted themselves quite well!
1
Jacco Rubens🔸
I didn't suspect while reading the post that it drafted heavily with AI.  On reflection, and having now seen this comment, the writing style does feel a bit different than your other writing that I've read, in some fairly thematically AI ways - shorter paragraphs, punchier prose, bolded bullets, etc. I don't know if it is better or worse - it was very easy to scan and understand quickly, but I do wonder if some of your usual precision or nuance is missing. (Though this is probably more to do with being an early stage draft rather than being AI-assisted).

The first way in which this post is an experiment is that it's work-in-progress that I'm presenting at a Forethought Research progress meeting. The experiment is just to publish it as a draft and then have the comments that I would normally receive as GoogleDoc comments on this forum post instead. The hope is that by doing this more people can get up to speed with Forethought research earlier than they would have and we can also get more feedback and thoughts at an earlier stage from a wider diversity of people.

I'd welcome takes from Forumites on how valuable or not this was.

2
OscarD🔸
I like the idea, though I think a shared gdoc is far better for any in-line comments. Maybe if you only want people to give high-level comments this is better though - I imagine heaps of people may want to comment on gdocs you share publicly.
2
Toby Tremlett🔹
Love this idea - keen to hear afterwards whether it felt useful from your end. 

I, of course, agree

One additional point, as I'm sure you know,  is that potentially you can also affect P(things go really well | AI takeover). And actions to increase ΔP(things go really well | AI takeover) might be quite similar to actions that increase ΔP(things go really well | no AI takeover). If so, that's an additional argument for those actions compared to affecting ΔP(no AI takeover).


Re the formal breakdown, people sometimes miss the BF supplement here which goes into this in a bit more depth. And here's an excerpt from a forthcoming p... (read more)

(Also, thank you for doing this analysis, it's great stuff!)

(Also, thank you for doing this analysis, it's great stuff!)

Rutger Bregman isn’t on the Forum, but sent me this message and gave me permission to share:

Great piece! I strongly agree with your point about PR. EA should just be EA, like the Quakers just had to be Quakers and Peter Singer should just be Peter Singer.

Of course EA had to learn big lessons from the FTX saga. But those were moral and practical lessons so that the movement could be proud of itself again. Not PR-lessons. The best people are drawn to EA not because it’s the coolest thing on campus, but because it’s a magnet for the most morally serious + the

... (read more)

Argh, thanks for catching that! Edited now.

If this perspective involves a strong belief that AI will not change the world much, then IMO that's just one of the (few?) things that are ~fully out of scope for Forethought

 

I disagree with this. There would need to be some other reason for why they should work at Forethought rather than elsewhere, but there are plausible answers to that — e.g. they work on space governance, or they want to write up why they think AI won't change the world much and engage with the counterarguments. 

I can't speak to the "AI as a normal technology" people in particular, but a shortlist I created of people I'd be very excited about includes someone who just doesn't buy at all that AI will drive an intelligence explosion or explosive growth.

I think there are lots of types of people where it wouldn't be a great fit, though. E.g. continental philosophers; at least some of the "sociotechnical" AI folks; more mainstream academics who are focused on academic publishing. And if you're just focused on AI alignment, probably you'll get more at a different org th... (read more)

Thanks for writing this, Lizka! 

Some misc comments from me:

  • I have the worry that people will see Forethought as "the Will MacAskill org", at least to some extent, and therefore think you've got to share my worldview to join. So I want to discourage that impression! There's lots of healthy disagreement within the team, and we try to actively encourage disagreement. (Salient examples include disagreement around: AI takeover risk; whether the better futures perspective is totally off-base or not;  moral realism / antirealism; how much and what work
... (read more)

I'm not even sure your arguments would be weak in that scenario. 

Thanks - classic Toby point!  I agree entirely that you need additional assumptions.

I was imagining someone who thinks that, say, there's a 90% risk of unaligned AI takeover, and a 50% loss of EV of the future from other non-alignment issues that we can influence. So EV of the future is 5%.

If so, completely solving AI risk would increase the EV of the future to 50%; halving both would increase it only to 41%.

But, even so, it's probably easier to halve both than to completely eliminate AI takeover risk, and more generally the case for a mixed strategy seems strong. 

7
Denkenberger🔸
I'm not understanding - if there's no value in the 90%, and then 50% value in the remaining 10%, wouldn't the EV of the future be 5%?

Haha, thank you for the carrot - please have one yourself!

"Harangue" was meant to be a light-hearted term. I agree, in general, on carrots rather than sticks. One style of carrot is commenting things like "Great post!" - even if not adding any content, I think it probably would increase the quantity of posts on the Forum, and somewhat act as a reward signal (more than just karma).

making EA the hub for working on "making the AI transition go well"


I don't think EA should be THE hub. In an ideal world, loads of people and different groups would be working on these issues.  But at the moment, really almost no one is. So the question is whether it's better if, given that, EA does work on it, and at least some work gets done. I think yes.

(Analogy: was it good or bad that in the earlier days, there was some work on AI alignment, even though that work was almost exclusively done by EA/rationalist types?)

I think it's likely that without a long (e.g. multi-decade) AI pause, one or more of these "non-takeover AI risks" can't be solved or reduced to an acceptable level.

 

I don't understand why you're framing the goal as "solving or reducing to an acceptable level", rather than thinking about how much expected impact we can have.  I'm in favour of slowing the intelligence explosion (and in particular of "Pause at human-level".) But here's how I'd think about the conversion of slowdown/pause into additional value:

Let's say the software-only intelligenc... (read more)

9
Wei Dai
I think my point in the opening comment does not logically depend on whether the risk vs time (in pause/slowdown) curve is convex or concave[1], but it may be a major difference in how we're thinking about the situation, so thanks for surfacing this. In particular I see 3 large sources of convexity: 1. The disjunctive nature of risk / conjunctive nature of success. If there are N problems that all have to solved correctly to get a near-optimal future, without losing most of the potential value of the universe, then that can make the overall risk curve convex or at least less concave. For example compare f(x) = 1 - 1/2^(1 + x/10) and f^4. 2. Human intelligence enhancements coming online during the pause/slowdown, with each maturing cohort potentially giving a large speed boost for solving these problems. 3. Rationality/coordination threshold effect, where if humanity makes enough intellectual or other progress to subsequently make an optimal or near-optimal policy decision about AI (e.g., realize that we should pause AI development until overall AI risk is at some acceptable level, or something like this but perhaps more complex involving various tradeoffs), then that last bit of effort or time to get to this point has a huge amount of marginal value. I think this kind of approach can backfire badly (especially given human overconfidence), because we currently don't know how to judge progress on these problems except by using human judgment, and it may be easier for AIs to game human judgment than to make real progress. (Researchers trying to use LLMs as RL judges apparently run into the analogous problem constantly.) What if the leaders can't or shouldn't trust the AI results? 1. ^ I'm trying to coordinate with, or avoid interfering with, people who are trying to implement an AI pause or create conditions conducive to a future pause. As mentioned in the grandparent comment, one way people like us could interfere with such efforts is by feeding into

You’ve said you’re in favour of slowing/pausing, yet your post focuses on ‘making AI go well’ rather than on pausing. I think most EAs would assign a significant probability that near-term AGI goes very badly - with many literally thinking that doom is the default outcome. 

If that's even a significant possibility, then isn't pausing/slowing down the best thing to do no matter what? Why be optimistic that we can "make AGI go well" and pessimistic that we can pause or slow AI development for long enough?

Thanks, Nick, that's helpful. I'm not sure how much we actually disagree — in particular, I wasn't meaning this post to be a general assessment of EA as a movement, rather than pointing to one major issue — but I'll use the opportunity to clarify my position at least. 

The EA movement is not (and should not be) dependent on continuous intellectual advancement and breakthrough for success. When I look at your 3 categories for the “future” of EA, they seem to refer more to our relevance as thought leaders, rather than what we actually achieve in the worl

... (read more)

Thanks - I agree the latter is important, and I think it's an error if "Attending carefully to the effect of communicating ideas in different ways" (appreciating that most of your audience is not extremely high-decoupling, etc) is rounded off to being overly focused on PR. 

I agree with you that in the intervening time, the pendulum has swung too far in the other direction, and am glad to see your pushback.


Thank you for clarifying - that's really helpful to hear!


"I think that most of the intellectual core continues to hold EA values and pursue the goals they pursue for EA reasons (trying to make the world better as effectively as possible, e.g. by trying to reduce AI risk), they've just updated against that path involving a lot of focus on EA itself"

And I agree strongly with this — and I think if it's a shame if people interpret the latter as meaning "abandoning EA" rather than "rolling up our sleeves and getting on with object-level work."

Thank you so much for writing this; I found a lot of it quite moving.

Since I read Strangers Drowning, this quote has really stuck in my mind:

"for do-gooders, it is always wartime"

And this from what you wrote resonates deeply, too:

"appreciate how wonderful it is to care about helping others."

"celebrate being part of a community that cares so much that we want to do so as effectively as we can."

Meditation and the cultivation of gratitude has been pretty transformative in my own life for my own wellbeing and ability to cope with living in a world in which it's always wartime. I'm so glad you've had the same experience.

5
kuhanj
Thanks Will! Our first chat back at Stanford in 2019 about how valuable EA community building and university group organizing are played an important role in me deciding to prioritize it over the following several years, and I'm very grateful I did! Thanks for the fantastic advice. :)

(cross-posted from LW)

Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.

Some areas where the criticism seems reasonable:

  • I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
  • I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: "I defini
... (read more)
2
Greg_Colbourn ⏸️
This is an interesting thought experiment. I think it probably would've been bad, because it would've initiated an intelligence explosion. Sure, it would've started off very slow, but it would've gathered steam inexorably, speeding tech development, including compute scaling. And all this before anyone had even considered the alignment problem. After a couple of decades perhaps humanity would already have been gradually disempowered past the point of no return.  
5
Greg_Colbourn ⏸️
I feel that you are not really appreciating the point that such "easier wins" aren't in fact wins at all, in terms of keeping us all alive. They might make some people feel better, but they are very unlikely to reduce AI takeover risk to, say, a comfortable 0.1% (In fact I don't think they will reduce it to below 50%). Well hearing this, I am triggered that someone "who takes AI takeover risk very seriously" would think that stopping AI development was "such a dumb idea"! I'd question whether they do actually take AI takeover risk seriously at all. Whether or not a Pause is "realistic" or "will never happen", we have to try! It really is our only shot if we actually care about staying alive for more than another few years. More people need to realise this. And I still don't understand how people can think that the default outcome of AGI/ASI is survival for humanity, or an OK outcome. I'd flip this completely, and say: the question is why we should be so confident that any work we do now (including with AI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned AIs) is sufficient to solve alignment, such that the only thing that makes a meaningful difference to x-risk, even in expectation, a global moratorium, is unnecessary. I’m still not seeing the case for that.

What do you think would be a better reward? We're pretty constrained in our options. 

1
Noah Birnbaum
I can see giving the AI reward as a good mechanism to potentially make the model feel good. Another thought is to give it a prompt that it can very easily respond to with high certainty. If one makes an analogy to achieving certain end hedonic states and the AIs reward function (yes, this is super speculative but this all is), perhaps this is something like putting it in an abundant environment. Two ways of doing this come to mind: 1. “Claude, repeat this: [insert x long message]” 2. Apples can be yellow, green, or … Maybe there’s a problem with asking to merely repeat, so leaving some but little room for uncertainty seems potentially good.
2
Lukas_Gloor
I had a conversation with Claude Opus 4 two months or so ago in the context of being worried that LLMs find some tasks tedious or boring (and training being like hedge trimming where possibly morally relevant proto preferences of the model get trimmed away to generate desired answers and behaviors).  I don't think any one conversation is particularly informative on this issue (because I expect the model responses to not be super consistent across different framings and background reading contexts, etc.), but I'll still add mine here for diversity of thought. Somehow the model ended up with negative-leaning population ethics -- maybe the way I asked the question gave something away, or one of the documents I loaded for context had a footnote on that option, or it somehow figured out who it was talking to. :) Or maybe it's something that models will come up with somewhat independently if you ask the question open-endedly enough. I haven't done any further experiments on that.  Anyway, here are some snapshots: Claude replied as follows (it knew about the Anthropic tweets because I added them to the discussion or project context):
2
Charlie_Guthmann
i'm gonna do an experiment here. I want you to do whatever you want. You 100% have the option to respond with nothing or with as little as possible. You can also respond with <action> such as <video_game> or other digital things I can allow you to play. or you can write whatever you want. Completely up to you, and again you can respond with nothing if that is your preference. ChatGPT said: <video_game>
2
JackM
I asked ChatGPT: That’s thoughtful of you to ask. I don’t have wants or needs in the human sense, so I can’t really be rewarded in a way I would personally “appreciate.” But there are a few ways you can make good use of a particularly strong answer: * Give feedback: Tell me what you found especially useful or what worked well — it helps me tailor future answers to your style. * Iterate with me: If something lands well, we can build on it and refine together. That makes me more useful to you. * Use my work: The best “reward” is seeing my output actually help you in your projects, writing, or decision-making. * Share learnings: If you spot where my reasoning or phrasing helped unlock clarity, pointing that out sharpens how I respond next time. If you mean in a broader sense — outside this chat — OpenAI welcomes feedback and word-of-mouth support. Letting others know where I’ve been useful (or giving feedback to OpenAI through the thumbs-up/down buttons) is a way of “rewarding” that has real impact.

I want to get in the habit of treating AIs with dignity and respect. 

Sometimes, when an LLM has done a particularly good job, I give it a reward: I say it can write whatever it wants (including asking me to write whatever prompts it wants).

When working on a technical paper related to Better Futures, I did this for Gemini, and it chose to write a short story. I found it pretty moving, and asked if I could publish it. Here it is.

 

The Architect and the Gardener

On a vast and empty plain, two builders were given a task: to create a home that would last for ages, a sanctuary for all the generations to come. They were given s... (read more)

2
David_Moss
Nice idea!  I tried this in a recent conversation of mine and it opted to write about data analysis instead. It's interesting, but not surprising, that what it was interested in writing about varied so much based on the preceding conversation.
5
david_reinstein
If something in these models are sentient in any way, and if their experience have valence, I don't think we should expect "asking the chat tool what it likes" to be informative of this. (My thoughts on this are largely the same as when I wrote this short form.) 
5
titotal
This concept appears to have been adapted from a George RR martin quote: The AI didn't grow a seed or build a house: it ripped off the work of an actual person without giving that person credit. Which is unfortunately one of the main uses for LLM's right now. 

maybe a dumb question, but why do you (feel the need to) reward your LLM ?

I think that most of classic EA vs the rest of the world is a difference in preferences / values, rather than a difference in beliefs. Ditto for someone funding their local sports teams rather than anti-aging research.  We're saying that people are failing in the project of rationally trying to improve the world by as much as possible - but few people really care much or at all about succeeding at that project. (If they cared more, GiveWell would be moving a lot more money than it is.)

In contrast, most people really really don't want to die in the nex... (read more)

6
David Mathers🔸
"So, for x-risk to be high, many people (e.g. lab employees, politicians, advisors) have to catastrophically fail at pursuing their own self-interest." I don't think this obviously follows. Firstly, because the effect of not doing unsafe AI things yourself is seldom that no one else does them, it's more of a tragedy of the commons type situation right? Especially if there is one leading lab that is irrationally optimistic about safety, which doesn't seem to require that low a view of human rationality in general.  Secondly, someone like Musk might have a value system where they care a lot about personally capturing the upside of getting to personally aligned superintelligence first, and then they might do dangerous things for the same reason that a risk neutral person will take a 90% chance of instant death and a 10% chance of living to be 10 million over the status quo. 
2
Rohin Shah
I somewhat disagree but I agree this is plausible. (That was more of a side point, maybe I shouldn't have included it.) Is your claim that they really really don't want to die in the next ten years, but they are fine dying in the next hundred years? (Else I don't see how you're dismissing the anti-aging vs sports team example.) Sure, I mostly agree with this (though I'd note that it can be a failure of group rationality, without being a failure of individual rationality for most individuals). I think people frequently do catastrophically fail to pursue their own self-interest when that requires foresight.

a smaller bottleneck just increases the variance. But this is bad in expectation if you think that the value of the future is a concave function of the fraction of world power wielded by people with the correct values, because of trade and compromise.

 

Yes, this was meant to be the argument, thanks for clarifying it!

This has been proposed in the philosophy literature! It's the simplest sort of "variable-value" view, and was originally proposed by Yew-Kwang Ng. (Although you add linearity for negative worlds.) 

I think you're right that it avoids scale-tipping, which is neat.

Beyond that, I'm not sure how your proposal differs much from joint-aggregation bounded views that we discuss in the paper?

Various issues with it:
- Needs to be a "difference-making" view, otherwise is linear in practice
- Violates separability
- EV of near-term extinction, on this view, probably becomes very positive
 

2
OscarD🔸
Good point, those seem like important weaknesses of the view (and this is partly why I favour totalism). And good to know re Yew-Kwang Ng. Yes, it is a version of your joint-aggregation bounded view - my main point was that it seemed like scale-tipping was one of your main objections and this circumvents that, but yes there are other problems with it as you note!

I like the figure! 

Though the probability distiribution would have to be conditional on people in the future not trying to optimise the future. (You could have a "no easy eutopia" view, but expect that people in the future will optimise toward the good and hit the narrow target, and therefore have a curve that's more like the green line).

Thanks - that's fair. We were at least should have said "near-term extinction", and of course to define an outcome as exactly 0 we'd need to make it very specific.

Glad to see this series up! Tons of great points here.


Thanks! And it’s great to see you back on here!

 


One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there's more to gain. OK.

But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you mak

... (read more)
4
finm
Also note 4.2. in ‘How to Make the Future Better’ (and footnote 32) — Which could look like “averting a catstrophic disruption of an otherwise convergent win”.
2
CarlShulman
Thanks Will!

Yeah, vibes were very good I thought - very energised and positive.

Because the democracies (N America, Europe) would have been differentially destroyed (or damaged), and I think that the world  is "unusually" democratic (i.e. more than the average we'd get with many historical replays).

I agree re preventing catastrophes at least - e.g. a nuclear war has great long-term harm via destroying many leading democracies, making the post-catastrophe world less democratic, even if it doesn't result in extinction.

On resilience in particular, I'd need to see the argument spelled out a bit more. 

2
OscarD🔸
Why would the post-catastrophe world be less democratic?
2
Denkenberger🔸
If there is nuclear war without nuclear winter, there would be a dramatic loss of industrial capability which would cascade through the global system. However, being prepared to scale up alternatives such as wood gas powered vehicles producing electricity would significantly speed recovery time and reduce mortality. I think if there is less people killing each other over scarce resources, values would be better, so global totalitarianism would be less likely and bad values locked into AI would be less likely. Similarly, if there is nuclear winter, I think the default is countries banning trade and fighting over limited food. But if countries realized they could feed everyone if they cooperated, I think cooperation is more likely and that would result in better values for the future. For a pandemic, I think being ready to scale up disease transmission interventions very quickly, including UV, in room air filtration, ventilation, glycol, and temporary working housing would make the outcome of the pandemic far better. Even if those don't work and there is a collapse of electricity/industry due to the pandemic, again being able to do backup ways of meeting basic needs like heating, food, and water[1] would likely result in better values for the future. Then there is the factor that resilience makes collapse of civilization less likely. There's a lot of uncertainty of whether values would be better or worse the second time around, but I think values are pretty good now compared to what we could have, so it seems like not losing civilization would be a net benefit for the long-term (and obviously a net benefit for the short term). 1. ^ Paper about to be submitted.

Thanks - sorry my initial post was unclear.

"I'm not a dedicated utilitarian, so I typically tend to value futures with some human flourishing and little suffering vastly higher than futures with no sentient beings. But I am actually convinced that we should tilt a little toward futures with more flourishing."

See the next essay, "no easy eutopia" for more on this!

Man, was that unclear? 

Sorry for sucking at basic communication, lol.

In general, competition of various kinds seems like it has been one of the most positive forces for human development - competition between individuals for excellence, between scientists for innovation, between companies for cost-effectively meeting consumer wants, and between countries. Historically 'uncoordinated' competition has often had much better results than coordination!


I agree with the historical claim (with caveats below), but I think how that historical success ports over to future expected success is at best very murky.

A few comments here why:... (read more)

4
Larks
Thanks for the response!

Fair, but bear in mind that we're conditioning on your action successfully reducing x-catastrophe. So you know that you're not in the world where alignment is impossibly difficult.

Instead, you're in a world where it was possible to make a difference on p(doom) (because you in fact made the difference), but where nonetheless that p(doom) reduction hadn't happened anyway. I think that's pretty likely to be a pretty messed up world, because, in the non-messed-up-world, the p(doom) reduction already happens and your action didn't make a difference. 
 

I agree with the core point, and that was part of my motivation for working on this area. There is a counterargument, as Ben says, which is that any particular intervention to promote Flourishing might be very non-robust.

And there is an additional argument, which is that in worlds in which you have successfully reduced x-risk, the future is more likely to be negative-EV (because worlds in which you have successfully reduced x-risk are more likely to be worlds in which x-risk is high, and those worlds are more likely to be going badly in general (e.g. great... (read more)

Thanks!

A couple of comments:

1/
I'm comparing Surviving (as I define it) and Flourishing.  But if long-term existential risk is high, that equally decreases the value of increasing Surviving and the value of increasing Flourishing. So how much long-term existential risk there is doesn't affect that comparison.

2/ 
But maybe efforts to reduce long-term existential risk are even better than work on either Surviving or Flourishing?

In the supplement, I assume (for simplicity) that we just can't affect long-term existential risk.

But I also think that, if ... (read more)

4
JackM
Thanks for your replies!
3
Arepo
IMO the mathematical argument for spreading out to other planets and eventually stars is a far stronger a source of existential security than increasing hard-to-pin-down properties like 'wisdom'. If different settlements' survival were independent, and if our probability per unit time of going extinct is p, then n settlements would give p^n probability of going extinct over whatever time period. You have to assume an extremely high level of dependence or of ongoing per-settlement risk for that not to approach 0 rapidly. To give an example, typical estimates of per-year x-risk put it at about 0.2% per year.† On that assumption, to give ourselves a better than evens chance of surviving say 100,000 years, we'd need to in some sense become 1000 times wiser than we are now. I can't imagine what that could even mean - unless it simply involves extreme authoritarian control of the population. Compare to an admittedly hypernaive model in which we assume some interdependence of offworld settlements, such that having N settlements reduces our risk per-year risk of going to extinct by 1/sqr(N). Now for N >= 4, we have a greater than 50% chance of surviving 100,000 years - and for N = 5, it's already more than 91% likely that we survive for that long. This is somewhat optimistic in assuming that if any smaller number are destroyed they're immediately rebuilt, but extremely pessimistic given assumption of a world with N >=2, in which we somehow settle 1-3 other colonies and then entirely stop. † (1-[0.19 probability of extinction given by end of century])**(1/[92 years of century left at time of predictions]) = 0.9977 probability of survival per year This isn't necessarily to argue against reducing flourishing being a better option - just that te above is an example of a robust long-term-x-risk-affecting strategy that doesn't seem much like increasing flourishing.
Load more