All of Toby_Ord's Comments + Replies

And I'll add that RL training (and to a lesser degree inference scaling) is limited to a subset of capabilities (those with verifiable rewards and that the AI industry care enough about to run lots of training on). So progress on benchmarks has been less representative of how good they are at things that aren't being benchmarked than it was in the non-reasoning-model era. So I think the problems of the new era are somewhat bigger than the effects that show up in benchmarks.

That's a great question. I'd expect a bit of slowdown this year, though not necessarily much. e.g. I think there is a 10x or so possible for RL before RL-training-compute reaches the size of pre-training compute, and then we know they have enough to 10x again beyond that (since GPT-4.5 was already 10x more), so there are some gains still in the pipe there. And I wouldn't be surprised if METR timelines keep going up in part due to increased inference spend (i.e. my points about inference scaling not being that good are to do with costs exploding, so if a co... (read more)

2
Toby_Ord
And I'll add that RL training (and to a lesser degree inference scaling) is limited to a subset of capabilities (those with verifiable rewards and that the AI industry care enough about to run lots of training on). So progress on benchmarks has been less representative of how good they are at things that aren't being benchmarked than it was in the non-reasoning-model era. So I think the problems of the new era are somewhat bigger than the effects that show up in benchmarks.

Good point about the METR curves not being Pareto frontiers.

Interesting ideas! A few quick responses:

  1. The data for the early 'linear' regime for these models actually appears to be even better than you suggest here. They have a roughly straight line (on a log-log plot), but at a slope that is better than 1. Eyeballing it, I think some are slope 5 or higher (i.e. increasing returns, with time horizon growing as the 5th power of compute). See my 3rd chart here. If anything, this would strengthen your case for talking about that regime separately from the poorly scaling high compute regime later on.
  2. I'd also suspected t
... (read more)
7
Ryan Greenblatt
Note that these METR cost vs time horizon are not at all pareto frontiers. These just correspond to what you get if you cut off the agent early, so they are probably very underelicited for "optimal performance for some cost" (e.g. note that if an agent doesn't complete some part of the task until it is nearly out of budget it would do much worse on this metric at low cost, see e.g. gpt-5 for which this is true). My guess is that with better elicitation you get closer to the regime I expect. At some point, METR might run results where they try to elicit performance at lower budgets such that we can actually get a pareto frontier. I agree my abstraction might not be the right ones and maybe there is a cleaner way to think about this.

Yeah, it isn't just like a constant factor slow-down, but is fairly hard to describe in detail. Pre-training, RL, and inference all have their own dynamics, and we don't know if there will be new good scaling ideas that breathe new life into them or create a new thing on which to scale. I'm not trying to say the speed at any future point is half what it would have been, but that you might have seen scaling as a big deal, and going forward it is a substantially smaller deal (maybe half as big a deal).

4
Ben_West🔸
Thanks, that's helpful. Do you have a sense of where we are on the current S-curve? E.g., if capabilities continue to progress in a straight line through the end of this year, is that evidence that we have found a new S-curve to stack on the current one?

Thanks for catching that — a lot of symbols in the appendix were lost when converting the post for the forum, so I've edited it to add them back in.

That's an interesting way to connect these. I suppose one way to view your model is as making clear the point that you can't cost-effectively use models on tasks that much longer than their 50% horizons — even if you are willing to try multiple times — and that trend of dramatic price improvements over time isn't enough to help with this. Instead you need the continuation of the METR trend of exponentially growing horizons. Moreover, you give a nice intuitive explanation of why that is.

One thing to watch out for is Gus Hamilton's recent study suggesti... (read more)

5
Margot Stakenborg
Thank you for this thoughtful reply, this comment is basically the reason this update exists. You were right that Hamilton is probably right. I have written a longer update incorporating Hamilton's reanalysis and extending the economics in two directions: a quantitative treatment of verification as the binding constraint, and a systematic look at the economic conditions under which a genuinely dangerous autonomous agent actually gets to run.  Curious whether you think the analysis holds up, and whether there are important considerations I have missed!

And one of the authors of the METR timelines paper has his own helpful critique/clarifications of their results.

Good points. I'm basically taking METR's results at face value and showing that people are often implicitly treating costs (or cost per 'hour') as constant (especially when extrapolating them), but show that these costs appear to be growing substantially.

Re the quality / generalisability of the METR timelines, there is quite a powerful critique of it by Nathan Witkin. I wouldn't go as far as he does, but he's got some solid points. 

4
Toby_Ord
And one of the authors of the METR timelines paper has his own helpful critique/clarifications of their results.

Thanks Basil! That's an interesting idea. The constant hazard rate model is just comparing two uses of the same model over different task lengths, so if using that to work out the 99% time horizon, it should cost 1/70th as much ($1.43). Over time, I think these 99% tasks should rise in cost in roughly the same way as the 50%-horizon ones (as they are both increasing in length in proportion). But estimating how that will change in practice is especially dicey as there is too little data.

Also, note that Gus Hamilton has written a great essay that takes the s... (read more)

Some great new analysis by Gus Hamilton shows that AI agents probably don't obey a constant hazard rate / half-life after all. Instead their hazard rates systematically decline as the task goes on.

This means that their success rates on tasks beyond their 50%-horizon are better than the simple model suggests, but those for tasks shorter than the 50% horizon are worse.

I had suggested a constant hazard rate was a good starting assumption for how their success rate at tasks decays with longer durations. It is the simplest model and fits the data OK. But Gus us... (read more)

I agree — a bunch of the arguments read like marketing that is greatly simplifying the real picture and not seeming very interested in digging deeper once a convenient story was found.

That's a good summary and pretty in-line with my own thoughts on the overall upshots. I'd say that absent new scaling approaches the strong tailwind to AI progress from compute increases will soon weaken substantially. But it wouldn't completely disappear, there may be new scaling approaches, and there remains progress via AI research. Overall, I'd say it lengthens timelines somewhat, makes raw compute/finances less of an overwhelming advantage, and may require different approaches to compute governance.

2
Davidmanheim
Strong agree that absent new approaches the tailwind isn't enough - but it seems unclear that pretraining scaling doesn't have farther to go, and it seems that current approaches with synthetic data and training via RL to enhance one-shot performance have room left for significant improvement. I also don't know how much room there is left until we hit genius level AGI or beyond, and at that point even if we hit a wall, more scaling isn't required, as the timeline basically ends.

A few points to clarify my overarching view:

  1. All kinds of compute scaling are quite inefficient on most standard metrics. There are steady gains, but they are coming from exponentially increasing inputs. These can't continue forever, so all these kinds of gains from compute scaling are naturally time-limited. The exponential growth in inputs may also be masking fundamental deficiencies in the learning algorithms.
  2. By 'compute scaling' I'm generally referring to the strategy of adding more GPUs to get more practically useful capabilities. I think this is runni
... (read more)
2
Ben_West🔸
I feel confused about this point because I thought the argument you were making implies a non-constant "tailwind." E.g. for the next generation these factors will be 1/2 as important as before, then the one after that 1/4, and so on. Am I wrong?
2
NickLaing
"All kinds of compute scaling are quite inefficient on most standard metrics. There are steady gains, but they are coming from exponentially increasing inputs." Is this kind of the opposite of Moore's law lol?

That is quite a surprising graph — the annual tripling and the correlation between the compute and revenue are much more perfect than I think anyone would have expected. Indeed they are so perfect that I'm a bit skeptical of what is going on. 

One thing to note is that it isn't clear what the compute graph is of (e.g. is it inference + training compute,  but not R&D?). Another thing to note is that it is year-end figures vs year total figures on the right, but given they are exponentials with the same doubling time and different units, that is... (read more)

Comparing AI scaling laws to Wright's law is an interesting idea. That is still a power law rather than logarithmic returns, but usefully comparable to both the pretraining and inference scaling behaviours.

Thanks for the comments. The idea that pretraining has slowed/stalled is in the background in many posts in my series and it is unfortunate I didn't write one where I addressed it head-on. I don't disagree with Vladimir Nesov as much as you may think. Some of this is that the terms are slippery.

I think there are three things under discussion:

  1. Scaling Laws. The empirical relationship between model size (or training data or compute) and the log-loss when predicting tokens from randomly chosen parts of the same data distribution that it hadn't trained on.
  2. Train
... (read more)

Thanks Paolo,

I was only able to get weak evidence of a noisy trend from the limited METR data, so it is hard to draw many conclusions from that. Moreover, METR's desire to measure the exponentially growing length of useful work tasks is potentially more exposed to an exponential rise in compute costs than more safety-related tasks. But overall, I'd think that the year-on-year growth in the amount of useful compute you can use on safety evaluations is probably growing faster than one can sustainably grow the number of staff.

I'm not sure how the dynamics wil... (read more)

Yes, that is a big limitation. Even more limiting is that it is only based on a subset of METR's data on this. That's enough to raise the question and illustrate what an answer might look like in data like this, but not to really answer it.

I'm not aware of others exploring this question, but I haven't done much looking.

Thanks for the clarification, and apologies for missing that in your original comment.

1
Simon
You're definitely right that my original comment failed to explain the importance of redistribution! 

Hi Simon, I want to push back on your claims about markets a bit. 

Markets are great, especially when there are minimal market failures. I love them. They are responsible for a lot of good things. But the first and second fundamental theorems don't conclude that they maximise social welfare. That is a widely held misconception. 

The first concludes that they reach a point on the Pareto frontier, but such a point could be really quite bad. e.g. a great outcome for one person but misery for 8 billion can be Pareto efficient. I'm not sure that extreme... (read more)

5
Simon
Hey Toby, thanks for your comment! I'm not sure we really disagree, because I agree that transfers are an important part of the picture, that's why I said "all of the interesting work is in designing anti-trust and tax measures that are robust to superintelligence". But I agree that my comment insufficiently emphasized the redistributive aspect. The overall point is that we may not need to design a brand new set of institutions to deal with the rise of superintelligence. At least one hypothesis worth considering is that markets plus avoiding market failure plus redistributive tax and transfer will be sufficient. We already have lots of political institutions that seek to remedy distributive problems associated with markets, and maybe these institutions will scale with growth in GDP. Or maybe not! But at least, I think this is a great place to start for the analysis, as opposed to from scratch 

Whether it is true or not depends on the community and the point I'm making is primarily for EAs (and EA-adjacent people too). It might also be true for the AI safety and governance communities. I don't think it is true in general though — i.e. most citizens and most politicians are not giving too little regard to long timelines. So I'm not sure the point can be made when removing this reference.

Also, I'm particularly focusing on the set of people who are trying to act rationally and altruistically in response to these dangers, and are doing so in a somewhat coordinated manner. e.g. a key aspect is that the portfolio is currently skewed towards the near-term.

2
EdoArad
Re the first point, I agree that the context should be related to a person with an EA philosophy. Re the second point, I think that discussions about the EA portfolio are often interpreted as 0-sum or tribal, and may cause more division in the movement.  I agree that most of the effects of such a debate are likely about shifting around our portfolio of efforts. However, there are other possible effects (recruiting/onboarding/promoting/aiding existing efforts, or increasing the amount of total resources by getting more readers involved). Also, a shift in the portfolio can happen as a result of object-level discussion, and it is not clear to me which way is better. I guess my main point is that I'd like people in the community to think less about the community should think. Err.. oops..

The point I'm trying to make is that we should have a probability distribution over timelines with a chance of short, medium or long — then we need to act given this uncertainty, with a portfolio of work based around the different lengths. So even if our median is correct, I think we're failing to do enough work aimed at the 50% of cases that are longer than the median.

1
Dylan Richardson
I think that is both correct and interesting as a proposition. But the topic as phrased seems more likely to mire it in more timelines debate. Rather than this proposition, which is a step removed from:  1. What timelines and probability distributions are correct 2. Are EAs correctly calibrated  And only then do we get to 3. EAs are "failing to do enough work aimed at longer than median cases". - arguably my topic "Long timelines suggest significantly different approaches than short timelines" is between 2 & 3 
Answer by Toby_Ord36
5
2

"EAs aren't giving enough weight to longer AI timelines"

(The timelines until transformative AI are very uncertain. We should, of course, hedge against it coming early when we are least prepared, but currently that is less of a hedge and more of a full-on bet. I think we are unduly neglecting many opportunities that would pay off only on longer timelines.)

-3
RedCat
I think of the opposite, EA aren't giving enough weight to present AI harms.
9
EdoArad
I think that this question will be better if it is framed not in terms of the EA community. This is because  1. The reasoning about the object level question involving timelines and different intervention strategies is very interesting in itself, and there's no need to add the layer of understanding what the community is doing and how practically it could and should adjust. 2. Signal boosting a norm of focusing less on intra-movement prioritization and more on personal or marginal additional prioritization and object-level questions.  For example, I like Dylan's reformulation attempt due to it being about object-level differences. Another could be to ask about the next $100K invested in AI safety.
4
Dylan Richardson
Perhaps "Long timelines suggest significantly different approaches than short timelines" is more direct and under discussed? I think median EA AI timelines are actually OK, it's more that certain orgs and individuals (like AI 2027) have tended toward extremity in one way or another.

I ran a timelines exercise in 2017 with many well known FHI staff (though not including Nick) where the point was to elicit one's current beliefs for AGI by plotting CDFs. Looking at them now, I can tell you our median dates were: 2024, 2032, 2034, 2034, 2034, 2035, 2054, and 2079. So the median of our medians was (robustly) 2034 (i.e. 17 more years time). I was one of the people who had that date, though people didn't see each others' CDFs during the exercise.

I think these have held up well.

So I don't think Eliezer's "Oxford EAs" point is correct.

I've often been frustrated by this assumption over the last 20 years, but don't remember any good pieces about it.

It may be partly from Eliezer's first alignment approach being to create a superintelligent sovereign AI, where if that goes right, other risks really would be dealt with.

Yeah, I mean 'more valuable to prevent', before taking into account the cost and difficulty.

At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do. 

This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn't matter, but here it does.

Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of ... (read more)

4
William_MacAskill
I agree - this is a great point. Thanks, Simon! You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can't change that, or in worlds where it's very unlikely and we can't change that, those aren't the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon).  So, if we're taking this consideration into account, this should also discount the value of word to reduce misalignment risk today, too. (Another upshot: bio-risk seems more like chance than uncertainty, so biorisk becomes comparatively more important than you'd think before this consideration.)
1
Peter Salib
Agree, and this relates to my point about distinguishing the likelihood of retaining alignment knowledge from the likelihood of rediscovering it. 

The value of saving philanthropic resources to deploy post-superintelligence is greater than it otherwise would be.

One way to think of this is that if there is a 10% existential risk from the superintelligence transition and we will attempt that transition, then the world is currently worth 0.90 V, where V is the expected value of the world after achieving that transition. So the future world is more valuable (in the appropriate long-term sense) and saving it is correspondingly more important. With these numbers the effect isn't huge, but would be importan... (read more)

1
Oliver Sourbut
For clarity, you're using 'important' here in something like an importance x tractability x neglectedness factoring? So yes more important (but there might be reasons to think it's less tractable or neglected)?

That's a very nice and clear idea — I think you're right that working on making mission-critical, but illegible, problems legible is robustly high value.

5
Wei Dai
Thanks! I hope this means you'll spend some more time on this type of work, and/or tell other philosophers about this argument. It seems apparent that we need more philosophers to work on philosophical problems related to AI x-safety (many of which do not seem to be legible to most non-philosophers). Not necessarily by attacking them directly (this is very hard and probably not the best use of time, as we previously discussed) but instead by making them more legible to AI researchers, decisionmakers, and the general public.

It's very difficult to do this with benchmarks, because as the models improve benchmarks come and go. Things that used to be so hard that it couldn't do better than chance quickly become saturated and we look for the next thing, then the one after that, and so on. For me, the fact that GPT-4 -> GPT4.5 seemed to involve climbing about half of one benchmark was slower progress than I expected (and the leaks from OpenAI suggest they had similar views to me). When GPT-3.5 was replaced by GPT-4, people were losing their minds about it — both internally and o... (read more)

1
Peter
Yes, what you are scaling matters just as much as the fact that you are scaling. So now developers are scaling RL post training and pretraining using higher quality synthetic data pipelines. If the point is just that training on average internet text provides diminishing real world returns in many real-world use cases, then that seems defensible; that certainly doesn't seem to be the main recipe any company is using for pushing the frontier right now. But it seems like people often mistake this for something stronger like "all training is now facing insurmountable barriers to continued real world gains" or "scaling laws are slowing down across the board" or "it didn't produce significant gains on meaningful tasks so scaling is done." I mentioned SWE-Bench because that seems to suggest significant real world utility improvements rather than trivial prediction loss decrease. I also don't think it's clear that there is such an absolute separation here - to model the data you have to model the world in some sense. If you continue feeding multimodal LLM agents the right data in the right way, they continue improving on real world tasks. 

I was going to say something about lack of incentives, but I think it is also a lack of credible signals that the work is important, is deeply desired by others working in these fields, and would be used to inform deployments of AI. In my view, there isn't much desire for work like this from people in the field and they probably wouldn't use it to inform deployment unless a lot of effort is also added from the author to meet the right people, convince theme to spend the time to take it seriously etc.

4
Wei Dai
Any thoughts on Legible vs. Illegible AI Safety Problems, which is in part a response to this?
4
Wei Dai
Right, I know about Will MacAskill, Joe Carlsmith, and your work in this area, but none of you are working on alignment per se full time or even close to full time AFAIK, and the total effort is clearly far from adequate to the task at hand. Any other names you can cite? Thanks, this makes sense to me, and my follow-up is how concerning do you think this situation is? One perspective I have is that at this point, several years into a potential AI takeoff, with AI companies now worth trillions in aggregate, alignment teams at AI companies still have virtually no professional philosophical oversight (or outside consultants that they rely on), and are kind of winging it based on their own philosophical beliefs/knowledge. It seems rather like trying to build a particle collider or fusion reaction with no physicists on the staff, only engineers. (Or worse, unlike engineers' physics knowledge, I doubt that receiving a systematic education in fields like ethics and metaethics is a hard requirement for working as an alignment researcher. And even worse, unlike the situation in physics, we don't even have settled ethics/metaethics/metaphilosophy/etc. that alignment researchers can just learn and apply.) Maybe the AI companies are reluctant to get professional philosophers involved, because in the fields that do have "professional philosophical oversight", e.g., bioethics, things haven't worked out that well. (E.g. human challenge trials being banned during COVID.) But to me, this would be a signal to yell loudly that our civilization is far from ready to attempt or undergo an AI transition, rather than a license to wing it based on one's own philosophical beliefs/knowledge. As an outsider, the situation seems cray alarming to me, and I'm confused that nobody else is talking about it, including philosophers like you who are in the same overall space and looking at roughly the same things. I wonder if you have a perspective that makes the situation not quite as alarming

I don't know what to make of that. Obviously Vladimir knows a lot about state of the art compute, but there are so many details there without them being drawn together into a coherent point that really disagrees with you or me on this. 

It does sound like he is making the argument that GPT 4.5 was actually fine and on trend. I don't really believe this, and don't think OpenAI believed it either (there are various leaks they were disappointed with it, they barely announced it, and then they shelved it almost immediately). 

I don't think the argument... (read more)

3
Peter
Shouldn't we be able to point to some objective benchmark if GPT-4.5 was really off trend? It got 10x the SWE-Bench score of GPT-4. That seems like solid evidence that additional pretraining continued to produce the same magnitude of improvements as previous scaleups. If there were now even more efficient ways than that to improve capabilities, like RL post-training on smaller o-series models, why would you expect OpenAI not to focus their efforts there instead? RL was producing gains and hadn't been scaled as much as self-supervised pretraining, so it was obvious where to invest marginal dollars. GPT-5 is better and faster than 4.5. This doesn't mean pretraining suddenly stopped working or went off trend from scaling laws though. 

Re 99% of academic philosophers, they are doing their own thing and have not heard of these possibilities and wouldn't be likely to move away from their existing areas if they had. Getting someone to change their life's work is not easy and usually requires hours of engagement to have a chance. It is especially hard to change what people work on in a field when you are outside that field.

A different question is about the much smaller number of philosophers who engage with EA and/or AI safety (there are maybe 50 of these). Some of these are working on some ... (read more)

5
Toby_Ord
I was going to say something about lack of incentives, but I think it is also a lack of credible signals that the work is important, is deeply desired by others working in these fields, and would be used to inform deployments of AI. In my view, there isn't much desire for work like this from people in the field and they probably wouldn't use it to inform deployment unless a lot of effort is also added from the author to meet the right people, convince theme to spend the time to take it seriously etc.

I appreciate you raising this Wei (and Yarrow's responses too). They both echoed a lot of my internal debate on this. I'm definitely not sure whether this is the best use of my time. At the moment, my research time is roughly evenly split between this thread of essays on AI scaling and more philosophical work connected to longtermism, existential risk and post-AGI governance. The former is much easier to demonstrate forward progress and there is more of a demand signal for it. The latter is harder to be sure it is on the right path and is in less demand. M... (read more)

4
Wei Dai
Do you have any insights into why there are so few philosophers working in AI alignment, or closely with alignment researchers? (Amanda Askell is the only one I know.) Do you think this is actually a reasonable state of affairs (i.e., it's right or fine that almost no professional philosophers work directly as or with alignment researchers), or is this wrong/suboptimal, caused by some kind of cultural or structural problem? It's been 6 years since I wrote Problems in AI Alignment that philosophers could potentially contribute to and I've gotten a few comments from philosophers saying they found the list helpful or that they'll think about working on some of the problems, but I'm not aware of any concrete follow-ups. If it is some kind of cultural or structural problem, it might be even higher leverage to work on solving that, instead of object level philosophical problems. I'd try to do this myself, but as an outsider to academic philosophy and also very far from any organizations who might potentially hire philosophers to work on AI alignment, it's hard for me to even observe what the problem might be.

Thanks. I'm also a bit surprised by the lack of reaction to this series given that:

  • compute scaling has been the biggest story of AI in the last few decades
  • it has dramatically changed
  • very few people are covering these changes
  • it is surprisingly easy to make major crisp contributions to our understanding of it just by analysing the few pieces of publicly available data
  • the changes have major consequences for AI companies, AI timelines, AI risk, and AI governance
1
Noah Birnbaum
I agree — it seems weird that people haven’t updated very much.  However, I wrote a similarly-purposed (though much less rigorous) post entitled “How To Update if Pre-Training is Dead,” and Vladmir Nesov wrote the following comment (before GPT 5 release), which I would be curious to hear your thoughts on: Frontier AI training compute is currently increasing about 12x every two years, from about 7e18 FLOP/s in 2022 (24K A100s, 0.3e15 BF16 FLOP/s per chip), to about 1e20 FLOP/s in 2024 (100K H100s, 1e15 BF16 FLOP/s per chip), to 1e21 FLOP/s in 2026 (Crusoe/Oracle/OpenAI Abilene system, 400K chips in GB200/GB300 NVL72 racks, 2.5e15 BF16 FLOP/s per chip). If this trend takes another step, we'll have 1.2e22 FLOP/s in 2028 (though it'll plausibly take a bit longer to get there, maybe 2.5e22 FLOP/s in 2030 instead), with 5 GW training systems. So the change between GPT-4 and GPT-4.5 is a third of this path. And GPT-4.5 is very impressive compared to the actual original GPT-4 from Mar 2023, it's only by comparing it to more recent models that GPT-4.5 isn't very useful (in its non-reasoning form, and plausibly without much polish). Some of these more recent models were plausibly trained on 2023 compute (maybe 30K H100s, 3e19 FLOP/s, 4x more than the original GPT-4), or were more lightweight models (not compute optimal, and with fewer total params) trained on 2024 compute (about the same as GPT-4.5). So what we can actually observe from GPT-4.5 is that increasing compute by 3x is not very impressive, but the whole road from 2022 to 2028-2030 is a 1700x-3500x increase in compute from original GPT-4 (or twice that if we are moving from BF16 to FP8), or 120x-250x from GPT-4.5 (if GPT-4.5 is already trained in FP8, which was hinted at in the release video). Judging the effect of 120x from the effect of 3x is not very convincing. And we haven't really seen what GPT-4.5 can do yet, because it's not a reasoning model. The best large model inference hardware available until ver

For my part, I simply didn't know the series existed until seeing this post, since this is the only post in the series on EAF.  :)

4
Sharmake
The crux for me is I don't agree that compute scaling has dramatically changed, because I don't think pre-training scaling has gotten much worse returns.
3
Lowe Lundin
Agreed! The series has been valuable for my personal thinking around this (I quoted the post I linked above as late as yesterday.) Imo, more people should be paying attention to this.

Thanks Jeff, this was very helpful. I'd listened to Andrew Snyder-Beattie's excellent interview on the 80,000 Hours podcast and wanted to buy one of these, but hadn't known exactly what to buy until now.

my hope with this essay is simply to make a case that all might benefit from a widening of Longtermism’s methods and a greater boldness in proclaiming that it is a part of the greatness of being human to be heroically, even slightly irrationally, generous in our relationship with others, including future generations, out of our love for humanity itself. 

This is a very interesting approach, and I don't think it is in conflict with the approach in the volume. I hope you develop it further.

6
Fr Peter Wyg
Thank you for your encouraging words. They mean a great deal. 

Thanks so much for writing this Will, I especially like the ideas:

  1. It is much more clear now than it was 10 years ago that AI will be a major issue of our time, affecting many aspects of our world (and our future). So it isn't just relevant as a cause, but instead as something that affects how we pursue many causes, including things like global health, global development, pandemics, animal welfare etc.
  2. Previously EA work on AI was tightly focused around technical safety work, but expansion of this to include governance work has been successful and we will ne
... (read more)
6
William_MacAskill
Thanks - classic Toby point!  I agree entirely that you need additional assumptions. I was imagining someone who thinks that, say, there's a 90% risk of unaligned AI takeover, and a 50% loss of EV of the future from other non-alignment issues that we can influence. So EV of the future is 5%. If so, completely solving AI risk would increase the EV of the future to 50%; halving both would increase it only to 41%. But, even so, it's probably easier to halve both than to completely eliminate AI takeover risk, and more generally the case for a mixed strategy seems strong. 

Your comment above is the most informative thing I've read so far on the likelihood of the end of democracy in America. I especially appreciate the mix of key evidence pointing in both directions.

Thanks for this Kuhan — a great talk.

I'm intrigued about the idea of promoting a societal-level culture of substantially more altruism. It does feel like there is room for a substantial shift (from a very low base!) and it might be achievable.

Chana, this is incredible work by you, Aric, and the rest of the team.

It’s not at all easy to balance being informative, sober, engaging, and touching — all while addressing the most important issues of our time — but you’re knocking it out of the park.

2
ChanaMessinger
That means the world, Toby, on behalf of the whole team, thank you!

Interesting! So this is a kind of representation theorem (a bit like the VNM Theorem) but instead of saying that Archimedean preferences of gambles can be represented as a standard sum, it says that any aggregation method (even a non-Archimedean one) can be represented by a sum of a hyperreal utility function applied to each of its parts.

2
Ben_West🔸
Yes, I think that's a good summary!

The short answer is that if you take the partial sums of the first n terms, you get the sequence 1, 1, 2, 3, 4, … which settles down to have its nth element being  and thus is a representative sequence for the number . I think you'll be able to follow the maths in the paper quite well, especially if trying a few examples of things you'd like to sum or integrate for yourself on paper. 

(There is some tricky stuff to do with ultrafilters, but that mainly comes up as a way of settling the matter for which of two sequences represents t... (read more)

One possibility is this: I don't value prospects by their classical expected utilities, but by the version calculated with the hyperreal sum or integral (which agrees to within an infinitesimal when the answer is finite, but can disagree when it is divergent). So I don't actually want the classical expected utility to be the measure of a prospect. It is possible that the continuity axiom gets you there and my modified or dropped version can allow caring about the hyperreal version of expectation.

Interesting question.

I think there is a version of VNM utility that survives and captures the core of what we wanted: i.e. a way of representing consistent ways of ordering prospects via cardinal values of individual outcomes — it it is just that these value of outcomes can be hyperreals. I really do think the 'continuity' axiom (which is really an Archimedean axiom saying that nothing is infinitely valuable compared to something else) is obviously false in these settings, so has to go (or to be replaced by a version that allows infinitesimal probabilities... (read more)

7
Michael St Jules 🔸
Without continuity (but maybe some weaker assumptions required), I think you get a representation theorem giving lexicographically ordered ordinal sequences of real utilities, i.e. a sequence of expected values, which you compare lexicographically. With an infinitary extension of independence or the sure-thing principle, you get lexicographically ordered ordinal sequences of bounded real utilities, ruling out St Pesterburg-like prospects, and so also ruling out risk neutral expectational utilitarianism.
4
Toby_Ord
One possibility is this: I don't value prospects by their classical expected utilities, but by the version calculated with the hyperreal sum or integral (which agrees to within an infinitesimal when the answer is finite, but can disagree when it is divergent). So I don't actually want the classical expected utility to be the measure of a prospect. It is possible that the continuity axiom gets you there and my modified or dropped version can allow caring about the hyperreal version of expectation.

Thanks!

On the hyperreal approach 1 + 0 + 1 + 1 + … does actually equal  as desired.

This is an example of the general fact that adding zeros can change the hyperreal valuation of an infinite sum, which is a property that is pretty common in variant summation methods.

2
Jeff Kaufman 🔸
Thanks! I'm glad this has 1 + 0 + 1 + 1 = ω − 1, but I'm going to need to go read more to understand why ;)

My chapter, Shaping Humanity's Longterm Trajectory, aims to better understand how reducing existential risk compares with other ways of influencing the longterm future. Helping avert a catastrophe can have profound value due to the way that the short-run effects of our actions can have a systematic influence on the long-run future. But it isn't the only way that could happen. 

For example, if we advanced human progress by a year, perhaps we should expect to see us reach each subsequent milestone a year earlier. And if things are generally becoming bett... (read more)

Here is a nice simple model of the trade-off between redundancy and correlated risk. Assume that each time period, each planet has an independent and constant chance of destroying civilisation on its own planet and an independent and constant chance of destroying civilisation on all planets. Furthermore, assume that unless all planets fail in the same time period, they can be restored from those that survive. 

e.g. assume the planetary destruction rate is 10% per century and the galaxy destruction rate is 1 in 1 million per century. Then with one plane... (read more)

This is a very interesting post. Here's how it fits into my thinking about existential risk and time and space.

We already know about several related risk effects over space and time:

  1. If different locations in space can serve as backups, such that humanity fails only if all of them fail simultaneously, then the number of these only needs to grow logarithmically before there is a non-zero chance of indefinite survival.
  2. However, this does not solve existential risk, as it only helps with uncorrelated risks such as asteroid impacts. Some risks are correlated bet
... (read more)

Here is a nice simple model of the trade-off between redundancy and correlated risk. Assume that each time period, each planet has an independent and constant chance of destroying civilisation on its own planet and an independent and constant chance of destroying civilisation on all planets. Furthermore, assume that unless all planets fail in the same time period, they can be restored from those that survive. 

e.g. assume the planetary destruction rate is 10% per century and the galaxy destruction rate is 1 in 1 million per century. Then with one plane... (read more)

Load more