All of Sharmake's Comments + Replies

While I agree that the optimizer's curse is a problem, and one that is relevant for certain sectors of EA, I will also say that given the very high variance in expected impact between causes, this is much less of a problem than other problems in EA epistemics, which is why it hasn't received much attention.

That said, you do note some very interesting things about the optimizer's curse, so the post is valuable beyond restating the problem, so I will give credit where it's due, it's a nice incremental improvement.

To a large extent, I agree that RL scaling is basically just inference scaling for the most part, but I disagree with this claim immensely, and this causes me to have different expectations of AI progress over the next 4-6 years (but agree in the longer term, absent new paradigms inference scaling will be more important and AI progress will slow back down to the prior compute trend of 1.55x efficiency per year, rather than getting 3-4x more compute every year):

> In the last year or two, the most important trend in modern AI came to an end. The scaling-u... (read more)

3
Toby_Ord
Thanks for the comments. The idea that pretraining has slowed/stalled is in the background in many posts in my series and it is unfortunate I didn't write one where I addressed it head-on. I don't disagree with Vladimir Nesov as much as you may think. Some of this is that the terms are slippery. I think there are three things under discussion: 1. Scaling Laws. The empirical relationship between model size (or training data or compute) and the log-loss when predicting tokens from randomly chosen parts of the same data distribution that it hadn't trained on. 2. Training Compute Increases. The annual increase in the amount of compute used to pretrain a frontier model. 3. Value of scaling. The practical returns from each 10x to the amount of pre-training compute. In my view, the scaling laws (1) may well hold, and I wouldn't be surprised if there isn't even a kink in the curve. This is what Nesov is mainly discussing and I don't disagree with him about it. My view is that the annual training compute scaleup for frontier models has declined (from more than 10x to less than 3x) and that the value per 10x has also declined (possibly due to having already trained on all books, leaving only marginal gains in Reddit comments etc). As witness to this, consider that the Epoch estimates for total training of OpenAI's leading model are just about 2x as high as original GPT-4, released almost 3 years ago. They did have a version (GPT-4.5) that was 10x as high, but were disappointed by it and quickly sunsetted it. xAI has a version with even more than that, but it is only the about the 5th best model and isn't widely acclaimed, despite having scaled the most. I see the fact that companies can't economically serve large pretrained models as part of an explanation for the stalling of pretraining, rather than as a counterargument. Note that I'm not saying pre-training scaling is dead (or anything about Scaling Laws). I'm saying something more like:  Pretraining scaling has r

For what it's worth, I think pre-training alone is probably enough to get us to about 1-3 month time horizons based on a 7 month doubling time, but pre-training data will start to run out in the early 2030s, meaning that you no longer (in the absence of other benchmarks) have very good general proxies of capabilities improvements.

The real issue isn't the difference between hours and months long tasks, but the difference between months long tasks and century long tasks, which Steve Newman describes well here.

Nice write-up on the issue.

One thing I will say is that I'm maybe unusually optimistic on power concentration compared to a lot of EAs/LWers, and the main divergence I have is that I basically treat this counter-argument as decisive enough to make me think the risk of power-concentration doesn't go through, even in scenarios where humanity is basically as careless as possible.

This is due to evidence on human utility functions showing that most people have diminishing returns on utility on exclusive goods to use personally that are fast enough that altruism... (read more)

The main reason I voted for Forethought and MATS was because I believe AI governance/safety is both unusually important, with only Farmed/Wild animal welfare being competitive in terms of EV, and I believe that AI has a reasonable chance to be so powerful as to make other cause area assumptions irrelevant, meaning their impact is much, much less predictable without considering AI governance/safety.

One of the key issues with "making the future go well" interventions is that we start to run up against the reality that what is a desirable outcome for the future is so variable between different humans that the concept of making the future go well requires buying into ethical assumptions that people won't share, meaning that it's much less valid as any sort of absolute metric to coordinate around:

(A quote from Steven Byrnes here):

  • When people make statements that implicitly treat "the value of the future" as being well-defined, e.g. statements like
... (read more)

I'm commenting late, but I don't think the better futures perspective gets us back to intuitive/normie ethical views, because what is a better future has far more variation in values than preventing catastrophic outcomes (I'm making an empirical claim that most human values have more convergence in things they want to avoid than in things they want to seek out/are positive), and the other issue is that to a large extent, AGI/ASI in the medium/long-term is very totalizing in its effects, meaning that basically the only thing that matters is getting a friend... (read more)

An example here is this quote, which straddles dangerously close to "these people have morality that you find to be offensive, therefore they are wrong on the actual facts of the matter" (Otherwise you would make the Nazi source allegations less central to your criticism here):

(I don't hold the moral views of what the quote is saying, to be clear).

It has never stopped shocking and disgusting me that the EA Forum is a place where someone can write a post arguing that Black Africans need Western-funded programs to edit their genomes to increase their intelli

... (read more)

Another issue, and why the comment is getting downvoted heavily (including by myself) is because you seem to conflate the is-ought distinction with this post, and without the is-ought distinction being conflated, this post would not exist.

You routinely leap from "a person has moral views that are offensive to you" to "they are wrong about the facts of the matter", and your evidence for this is paper thin at best.

Being able to separate moral views from beliefs on factual claims is one of the things that is expected if you are in EA/LW spaces.

This is not mut... (read more)

-5
Yarrow Bouchard 🔸

I currently can't find a source, but to elaborate a little bit, my reason for thinking this is that the GPT-4 to GPT-4.5 scaleup used 15x the compute instead of 100x the compute, and I remember that 10x compute is enough to be competitive with the current algorithmic improvements that don't involve scaling up models, whereas 100x compute increases result in the wow moments we associated with GPT-3 to GPT-4, and the GPT-5 release was not a scale up of compute, but instead productionizing GPT-4.5.

I'm more in the camp of "I find little reason to believe that pre-training returns have declined" here.

1
Yarrow Bouchard 🔸
I’ll just mention that, for what it’s worth, the AI researcher and former OpenAI Chief Scientist Ilya Sutskever thinks the scaling of pre-training for LLMs has run out of steam. Dario Amodei, the CEO of Anthropic, has also said things that seem to indicate the scaling of pre-training no longer has the importance it once did.  Other evidence would be reporters talking to anonymous engineers inside OpenAI and Meta who have expressed disappointment with the results of scaling pre-training. Toby mentioned this in another blog post and I quoted the relevant paragraph in a comment here.

The crux for me is I don't agree that compute scaling has dramatically changed, because I don't think pre-training scaling has gotten much worse returns.

I broadly don't think inference scaling is the only path, primarily because I disagree with the claim that pre-training returns declined much, and attribute the GPT-4.5 evidence as mostly a case of broken compute promises making everything disappointing.

I also have a hypothesis that current RL is mostly serving as an elicitation method for pre-trained AIs.

We shall see in 2026-2027 whether this remains true.

1
Yarrow Bouchard 🔸
Could you elaborate or link to somewhere where someone makes this argument? I'm curious to see if a strong defense can be made of self-supervised pre-training of LLMs continuing to scale and deliver worthwhile, significant benefits.

A big part of the issue, IMO is the fact that EA funding is often very skewed by people who have managed to capture the long-tail of wealth/income, and while this is quite necessary for EA to be as impactful as it is in a world where it's good for EA to remain small, and I'd still say it was positive overall to do the strategy, this also inevitably distorts any conversations, because people reasonably fear that being unable to justify/defer to a funder about what to do means you can't get off the ground at all, since there are few alternative funders.

So th... (read more)

My general take on gradual disempowerment, independent of any other issues raised here, is that I think it's a coherent scenario, but that it ultimately is very unlikely to arise in practice, because it relies on an equilibrium where the sort of very imperfect alignment needed for divergence between human and AI interests to occur over the long-run being stable, even as the reasons for why the alignment problem in humans being very spotty/imperfect being stable get knocked out.

In particular, I'm relatively bullish on automated AI alignment conditional on n... (read more)

Sharmake
2
0
0
50% agree

The "arbitrariness" of precise EVs is just a matter of our discomfort with picking a precise number (see above).

 

A non-trivial reason for this is that precise numbers expose ideological assumptions, and a whole of people do not like this.

It's easy to lie with numbers, but it's even easier to lie without a number.

Crossposting a comment from LessWrong:

@1a3orn goes deeper into another dynamic that causes groups to have false beliefs while believing they are true, and it's the fact that some bullshit beliefs help you figure out who to exclude, which is the people who don't currently hold the belief, and in particular assholery also helps people who don't want their claims checked, and it's a reason I think politeness is actually useful in practice for rationality:

(Sharmake's first tweet): I wrote something on a general version of this selection effect, and why it's so

... (read more)

Another story is that this is a standard diminishing returns case, and once we have removed all the very big blockers like non-functional rule of law, property rights, untreated food and water, as well as disease, it's very hard to make the people who would still remain poor actually improve their lives, because all the easy wins have been taken, so what we are left with is the harder/near impossible poverty cases.

2
Linch
Yeah I think these two claims are essentially the same argument, framed in different ways. 

I think each galactic x-risk on the list can probably be disregarded, but combined, and with the knowledge that we are extremely early in thinking about this, they present a very convincing case to me that at least 1 or 2 galactic x-risks are possible.

I think this is kind of a crux, in that I currently think the only possible galactic scale risks are risks where our standard model of physics breaks down in a deep way once you can get at least one dyson swarm going up, you are virtually invulnerable to extinction methods that doesn't involve us being ver... (read more)

On hot take 2, this relies on the risks from each start system being roughly independent, so breaking this assumption seems like a good solution, but then each star system being very correlated maybe seems bad for liberalism and diversity of forms of flourishing and so forth. But maybe some amount of regularity and conformity is the price we need to pay for galactic security.

I think liberalism is unfortunately on a timer that will almost certainly expire pretty soon, no matter what we do.

We either technologically regress due to the human population fall... (read more)

Sharmake
2
0
0
50% disagree

Interstellar travel will probably doom the long-term future

 

A lot of the reason for my disagreement stems from thinking that most galactic-scale disasters either don't actually serve as x-risks (like the von Neumann probe scenario), because they are defendable, or they require some shaky premises about physics to come true.

The change the universe constants is an example.

Also, in most modern theories of time travel, you only get self-consistent outcomes, and a lot of the classic portrayals of using time travel to destroy the universe through paradoxica... (read more)

1
JordanStone
I think each galactic x-risk on the list can probably be disregarded, but combined, and with the knowledge that we are extremely early in thinking about this, they present a very convincing case to me that at least 1 or 2 galactic x-risks are possible. Really interesting point, and probably a key consideration on existential security for a spacefaring civilisation. I'm not sure if we can be confident enough in acausal trade to rely on it for our long-term existential security though. I can't imagine human civilisation engaging in acausal trade if we expanded before the development of superintelligence. There are definitely some tricky questions to answer about what we should expect other spacefaring civilisations to do. I think there's also a good argument for expecting them to systematically eliminate other spacefaring civilisations rather than engage in acausal trade.   

The main unremovable advantages of AIs over humans will probably be in the following 2 areas:

  1. A serial speed advantage, from 50-1000x, with my median in the 100-500x speed advantage range, and more generally the ability to run slower or faster to do more work proportionally, albeit there are tradeoffs at either extreme of either running slow or fast.

  2. The ability for compute/software improvements to directly convert into more researchers with essentially 0 serial time necessary, unlike basically all of reproduction (about the only cases where it even ge

... (read more)

I'm trying to identify why the trend has lasted, so that we can predict when the trend will break down.

That was the purpose of my comment.

Sharmake
2
0
0
100% disagree

Consequentialists should be strong longtermists

 

I disagree, mostly due to the should wording, as believing in consequentialism doesn't obligate you to have any particular discount rate or have any particular discount function, and these are basically free parameters, so discount rates are independent of consequentialism.

Sharmake
2
0
0
50% agree

Bioweapons are an existential risk


I'll just repeat @weeatquince's comment, since he already covered the issue better than I did:

With current technology probably not an x-risk. With future technology I don’t think we can rule out the possibility of bio-sciences reaching the point where extinction is possible. It is a very rapidly evolving field with huge potential.

I mean the trend of very fast compute increases dedicated to AI, and what I mean is that fabs and chip manufacturers have switched their customers to AI companies.

1
Yarrow Bouchard 🔸
I still don't follow. What point are you trying to make about my comment or about Ege Erdil's post?
Sharmake
2
0
0
40% disagree

AGI by 2028 is more likely than not

 

While I think AGI by 2028 is reasonably plausible, I think that there are way too many factors that have to go right in order to get AGI by 2028, and this is true even if AI timelines are short.

 

To be clear, I do agree that if we don't get AGI by the early 2030s at latest, AI progress will slow down, I don't have nearly enough credence for the supporting arguments to have my median be in 2028.

The basic reason for the trend continuing so far is that NVIDIA et al have diverted normal compute expenditures into the AI boom.

I agree that the trend will stop, and it will stop around 2027-2033 (my widest uncertainty lies here), and once that happens the probability of having AGI soon will go down quite a bit (if it hasn't happened by then).

1
Yarrow Bouchard 🔸
I don't understand what you're trying to say here. By "the trend", do you mean Nvidia's revenue growth? And what do you mean by "have diverted normal compute expenditures into the AI boom"?

@Vasco Grilo🔸's comment is reproduced here for posterity:

Thanks for sharing, Sharmake! Have you considered crossposting the full post? I tend to think this is worth it for short posts.

8
Vasco Grilo🔸
Thanks for sharing the full post!

My own take is that while I don't want to defend the "find a correct utility function" approach to alignment to be sufficient at this time, I do think it is actually necessary, and that the modern era is an anomaly in how much we can get away with misalignment being checked by institutions that go beyond an individual.

The basic reason why we can get away with not solving the alignment problem is that humans depend on other humans, and in particular you cannot replace humans with much cheaper workers that have their preferences controlled arbitrarily.

AI thr... (read more)

My own take on AI Safety Classic arguments is I've become convinced by o3/Sonnet 3.7 that the alignment is very easy hypothesis is looking a lot shakier than it used to be, and I suspect future capabilities progress is likely to be at best neutral, and probably worse for alignment being very easy.

I do think you can still remain optimistic based on other cases, but a pretty core crux is I think alignment does need to be solved if AIs are able to automate the economy, and this is pretty robust to variations on what happens with AI.

The big reason for this is ... (read more)

For what it's worth, I basically agree with the view that Mechanize is unlikely to be successful at it's goals:

As a side note, it’s also strange to me that people are treating the founding of Mechanize as if it has a realistic chance to accelerate AGI progress more than a negligible amount — enough of a chance of enough of an acceleration to be genuinely concerning. AI startups are created all the time. Some of them state wildly ambitious goals, like Mechanize. They typically fail to achieve these goals. The startup Vicarious comes to mind.

There are many s

... (read more)

I'll flag that for the purposes of having scout mindset/honesty, I want to note that o3 is pretty clearly misaligned in ways that arguably track standard LW concerns around RL:

https://x.com/TransluceAI/status/1912552046269771985

Relevant part of the tweet thread:

Transluce: We tested a pre-release version of o3 and found that it frequently fabricates actions it never took, and then elaborately justifies these actions when confronted. We were surprised, so we dug deeper. We generated 1k+ conversations using human prompters and AI investigator agents, then use

... (read more)

I incorrectly thought that you also left, I edited my comment.

To be honest, I don't necessarily think it's as bad as people claim, though I still don't think it was a great action relative to available alternatives, and is at best not the best thing you could decide on for making AI safe, relative to other actions.

One of my core issues, and a big crux here is that I don't really believe that you can succeed at the goal of automating the whole economy with cheap robots without also allowing actors to speed up the race to superintelligence/superhuman AI researchers a lot.

And if we put any weight on misalignment, we sho... (read more)

[This comment is no longer endorsed by its author]Reply

Saying that I personally support faster AI development because I want people close to me to benefit is not the same as saying I'm working at Epoch for selfish reasons.

I've had opportunities to join major AI labs, but I chose to continue working at Epoch because I believe the impact of this work is greater and more beneficial to the world.

That said, I’m also frustrated by the expectation that I must pretend not to prioritize those closest to me. I care more about the people I love, and I think that’s both normal and reasonable—most people operate this way. That doesn’t mean I don’t care about broader impacts too.

I agree with most of this, albeit I have 2 big disagreements with the article:

  1. I think alignment is still important and net-positive, but yeah I've come to think it's no longer the number 1 priority, for the reasons you raise.
     

2. I think with the exception of biotech and maybe nanotech, no plausible technology in the physical world can actually become a recipe for ruin, unless we are deeply wrong about how the physical laws of the universe work, so we can just defer that question to AI superintelligences.

The basic reason for this is that once you are a... (read more)

This is begging the question! My whole objection is that alignment of ASI hasn't been established to be possible.

 

A couple of things I'll say here:

  1. You do not need a strong theory for why something must be possible in order to put non-trivial credence on it being possible, and if you hold a prior that scientific difficulty of doing something is often overrated, especially if you believe in the idea that alignment is possibly automatable and that a lot of people overrate the difficulty of automating something, that's enough to cut p(doom) by a lot, argu
... (read more)

The prediction of many moral perspectives caring more about averting downsides than producing upsides is well explained if we live in a moral relativist multiverse, where there are an infinity of correct moral systems, and which one you come to is path dependent and starting point dependent, but there exist instrumental goals from many moral perspectives that has a step that wants to avoid extinction/disempowerment, because it means that morality loses out in the competition/battle for survival/dominance.

cf @quinn's positive vs negative longtermism framewo... (read more)

Some thoughts on this comment:

On this part:

I responded well to Richard's call for More Co-operative AI Safety Strategies, and I like the call toward more sociopolitical thinking, since the Alignment problem really is a sociological one at heart (always has been). Things which help the community think along these lines are good imo, and I hope to share some of my own writing on this topic in the future.

I don't think it was always a large sociological problem, but yeah I've updated more towards the sociological aspect of alignment being important (especially... (read more)

I agree that conditional on escaping/rogue internal deployments like this scenario by Buck, with a lot of contributors, it leads to much larger disasters, and if the AI is unaligned, then unless we have an aligned AI that has somewhat similar capabilities, we lose.

My point is more so that you are way overestimating how many chances the AI has to overthrow us before it is aligned.

https://www.lesswrong.com/posts/ceBpLHJDdCt3xfEok/ai-catastrophes-and-rogue-deployments

But the crux might be that I don't think that we need that much reliability for AI catching, ... (read more)

2
Greg_Colbourn ⏸️
This is begging the question! My whole objection is that alignment of ASI hasn't been established to be possible. So it will worry about being in a kind of panopticon? Seems pretty unlikely. Why should the AI care about being caught any more than it should about any given runtime instance of it being terminated?

While finm made a general comment in response to you, I want to specifically focus on the footnote, because I think it's a central crux in why a lot of EAs are way less doomy than you.

Quote below:

  1. We need at least 13 9s of safety for ASI, and the best current alignment techniques aren't even getting 3 9s...

I think the 13 9s can be reduced to something requiring closer to 1-2 9s at the very least, and there are 2 reasons for this:

  1. I think you drastically overestimate how many chances the AI gets at misalignment, because the trillions of executions will use fa
... (read more)
2
Greg_Colbourn ⏸️
The little compute leads to much more once it has escaped! The point is that we won't, unless we have many more 9s of reliability in terms of catching such attempts!

I basically agree with this, with one particular caveat, in that the EA and LW communities might eventually need to fight/block open source efforts due to issues like bioweapons, and it's very plausible that the open-source community refuses to stop open-sourcing models even if there is clear evidence that they can immensely help/automate biorisk, so while I think the fight was done too early, I think the fighty/uncooperative parts of making AI safe might eventually matter more than is recognized today.

6
Matrice Jacobine🔸🏳️‍⚧️
If you mean Meta and Mistral I agree. I trust EleutherAI and probably DeepSeek to not release such models though, and they're more centrally who I meant.

To respond to a local point here:

  • Also, I am suspicious of framing "opposition to geoengineering" as bad -- this, to me, is a red flag that someone has not done their homework on uncertainties in the responses of the climate system to large-scale interventions like albedo modification. Geoengineering the planet wrong is absolutely an X-risk.

While I can definitely buy that geoengineering is a net-negative, I'm not sure how geoengineering gone wrong can actually result in X-risk, at least to me so far, and I don't currently understand the issues that well.

It ... (read more)

This is roughly my take, with the caveat that I'd replace CEV by instruction following, and I wouldn't be so sure that alignment is easy (though I do think we can replace it with the assumption that it is highly incentivized to solve the AI alignment problem and that the problem is actually solvable).

Crossposting this comment from LW, because I think there is some value here:

https://www.lesswrong.com/posts/6YxdpGjfHyrZb7F2G/third-wave-ai-safety-needs-sociopolitical-thinking#HBaqJymPxWLsuedpF

The main points are that value alignment will be way more necessary for ordinary people to survive, no matter the institiutions adopted, that the world hasn't yet weighed in that much on AI safety and plausibly never will, but we do need to prepare for a future in which AI safety may become mainstream, that Bayesianism is fine actually, and many more points in the full comment.

Sharmake
1
0
0
21% disagree

The big reason I lean towards disagreeing nowadays is coming to the belief that I expect the AI control/alignment problem to be much less neglected and important to solve, and more generally I've come to doubt the assumption that worlds in which we survive are worlds in which we achieve very large value (under my own value set), such that reducing existential risk is automatically good.

Late comment, I basically agree with the point being made here that we should avoid committing a fallacy of assuming work done is constant/lump of labor fallacies, but I don't think this weakens the argument that human work will be replaced by AI work totally, for 2 reasons:

  1. In a world where you can copy AI labor hugely readily, wages fall for the same reason why prices fall when more goods are supplied, and in particular humans have a biological minimum wage of 20-100 watts that fundamentally makes them unemployable once AIs can be run for cheaper than

... (read more)

To be honest, even if we grant the assumption that AI alignment is achieved and it matters who achieves AGI/ASI, I'd be much, much less confident in America racing, and think that it's weakly negative to race.

One big reason for this is that the pressures AGI introduces are closer to cross-cutting pressures than pressures that are dependent on nations, like the intelligence curse sort of scenario where elites have incentives to invest in their automated economy, and leave the large non-elite population to starve/be repressed:

https://lukedrago.substack.com/p/the-intelligence-curse

I think this might not be irrationality, but a genuine difference in values.

In particular, I think something like a discount rate disagreement is at the core of a lot of disagreements on AI safety, and to be blunt, you shouldn't expect convergence unless you successfully persuade them of this.

2
Greg_Colbourn ⏸️
I don't think it's discount rate (esp given short timelines); I think it's more that people haven't really thought about why their p(doom|ASI) is low. But people seem remarkably resistant to actually tackle the cruxes of the object level arguments, or fully extrapolate the implications of what they do agree on. When they do, they invariably come up short.

I ultimately decided to vote for the animal welfare groups, because I believe that animal welfare, in both it's farmed and wild variants is probably one of the most robust and large problems in the world, and with the exception of groups that are the logistical/epistemic backbone of the movements (they are valuable for gathering data and making sure that the animal welfare groups can do their actions), I've become more skeptical that other causes were robustly net-positive, especially reducing existential risks.

This sounds very much like the missile gap/bomber gap narrative, and yeah this is quite bad news if they actually adopt the commitments pushed here.

The evidence that China is racing to AGI is quite frankly very little, and I see a very dangerous arms race that could come:

https://forum.effectivealtruism.org/posts/cXBznkfoPJAjacFoT/are-you-really-in-a-race-the-cautionary-tales-of-szilard-and

Load more