Charlie_Guthmann

Thank you for doing this, love to see some data.

I don't have high familiarity with METR but I think it is probably not great data for this type of analysis. Few issues or clarifications would be needed (and anyone who understands METR better bear with me/or correct me on my mistakes plz).

1. How does METR handle context windows? Are we doing a rolling window? Compact? something else?
How much of this inverse quadratic relationship is just caused by longer tasks having a larger used context window for the back half of the run? How much is caused by a lack of a default information management system that persists?

2. What is the exact harness(es) METR is using?

Harness/enviornment engineering/information management might control more of the cost of long running SWE projects than iq (past a point).

3. Does METR allow repo forking? routing?
In the future no 180 iq ai is building the ORM and buttons for a crud app. They are either forking a boilerplate or routing it to a cheaper model.

etc.

It is said that the current iteration of models suffer from retrograde amnesia. Whether or not this will get bitter lesson pilled is a separate question, but for this class of memento models, version control, information management, context management, and the meta process of improving and routing through the best versions of these combos for a specific task is not some side quest but in fact the main route to making long tasks cheaper. Even as we enter the next paradigm of models that don't have such profound short term memory loss, a huge part of cost reduction will come from the orchestrator meta planning about how much to explore the space of options/build out the software factory / vs actually starting the work.

I’m not denying the core question OP is raising — costs could plausibly be rising and could matter a lot. I’m just not convinced this specific curve cleanly isolates “AI economics” from “how expensive a particular scaffold/set of arbitrary constraints makes long-context work.”

Are the Costs of AI Agents Also Rising Exponentially?

Charlie_Guthmann4d2

yea doesn't arc leaderboard have somewhat opposing trends? https://arcprize.org/leaderboard

Liam Robins's Quick takes

Charlie_Guthmann4d4

might want to check out this (only indirectly related but maybe useful).

https://forum.effectivealtruism.org/posts/zuQeTaqrjveSiSMYo/a-proposed-hierarchy-of-longtermist-concepts

Personally don't mind o-risk think it has some utility but s-risk ~somewhat seems like it still works here. An O-risk is just a smaller scale s-risk no?

Annika Burman's Quick takes

Charlie_Guthmann12d2

What do people think of the idea of pushing for a constitutional convention/amendment? The coalition would be ending presidential immunity + reducing the pardon powers + banning stock trading for elected officials. Probably politically impossible but if there were ever a time it might be now.

Why I reject the longtermist argument against accelerating AI

Charlie_Guthmann13d8

tldr; wrote some responses to sections, don't think I have an overall point. I think this line of argumentation deserves to be taken seriously but think this post is maybe trying to do too much at once. The main argument is simply cluelessness + short term positive EV.

In virtually every other area of human decision-making, people generally accept without much argument that the very-long-term consequences of our actions are extremely difficult to predict.

I'm a little confused what your argumentative technique is here. Is that fact that most humans do something the core component here? Wouldn't this immediately disqualify much of what EAs work on? Or is this just a persuasive technique, and you mean ~ "most humans think this for reason x. I also think this for reason x, though the fact most humans think it matters little to me."
For me, most humans do x is not an especially convincing argument of something.

I don't want to get bogged down on cluelessness because there are many lengthy discussions elsewhere but I'll say that cluelessness depends on the question. If you told me what the rainforest looked like and then asked me to guess the animals I wouldn't have a chance. If you asked me to guess if they ate food and drank water I think I would do decent. Or a more on the nose example. If you took me back to 5 million years ago and asked me to guess what would happen to the chimps if humans came to exist, I wouldn't be able to predict much specifics, but I might be able to predict (1) humans would become the top dog, and with less certainty (2) chimp population would go down and with even less certainty (3) chimps will go extinct. That's why the horse model gets so much play, people have some level of belief that there are certain outcomes that might be less chaotic if modeled correctly.

To wrap up I think your first 4 paragraphs could be shortened to your unique views on cluelessness (specifically wrt ai?) + discount rates/whatever other unique philosophical axioms you might hold.

Understood in this way, AI does not actually pose a risk of astronomical catastrophe in Bostrom's sense.

To be clear, neither does the asteroid. Aliens might exist and our survival similarly presents a risk of replacement for all the alien civs that won't have time to biologically evolve as (humans or ai from earth) speed through the lightcone. Also even if no aliens, we have no idea if conditional on humans being grabby, utility is net positive or negative. There isn't even agreement on this forum or in the world on if there is such a thing as a negative life or not. Don't think i'm arguing against you here but feels like you are being a little loose here (don't want to be too pedantic as I can totally understand if you are writing for a more general audience).

Now, you might still reasonably be very concerned about such a replacement catastrophe. I myself share that concern and take the possibility seriously. But it is crucial to keep the structure of the original argument clearly in mind. ... Even if you accept that killing eight billion people would be an extraordinarily terrible outcome, it does not automatically follow that this harm carries the same moral weight as a catastrophe that permanently eliminates the possibility of 10^23 future lives.

Well I have my own "values". Just because I die doesn't mean these disappear. I'd prefer that those 10^23 lives aren't horrifically tortured for instance.

Though I say this with extremely weak confidence, I feel like in the case where a "single agent/hivemind" misaligned ai immediately wipes us all out, I'm thinking they probably are not going to convert resources into utility as efficiently as me (by my current values), and thus this might be viewed as an s-risk. I'm guessing you might say that we can't possibly predict that, but then can we even predict if those 10^23 lives will be positive or negative? if not I guess i'm not sure why you brought any of this up anyway. Bostrom's whole argument predicates on the assumption that earth descended life is + ev, which predicates on not being clueless or having a very kumbaya pronatal moral philosophy.

So I guess even better for you, from my POV you don't even need to counter argue this.

Virtually every proposed mechanism by which AI systems might cause human extinction relies on the assumption that these AI systems would be extraordinarily capable, productive, or technologically sophisticated.

I might not be especially up to date here. Can't it like cause a nuclear fallout etc? totalitarian lock in? the matrix? Extreme wealth and power disparity? is there agreement that the only scenarios in which our potential is permanently curtailed the terminator flavors?

The reason is that a decade of delayed progress would mean that nearly a billion people will die from diseases and age-related decline who might otherwise have been saved by the rapid medical advances that AI could enable. Those billion people would have gone on to live much longer, healthier, and more prosperous lives.

You might need to flesh this out a bit more for me because I don't think it's as true as you said. Is the claim here that AI will (1) invent new medicine or (2) replace doctors or (3) improve US healthcare policy?

(1) Drug development pipelines are excruciatingly long and mostly not because of a lack of hypotheses. For instance, https://pmc.ncbi.nlm.nih.gov/articles/PMC10786682/ GLP-1 have been in the pipelines for half a century (though debatably with better ai some of the nausea stuff could have been figured out quicker). IL-23 connection to IBD/Crohns was basically known ~2000 as it was one of the first/most significant single nucleotide mutations picked up with GWAS phenotype/genotype studies. Yet Skyrizi only hit the market a few years ago. Assuming ai could instantly just invent the drugs, IIRC it's a minimum of like 7 years to get approval. That's absolute minimum. And likely even super intelligent AI is gonna need physical labs, iteration, make mistakes, etc.

Assuming sufficient AGI in 2030 for this threshold, we are looking at early 2040s before we start to see significant impact on the drugs we use, although it's possible AI will usher a new era of repurposes drug cocktails via extremely good lit review (although IMO the current tools might already be enough to see huge benefits here!).

(2) Doctors, while overpaid, still only make up like 10-15% of healthcare costs in the US. I do think ai will end up being better than them, although whether people will quickly accept this idk. So you can get some nice savings there, but again that's assuming you just break the massive lobbying power they have. And beyond the costs, tons of the most important health stuff is already widely known among the public. Stuff like don't smoke cigarettes, don't drink alcohol, don't be fat, don't be lonely. People still fail to do this stuff. Not an information problem. Further doctors often know when they are overprescribing useless stuff, often just an incentives problem. No good reason to think AI will break this trend unless you are envisioning a completely decentralized or single payer system that uses all ai doctors, both are at least partially political issues not intelligence. And if we are talking solid basic primary care for the developing world, I just question how smart the ai needs to be. I'd assume a 130 iq llm with perfect vision and full knowledge of medical lit would be more than sufficient, and that seems like it will be the next major gemini release?

(3) will leave this for now.

Kinda got sidetracked here and will leave this comment here for now because so long, but I guess takeaway from this section: You can't claim cluelessness on the harms and then assume the benefits are guaranteed.

Linch's Quick takes

Charlie_Guthmann18d2

2 thoughts here just thinking about persuasiveness. I'm not quite sure what you mean by normal people and also if you still want your arguments to be actually arguments or just persuasion-max.

show don't tell for 1-3
- For anyone who hasn't intimately used frontier models but is willing to with an open mind, I'd guess you should just push them to use and actually engage mentally with them and their thought traces, even better if you can convince them to use something agentic like CC.
Ask and/or tell stories for 4
- What can history tell us about what happens when a significantly more tech savy/powerful nation finds another one?
  - no "right" answer here though the general arc of history is that significantly more powerful nations capture/kill/etc.
- What would it be like to be a native during various european conquests in the new world (esp ignoring effects of smallpox/disease to the extent you can)?
  - Incan perspective? Mayan?
  - I especially like Orellena's first expedition down the amazon. As far as I can tell, Orellena was not especially bloodthirsty, had some interest/respect for natives. Though he is certainly misaligned with the natives.
    - Even if Orellana is “less bloodthirsty,” you still don’t want to be a native on that river. You hear fragmented rumors—trade, disease, violence—with no shared narrative; you don’t know what these outsiders want or what their weapons do; you don’t know whether letting them land changes the local equilibrium by enabling alliances with your enemies; and you don’t know whether the boat carries Orellana or someone worse.
    - Do you trade? attack? flee? coordinate? Any move could be fatal, and the entire situation destabilizes before anyone has to decide “we should exterminate them.”
  - and for all of these situations you can actually see what happened (approximately) and usually it doesn't end well.
  - Why is AI different?
    - not rhetorical and gives them space to think in a smaller, more structured way that doesn't force an answer.

Ozzie Gooen's Quick takes

Charlie_Guthmann23d2

Just finding about about this & crux website. So cool. Would love to see something like this for charity ranking (if it isn't already somewhere on the site).

Don't you need a philosophy axioms layer between outputs and outcomes? Existential catastrophe definitions seems to be assuming a lot of things.

Would also need to think harder about why/in what context i'm using this but "governance" being a subcomponent when it's arguably more important/ can control literally everything else at the top level seems wrong.

Whose expected value? Making sense of EA epistemology with Ideal Reflection

Charlie_Guthmann2mo2

Thanks for the post — There is definitely a certain fuzziness at times about value claims in the movement and I have been critical at times of similar things. Also chatgpt edited this but (nearly) all thoughts are my own hope that's ok!

I see a few threads here that are easy to blur:

1) Metaethics (realism vs anti-realism) is mostly orthogonal to Ideal Reflection.
You can be a realist or anti-realist and still endorse (or reject) a norm like “defer to what an idealized version of you would believe, holding evidence fixed.” Ideal Reflection doesn’t have to claim there’s a stance-independent EV “out there”; it can be a procedural claim about which internal standpoint is authoritative (idealized deliberation vs current snap judgment), and about how to talk when you’re trying to approximate that standpoint. I'm not saying you claimed the opposite exactly but language was a bit confusing to me at times.

2) Ideal Reflection is a metanormative framework; EA is basically a practical operationalization of it.
Ideal Reflection by itself is extremely broad. But on its own it doesn’t tell you what you value, and it doesn’t even guarantee that you can map possible world-histories to an ordinal ranking. It might seem less hand-wavy but its lack of assumptions makes it hard to see what non trivial claims can follow. Once you add enough axioms/structure to make action-guiding comparisons possible (some consequentialist-ish evaluative ranking, plus willingness to act under uncertainty), then you can start building “upward” from reflection to action.

It also seems to me (and is part of what makes EA distinctive) that EA ecosystem was built by unusually self-reflective people — sometimes to a fault — who tried hard to notice when they were rationalizing, to systematize their uncertainty, and to actually let arguments change their minds.

On that picture, EA is a specifc operationalization/instance of Ideal Reflection for agents who (a) accept some ranking over world-states/world-histories, and (b) want scalable, uncertainty-aware guidance about what to do next.

3) But this mainly helps with the “upward” direction; it doesn’t make the “downward” direction easier.
I think of philosophy as stacked layers: at the bottom are the rules of the game; at the top is “what should I do next.” EA (and the surrounding thought infrastructure) clarifies many paths upward once you’ve committed to enough structure to compare outcomes. But it’s not obvious why building effective machinery for action gives us privileged access to bedrock foundations. People have been trying to “go down” for a long time. So in practice a lot of EAs seem to do something like: “axiomatize enough to move, then keep climbing,” with occasional hops between layers when the cracks become salient.

4) At the community level, there’s a coordination story that explains the quasi-objective EV rhetoric and the sensitivity to hidden axioms.
Even among “utilitarians,” the shape of the value function can differ a lot — and the best next action can be extremely sensitive to those details (population ethics, welfare weights across species, s-risk vs x-risk prioritization, etc.). Full transparency about deep disagreements can threaten cohesion, so the community ends up facilitating a kind of moral trade: we coordinate around shared methods and mid-level abstractions, and we get the benefits of specialization and shared infrastructure, even without deep convergence.

It's true - Institutionally, founder effects + decentralization + concentrated resources (in a world with billionaires) create path dependence: once people find a lane and drive — building an org, a research agenda, a funding pipeline — they implicitly assume a set of rules and commit resources accordingly. As the work becomes more specific, certain foundational assumptions become increasingly salient, and it’s easy for implicit axioms to harden and complexify over time. To some extent you can say that is what happened, although on the object level it feels like we have picked pretty good stuff to work on in my view. And charitably, when 80k writes fuzzy definitions of the good, it isn't necessarily that the employees and org don't have more specific values, it's that they think its better to leave it at the level of abstraction to build the best coalition right now. And also that they are trying to help you build up from what you have to making a decision.

The funding conversation we left unfinished

Charlie_Guthmann2mo16

I don't see this strain of argument as particularly action relevant. I feel like you are getting way to caught up in the abstractions of what "agi" is and such. This is obviously a big deal, this is obviously going to happen "soon" and/or already "happening", it's obviously time to take this very serious and act like responsible adults.

Ok so you think "AGI" is likely 5+ years away. Are you not worried about anthropic having a fiduciary responsibility to it's shareholders to maximize profits? I guess reading between the lines you see very little value in slowing down or regulating AI? While leaving room for the chance that our whole disagreement does revolve around our object level timeline differences, I think you probably are missing the forrest from the trees here in your quest to prove the incorrectness of people with shorter timelines.

I am not a doom maximilist in the sense that I think this technology is already profoundly world-bending and scary today. I am worried about my cousin becoming a short form addicted goonbot with an AI best friend right now - whether or not robot bees are about to gorge my eyes out.

I think there are a reasonably long list of sensible regulations around this stuff (both x-risk related and more minor stuff) that would probably result in a large drawdown in these companies valuations and really the stock market at large. For example but not limited to - AI companionship, romance, porn should probably be on a pause right now while the government performs large scale AB testing, the same thing we should have done with social media and cellphone use especially in children that our government horribly failed to do because of its inability to utilitize RCTs and the absolute horrifying average age of our president and both houses of congress.

Charlie_Guthmann's Quick takes

Charlie_Guthmann2mo3

it's quite easy, I actually already did it with printful + shopify. I stalled out because (1) I realized it's much more confusing to deal with all the copyright stuff and stepping on toes (I don't want to be competing with ea itself or ea orgs and didn't feel like coordinating with a bunch of people. (2) you kind of get raked using a easy fully automated stack. Not a big deal but with shipping hoodies end up being like 35-40 and t shirts almost 20. I felt like given the size of EA we should probably just buy a heat press or embroidery machine since we probably want to produce 100s+.

Anyway feel free to reach out and we can chat!

here is the example site I spun up, again not actually trying to sell those products was just testing if I could do it https://tp403r-fy.myshopify.com/

Charlie_Guthmann

Bio

Posts 11

Comments274

Posts
11

Comments
274