All of simeon_c&#x27;s Comments + Replies

The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns

I mean, I agree that it has nuance but it's still trained on a set of values that are pretty much current western people values, so it will probably put more or less emphasis on various values according to the weight western people give to each of those.

Yellow (Daryl)

Not too sure how important values in data sets would be. Possibly AGI's may be created different than current LLMs in simply not needing a dataset to be trained from

New EA Podcast: Critiques of EA

I may try to write something on that in the future. I'm personally more worried about accidents and think that solving accidents causes one to solve misuse pre-AGI. Post aligned AGI, misuse rebecomes a major worry.

Greg_Colbourn ⏸️

I guess it's possible that, post-AutoGPT, we are in a world where warning shots are much more likely, because there will be a lot more misuse than was previously expected.

simeon_c3y11

Note that saying "this isn't my intention" doesn't prevent net negative effects of a theory of change from applying. Otherwise, doing good would be a lot easier.

I also highly recommend clarifying what exactly you're criticizing, i.e. the philosophy, the movement norms or some institutions that are core to the movement.

Finally, I usually find the criticism of people a) at the core of the movement and b) highly truth-seeking most relevant to improve the movement so I would expect that if you're trying to improve the movement, you may want to focu... (read more)

Nick_Anyos

Thank you for your comment and especially your guest recommendations! :) I completely agree. But I still think that saying when a harm was unintentional is an important signaling mechanism. For example, if I step on your foot, saying "Sorry, that was an accident" doesn't stop you from experiencing pain but hopefully prevents us from getting into a fight. Of course it is possible for signals like this to be misused by bad actors. Ideally all of the above, with different episodes focusing on different aspects. Though I agree I should make the scope of the criticism clear at the beginning of each episode. I think the Ozzie's comment below has a good break down that I may use in the future.

Ozzie Gooen3y12

+1 for clarification. It could be neat if you could use a standard diagram to pinpoint what sort of criticism each one is.

For example, see this one from Astral Codex Ten.

There can be highly neglected solutions to less-neglected problems

simeon_c3y2

Thanks for publishing that, I also had a draft lying somewhere on that!

How many hours is your standard workweek? Why?

Answer by simeon_cJan 02, 20233

I work every day from about 9:30am to 1am with about 3h off on average and 30 min of walk which helps me brainstorming. Technically this is ~12*7=84h. The main reason is that 1) I want that we don't die and 2) think that there are increasing marginal returns on working hours in a lot of situation, mostly due to the fact that in a lot of domains, winner takes all even if he's only 10% better than others, and because you accumulate in a single person more expertise/knowledge which gives access to more and more rare skills

Among that, I would say that I lose a... (read more)

How many hours is your standard workweek? Why?

simeon_c3y1

Can you share the passive time tracking tools you're using?

calebp

I use rescue time (pro?) for passive time tracking and think it’s pretty great - though I haven’t configured it as well as I’d like yet. Cold Turkey and news feed eradicator keep me from using social media during work hours + being on a call where I share my screen and video with someone who is happy to tell me off if I slack off.

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c3y3

"Nobody cared about" LLMs is certainly not true - I'm pretty sure the relevant people watched them closely.

What do you mean by "the relevant people"? I would love that we talk about specifics here and operationalize what we mean. I'm pretty sure E. Macron haven't thought deeply about AGI (i.e has never thought for more than 1h about timelines) and I'm at 50% that if he had any deep understanding of what changes it will bring, he would already be racing. Likewise for Israel, which is a country which has strong track record of becoming leads in technol... (read more)

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c3y1

Hey Misha! Thanks for the comment!

I am quite confused about what probabilities here mean, especially with prescriptive sentences like "Build the AI safety community in China" and "Beware of large-scale coordination efforts."

As I wrote in note 2, I'm here claiming that this claim is more likely to be true under these timelines than the other timelines. But how could I make it clearer without bothering too much? Maybe putting note 2 under the table in italic?

I also disagree with the "vibes" of probability assignment to a bunch of these, and the lack of clari

... (read more)

Misha_Yagudin

Well, yeah, I struggle with interpreting that: * Prescriptive statements have no truth value — hence I have trouble understanding how they might be more likely to be true. * Comparing "what's more likely to be true" is also confusing as, naively, you are comparing two probabilities (your best guesses) of X being true conditional on "T " and "not T;" and one is normally very confident in their arithmetic abilities. * There are less naive ways of interpreting that would make sense, but they should be specified. * Lastly and probably most importantly, a "probability of being more likely under condition" is not illuminating (in these cases, e.g., how much larger expected returns to community building is actually an interesting one). Sorry for the lack of clarity: I meant that despite my inability to interpret probabilities, I could sense their vibes, and I hold different vibes. And disagreeing with vibes is kinda difficult because you are unsure if you are interpreting them correctly. Typical forecasting questions aim to specify the question and produce probabilities to make underlying vibes more tangible and concrete — maybe allowing to have a more productive discussion. I am generally very sympathetic to the use of these as appropriate.

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c3y6

Ah ah you probably don't realize it but "you" is actually 4 persons: Amber Dawn for the first draft of the post, me (Simeon) for the ideas, the table and the structure of the post, and me, Nicole Nohemi & Felicity Riddel for the partial rewriting of the draft to make it clearer.

So the credits are highly distributed! And thanks a lot, it's great to hear that!

AGI Timelines in Governance: Different Strategies for Different Timeframes

AGI Timelines in Governance: Different Strategies for Different Timeframes

I think that our disagreement comes from what we mean by "regulating and directing it."

My rough model of what usually happens in national governments (and not the EU, which is a lot more independent from its citizen than the typical national government) is that there are two scenarios:

Scenario 1 in which national governments regulate or do things on something nobody is caring about (in particular, not the media). That gives birth to a lot of degrees of freedom and the possibility of doing fairly ambitious things (cf Secret Congress)
Scenar

simeon_c3y6

Thanks for your comment!

A couple of remarks:

Regulations that cut X-risks are strong regulations: My sense is that regulations that really cut X-risks at least a bit are pretty "strong", i.e. in the reference class of "Constrain labs to airgap and box their SOTA models while they train them" or "Any model which is trained must be trained following these rules/applying these tests". So what matters in terms of regulation is "will governments take such actions?" and my best guess is no, at least not without the public opinion caring a lot about th

... (read more)

HaydnBelfield

I think my point is more like "if anyone gets anywhere near advanced AI, governments will have something to say about it - they will be a central player in shaping its development and deployment." It seems very unlikely to me that governments would not notice or do anything about such a potentially transformative technology. It seems very unlikely to me that a company could train and deploy an advanced AI system of the kind you're thinking about without governments regulating and directing it. On funding specifically, I would probably be >50% on governments getting involved in meaningful private-public collaboration if we get closer to substantial leaps in capabilities (though it seems unlikely to me that AI progress will get to that point by 2030). On your regulation question, I'd note that the EU AI Act, likely to pass next year, already proposes the following requirements applying to companies providing (eg selling, licensing or selling access to) 'general purpose AI systems' (eg large foundation models): * Risk Management System * Data and data governance * Technical documentation * Record-keeping * Transparency and provision of information to users * Human oversight * Accuracy, robustness and cybersecurity So they'll already have to do (post-training) safety testing before deployment. Regulating the training of these models is different and harder, but even that seems plausible to me at some point, if the training runs become ever huger and potentially more consequential. Consider the analogy that we regulate biological experiments.

simeon_c3y1

Oh that's fun, thanks for the context!

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c3y16

Thanks for your comment!

First, you have to have in mind that when people are talking about "AI" in industry and policymaking, they usually have mostly non-deep learning or vision deep learning techniques in mind simply because they mostly don't know the ML academic field but they have heard that "AI" was becoming important in industry. So this sentence is little evidence that Russia (or any other country) is trying to build AGI, and I'm at ~60% Putin wasn't thinking about AGI when he said that.

If anyone who could play any role at all in develop

... (read more)

Karl von Wendt

As you point out yourself, what makes people interested in developing AGI is progress in AI, not the public discussion of potential dangers. "Nobody cared about" LLMs is certainly not true - I'm pretty sure the relevant people watched them closely. That many people aren't concerned about AGI or doubting its feasibility by now only means that THOSE people will not pursue it, and any public discussion will probably not change their minds. There are others who think very differently, like the people at OpenAI, Deepmind, Google, and (I suspect) a lot of others who communicate less openly about what they do. I don't think you can easily separate the scientific community from the general public. Even scientific papers are read by journalists, who often publish about them in a simplified or distorted way. Already there are many alarming posts and articles out there, as well as books like Stuart Russell's "Human Compatible" (which I think is very good and helpful), so keeping the lid on the possibility of AGI and its profound impacts is way too late (it was probably too late already when Arthur C. Clarke wrote "2001 - A Space Odyssey"). Not talking about the dangers of uncontrollable AI for fear that this may lead to certain actors investing even more heavily in the field is both naive and counterproductive in my view. I will definitely publish it, but I doubt very much that it will have a large impact. There are many other writers out there with a much larger audience who write similar books. I'm currently in the process of translating it to English so I can do just that. I'll send you a link as soon as I'm finished. I'll also invite everyone else in the AI safety community (I'm probably going to post an invite on LessWrong). Concerning the Putin quote, I don't think that Russia is at the forefront of development, but China certainly is. Xi has said similar things in public, and I doubt very much that we know how much they currently spend on training their AIs. The qu

HaydnBelfield3y14

Strongly agree, upvoted.

Just a minor point on the Putin quote, as it comes up so often, he was talking to a bunch of schoolkids, encouraging them to do science and technology. He said similarly supportive things about a bunch of other technologies. I'm at >90% he wasn't referring to AGI. He's not even that committed to AI leadership: he's taken few actions indicating serious interest in 'leading in AI'. Indeed, his Ukraine invasion has cut off most of his chip supplies and led to a huge exodus of AI/CS talent. It was just an off-the-cuff rhetorical remark.

AGI Timelines in Governance: Different Strategies for Different Timeframes

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Thanks for your comment!
That's an important point that you're bringing up.

My sense is that at the movement level, the consideration you bring up is super important. Indeed, even though I have fairly short timelines, I would like funders to hedge for long timelines (e.g. fund stuff for China AI Safety). Thus I think that big actors should have in mind their full distribution to optimize their resource allocation.

That said, despite that, I have two disagreements:

I feel like at the individual level (i.e. people working in governa

Answer by simeon_cSep 18, 20224

One of my friends and collaborators did this app which is aimed at predicting the likelihood that we go extinct: https://xriskcalculator.vercel.app/

It might be useful!

It was a way to say that if you think that intelligence is perfectly correlated with "morally good", then you're fine. But you're right that it doesn't include all the ways you could reject the orthogonality thesis

AMA: Ought

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Why do you think your approach is better than working straight on alignment?

stuhlmueller

To clarify, here’s how I’m interpreting your question: “Most technical alignment work today looks like writing papers or theoretical blog posts and addressing problems we’d expect to see with more powerful AI. It mostly doesn’t try to be useful today. Ought claims to take a product-driven approach to alignment research, simultaneously building Elicit to inform and implement its alignment work. Why did Ought choose this approach instead of the former?” First, I think it’s good for the community to take a portfolio approach and for different teams to pursue different approaches. I don’t think there is a single best approach, and a lot of it comes down to the specific problems you’re tackling and team fit. For Ought, there’s an unusually good fit between our agenda and Elicit the product—our whole approach is built around human-endorsed reasoning steps, and it’s hard to do that without humans who care about good reasoning and actually want to apply it to solve hard problems. If we were working on ELK I doubt we’d be working on a product. Second, as a team we just like building things. We have better feedback loops this way and the nearer-term impacts of Elicit on improving quality of reasoning in research and beyond provide concrete motivation in addition to the longer-term impacts. Some other considerations in favor of taking a product-driven approach are: * Deployment plans help us choose tasks. We did “pure alignment research” when we ran our initial factored cognition experiments. At the time, choosing the right task felt about as hard as choosing the right mechanism or implementing it correctly. For example, we want to study factored cognition - should we factor reasoning about essays? SAT questions? Movie reviews? Forecasting questions? When experiments failed, it was hard to know whether we could have stripped down the task more to better test the mechanism, or whether the mechanism in fact didn’t solve the problem at hand. Our findings seemed brittle an

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

intelligence peaks more closely to humans, and super intelligence doesn't yield significant increases to growth.

Even if you have a human-ish intelligence, most of the advantage of AI from its other features:
- You can process any type of data, orders of magnitude faster than human and once you know how to do a task your deterministically know how to do it.
- You can just double the amount of GPUs and double the number of AIs. If you pair two AIs and make them interact at high speed, it's much more power than anything human-ish.
These are two... (read more)

MathiasKB🔸4y11

I think the meta-point might be the crux of our disagreement.

I mostly agree with your inside view that other catastrophic risks struggle to be existential the way AI would, and I'm often a bit perplexed as to how quick people are to jump from 'nearly everyone dies' to 'literally everyone dies'. Similarly I'm sympathetic to the point that it's difficult to imagine particularly compelling scenarios where AI doesn't radically alter the world in some way.

But we should be immensely uncertain about the assumptions we make and I would argue the far most likely fi... (read more)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Yes, that's right, but it's very different to be somewhere and by chance affect AGI and to be somewhere because you think that it's your best way to affect AGI.
And I think that if you're optimizing the latter, you're not very likely to end up working in nuclear weapons policy (even if there might be a few people for who it is be the best fit)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think that this comment is way too outside viewy.

Could you mention concretely one of the "many options" that would change directionally the conclusion of the post?

MathiasKB🔸

for example: * intelligence peaks more closely to humans, and super intelligence doesn't yield significant increases to growth. * superintelligence in one domain doesn't yield superintelligence in others, leading to some, but limited growth, like most other technologies. * we develop EMs which radically changes the world, including growth trajectories, before we develop superintelligence.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

The claim is "AGI will radically change X". And I tried to argue that if you cared about X and wanted to impact it, basically on the first order you could calculate your impact on it just by measuring your impact on AGI.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

"The superintelligence is misaligned with our own objectives but is benign".
You could have an AI with some meta-cognition, able to figure out what's good and maximizing it in the same way EAs try to figure out what's good and maximize it with parts of their life. This view mostly make sense if you give some credence to moral realism.

"My personal view on your subject is that you don't have to work in AI to shape its future."
Yes, that's what I wrote in the post.

"You can also do that by bringing the discussion into the public and create awareness for the dangers."
I don't think it's a good method and I think you should target a much more specific public but yes, I know what you mean.

Karl von Wendt

I'm not sure how that would work, but we don't need to discuss it further, I'm no expert. What exactly do you think is "not good" about a public discussion of AI risks?

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think that by the AGI timelines of the EA community, yes other X-risks have roughly a probability of extinction indistinguishable from 0.
And conditional on AGI working we'll also go out of the other risks most likely.

Whereas without AGI, biorisks X-risks might become a thing, not in the short run but in the second half of the century.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

That's right! I just think that the base rate for "civilisation collapse prevents us from ever becoming a happy intergalactic civilisation" is very low.
And multiplying any probability by 0.1 also does matter because when we're talking about AGI, we're talking about things are >=10% likely to happen for a lot of people (I put a higher likelihood than that but Toby Ord putting 10% is sufficient).

So it means that even if you condition on biorisks being the same as AGI (which is the point I argue against) for everything else, you still need biorisks t... (read more)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think that I'd be glad to stay as long as we can in the domain of aggregate probabilities and proxies of real scenarios, particularly for biorisks.
Mostly because I think that most people can't do a lot about infohazardy things so the first-order effect is just net negative.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think that it's one of the problems that explains why many people find my claim far too strong: in the EA community, very few people have a strong inside view on both advanced AIs and biorisks. (I think that's it's more generally true for most combinations of cause areas).

And I think that indeed, with the kind of uncertainty one must have when one's deferring , it becomes harder to do claims as strong as the one I'm making here.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Yes, I think you're right actually.
Here's a weaker claim which is I think it true:
- When someone knows and has thought on a infohazard, the baseline is that he's way more likely to cause harm via it than to cause good.
- Thus, I'd recommend anyone who's not actively thinking about ways to solve to prevent classes of scenario where this infohazard would end up being very bad, to try to forget this infohazard and not talk about it even to trusted individuals. Otherwise it will most likely be net negative.

simeon_c4y4

I think that if you take these infohazards seriously enough, you probably even shouldn't do that. Because if everyone has a 95% likelihood to keep it secret, with 10 persons in the know is 60%.

Evan R. Murphy

I see what you mean, but if you value cause prioritization seriously enough, it is really stifling to have literally no place to discuss x-risks in detail. Carefully managed private spaces are the best compromise I've seen so far, but if there's something better then I'd be really glad to learn about it.

What's the likelihood of irrecoverable civilizational collapse if 90% of the population dies?

simeon_c4y4

Thanks!
Do you think that biorisks/nuclear war could plausibly cause us never to recover our values? What's the weight you give to such a scenario?

(I want to know if the weight you put on "worse values" is due to stable totalitarianism due to new technologies or due to collapse -> bad people win).

Will Aldred

(DM'ed you)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

And on AI, do you have timelines + P(doom|AGI)?

Ben Stewart

I don't have a deep model of AI - I mostly defer to some bodged-together aggregate of reasonable seeming approaches/people (e.g. Carlsmith/Cotra/Davidson/Karnofsky/Ord/surveys).

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Thanks for this information!
What's the probability we go extinct due to biorisks by 2045 according to you?

Also, I think that things that are extremely infohazardy shouldn't be thought of too strongly bc without the info revelation they will likely remain very unlikely

Thomas Kwa🔹

I don't think this reasoning works in general. A highly dangerous technology could become obvious in 2035, but we could still want actors to not know about it until as late as possible. Or the probability of a leak over the next 10 years could be high, yet it could still be worth trying to maintain secrecy.

Ben Stewart

I'm currently involved in the UPenn tournament so can't communicate my forecasts or rationales to maintain experimental conditions, but it's at least substantially higher than 1/10,000. And yeah, I agree complicated plans where an info-hazard makes the difference are unlikely, but info-hazards also preclude much activity and open communication about scenarios even in general.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Basically, as I said in my post I'm fairly confident about most things except the MVP (minimum viable population) where I almost completely defer to Luisa Rodriguez.
Likewise, for the likelihood of irrecoverable collapse, my prior is that's the likelihood is very low for the reasons I gave above but given that I haven't explored that much the inside view arguments in favor of it, I could quickly update upward and I think that it would the best way for me to update positively on biorisks actually posing an X-risk in the next 30 years.

My view on t... (read more)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think you make a good point if we were close in terms but what matters primarily is the EV and I expect this to dominate uncertainty here.
I didn't do the computations but I feel like if u have something which is OOMs more important than others, even with very large bars of uncertainty you'd probably put >19/20 of your resources on the highest EV thing.
In the same way we don't give to another less cost-effective org to hedge against AMF even though they might have some tail chances of having a very significant positive impact on society because the bars of estimate are very large.

simeon_c4y5

Yep, good point! I just wanted to make clear that IMO a good first-order approximation of your impact on the long-term future is: "What's the causal impact of your work on AI?"
And even though UX designer for 80k / Community building are not focused on AI, they are instrumentally very useful towards AI, in particular if the person who does it has this theory of change in mind.

Stefan_Schubert

Yeah agreed; I got that.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

simeon_c4y8

Yes, scenarios are a good way to put a lower bound but if you're not able to create one single scenario that's a bad sign in my opinion.

For AGI there are many plausible scenarios where I can reach ~1-10% likelihood of dying. With biorisks it's impossible with my current belief on the MVP (minimum viable population)

Ben Stewart

Sketching specific bio-risk extinction scenarios would likely involve substantial info-hazards.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

"If (toy numbers here) AI risk is 2 orders of magnitude more likely to occur than biorisk, but four orders of magnitude less tractable". I think that indeed 2 or 3 OOMs of difference would be needed at least to compensate (especially given that positively shaping biorisks is not extremely positive) and as I argued above I think it's unlikely.

"They are of course not, as irrecoverable collapse , s-risks and permanent curtailing of human potential". I think that irrecoverable collapse is the biggest crux. What likelihood do you put on it? For other type... (read more)

GideonF

I actually think our big crux here is the amount of uncertainty. Each of the points I raise and each new assumption you are putting in should raise you uncertainty. Given you claim 95% ofongtermists should work on AI, high uncertainties fo not seem to weigh in the favour of your argument. Note I am not saying and haven't that either AI isn't the most important X-Risk or that we shouldn't work on it. Just arguing against the certainty from your post

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

By default, you shouldn't have a prior that bio risks is 100x more tractable than AI though. Some (important) people think that the EA community had a net negative impact on biorisks because of infohazards for instance.

Also, I'll argue below that timelines matter for ITN and I'm pretty confident the risk/year is very different for the two risks (which favors AI in my model).

GideonF

I would be interested in your uncertainties with all of this. If we are basing our ITN analysis on priors, given the limitations and biases of our priors, I would again be highly uncertain, once more leaning away from the certainty that you present in this post

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think it's extremely relevant.
To be honest I think that if someone coming without technical background wanted to contribute, that looking into these things would be one of the best default opportunities because:
1) These points you mention are blindspots of the AI alignment community because the typical member of the AI alignment doesn't really care about all this political stuff. Especially questions on values and on "How does magically those who are 1000x more powerful than others don't start ruling the entire world with their aligned AI" ar... (read more)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

I think that if the community was convinced that it was by far the most important thing, we would try harder to find projects and I'm confident there are a bunch of relevant things that can be done.
I think we're suffering from a argument to moderation fallacy that makes that we underinvest massively in AI safety bc :
1) AI Safety is hard
2) There are other causes that when you think not to deeply about it, seem equally important

The portfolio argument is an abstraction that hides the fact that if something is way more important than so... (read more)

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

"Firstly, under the standard ITN (Importance Tractability Neglectedness) framework, you only focus on importance. If there are orders of magnitude differences in, let's say, traceability (seems most important here), then longtermists maybe shouldn't work on AI."

I think this makes sense when we're in the domain of non-existential areas. I think that in practice when you're confident on existential outcomes and don't know how to solve them yet, you probably should still focus on it though.

"which probably leads to an overly narrow interprets of what might pos... (read more)

GideonF

"I think this makes sense when we're in the domain of non-existential areas. I think that in practice when you're confident on existential outcomes and don't know how to solve them yet, you probably should still focus on it though" -I think this somewhat misinterprets what I said. This is only the case if you are CERTAIN that biorisk, climate, nuclear etc aren't X-Risks. Otherwise it matters. If (toy numbers here) AI risk is 2 orders of magnitude more likely to occur than biorisk, but four orders of magnitude less tractable, then it doesn't seem that AI risk is the thing to work on. "Not sure what you mean by "this isn't true (definitionnally". Do you mean irrecoverable collapse, or do you mean for animals? " -Sorry, I worded this badly. What I meant is that argument assumes that X-Risk and human extinction are identical. They are of course not, as irrecoverable collapse , s-risks and permanent curtailing of human potential (which I think is a somewhat problematic concept) are all X-Risks as well. Apologies for the lack of clarity. "The posts I linked to were meant to have that purpose." -I think my problem is that I don't think the articles necessarily do a great job at evidencing the claims they make. Take the 80K one. It seems to ignore the concept of vulnerabilities and exposures, instead just going for a hazard centric approach. Secondly, it ignores a lot of important stuff that goes on in the climate discussion, for example what is discussed in this (https://www.pnas.org/doi/10.1073/pnas.2108146119) and this (https://www.cser.ac.uk/resources/assessing-climate-changes-contribution-global-catastrophic-risk/). Basically, I think it fails to adequately address systemic risk, cascading risk and latent risk. Also, it seems to (mostly) equate X-Risk to human extinction without massively exploring the question of if civilisation collapses whether we WILL recover not just whether we could. The Luisa Rodriguez piece also doesn't do this (this isn't a critique of her p

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Just tell me a story with probabilities of how nuclear war or bioweapons could cause human extinction and you'll see that when you'll multiply the probabilities, it will go down to a very low number.
I repeat but I think that you don't still have a good sense of how difficult it is to kill every humans if the minimal viable population (MVP) is around 1000 as argued in the post linked above.

"knock-on effects"
I think that it's true but I think that on the first-order, not dying from AGI is the most important thing compared with developing it in like 100 years.

GideonF

I have a slight problem with the "tell me a story" framing. Scenarios are useful, but also lend themselves general to crude rather than complex risks. In asking this question, you implicitly downplay complex risks. For a more thorough discussion, the "Democratising Risk" paper by Cramer and Kemp has some useful ideas in it (I disagree with parts of the paper but still) It also continues to priorities epistemically neat and "sexy" risks which whilst possibly the most worrying are not exclusive. Also probabilities on scenarios in many contexts can be somewhat problematic, and the methodologies used to come up with very high xrisk values for AGI vs other xrisks have very high uncertainties. To this degree, I think the certainty you have is somewhat problematic

simeon_c4y5

If there were no preferences, at least 95% and probably more around 99%. I think that this should update according to our timelines.

And just to clarify, that includes community building etc. as I mentioned.

Longtermists Should Work on AI - There is No "AI Neutral" Scenario

Forecasts estimate limited cultured meat production through 2050

Thanks for the comment.

I think it would be true if there were other X-risks. I just think that there is no other literal X-risk. I think that there are huge catastrophic risks. But there's still a huge difference between killing 99% of people and killing a 100%.

I'd recommend reading (or skimming through) this to have a better sense of how different the 2 are.

I think that in general the sense that it's cool to work on every risks come precisely from the fact that very few people have thought about every risks and thus people in AI for instance I... (read more)

Arepo

Luisa's post addresses our chance of getting killed 'within decades' of a civilisational collapse, but that's not the same as the chance that it prevents us ever becoming a happy intergalactic civilisation, which is the end state we're seeking. If you think that probability is 90%, given a global collapse, then the effective x-risk of that collapse is 0.1 * <its probability of happening>. One order of magnitude doesn't seem like that big a deal here, given all the other uncertainties around our future.

Tom4y15

"no other literal X-risk" seems too strong. There are certainly some potential ways that nuclear war or a bioweapon could cause human extinction. They're not just catastrophic risks.

In addition, catastrophic risks don't just involve massive immediate suffering. They drastically change global circumstances in a way which will have knock-on effects on whether, when, and how we build AGI.

All that said, I directionally agree with you, and I think that probably all longtermists should have a model of the effects their work has on the potentiality of aligned AGI... (read more)

simeon_c4y4

I know it's not trivial to do that but if you included your AGI timelines into consideration for this type of forecast, you'd come up with very different estimates. For that reason, I'd be willing to bet on most estimates

How would a language model become goal-directed?

[$20K In Prizes] AI Safety Arguments Competition

I have the impression (coming from the simulator theory (https://generative.ink/)) that Decision Transformers (DT) have some chance (~45%) to be a much safer form of trial and error technique than RL. The core reason is that DT learn to simulate a distribution of outcome (e.g they learn to simulate the kind of actions that lead to a reward of 10 as much as one that leads to a reward of 100) and that it's only during inference that you make it doing inferences systematically with a reward of 100. So in some sense, the agent which has become very good via tr... (read more)

Martín Soto

My take would be: Okay, so you have achieved that, instead of the whole LLM being an agent, it just simulates an agent. Has this gained much for us? I feel like this is (almost exactly) as problematic. The simulated agent can just treat the whole LLM as its environment (together with the outside world), and so try to game it like any agentic enough misaligned AI would: it can act deceptively so as to keep being simulated inside the LLM, try to gain power in the outside world which (if it has a good enough understanding of minimizing loss) it knows is the most useful world model (so that it will express its goals as variables in that world model), etc. That is, you have just pushed the problem one step back, and instead of the LLM-real world frontier, you must worry about the agent-LLM frontier. Of course we can talk more empirically about how likely and when these dynamics will arise. And it might well be that the agent being enclosed in the LLM, facing one further frontier between itself and real-world variables, is less likely to arrive at real-world variables. But I wouldn't count on it, since the relationship between the LLM and the real world would seem way more complex than the relationship between the agent and the LLM, and so most of the work is gaming the former barrier, not the latter.

[$20K In Prizes] AI Safety Arguments Competition

[TO POLICYMAKERS]
Trying to align very advanced AIs with what we want is a bit like when you try to design a law or a measure to constrain massive companies, such as Google or Amazon, or powerful countries, such as the US or China. You know that when you put a rule in place, they will have enough resources to circumvent it. And you might try as hard as you want, if you didn't design the AI properly in the first place, you won't be able to have it make what you want.