elifland

795Joined Aug 2018
www.elilifland.com/

Bio

Interested in improving epistemics and AI safety. More at https://www.elilifland.com/. You can give me anonymous feedback here.

Comments
45

Prioritizing x-risks may require caring about future people

Thanks for linking; in particular, Greg covered some similar ground to this post in The person-affecting value of existential risk reduction:

Although it seems unlikely x-risk reduction is the best buy from the lights of a person-affecting view (we should be suspicious if it were), given ~$10000 per life year compares unfavourably to best global health interventions, it is still a good buy

 

although it seems unlikely that x-risk reduction would be the best buy by the lights of a person affecting view, this would not be wildly outlandish.

Prioritizing x-risks may require caring about future people

I agree with most of this, thanks for pointing to the relevant newsletter!

A few specific reactions:

The first $1bn spent on xrisk reduction is very cost-effective

This seems plausible to me but not obvious, in particular for AI risk the field seems pre-paradigmatic such that there aren't necessarily "low-hanging fruit" to be plucked; and it's unclear whether previous efforts besides field-building have even been net positive in total.

That said, I think it's fair to say it doesn't depend on something like "strong longtermism". Common sense ethics cares about future generations, and I think suggests we should do far more about xrisk and GCR reduction than we do today.

Agree with this, though I think "strong longtermism" might make the case easier for those who aren't sure about the expected length of the long-term future.

Taking a global perspective, if you can reduce existential risk by 1 percentage point for under $234 billion, you would save lives more cheaply than GiveWell’s top recommended charities — again, regardless of whether you attach any value to future generations or not. 

reducing it by another percentage point might take $100 billion+, which would be only 20% as cost-effective as GiveWell top charities.

Seems like there's a typo somewhere; reducing x-risk by a percentage point for $100 billion would be more effective than $234 billion, not 20% as cost-effective?

Reasons I’ve been hesitant about high levels of near-ish AI risk

Yeah, the idea is that the lower the expected value of the future the less bad it is if AI causes existential catastrophes that don't involve lots of suffering. So my wording was sloppy here; lower EV of the future perhaps decreases the importance of (existential catastrophe-preventing) AI risk but not my credence in it.

Reasons I’ve been hesitant about high levels of near-ish AI risk

Thanks for pointing this out. I agree that I wasn't clear about this in the post.

My hesitations have been around adopting views with timelines and risk level that are at least as concerning as the OpenPhil cluster (Holden, Ajeya, etc.) that you're pointing at; essentially views that seem to imply that AI and things that feed into it are clearly the most important cause area.

I'm not sure Eliezer having occasionally been overconfident, but got the general shape of things right is any evidence at all against >50% AGI in 30 years or >15% chance of catastrophe this century (though it could be evidence against Eliezer's very high risk view).

I wouldn't go as far as no evidence at all given that my understanding is Eliezer (+ MIRI) was heavily involved in influencing the OpenPhil's cluster's views so it's not entirely independent, but I agree it's much weaker evidence for less extreme views.

Fewer 'smart people disagree' about the numbers in your footnote than about the more extreme view.'

I was going to say that it seems like a big difference within our community, but both clusters of views are very far away from the median pretty reasonable person and the median AI researcher. Though I suppose the latter actually isn't far away on timelines (potentially depending on the framing?). It definitely seems to be in significant tension with how AI researchers and the general public / markets / etc. act, regardless of stated beliefs (e.g. I found it interesting how short the American public's timelines are, compared to their actions). 

Anyway, overall I think you're right that it makes a difference but it seems like a substantive concern for both clusters of views.

The Carlsmith post you say you roughly endorse seems to have 65% on AGI in 50 years, with a 10% chance of existential catastophe overall. So I'm not sure if that means your conclusion is 

[...]

The conclusion I intend to convey is something like "I'm no longer as hesitant about adopting views which are at least as concerning as >50% of AGI/TAI/APS-AI within 30 years, and >15% chance of existential catastrophe this century" which as I referred to above seem to make AI clearly the most important cause area.

Copying my current state on the object level views from another recent post:

I’m now at ~20% by 2036; my median is now ~2050 though still with a fat right tail.


My timelines shortening [due to reflecting on MATH breakthrough] should also increase my p(AI doom by 2100) a bit, though I’m still working out my views here. I’m guessing I’ll land somewhere between 20 and 60% [TBC, most of the variance is coming from working out my views and not the MATH breakthrough].

Reasons I’ve been hesitant about high levels of near-ish AI risk

Consequently, it's possible to be skeptical of the motivations anyone in AI safety, expert or novice, on the grounds that "isn't it convenient the best way to save the world is to do cool AI stuff?"

 

Fair point overall, and I'll edit in a link to this comment in the post. It would be interesting to see data on what percentage of people working AI safety due to EA motivations would likely be working in AI regardless of impact. I'd predict that it's significant but not a large majority (say, 80% CI of 25-65%).

A few reactions to specific points/claims:

It's possible that they know the same amount about AI X-risk mitigation, and would perhaps have similar success rate working on some alignment research (which to a great deal involves GPT-3 prompt hacking with near-0 maths).

My understanding is that most alignment research involves either maths or skills similar to ML research/engineering; there is some ~GPT-3 prompt hacking (e.g. this post?) but it seems like <10% of the field?

Imagine that two groups wanted to organise an AI camp or event: a group of AI novice undergrads who have been engaged in EA vs a group of AI profs with no EA connections. Who is more likely to get funding?

I'm not sure about specifically organizing an event, but I'd guess that experienced AI profs with no EA connections but who seemed genuinely interested in reducing AI x-risk would be able to get substantial funding/support for their research.

EA-funded AI safety is actually a pretty sweet deal for an AI novice who gets to do something that's cool at very little cost.

The field has probably gotten easier to break into over time but I'd guess most people attempting to enter still experience substantial costs, such as rejections and mental health struggles.

elifland's Shortform

Sharing an update on my last 6 months that's uncomfortably personal for me to want to share as more than a shortform for now, but I think is worth sharing somewhere on the Forum: Personal update: EA entrepreneurship, mental health, and what's next

tl;dr In the last 6 months I started a forecasting org, got fairly depressed and decided it was best to step down indefinitely, and am now figuring out what to do next. I note some lessons I’m taking away and my future plans.

AGI Ruin: A List of Lethalities

But I don't think you learn all that much about how 'concrete and near mode' researchers who expect slower takeoff are being, from them not having given much thought to what to do in this (from their perspective) unlikely edge case.

 

I'm not sure how many researchers assign little enough credence to fast takeoff that they'd describe it as an unlikely edge case, which sounds like <=10%? e.g in Paul's blog post he writes "I’m around 30% of fast takeoff"

ETA: One proxy could be what percentage researchers assigned to "Superintelligence" in this survey

We should expect to worry more about speculative risks

I agree with the general thrust of the post, but when analyzing technological risks I think one can get substantial evidence by just considering the projected "power level" of the technology, while you focus on evidence that this power level will lead to extinction. I agree the latter is much hard to get evidence about but I think the former is sufficient to be very worrisome without much evidence on the latter.

Specifically, re: AI you write:

we’re reliant abstract arguments that use ambiguous concepts (e.g. “objectives” and “intelligence”), rough analogies, observations of the behaviour of present-day AI systems (e.g. reinforcement learners that play videogames) that will probably be very different than future AI systems, a single datapoint (the evolution of human intelligence and values) that has a lot of important differences with the case we’re considering, and attempts to predict the incentives and beliefs of future actors in development scenarios that are still very opaque to us.

I roughly agree with all of this, but by itself the argument that we will within the next century plausibly create AI systems that are more powerful than humans (e.g. Ajeya's timelines report) seems like enough to get the risk pretty high. I'm not sure what our prior should be on existential risk conditioned on a technology this powerful being developed, but honestly starting from 50% might not be unreasonable.

Similar points were made previously e.g. by Richard Ngo with the "second species" argument, or by Joe Carlsmith in his report on x-risk from power-seeking AI: "Creating agents who are far more intelligent than us is playing with fire."

How likely is World War III?

5 forecasters from Samotsvety Forecasting discussed the forecasts in this post.

 

First, I estimate that the chance of direct Great Power conflict this century is around 45%. 

Our aggregated forecast was 23.5%. Considerations discussed were the changed incentives in the nuclear era, possible causes (climate change, AI, etc.) and the likelihood of specific wars (e.g. US-China fighting over Taiwan).

 

Second, I think the chance of a huge war as bad or worse than WWII is on the order of 10%.


Our aggregated forecast was 25%, though we were unsure if this was supposed to only count wars between great powers, in which case it’s bounded above by the first forecast.


There was some discussion of the offense-defense balance as tech capabilities increase; perhaps offense will have more of an advantage over time.


Some forecasters would have preferred to predict based on something like human suffering per capita rather than battle deaths, due to an expected shift in how a 21st century great power war would be waged.

 


Third, I think the chance of an extinction-level war is about 1%. This is despite the fact that I put more credence in the hypothesis that war has become less likely in the post-WWII period than I do in the hypothesis that the risk of war has not changed.
 

Our aggregated forecast was 0.1% for extinction. Forecasters were skeptical of using Braumoeller’s model to estimate this as it seems likely to break down at the tails; killing everyone via a war seems really hard. There was some uncertainty of whether borderline cases such as war + another disaster to finish people off or war + future tech would count.


(Noticed just now that MaxRa commented giving a similar forecast with similar reasoning)

Load More