All of Aaron_Scher's Comments + Replies

Elaborating on point 1 and the "misinformation is only a small part of why the system is broken" idea: 

The current system could be broken in many ways but at some equilibrium of sorts. Upsetting this equilibrium could have substantial effects because, for instance, people's built immune response to current misinformation is not as well trained as their built immune response to traditionally biased media. 

Additionally, intervening on misinformation could be far more tractable than other methods of improving things. I don't have a solid grasp of wh... (read more)

I appreciate you writing this, it seems like a good and important post. I'm not sure how compelling I find it, however. Some scattered thoughts:

  • In point 1, it seems like the takeaway is "democracy is broken because most voters don't care about factual accuracy, don't follow the news, and elections are not a good system for deciding things; because so little about elections depends on voters getting reliable information, misinformation can't make things much worse". You don't actually say this, but this appears to me to be the central thrust of your argumen
... (read more)
2
Dan Williams
1mo
Thanks for this. A few points:  * Re. point 1: Yes, I agree with your characterisation in a way: democracy is already a kind of epistemological disaster. Many treatments of dis/misinfo assume that people would be well-informed and make good decisions if not for exposure to dis/misinfo. That's completely wrong and should affect how we view the marginal impact of AI-based disinformation.  * On the issue of social media, see my post: https://www.conspicuouscognition.com/p/debunking-disinformation-myths-part. Roughly: People tend to greatly overestimate how much people are informing themselves about social media in my view.  * Re. point 4: I'm not complaining it's a good thing if establishment propaganda outcompetes anti-establishment propaganda. As I explicitly say, I think this is actually a genuine danger. It's just different from the danger most people focus on when they think about this issue. More generally, in thinking about AI-based disinformation, people tend to assume that AI will only benefit disinformation campaigns, which is not true, and the fact it is not true should be taken into consideration when evaluating the impact of AI-based disinformation.  You write of my arguments: "They feel largely handwavy in the sense of "here is an argument which points in that direction", but it's really hard to know how hard they push that direction. There is ample opportunity for quantitative and detailed analysis (which I generally would find more convincing), but that isn't made here, and is instead obfuscated in links to other work."  It's a fair point. This was just written up as a blog post as a hobby in my spare time. However, what really bothers me is the asymmetry here: There is a VAST amount of alarmism about AI-based disinformation that is guilty of the problems you criticise in my arguments, and in fact features much less reference to existing empirical research. So it felt important for me to push back a bit, and more generally I think it's important t
1
Aaron_Scher
2mo
Elaborating on point 1 and the "misinformation is only a small part of why the system is broken" idea:  The current system could be broken in many ways but at some equilibrium of sorts. Upsetting this equilibrium could have substantial effects because, for instance, people's built immune response to current misinformation is not as well trained as their built immune response to traditionally biased media.  Additionally, intervening on misinformation could be far more tractable than other methods of improving things. I don't have a solid grasp of what the problem is and what makes is worse, but a number of potential causes do seem much harder to intervene on than misinformation: general ignorance, poor education, political apathy. It can be the case that misinformation makes the situation merely 5% worse but is substantially easier to fix than these other issues. 

Due to current outsourcing being of data labeling, I think one of the issues you express in the post is very unlikely:

My general worry is that in future, the global south shall become the training ground for more harmful AI projects that would be prohibited within the Global North. Is this something that I and other people should be concerned about?

Maybe there's an argument about how: 

  • current practices are evidence that AI companies are trying to avoid following the laws (note I mostly don't believe this), 
  • and this is why they're outsourcing part
... (read more)

This line of argument suggests that slow takeoff is inherently harder to steer. Because pretty much any version of slow takeoff means that the world will change a ton before we get strongly superhuman AI.

I'm not sure I agree that the argument suggests that. I'm also not sure slow takeoff is harder to steer than other forms of takeoff — they all seem hard to steer. I think I messed up the phrasing because I wasn't thinking about it the right way. Here's another shot:

Widespread AI deployment is pretty wild. If timelines are short, we might get attempts at AI... (read more)

I think these don’t bite nearly as hard for conditional pauses, since they occur in the future when progress will be slower

Your footnote is about compute scaling, so presumably you think that's a major factor for AI progress, and why future progress will be slower. The main consideration pointing the other direction (imo) is automated researchers speeding things up a lot. I guess you think we don't get huge speedups here until after the conditional pause triggers are hit (in terms of when various capabilities emerge)? If we do have the capabilities for automated researchers, and a pause locks these up, that's still pretty massive (capability) overhang territory. 

1
AnonResearcherMajorAILab
7mo
Yeah, unless we get a lot better at alignment, the conditional pause should hit well before we create automated researchers.

While I’m very uncertain, on balance I think it provides more serial time to do alignment research. As model capabilities improve and we get more legible evidence of AI risk, the will to pause should increase, and so the expected length of a pause should also increase [footnote explaining that the mechanism here is that the dangers of GPT-5 galvanize more support than GPT-4]

I appreciate flagging the uncertainty; this argument doesn't seem right to me. 

One factor affecting the length of a pause would be the (opportunity cost from pause) / (risk of cata... (read more)

4
AnonResearcherMajorAILab
7mo
I agree it's important to think about the perceived opportunity cost as well, and that's a large part of why I'm uncertain. I probably should have said that in the post. I'd still guess that overall the increased clarity on risks will be the bigger factor -- it seems to me that risk aversion is a much larger driver of policy than worries about economic opportunity cost (see e.g. COVID lockdowns). I would be more worried about powerful AI systems being seen as integral to national security; my understanding is that national security concerns drive a lot of policy. (But this could potentially be overcome with international agreements.)

Sorry, I agree my previous comment was a bit intense. I think I wouldn't get triggered if you instead asked "I wonder if a crux is that we disagree on the likelihood of existential catastrophe from AGI. I think it's very likely (>50%), what do you think?" 

P(doom) is not why I disagree with you. It feels a little like if I'm arguing with an environmentalist about recycling and they go "wow do you even care about the environment?" Sure, that could be a crux, but in this case it isn't and the question is asked in a way that is trying to force me to ag... (read more)

I don't think you read my comment:

I don't think extra time pre-transformative-AI is particularly valuable except its impact on existential risk

I also think it's bad how you (and a bunch of other people on the internet) ask this p(doom) question in a way that (in my read of things) is trying to force somebody into a corner of agreeing with you. It doesn't feel like good faith so much as bullying people into agreeing with you. But that's just my read of things without much thought. At a gut level I expect we die, my from-the-arguments / inside view is something like 60%, and my "all things considered" view is more like 40% doom. 

-1
Greg_Colbourn
7mo
Wow that escalated quickly :( It's really not. I'm trying to understand where people are coming from. If someone has low p(doom|AGI), then it makes sense that they don't see pausing AI development as urgent. Or their p(doom) relative to their actions can give some idea of how risk taking they are (but I still don't understand how OpenAI and their supporters think it's ok to gamble 100s of millions of lives in expectation for a shot at utopia without any democratic mandate). and Surely means that extra time now (pausing) is extremely valuable? i.e. because of its impact on existential risk. Or do you think that the chance we're in a net negative world now means that the astronomical future we could save would also most likely be net negative? I don' think this follows. Or that continuing to allow AI to speed up now will actually prevent extinction threats in the next 10 years that we would otherwise be wiped out by (this seems very unlikely to me).

Yep, seems reasonable, I don't really have any clue here. One consideration is that this AI is probably way better than all the human scientists and can design particularly high-value experiments, also biological simulations will likely be much better in the future. Maybe the bio-security community gets a bunch of useful stuff done by then which makes the AI's job even harder. 

there will be governance mechanisms put in place after a failure

Yep, seems reasonably likely, and we sure don't know how to do this now. 

I'm not sure where I'm assuming we can't pause dangerous AI "development long enough to build aligned AI that would be more capable of ensuring safety"? This is a large part of what I mean with the underlying end-game plan in this post (which I didn't state super explicitly, sorry), e.g. the centralization point

centralization is good because it gives this project more time for safety work and securing the world

I'm curious why you don't include intellectually aggressive culture in the summary? It seems like this was a notable part of a few of the case studies. Did the others just not mention this, or is there information indicating they didn't have this culture? I'm curious how widespread this feature is. e.g., 

The intellectual atmosphere seems to have been fairly aggressive. For instance, it was common (and accepted) that some researchers would shout “bullshit” and lecture the speaker on why they were wrong.

we need capabilities to increase so that we can stay up to date with alignment research

I think one of the better write-ups about this perspective is Anthropic's Core Views on AI Safety

From its main text, under the heading The Role of Frontier Models in Empirical Safety, a couple relevant arguments are: 

  • Many safety concerns arise with powerful systems, so we need to have powerful systems to experiment with
  • Many safety methods require large/powerful models
  • Need to understand how both problems and our fixes change with model scale (if model gets big
... (read more)

Thanks Aaron that's a good article appreciate it. It still wasn't clear to me they were making an argument that increasing capabilities could be net positive, more that safety people should be working with whatever is the current most powerful model

"But we also cannot let excessive caution make it so that the most safety-conscious research efforts only ever engage with systems that are far behind the frontier." 

This makes sense to me, the best safety researchers should have full access to the current most advanced models, preferably in my eyes before ... (read more)

Not responding to your main question:

Second in a theoretical situation where capabilities research globally stopped overnight, isn't this just free-extra-time for the human race where we aren't moving towards doom? That feels pretty valuable and high EV in and of itself. 

I'm interpreting this as saying that buying humanity more time, in and of itself, is good. 

I don't think extra time pre-transformative-AI is particularly valuable except its impact on existential risk. Two reasons for why I think this:

... (read more)
2
Greg_Colbourn
7mo
What is your p(doom|AGI)? (Assuming AGI is developed in the next decade.) Note that Bostrom himself says in Astronomical Waste (my emphasis in bold):
4
NickLaing
7mo
Thanks Aaron appreciate the effort. I Faild to point out my central assumpton here, that Transformative AI in our current state of poor preparedness is net negative due to the existential risk it entails. Its a good point about time pre transformative AI not being so valuable in the grand scheme of the future, but that ev would increase substantally assuming transformative AI is the end. Still looking for the fleshing out of this argument that I don't understand - if anyone can be bothered! "It seems to me the argument would have to be that the advantage to the safety work of improving capabilities would outstrip the increasing risk of dangerous GAI, which I find hard to get my head around, but I might be missing something important."  

I'm glad you wrote this post. Mostly before reading this post, I wrote a draft for what I want my personal conflict of interest policy to be, especially with regard to personal and professional relationships. Changing community norms can be hard, but changing my norms might be as easy as leaving a persuasive comment! I'm open to feedback and suggestions here for anybody interested. 

I think Ryan is probably overall right that it would be better to fund people for longer at a time. One counter-consideration that hasn't been mentioned yet: longer contracts implicitly and explicitly push people to keep doing something — that may be sub-optimal — because they drive up switching costs. 

If you have to apply for funding once a year no matter what you're working on, the "switching costs" of doing the same thing you've been doing are similar to the cost of switching (of course they aren't in general, but with regard to funding they might ... (read more)

2
RyanCarey
8mo
I agree. Early-career EAs are more likely to need to switch projects, less likely to merit multi-year funding, and have - on average - less need for job-security. Single-year funding seems OK for those cases. For people and orgs with significant track records, however, it seems hard to justify.
4
Austin
8mo
"Focused Research Org"

How is the super-alignment team going to interface with the rest of the AI alignment community, and specifically what kind of work from others would be helpful to them (e.g., evaluations they would want to exist in 2 years, specific problems in interpretability that seem important to solve early, curricula for AIs to learn about the alignment problem while avoiding content we may not want them reading)? 

To provide more context on my thinking that leads to this question: I'm pretty worried that OpenAI is making themselves a single point of failure in e... (read more)

I am not aware of modeling here, but I have thought about this a bit. Besides what you mention, some other ways I think this story may not pan out (very speculative):

  1. At the critical time, the cost of compute for automated researchers may be really high such that it's actually not cost effective to buy labor this way. This would mainly be because many people want to use the best hardware for AI training or productive work, and this demand just overwhelms suppliers and prices skyrocket. This is like the labs and governments paying a lot more except that they
... (read more)

A few weeks ago I did a quick calculation for the amount of digital suffering I expect in the short term, which probably gets at your question about these sizes, for the short term. tldr of my thinking on the topic: 

  • There is currently a global compute stock of ~1.4e21 FLOP/s (each second, we can do about that many floating point operations). 
  • It seems reasonable to expect this to grow ~40x in the next 10 years based on naively extrapolating current trends in spending and compute efficiency per dollar. That brings us to 1.6e23 FLOP/s in 2033. 
... (read more)

Thanks for your response. I'll just respond to a couple things. 

Re Constitutional AI: I agree normatively that it seems bad to hand over judging AI debates to AIs[1]. I also think this will happen. To quote from the original AI Safety via Debate paper, 

Human time is expensive: We may lack enough human time to judge every debate, which we can address by training ML models to predict human reward as in Christiano et al. [2017]. Most debates can be judged by the reward predictor rather than by the humans themselves. Critically, the reward predictors

... (read more)
7
zhengdong
10mo
Makes sense that this would be a big factor in what to do with our time, and AI timelines. And we're surprised too by how AI can overperform expectations, like in the sources you cited. We'd still say the best way of characterizing the problem of creating synthetic data is that it's a wide open problem, rather than high confidence that naive approaches using current LMs will just work. How about a general intuition instead of parsing individual sources. We wouldn't expect making the dataset bigger by just repeating the same example over and over to work. We generate data by having 'models' of the original data generators, humans. If we knew what exactly made human data 'good,' we could optimize directly for it and simplify massively (this runs into the well-defined eval problem again---we can craft datasets to beat benchmarks of course). An analogy (a disputed one, to be fair) is Ted Chiang's lossy compression. So for every case of synthetic data working, there's also cases where it fails, like Shumailov et el. we cited. If we knew exactly what made human data 'good,' we'd argue you wouldn't see labs continue to ramp up hiring contractors specifically to generate high-quality data in expert domains, like programming. A fun exercise---take a very small open-source dataset, train your own very small LM, and have it augment (double!) its own dataset. Try different prompts, plot n-gram distributions vs the original data. Can you get one behavior out of the next generation that looks like magic compared to the previous, or does improvement plateau? May have nitpicks with this experiment, but I don't think it's that different to what's happening at large scale.

The article doesn't seem to have a comment section so I'm putting some thoughts here. 

  • Economic growth: I don't feel I know enough about historical economic growth to comment on how much to weigh the "the trend growth rate of GDP per capita in the world's frontier economy has never exceeded three percent per year." I'll note that I think the framing here is quite different than that of Christiano's Hyperbolic Growth, despite them looking at roughly the same data as far as I can tell. 
  • Scaling current methods: the article seems to cherrypick the evi
... (read more)

Hey Aaron, thanks for your thorough comment. While we still disagree (explained a bit below), I'm also quite glad to read your comment :)

Re scaling current methods: The hundreds of billions figure we quoted does require more context not in our piece; SemiAnalysis explains in a bit more detail how they get to that number (eg assuming training in 3mo instead of 2 years). We don't want to haggle over the exact scale before it becomes infeasible, though---even if we get another 2 OOM in, we wanted to emphasize with our argument that 'the current method route' ... (read more)

2
PeterSlattery
10mo
I really appreciate that you took the time to provide such a detailed response to these arguments. I want to say this pretty often when on the forum, and maybe I should do it more often! 

I'm not Buck, but I can venture some thoughts as somebody who thinks it's reasonably likely we don't have much time

Given that "I'm skeptical that humans will go extinct in the near future" and that you prioritize preventing suffering over creating happiness, it seems reasonable for you to condition your plan on humanity surviving the creation of AGI. You might then back-chain from possible futures you want to steer toward or away from. For instance, if AGI enables space colonization, it sure would be terrible if we just had planets covered in factor... (read more)

1
Ren Ryba
1y
Thank you, I appreciate you taking the time to construct this convincing and high-quality comment. I'll reflect on this in detail. I did do some initial scoping work for longtermist animal stuff last year, of which AGI-enabled mass suffering was a major part of course, so might be time to dust that off.

I agree that persuasion frames are often a bad way to think about community building.

I also agree that community members should feel valuable, much in the way that I want everybody in the world to feel valued/loved.

I probably disagree about the implications, as they are affected by some other factors. One intuition that helps me is to think about the donors who donate toward community building efforts. I expect that these donors are mostly people who care about preventing kids from dying of malaria, and many donors also donate lots of money towards chariti... (read more)

Sorry about the name mistake. Thanks for the reply. I'm somewhat pessimistic about us two making progress on our disagreements here because it seems to me like we're very confused about basic concepts related to what we're talking about. But I will think about this and maybe give a more thorough answer later. 

Edit: corrected name, some typos and word clarity fixed

Overall I found this post hard to read and I spent far too long trying to understand it. I suspect the author is about as confused about key concepts as I am. David, thanks for writing this, I am glad to see writing on this topic and I think some of your points are gesturing in a useful and important direction. Below are some tentative thoughts about the arguments. For each core argument I first try to summarize your claim and then respond, hopefully this makes it clearer where we actually disagree vs.... (read more)

2
DavidW
1y
Thank you for your thoughtful feedback! You asked a lot of constructive questions, and I wanted to think carefully about my responses, so sorry for the delay. The first point in particular has helped me refine and clarify my own models.  That’s one plausible goal, but I would consider that a direct reward optimizer. That could be dangerous but is outside of the scope of this sequence. Direct reward optimizers can’t be deceptively aligned, because they are already trying to optimize reward in the training set, so there’s no need for them to do that instrumentally. Another possibility is that the model could just learn to predict future words based on past words. This is subtly but importantly different from maximizing reward, because optimizing for reward and minimizing loss might require some level of situational awareness about the loss function (but probably not as much as deceptive alignment requires).  A neural network doesn’t necessarily have a way to know that there’s a loss function to be optimized for. Loss is calculated after all of the model’s cognition, but the model would need to conceptualize that loss calculation before it actually happens to explicitly optimize for it. I haven’t thought about this nearly as much as I’ve thought about deceptive alignment, but it seems much more likely for a pre-trained model to form goals directly derived from past tokens (E.G. predict the next token) than directly optimizing for something that's not in its input (loss). This logic doesn’t carry over to the RL phase, because that will include a lot of information about the training environment in the prompt, which the model can use to infer information about the loss function. Pre-training, on the other hand, just uses a bunch of internet text without any context.  Assuming that pre-training doesn’t create a direct reward optimizer, I expect the goal of the model to shift when the training process shifts to fine-tuning with human feedback. A goal like next token pr

FWIW I often vote on posts at the top without scrolling because I listened to the post via the Nonlinear podcast library or read it on a platform that wasn't logged in. Not all that important of a consideration, but worth being aware of. 

Here are my notes which might not be easier to understand, but they are shorter and capture the key ideas:

  • Uneasiness about chains of reasoning with imperfect concepts
    • Uneasy about conjunctiveness: It’s not clear how conjunctive AI doom is (AI doom being conjunctive would mean that Thing A and Thing B and Thing C all have to happen or be true in order for AI doom; this is opposed to being disjunctive where either A, or B, or C would be sufficient for AI Doom), and Nate Soares’s response to Carlsmith’s powerseeking AI report is not a silver bullet; there is s
... (read more)
2
Pato
1y
Thanks! I think I understood everything now and in a really quick read.
6
NunoSempere
1y
Nice, thanks, great summary.

This evidence doesn't update me very much. 

I would prefer an EA Forum without your critical writing on it, because I think your critical writing has similar problems to this post...

I interpret this quote to be saying, "this style of criticism — which seems to lack a ToC and especially fails to engage with the cruxes its critics have, which feels much closer to shouting into the void than making progress on existing disagreements — is bad for the forum discourse by my lights. And it's fine for me to dissuade people from writing content which hurts disc... (read more)

I expect a project like this is not worth the cost. I imagine doing this well would require dozens of hours of interviews with people who are more senior in the EA movement, and I think many of those people’s time is often quite valuable.

Regarding the pros you mention:

  1. I’m not convinced that building more EA ethos/identity based around shared history is a good thing. I expect this would make it even harder to pivot to new things or treat EA as a question, it also wouldn’t be unifying for many folks (e.g. who having been thinking about AI safety for a dec

... (read more)
7
Pete Rowlett
1y
I’ve addressed the point on costs in other commentary, so we may just disagree there! 1. I think the core idea is that the EA ethos is about constantly asking how we can do the most good and updating based on new information.  So the book would hopefully codify that spirit rather than just talk about how great we’re doing. 2. I find it easier to trust people whose motivations I understand and who have demonstrated strong character in the past.  History can give a better sense of those two things.  Reading about Julia Wise in Strangers Drowning, for example, did that for me. 3. Humans often think about things in terms of stories.  If you want someone to care about global poverty, you have a few ways of approaching it.  You could tell them how many people live in extreme poverty and that by donating to GiveDirectly they’ll get way more QALYs per dollar than they would by donating elsewhere.  You could also tell them about your path to donating, and share a story from the GiveDirectly website about how a participant benefited the money they received.  In my experience, that’s the better strategy.  And absolutely, the EA community exists to serve a purpose.  Right now I think it’s reasonably good at doing the things that I care about, so I want it to continue to exist. 4. Agreed! I think there could be a particular audience for this book, and it likely wouldn’t be EA newbies.  The project could also take on a lot of different forms, from empirical report to personal history, depending on the writer.  Hopefully the right person sees this and decides to go for it if and when it makes sense!  Regardless, your commentary is appreciated.  

The short answer to your question is "yes, if major changes happen in the world fairly quickly, then career advice which does not take such changes into account will be flawed"

I would also point to the example of the advice "most of the impact in your career is likely to come later on in your life, like age 36+" (paraphrase from here and other places). I happen to believe there's a decent chance we have TAI/AGI by the time I'm 36 (maybe I'm >50% on this), which would make the advice less likely to be true. 

Other things to consider: if timelines are... (read more)

I like this comment and think it answers the question at the right level of analysis.

To try and summarize it back: EA’s big assumption is that you should purchase utilons, rather than fuzzies, with charity. This is very different from how many people think about the world and their relationship to charity. To claim that somebody’s way of “doing good” is not as good as they think is often interpreted by them as an attack on their character and identity, thus met with emotional defensiveness and counterattack.

EA ideas aim to change how people act and think (and for some core parts of their identity); such pressure is by default met with resistance.

I have only skimmed this, but it seems quite good and I want more things like it on the forum. Positive feedback! 

My phrasing below is more blunt and rude than I endorse, sorry. I’m writing quickly on my phone. I strong downvoted this post after reading the first 25% of it. Here are some reasons:

“Bayesianism purports that if we find enough confirming evidence we can at some point believe to have found “truth”.” Seems like a mischaracterization, given that sufficient new evidence should be able to change a Bayesian’s mind (tho I don’t know much about the topic).

“We cannot guess what knowledge people will create into the future” This is literally false, we can guess at ... (read more)

2
Toni MUENDEL
1y
Please see my reply’s in Bold   My phrasing below is more blunt and rude than I endorse, sorry. I’m writing quickly on my phone. I strong downvoted this post after reading the first 25% of it. Here are some reasons: “Bayesianism purports that if we find enough confirming evidence we can at some point believe to have found “truth”.” Seems like a mischaracterization, given that sufficient new evidence should be able to change a Bayesian’s mind (tho I don’t know much about the topic).   Yes, that is how Baysianism works. A Bayesian will change their mind based on either the confirming or disconfirming evidence.   “We cannot guess what knowledge people will create into the future” This is literally false, we can guess at this and we can have a significant degree of accuracy. E.g. I predict that there will be a winner in the 2020 US presidential election, even though I don’t know who it will be.   I agree with you, but you can’t predict this for things that will happen 100 years from now. We may find better ways to govern by then.   I can guess that there will be computer chips which utilize energy more efficiently than the current state of the art, even though I do not know what such chips will look like (heck I don’t understand current chips).   Yes, this is the case if progress continues. But it isn’t inevitable. There are groups attempting to create a society that inhibit progress. If our growth culture changes, our “chip” producing progress could stop. Then any prediction about more efficient chips would be an error.   “We can’t predict the knowledge we will have in the future. If we could, we would implement that knowledge today” Still obviously false. Engineers often know approximately what the final product will look like without figuring out all the details along the way.   Yes, that works incrementally. But what the engineers can’t predict is what their next, next renditions of their product will be. And those subsequent steps is what I’m referri

I liked this post and would like to see more of people thinking for themselves about cause prioritization and doing BOTECs.

Some scattered thoughts below, also in the spirit of draft amnesty.

I had a little trouble understanding your calculations/logic, so I'm going to write them out in sentence form: GiveWell's current giving recommendations correspond to spending about $0.50 to save an additional person an additional year of life.  A 10% chance of extinction from misaligned AI means that postponing misaligned AI by a year gets us 10%*current populatio... (read more)

I am a bit confused by the key question / claim. It seems to be some variant of "Powerful AI may allow the development of technology which could be used to destroy the world. While the AI Alignment problem is about getting advanced AIs to do what their human operator wants, this could still lead to an existential catastrophe if we live in such a vulnerable world where unilateral actors can deploy destructive technology. Thus actual safety looks like not just having Aligned AGI, but also ensuring that the world doesn't get destroyed by bad or careless or un... (read more)

1
Jordan Arel
1y
Thank you so much for this reply! I’m glad to know there is already some work on this, makes my job a lot easier. I will definitely look into the articles you mentioned and perhaps just study AI risk / AI safety a lot more in general to get a better understanding of how people think about this. It sounds like what people call “deployment” may be very relevant, so well especially look into this.

Personally I didn't put much weight on this sentence because the more-important-to-me evidence is many EAs being on the political left (which feels sufficient for the claim that EA is not a generally conservative set of ideas, as is sometimes claimed). See the 2019 EA Survey in which 72% of respondents identified as Left or Center Left. 

“There are also strong selection effects on retreat attendees vs. intro fellows”

I wonder what these selection effects are. I imagine you get a higher proportion of people who think they are very excited about EA. But also, many of the wicked smart, high achieving people I know are quite busy and don’t think they have time for a retreat like this, so I wonder if you’re somewhat selecting against these people?

Similarly, people who are very thoughtful about opportunity costs and how they spend their time might feel like a commitment like this is too big given that they don’t know much about EA yet and don’t know how much they agree/want to be involved.

Thanks for making this. I expect that after you make edits based on comments and such this will be the most up to date and accurate public look at this question (the current size of the field). I look forward to linking people to it!

I disagree with a couple specific points as well as the overall thrust of this post. Thank you for writing it!

A maximizing viewpoint can say that we need to be cautious lest we do something wonderful but not maximally so. But in practice, embracing a pragmatic viewpoint, saving money while searching for the maximum seems bad.

I think I strongly disagree with this because opportunities for impact appear heavy tailed. Funding 2 interventions that are in the 90th percentile is likely less good than funding 1 intervention in the 99th percentile. Given this stat... (read more)

3
Davidmanheim
2y
I agree that we mostly agree. That said, I think that I either disagree with what it seems you recommend operationally, or we're talking past one another. "Funding 2 interventions that are in the 90th percentile is likely less good than funding 1 intervention in the 99th percentile. Given this state of the world, spending much of our resources trying to identify the maximum is worthwhile." Yes, we should do that second thing. But how much of our resources do we spend on identifying what exists? I'd agree that1% of total EA giving going to cause exploration is obviously good, 10% is justifiable, and 50% is not even reasonable. It was probably the right allocation when Givewell was started, but it isn't now, and as a community we've looked at thousands of potential cause areas and interventions, and are colelctively sitting on quite a bit of money, an amount that seems to be increasing over time. Now we need to do things. The question now is whether we care about funding the 99.9% interventions we have, versus waiting for certainty that it's a 99.99% intervention, or a  a 99.999% intervention, and spending to find it, or saving to fund it.  " I think the default of the world is that I donate to a charity in the 50th percentile..." Agreed, and we need to fix that. "...And if I adopt a weak mandate to do lots of good (a non-maximizing frame, or an early EA movement), I will probably identify and donate to a charity in the 90th percentile." And that's where this lost me. In the early EA movement, this was true, and I would have pushed for more research and less giving early on. (But they did that.) And for people who haven't previously been exposed to EA, yes, there's a danger of under-optimizing, though it is mostly mitigated about an hour after looking at Givewell's web site. The community is not at the point of looking at 90th percentile charities now, and continuing to think the things we've found are 90th percentile, and acting that way, and we need to  evaluat

You write:

Another possible reason to argue for a zero-discount rate is that the intrinsic value of humanity increases at a rate greater than the long-run catastrophe rate[19]. This is wrong for (at least) 2 reasons. 

Your footnote is to The Precipice: To quote from The Precipice Appendix E:

by many measures the value of humanity has increased substantially over the centuries. This progress has been very uneven over short periods, but remarkably robust over the long run. We live long lives filled with cultural and material riches that would have seemed l

... (read more)

Welcome to the forum! I am glad that you posted this! And also I disagree with much of it. Carl Shulman already responded explaining why he things the extinction rate approaches zero fairly soon, reasoning I agree with. 

Under a stable future population, where people produce (on average) only enough offspring to replace themselves, a person’s expected number of descendants is equal to the expected length of human existence, divided by the average lifespan (). I estimate this figure is 93[22].

To be consistent, when comparing lives saved in pr

... (read more)

I read this post around the beginning of March this year (~6 months ago). I think reading this post was probably net-negative for my life plans. Here are some thoughts about why I think reading this post was bad for me, or at least not very good. I have not re-read the post since then, so maybe some of my ideas are dumb for obvious reasons. 

I think the broad emphasis on general skill and capacity building often comes at the expense of directly pursuing your goals. In many ways, the post is like “Skill up in an aptitude because in the future this might... (read more)

This is great and I’m glad you wrote it. For what it’s worth, the evidence from global health does not appear to me strong enough to justify high credence (>90%) in the claim “some ways of doing good are much better than others” (maybe operationalized as "the top 1% of charities are >50x more cost-effective than the median", but I  made up these numbers).

The DCP2 (2006) data (cited by Ord, 2013) gives the distribution of the cost-effectiveness of global health interventions. This is not the distribution of the cost-effectiveness of possible dona... (read more)

5
Owen Cotton-Barratt
2y
Yeah I think this is a really good question and would be excited to see that kind of analysis. Maybe I'd make the numerator be "# of charitable $ spent" rather than "# of charities" to avoid having the results be swamped by which areas have the most very small charities.  It might also be pretty interesting to do some similar analysis of how good interventions in different broad areas look on longtermist grounds (although this necessarily involve a lot more subjective judgements).

The edit is key here. I would consider running an AI-safety arguments competition in order to do better outreach to graduate-and-above level researchers to be a form of movement building and one for which crunch time could be in the last 5 years before AGI (although probably earlier is better for norm changes). 

One value add from compiling good arguments is that if there is a period of panic following advanced capabilities (some form of firealarm), then it will be really helpful to have existing and high quality arguments and resources on hand to help... (read more)

2
ThomasW
2y
Aaron didn't link it, so if people aren't aware,  we are running that competition (judging in progress).

I’m a bit confused by this post. I’m going to summarize the main idea back, and I would appreciate it if you could correct me where I’m misinterpreting.

Human psychology is flawed in such a way that we consistently estimate the probability of existential risk from each cause to be ~10% by default. In reality, the probability of existential risk from particular causes is generally less than 10% [this feels like an implicit assumption], so finding more information about the risks causes us to decrease our worry about those risks. We can get more information a... (read more)

5
bgarfinkel
2y
This is a helpful comment - I'll see if I can reframe some points to make them clearer. I'm actually not assuming human psychology is flawed. The post is meant to be talking about how a rational person (or, at least, a boundedly rational person) should update their views. On the probabilities: I suppose I'm implicitly evoking both a subjective notion of probability ("What's a reasonable credence to assign to X happening?" or "If you were betting on X, what betting odds should you be willing to accept?") and a more objective notion ("How strong is the propensity for X to happen?" or "How likely is X actually?" or "If you replayed the tape a billion times, with slight tweaks to the initial conditions, how often would X happen?").[1] What it means for something to pose a "major risk," in the language I'm using, is for the objective probability of doom to be high. For example, let's take existential risks from overpopulation. In the 60s and 70s, a lot of serious people were worried about near-term existential risks from overpopulation and environmental depletion. In hindsight, we can see that overpopulation actually wasn't a major risk. However, this wouldn't have been clear to someone first encountering the idea and noticing how many experts took it seriously. I think it might have been reasonable for someone first hearing about The Population Bomb to assign something on the order of a 10% credence to overpopulation being a major risk. I think, for a small number of other proposed existential risks, we're in a similar epistemic position. We don't yet know enough to say whether it's actually a major risk, but we've heard enough to justify a significant credence in the hypothesis that it is one.[2] If you assign a 10% credence to something not being a major risk, then you should assign a roughly 90% credence to further evidence/arguments helping you see that it's not a major risk. If you become increasingly confident that it's not a major risk, then your credence in

A solution that doesn’t actually work but might be slightly useful: Slow the lemons by making EA-related Funding things less appealing than the alternative.

One specific way to do this is to pay less than industry pays for similar positions: altruistic pay cut. Lightcone, the org Habryka runs, does this: “Our current salary policy is to pay rates competitive with industry salary minus 30%.” At a full-time employment level, this seems like one way to dissuade people who are interested in money, at least assuming they are qualified and hard working enough to ... (read more)

Good question. Short answer: despite being an April Fools post, that post seems to encapsulate much of what Yudkowski actually believes – so the social context is that the post is joking in its tone and content but not so much the attitude of the author; sorry I can't link to anything to further substantiate this. I believe Yudkowski's general policy is to not put numbers on his estimates.

Better answer: Here is a somewhat up-to-date database about predictions about existential risk chances from some folks in the community. You'll notice these are far below... (read more)

2
ada
2y
Thanks for the reply. I had no idea the spread was so wide (<2% to >98% in the last link you mentioned)! I guess the nice thing about most of these estimates is they are still well above the ridiculously low orders of magnitude that might prompt a sense of 'wait, I should actually upper-bound my estimate of humanity's future QALYs in order to avoid getting mugged by Pascal.' It's a pretty firm foundation for longtermism imo.

#17 in the spreadsheet is "How much do charities differ in impact?"

I would love to see an actual distribution of charity cost-effectiveness. As far as I know, that doesn't exist. Most folks rely on Ord (2013) which is the distribution of health interventions, but it says nothing about where charities actually do work. 

Load more