Effective Altruism Forum
EA Forum

All of rgb's Comments + Replies

Postdocs and PhD/MS positions in Farmed Insect Welfare for 2024

rgb7mo10

small correction that Jonathan Birch is at LSE, not QMUL. Lars Chittka, the co-lead of the project, is at QUML

Constance Li

7mo

Corrected, thank you!

Panel discussion on AI consciousness with Rob Long and Jeff Sebo

rgb8mo5

You’re correct, Fai - Jeff is not on a co-author on the paper. The other participants - Patrick Butlin, Yoshua Bengio, and Grace Lindsay - are.

AMA: Peter Wildeford (Co-CEO at Rethink Priorities)

rgb9mo6

What's something about you that might surprise people who only know your public, "professional EA" persona?

Peter Wildeford

9mo

* I like pop music, like Ariana Grande and Olivia Rodriguo, though Taylor Swift is the Greatest of All Time. I went to the Eras Tour and loved it. * I have strong opinions about the multiple types of pizza. * I'm nowhere near as good at coming up with takes and opinions off-the-cuff in verbal conversations as I am in writing. I'm 10x smarter when I have access to the internet.

Why I don't trust forecasters

rgb10mo8

I suggest that “why I don’t trust pseudonymous forecasters” would be a more appropriate title. When I saw the title I expected an argument that would apply to all/most forecasting, but this worry is only about a particular subset

WobblyPanda2

10mo

The idea is that the potential for pseudonymous forecasting makes all forecaster track records suspect

Principles for AI Welfare Research

rgb10mo19

Unsurprisingly, I agree with a lot of this! It's nice to see these principles laid out clearly and concisely:

You write

AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future.

Do you know of any work that estimates these sizes? There are various places that people have estimated the 'size of the future' including potential digital moral patients in the long run, but do you know of anything that estimates how many AI moral patients there could be by (say) 2030?

Aaron_Scher

10mo

A few weeks ago I did a quick calculation for the amount of digital suffering I expect in the short term, which probably gets at your question about these sizes, for the short term. tldr of my thinking on the topic: * There is currently a global compute stock of ~1.4e21 FLOP/s (each second, we can do about that many floating point operations). * It seems reasonable to expect this to grow ~40x in the next 10 years based on naively extrapolating current trends in spending and compute efficiency per dollar. That brings us to 1.6e23 FLOP/s in 2033. * Human brains do about 1e15 FLOP/s (each second, a human brain does about 1e15 floating point operations worth of computation) * We might naively assume that future AIs will have similar consciousness-compute efficiency to humans. We'll also assume that 63% of the 2033 compute stock is being used to run such AIs (makes the numbers easier). * Then the number of human-consciousness-second-equivalent AIs that can be run each second in 2033 is 1e23 / 1e15 = 1e8, or 100 million. * For reference, there are probably around 31 billion land animals being factory farmed each second. I make a few adjustments based on brain size and guesses about the experience of suffering AIs and get that digital suffering in 2033 seems to be similar in scale to factory farming. * Overall my analysis is extremely uncertain, and I'm unsurprised if it's off by 3 orders of magnitude in either direction. Also note that I am only looking at the short term. You can read the slightly more thorough, but still extremely rough and likely wrong BOTEC here

Vasco Grilo

10mo

Hi Robert, Somewhat relatedly, do you happen to have a guess for the welfare range of GPT-4 compared to that of a human? Feel free to give a 90 % confidence interval with as many orders of magnitude as you like. My intuitive guess would be something like a loguniform distribution ranging from 10^-6 to 1, whose mean of 0.07 is similar to Rethink Priorities' median welfare range for bees.

jeffsebo10mo11

No, but this would be useful! Some quick thoughts:

A lot depends on our standard for moral inclusion. If we think that we should include all potential moral patients in the moral circle, then we might include a large number of near-term AI systems. If, in contrast, we think that we should include only beings with at least, say, a 0.1% chance of being moral patients, then we might include a smaller number.
With respect to the AI systems we include, one question is how many there will be. This is partly a question about moral individuation. Insofar as di

... (read more)

Common-sense sentience is dubious

rgb1y3

Hi Timothy! I agree with your main claim that "assumptions [about sentience] are often dubious as they are based on intuitions that might not necessarily ‘track’ sentience", shaped as they are by potentially unreliable evolutionary and cultural factors. I also think it's a very important point! I commend you for laying it out in a detailed way.

I'd like to offer a piece of constructive criticism if I may. I'd add more to the piece that answers, for the reader:

what kind of piece am I reading? What is going to happen in it?
why should I care about the centr

... (read more)

Timothy Chan

Thanks for the advice :) I added a summary to the post (hopefully making it more readable).

What to think when a language model tells you it's sentient

rgb1y3

Hi Brian! Thanks for your reply. I think you're quite right to distinguish between your flavor of panpsychism and the flavor I was saying doesn't entail much about LLMs. I'm going to update my comment above to make that clearer, and sorry for running together your view with those others.

Brian_Tomasik

No worries. :) The update looks good.

What to think when a language model tells you it's sentient

rgb1y5

Ah, thanks! Well, even if it wasn't appropriately directed at your claim, I appreciate the opportunity to rant about how panpsychism (and related views) don't entail AI sentience :)

Brian_Tomasik1y11

Unlike the version of panpsychism that has become fashionable in philosophy in recent years, my version of panpsychism is based on the fuzziness of the concept of consciousness. My view is involves attributing consciousness to all physical systems (including higher-level ones like organisms and AIs) to the degree they show various properties that we think are important for consciousness, such as perhaps a global workspace, higher-order reflection, learning and memory, intelligence, etc. I'm a panpsychist because I think at least some attributes of consciou... (read more)

What to think when a language model tells you it's sentient

rgb1y6

The Brian Tomasik post you link to considers the view that fundamental physical operations may have moral weight (call this view "Physics Sentience").

[Edit: see Tomasik's comment below. What I say below is true of a different sort of Physics Sentience view like constitutive micropsychism, but not necessarily of Brian's own view, which has somewhat different motivations and implications]

But even if true, [many versions of] Physics Sentience [but not necessarily Tomasik's] doesn't have straightforward implications about what high-level systems, like or... (read more)

Vasco Grilo

Thanks for the clarification! I linked to Brian Tomasik's post to provide useful context, but I wanted to point to a more general argument: we do not understand sentience/consciousness well enough to claim LLMs (or whatever) have null expected moral weight.

What to think when a language model tells you it's sentient

rgb1y1

I like it! I think one thing the post itself could have been clearer on is that reports could be indirect evidence for sentience, in that they are evidence of certain capabilities that are themselves evidence of sentience. To give an example (though it’s still abstract), the ability of LLMs to fluently mimic human speech —> evidence for capability C—> evidence for sentience. You can imagine the same thing for parrots: ability to say “I’m in pain”—> evidence of learning and memory —> evidence of sentience. But what they aren’t are reports of sentience.

so maybe at the beginning: aren’t “strong evidence” or “straightforward evidence”

Zoe Williams

Fixed, thanks!

What to think when a language model tells you it's sentient

rgb1y2

Thanks for the comment. A couple replies:

I want to clarify that these are examples of self-reports about consciousness and not evidence of consciousness in humans.

Self-report is evidence of consciousness in Bayesian sense (and in common parlance): in a wide range of scenarios, if a human says they are conscious of something, you should have a higher credence than if they do not say they are. And in the scientific sense: it's commonly and appropriately taken as evidence in scientific practice; here is Chalmers's "How Can We Construct a Science of Consci... (read more)

Rocket

Thanks for following up and thanks for the references! Definitely agree these statements are evidence; I should have been more precise and said that they're weak evidence / not likely to move your credences in the existence/prevalence of human consciousness.

What to think when a language model tells you it's sentient

rgb1y2

Agree, that's a great pointer! For those interested, here is the paper and here is the podcast episode.

[Edited to add a nit-pick: the term 'meta-consciousness' is not used, it's the 'meta-problem of consciousness', which is the problem of explaining why people think and talk the way they do about consciousness]

What to think when a language model tells you it's sentient

rgb1y1

Thank you!

rgb1y5

I enjoyed this excerpt and the pointer to the interview, thanks. It might be helpful to say in the post who Jim Davies is.

Announcing the Future Fund's AI Worldview Prize

rgb2y1

That may be right - an alternative would be to taboo the word in the post, and just explain that they are going to use people with an independent, objective track record of being good at reasoning under uncertainty.

Of course, some people might be (wrongly, imo) skeptical of even that notion, but I suppose there's only such much one can do to get everyone on board. It's a tricky balance of making it accessible to outsiders while still just saying what you believe about how the contest should work.

Guy Raveh

To be clear, I wrote "superforecasters" not because I mean the word, but because I think the very notion is controversial like you said - for example, I personally doubt the existence of people who can be predictably "good at reasoning under uncertainty" in areas where they have no expertise.

Announcing the Future Fund's AI Worldview Prize

rgb2y19

I think that the post should explain briefly, or even just link to, what a “superforecaster” is. And if possible explain how and why this serves an independent check.

The superforecaster panel is imo a credible signal of good faith, but people outside of the community may think “superforecasters” just means something arbitrary and/or weird and/or made up by FTX.

(The post links to Tetlock’s book, but not in the context of explaining the panel)

Guy Raveh

I think this would be better than the current state, but really any use of "superforecasters" is going to be extremely off-putting to outsiders.

Schisms are bad, actually: Community breakdown sucks

rgb2y3

I think you mean “schisms”

(p-)Zombie Universe: another X-risk

rgb2y5

You write,

Those who do see philosophical zombies as possible don’t have a clear idea of how consciousness relates to the brain, but they do think...that consciousness is something more than just the functions of the brain. In their view, a digital person (an uploaded human mind which runs on software) may act like a conscious human, and even tell you all about its ‘conscious experience’, but it is possible that it is in fact empty of experience.

It's consistent to think that p-zombies are possible but to think that, given the laws of nature, digital peo... (read more)

Reducing nightmares as a cause area

rgb2y3

You might be interested in this LessWrong shortform post by Harri Besceli, "The best and worst experiences you had last week probably happened when you were dreaming." Including a comment from gwern.

Drew Housman

I liked that a lot, thanks for the share. This reminds me that I wanted to note that I also have incredibly positive dreams as well. I have higher dream highs, I am pretty sure, than most of my friends. This is especially true if I am able to lucid dream. There are so many interesting things to note in that post and in Gwern's comment. There's one part of Gwern's post I'd like to pull over here because I think it adds to the conversation around how bad nightmares are (or aren't.) I agree with Harri's observation that some dreams can feel like they are lasting an incredibly long time. And if it's a bad dream that's obviously really bad. Gwern brushes that concern aside by saying: "You can't remember or produce hours of experience corresponding to [the dream], and when you try to intervene by waking people up in lucid dreams or doing tasks, they seem to still be processing time at a normal 1:1 rate." It doesn't seem relevant to me that the awake person is processing time normally. I care about the subjective experience of whatever person is suffering. And if someone has an awful experience that feels like it lasts hours or even days, I think that's worse than a subjectively shorter bad experience. This kind of stuff happens with psychedelics, too.

What if we don't need a "Hard Left Turn" to reach AGI?

rgb2y1

Thanks for the post! Wanted to flag a typo: “ To easily adapt to performing complex and difficult math problems, Minerva has That's not to say that Minerva is an AGI - it clearly isn't.”

Searle vs Bostrom: crucial considerations for EA AI work?

rgb2y11

Well, I looked it up and found a free pdf, and it turns out that Searle does consider this counterargument.

Why is it so important that the system be capable of consciousness? Why isn’t appropriate behavior enough? Of course for many purposes it is enough. If the computer can fly airplanes, drive cars, and win at chess, who cares if it is totally nonconscious? But if we are worried about a maliciously motivated superintelligence destroying us, then it is important that the malicious motivation should be real. Without consciousness, there is no possibility

rgb2y7

Feedback: I find the logo mildly unsettling. I think it triggers my face detector, and I see sharp teeth. A bit like the Radiohead logo.

On the other hand, maybe this is just a sign of some deep unwellness in my brain. Still, if even a small percentage of people get this feeling from the logo, could be worth reconsidering.

Max_Daniel

fwiw it also reminded me of the Radiohead logo.

Searle vs Bostrom: crucial considerations for EA AI work?

rgb2y18

Since the article is paywalled, it may be helpful to excerpt the key parts or say what you think Searle's argument is. I imagine the trivial inconvenience of having to register will prevent a lot of people from checking it out.

I read that article a while ago, but can't remember exactly what it says. To the extent that it is rehashing Searle's arguments that AIs, no matter how sophisticated their behavior, necessarily lack understanding / intentionality/ something like that, then I think that Searle's arguments are just not that relevant to work on AI align... (read more)

rgb2y11

Well, I looked it up and found a free pdf, and it turns out that Searle does consider this counterargument.

Why is it so important that the system be capable of consciousness? Why isn’t appropriate behavior enough? Of course for many purposes it is enough. If the computer can fly airplanes, drive cars, and win at chess, who cares if it is totally nonconscious? But if we are worried about a maliciously motivated superintelligence destroying us, then it is important that the malicious motivation should be real. Without consciousness, there is no possibility

rgb2y29

Just wanted to say that I really appreciated this post. As someone who followed the campaign with interest, but not super closely, I found it very informative about the campaign. And it covered all of the key questions I have been vaguely wondering about re: EAs running for office.

List of lists of EA-related open philosophy research questions

rgb2y7

opinionated (per its title) and non-comprehensive, but "Key questions about artificial sentience: an opinionated introduction" by me:

https://forum.effectivealtruism.org/posts/gFoWdiGYtXrhmBusH/key-questions-about-artificial-sentience-an-opinionated

My Job: EA Office Manager

rgb2y24

I work at Trajan House and I wanted to comment on this:

But a great office gives people the freedom to not worry about what they need for work, a warm environment in which they feel welcome and more productive, and supports them in ways they did not think were necessary.

By these metrics, Trajan House is a really great office! I'm so grateful for the work that Jonathan and the other operations staff do. It definitely makes me happier and more productive.

Trajan House in 2022 is a thriving hub of work, conversation, and fun.

Updates from Leverage Research: history, mistakes and new focus

rgb2y1

Leverage just released a working paper, "On Intention Research". From the post:

Starting in 2017, some of Leverage’s psychology researchers stumbled across unusual effects relating to the importance and power of subtle nonverbal communication. Initially, researchers began by attempting to understand and replicate some surprising effects caused by practitioners in traditions like bodywork and energy healing. Over time researchers investigated a wide range of phenomena in subtle nonverbal communication and developed an explanation for these phenomena accord

rgb2y2

Thanks for the comment! I agree with the thrust of this comment.

Learning more and thinking more clearly about implementation of computation in general and neural computation in particular, is perennially on my intellectual to-do list list.

We don't want to allow just any arbitrary gerrymandered states to count as an adequate implementation of consciousness's functional roles

maybe the neurons printed on each page aren't doing enough causal work in generating the next edition

I agree with the way you've formulated the problem, and the possible solution ... (read more)

How many EAs failed in high risk, high reward projects?

Answer by rgbApr 26, 202264

Some past example that come to mind. Kudos to all of the people mentioned for trying ambitious things, and writing up the retrospectives:

Not strictly speaking "EA", but an early effort from folks in the rationality community started an evidence-based medicine organization called MetaMed

Zvi Mowshowitz's post-mortem: https://thezvi.wordpress.com/2015/06/30/the-thing-and-the-symbolic-representation-of-the-thing/

Sarah Constantin's post-mortem: https://docs.google.com/document/d/1HzZd3jsG9YMU4DqHc62mMqKWtRer_KqFpiaeN-Q1rlI/edit

Michael Plant has a post-mo

rgb2y2

Thanks for writing this! Your work sounds super interesting. You write, “ But you could be rewarded by the euphoric sense of revelation. Some of that sense may even be authentic; most of it will be fool’s gold.” What are some times you got that euphoric sense in your research for HLI?

JoelMcGuire

I'll assume you're asking about the times in which something was truly revealed to me, and I wasn’t (as is commonly the case) just confused? In that case, I’d say the top realizations are: 1. Most meta-analyses are limited in their usefulness for comparing the effectiveness of interventions. Because few papers study where most of the impact happens: over time and in the rest of the recipient’s household or community. If we don’t know what happens over time or to the whole family, I don’t think we can confidently compare the effectiveness of interventions. 2. When replicating existing analyses, appreciating that there are more elegant methods to estimate and communicate cost-effectiveness than what appears to be state of the art in EA circles. 3. Finally, understanding that we don’t have a clear framework for deciding which measures or proxies of wellbeing are the best. There also appear to be straightforward ways to make progress here, but this has been little explored. And in general, we don’t do “philosophical robustness checks” even though our analysis/conclusions in the global wellbeing space often rely heavily on the philosophical view we endorse. My view here is that our global wellbeing priorities won’t conveniently converge across proxies of wellbeing (income, DALYs, SWB) or philosophical views of the badness of death or wellbeing. I’ll probably try and expand on these more in future posts. Happy to talk more or arrange a call if you’d like!

The pretty hard problem of consciousness

rgb3y12

[Replying separately with comments on progress on the pretty hard problem; the hard problem; and the meta-problem of consciousness]

The meta-problem of consciousness is distinct from both a) the hard problem: roughly, the fundamental relationship between the physical and the phenomenal b) the pretty hard problem, roughly, knowing which systems are phenomenally consciousness

The meta-problem is c) explaining "why we think consciousness poses a hard problem, or in other terms, the problem of explaining why we think consciousness is hard to explain" (6)

The me... (read more)

The pretty hard problem of consciousness

rgb3y6

[Replying separately with comments on progress on the pretty hard problem; the hard problem; and the meta-problem of consciousness]

Progress on the hard problem

I am much less sure of how to think about this than about the pretty hard problem. This is in part because in general, I'm pretty confused about how philosophical methodology works, what it can achieve, and the extent to which there is progress in philosophy. This uncertainty is not in spite of, but probably because of doing a PhD in philosophy! I have considerable uncertainty about these background ... (read more)

The pretty hard problem of consciousness

rgb3y9

That's a great question. I'll reply separately with my takes on progress on a) the pretty hard problem, b) the hard problem, and c) something called the meta-problem of consciousness [1].

[1] With apologies for introducing yet another 'problem' to distinguish between, when I've already introduced two! (Perhaps you can put these three problems into Anki?)

Progress on the pretty hard problem

This is my attempt to explain Jonathan Birch's recent proposal for studying invertebrate consciousness. Let me know if it makes rough sense!

The problem with studying anima... (read more)

Writing about my job: Research Fellow, FHI

rgb3y2

Oh and I should add: funnily enough, you are on my list of people to reach out to! :D

Writing about my job: Research Fellow, FHI

rgb3y5

Great question, I'm happy to share.

One thing that makes the reaching out easier in my case is that I do have one specific ask: whether they would be interested in (digitally) visiting the reading group. But I also ask if they'd like to talk with me one-on-one about their work. For this ask, I'll mention a paper of theirs that we have read in the reading group, and how I see it as related to what we are working on. And indicate what broad questions I'm trying to understand better, related to their work.

On the call itself, I am a) trying to get a better unde... (read more)

rgb

Oh and I should add: funnily enough, you are on my list of people to reach out to! :D

Writing about my job: Research Fellow, FHI

rgb3y15

Thanks Darius! It was my pleasure.

Writing about my job: Research Fellow, FHI

rgb3y11

That's a great point. A related point that I hadn't really clocked until someone pointed it out to me recently, though it's obvious in retrospect, is that (EA aside) in an academic department it is structurally unlikely that you will have a colleague who shares your research interests to a large extent. Since it's rare that a department is big enough to have two people doing the same thing, and departments need coverage of their whole field, for teaching and supervision.

Kevin Kuruc

That seems correct to me for the most part, though it might be less inevitable than you suspect, or at least this is my experience in economics. At my University they tried hiring two independent little 'clusters' (one being 'macro-development' which I was in) so I had a few people with similar enough interests to bounce ideas off of. A big caveat is that its a fragile setup: after 1 left its now just 2 of us with only loosely related interests. I have a friend in a similarly ranked department that did this for applied-environmental economics, so she has a few colleagues with similar interests. Everything said here is even truer of the top departments if you're a strong enough candidate to land one of those. My sense is that departments are wise enough to recognize the increasing returns to having peers with common interest at the expense of sticking faculty in teaching roles that are outside of their research areas. Though this will obviously vary job-to-job and should just be assessed when assessing whether to apply to a specific job; I just don't think its universal enough to steer people away from academia.

Writing about my job: Economics Professor

rgb3y1

"I've learned to motivate myself, create mini-deadlines, etc. This is a constant work in progress - I still have entire days where I don't focus on what I should be doing - but I've gotten way better."

What do you think has led to this improvement, aside from just time and practice? Favorite tips / tricks / resources?

Kevin Kuruc

I'm not sure I have much to add aside from things I saw in your post (e.g., morning working, and other Cal Newport-ish tricks). I've found these to be really great. One thing I experimented with pre-pandemic, and am about to re-up, is canceling my WiFi. Obviously during the depth of the pandemic when I had to work full time from home I needed it, but I'm actually calling up my provider tomorrow to drop back off. I still had some data on my phone for a quick email and/or internet check , but this entirely eliminated useless scrolling, streaming, etc., at home that don't bring me joy. I think more people should try this -- maybe I'll write a short post making the case for it. EDIT: I did write that short post up, if anyone's interested.

Retrospective on thinking about my career for a year

rgb4y1

Thanks for this. I was curious about "Pick a niche or undervalued area and become the most knowledgeable person in it." Do you feel comfortable saying what the niche was? Or even if not, can you say a bit more about how you went about doing this?

careersthrowaway

I don't want to share more on the specific field. I did not start with a plan. As I say in the post, I started with writing one or two forum posts on the topic. People thought these were valuable. I read a few books on the topic. I connected with a few people as a result of this, either asking for advice or giving advice. I gave feedback on the writing of others. I focused on the same field during part of my internship, which also helped.

Parallels Between AI Safety by Debate and Evidence Law

rgb4y8

This is very interesting! I'm excited to see connections drawn between AI safety and the law / philosophy of law. It seems there are a lot of fruitful insights to be had.

You write,

The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.

Can you elaborate a bit on this?

I don't know anything about the history of these rules about evidence. But why think that over this history, these rules have trended to... (read more)

Cullen

Thanks for this very thoughtful comment! I think it is accurate to say that the rules of evidence have generally aimed for truth-seeking per se. That is their stated goal, and it generally explains the liberal standard for admission (relevance, which is a very low bar and tracks Bayesian epistemology well), the even more liberal standards for discovery, and most of the admissibility exceptions (which are generally explainable by humans' imperfect Bayesianism). You're definitely right that the legal system as a whole has many goals other than truth-seeking. However, those other goals are generally advanced through other aspects of the justice system. As an example, finality is a goal of the legal system, and is advanced through, among other things, statutes of limitations and repose. Similarly, the "beyond reasonable doubt" standard for criminal conviction is in some sense contrary to truth-seeking but advances the policy preference for underpunishment over overpunishment. You're also right that there are some exceptions to this within evidence law itself, but not many. For example, the attorney–client privilege exists not to facilitate truth-seeking, but to protect the attorney–client relationship. Similarly, the spousal privileges exist to protect the marital relationship. (Precisely because such privileges are contrary to truth-seeking, they are interpreted narrowly. See, e.g., United States v. Aramony, 88 F.3d 1369, 1389 (4th Cir. 1996); United States v. Suarez, 820 F.2d 1158, 1160 (11th Cir. 1987)). And of course, some rules of evidence have both truth-seeking and other policy rationales. Still, on the whole and in general, the rules of evidence are aimed towards truth.

AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher

rgb4y3

Thanks for the great summary! A few questions about it

1. You call mesa-optimization "the best current case for AI risk". As Ben noted at the time of the interview, this argument hasn't yet really been fleshed out in detail. And as Rohin subsequently wrote in his opinion of the mesa-optimization paper, "it is not yet clear whether mesa optimizers will actually arise in practice". Do you have thoughts on what exactly the "Argument for AI Risk from Mesa-Optimization" is, and/or a pointer to the places where, in your opinion,... (read more)

abergal

1. Oh man, I wish. :( I do think there are some people working on making a crisper case, and hopefully as machine learning systems get more powerful we might even see early demonstrations. I think the crispest statement of it I can make is "Similar to how humans are now optimizing for goals that are not just the genetic fitness evolution wants, other systems which contain optimizers may start optimizing for goals other than the ones specified by the outer optimizer." Another related concept that I've seen (but haven't followed up on) is what johnswentworth calls "Demons in Imperfect Search", which basically advocates for the possibility of runaway inner processes in a variety of imperfect search spaces (not just ones that have inner optimizers). This arguably happened with metabolic reactions early in the development of life, greedy genes, managers in companies. Basically, I'm convinced that we don't know enough about how powerful search mechanisms work to be sure that we're going to end up somewhere we want. I should also say that I think these kinds of arguments feel like the best current cases for AI alignment risk. Even if AI systems end up perfectly aligned with human goals, I'm still quite worried about what the balance of power looks like in a world with lots of extremely powerful AIs running around. 2. Yeah, here I should have said 'new species more intelligent than us'. I think I was thinking of two things here: * Humans causing the extinction of less intelligent species * Some folk intuition around intelligent aliens plausibly causing human extinction (I admit this isn't the best example...). Mostly I meant here that since we don't actually have examples of existentially risky technology (yet), putting AI in the reference class of 'new technology' might make you think it's extremely implausible that it would be existentially bad. But we do have examples of species causing the extinction of lesser species (and scarier intuitions around it), so in the