If you're ever running an event that you are not excited to be part of, something has gone wrong
This seems way too strong to me. Eg, reasonable and effective intro talks feel like they wouldn't be much fun for me to do, yet seem likely high value
+1. The heuristic doesn’t always work.
(Though for an intro talk I would probably just modify the heuristic to “is the the kind of intro talk that would’ve actually excited a younger version of me.”)
Really excited to see this post come out! I think this is a really helpful guide to people who want to work on AI Alignment, and would have been pretty useful to me in the past.
This felt like an unusually high quality post in the genre of 'stuff I buy and use', thanks for writing it! I particularly appreciate the nutrition advice, plus actual discussion of your reasoning and epistemic confidences
I'm did a pure maths undergrad and recently switched to doing mechanistic interpretability work - my day job isn't exactly doing maths, but I find it has a strong aesthetic appeal in a similar way. My job is not to train an ML model (with all the mess and frustration that involves), it's to take a model someone else has trained, and try to rigorously understand what is going on with it. I want to take some behaviour I know it's capable of and understand how it does that, and ideally try to decompile the operations it's running into something human understa... (read more)
the reason the "longtermists working on AI risk" care about the total doom in 15 years is because it could cause extinction preclude the possibility of a trillion-happy-sentient-beings in the long term. Not because it will be bad for people alive today.
As a personal example, I work on AI risk and care a lot about harm to people alive today! I can't speak for the rest of the field, but I think the argument for working on AI risk goes through if you just care about people alive today and hold beliefs which are common in the field
- see this post I wrote... (read more)
Thanks for the recommendation! I've just finished reading it and really enjoyed it. Note for future readers that the titular "war" only really happens towards the end of the book, and most of it is about set up and exploring the idea of introducing newts to society
No worries, I'm excited to see more people saying this! (Though I did have some eerie deja vu when reading your post initially...)
I'd be curious if you have any easy-to-articulate feedback re why my post didn't feel like it was saying the same thing, or how to edit it to be better?
(EDIT: I guess the easiest object-level fix is to edit in a link at the top to your's, and say that I consider you to be making substantially the same point...)
Inside view feels deeply emotional and tied to how I feel the world to be, independent impression feels cold and abstract
How can we best allocate our limited resources to improve the world? Sub-question: Which resources are worth the effort to optimise the allocation of, and which are not, given that we all have limited time, effort and willpower?
I find this framing most helpful. In particular, for young people, the most valuable resource they have is their future labor. Initially, converting this to money and the money to donations was very effective, but now this is often outcompeted by working directly on high priority paths. But the underlying question remains. And I'd a... (read more)
The complaint that it's confusing jargon is fair. Though I do think the Tetlock sense + phrase inside view captures something important - my inside view is what feels true to me, according to my personal best guess and internal impressions. Deferring doesn't feel true in the same way, it feels like I'm overriding my beliefs, not like how they world is.
This mostly comes under the motivation point - maybe, for motivation, inside views matter but independent impressions don't? And people differ on how they feel about the two?
One thing I disagree with: the importance of forming inside views for community epistemic health. I think it's pretty important. E.g. I think that ~2 years ago, the arguments for the longterm importance of AGI safety were pretty underdeveloped; that since then lots more people have come out with their insidee views about it; and that now the arguments are in much better shape.
I want to push back against this. The aggregate benefit may have been high, but when you divide it by all the people trying, I'm not convinced it's all that high.
Further, that's an ov... (read more)
Fair point re tractability
What argument do you think works on people who already think they're working on important and neglected problems? I can't think of any argument that doesn't just boil down to one of those
Thanks for the post! I broadly agree with the arguments you give, though I think you understate the tensions between promoting earning to give vs direct work.
Personal example: I'm currently doing AI Safety work, and I expect it to be fairly impactful. But I came fairly close to going into finance as it was a safe, stable path I was confident I'd enjoy. And part of this motivation was a fuzzy feeling that donations was still somewhat good. And this made it harder to internalise just how much higher the value from direct work was. Anecdotally, a lot of smart... (read more)
Thanks Neel for sharing your personal experience! I can see how this would be a concern with promoting earning to give too heavily.
However, Michael's post advocating for promoting earning to give, it's about promoting effective giving. This is a really important distinction. GWWC is focused on promoting effective giving more broadly to the wider public and not focused on promoting earning to give as a career path.
Promoting effective giving outside the EA community helps fund important work, provides many people with a strong opportunity to have a big impact, and also brings people into the EA community.
But we want to make sure that the "truth-seeking" norms of this movement stay really really high.
I think there's two similar but different things here - truth-seeking and cause neutrality. Truth-seeking is the general point of 'it's really important to find truth, look past biases, care about evidence, etc' and cause neutrality is the specific form of truth seeking that impact between different causes can differ enormously and that it's worth looking past cached thoughts and the sunk cost fallacy to be open to moving to other causes.
I think truth-seeking c... (read more)
- I think there's a lot that goes into deciding which people are correct on this, and only saying "AI x-risk and bio x-risk are really important" is missing a bunch of stuff that feels pretty essential to my beliefs that x-risk is the best thing to work on
Can you say more about what you mean by this? To me, 'there's a 1% chance of extinction in my lifetime from a problem that fewer than 500 people worldwide are working on' feels totally sufficient
It's not enough to have an important problem: you need to be reasonably persuaded that there's a good plan for actually making the problem better, the 1% lower. It's not a universal point of view among people in the field that all or even most research that purports to be AI alignment or safety research is actually decreasing the probability of bad outcomes. Indeed, in both AI and bio it's even worse than that: many people believe that incautious action will make things substantially worse, and there's no easy road to identifying which routes are both safe... (read more)
This is a fair criticism! My short answer is that, as I perceive it, most people writing new EA pitches, designing fellowship curricula, giving EA career advice, etc, are longtermists and give pitches optimised for producing more people working on important longtermist stuff. And this post was a reaction to what I perceive as a failure in such pitches by focusing on moral philosophy. And I'm not really trying to engage with the broader question of whether this is a problem in the EA movement. Now OpenPhil is planning on doing neartermist EA movement buildi... (read more)
I think to the extent you are trying to draw the focus away from longtermist philosophical arguments when advocating for people to work on extinction risk reduction, that seems like a perfectly reasonable thing to suggest (though I'm unsure which side of the fence I'm on).
But I don't want people casually equivocating between x-risk reduction and EA, relegating the rest of the community to a footnote.
This seems like a really exciting project, I look forwards to seeing where it goes!
As I understand it, a lot of the difficulty with new medical technology is running big and expensive clinical trials, and going through the process of getting approved by regulators. What's Alvea's plan for getting the capital and expertise necessary to do this?
Ah gotcha. So you're specifically objecting to people who say 'even if there's a 1% chance' based on vague intuition, and not to people who think carefully about AI risk, conclude that there's a 1% chance, and then act upon it?
Ah sorry, the original thing was badly phrased. I meant, a valid objection to x-risk work might be "I think that factory farming is really really bad right now, and prioritise this over dealing with x-risk". And if you don't care about the distant future, that argument seems pretty legit from some moral perspectives? While if you do care about the distant future, you need to answer the question of what the future distribution of animal welfare looks like, and it's not obviously positive. So to convince these people you'd need to convince them that the distribution is positive.
I haven't met anyone who's working on this stuff and says they're deferring on the philosophy (while I feel like I've often heard that people feel iffy/confused about the empirical claims).
Fair - maybe I feel that people mostly buy 'future people have non-zero worth and extinction sure is bad', but may be more uncertain on a totalising view like 'almost all value is in the far future, stuff today doesn't really matter, moral worth is the total number of future people and could easily get to >=10^20'.
... (read more)I'm sympathetic to something along these lines. But I
That's fair pushback. My personal guess is that it's actually pretty tractable to decrease it to eg 0.9x of the original risk, with the collective effort and resources of the movement? To me it feels quite different to think about reducing something where the total risk is (prob=10^-10) x (magnitude = 10^big), vs having (prob of risk=10^-3 ) x (prob of each marginal person making a decrease = 10^-6) x (total number of people working on it = 10^4) x (magnitude = 10^10)
(Where obviously all of those numbers are pulled out of my ass)
These arguments appeal to phenomenal stakes implying that, using expected value reasoning, even a very small probability of the bad thing happening means we should try to reduce the risk, provided there is some degree of tractability in doing so.
To be clear, the argument in my post is that we only need the argument to work for very small=1% or 0.1%, not eg 10^-10. I am much more skeptical about arguments involving 10^-10 like probabilities
I'm curious, do you actually agree with the two empirical claims I make in this post? (1% risk of AI x-risk, 0.1% of bio within my lifetime)
Re your final point, I mostly just think they miss the mark by not really addressing the question of what the long-term distribution of animal welfare looks like (I'm personally pretty surprised by the comparative lack of discussion about how likely our Lightcone is to be net bad by the lights of people who put significant weight on animal welfare)
Thanks, this is some great pushback. Strongly upvoted.
Re long-termists will think hard about x-risk, that's a good point. Implicitly I think I'm following the intuition that people don't really evaluate a moral claim in isolation. And that when someone considers how convinced to be by long-termism, they're asking questions like "does this moral system imply important things about my actions?" And that it's much easier to convince them of the moral claim once you can point to tractable action relevant conclusions.
Re target audiences, I think we are imaginin... (read more)
Estimates can be massively off in both directions. Why do you jump to the conclusion of inaction rather than action?
(My guess is that it's sufficiently easy to generate plausible but wrong ideas at the 1% level that you should have SOME amount of inaction bias, but not to take it too far)
To articulate my worries, I suppose it's that this implies a very reductionist and potentially exclusionary idea of doing good; it's sort of "Holy shit, X-risks matters (and nothing else does)". On any plausible conception of EA, we want people doing a whole bunch of stuff to make things better.
I'd actually hoped that this framing is less reductionist and exclusionary. Under total utilitarianism + strong longtermism, averting extinction is the only thing that matters, everything else is irrelevant. Under this framing, averting extinction from AI is, say, m... (read more)
It's not at all clear under this view that it would be worthwhile to pivot your career to AI safety or biorisk, instead of taking the more straightforward route of earning to give to standard near-term interventions.
I'd disagree with this. I think the conversion of money to labour is super inefficient on longtermism, and so this analogy breaks down. Sure, maybe I should donate to the Maximum Impact Fund rather than LTFF. But it's really hard to usefully convert billions of dollars into useful labour on longtermist stuff. So, as someone who can work on AI S... (read more)
Yep! I think this phenomena of 'things that are technically all-or-nothing, but it's most useful to think of them as a continuous thing' is really common. Eg, if you want to reduce the amount of chickens killed for meat, it helps to stop buying chicken. This lowers demand, which will on average lower chickens killed. But the underlying thing is meat companies noticing and reducing production, which is pretty discrete and chunky and hard to predict well (though not literally all-or-nothing).
Basically any kind of campaign to change minds or achieve social change with some political goal also comes under this. I think AI Safety is about as much a Pascal's Mugging as any of these other things
Hmm, what would this perspective say to people working on climate change?
If this were to actually be delivered as a pitch I would suggest putting more focus on cognitive biases that lead to inaction
Thanks for the thoughts! Definitely agreed that this could be compelling for some people. IMO this works best on people whose crux is "if this was actually such a big deal, why isn't it common knowledge? Given that it's not common knowledge, this is too weird for me and I am probably missing something".
I mostly make this argument in practice by talking about COVID - IMO COVID clearly demonstrates basically all of these biases with different ways that we under-prepared and bungled the response.
Thanks for the feedback! Yep, it's pretty hard to judge this kind of thing given survivorship bias. I expect this kind of pitch would have worked best on me, though I got into EA long enough ago that I was most grabbed by global health pitches. Which maybe got past my weirdness filter in a way that this one didn't.
I'd love to see what happens if someone tries an intro fellowship based around reading the Most Important Century series!
TL;DR I think that in practice most of these disagreements boil down to empirical cruxes not moral ones. I'm not saying that moral cruxes are literally irrelevant, but that they're second order, only relevant to some people, and only matter if people buy the empirical cruxes, and so should not be near the start of the outreach funnel but should be brought up eventually
Hmm, I see your point, but want to push back against this. My core argument is essentially stemming from an intuition that you have a limited budget to convince people of weird ideas, and tha... (read more)
Hm, I think I have different intuitions about several points.
you have a limited budget to convince people of weird ideas
I'm not sure this budget is all that fixed. Longtermism pretty straightforwardly implies that empirical claims about x-risk are worth thinking more about. So maybe this budget grows significantly (maybe differentially) if someone gets convinced of longtermism. (Anecdotally, this seems true--I don't know any committed longtermist who doesn't think empirical claims about x-risk are worth figuring out, although admittedly there's confoun... (read more)
Suppose it takes $100 billion to increase our chance of completely averting extinction (or the equivalent) by 0.1%. By this, I don't mean averting an extinction event by having it be an event that only kills 98% of people, or preventing the disempowerment of humanity due to AI; I mean that we save the entire world's population. For convenience, I'll assume no diminishing marginal returns. If we only consider the 7 generations of lost wellbeing after the event, and compute $100 billion / (7 * 8 billion * 0.1%), then we get a cost-effectiveness of $1,780 to ... (read more)
I'm not saying this consideration is overriding, but one reason you might want moral agreement and not just empirical agreement is that people who agree with you empirically but not morally may be more interested in trading x-risk points for ways to make themselves more powerful.
I don't think this worry is completely hypothetical, I think there's a fairly compelling story where both DeepMind and OpenAI were started by people who agree with a number of premises in the AGI x-risk argument but not all of them.
Fortunately this hasn't happened in bio (yet), at least to my knowledge.
+1, I think this is my current favourite intro to EA
Strongly downvoted. I think a meme review feels in fairly poor taste for this post. I took the tone of Denise's post as an honest, serious and somewhat sad account of how path to having an impact. Tonally, memes feel non-serious and about humour and making light of things. This clashes a lot with the tone of Denise's post, in a way that feels inappropriate to me.
I found the meme review of Aaron Gertler's retirement fun though!
I think this proposal fixes a lot of the problems I'd seen in the earlier CBG program, and I'm incredibly excited to see where it goes. Nice work! EA Stanford and EA Cambridge seem like some of the current groups we have that are closest to Campus Centres, and I've been REALLY impressed with both of their work and all the exciting projects that are coming out of them. I'm very keen to see this scaled to more places!
- Why is it useful to think of AI-influenced coordination failures as a major threat model in the alignment landscape? My intuition would be to think of it as falling under capabilities (since the worry, if I understand it, is that--even if AI systems are aligned with their users--bad things will still happen because coordination is hard?).
This may be a disagreement about semantics. As I see it, my goal as an alignment researcher is to do whatever I can to reduce x-risk from powerful AI. And given my skillset, I mostly focus on how I can do this with technic... (read more)
Thanks for the feedback! Really glad to hear it was helpful de-confusion for people who've already engaged somewhat with AI Alignment, but aren't actively researching in the field, that's part of what I was aiming for.
1
I didn't get much feedback on my categorisation, I was mostly trying to absorb other people's inside views on their specific strand of alignment. And most of the feedback on the doc was more object-level discussion of each section. I didn't get feedback suggesting this was wrong in some substantial way, but I'd also expect it to be considere... (read more)
This seems like a really exciting set of grants! It's great to see EAIF scaling up so rapidly.
Sure. But I think the story there was that Open Phil intentionally split off to pursue this much more aggressive approach, and GiveWell is more traditional charity focused/requires high standards of evidence. And I think having prominent orgs doing each strategy is actually pretty great? They just fit into different niches
I had planned to write a whole post on this and on how to do active grant-making well as a small donor – not sure if I will have time but maybe
I would love to read this post (especially any insights that might transfer to someone with AI Safety expertise, but not much in other areas of EA!). Do you think there's much value in small donors giving to areas they don't know much about? Especially in areas with potential high downside risk like policy. Eg, is the average value of the marginal "not fully funded" policy project obviously positive or negative?
So of course the community collectively gets credit because OpenPhil identifies as EA, but it's worth noting that their "hits based giving" approach divers substantially from more conventional EA-style (quantitative QALY/cost-effectiveness) analysis and asking what that should mean for the movement more generally.
My impression is that most major EA funding bodies, bar Givewell, are mostly following a hits based giving approach nowadays. Eg EA Funds are pretty explicit about this. I definitely agree with the underlying point about weaknesses of traditional EA methods, but I'm not sure this implies a deep question for the movement, vs a question that's already fairly internalised
Though maybe "quitting their job and not getting a pension" is meant as a metaphor for "take very big life risks,"
That's fair pushback - a lot of that really doesn't seem that risky if you're young and have a very employable skillset. I endorse this rephrasing of my view, thanks
I guess you're still exposed to SOME increased risk, eg that the tech industry in general becomes much smaller/harder to get into/less well paying, but you're still exposed to risks like "the US pension system collapses" anyway, so this seems reasonable to mostly ignore. (Unless there's a good way of buying insurance against this?)
I think if it turns out that short AI timelines are wrong, those with short timelines should acknowledge it and the EA as a whole should seek to understand why we got it so wrong. I will think it odd if those who make repeatedly wrong predictions continue to be taken seriously.
I think this only applies to people who are VERY confident in short timelines. Say you have a distribution over possible timelines that puts 50% probability on <20 years, and 20% probability on >60 years. This would be a really big deal! It's a 50% chance of the world wild... (read more)
Haseeb Qureshi and FTX are both EA aligned donors. I'm fairly skeptical that this is a counterfactual donation match.
There might still be some leverage in some cases, but less than 1:1.
If they have a rule of providing 66% of a charity's budget, surely donations are even more leveraged? $1 to the charity unlocks $2.
Of course, this assumes that additional small donations to the charity will counter-factually unlock further donations from OpenPhil, which is making some strong assumptions about their decision-making
This post brought me joy, and I would enjoy it being a monthly thing. I'm weakly pro it being on the Front Page, though I expect I'd see it even if it was a Personal Blog. I'd feel sad if memes became common in the rest of the Forum though - having a monthly post feels like a good compromise to me.
Their review for GFI gives the weaknesses as:
... (read more)We think that GFI’s focus on cell-cultured products could have an enormous impact for farmed animals in the longer term, as cell-cultured food potentially cause a considerable decrease in demand for farmed animal products. However, we are relatively uncertain about the price-competitiveness of cell-cultured products with conventional animal protein. Furthermore, the majority of impacts that GFI’s work has on animals are more indirect and may happen in the future. As such, the cost effectiveness of their work is
Of note: "ACE is not able to share any additional information about any of the anonymous allegations", and yet ACE turned down GFI's offer to investigate the complaints further:
GFI would be happy to participate fully in an investigation of the complaints to better understand and address them, and we offered to hire an external investigator. ACE declined
Which makes is sound as though GFI were willing to make efforts to resolve/investigate these anonymous complaints but ACE were not willing to pursue this.
As Pablo noted, concerns over the uncertain impact of... (read more)
Their review for GFI gives the weaknesses as:
The first of the concerns listed in the quoted paragraph predated this review, so it can't be one of the considerations adduced for demoting GFI. From their 2020 review:
... (read more)Work on cell-cultured products could have an enormous impact for farmed animals in the long term. If cell-cultured animal products become a competitive alternative, they could reach consumers with various food preferences and attitudes and reduce the consumption of animal products significantly. However, our impression is that it is relatively un
Thanks a lot for the post! This felt like one of the rare posts that clearly defines and articulates a thing that feels intuitively true, but which I've never properly thought about before.
Note that OpenAI became a limited profit company in 2019 (2 years into this grant), which I presume made them a much less cost-effective thing to invest in, since they had much better alternative funding sources