204 karmaJoined Jan 2023


Interesting - I definitely think this is valuable. I have two small recommendations for the survey:

- Specify in the sugary drinks question whether it only includes commercial, fizzy sugary drinks, or any drinks with sugar in (e.g. coffee with sugar, milkshakes, bubble tea, traditional sweet drinks etc.) As it is, you give examples of commercial fizzy drinks, but it's a little ambiguous whether other sweet drinks might be included.

- Make it clear that you can choose percentages over 100% for the first two options (a life in prison, or without any pleasure is worse than death - many people are likely to believe this). I think that your example percentages (e.g. 0.1%, 1%, 10%, 20%, 30%, 100% etc) are anchoring people to a particularly low score. 

Interesting. I think there are two related concepts here, which I'll call individual modesty and communal modesty. Individual modesty, meaning that an individual would defer to the perceived experts (potentially within his community) and communal modesty, meaning that the community defers to the relevant external expert opinion. I think EAs tend to have fairly strong individual modesty, but occasionally our communal modesty lets us down. 

With most issues that EAs are likely to have strong opinions on, here are a few of my observations:

1. Ethics: I'd guess that most individual EAs think they're right about the fundamentals- that consequentialism is just better than the alternatives. I'm not sure whether this is more communal or individual immodesty.
2. Economics/ Poverty: I think EAs tend to defer to smart external economists who understand poverty better than core EAs, but are less modest when it comes to what we should prioritise based on expert understanding. 
3. Effective Giving: Individuals tend to defer to a communal consensus. We're the relevant experts here, I think.
4. General forecasting/ Future: Individuals tend to defer to a communal consensus. We think the relevant class is within our community, so we have low communal modesty. 
5. Animals: We probably defer to our own intuitions more than we should. Or Brian Tomasik. If you're anything like me, you think: "he's probably right, but I don't really want to think about it".
6. Geopolitics: I think that we're particularly bad at communal modesty here - I hear lots of bad memes (especially about China) that seem to be fairly badly informed. But it's also difficult to work out the relevant expert reference class. 
7. AI (doom): Individuals tend to defer to a communal consensus, but tend to lean towards core EA's 3-20% rather than core-LW/Eliezer's 99+%. People broadly within our community (EA/ rationalists) genuinely have thought about this issue more than anyone else, but I think there's a debate whether we should defer to our pet experts or more establishment AI people. 


I think there's a range of things that could happen with lower-level AGI, with increasing levels of 'fire-alarm-ness' (1-4), but decreasing levels of likelihood. Here's a list; my (very tentative) model would be that I expect lots of 1s and a few 2s within my default scenario, and this will be enough to slow down the process and make our trajectory slightly less dangerous. 

Forgive the vagueness, but these are the kind of things I have in mind:

1. Mild fire alarm: 

- Hacking (prompt injections?) within current realms of possibility (but amped up a bit)
- Human manipulation within current realms of possibility (IRA disinformation *5)
- Visible, unexpected self-improvement/ escape (without severe harm)
- Any lethal autonomous weapon use (even if generally aligned) especially by rogue power
- Everyday tech (phones, vehicles, online platforms) doing crazy, but benign misaligned stuff
- Stock market manipulation causing important people to lose a lot of money 

2. Moderate fire alarm:

- Hacking beyond current levels of possibility
- Extreme mass manipulation
- Collapsing financial or governance systems causing minor financial or political crisis
- Deadly use of autonomous AGI in weapons systems by rogue group (killing over 1000 people)
- Misaligned, but less deadly, use in weapons systems
- Unexpected self-improvement/ escape of a system causing multiple casualties/ other chaos
- Attempted (thwarted) acquisition of WMDs/ biological weapons
- Unsuccessful (but visible) attempts to seize political power

3. Major fire alarm:

- Successful attempts to seize political power
- Effective global mass manipulation
- Successful acquisition of WMDs, bioweapons
- Complete financial collapse 
- Complete destruction of online systems- internet becomes unuseable etc.
- Misaligned, very deadly use in weapons systems 

4. The fire alarm has been destroyed, so now it's just some guy hitting a rock with a scorched fencepost:

- Actual triggering of nuclear/ bio conflict/ other genuine civilisational collapse scenario (destroying AI in the process)

Okay, I think your reference to infinite time periods isn't particularly relevant here (seems to be a massive difference between 5 and 20 years), but I get your point that short timelines play an important role.

I guess the relevant factors that might be where we have different intuitions are:

  1. How long will this post-agentic-AGI, pre-God-AGI phase last?
  2. How chaotic/ dangerous will it be?
  3. When bad stuff happens, how likely is it to seriously alter the situation? (e.g. pause in AI progress, massive increase in alignment research, major compute limitations, massive reduction on global scientific capacity etc.)

If you hold these assumptions robustly, the most direct answer would be to focus on the kind of beings who are likely to experience greater suffering by default, namely factory-farmed animals, and potentially some wild animals. You should focus on interventions (alternative proteins, vegan advocacy) that are likely to cause these animals not to come into existence, rather than welfarist approaches that improve the lives of animals but keep numbers relatively constant. This is a very popular approach, so you'd be welcome in this part of the EA space.

But you might want to slightly relax your assumptions slightly when considering practical work you could do. Assuming that reducing suffering is your ultimate goal, if the "best way not to suffer is not to live", it doesn't necessarily follow that the most effective way to reduce suffering (given limited resources) is stopping beings coming into existence. 

For example, an intervention to help people in poor countries detect particularly painful congenital defects before birth and terminate these pregnancies might reduce suffering and satisfy your assumptions, but if it's expensive, it might be more effective to reduce the suffering of existing people, for example, providing relatively cheap pain relief for people with late-stage cancer. 

Or if you could cause x number of factory-farmed chickens to be raised in a free-range/ organic way for the same cost/ resources as stopping y number of factory-farmed chickens being born, there's probably some number for x and y for which you'd choose the first option. 


Rambling question here. What's the standard response to the idea that very bad things are likely to happen with non-existential AGI before worse things happen with extinction-level AGI?

Eliezee dismissed this as unlikely "what, self-driving cars crashing into each other?", and I read his "There is no fire alarm" piece, but I'm unconvinced.

For example, we can imagine a range of self-improving, kinda agentic AGIs, from some kind of crappy ChaosGPT let loose online, to a perfect God-level superintelligence optimising for something weird and alien, but perfectly able to function in, conceal itself in and manipulate human systems.

It seems intuitively more likely we'll develop many of the crappy ones first (seems to already be happening). And that they'll be dangerous.

I can imagine flawed, agentic, and superficially self-improving AI systems going crazy online, crashing financial systems, hacking military and biosecurity, taking a shot at mass manipulation, but ultimately failing to displace humanity, perhaps because they fail to operate in analog human systems, perhaps because they're just not that good.

Optimistically, these crappy AIs might function as a warning shot/ fire alarm. Everyone gets terrified, realises we're creating demons, and we're in a different world with regards to AI alignment.

I sympathise, but... For 1), if your negative utilitarianism (NU) is a sincerely held, 'psychologically normal' belief, I think that you can be a very strong NU and still want to pursue totally 'normal' EA goals. For any brand of utilitarianism, greatly reducing or eradicating suffering is a valid and obviously normal goal. Assuming you don't have the capability for 'magic annihilation', there are so many alternatives. Is there a worldview where 'ending your own suffering' is higher expected-value than ending factory farming or treating extreme cancer pain in Sub-Saharan Africa? Preventing S-risks is also a productive way for an NU to work on/ think about an EA topic.

Answer by DzoldzayaMar 13, 202373

I feel that strong negative utilitarianism (we should only consider disutility) is just a non-starter. It doesn't match any of my moral intuitions. 

But a weaker negative utilitarianism is a powerful and potentially valid position. These are my views:

  1. Good things are actually good. Pleasure is usually good. Laughing/ smiling/ dancing/ sex/ rollercoasters/ MDMA trips are actually really good.
  2. Bad things might be a bit worse than good things are good (at least our intuitions might be skewed). But a rudimentary thought experiment can calibrate your scales here - I don't see how you can end up at a strong negative position. 
  3. There is a hedonic treadmill/ normalisation effect, making 'normality' into suffering, but this is not always the case - some hedonistic pleasure can give you a warm glow for a long time, and make you genuinely happier in the long term.
  4. We probably have a weak bias towards pleasure and suffering in our lives being more balanced than they are, and towards believing that our lives are more positive than they actually are. E.g. someone can say that his life is 6/10 on a happiness scale, but he could be totally wrong, and his life could actually not be worth living. 
  5. This bias is unlikely to be so strong that even the happiest-seeming people actually have net-negative lives. So there are almost definitely some net-positive lives in the world.
  6. Lots of human lives are net-negative. My median estimate would be 25% (around 5/10 on a happiness scale), but I have very wide error margins. 
  7. This is just a rounding error compared to animal lives, which are more likely to be net-negative, therefore the world is likely to be net-negative, even from a standard utilitarian position. 
  8. Even under very weak negative utilitarianism, the 'benevolent world-exploder' argument may be valid from a near-termist view, and is compatible with fear of s-risks. But, even to a moderate weak negative utilitarian, it can be countered with an optimistic long-termism where we may be able to both eradicate suffering and normalise increasingly wondrous states of pleasure.

Some random thoughts:

  1. Whenever I hear someone (usually Will Macaskill) say: "If utilitarianism is correct..." I don't hear it as a coherent sentence. I don't think there is a fundamental ethical truth that utilitarianism is trying to get at. I'd still happily call myself a utilitarian, as I believe it's the best, most coherent way of thinking about ethics- I even 'bite the bullet' of the repugnant conclusion. But utilitarianism doesn't feel like the kind of thing that can be 'true' or 'false'. 
  2. But I feel I agree with the 'realist' side of this paper more than the 'anti-realist'. When I'm 'doing' ethics, I'd rather do it well and consistently. I tend to slightly disagree with Scott Alexander in that post. 
  3. But ethics, especially utilitarianism, is so demanding that you need a way to 'get out of it' - I do that by just 'not doing ethics' - I would be a 'better person' if I did ethics more consistently, but I just don't - I also spend most of my time not even considering utilitarian ethics. I should but don't need to be ethical. I'm bound by utilitarian ethics in a similar way that I'm bound by my own bodily desires, politeness norms, family ties, occasional propensity to turn into a dire wolf etc. 
  4. But I don't think it works in the Scott Alexander/ lizard/ utopia case. If there's a time to take ethics deadly seriously, it's when you've got a zillion potentially net-positive lizard lives on the scales! 

Yep, agreed (I haven't read those books, but I broadly know the story). I wasn't trying to imply that eugenics was the main cause of the one-child policy, but the two are definitely connected. Post-1CP, the state took a really active role in controlling how and when kids were born, compulsory sterilisation (mostly of females, despite vasectomies being safer) became normalised for 'quality and quantity' etc. 

A stronger "eugenics taboo" could plausibly have limited the scope of the policy.

Load more