3663 karmaJoined Sep 2014



    Courting Virgo
    EA Gather Town
    Improving EA tech work


    Topic Contributions

    Hey Ryan, I think your scepticism is a widely held view among EAs, but  IMO overlooks some crucial factors in addition to the considerations Christopher mention:

    • Focusing just on cost seems like a huge oversimplification. If Musk or someone can set up a space economy that gradually gives people real incentives to fly there and back, it's not that hard to imagine the free market effectively covering this cost many times over. That doesn't have to mean something like 'people on Earth pay to transport materials from the surface of Mars to the surface of Earth'. If you have a bunch of individuals living on the surface for whatever reason (initially scientific research, say), they could make a living in any number of ways such that some of the value is exportable to earth - low gravity research or industry, xenobiology, web development, art, or whatever - and you can bet they'd be spending a very high proportion of that on urgently reducing the number of ways that their environment could kill them.
    • Feedback loops: we know that Earth has a terrible record of keeping important-to-isolate areas actually isolated. You could spend the GDP of a mid-sized nation on creating a network of theoretically isolated bunkers, but if one sleepy resident calls out for pizza, leaves a vent off or whatever, the entire thing could be compromised, at least to biological hazards. On the other hand, space settlements, unlike bunkers, force isolation. Obviously mistakes can still be fatal, but unlike on Earth they're going to have a fast and hard-to-ignore feedback loop if they're doing something existentially risky. On an offworld colony if you make a serious mistake a bunch of people will probably  die very soon after. On an Earth bunker if you make a serious mistake the whole of the bunker might all die at some indefinite point after - which is exactly what it was supposed to safeguard against.
    • Relatedly, lack of any serious proposal to make such disaster shelters on Earth. This is maybe just human motivation, or the 'forcing function' people like Robert Zubrin have described: you put people on a remote frontier where their lives are in danger every day, and they're just going to be a lot more productive than people who vaguely think their project might matter someday.
    • Also relatedly, in-atmosphere biodefence seems hopeless. We seem quite close to it being terrifyingly easy to create extremely viral and extremely lethal pandemics and nowhere near being able to regulate the ecosystem to the degree that 
    • Long-term value: a bunker system is ultimately way more limited as a backup mechanism. At its apotheosis, the residents might be able to help humanity leapfrog some of the way towards the modern era (which may not actually bypass that much of the existential risk, per my argument here and the subsequent post I'm still working on in that series). Whereas the apotheosis of offworld colonisation is basically our end goal - colonising the Virgo supercluster. And it would only take a century or two at most - and possibly only a few decades on optimistic timelines - to far surpass the best defence a bunker system could provide. 
    • 'Self-sustainingness' is more of a spectrum than a hard line. If some 100% lethal pandemic or other such event killed all humans on Earth, an offworld colony would still be able to repopulate it if they had the capacity for a couple of trips back and the wherewithal to outlast the catastrophe. It seems plausible that an offworld colony could start providing meaningful backup (without yet being fully self-sustaining) by 2050-2060.

    If you also think AI timelines might substantially longer than EAs typically think or that AI could be less 'extinction event' and more 'global catastrophe on the order of taking down the internet' (which seems plausible to me - there are a lot of selection effects in this community for short-timeline-doomers and evidence for some quite strong groupthink) then it starts to look like a reasonable area for further consideration, especially given how little serious scrutiny it's had among EAs to date.

    I've upvoted this comment, but weakly disagree that there's such a shift happening (EVF orgs still seem to be selecting pretty heavily for longtermist projects, the global health and development fund has been discontinued while the LTFF is still around etc), and quite strongly disagree that it would be bad if it is:

    From a longtermist (~totalist classical utilitarian) perspective, there's a huge difference between ~99% and 100% of the population dying, if humanity recovers in the former case, but not the latter.

    That 'if' clause is doing a huge amount of work here. In practice I think the EA community is far too sanguine about our prospects post-civilisational collapse  of becoming interstellar (which, from a longtermist perspective, is what matters - not 'recovery'). I've written a sequence on this here, and have a calculator which allows you to easily explore the simple model's implications on your beliefs described in post 3 here, with an implementation of the more complex model available on the repo. As Titotal wrote in another reply, it's easy to believe 'lesser' catastrophes are many times more likely, so could very well be where the main expected loss of value lies.

    From a longtermist (~totalist classical utilitarian) perspective, preventing a GCR doesn't differentiate between "humanity prevents GCRs and realises 1% of it's potential" and "humanity prevents GCRs realises 99% of its potential"

    I think I agree with this, but draw a different conclusion. Longtermist work has focused heavily on existential risk, and in practice the risk of extinction, IMO seriously dropping the ball on trajectory changes with little more justification that the latter are hard to think about. As a consequence they've ignored what seem to me the very real loss of expected unit-value from lesser catastrophes, and the to-me-plausible increase in it from interventions designed to make people's lives better (generally lumping those in as 'shorttermist'). If people are now starting to take other catastrophic risks more seriously, that might be remedied. (also relevant to your 3rd and 4th points)

    From a "current generations" perspective, reducing GCRs is probably not more cost-effective than directly improving the welfare of people / animals alive today

    This seems to be treating 'focus only on current generations' and 'focus on Pascalian arguments for astronomical value in the distant future' as the only two reasonable views. David Thorstad has written a lot, I think very reasonably, about reasons why expected value of longtermist scenarios might actually be quite low, but one might still have considerable concern for the next few generations.

    From a general virtue ethics / integrity perspective, making this change on PR / marketing reasons alone - without an underlying change in longtermist motivation - feels somewhat deceptive.

    Counterpoint: I think the discourse before the purported shift to GCRs was substantially more dishonest. Nanda and Alexander's posts argued that we should talk about x-risk rather than longtermism on the grounds that it might kill you and everyone you know - which is very misleading if you only seriously consider catastrophes that kill 100% of people, and ignore (or conceivably even promote) those that leave >0.01% behind (which, judging by Luisa Rodriguez's work is around the point beyond which EAs would typically consider something an existential catastrophe).

    I basically read Zabel's post as doing the same, not as desiring a shift to GCR focus, but as desiring presenting the work that way, saying 'I’d guess that if most of us woke up without our memories here in 2022 [now 2023], and the arguments about potentially imminent existential risks were called to our attention, it’s unlikely that we’d re-derive EA and philosophical longtermism as the main and best onramp to getting other people to work on that problem' (emphasis mine).

    Nanda, Alexander and Zabel's posts all left a very bad taste in my mouth for exactly that reason.

    There's something fairly disorienting about the community switching so quickly from [quite aggressive] "yay longtermism!" (e.g. much hype around launch of WWOTF) to essentially disowning the word longtermism, with very little mention / admission that this happened or why

    This is as much an argument that we made a mistake ever focusing on longtermism as that we shouldn't now shift away from it. Oliver Habryka (can't find link offhand) and Kelsey Piper are two EAs who've publicly expressed discomfort with the level of artificial support WWOTF received, and I'm much less notable, but happy to add myself to the list of people uncomfortable the business, especially since at the time he was a trustee of the charity that was doing so much to promote his career.

    I'm not sure what the solution is - more experimentation seems generally like a good idea, but EA fundmakers seem quite conservative in the way they operate, at least once they've locked in a modus operandi.

    For what it's worth, my instinct is to try a model with more 'grantmakers' who take a more active, product-managery/ownery role, where they make  fewer grants, but the grants are more like contracts of employment, such that the grantmakers take some responsibility for the ultimate output (and can terminate a contract like a normal employer if the 'grant recipient' underperforms). This would need a lot more work-hours, but I can imagine it more than paying itself back through the greater security of the grant recipients and the increased accountability for both recipients and grantmakers.

    What talents do you think aren't applicable outside the EAsphere?

    Community building doesn't seem to have that much carryover - that's not to say it's useless, just that it's not going to look anywhere as good to most employers as something vaguely for-profit equivalent, like being a consultant at some moderately prestigious firm. Research seems comparable. It's unlikely to be taken seriously for academic jobs, and likely to be far too abstract for for-profits. In general, grantees and even employees at small EA orgs get little if any peer support or training budgets, which will stymie their professional development even when they're working in roles that have direct for-profit equivalents (I've written a little about this phenomenon for the specific case of EA tech work here).


    I would really like to see EA funding orgs more explicitly discuss the costs of the uncertainty their one-grant-at-a-time funding models plus short notice times of (non-)renewal impose on so many people in the EA community. I realise EA funding took a big hit last year, but for years before FTX Foundation was announced, 80k were claiming EA was talent-constrained rather than funding constrained, and than most EAs should not be earning to give. The net result is that there are a bunch of people with EA-relevant talents that aren't particularly applicable outside the EAsphere, who are struggling to make ends meet, or whose livelihood could disappear with little warning.

    After hearing multiple experiences like this it's really hard for me to encourage anyone go into EA meta work until the landscape gets a lot smoother.

    I think at this point we can amicably disagree, though I'm curious why you think the 'more people = more animals exploited' philosophy applies to people in Africa, but not in the future. One might hope that we learn to do better, but it seems like that hope could be applied to and criticised in either scenario.

    I have no particular reason to think you shouldn't believe in any of those claims, but fwiw I find it quite plausible (though wouldn't care to give particular credences atm) that at least some of them could be bad, eg:

    • Technical AI safety seems to have been the impetus for various organisations who are working on AI capabilities in a way that everyone except them seems to think is net negative (OpenAI, Deepmind, Anthropic, maybe others). Also, if humans end up successfully limiting AI by our own preferences, that could end up being a moral catastrophe all of its own.
    • 'Expanding our moral circle' sounds nice, but without a clear definition of the morality involved it's pretty vague what it means - and with such a definition, it could cash out as 'make people believe our moral views', which doesn't have a great history.
    • Investing for the future could put a great deal of undemocratic power into the hands of a small group of people whose values could shift (or turn out to be 'wrong') over time.

    And all of these interventions just cost a lot of money, something which the EA movement seems very short on recently.


    As it stands I struggle to justify GHD work at all on cluelessness grounds. GiveWell-type analyses ignore a lot of foreseeable indirect effects of the interventions e.g. those on non-human animals.

    I support most of this comment, but strongly disagree with this, or at least think it's much too strong. Cluelessness isn't a categorical property which some interventions have and some don't - it's a question of how much to moderate your confidence in a given decision. Far from being the unanswerable question Greaves suggests, it seems reasonable to me to do any or all of the following:

    1. Assume unknown unknowns pan out to net 0
    2. Give credences on a range of known unknowns
    3. Time-limit the above process in some way, and give an overall best guess expectation for remaining semi-unknowns 
    4. Act based on the numbers you have from above process when you stop
    5. Incorporate some form of randomness in the criteria you investigate

    If you're not willing to do something like the above, you lose the ability to predict anything, including supposedly long-termist interventions, which are all mired in their own uncertainties.

    So while one might come to the view that GHD is in fact bad because of eg the poor meat eater problem, it seems irrational to be agnostic on the question, unless you're comparably agnostic towards every other cause.

    But can you produce a finite upper bound on our lightcone that you're 100% confident nothing can pass? (It doesn't have to be tight.)

    I think Vasco already made this point elsewhere, but I don't see why you need certainty about any specific line to have finite expectation. If for the counterfactual payoff x, you think (perhaps after a certain point) xP(x) approaches 0 as x tends to infinity, it seems like you get finite expectation without ever having absolute confidence in any boundary (this applies to life expectancy, too).

    Section II from Carlsmith, 2021 is one of the best arguments for acausal influence I'm aware of, in case you're interested in something more convincing. (FWIW, I also thought acausal influence was crazy for a long time, and I didn't find Newcomb's problem to be a compelling reason to reject causal decision theory.)

    Thanks! I had a look, and it still doesn't persuade me, for much the reasons Newcomb's problem didn't. In roughly ascending importance

    1. Maybe this just a technicality, but the claim 'you are exposed to exactly identical inputs' seems impossible to realise with perfect precision. The simulator itself must differ in the two cases. So in the same way that outputs of two instances of a software program being run, even on the same computer in the same environment can theoretically differ for various reasons (looking at a high enough zoom level they will differ), the two simulations can't be guaranteed identical (Carlsmith even admits this with 'absent some kind of computer malfunction', but just glosses over it). On the one hand, this might be too fine a distinction to matter in practice; on the other, if I'm supposed to believe a wildly counterintuitive proposition instead of a commonsense one that seems to work fine in the real world, based on supposed logical necessity that it turns out isn't logically necessary, I'm going to be very sceptical of the proposition even if I can't find a stronger reason to reject it.
    2. The thought experiment gives no reason why the AI system should actually believe it's in the scenario described, and that seems like a crucial element in its decision process. If in the real world, someone put me in a room with a chalkboard and told me this is what was happening, no matter what evidence they showed, I would have some element of doubt, both of their ability (cf point 1) but more importantly their motivations. If I discovered that the world was so bizarre as in this scenario, it would be at best a coinflip for me that I should take them at face value. 
    3. It seems contradictory to frame decision theory as applying to 'a deterministic AI system' whose clones 'will make the same choice, as a matter of logical necessity'. There's a whole free will debate lurking underneath any decision theoretic discussion involving recognisable agents that I don't particularly want to get into - but if you're taking away all agency from the 'agent', it's hard to see what it means to advocate it adopting a particular decision theory. At that point the AI might as well be a rock, and I don't feel like anyone is concerned about which decision theory rocks 'should' adopt. 

    This follows from the theorems I cited, but I didn't include proofs of the theorems here. The proofs are technical and tricky,[1] and I didn't want to make my post much longer or spend so much more time on it. Explaining each proof in an intuitive way could probably be a post on its own.

    I would be less interested to see a reconstruction of a proof of the theorems and more interested to see them stated formally and a proof of the claim that it follows from them. 

    I haven't downvoted it, and I'm sorry you're getting that response for a thoughtful and in-depth piece of work, but I can offer a couple of criticisms I had that have stopped me upvoting it yet because I don't feel like I understand it, mixed in with a couple of criticisms where I feel like I did:

    • Too much work done by citations. Perhaps it's not possible to extract key arguments, but most philosophy papers IME have their core point in just a couple of paragraphs, which you could quote, summarise or refer to more precisely than a link to the whole paper. Most people on this forum just won't have the bandwidth to go digging through all the links.
    • The arguments for infinite prospective utility didn't hold up for me. A spatially infinite universe doesn't give us infinite expectation from our action - even if the universe never ends, our light cone will always be finite. Re Oesterheld's paper, acausal influence seems an extremely controversial notion in which I personally see no reason to believe. Certainly if it's a choice between rejecting that or scrabbling for some alternative to an intuitive approach that in the real world has always yielded reasonable solutions, I'm happy to count that as a point against Oesterheld.
    • Relatedly, some parts I felt like you didn't explain well enough for me to understand your case, eg:
      • I don't see the argument in this post for this: 'So, based on the two theorems, if we assume Stochastic Dominance and Impartiality,[18] then we can’t have Anteriority (unless it’s not worse to add more people to hell) or Separability.' It seemed like you just attempted to define these things and then asserted this - maybe I missed something in the definition?
      • 'You are facing a prospect  with infinite expected utility, but finite utility no matter what actually happens. Maybe  is your own future and you value your years of life linearly, and could live arbitrarily but finitely long, and so long under some possibilities that your life expectancy and corresponding expected utility is infinite.' I don't see how this makes sense. If all possible outcomes have me living a finite amount of time and generating finite utility per life-year, I don't see why expectation would be infinite.
    • Too much emphasis on what you find 'plausible'. IMO philosophy arguments should just taboo that word.

    Appreciate the patient breakdown :)

    This [L/S = 10^7] happens to be bigger than 1, which suggests that targeting the far future is still ~10 million times better than targeting the short term. But this calculation could have come out as less than 1 using other possible inputs. Combined with general model uncertainty, it seems premature to conclude that far-future-focused actions dominate short-term helping. It's likely that the far future will still dominate after more thorough analysis, but by much less than a naive future fanatic would have thought.

    This is more of a sidenote, but given all the empirical and model uncertainty in any  far-future oriented work, it doesn't seem like adding a highly speculative counterargument with its own radical uncertainties should meaningfully shift anyone's priors. It seems like a strong longtermist could accept Brian's views at face value and say 'but the possibility of L/S being vastly bigger than 1 means we should just accept the Pascalian reasoning and plow ahead regardless', while a sceptic could point to rapid diminution and say no simulationy weirdness is necessary to reject these views.

    (Sidesidenote: I wonder whether anyone has investigated the maths of this in any detail? I can imagine there being some possible proof by contradiction of RD, along the lines of 'if there were some minimum amount that it was rational for the muggee to accept, a dishonest mugger could learn that and raise the offer beyond it whereas an honest mugger might not be able to, and therefore, when the mugger's epistemics are taken into account, you should not be willing to accept that amount. Though I can also imagine this might just end up as an awkward integral that you have to choose your values for somewhat arbitrarily)

    I think Brian's reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.

    For the record, this kind of thing is why I love Brian (aside from him being a wonderful human) - I disagree with him vigorously on almost every point of detail on reflection, but he always come up with some weird take. I had either forgotten or never saw this version of the argument, and was imagining the version closer to Pablo's that talks about the limited value of the far future rather than the increased near-term value.

    That said, I still think I can basically C&P my objection. It's maybe less that I think F is likely to be super small, and more that, given our inability to make any intelligible statements about our purported simulators' nature or intentions it feels basically undefined (or, if you like, any statement whatsoever about its value is ultimately going to be predicated on arbitrary assumptions), making the equation just not parse (or not output any value that could guide our behaviour).

    Load more