All of Owen Cotton-Barratt's Comments + Replies

I might think of FHI as having borrowed prestige from Oxford. I think it benefited significantly from that prestige. But in the longer run it gets paid back (with interest!).

That metaphor doesn't really work, because it's not that FHI loses prestige when it pays it back -- but I think the basic dynamic of it being a trade of prestige at different points in time is roughly accurate.

I'm worried I'm misunderstanding what you mean by "value density". Could you perhaps spell this out with a stylized example, e.g. comparing two different interventions protecting against different sizes of catastrophe?

2
Vasco Grilo
2d
I guess you are thinking that the period of 1 year I mention above is one over which there is a catastrophe, i.e. a large reduction in population. However, I meant a random unconditioned year. I have now updated "period of 1 year" to "any period of 1 year (e.g. a calendar year)". Population has been growing, so my ratio between the initial and final population will have a high chance of being lower than 1.

I think human extinction over 1 year is extremely unlikely. I estimated 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks.

Without having dug into them closely, these numbers don't seem crazy to me for the current state of the world. I think that the risk of human extinction over 1 year is almost all driven by some powerful new technology (with residues for the wilder astrophysical disasters, and the rise of some powerful ideol... (read more)

2
Vasco Grilo
2d
To clarify, my estimates are supposed to account for unknown unknowns. Otherwise, they would be any orders of magnitude lower. I found the "Unfortunately" funny! Makes sense. We may even have both cases in the same tail distribution. The tail distribution of the annual war deaths as a fraction of the global population is characteristic of a power law from 0.001 % to 0.01 %, then it seems to have a dragon king from around 0.01 % to 0.1 %, and then it decreases much faster than predicted by a power law. Since the tail distribution can decay slower and faster than a power law, I feel like this is still a decent assumption. I agree we cannot rule out dragon kings (flatter sections of the tail distribution), but this is not enough for saving lives in catastrophes to be more valuable than in normal times. At least for the annual war deaths as a fraction of the global population, the tail distribution still ends up decaying faster than a power law despite the presence of a dragon king, so the expected value density of the cost-effectiveness of saving lives is still lower for larger wars (at least given my assumption that the cost to save a life does not vary with the severity of the catastrophe). I concluded the same holds for the famine deaths caused by the climatic effects of nuclear war. One could argue we should not only put decent weight on the existence of dragon kings, but also on the possibility that they will make the expected value density of saving lives higher than in normal times. However, this would be assuming the conclusion.

Sorry, I understood that you primarily weren't trying to model effects on extinction risk. But I understood you to be suggesting that this methodology might be appropriate for what we were doing in that paper -- which was primarily modelling effects on extinction risk.

Sorry, this isn't speaking to my central question. I'll try asking via an example:

  • Suppose we think that there's a 1% risk of a particular catastrophe C in a given time period T which kills 90% of people
  • We can today make an intervention X, which costs $Y, and means that if C occurs then T will only kill 89% of people
    • We pay the cost $Y in all worlds, including the 99% in which C never occurs
  • When calculating the cost to save a life for X, do you:
    • A) condition on C, so you save 1% of people at the cost of $Y; or
    • B) don't condition on C, so you save an expected 0
... (read more)
2
Vasco Grilo
3d
Thanks for clarifying! I agree B) makes sense, and I am supposed to be doing B) in my post. I calculated the expected value density of the cost-effectiveness of saving a life from the product between: * A factor describing the value of saving a life (B=kB(Pi/Pf)ϵB). * The PDF of the ratio between the initial and final population (f=α(Pi/Pf)−(α+1)), which is meant to reflect the probability of a catastrophe.

I think if you're primarily trying to model effects on extinction risk, then doing everything via "proportional increase in population" and nowhere directly analysing extinction risk, seems like a weirdly indirect way to do it -- and leaves me with a bunch of questions about whether that's really the best way to do it.

2
Vasco Grilo
3d
I am not necessarily trying to do this. I intended to model the overall effect of saving lives, and I have the intuition that saving a life in a catastrophe (period over which there is a large reduction in population) conditional on it happening is more valuable than saving a life in normal times, so I assumed the value of saving a life increases with the severity of the catastrophe. One can assume preventing extinction is specially important by selecting a higher value for ϵB ("the elasticity of the benefits [of saving a life] with respect to the ratio between the initial and final population").

Re.

Cotton-Barratt 2020 says “it’s usually best to invest significantly into strengthening all three defence layers”:

“This is because the same relative change of each probability will have the same effect on the extinction probability”. I agree with this, but I wonder whether tail risk is the relevant metric. I think it is better to look into the expected value density of the cost-effectiveness of saving a life, accounting for indirect longterm effects as I did. I predict this expected value density to be higher for the 1st layers, which respect a

... (read more)
2
Vasco Grilo
3d
Thanks for all your comments, Owen! My expected value density of the cost-effectiveness of saving a life, which decreases as catastrophe severity increases, is supposed to account for longterm effects like decreasing the risk of human extinction.

I'm worried that modelling the tail risk here as a power law is doing a lot of work, since it's an assumption which makes the risk of very large events quite small (especially since you're taking a power law in the ratio, aside from the threshold from requiring a certain number of humans to have a viable population, the structure of the assumption essentially gives that extinction is impossible).

But we know from (the fancifully named) dragon king theory that the very largest events are often substantially larger than would be predicted by power law extrapolation.

2
Vasco Grilo
3d
Thanks for the critique, Owen! I strongly upvoted it. Assuming the PDF of the ratio between the initial and final population follows a loguniform distribution (instead of a power law), the expected value density of the cost-effectiveness of saving a life would be constant, i.e. it would not depend on the severity of the catastrophe. However, I think assuming a loguniform distribution for the ratio between the initial and final population majorly overestimates tail risk. For example, I think a population loss (over my period length of 1 year[1]) of 90 % to 99 % (ratio between the initial and final population of 10 to 100) is more likely than a population loss of 99.99 % to 99.999 % (ratio between the initial and final population of 10 k to 100 k), whereas a loguniform distribution would predict both of these to be equally likely. My reduction in population is supposed to refer to a period of 1 year, but the above only decreases population over longer horizons.  I think human extinction over 1 year is extremely unlikely. I estimated 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. Interesting! I did not know about that theory. On the other hand, there are counterexamples. David Roodman has argued the tail risk of solar storms decreases faster than predicted by a power law: I have also found the tail risk of wars decreases faster than predicted by a power law: Do you have a sense of the extent to which the dragon king theory applies in the context of deaths in catastrophes? 1. ^ I have now clarified this in the post.

I'm confused by some of the set-up here. When considering catastrophes, your "cost to save a life" represents the cost to save that life conditional on the catastrophe being due to occur? (I'm not saying "conditional on occurring" because presumably you're allowed interventions which try to avert the catastrophe.)

Understood this way, I find this assumption very questionable:

, since I feel like the effect of having more opportunities to save lives in catastrophes is roughly offset by the greater difficulty of preparing to take advantage of those opportu

... (read more)
2
Vasco Grilo
3d
My language was confusing. By "pre- and post-catastrophe population", I meant the population at the start and end of a period of 1 year, which I now also refer to as the initial and final population. I have now clarified this in the post. I assume the cost to save a life in a given period is a function of the ratio between the initial and final population of the period. I meant to refer to all mechanisms (e.g. prevention, response and resilience) which affect the variation in population over a period.

Habryka identifies himself as the author of a different post which is linked to and being discussed in a different comment thread.

2
Peter Wildeford
8d
Oh ok, thanks! Sorry for my confusion.

Yeah it totally has the same effect. It can just be less natural to analyse, if you think the risk will (or might) decrease a lot following some transition (which is also when the risk will mostly be incurred), but you're less confident about when the transition will occur.

I'm worried we're talking past each other here. We totally might find arrangements that keep the state risk at like 1% -- and in that case then (as Thorstad points out) we expect not to have a very large future (though it could still be decently large compared to the world today).

But if your axiology is (in part) totalist, you'll care a lot whether we actually get to very large futures. I'm saying (agreeing with Thorstad) that these are dependent on finding some arrangement which drives risk very low. Then I'm saying (disagreeing with Thorstad?) that the decision-relevant question is more like "have we got any chance of getting to such a state?" rather than "are we likely to reach such a state?"

4
titotal
11d
Okay, that makes a lot more sense, thank you.  I think the talk of transition risks and sail metaphors aren't actually that relevant to your argument here? Wouldn't a gradual and continuous decrease to state risk, like Kuznets curve shown in Thorstadt's paper here, have the same effect? 

Ok, I agree with you that state risk is also an important part of the picture. I basically agree that nuclear risk is better understood as a state risk. I think the majority of AI risk is better understood as a transition risk, which was why I was emphasising that.

I guess at a very high level, I think: either there are accessible arrangements for society at some level of technological advancement which drive risk very low, or there aren't. If there aren't, it's very unlikely that the future will be very large. If there are, then there's a question of wheth... (read more)

I guess at a very high level, I think: either there are accessible arrangements for society at some level of technological advancement which drive risk very low, or there aren't. If there aren't, it's very unlikely that the future will be very large. If there are, then there's a question of whether the world can reach such a state before an existential catastrophe.

This reasoning seems off. Why would it have to drive thing to very low risk, rather than to a low but significant level of risk, like we have today with nuclear weapons? Why would it be impossibl... (read more)

Let me be clear about the type signature of the sail metaphor: it's not giving an object-level argument that the risk will drop a long way. I think it's a completely legit question why this one is different. (I'm not confident that it is, but the kind of reason I think it may well be are outlined in this post.)

Instead it's saying that it may be more natural to have the object-level conversations about transitions rather than about risk-per-century. Here's a stylized example:

  • Suppose you're confident that putting up the sail will incur a 50% risk, and otherw
... (read more)
8
titotal
12d
  Hmm, I definitely think there's an object level disagreement about the structure of risk here.  Take the invention of nuclear weapons for example. This was certainly a "transition" in society relevant to existential risk. But it doesn't make sense to me to analogise it to a risk in putting up a sail. Instead, nuclear weapons are just now a permanent risk to humanity, which goes up or down depending on geopolitical strategy.  I don't see why future developments wouldn't work the same way. It seems that since early humanity the state risk has only been increasing further and further as technology develops. I know there are arguments for why it could suddenly drop, but I agree with the linked Thorstadt analysis that this seems unlikely. 

The main point of my comment above is that "highly uncertain" is enough to support action premised on the possibility of a time of perils.

For what it's worth I think that the ontology of "dropping risk by many orders of magnitude" is putting somewhat too much emphasis on "risk per century" as a natural unit. I think a lot of anthropogenic risk is best understood not as a state risk (think "risk I randomly fall off the side of the boat"), but as a transition risk (think "risk I fall in as I try to put the sail up"). Some of the high risk imagined this centu... (read more)

I think this sail metaphor is more obfucatory than revealing. If you think that the risk will drop orders of magnitude and stay there, then it's fine to say so, and you should make your object-level arguments for that. Calling it a transition doesn't really add anything: society has been "transitioning" between different states for it's entire lifetime, why is this one different?

From my perspective, therefore, the value of this work is that it justifies that it would be importantly decision-relevant to find strong arguments that we're not in a time of perils situation. That's not hugely surprising, but it's good to get the increased confidence and to have a handle on precisely how it would be decision-relevant.

This work hinges on the assumption that we're not in a time of perils situation. In other work Thorstad argues that the common arguments for thinking we're in a time of perils are uncompelling. I'm not sure I agree (i.e. on balance my inside view supports a time of perils, but I'm not sure that the case for this has ever been spelled out in a watertight way), but fair enough -- it's very healthy and good to poke at foundational assumptions. But he doesn't provide any strong arguments that we aren't in a time of perils. And the arguments presented here rely... (read more)

Thorstadt has previously written a paper specifically addressing the time of perils hypothesis, summarised in seven parts here

One of the points is that just being in a time of perils is not enough to debunk his arguments, it has to be a short time of perils, and the time of perils ending has to drop the risk by many orders of magnitude. These assumptions seem highly uncertain to me. 

From my perspective, therefore, the value of this work is that it justifies that it would be importantly decision-relevant to find strong arguments that we're not in a time of perils situation. That's not hugely surprising, but it's good to get the increased confidence and to have a handle on precisely how it would be decision-relevant.

I think you're right to be more uncomfortable with the counterfactual analysis in cases where you're aligned with the other players in the game. Cribbing from a comment I've made on this topic before on the forum:

I think that counterfactual analysis is the right approach to take on the first point if/when you have full information about what's going on. But in practice you essentially never have proper information on what everyone else's counterfactuals would look like according to different actions you could take.

If everyone thinks in terms of something l... (read more)

I'll give general takes in another comment, but I just wanted to call out how I think that at least for some of your examples the assumptions are unrealistic (and this can make the puzzle sound worse than it is).

Take the case of "The funding of an organization and the people working at the org". In this case the must factors combine in a sub-multiplicative way rather than a multiplicative way. For it's clear that if you double the funding and double the people working at the org you should approximately double the output (rather than quadruple it). I think... (read more)

For my part, I especially liked reading the article on moral misdirection, which I think gives a clear explanation of (and name for!) a worryingly-prevalent dynamic. Thanks!

I don't really know (and have had almost no interactions with Alexander). But it would be unsurprising to me if that would have flipped the decision.

Basically: Alexander's views seem compatible with a real felt regret after the controversy about it a year ago, and this being the first time one can talk about it publicly without undermining a grantee. Since the PR costs have by this stage largely(?) been paid, it seems quite plausible that if the cost-effectiveness analysis had come back mildly positive he'd have continued to feel regret about not having averted the past issue, while now thinking it was right to continue support.

6
Habryka
18d
Thanks, that's very helpful, though if you think it's mostly because the PR cost has already been paid, then that does provide little solace under my worldview. Let's assume the PR costs were still ongoing, do you think it would have then flipped the decision?

Overall I feel relatively supportive of more investigation and (especially) postmortem work. I also don't fully understand why more wasn't shared from the EV investigation[1].

However, I think it's all a bit more fraught and less obvious than you imply. The main reasons are:

  • Professional external investigations are expensive
    • Especially if they're meaningfully fact-finding and not just interviewing a few people, I think this could easily run into hundreds of thousands of dollars
    • Who is to pay for this? If a charity is doing it, I think it's important that their
... (read more)

Ah but the genius of it is that you still have all of the information -- it's just also distributed! Each person knows exactly who they voted for, and as a bonus they avoid having to entrust any of their data to a centralized system which could be controlled by any kind of nefarious types. (I heard that the US had issues with election integrity in 2020, and if a major nation can't manage this, I think it's really a bit much to expect a shoestring-budget org like CEA to manage it.)

As an extra benefit, the Proportional Representation can be implemented in a decentralized way -- and people can opt into it on a case-by-case basis, without the approval of any central authority.

To implement this, donors should simply give to the place they believe to be most deserving. You might be concerned that this is antidemocratic! But in fact it's entirely democratic, and you're just taking responsibility for paying out the proportion of the total that you represent. Others can in parallel take responsibility for their shares. Thus in a distributed manner you reach deeply democratic outcomes.

2
Jason
22d
You may lose the information-gathering function though -- in @RedStateBlueState's model, we learn what proportion of the Forum user base voted for Charity X (and what portion of those users' votes it received). That would be difficult to get in your decentralized take -- an org would have to determine ForumUser status, and figure out the share of each user's vote it received.

Of course you're right; my "log uniform" assumption is in a different space than your "Pareto" assumption. I think I need to play around with the scale density notion a bit more until it's properly intuitive.

Thanks! I think this is really helpful.

[Warning: this comment is kind of thinking-out-loud; the ideas are not yet distilled down to their best forms.]

The only thing I want to quibble about so far is your labelling my model as more general. I think it isn't really -- I had a bit of analysis based on the bivariate distribution, but really this was just a variation on the univariate distribution I mostly thought about.

Really the difference between our models is in the underlying distribution they assume. I was assuming something roughly (locally) log-uniform.... (read more)

3
ABlank
23d
Howdy. I appreciate your reply. By the difference in generality i meant the difficulty-based problem selection. (Or the possibility of some other hidden variable that affects the order in which we solve problems.)   On a closer examination of your 2014 post, i don't think this is true. If we look at the example distribution and try to convert it to the language i've used in this post, there's a trick with the scale density concept: Because the benefits of each problem are identical, their cost-effectiveness is the inverse of difficulty, yes. But the spacing of the problems along the cost-effectiveness axis decreases as the cost increases. So the scale density, which would be the cost divided by that spacing, ends up being proportional to the inverse square of cost-effectiveness. This is easier to understand in a spreadsheet. And the inverse square distribution is exactly where i would expect to see logarithmic returns to scale.   As for what distributions actually make sense in real life, i really don't know. That's more for people working in concrete cause areas to figure out than me sitting at home doing math. I'm just happy to provide a straightforward equation for those people to punch their more empirically-informed distributions into.

Thanks, this makes sense. I hadn't thought about the possibility of committing to have the external investigation provide public answers on specific questions. (And the fact that I hadn't thought of it gives me something to reflect on about how to ideally act during crises.)

  1. In the world with high corruption, I'd have expected more individuals to be scared of what an investigation would turn up.

  2. If the investigation found major issues which weren't publicly reported on, it would increase the fragility of EV's position, since anyone who had seen the investigation could become a whistleblower with a relatively incontrovertible case.

These don't totally exclude "there's a big collusion not to share the issues", but "major internal collusion" without also "was colluding with SBF" is getting to be a narrower target, and I thi... (read more)

Jason
24d67
9
2
1
4

I have the opposite reaction.

Background

To recap, Zach wrote about four months ago:

Mintz found no evidence that anyone at EV (including employees, leaders of EV-sponsored projects, and trustees) was aware of the criminal fraud of which Sam Bankman-Fried has now been convicted.

While we are not publishing any additional details regarding the investigation because doing so could reveal information from people who have not consented to their confidences being publicized and could waive important legal privileges that we do not intend to waive, we recognize

... (read more)

Meta: I'm worried there could be some people-talking-past-each-other here. I never meant to claim that PR concerns shouldn't influence one's decision-making, but that they shouldn't drive one's decision-making. On this view you should certainly be willing to change direction on relatively unimportant issues for PR reasons, but should be somewhat resolute against doing so when it would change what was otherwise the central important thing you wanted to do. 

I defended a similar broad position to this in a post a couple of years back on the perils of opt... (read more)

(Sorry you're getting downvoted, this seems like a productive conversation to me.)

There's a big question about who things are being legible to. I don't think this would make things meaningfully more legible to you or people in your reference class of basically being high context on the org and having a lot of intersections in social circles.

I think there's a large crowd who are closer to "the general public" who only know about things via stuff that's written publicly, and will only ever read a small fraction of that. I think that these kinds of reform hel... (read more)

Yeah, I think you are pointing towards something real here. 

Like, I do think a thing that drove my reaction to this was a perspective in which it was obvious that most people in EA didn't literally actively participate in the FTX fraud. I have encountered very extreme and obviously wrong opinions about this in the public (the comment section of the WaPo article provides many examples of this), and there is some value in engaging with that. 

But I do think that is engaging with a position that is extremely shallow, and the mechanism of it seems lik... (read more)

I think if EV had had a high corruption culture (even if they didn't literally know about any crimes), it would have been much more costly to have an external investigation (and such an investigation would likely have shaken out differently).

This is an interesting argument, thanks for making it. Would you mind explaining in a bit more depth? I would have thought the structure of the investigation, especially including the fact that the results were not published, would limit the cost of the investigation. As far as I'm aware the only thing outside observer... (read more)

Sorry, I wasn't meaning to object to your critique overall. On my impression it's substantively correct that it's a puff piece and slightly disingenuous. (Maybe I should have owned that up-front. I still feel a bit bad about saying it baldly, because Zach is relatively new in a role that I think is often both difficult and thankless; I'd guess he's doing a good job overall and it seems bad if the feedback he gets skews too negative; I think it's probably good if he does more media. But I'm now worried that not saying it will be more confusing/costly, so I'... (read more)

I still think you're overstating the extent to which the things described aren't about institutional reform

I think they kind of are -- but it should be understood that the purpose of the reforms is more about improving legibility than about fundamentally reducing the risk of similar occurrences

So, I feel like you are using "improving legibility" as kind of a euphemism here. I don't currently believe that you think that the changes that CEA has made would make it easier for you or me or the external world to discover whether something like FTX was likely to... (read more)

Habryka
25d34
18
0
7
3

because Zach is relatively new in a role that I think is often both difficult and thankless

Yeah, I agree with this. 

I think Zach definitely has a very difficult job, and I would like to support him in what he is doing. I seem to have disagreements about how to go about it, but that doesn't change the fact that I think he is paying large costs in stress and associated suffering in the pursuit of aims that I do care deeply about, and I think it's important to recognize that despite any disagreements we might have about strategy and tactics. Man in the arena and all that. 

I’m not privy to the details of the assessment that OP did, but I was briefly consulted (as a courtesy) before this decision was made, and I understand that there was a proper cost-benefit analysis driving their decisions here.

Compared to when the original decision was made, a few things look different to me:

  • I originally underestimated the staffing and maintenance costs for the project
    • (I’m still not sure whether there might have been an accessible “shoestring” version)
  • After what happened with FTX, money is more constrained, which means it’s less desirable
... (read more)
5
Habryka
21d
Is your sense that if the cost-effectiveness estimate had come back positive, but not overwhelmingly positive (let's say like a 70th percentile OP grant-dollar in the last year), that this would have flipped the decision?  Given that not vetoing this grant made Alexander's top list of decisions he regrets, mostly because of the negative optics-aspects, I would be surprised if the cost-effectiveness estimate was actually a crux here.

I think it's great to think about what projects should maybe exist and then pitch them! Kudos to you for doing that; it seems potentially one of the highest-value activities on the Forum.

I think that information flows are really important, and in principle projects like this could be really high-value already in the world today. Moreover I agree that the general area is likely to increase in importance as the impacts of language models are more widely felt. But details are going to matter a lot, and I'm left scratching my head a bit over this:

  • When I read t
... (read more)
1
Light_of_Illuvatar
25d
Hi, the general model for the platform would be something akin to a web-based news site (e.g. WIRED, Vox, etc.) and a subreddit combined. There's the human run in depth coverage part, where the work should be done to increase impartiality, but there's also the linklist part which allows community members to "float" content they find interesting without getting bogged down in writing it up, so to speak. The links shared will be opinionated, definitely,  but that should be mitigated by the human coverage, and the limitations of human coverage (speed of updates, long reading time) can hopefully be compensated by the linklist/subreddit portion of the site.

I think you're missing some important ground in between "reflection process" and "PR exercise".

I can't speak for EV or other people then on the boards, but from my perspective the purpose of the legal investigation was primarily about helping to facilitate justified trust. Sam had by many been seen as a trusted EA leader, and had previously been on the board of CEA US. It seemed it wouldn't be unreasonable if people in EA (or even within EV) started worrying that leadership were covering things up. Having an external investigation was, although not a cheap... (read more)

Yep, I think the investigation did screen off some extreme scenarios in which people at CEA were colluding very actively with Sam about his biggest crimes. I at least had very little probability on this, and while it's good to clear up such an extreme scenario, I don't think it has much to do with actually preventing future things like FTX. 

I totally agree there is value in signaling to external stakeholders that CEA is not literally criminal, but I don't think this has anything to do with a claimed "internal reflection process" and "institutional ref... (read more)

I haven't downvoted, but this is attracting several downvotes, and I thought I'd try to articulate some negative feelings I have here:

  • First, as Stefan has noted, the summary seems inaccurate: Zach's article nowhere claims that Sam was never an effective altruist
    • I think it's bad form to put sensationalist takeaways in a summary when they don't appear in the article, and feel not great about the link as a result
    • I do think that seeing it linked in this way primed me to be more negative about the article (and the notes below reflect that)
      • I have mixed feelings
... (read more)
8
Deborah W.A. Foulkes
25d
Phew! It's much harder to write an effective (no pun intended) headline than I thought! :-). Have changed it to include an actual quote, which I hope is sufficiently representative of the article's content.

4 is a great point, thanks.

On 1--3, I definitely agree that I may prudentially prefer some possibilities than others. I've been assuming that from a consequentialist moral perspective the distribution of future outcomes still looks like the one I give in this post, but I guess it should actually look quite different. (I think what's going on is that in some sense I don't really believe in world A, so haven't explored the ramifications properly.)

I support people poking at the foundations of these arguments. And I especially appreciated the discussion of bottlenecks, which I think is an important topic and often brushed aside in these discussions.

That said, I found that this didn't really speak to the reasons I find most compelling in favour of something like the singularity hypothesis. Thorstad says in the second blog post:

If each doubling of intelligence is harder to bring about than the last, then even if all AI research is eventually done by recursively self-improving AI systems, the pace of do

... (read more)

Although I agree that all of these are challenges, I don't really believe they're enough to undermine the basic case. It's not unusual in high-paying industries for some people to make several times as much as their colleagues. So there's potentially lots of room to have higher consumption than you would working at a nonprofit while also giving away more than half your salary.

Empirically my impression is also that people who went into earning to give early tended to stay giving later in their careers 10+ years later; although that's anecdotal rather than data-driven.

I agree with "pretty rationally overall" with respect to general world modelling, but I think that some of the stuff about how it relates to its own values / future selves is a bit of a different magisterium and it wouldn't be too surprising if (1) it hadn't been selected for rationality/competence on this dimension, and (2) the general rationality didn't really transfer over.

One thought is that for something you're describing as a minimal viable takeover AI, you're ascribing it a high degree of rationality on the "whether to wait" question.

By default I'd guess that minimal viable takeover systems don't have very-strong constraints towards rationality. And so I'd expect at least a bit of a spread among possible systems -- probably some will try to break out early whether or not that's rational, and likewise some will wait even if that isn't optimal.

That's not to say that it's not also good to ask what the rational-actor model suggests. I think it gives some predictive power here, and more for more powerful systems. I just wouldn't want to overweight its applicability.

4
Habryka
1mo
Hmm, my guess is by the time a system might succeed at takeover (i.e. has more than like a 5% chance of actually disempowering all of humanity permanently), I expect its behavior and thinking to be quite rational. I agree that there will probably be AIs taking reckless action earlier than that, but in as much as an AI is actually posing a risk of takeover, I do expect it to behave pretty rationally overall.

Yeah I'm arguing that with good reflective governance we should achieve a large fraction of what's accessible.

It's quite possible that that means "not quite all", e.g. maybe there are some trades so that we don't aestivate in this galaxy, but do in the rest of them; but on the aggregative view that's almost as good as aestivating everywhere.

I think in this context the natural way to interpret probabilities (or medians) of probabilities is as something like: "what would I think the probability was if I knew a lot more about the situation and got to think about it a lot more, but wasn't literally omniscient". Of course that isn't fully defined (since it's not specified exactly how much more they get to know), but I think it's approximately meaningful, and can capture something quite useful/important about credal resilience.

Relatedly, I think the journalist's presentation is misleading. I think ... (read more)

I think maybe yes? But I'm a bit worried that "won't react to them" is actually doing a lot of work.

We could chat about more a concrete example that you think fits this description, if you like.

Re:

I just notice that I have a strong intuition, backed up by something that seems to me like a plausible claim: given that myriad actors always contribute to any outcome, it is hard to imagine that there is one (or a very few) individual(s) that does all of the heavy lifting...

I want to note that this property isn't a consequence of a power-law distribution. (It's true of some power laws but not others, depending on the exponent.) I think you're right about this in most cases (though in some domains like theoretical physics I think it's more plausible tha... (read more)

Perhaps we could promote the questions:

  • 'How can I help facilitate the most good?', or
  • 'How can I support the most good?'

and not the question:

  • 'How can I do the most good?'

Similar reframes might acknowledge that some efforts help facilitate large benefits, while also acknowledging all do-gooding efforts are ultimately co-dependent, not simply additive*? I like the aims of both of you, including here and here, to capture both insights.

(*I'm sceptical of the simplification that "some people are doing far more than others". Building on Owen's example, any impact... (read more)

Yeah the tone makes sense for a personal blog (and in general the piece makes more sense for an audience who can mostly be expected to know Katja already).

I think it could have signalled more not-being-a-puff-piece by making the frame less centrally about Katja and more about the virtues you wanted to draw attention to. It's something like: those, rather than the person, are the proper object-of-consideration for these large internet audiences. Then you could also mention that the source of inspiration was the person.

2
Nathan Young
1mo
Yeah that seems right. Not sure what options one can click on crossposting to point that out. (I think the forum has a personal blog option, but I'm not sure that's so appropriate on LessWrong)

Extremely relevant for my personal assessment!

For the social fabric stuff it seems more important whether it's legibly not a puff piece. Had I downvoted (and honestly I was closer to upvoting), the intended signal would have been something like "small ding for failing to adequately signal that it's not a puff piece" (such signalling is cheaper for things that actually aren't puff pieces, so asking for it is relatively cheap and does some work to maintain a boundary against actual puff pieces). It would have warranted a bigger ding if I'd thought it was a p... (read more)

2
Nathan Young
1mo
* How could it have better signalled it wasn't a puff piece? * It sort of is a bit of a puff piece. I tried to talk about some negatives but I don't know that it's particularly even handed. * I tend to get quite a lot of downvotes in general, so some is probably that. * Beyond that, the title is quite provocative - I just used the title on my blog, but I guess I could have chosen something more neutral 

I think I was leaning into making my guess sound surprising there, and I had in mind something closer to 100 than 30; it might have been better to represent it as "about 100" or ">50" or something.

The fact that presidential terms are just 4 or 8 years does play into my thinking. For sure, they've typically done other meaningful stuff, but I don't think that typically has such a high impact ratio as their years as president. I generated my ratio by querying my brain for snap judgements about how big a deal it would seem to have [some numbers of president... (read more)

FWIW my guess is that if you compare (lifetime impact of president):(lifetime impact of average member of congress), the ratio would be <100 (but >30).

2
Larks
1mo
I'm surprised you think that low, especially considering the President often will have been a Senator or Governor or top businessman before office, so the longer average term in Congress is not a big advantage. 

I'm definitely a little surprised to hear that you don't think that impact is power-law distributed at all, even ex post. I wonder if it's worth trying to get numerical about this, rather than talk qualitatively about "whether impact is power-law distributed". Because really it's the quantitative ratios that matter rather than the exact nature of the distribution out in the tails (e.g. I doubt the essential disagreement here is about whether it's a power law vs a lognormal).

If you restrict to people who are broadly trying to do good with their work (at lea... (read more)

6
Sarah Weiler
1mo
Appreciate the attempt to make headway on the disagreement! I feel pretty lost when trying to quantify impact at these percentiles. Taking concerns about naive attribution of impact into consideration, I don't even really know where to start to try to come up with numbers here. I just notice that I have a strong intuition, backed up by something that seems to me like a plausible claim: given that myriad actors always contribute to any outcome, it is hard to imagine that there is one (or a very few) individual(s) that does all of the heavy lifting... "And how much spread do we need to get here in order to justify a lot of attention going into looking for tail-upsides?" -- Also a good question. I think my answer would be: it depends on the situation and how much up- or downsides come along with looking for tail-upsides. If we're cautious about the possible adverse effects of impact maximizing mindsets, I agree that it's often sensible to look for tail-upsides even if they would "only" allow us to double impact. Then there are some situations/problems where I believe the collective rationality mindset, which looks for "how should I and my fellows behave in order to succeed as a community" rather than "how should I act now to maximize the impact I can have as a relatively direct/traceable outcome from my own action?"

I mostly-disagree with this on pragmatic grounds. I agree that that's the right approach to take on the first point if/when you have full information about what's going on. But in practice you essentially never have proper information on what everyone else's counterfactuals would look like according to different actions you could take.

If everyone thinks in terms of something like "approximate shares of moral credit", then this can help in coordinating to avoid situations where a lot of people work on a project because it seems worth it on marginal impact, ... (read more)

2
Ben Millwood
1mo
Is it at least fair to say that in situations where the other main actors aren't explicitly coordinating with you and aren't aware of your efforts (and, to an approximation, weren't expecting your efforts and won't react to them), you should be thinking more like I suggested?
Load more