rosehadshar

I'd be genuinely curious to hear how Cotton-Barratt and Hadshar see this difference. Is it a meaningful distinction? Are these frameworks reconcilable at different scales of analysis? When would we know which better serves long-term flourishing?

I've only skimmed your post, and haven't read what me and Owen wrote in several years, but my quick take is:

We're saying 'within a particular longtermist frame, it's notable that it's still rational to allocate resources to neartermist ends, for instrumental reasons'
- I think you agree with this
- Since writing that essay, I'm now more worried about AI making humans instrumentally obsolete, in a way that would weaken this dynamic a lot (I'm thinking of stuff like the intelligence curse). So I don't actually feel confident this is true any more.
I think you are saying 'but that is not a good frame, and in fact normatively we should care about some of those things intrinsically'
- I agree, at least partially. I don't think we intended to endorse that particular longtermist frame - just wanted to make the argument that even if you have it, you should still care about neartermist stuff. (And actually, caring instrinsically about neartermist stuff is part of what motivated making the argument, iirc.)
- I vibed with some of your writing on this, e.g. "The Tuesday-morning maintenance network isn't preparation for a future we're aiming toward; it is the future, continuously instantiated."
- I'm not a straight out yes - I think Wednesday in a million years might matter much more than this Tuesday morning, and am pretty convinced of some aspects of longtermism. But I agree with you in putting intrinsic value on the present moment and people's experiences in it

So my guess is, you have a fundamental disagreement with some version of longtermism, but less disagreement with me than you thought.

A personal take on why you should work at Forethought (maybe)

rosehadshar2mo10

Thanks Lizka!

Some misc personal reflections:

Working at Forethought has been my favourite job ever, by a decent margin
I spent a couple of years doing AI governance research independently/collaborating with others in an ad hoc way before joining Forethought. I think the quality of my work has been way higher since joining (because I've been working on more important questions than I was able to make headway on solo), and it's also been just a huge win in terms of productivity and attention (the costs of tracking my time, hustling for new projects, managing competing projects etc were pretty huge for me and made it really hard to do proper thinking)

One minor addition from me on why/not to work at Forethought: I think the people working at Forethought care pretty seriously about things going well, and are really trying to make a contribution.

I think this is both a really special strength, and something that has pitfalls:

It's a privilege to work with people who care in this way, and it cuts a lot of the crap that you'd get in organisations that were more oriented towards short term outcomes, status, etc
On the other hand, I sometimes worry about Forethought leaning a bit to heavily on EA-style 'do what's most impactful' vibes. I think this can kill curiosity, and also easily degrades into trying to try/people trying to meet their own psychological needs to make an impact instead of really staring in the face the reality we seem to be living in.
- Other people at Forethought think that we're not leaning into this enough though: most work on AI futures stuff is low quality and won't matter at all, and it's very easy to fill all your time with interesting and pointless stuff. I agree on those failure modes, but disagree about where the right place on the spectrum is.

And then a few notes on the sorts of people I'd be really excited to have apply:

People who are thinking for themselves and building their own models of what's going on. I think this is rare and sorely needed. Some particular sub-groups I want to call out:
- Really smart independent thinkers who want to work on AI macrostrategy stuff but haven't yet had a lot of surface area with the topic or done a lot of research. I think Forethought could be a great place for someone to soak up a lot of the existing thinking on these topics, en route to developing their own agenda.
- Researchers with deep world models on the AI stuff, who think that Forethought is kind of wrong/a lot less good than it could be. The high-level aspiration for Forethought is something like, get the world to sensibly navigate the transition to superintelligence. We are currently 6 researchers, with fairly correlated views: of course we are totally failing to achieve this aspiration right now. But it's a good aspiration, and to the extent that someone has views on how to better address it, I'd love for them to apply.
  - If I got to choose one type of researcher to hire, it would be this one.
  - My hope would be that for many people in this category, Forethought would be able to 'get out of the way': give the person free reign, not entangle them in organisational stuff where they don't want that, and engage with them intellectually to the extent that it's mutually productive.
  - I agree with Lizka that people who think Forethought sucks probably won't want to apply/get hired/enjoy working at Forethought.
People who are working on this stuff already, but hamstrung by not having [a salary/colleagues/an institutional home/enough freedom for research at their current place of work/a manager to support them/etc]. I'd hope that Forethought could be a big win for people in this position, and allow them to unlock a bunch more of their potential.

Intelsat as a Model for International AGI Governance

rosehadshar8mo2

Sorry for the slow response here! Agree that diffusion is an important issue. A few thoughts:

Some forms of diffusion might be actively good, for reducing concentration of power. So it's not clear that we want to straightforwardly prevent tech diffusion
Ways you could reduce tech diffusion within something like Intelsat:
- Limited membership helps
- You could do things like require companies it contracts with to comply with strong infosec, require members not to allow frontier development without strong infosec, require member governments to provide gov-level infosec to frontier developers in their countries
- Intelsat for satellites involved sharing all the technical information. For AGI, it could involve sharing only some forms of information (e.g. weights don't get shared with everyone, but encrypted chunks of the weights are distributed among founder members)
- h/t Will: having many countries part of the multilateral project removes their incentives to try to develop frontier AI themselves (and potentially open-source)

Intelsat as a Model for International AGI Governance

rosehadshar8mo2

Sorry for the slow response here! Agree that diffusion is an important issue. A few thoughts:

Some forms of diffusion might be actively good, for reducing concentration of power. So it's not clear that we want to straightforwardly prevent tech diffusion
Ways you could reduce tech diffusion within something like Intelsat:
- Limited membership helps
- You could do things like require companies it contracts with to comply with strong infosec, require members not to allow frontier development without strong infosec, require member governments to provide gov-level infosec to frontier developers in their countries
- Intelsat for satellites involved sharing all the technical information. For AGI, it could involve sharing only some forms of information (e.g. weights don't get shared with everyone, but encrypted chunks of the weights are distributed among founder members)
- h/t Will: having many countries part of the multilateral project removes their incentives to try to develop frontier AI themselves (and potentially open-source)

Should there be just one western AGI project?

rosehadshar1y5

I agree that it's not necessarily true that centralising would speed up US development!

(I don't think we overlook this: we say "The US might slow down for other reasons. It’s not clear how the speedup from compute amalgamation nets out with other factors which might slow the US down:

Bureaucracy. A centralised project would probably be more bureaucratic.
Reduced innovation. Reducing the number of projects could reduce innovation.")

Interesting take that it's more likely to slow things down than speed things up. I tentatively agree, but I haven't thought deeply about just how much more compute a central project would have access to, and could imagine changing my mind if it were lots more.

How much is 1.8 million years of work?

rosehadshar1y2

Thanks, I think these points are good.

Learning may be bottlenecked by serial thinking time past a certain point, after which adding more parallel copies won't help. This could make the conclusion much less extreme.

Do you have any examples in mind of domains where we might expect this? I've heard people say things like 'some maths problems require serial thinking time', but I still feel pretty vague about this and don't have much intuition about how strongly to expect it to bite.

Fat Tails Discourage Compromise

rosehadshar1y5

Thanks! I'm now unsure what I think.

if you can select from the intersection, you get options that are pretty good along both axes, pretty much by definition.

Isn't this an argument for always going for the best of both worlds, and never using a barbell strategy?

a concrete use case might be more illuminating.

This isn't super concrete (and I'm not if the specific examples are accurate), but for illustrative purposes, what if:

Portable air cleaners score very highly for non-x-risk benefits, and low for x-risk benefits
Interventions which aim to make far-UVC commercially viable look pretty good on both axes
Deploying far-UVC in bunkers scores very highly for x-risk benefits, and very low for non-x-risk benefits

I think a lot of people's intuition would be that the compromise option is the best one to aim for. Should thinking about fat tails make us prefer one or other of the extremes instead?

Fat Tails Discourage Compromise

rosehadshar1y3

This is cool, thanks!

One scenario I am thinking about is how to prioritise biorisk interventions, if you care about both x-risk and non-x-risk impacts. I'm going to run through some thinking, and ask if you think it makes sense:

I think it is hard (but not impossible) to compare between x-risk and non-x-risk impacts
I intuitively think that x-risk and non-x-risk impacts are likely to be lognormally distributed (but this might be wrong)
This seems to suggest that if I want to do the most good, I should max out on on one, even if I care about both equally. I think the intuition for this is something like:
- If x-risk and non-x-risk impacts were normally distributed, you'd expect that there are plenty of interventions which score well on both. The EV for both is reasonably smoothly distributed; it's not very unlikely to draw something which is between 50th and 75th percentile on both, and that's pretty good EV wise.
- But if they are log normal instead, the EV is quite skewed: the best interventions for x-risk and for non-x-risk impacts are a lot better than the next-best. But it's statistically very unlikely that the 99th percentile on one axis is also the 99th on the other
- If I care about EV, but not about whether I get it via x-risk or non-x-risk impacts (I care equally about x-risk and non-x-risk impacts), I should therefore pick the very best interventions on either axis, rather than trying to compromise between them
However, I think that assumes that I know how to identify the very best interventions on one or both axes
- Actually I expect it to be quite hard to tell whether an intervention is 70th or 99th percentile for x-risk/non-x-risk impacts
What should I do, given that I don't know how to identify the very best interventions along either axis?
- If I max out, I may end up doing something which is mediocre on one axis, and totally irrelevant on the other
- If I instead go for the best of both worlds, it seems intuitively more likely that I end up with something which is mediocre on both axes - which is a bit better than mediocre on one and irrelevant on the other
So maybe I should go for the best of both worlds in any case?

What do you think? I'm not sure if that reasoning follows/if I've applied the lessons from your post in a sensible way.

Strongest real-world examples supporting AI risk claims?

rosehadshar2y2

Thanks, really helpful!

What happens on the average day?

rosehadshar2y2

Super cool, thanks for making this!

rosehadshar

Posts 38

Comments57

Posts
38

Comments
57