Max Clarke

Comments
65

People who read this far seem to have upvoted

I would have expected the opposite corner of the two axis voting (because I think people don't like the language)

seems he has ended up giving more to the democratic party than ea lol

There seems to be two different conceptual models for AI risk.

The first is a model like in his report "Existential risk from power-seeking AI", in which he lays out a number of things, which, if they happen, will cause AI takeover.

The second is a model (which stems from Yudkowsky & Bosteom, and more recently in Michael Cohen's work https://www.lesswrong.com/posts/XtBJTFszs8oP3vXic/?commentId=yqm7fHaf2qmhCRiNA ) where we should expect takeover by malign AGI by default, unless certain things happen.

I personally think the second model is much more reasonable. Do you have any rebuttal?

Likewise, I have a post from January suggesting that crypto assets are over-represented in the EA funding portfolio.

Probably the number of people actually pushing the frontier of alignment is more like 30, and for capabilities maybe 3000. If the 270 remaining alignment people can influence those 3000 (biiiig if, but) then the odds aren't that bad

Not sure what Rob is referring to but there are a fair few examples of org/people's purposes slipping from alignment to capabilities, eg. OpenAI

I myself find it surprisingly difficult to focus on ideas that are robustly beneficial to alignment but not to capabilities.

(E.g. I have a bunch of interpretability ideas. But interpretability can only have no impact on, or accelerate timelines)

Do you know if any of the alignment orgs have some kind of alignment research NDA, with a panel to allow any alignment-only ideas be public, but keep the maybe-capabilities ideas private?

I think probably this post should be edited and "focus on low risk interventions first" put in bold in the first sentence and put right next to the pictures. Because the most careless people (possibly like me...) are the ones that will read that and not read the current caveats

An addendum is then:

  1. If Buying time interventions are conjunctive (ie. one can cancel out the effect of the others); but technical alignment is disjunctive

  2. If the distribution of people performing both kinds of intervention is mostly towards the lower end of thoughtfulness/competence, (which we should imo expect)

Then technical alignment is a better recommendation for most people.

In fact it suggests that the graph in the post should be reversed (but the axis at the bottom should be social competence rather than technical competence)

Load More