Zach Stein-Perlman

Research @ AI Impacts
2885 karmaJoined Nov 2020Working (0-5 years)Berkeley, CA, USA



AI forecasting & strategy at AI Impacts. Blog: Not Optional.


I mean, I don't think all of your conditions are necessary (e.g. "We invent a way for AGIs to learn faster than humans" and "We massively scale production of chips and power") and I think together they carve reality quite far from the joints, such that breaking the AGI question into these subquestions doesn't help you think more clearly [edit: e.g. because compute and algorithms largely trade off, so concepts like 'sufficient compute for AGI' or 'sufficient algorithms for AGI' aren't useful].

Not reading the paper, and not planning to engage in much discussion, and stating beliefs without justification, but briefly commenting since you asked readers to explain disagreement:

I think this framework is bad and the probabilities are far too low, e.g.:

  • We probably already have "algorithms for transformative AGI."
  • The straightforward meaning of "a way for AGIs to learn faster than humans" doesn't seem to be relevant (seems to be already achieved, seems to be unnecessary, seems to be missing the point); e.g. language models are trained faster than humans learn language (+ world-modeling), and AlphaGo Zero went from nothing to superhuman in three days. Maybe you explain this in the paper though.
  • GPT-4 inference is much cheaper than paying humans $25/hr to write similar content.
  • We probably already have enough chips for AGI by 2043 without further scaling up production.

Separately, note that "AI that can quickly and affordably be trained to perform nearly all economically and strategically valuable tasks at roughly human cost or less" is a much higher bar than the-thing-we-should-be-paying-attention-to (which is more like takeover ability; see e.g. Kokotajlo).

Not really, or it depends on what kinds of rules the IAIA would set.

For monitoring large training runs and verifying compliance, see Verifying Rules on Large-Scale NN Training via Compute Monitoring (Shavit 2023).

Some more sketching of auditing with model evals is in Model evaluation for extreme risks (DeepMind 2023).

I disagree. In particular:

  1. Roughly, I think the community isn't able (isn't strong enough?) to both think much about how it's perceived and think well or in-a-high-integrity-manner about how to do good, and I'd favor thinking well and in a high-integrity manner.
  2. I'd guess donating for warm fuzzies is generally an ineffective way to gain influence/status.

(Of course you should be friendly and not waste weirdness points.)

Minor note: I think it's kinda inelegant that your operationalization depends on the kinds of question-answer pairs humans consider rather than asserting something about the counterfactual where you consider an arbitrary question-answer pair for an hour.

Some categories where extraordinary evidence is common, off the top of my head:

  • Sometimes someone knows the answer with high trustworthiness, e.g. I spend an hour thinking about a math problem, fail to determine the answer, check the answer, and massively update toward the textbook's answer and against others.
  • Sometimes you have high credence that the truth will be something you assigned very low credence to, e.g. what a stranger's full name is or who the murderer is or what the winning lottery number is or what the next sentence you hear will be.
    • Maybe you meant to refer only to (binary) propositions (and exclude unprivileged propositions like "the stranger's name is Mark Xu").
  • Sometimes you update to 0 or 1 because of the nature of the proposition. E.g. if the proposition is something like "(conditional on seeing Zach again) when I next see Zach will he appear (to me) to be wearing a mostly-blue shirt." When you see me it's impossible not to update infinitely strongly.

Separately, fwiw I endorse the Mark Xu post but agree with you that (there's a very reasonable sense in which) extraordinary evidence is rare for stuff you care about. Not sure you disagree with "extraordinary evidence is common" proponents.

This is quite surprising to me. For the record, I don't believe that the authors believe that "carry out as much productive activity as one of today’s largest corporations" is a good--or even reasonable--description of superintelligence or of what's "conceivable . . . within the next ten years."

And I don't follow Sam's or OpenAI's communications closely, but I've recently seemed to notice them declining to talk about AI as if it's as big a deal as I think they think it is. (Context for those reading this in the future: Sam Altman recently gave congressional testimony which [I think after briefly engaging with it] was mostly good but notable in that Sam focused on weak AI and sometimes actively avoided talking about how big a deal AI will be and x-risk, in a way that felt dishonest.)

(Thanks for engaging.)

This was hard to read, emotionally.

Some parts are good. I'm confused about why OpenAI uses euphemisms like

it’s conceivable that within the next ten years, AI systems will exceed expert skill level in most domains, and carry out as much productive activity as one of today’s largest corporations.

(And I heard MATS almost had a couple strategy/governance mentors. Will ask them.)

(Again, thanks for being constructive, and in the spirit of giving credit, yay to GovAI, ERA, and CHERI for their summer programs. [This is yay for them trying; I have no knowledge of the programs and whether they're good.])

(Sure. I was mostly just trying to complain but I appreciate you being more constructive. The relevant complaint in response is that AGISF hasn't improved/updated their curriculum much + nobody's made and shared a better one.)

Load more