870Joined Aug 2016


I'm really sorry that you and so many others have this experience in the EA community. I don't have anything particularly helpful or insightful to say -- the way you're feeling is understandable, and it really sucks :(

I just wanted to say I'm flattered and grateful that you found some inspiration in that intro talk I gave. These days I'm working on pretty esoteric things, and can feel unmoored from the simple and powerful motivations which brought me here in the first place -- it's touching and encouraging to get some evidence that I've had a tangible impact on people.

I can give a sense of my investment, though I'm obviously an unusual case in multiple ways. I'm a coauthor on the report but I'm not an ARC researcher, and my role as a coauthor was primarily to try to make it more likely that the report would be accessible to a broader audience, which involved making sure my own "dumb questions" were answered in the report.

I kept time logs, and the whole project of coauthoring the report took me ~100 hours. By the end I had one "seed" of an ELK idea but unfortunately didn't flesh it out because other work/life things were pretty hectic. Getting to this "seed" was <30 min of investment.

I think if I had started with the report in hand, it would have taken me ~10 hours to read it carefully enough and ask enough "dumb questions" to get to the point of having the seed of an idea about as good as that idea, and then another ~10 hours to flesh it out into an algorithm + counterexample. I think the probability I'd have won the $5000 prize after that investment is ~50%, making the expected investment 40h. I think there's a non-trivial but not super high chance I'd have won larger prizes, so the $ / hours ratio is significantly better in expectation than $125/hr (since the ceiling for the larger prizes is so much higher).

My background: 

  • I have a fairly technical background, though I think the right way to categorize me is as "semi-technical" or "technical-literate." I did computer science in undergrad and enjoyed it / did well, but my day to day work mainly involves writing. I can do simple Python scripting. I can slowly and painfully sometimes do the kinds of algorithms problem sets I did quickly in undergrad.
  • Four years ago I wrote this to explain what I understood of Paul's research agenda at the time.
  • I've been thinking about AI alignment a lot over the last year, and especially have the unfair advantage of getting to talk to Paul a lot.  With that said, I didn't really know much or think much about ELK specifically (which I consider pretty self-contained) until I started writing the report, which was late Nov / early Dec.

ARC would be excited for you to send a short email to elk@alignmentresearchcenter.org with a few bullet points describing your high level ideas, if you want to get a sense for whether you're on the right track / whether fleshing them out would be likely to win a prize.

I was imagining Sycophants as an outer alignment failure, assuming the model is trained with naive RL from human feedback.

Not intended to be expressing a significantly shorter timeline; 15-30 years was supposed to be a range of "plausible/significant probability" which the previous model also said (probability on 15 years was >10% and probability on 30 years was 50%). Sorry that wasn't clear!

(JTBC I think you could train a brain-sized model sooner than my median estimate for TAI, because you could train it on shorter horizon tasks.)

Ah yeah, that makes sense -- I agree that a lot of the reason for low commercialization is local optima, and also agree that there are lots of cool/fun applications that are left undone right now.

To clarify, we are planning to seek more feedback from people outside the EA community on our views about TAI timelines, but we're seeing that as a separate project from this report (and may gather feedback from outside the EA community without necessarily publicizing the report more widely).

Finally, have you talked much to people outside the alignment/effective altruism communities about your report? How have reactions varied by background? Are you reluctant to publish work like this broadly? If so, why? Do you see risks of increasing awareness of these issues pushing unsafe capabilities work?


I haven't engaged much with people outside the EA and AI alignment communities, and I'd guess that very few people outside these communities have heard about the report. I don't personally feel sold that the risks of publishing this type of analysis more broadly (in terms of potentially increasing capabilities work) outweigh the benefits of helping people better understand what to expect with AI and giving us a better chance of figuring out if our views are wrong. However, some other people in the AI risk reduction community who we consulted (TBC, not my manager or Open Phil as an institution) were more concerned about this, and I respect their judgment, so I chose to publish the draft report on LessWrong and avoid doing things that could result in it being shared much more widely, especially in a "low-bandwidth" way (e.g. just the "headline graph" being shared on social media).

Thanks!  I'll answer your cluster of questions about takeoff speeds and commercialization in this comment and leave another comment respond to your questions about sharing my report outside the EA community.

Broadly speaking, I do expect that  transformative AI will be foreshadowed by incremental economic gains; I generally expect gradual takeoff , meaning I would bet that at some point growth will be ~10% per year before it hits 30% per year (which was the arbitrary cut-off for "transformative" used in my report). I don't think it's necessarily the case; I just think it'll probably work this way. On the outside view, that's how most technologies seem to have worked. And on the inside view, it seems like there are lots of valuable-but-not-transformative applications of existing models on the horizon, and industry giants + startups are already on the move trying to capitalize.

My views imply a roughly ~10% probability that the compute to train transformative AI would be affordable in 10 years or less, which wouldn't really leave time for this kind of gradual takeoff. One reason it's a pretty low number is because it would imply sudden takeoff and I'm skeptical of that implication (though it's not the only reason -- I think there are separate reasons to be skeptical of the Lifetime Anchor and the Short Horizon Neural Network anchor, which drive short timelines in my model).

I don't expect that several generations of more powerful successors to GPT-3 will be developed before we see significant commercial applications to GPT-3; I expect commercialization of existing models and scaleup to larger models to be happening in parallel. There are already various applications online, e.g. AI Dungeon (based on GPT-3), TabNine (based on GPT-2), and this list of other apps. I don't think that evidence OpenAI was productizing GPT-3 would shift my timelines much either way, since I already expect them to be investing pretty heavily in this. 

Relative to the present, I expect the machine learning industry to invest a larger share of its resources going forward into commercialization, as opposed to pure R&D: before this point a lot of the models studied in an R&D setting just weren't very useful (with the major exception of vision models underlying self-driving cars), and now they're starting to be pretty useful. But at least over the next 5-10 years I don't think that would slow down scaling / R&D much in an absolute sense, since the industry as a whole will probably grow, and there will be more resources for both scaling R&D and commercialization.

Load More