E

Edison

Software Engineer
4 karmaJoined

Posts
1

Sorted by New

Comments
2

This thread is exactly the kind of scrutiny I was hoping for — I graded "open-source fades" as his clearest miss on a live scorecard I built (https://agiscorecard.com), but your point about distillation muddying the picture makes me wonder if "wrong" is too strong vs "right mechanism, wrong conclusion." Curious where you'd land.

Inspired by this post (and the one-year retrospective Rasool linked), I built a continuously-updated version: a live scorecard grading each prediction as evidence comes in — https://agiscorecard.com

Current tally: 3 on track (capability trajectory, scaling pace, capex), 1 graded wrong (open-source fading — DeepSeek V4 and Qwen are now ~3-6 months behind the frontier), 2 still open (AGI-by-2027, The Project). It also puts his 2027 side-by-side with Metaculus, Samotsvety, Hassabis, and the academic survey median.

Reading the thread here, the open-source verdict seems to be the most contested one — huw and JoshYou make points cutting both ways. I'd genuinely welcome pushback on any grading; the whole thing only works if the verdicts survive scrutiny.