On Tesla: I don't think training a special model for expensive test cars makes sense. They're not investing in a method that's not going to be scalable. The relevant update will come when AI5 ships (end of this year reportedly), with ~9x the memory. I'd be surprised if they don't solve it on that hardware.
On the broader point about predictions failing: I think these were mostly failures of economic reasoning more than failures of AI progress. AI has made enormous progress on both translation and radiology imaging. What Hinton and others got wrong wasn't the capability prediction, it was assuming the job consisted entirely of the task AI was getting good at. Turns out radiologists do more than read images, translators do more than translate sentences, and AI ends up complementary rather than substitutive. Maybe some Jevons paradox at play too. Cheaper translation means more content gets translated, which means more demand for people who can review AI output, catch cultural nuance, and put their name on the final result.
The benchmarks aren't perfect but they consistently point to rapid progress, from METR time horizons, SWE-bench, GDPval...
On the 90% prediction: my somewhat conservative view is that AI could write 90%+ of production code this year and will next year. But I don't think this will mean immediate mass unemployment for programmers. The job will initially just shift toward review, specification, and technical direction of AI systems. I think most SWE jobs will look more like the one of a technical PM next year.
Edit:
Customer support chat is another area with some applicability, but results are mixed
That study uses a fine-tuned GPT-3 model.
Tesla has shown that scale doesn't confer the benefits that many (including me) had hoped.
This misses a hardware bottleneck: HW4/AI4 has 16GB RAM, capping model size regardless of training data. Tesla made real progress with v12/v13. We won't get a clean test of the scaling hypothesis for self-driving until they ship better inference hardware.
We have already started to cross into the timeframe where some of the megaphones of AGI fever are proving to be false prophets. Dario Amodei, the CEO of Anthropic, predicted in the first quarter of this year that AI would write 90% of software code by the last quarter of this year. Nothing close to that happened.
I think you're reading too much into self-driving as evidence about AI progress generally. Anyone who's used Opus 4.5 in Claude Code over the past month will tell you Amodei wasn't far off on his coding prediction. The tool's creator reports 100% of his PRs last month were written by Claude. Amodei may have been off by a few months, and maybe is too optimistic about speed of adoption, but I still don't think it was such a bad prediction.
I give the MWI a probability of greater than 0.5 of being correct, but as far as I can tell, there isn't any way to generate more value out of it. There isn't any way to create more branches. You only can choose to be intentional and explicit about creating new identifiable branches, but that doesn't mean that you've created more branches. The branching happens regardless of human action.
Someone with a better understanding of this please weigh in.
"...cryptocurrencies makes stopping the funding of terrorists basically impossible."
No. Really, really, no. I could talk a lot more about this, but if you think terrorist groups can manage infosec well enough to overcome concerted attacks by the NSA, or Mossad, or FSB, etc., you're fooling yourself.
"Impossible" might be an exaggeration, but it does seem to make it much easier. That's also what the article you link to suggests. Edit: Are you skeptical because of the on/off ramps, the security of terrorist's computer infrastructure or something else?
Other than the misunderstanding and conflating nodes for hash power, this is also not true. Has power is concentrated, so you'd need to somehow convince the biggest mining groups that they don't care about countries keeping their operations legal, and as we've seen, they do. That means they will continue to run to embrace KYC/AMF regulation, and will do whatever else makes their investments go well - including cooperating with nation-states in almost any way you can imagine.
So far, the only serious KYC/AMF happens at the level of centralized exchanges. Nation-states cannot enforce KYC/AMF at the level of decentralized exchanges. They can also use chain-analysis and put pressure on mining groups within their countries to do KYC/AMF or to create address "black lists" but so far there hasn't been much political will for this, and probably would lead to a big backlash from the crypto community. And this becomes impossible for privacy coins such as Zcash and Monero.
I feel like a number of these maybe could be fitted under a single very large organization. Namely:
Basically, a big EA research University with a forecasting, policy research and ML/AI safety department.
I'd also add non-profit and for-profit startup incubator. I think Universities would be much better if they made it possible to try something entrepreneurial without having to fully drop-out.
In my experience, EAs tend to be pretty dissatisfied with the higher education system, but I interpreted the muted/mixed response to my post on the topic as a sign that my experience might have been biased, or that despite the dissatisfaction, there wasn't any real hunger for change. Or maybe a sense that change was too intractable.
Though I might also have done a poor job at making the case.
My speculative, cynical, maybe unfair take is that most senior EAs are so enmeshed in the higher education system, and sunk so much time succeeding in it, that they're incentivized against doing anything too disruptive that might jeopardize their standing within current institutions. And why change how undergrad education is done if you've already gone through it?
The very quick summary: Japan used to be closed off from the rest of the world, until 1853 when the US forced them to open up. This triggered major reforms. The Shogun was overthrown and replaced with the emperor, and in less than a century, Japan went from an essentially medieval economic and societal structure, to a modern industrial economy.
I don't know of any books exclusively focused on it, but it's analyzed in Why Nations Fail and Political Order and Political Decay.
Re Tesla: My best guess is that they still need 5-20x reliability to match human level, and I don't entirely rule out that they'll manage it with AI4. Hard to get good data on this though. It sounds like they were still finalizing the AI5 chip design until quite recently, and I'm not sure it makes sense to spend training budget on a model that can only run on a handful of cars while they're still hoping to squeeze more out of AI4. There's likely an inference overhang here. They've spent years scaling training data and compute while model size stayed fixed, way, way past the usual theoretical optimal tradeoff point. Lifting the size constraint will probably yield disproportionate gains.
Re the middle stuff: I think we just disagree on how to weigh various evidence from failed predictions (based on narrow models, older models...), various firsthand reports and more recent benchmark results.
Re the 90% prediction: by "conservative" I meant this is like my 5-10th percentile slowest timeline. I've heard from a number of SWEs that they're already basically not writing code, just instructing and reviewing. I'm also uncertain about adoption speed. I'd put it at >50% chance that among SWEs actually using the latest LLMs and tools, AI writes 90%+ of their code in the first half of this year.