This other Ryan Greenblatt is my old account[1]. Here is my LW account.
Account lost to the mists of time and expired university email addresses.
This argument neglects improvements in speed and capability right? Even if parallel labor and compute are complements, shouldn't we expect it is possible for increased speed or capabilities to substitute for compute? (It just isn't possible for AI companies to buy much of this.)
(I'm not claiming this is the biggest problem with this analysis, just noting that it is a problem.)
Might be true, doesn't make that not a strawman. I'm sympathetic to thinking it's implausible that mechanize would be the best thing to do on altruistic grounds even if you share views like those of the founders. (Because there is probably something more leveraged to do and some weight on cooperativeness considerations.)
From my perspective, a large part of the point of safety policies is that people can comment on the policies in advance and provide some pressure toward better policies. If policies are changed at the last minute, then the world may not have time to understand the change and respond before it is too late.
So, I think it's good to create an expectation/norm that you shouldn't substantially weaken a policy right as it is being applied. That's not to say that a reasonable company shouldn't do this some of the time, just that I think it should by default be considered somewhat bad, particularly if there isn't a satisfactory explanation given. In this case, I find the object level justification for the change somewhat dubious (at least for the AI R&D trigger) and there is also no explanation of why this change was made at the last minute.
Hmm, 10k$ is maybe too small size to be worth it, but I might be down to do:
I'd like to bet on a milestone that triggers before it's too late for human intervention if possible, so I've picked this research engineer milestone. We'd presuambly have to operationalize further. I'm not sure if I think it's worth the time to try to operationalize enough that we could do a bet.
Somewhat relatedly, Anthropic quietly weakened its security requirements about a week ago as I discuss here.
Yes, I'm aware of more formal models with estimates based on expert surveys. Sadly, this work isn't public yet I think.