I downvoted this because I think this isn't independently valuable / separate enough from your existing posts to merit a new, separate post. I think it would have been better as a comment on your existing posts (and as I've said on a post by someone else about your reviews, I think we're better off consolidating the discussion in one place).
That said, I think the sentiments expressed here are pretty reasonable, and I would have upvoted this in comment form I think.
They posted about their review of Sinergia on the forum already: https://forum.effectivealtruism.org/posts/YYrC2ZR5pnrYCdSLt/sinergia-ace-top-charity-makes-false-claims-about-helping
I suggest we concentrate discussion there and not here.
Someone on the forum said there were ballpark 70 AI safety roles in 2023
Just to note that the UK AI Security Institute employs more than 50 technical staff by itself and I forget how many non-technical staff, so this number may be due an update.
This doesn't seem right to me because I think it's popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isn't a continuation of the genetic legacy of humans, so I feel pretty confident that it's something else about humanity that people want to preserve against AI. (I'm not here to defend this particular vision of the future beyond noting that people like Holden Karnofsky have written about it, so it's not exactly niche.)
You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar – sure, so in the absence of having done those studies, we should delay our replacement until they can be done. And doing these studies is undermined by the fact that right now the state of our knowledge on how to reliably determine what an AI is thinking is pretty bad, and it will only get worse as they develop their abilities to strategise and lie. Solving these problems would be a major piece of what people are looking for in alignment research, and precisely the kind of thing it seems worth delaying AI progress for.
another opportunity for me to shill my LessWrong writing posing this question: Should we exclude alignment research from LLM training datasets?
I don't have a lot of time to spend on this, but this post has inspired me to take a little time to figure out whether I can propose or implement some controls (likely: making posts visible to logged-in users only) in ForumMagnum (the software underlying the EA Forum, LW, and the Alignment Forum)
edit: https://github.com/ForumMagnum/ForumMagnum/issues/10345
See also:
(Perhaps there should be a forum tag for this issue specifically, idk)
forgive the self-promotion but here's a related Facebook post I made: