Software engineer since 2010. Left Google in fall '21, now getting into independent AI alignment research.
Habryka, I appreciate you sharing your outputs. Do you have a few minutes to follow up with a little explanation of your models yet? It's ok if it's a rough/incomplete explanation. But it would help to know a bit more about what you've seen with government-funded research etc. that makes you think this would be net-negative for the world.
Judging by the voting and comments so far (both here as well as on the LessWrong crosspost), my sense is that many here support this effort, but some definitely have concerns. A few of the concerns are based in hardcore skepticism about academic research that I'm not sure are compatible with responding to the RfI. Many concerns though seems to be about this generating vague NSF grants that are in the name of AI safety but don't actually contribute to the field.
For these latter concerns, I wonder is there a way we could resolve them by limiting the scope of topics in our NSF responses or giving them enough specificity? For example, what if we convinced the NSF that all they should make grants for is mechanistic interpretability projects like the Circuits Thread. This is an area that most researchers in the alignment community seem to agree is useful, we just need a lot more people doing it to make substantial progress. And maybe there is less room to go adrift or mess up this kind of concrete and empirical research compared to some of the more theoretical research directions.
It doesn't have to be just mechanistic interpretability, but my point is, are there ways we could shape or constrain our responses to the NSF like this that would help address your concerns?
I agree that we need to care with high fidelity idea transmission, and there is some risk of diluting the field. But I think the reasonable chance of this spurring some more good research in AI safety is worth it, even if there will also be some wasted money.
One thing that's interesting in the RfI is that it links to something called THE NATIONAL ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT STRATEGIC PLAN: 2019 UPDATE. This PDF outlines a federal committee's strategic plan for dealing with AI. Strategy #4 is Ensure the Safety and Security of AI Systems and they are saying a lot of the "right things". For example, it includes discussion of emergent behavior, goal misspecification, explainability/transparency and long-term AI safety and value-alignment. Whether this will help translate into useful actions isn't certain, but it's heartening to see some acknowledgment of AI concerns from the US government besides just "develop it before China does".
As for the current funding situation in the EA/AI risk community, I have also heard about this issue of there being too much funding for not enough good researchers/applicants right now. I don't think we should get to used to this dynamic though. The situation could easily reverse in a short time if awareness about AI risk causes a wave of new research interest, or if 80,000 Hours, AGI Safety Fundamentals Curriculum, AI Safety Camp and related programs are able to introduce more people into the field. So just because we have a funding glut now doesn't mean we should assume that will continue through 2023 which is the time period that this NSF RfI pertains to.
There are two governance-related proposals in the second EA megaprojects thread. One is to create a really large EA-oriented think tank. The other is essentially EA lobbying, i.e. to put major funding behind political parties and candidates who agree to take EA concerns seriously.
Making one of these megaprojects a reality could get officials in governments to take AGI more seriously and/or get it more into the mainstream political discourse.
Indeed, ai.gov doesn't have even a single mention of the term "AGI".
Andrew Yang made transformative AI a fairly central part of his 2020 presidential campaign. To the OP's point though, I don't recall him raising any alarms about the existential risks of AGI.
Albert Einstein also comes to mind as an agreeable generator. I haven't read his biography or anything, but based on the collage of stories I've heard about him, he never seemed like a very disagreeable person but obviously generated important new ideas.
Dr. Greger from NutritionFacts.org also seems like an agreeable generator. Actually he may be disagreeable in that he's not shy about pointing out flaws in studies and others' conceptions, but he does it in an enthusiastic, silly and not particularly abrasive way.
It's interesting that some people may still disagree often but not be doing it in a disagreeable manner.
Thanks for sharing your expertise and in-depth reply!
Thanks, Linch. I didn't realize I might be treading near information hazards. It's good to know and an interesting point about the pros and cons of having such conversations openly.