How can we trust the findings of EA organisations? It is a genuine possibility that I will change the entire course of my life as a result of the information on 80k hours. I guess many of you already have. Have you checked all of their reasoning? What percentage ought one to check? Or you trust someone else to have done that "due diligence"?
It's not enough to say they are transparent and seem honest - I know plenty of misguided transparent honest people. The issue is that EA organisations might be wrong, not in what they don't know (we cannot avoid being wrong in what we don't know) but in what they do - like a mistaken mathematical proof, their logic might be flawed or their sums might be off. This, by our own logic, would likely have disastrous results.
Frankly, someone needs to be checking their notes and I am not skilled enough, nor do I want to. I have yet to see this in regard to say 80k hours.
In this sense I can imagine three solutions:
Firstly, some sort of independent auditing body. They could read the works of EA organisations and attempt to see if the logic holds, flag up areas where decisions seem arbitrary etc. We would be paying someone to just be really on top of this stuff as their main job and to tell us if they found anything worrying. Arguably this forum kind of does this job, though A) we are all tremendously biased B) are people *really* checking the minutiae? I am not.
Secondly, multiple organisations independently asking the same questions. What if there were another 80k hours (called say, "Nine Years") which didn't try to interact with them, but sought answers to the same problems. "Nine Years" could publish their research and then we could read both summaries and then investigate areas of difference.
Thirdly, publish papers on our explanations as if they were mathematical (perhaps in philosophy journals). Perhaps this already happens (I guess if this post takes off I might research it more), but you could publish rigid testable explanations of the theories which undergird EA as an ideology. It seems well being (for instance) is very poorly defined. I'll explain more if people are interested, (read Deutsch's The Beginning of Infinity) but suffice it to say that to avoid being wrong you want to be definite so you can change. Is our ideology falsifiable? Sometimes EA seems very vague to me in its explanational underpinnings. If you can vary easily, it's hard to be wrong, and if you're never wrong, you never get better. I don't know if journals are the way to go but it seemed the easiest way to clearly suggest becoming more rigid.
Caveats
I do not know enough about EA - I've read about 20 hours of it in my life. Perhaps mechanisms like this already exist or you have reason to not require them.
I recently left religion and for that reason would like to know that I am not fooling myself here also. "Trust EA organisations because they are good" doesn't hold much water since the logic applies elsewhere - "Trust the Church because it is good"?
Summary
I think it would be good to have a mechanism for ensuring that we are not fooling ourselves here. EA redirects a huge number of person-hours and flaws in it could be catastrophic. I don't know what those are, but have got a few suggestions and am interested in your suggestions or criticisms of the ideas suggested here.
An alignment arms race is only bad if there is a concomitant capabilities development that would make a wrong alignment protocol counterproductive. Different approaches to alignment can lead to insights into capabilities, and that's something to be concerned about, but that isn't anything already captured in analyses of capabilities arms-race scenarios.
If there are 2 or more alignment agencies, but only one of their approaches can fit with advanced AI systems as developed, each would race to complete their alignment agenda before the other agencies could complete theirs. This rushing could be especially bad if anyone doesn't take the time to authenticate and verify their approach will actually align AI as intended. In addition, if the competition becomes hostile enough, AI alignment agencies won't be checking each other's work in good faith, and in general, there won't be enough trust for anyone to let anyone else check the work they've done for alignment.
If 1 or more of these agencies racing to the finish line doesn't let anyone check their work, and their strategy is invalid or unsound, then implementing one of them into an AI system would fail to lead to alignment, when it was expected that it would. In other words, because of mistakes made, what looks like an alignment competition inadvertently becomes a misalignment race.
I'm not saying competition in AI alignment is either good or bad by default. What I am saying is it appears there are particular conditions that would lead competition in AI alignment to make things worse, and that such states should be avoided. To summarize, it appears to me at least some of those conditions are:
1. Competition in AI alignment becomes a 'race.'
2. One or more agencies in AI alignment themselves become untrustworthy.
3. Even if in principle all AI alignment agencies should be able to trust each other, in practice they end up mistrusting each other.