Peter S. Park

508Joined Mar 2022


No, "Women and Effective Altruism" by Keerthana preceded "Brainstorming ways to make EA safer and more inclusive" by Richard.

I agree that in many cases, when bad-faith critics say "EA should not do X" and we stop doing X, the world would actually become worse off.

But making SBF the face of the EA movement was a really bad decision. Especially given that he was unilaterally gambling with the whole EA movement's credibility.

There are robust lessons to be learned in this saga, which will allow us EAs to course-correct and prevent future catastrophic outcomes (to our movement and in general).

I think we EAs need to increasingly prioritize speaking up about concerns like the ones Habryka mentioned.

Even when positive in-group feelings, the fear of ostracism, and uncertainty/risk aversion internally influences one to not bring up these concerns, we should fight back against this urge because the concerns, if true, will likely grow larger and larger until they blow up.

There is very high EV in course correction before the catastrophic failure point.

Strongly agree with all of these points.

On point 2: The EA movement urgently needs more earners-to-give, especially now. One lesson that I think is correct, however, is that we should be wary of making any one billionaire donor the face of the EA movement. The downside risk—a loss of credibility for the whole movement due to unknown information about the billionaire donor—is generally too high.

Mostly it was about Point 3. I think an unconditional norm of only accepting anonymous donations above a certain threshold would be too blunt.

I think a version of Point 3 I would agree with is to have high-contributing donor names not be publicized as a norm (with some possible exceptions). I think this captures most of the benefits of an anonymous donation, and most potential donors who might not be willing to make an anonymous donation would be willing to make a discreet, non-publicized donation.

I strongly agree with the spirit of the reforms being suggested here (although I might have some different opinions on how to implement it). We need large-scale reforms of the EA community's social norms to prevent future risks to movement-wide credibility.

  1. Strongly agree. The fact that net upvotes are the only concrete metric by which EA forum posts and LessWrong forum posts are judged has indeed been suboptimal for one of EA's main goals: to reflect on and adapt our previous beliefs based on new evidence. Reforms designed to increase the engagement of controversial posts would be very helpful for our pursuit of this goal. (Disclaimer: Most of my EA forum posts would rank highly on the "controversial" scale, in that many people upvote and many people downvote them, and the top comment is usually critical and has a lot of net upvotes. I think that we EAs need to increasingly prioritize both posting and engaging with controversial arguments that run contrary to status-quo beliefs, even if it's hard! This is especially true for LessWrong, which arguably doubles as a scientific venue for AI safety research in addition to an EA-adjacent discussion forum.)
  2.  Agree, although I think EAs should be more willing to write and engage with controversial arguments non-anonymously as well.
  3. Strongly agree in spirit. While a norm of unconditionally refusing non-anonymous donations above a certain threshold might be too blunt, I do think we need to have better risk-management about tying our EA movement's credibility to a single charismatic billionaire, or a single charismatic individual in general. Given how important our work is, we probably need better risk-management practices in general. (And we EAs already care earnestly about this! I do think this is a question not of earnest desire but of optimal implementation.) I also think that many billionaires would actually prefer to donate anonymously or less publicly, because they agree with the bulk of but not all of EA's principles. Realistically, leaving room for case-by-case decision-making seems helpful.

Thank you so much for your excellent post on the strategy of buying time, Thomas, Akash, and Olivia! I strongly agree that this strategy is necessary, neglected, and tractable.

For practical ideas of how to achieve this (and a productive debate in the comments section of the risks from low-quality outreach efforts), please see my related earlier forum post:

Thanks so much for your detailed response, Charles! I really appreciate your honest feedback.

It seems likely that our cruxes are the following. I think that (a) we probably cannot predict the precise moment the AGI becomes agentic and/or dangerous, (b) we probably won't have a strong credence that a specific alignment plan will succeed (say, to the degree of solving ELK or interpretability in the short time we have), and (c) AGI takeoff will be slow enough that secrecy can be a key difference-maker in whether we die or not.

So, I expect we will have alignment plan numbers 1, 2, 3, and so on. We will try alignment plan 1, but it will probably not succeed (and hopefully we can see signs of it not succeeding early enough that we shut it down and try alignment plan 2). If we can safely empirically iterate, we will find an alignment plan N that works.

This is risky and we could very well die (although I think the probability is not unconditionally 100%). This is why I think not building AGI is by far the best strategy. (Corollary: I place a lot of comparative optimism on AI governance and coordination efforts.) The above discussion is conditional on trying to build an aligned AGI.

Given these cruxes, I think it is plausible that a Manhattan-Project-esque shift in research norms can help achieve secrecy value in our generated AI safety plans, but also keep ease-of-research high. 

This would definitely be a useful tag for a hostile AI. 

Moreover, the fact that everyone can access tagged posts right now means that there are many ways by which tagged content may still be made accessible by the AGI, even in the event that the AGI corporation cooperates with us.

Secrecy is only likely to be kept if the information is known by a small set of people. (e.g., Conjecture's infohazard document discusses this quite well)

Thanks so much, Michael, for your detailed and honest feedback! I really appreciate your time.

I agree that both threat models are real. AI safety research can lose its value when not kept secret, and humanity’s catastrophic and extinction risk increases if AI capabilities advances faster than valuable AI safety research.

Regarding your point that the Internet has significantly helped accelerate AI safety research, I would say two things. First, what matters is the Internet’s effect on valuable AI safety research. If much of the value in AI safety plans requires them being kept secret from the Internet (e.g., one’s plan in a rock-paper-scissors-type interaction), then the current Internet-forum-based research norms may not be increasing the rate of value generation by that much. In fact, it may plausibly be decreasing the rate of value generation, in light of the discussion in my post. So, we should vigorously investigate how much of the value in AI safety plans in fact requires secrecy. 

Second, the fact that AI safety researchers (or more generally, people in the economy) are extensively using the Internet does not falsify the claim that the Internet may be close to net-zero or net-negative for community innovation. In this claim, the Internet is great at enticing people to use it and spend money on it, but it is not great at improving real innovation as measured by productivity gains. So, it has a redistributive rather than productive effect on how people spend their time and money. So, we should expect to see people (e.g., AI safety researchers) extensively use the Internet even if the Internet has not increased the innovation rate of real productivity gains. 

What matters is the comparison with the counterfactual: if there were an extensive, possibly expensive, and Manhattan-Project-esque change in research norms for the whole community, could ease-of-research be largely kept the same even with secrecy gains? I think the answer may be plausibly “yes.”

How do we estimate the real effect of the Internet on the generation of valuable AI safety research? First, we would need to predict the value of AI safety research, particularly how its value is affected by its secrecy. This effort would be aided by game theory, past empirical evidence of real-world adversarial interactions, and the resolution of scientific debates about what AGI training would look like in the future.

Second, we would need to estimate how much the Internet affects the generation of this value. This effort would be aided by progress studies and other relevant fields of economics and history. 

Edit: Writing to add a link for the case of why the Internet may be overrated for the purposes of real community innovation:

Load More