Wei Dai

4268 karmaJoined


Sorted by New
· · 1m read


The main alternative to truth-seeking is influence-seeking. EA has had some success at influence-seeking, but as AI becomes the locus of increasingly intense power struggles, retaining that influence will become more difficult, and it will tend to accrue to those who are most skilled at power struggles.

Thanks for the clarification. Why doesn't this imply that EA should get better at power struggles (e.g. by putting more resources into learning/practicing/analyzing corporate politics, PR, lobbying, protests, and the like)? I feel like maybe you're adopting the framing of "comparative advantage" too much in a situation where the idea doesn't work well (because the situation is too adversarial / not cooperative enough). It seems a bit like a country, after suffering a military defeat, saying "We're better scholars than we are soldiers. Let's pursue our comparative advantage and reallocate our defense budget into our universities."

Rather, I think its impact will come from advocating for not-super-controversial ideas, but it will be able to generate them in part because it avoided the effects I listed in my comment above.

This part seems reasonable.

I've also updated over the last few years that having a truth-seeking community is more important than I previously thought - basically because the power dynamics around AI will become very complicated and messy, in a way that requires more skill to navigate successfully than the EA community has. Therefore our comparative advantage will need to be truth-seeking.

I'm actually not sure about this logic. Can you expand on why EA having insufficient skill to "navigate power dynamics around AI" implies "our comparative advantage will need to be truth-seeking"?

One problem I see is that "comparative advantage" is not straightforwardly applicable here, because the relevant trade or cooperation (needed for the concept to make sense) may not exist. For example, imagine that EA's truth-seeking orientation causes it to discover and announce one or more politically inconvenient truths (e.g. there are highly upvoted posts about these topics on EAF), which in turn causes other less truth-seeking communities to shun EA and refuse to pay attention to its ideas and arguments. In this scenario, if EA also doesn't have much power to directly influence the development of AI (as you seem to suggest), then how does EA's truth-seeking benefit the world?

(There are worlds in which it takes even less for EA to be shunned, e.g., if EA merely doesn't shun others hard enough. For example there are currently people pushing for EA to "decouple" from LW/rationality, even though there is very little politically incorrect discussions happening on LW.)

My own logic suggests that too much truth-seeking isn't good either. Would love to see how to avoid this conclusion, but currently can't. (I think the optimal amount is probably a bit higher than the current amount, so this is not meant to be an argument against more truth-seeking at the current margin.)

You probably didn't have someone like me in mind when you wrote this, but it seems a good opportunities to write down some of my thoughts about EA.

On 1, I think despite paying lip service to moral uncertainty, EA encourages too much certainty in the normative correctness of altruism (and more specific ideas like utilitarianism), perhaps attracting people like SBF with too much philosophical certainty in general (such as about how much risk aversion is normative), or even causing such general overconfidence (by implying that philosophical questions in general aren't that hard to answer, or by suggesting how much confidence is appropriate given a certain amount of argumentation/reflection).

I think EA also encourages too much certainty in descriptive assessment of people's altruism, e.g., viewing a philanthropic action or commitment as directly virtuous, instead of an instance of virtue signaling (that only gives probabilistic information about someone's true values/motivations, and that has to be interpreted through the lenses of game theory and human psychology).

On 25, I think the "safe option" is to give people information/arguments in a non-manipulative way and let them make up their own minds. If some critics are using things like social pressure or rhetoric to manipulate people into being anti-EA (as you seem to implying - I haven't looked into it myself), then that seems bad on their part.

On 37, where has EA messaging emphasized downside risk more? A text search for "downside" and "risk" on https://www.effectivealtruism.org/articles/introduction-to-effective-altruism both came up empty, for example. In general it seems like there has been insufficient reflection on SBF and also AI safety (where EA made some clear mistakes, e.g. with OpenAI, and generally contributed to the current AGI race in a potentially net negative way, but seem to have produced no public reflections on these topics).

On 39, seeing statements like this (which seems overconfident to me) makes me more worried about EA, similar to how my concern about each AI company is inversely related to how optimistic it is about AI safety.

The problem of motivated reasoning is in some ways much deeper than the trolley problem.

The motivation behind motivated reasoning is often to make ourselves look good (in order to gain status/power/prestige). Much of the problem seems to come from not consciously acknowledging this motivation, and therefore not being able to apply system 2 to check for errors in the subconscious optimization.

My approach has been to acknowledge that wanting to make myself look good may be a part of my real or normative values (something like what I would conclude my values are after solving all of philosophy). Since I can't rule that out for now (and also because it's instrumentally useful), I think I should treat it as part of my "interim values", and consciously optimize for it along with my other "interim values". Then if I'm tempted to do something to look good, at a cost to my other values or perhaps counterproductive on its own terms, I'm more likely to ask myself "Do I really want to do this?"

BTW I'm curious what courses you teach, and whether / how much you tell your students about motivated reasoning or subconscious status motivations when discussing ethics.

The CCP's current appetite for AGI seems remarkably small, and I expect them to be more worried that an AGI race would leave them in the dust (and/or put their regime at risk, and/or put their lives at risk), than excited about the opportunity such a race provides.

Yeah, I also tried to point this out to Leopold on LW and via Twitter DM, but no response so far. It confuses me that he seems to completely ignore the possibility of international coordination, as that's the obvious alternative to what he proposes, that others must have also brought up to him in private discussions.

But we’re so far away from having that alternative that pining after it is a distraction from the real world.

For one thing, we could try to make OpenAI/SamA toxic to invest in or do business with, and hope that other AI labs either already have better governance / safety cultures, or are greatly incentivized to improve on those fronts. If we (EA as well as the public in general) give him a pass (treat him as a typical/acceptable businessman), what lesson does that convey to others?

I should add that there may be a risk of over-correcting (focusing too much on OpenAI and Sam Altman), and we shouldn't forget about other major AI labs, how to improve their transparency, governance, safety cultures, etc. This project (Zach Stein-Perlman's AI Lab Watch) seems a good start, if anyone is interested in a project to support or contribute ideas to.

Answer by Wei Dai5

I'm also concerned about many projects having negative impact, but think there are some with robustly positive impact:

  1. Making governments and the public better informed about AI risk, including e.g. what x-safety cultures at AI labs are like, and the true state of alignment progress. Geoffrey Irving is doing this at UK AISI and recruiting, for example.
  2. Try to think of important new arguments/considerations, for example a new form of AI risk that nobody has considered, or new arguments for some alignment approach being likely or unlikely to succeed. (But take care to not be overconfident or cause others to be overconfident.)

Agreed with the general thrust of this post. I'm trying to do my part, despite a feeling of "PR/social/political skills is so far from what I think of as my comparative advantage. What kind of a world am I living in, that I'm compelled to do these things?"

Those low on the spectrum tend to shape the incentives around them proactively to create a culture that rewards what they don’t want to lose about their good qualities.

What percent of people do you think fall into this category? Any examples? Why are we so bad at distinguishing such people ahead of time and often handing power to the easily corrupted instead?

Load more