WD

Wei Dai

4232 karmaJoined
0

Posts
7

Sorted by New
9
· · 1m read

Comments
245

The problem of motivated reasoning is in some ways much deeper than the trolley problem.

The motivation behind motivated reasoning is often to make ourselves look good (in order to gain status/power/prestige). Much of the problem seems to come from not consciously acknowledging this motivation, and therefore not being able to apply system 2 to check for errors in the subconscious optimization.

My approach has been to acknowledge that wanting to make myself look good may be a part of my real or normative values (something like what I would conclude my values are after solving all of philosophy). Since I can't rule that out for now (and also because it's instrumentally useful), I think I should treat it as part of my "interim values", and consciously optimize for it along with my other "interim values". Then if I'm tempted to do something to look good, at a cost to my other values or perhaps counterproductive on its own terms, I'm more likely to ask myself "Do I really want to do this?"

BTW I'm curious what courses you teach, and whether / how much you tell your students about motivated reasoning or subconscious status motivations when discussing ethics.

The CCP's current appetite for AGI seems remarkably small, and I expect them to be more worried that an AGI race would leave them in the dust (and/or put their regime at risk, and/or put their lives at risk), than excited about the opportunity such a race provides.

Yeah, I also tried to point this out to Leopold on LW and via Twitter DM, but no response so far. It confuses me that he seems to completely ignore the possibility of international coordination, as that's the obvious alternative to what he proposes, that others must have also brought up to him in private discussions.

But we’re so far away from having that alternative that pining after it is a distraction from the real world.

For one thing, we could try to make OpenAI/SamA toxic to invest in or do business with, and hope that other AI labs either already have better governance / safety cultures, or are greatly incentivized to improve on those fronts. If we (EA as well as the public in general) give him a pass (treat him as a typical/acceptable businessman), what lesson does that convey to others?

I should add that there may be a risk of over-correcting (focusing too much on OpenAI and Sam Altman), and we shouldn't forget about other major AI labs, how to improve their transparency, governance, safety cultures, etc. This project (Zach Stein-Perlman's AI Lab Watch) seems a good start, if anyone is interested in a project to support or contribute ideas to.

Answer by Wei Dai5
1
0

I'm also concerned about many projects having negative impact, but think there are some with robustly positive impact:

  1. Making governments and the public better informed about AI risk, including e.g. what x-safety cultures at AI labs are like, and the true state of alignment progress. Geoffrey Irving is doing this at UK AISI and recruiting, for example.
  2. Try to think of important new arguments/considerations, for example a new form of AI risk that nobody has considered, or new arguments for some alignment approach being likely or unlikely to succeed. (But take care to not be overconfident or cause others to be overconfident.)

Agreed with the general thrust of this post. I'm trying to do my part, despite a feeling of "PR/social/political skills is so far from what I think of as my comparative advantage. What kind of a world am I living in, that I'm compelled to do these things?"

Those low on the spectrum tend to shape the incentives around them proactively to create a culture that rewards what they don’t want to lose about their good qualities.

What percent of people do you think fall into this category? Any examples? Why are we so bad at distinguishing such people ahead of time and often handing power to the easily corrupted instead?

#5 seems off to me. I don’t know whether OpenAI uses nondisparagement agreements;

Details about OpenAI's nondisparagement agreements have come out.

Unlike FTX, OpenAI has now had a second wave of resignations in protest of insufficient safety focus.

Personally, I think fascism should be more upsetting than woke debate!

I'm not very familiar with Reactionary philosophy myself, but was suspicious of your use of "fascism" here. Asked Copilot (based on GPT-4) and it answered:

As an AI, I don’t form personal opinions. However, I can share that Reactionary philosophy and Fascism are distinct ideologies, even though they might share some common elements such as a critique of modernity and a preference for traditional social structures.

Fascism is typically characterized by dictatorial power, forcible suppression of opposition, and strong regimentation of society and of the economy which is not necessarily present in Reactionary philosophy. Reactionaries might advocate for a return to older forms of governance, but this does not inherently involve the authoritarian aspects seen in Fascism.

(Normally I wouldn't chime in on some topic I know this little about, but I suspect others who are more informed might fear speaking up and getting associated with fascism in other people's minds as a result.)

Also, I'm not Scott but I can share that I'm personally upset with wokeness, not because of how it changed debate, but based on more significant harms to my family and the community we live in (which I described in general terms in this post), to the extent that we're moving half-way across the country to be in a more politically balanced area, where hopefully it has less influence. (Not to mention damage to other institutions I care about, such as academia and journalism.)

(Yes, that is melodramatic phrasing, but I am trying to shock people out what I think is complacency on this topic.)

Not entirely sure what you're referring to by "melodramatic phrasing", but if this is an excuse for using "fascism" to describe "Reactionary philosophy" in order to manipulate people's reactions to it and/or prevent dissent (I've often seen "racism" used this way in other places), I think I have to stand against that. If everyone started excusing themselves from following good discussion norms when they felt like others were complacent about something, that seems like a recipe for disaster.

Load more