Wiki Contributions


Draft report on existential risk from power-seeking AI

I’m focused, here, on a very specific type of worry. There are lots of other ways to be worried about AI -- and even, about existential catastrophes resulting from AI.

Can you talk about your estimate of the overall AI-related x-risk (see here for an attempt at a comprehensive list), as well as total x-risk from all sources? (If your overall AI-related x-risk is significantly higher than 5%, what do you think are the other main sources?) I think it would be a good idea for anyone discussing a specific type of x-risk to also give their more general estimates, for a few reasons:

  1. It's useful for the purpose of prioritizing between different types of x-risk.
  2. Quantification of specific risks can be sensitive to how one defines categories. For example one might push some kinds of risks out of "existential risk from misaligned AI" and into "AI-related x-risk in general" by defining the former in a narrow way, thereby reducing one's estimate of it. This would be less problematic (e.g., less likely to give the reader a false sense of security) if one also talked about more general risk estimates.
  3. Different people may be more or less optimistic in general, making it hard to compare absolute risk estimates between individuals. Relative risk levels suffer less from this problem.
Concerns with ACE's Recent Behavior

If there are lots of considerations that have to be weighed against each other, then it seems easily the case that we should decide things on a case by case basis, as sometimes the considerations might weigh in favor of downvoting someone for refusing to engage with criticism, and other times they weigh in the other direction. But this seems inconsistent with your original blanket statement, "I don’t think any person or group should be downvoted or otherwise shamed for not wanting to engage in any sort of online discussion"

About online versus offline, I'm confused why you think you'd be able to convey your model offline but not online, as the bandwidth difference between the two don't seem large enough that you could do one but not the other. Maybe it's not just the bandwidth but other differences between the two mediums, but I'm skeptical that offline/audio conversations are overall less biased than online/text conversations. If they each have their own biases, then it's not clear what it would mean if you could convince someone of some idea over one medium but not the other.

If the stakes were higher or I had a bunch of free time, I might try an offline/audio conversation with you anyway to see what happens, but it doesn't seem like a great use of our time at this point. (From your perspective, you might spend hours but at most convince one person, which would hardly make a dent if the goal is to change the Forum's norms. I feel like your best bet is still to write a post to make your case to a wider audience, perhaps putting in extra effort to overcome the bias against it if there really is one.)

I'm still pretty curious what experiences led you to think that online discussions are often terrible, if you want to just answer that. Also are there other ideas that you think are good but can't be spread through a text medium because of its inherent bias?

Concerns with ACE's Recent Behavior

(It seems that you're switching the topic from what your policy is exactly, which I'm still unclear on, to the model/motivation underlying your policy, which perhaps makes sense, as if I understood your model/motivation better perhaps I could regenerate the policy myself.)

I think I may just outright disagree with your model here, since it seems that you're not taking into account the significant positive externalities that a public argument can generate for the audience (in the form of more accurate beliefs, about the organizations involved and EA topics in general, similar to the motivation behind the DEBATE proposal for AI alignment).

Another crux may be your statement "Online discussions are very often terrible" in your original comment, which has not been my experience if we're talking about online discussions made in good faith in the rationalist/EA communities (and it seems like most people agree that the OP was written in good faith). I would be interested to hear what experiences led to your differing opinion.

But even when online discussions are "terrible", that can still generate valuable information for the audience, about the competence (e.g., reasoning abilities, PR skills) or lack thereof of the parties to the discussion, perhaps causing a downgrade of opinions about both parties.

Finally, even if your model is a good one in general, it's not clear that it's applicable to this specific situation. It doesn't seem like ACE is trying to "play private" as they have given no indication that they would be or would have been willing to discuss this issue in private with any critic. Instead it seems like they view time spent on engaging such critics as having very low value because they're extremely confident that their own conclusions are the right ones (or at least that's the public reason they're giving).

Concerns with ACE's Recent Behavior

Still pretty unclear about your policy. Why is ACE calling the OP "hostile" not considered "meta-level" and hence not updateable (according to your policy)? What if the org in question gave a more reasonable explanation of why they're not responding, but doesn't address the object-level criticism? Would you count that in their favor, compared to total silence, or compared to an unreasonable explanation? Are you making any subjective judgments here as to what to update on and what not to, or is there a mechanical policy you can write down (that anyone can follow and achieve the same results)?

Also, overall, is you policy intended to satisfy Conservation of Expected Evidence, or not?

ETA: It looks like MIRI did give at least a short object-level reply to Paul's takeoff speed argument along with a meta-level explanation of why they haven't given a longer object-level reply. Would you agree to a norm that said that organizations have at least an obligation to give a reasonable meta-level explanation of why they're not responding to criticism on the object level, and silence or an unreasonable explanation on that level could be held against them?

Concerns with ACE's Recent Behavior

I would be curious to read more about your approach, perhaps in another venue. Some questions I have:

  1. Do you propose to apply this (not updating when an organization refuses to engage with public criticism) universally? For example would you really not have thought worse of MIRI (Singularity Institute at the time) if it had labeled Holden Karnofsky's public criticism "hostile" and refused to respond to it, citing that its time could be better spent elsewhere? If not, how do you decide when to apply this policy? If yes, how do you prevent bad actors from taking advantage of the norm to become immune to public criticism?
  2. Would you update in a positive direction if an organization does effectively respond to public criticism? If not that seems extremely strange/counterintuitive, but if yes I suspect that might lead to dynamic inconsistencies in one's decision making (although I haven't thought about this deeply).
  3. Do you update on the existence of the criticism itself, before knowing whether or how the organization has chosen to respond?

I guess in general I'm pretty confused about what your proposed policy or norm is, and would appreciate some kind of thought-out exposition.

Concerns with ACE's Recent Behavior

FWIW if I was in a position similar to ACE’s here are a few potential “compromises” I would have explored.

Inferring from the list you wrote, you seem to be under the impression that the speaker in question was going to deliver a talk at the conference, but according to Eric Herboso's top-level comment, "the facebook commenter in question would be on a panel talking about BLM". Also, the following sentence from ACE's Facebook post makes it sound like the only way ACE staff members would attend the conference was if the speaker would not be there at all, which I think rules out all of the compromise ideas you generated.

In fact, asking our staff to participate in an event where a person who had made such harmful statements would be in attendance, let alone presenting, would be a violation of our own anti-discrimination and anti-harassment policy.

Concerns with ACE's Recent Behavior

And suppose we did make introductory spaces "safe" for people who believe that certain types of speech are very harmful, but somehow managed to keep norms of open discussion in other more "advanced" spaces. How would those people feel when they find out that they can't participate in the more advanced spaces without the risk of paying a high subjective cost (i.e., encountering speech that they find intolerable)? Won't many of them think that the EA community has performed a bait-and-switch on them and potentially become hostile to EA? Have people who have proposed this type of solution actually thought things through?

I think it's important to make EA as welcoming as possible to all people, but not by compromising in the direction of safetyism, as I don't see any way that doesn't end up causing more harm than good in the long run.

Concerns with ACE's Recent Behavior

one of the ones I find most concerning are the University of California diversity statements

I'm not sure I understand what you mean here. Do you think other universities are not requiring diversity statements from job applicants, or that the University of California is especially "concerning" in how it uses them? If it's the latter, what do you think the University of California is doing that others aren't? If the former, see this article from two years ago, which states:

Many more institutions are asking her to submit a statement with her application about how her work would advance diversity, equity, and inclusion.

The requests have appeared on advertisements for jobs at all kinds of colleges, from the largest research institutions to small teaching-focused campuses

(And it seems a safe bet that the trend has continued. See this search result for a quick sense of what universities currently have formal rubrics for evaluating diversity statements. I also checked a random open position (for a chemistry professor) at a university that didn't show up in these results and found that it also requires a diversity statement: "Applicants should state in their cover letter how their teaching, research, service and/or life experiences have prepared them to advance Dartmouth’s commitment to diversity, equity and inclusion.")

Another reason I think academia has been taken over by cancel culture is that I've read many news stories, blog posts, and the like about cancel culture in academia, and often scan their comment sections for contrary opinions, and have yet to see anyone chime to say that they're an academic and cancel culture doesn't exist at their institution (which I'd expect to see if it weren't actually widespread), aside from some saying that it doesn't exist as a way of defending it (i.e., that what's happening is just people facing reasonable consequences for their speech acts and doesn't count as cancel culture). I also tried to Google "cancel culture isn't widespread in academia" in case someone wrote an article arguing that, but all the top relevant results are articles arguing that cancel culture is widespread in academia.

Curious if you have any evidence to the contrary, or just thought that I was making too strong a claim without backing it up myself.

Concerns with ACE's Recent Behavior

Can you explain more about this part of ACE's public statement about withdrawing from the conference:

We took the initiative to contact CARE’s organizers to discuss our concern, exchanging many thoughtful messages and making significant attempts to find a compromise.

If ACE was not trying to deplatform the speaker in question, what were these messages about and what kind of compromise were you trying to reach with CARE?

Concerns with ACE's Recent Behavior

but the main EAA Facebook group does not seem like an appropriate place to have them, since it’s one of the first places people get exposed to EAA.

I might agree with you if doing this had no further consequences beyond what you've written, but... quoting an earlier comment of mine:

You know, this makes me think I know just how academia was taken over by cancel culture. They must have allowed “introductory spaces” like undergrad classes to become “safe spaces”, thinking they could continue serious open discussion in seminar rooms and journals, then those undergrads became graduate students and professors and demanded “safe spaces” everywhere they went. And how is anyone supposed to argue against “safety”, especially once its importance has been institutionalized (i.e., departments were built in part to enforce “safe spaces”, which can then easily extend their power beyond “introductory spaces”).

Load More