ofer

Send me anonymous feedback: https://docs.google.com/forms/d/1qDWHI0ARJAJMGqhxc9FHgzHyEFp-1xneyl9hxSMzJP0/viewform

Any type of feedback is welcome, including arguments that a post/comment I wrote is net negative.


Some quick info about me:

I'm Ofer G. and I have a background in computer science (BSc+MSc; my MSc thesis was in NLP and ML, though not in deep learning).

You can also find me on the AI Alignment Forum and LessWrong.

(Feel free to reach out by sending me a PM through this forum.)

ofer's Posts

Sorted by New

ofer's Comments

Conversation on AI risk with Adam Gleave
Gleave thinks discontinuous progress in AI is extremely unlikely:

I'm confused about this point. Did Adam Gleave explicitly say that he thinks discontinuous progress is "extremely unlikely" (or something to this effect)?

From the transcript I get a sense of a less confident estimate being made:

Adam Gleave: [...] I don’t see much reason for AI progress to be discontinuous in particular.

Adam Gleave: [...] I don’t expect there to be a discontinuity, in the sense of, we just see this sudden jump.
2019 AI Alignment Literature Review and Charity Comparison
Financial Reserves

You listed important considerations; here are some additional points to consider:

1. As suggested in SethBaum's comment, a short runway may deter people from joining the org (especially people with larger personal financial responsibilities and opportunity cost).

2. It seems likely that—all other things being equal—orgs with a longer runway are "less vulnerable to Goodhart's law" and generally less prone to optimize for short-term impressiveness in costly ways. Selection effects alone seem sufficient to justify this belief: Orgs with a short runway that don't optimize for short-term impressiveness seem less likely to keep on existing.

But exactly how complex and fragile?
The traditional argument for AI alignment being hard is that human value is ‘complex’ and ‘fragile’.

Presumably, many actors will be investing a lot of resources into building the most capable and competitive ML models in many domains (e.g. models for predicting stock prices). It seems to me that the purpose of the field of AI alignment is to make it easier for actors to build such models in a way that is both safe and competitive. AI alignment seems hard to me because using arbitrarily-scaled-up versions of contemporary ML methods—in a safe and competitive way—seems hard.

What metrics may be useful to measure the health of the EA community?

Some more ideas for metrics that might be useful for tracking 'the health of the EA community' (not sure whether they fit in the first category):

How much runway do EA orgs have?

How diverse is the 'EA funding portfolio'? [EDIT: I'm referring here to the diversity of donors rather than the diversity of funding recipients.]

Summary of Core Feedback Collected by CEA in Spring/Summer 2019

Thanks for this helpful explanation!

To clarify my view, I do think there is a large variance in risk among 'long-term future interventions' (such as donating to FHI, or donating to fund an independent researcher with a short track record).

Summary of Core Feedback Collected by CEA in Spring/Summer 2019

Thanks for publishing this!

Respondents mentioned two broad concerns about EA Funds:

...

  1. Funds was targeted to meet the needs of a small set of donors, but was advertised to the entire EA community.

.

Many donors may not want their donations going towards “unusual, risky, or time-sensitive projects”, and respondents were concerned that the Funds were advertised to too broad a set of donors, including those for whom the Funds may not have been a good fit.

.

we do not currently proactively advertise EA Funds.

I'd be happy to learn more about these considerations/concerns. It seems to me that many of the interventions that are a good idea from a 'long-term future perspective' are unusual, risky, or time-sensitive. Is this an unusual view in the EA sphere?

Does 80,000 Hours focus too much on AI risk?

Is this the case in the AI safety community?

I have no idea to what extent the above factor is influential amongst the AI safety community (i.e. the set of all AI safety (aspiring) researchers?).

If the reasoning for their views isn't obviously bad, I would guess that it's "cool" to say unpopular or scary but not unacceptable things, because the rationality community has been built in part on this.

(As an aside, I'm not sure what's the definition/boundary of the "rationality community", but obviously not all AI safety researchers are part of it.)

Does 80,000 Hours focus too much on AI risk?

Thanks for asking.

One factor that seems important is that even a small probability of "very short timelines and a sharp discontinuity" is probably a terrifying prospect for most people. Presumably, people tend to avoid saying terrifying things. Saying terrifying things can be costly, both socially and reputationally (and there's also the possible side effect of, well, making people terrified).

I hope to write a more thorough answer to this soon (I'll update this comment accordingly by 2019-11-20).

[EDIT (2019-11-18): adding the content below]

(I should note that I haven't yet discussed some of the following with anyone else. Also, so far I had very little one-on-one interaction with established AI safety researchers, so consider the following to be mere intuitions and wild speculations.)

Suppose that some AI safety researcher thinks that 'short timelines and a sharp discontinuity' is likely. Here are some potential reasons that might cause them to not discuss their estimate publicly:

  1. Extending the point above ("people tend to avoid saying terrifying things"):

    • Presumably, most people don't want to give a vibe of an extremist.
    • People might be concerned that the most extreme/weird part of their estimate would end up getting quoted a lot in an adversarial manner, perhaps is a somewhat misleading way, for the purpose of dismissing their thoughts and making them look like a crackpot.
    • Making someone update towards such an estimate might put them in a lot of stress which might have a negative impact on their productivity.
  2. Voicing such estimates publicly might make the field of AI safety more fringe.

    • When the topic of 'x-risks from AI' is presented to a random person, presenting a more severe account of the risks might make it more likely that the person would rationalize away the risks due to motivated reasoning.
    • Being more optimistic probably correlates with others being more willing to collaborate with you. People are probably generally attracted to optimism, and working with someone who is more optimistic is probably a more attractive experience.
    • Therefore, the potential implications of voicing such estimates publicly include:
      • making talented people less likely to join the field of AI safety;
      • making established AI researchers (and other key figures) more hesitant to be associated with the field; and
      • making donors less likely to donate to this cause area.
  3. Some researchers might be concerned that discussing such estimates publicly would make them appear as fear mongering crooks who are just trying to get funding or better job security.

    • Generally, I suspect that most researchers that work on xrisk reduction would strongly avoid saying anything that could be pattern-matched to "I have this terrifying estimate about the prospect of the world getting destroyed soon in some weird way; and also, if you give me money I'll do some research that will make the catastrophe less likely to happen."
    • Some supporting evidence that those who work on xrisk reduction indeed face the risk of appearing as fear mongering crooks:
      • Oren Etzioni, a professor of computer science at the University of Washington and the CEO of the Allen Institute for Artificial Intelligence (not to be confused with the Alan Turing Institute) wrote an article for the MIT Technology Review in 2016 (which was summarized by an AI Impacts post on November 2019). In that article, which is titled "No, the Experts Don’t Think Superintelligent AI is a Threat to Humanity", Etzioni cited the following comment that is attributed to an anonymous AAAI Fellow:

        Nick Bostrom is a professional scare monger. His Institute’s role is to find existential threats to humanity. He sees them everywhere. I am tempted to refer to him as the ‘Donald Trump’ of AI.

        Note: at the end of that article there's an update from November 2016 that includes the following:

        I’m delighted that Professors Dafoe & Russell, who responded to my article here, and I seem to be in agreement on three critical matters. One, we should refrain from ad hominem attacks. Here, I have to offer an apology: I should not have quoted the anonymous AAAI Fellow who likened Dr. Bostrom to Donald Trump. I didn’t mean to lend my voice to that comparison; I sincerely apologized to Bostrom for this misstep via e-mail, an apology that he graciously accepted. [...]

      • See also this post by Jessica Taylor from July 2019, titled "The AI Timelines Scam" (a link post for it was posted on the EA Forum), which seems to argue for the (very reasonable) hypothesis that financial incentives have caused some people to voice short timelines estimates (it's unclear to me what fraction of that post is about AI safety orgs/people, as opposed to AI orgs/people in general).

  4. Some researchers might be concerned that in order to explain why they have short timelines they would need to publicly point at some approaches that they think might lead to short timelines, which might draw more attention to those approaches which might cause shorter timelines in a net-negative manner.

  5. If voicing such estimates would make some key people in industry/governments update towards shorter timelines, it might contribute to 'race dynamics'.

  6. If a researcher with such an estimate does not see any of their peers publicly sharing such estimates, they might reason that sharing their estimate publicly is subject to the unilateralist’s curse. If the researcher has limited time or a limited network, they might opt to "play it safe", i.e. decide to not share their estimate publicly (instead of properly resolving the unilateralist’s curse by privately discussing the topic with others).

Does 80,000 Hours focus too much on AI risk?

There seems to be a large variance in researchers' estimates about timelines and takeoff-speed. Pointing to specific writeups that lean one way or another can't give much insight about the distribution of estimates. Also, I think that at least some researchers are less likely to discuss their estimates publicly if they're leaning towards shorter timelines and a discontinuous takeoff, which subjects the public discourse on the topic to a selection bias.

So I'm skeptical about the claim that "Most researchers seem to be moving away from a fast takeoff view of AI safety, and are now opting for a softer takeoff view".

Top AI safety researchers are now saying that they expect AI to be safe by default, without further intervention from EA. See here and here.

Again, there seems to be a large variance in researchers' views about this. Pointing to specific writeups can't give much insight about the distribution of views.

Load More