Vael Gates

787Joined Jun 2021

Comments
38

No, the same set of ~28 authors read all of the readings. 

The order of the readings was indeed specified:

  1. Concise overview (Stuart Russell, Sam Bowman; 30 minutes)
  2. Different styles of thinking about future AI systems (Jacob Steinhardt; 30 minutes)
  3. A more in-depth argument for highly advanced AI being a serious risk (Joe Carlsmith; 30 minutes)
  4. A more detailed description of how deep learning models could become dangerously "misaligned" and why this might be difficult to solve with current ML techniques (Ajeya Cotra; 30 minutes)
  5. An overview of different research directions (Paul Christiano; 30 minutes)
  6. A study of what ML researchers think about these issues (Vael Gates; 45 minutes)
  7. Some common misconceptions (John Schulman; 15 minutes)

Researchers had the option to read the transcripts where transcripts were available; we said that consuming the content in either form (video or transcript) was fine.

I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom.  It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard. Are there any kind of "virtual, restricted" options like this?

Nice, yeah! I wouldn't have expected a statistically significant difference between a mean of 5.7 and 5.4 with those standard errors, but it's nice to see it here. 

I considered doing a statistical test, and then spent some time googling how to do something like a "3-paired" ANOVA on data that looks like ("s" is subject, "r" is reading):

[s1 r1 "like"] [s1 r1 "agreement"] [s1 r1 "informative"]

[s2 r1 "like"] [s2 r1 "agreement"] [s2 r1 "informative"]

... [s28 r1 "like"] [s28 r1 "agreement"] [s28 r1 "informative"]

[s1 r2 "like"] [s1 r2"agreement"] [s1 r2 "informative"]

[s2 r2 "like"] [s2 r2 "agreement"] [s2 r2 "informative"]

...

 because I'd like to do an ANOVA on the raw scores, rather than the means. I did not resolve my confusion about about what to do about the 3-paired data (I guess you could lump each subject's data in one column, or do it separately by "like", "agreement", and "informative", but I'm interested in how good each of the readings are summed across the three metrics).  I then gave up and just presented the summary statistics.  (You can extract the raw scores from the Appendix if you put some work into it though, or I could pass along the raw scores, or you could tell me how to do this sort of analysis in Python if you wanted me to do it!)

When I look at these tables, I'm also usually squinting at the median rather than mean, though I look at both. You can see the distributions in the Appendix, which I like even better. But point taken about how it'd be nice to have stats.

This is not directly related, but I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference (physically or virtually) but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom.  It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard, but I don't want to pay to do this (nor make the event organizers pay for my non-physical presence if I do end up applying-- I usually want a "virtual only" ticket).

I'm not going to comment too much here, but if you haven't seen my talk  (“Researcher Perceptions of Current and Future AI” (first 48m; skip the Q&A) (Transcript)), I'd recommend it! Specifically, you want the timechunk 23m-48m  in that talk, when I'm talking about the results of interviewing ~100 researchers about AI safety arguments. We're going to publish much more on this interview data within the next month or so, but the major results are there, which describes some AI researchers cruxes.

(in response to the technical questions)

Mostly n=28 for each document, some had n =29 or n= 30; you can see details in the Appendix, quantitative section. 

The Carlsmith link is to the Youtube talk version, not the full report -- we chose materials based on them being pretty short. 

The application form is actually really restrictive once you open it-- when I filled it out, it explicitly instructed not to write any new material and only attach old material that was sent to FTXFF, and only had a <20 word box and <150 word box for grant descriptions. Today when I open the form even those boxes have disappeared. I think it's meant to be a quite quick form, where they'll reach out for more details later.

Just updated my post on this: https://forum.effectivealtruism.org/posts/8sAzgNcssH3mdb8ya/resources-i-send-to-ai-researchers-about-ai-safety

I have different recommendations for ML researchers / the public / proto-EAs (people are more or less skeptical to begin with, rely on different kinds of evidence, and are willing to entertain weirder or more normative hypotheses), but that post covers some introductory materials. 

If they're a somewhat skeptical ML researcher and looking for introductory material, my top recommendation at the moment is “Why I Think More NLP Researchers Should Engage with AI Safety Concerns” by Sam Bowman (2022), 15m (Note: stop at the section “The new lab”)

Load More