All of Vael Gates's Comments + Replies

Arkose is seeking an AI Safety Call Specialist who will be speaking with and supporting professors, PhD students, and industry professionals who are interested in AI safety research or engineering.

Salary: $75,000 - $95,000, depending on prior experience and location. This is currently a 9-month-long fixed contract.

Location: Remote (but we highly prefer candidates to be able to work in roughly US time zones).

Deadline: 30 March 2024, with rolling admission (early applications encouraged).

Learn more on our website, and apply here if you’re interested!

FAQ

This is cool! Why haven't I heard of this?
Arkose has been in soft-launch for a while, and we've been focused on email outreach more than public comms. But we're increasingly public, and are in communication with other AI safety fieldbuilding organizations! 

How big is the team?

3 people: Zach Thomas and Audra Zook are doing an excellent job in operations, and I'm the founder.

How do you pronounce "Arkose"? Where did the name come from?

I think whatever pronunciation is fine, and it's the name of a rock. We have an SEO goal for arkose.org to surpass t... (read more)

Neat! As someone who's not on the ground and doesn't know much about either initiative, I'm curious what Arcadia's relationship is London Initiative for Safe AI (LISA)? Mostly in the spirit of "if I know someone in AI safety in London, in what cases should I recommend them to each?"

3
Joe Hardie
3mo
Thanks for the question! We aren't connected in any official capacity and haven't collaborated on any projects. For the events we run, they are focused on students and young professionals that haven't engaged with AI safety arguments or the community before. LISA is more focused on those already doing relevant research. As office spaces, the majority of our users attend as individuals (working independently or as the only person from their organisation), while LISA is hosts organisations and up-skilling programs. Our office has a wider focus than just AI safety although I expect there is some overlap in the people we would accept, and a small number of users that are signed up to both offices.

This is sideways to the main point in the post, but I'm interested in a ticket type that's just "Swapcard / unsupported virtual attendee" where accepted people just get access to Swapcard, which lets them schedule 1-1 online videoconferencing, and that's it.

I find a lot of the value of EAG is in 1-1s, and I'd hope that this would be an option where virtual attendees can get potentially lots of networking value for very little cost.

(Asking because I don't want to pay a lot of money to attend an EAG where I'd mostly be taking on a mentor role, but I would po... (read more)

6
dan.pandori
7mo
This feels like a "be the change you want to see in the world" moment. If you want such an event, it seems like you could basically just make a forum post (or quick take) offering 1:1s?

"For those applying for grants, asking for less money might make you more likely to be funded" 

My guess is that it's good to still apply for lots of money, and then you just may not be funded the full amount? And one can say what one would do with more or less money granted, so that the grantmakers can take that into account in their decision.

I didn't give a disagreement vote, but I do disagree on aisafety.training being the "single most useful link to give anyone who wants to join the effort of AI Safety research", just because there's a lot of different resources out there and I think "most useful" depends on the audience. I do think it's a useful link, but most useful is a hard bar to meet!

3
Linda Linsefors
1y
I agree that it's not the most useful link for everyone. I can see how my initial message was ambiguous about this. What I meant is that by all the links I know, I expect this to be most useful on average. Like, if I meat someone and have a conversation with someone and I had to constrain myself to give them only a single link, I might pick another recourse to give them, based on their personal situation. But if I wrote a post online or gave a talk to a broad audience, and I had to pick only one link to share, it would be this one. 

Not directly relevant to the OP, but another post covering research taste: An Opinionated Guide to ML Research (also see Rohin Shah's advice about PhD programs (search "Q. What skills will I learn from a PhD?") for some commentary.

Two authors gave me permission to publish their transcripts non-anonymously! Thus:

- Interview with Michael L. Littman

- Interview with David Duvenaud

Whoops, forgot I was the owner. I tried moving those files to the drive folder, but also had trouble with it? So I'm happy to have them copied instead. 

Thanks plex, this sounds great!

1
plex
1y
Ops, forgot to share edit access, I sent you an invitation to the subfolder so you should be able to move it now. Can also copy if you'd prefer, but I think having one canonical version is best.

Update: Michael Keenan reports it is now fixed!

Thanks for the bug report, checking into it now. 

1
Vael Gates
1y
Update: Michael Keenan reports it is now fixed!

No, the same set of ~28 authors read all of the readings. 

The order of the readings was indeed specified:

  1. Concise overview (Stuart Russell, Sam Bowman; 30 minutes)
  2. Different styles of thinking about future AI systems (Jacob Steinhardt; 30 minutes)
  3. A more in-depth argument for highly advanced AI being a serious risk (Joe Carlsmith; 30 minutes)
  4. A more detailed description of how deep learning models could become dangerously "misaligned" and why this might be difficult to solve with current ML techniques (Ajeya Cotra; 30 minutes)
  5. An overview of different rese
... (read more)

I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom.  It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard. Are there any kind of "virtual, restricted" options like this?

Nice, yeah! I wouldn't have expected a statistically significant difference between a mean of 5.7 and 5.4 with those standard errors, but it's nice to see it here. 

I considered doing a statistical test, and then spent some time googling how to do something like a "3-paired" ANOVA on data that looks like ("s" is subject, "r" is reading):

[s1 r1 "like"] [s1 r1 "agreement"] [s1 r1 "informative"]

[s2 r1 "like"] [s2 r1 "agreement"] [s2 r1 "informative"]

... [s28 r1 "like"] [s28 r1 "agreement"] [s28 r1 "informative"]

[s1 r2 "like"] [s1 r2"agreement"] [s1 r2 "in... (read more)

2
Vasco Grilo
1y
Ah, thanks for the suggestion! To be honest, I only have basic knowledge about stats, so I do not know to do the more complex analysis you described. My (quite possibly flawed) intuition for analysing all questions would be: * Determine, for each subject, "overall score" = ("score of question 1" + "score of question 2" + "score of question 3")/3. * If some subjects did not answer to all 3 questions, "overall score" = "sum of the scores of the answered questions"/"number of answered questions". * Calculate the mean and standard error for each of the AI safety materials. * Repeat the calculation of the p-value as I illustrated above for the  pairs of AI safety materials (best, 2nd best), (2nd best, 3rd best), ..., and (2nd worst, worst), or just analyse all possible pairs.

This is not directly related, but I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference (physically or virtually) but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom.  It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard, but I don't want to pay to do this (nor make the event organizers pay for my non-physical presence if I do end up applying-- I usually want a "virtual only" ticket).

I'm not going to comment too much here, but if you haven't seen my talk  (“Researcher Perceptions of Current and Future AI” (first 48m; skip the Q&A) (Transcript)), I'd recommend it! Specifically, you want the timechunk 23m-48m  in that talk, when I'm talking about the results of interviewing ~100 researchers about AI safety arguments. We're going to publish much more on this interview data within the next month or so, but the major results are there, which describes some AI researchers cruxes.

(in response to the technical questions)

Mostly n=28 for each document, some had n =29 or n= 30; you can see details in the Appendix, quantitative section. 

The Carlsmith link is to the Youtube talk version, not the full report -- we chose materials based on them being pretty short. 

3[anonymous]1y
Was each piece of writing read by a fresh set of n researchers (i.e. meaning that a total of ~30*8 researchers participated)? I understand the alternative to be that the same ~30 researchers read the 8 pieces of writing. The following question interests me if the latter was true: Do you specify in what order they should read the pieces? I expect somebody making their first contact with AIS to have a very path-dependent response. For instance, encountering Carlsmith first and encountering Carlsmith last seem to produce different effects—these effects possibly extending to the researchers' ratings of the other pieces. Unrelatedly, I'm wondering whether researchers  were exposed only to the transcripts of the videos as opposed to the videos themselves.

The application form is actually really restrictive once you open it-- when I filled it out, it explicitly instructed not to write any new material and only attach old material that was sent to FTXFF, and only had a <20 word box and <150 word box for grant descriptions. Today when I open the form even those boxes have disappeared. I think it's meant to be a quite quick form, where they'll reach out for more details later.

1
Falk Lieder
1y
Thank you so much for pointing that out, Vael! I had completely overlooked that information. That's really helpful to know.

Just updated my post on this: https://forum.effectivealtruism.org/posts/8sAzgNcssH3mdb8ya/resources-i-send-to-ai-researchers-about-ai-safety

I have different recommendations for ML researchers / the public / proto-EAs (people are more or less skeptical to begin with, rely on different kinds of evidence, and are willing to entertain weirder or more normative hypotheses), but that post covers some introductory materials. 

If they're a somewhat skeptical ML researcher and looking for introductory material, my top recommendation at the moment is “Why I Thin... (read more)

(quick reply to a private doc on interaction effects vs direct effects for existential risks / GCR. They're arguing for more of a focus on interaction effects overall, I'm arguing for mostly work on direct effects. Keeping for my notes.)

In addition to direct effects from AI, bio, nuclear, climate...

...there are also mitigating / interaction effects, which could make these direct effects better or worse. For each of the possible direct risks, mitigating / interaction effects are more or less important. 

For AI, the mainline direct risks that are possibl... (read more)

Continual investment argument for why AGI will probably happen, absent major societal catastrophes, written informally, for my notes:

We’ve been working on AI since ~1950s, in an era of history that feels normal to us but in fact develops technologies very very very fast compared to most of human existence. In 2012, the deep learning revolution of AI started with AlexNet and GPUs. Deep learning has made progress even faster than the current very fast rate of progress: 10 years later, we have unprecedented and unpredicted progress in large language mode... (read more)

Thanks Gabriel-- super useful step-by-step guide, and also knowledge/skill clarification structure! I usually gesture around vaguely when talking about my skills (I lose track of how much I know compared to others-- the answer is I clearly completed Levels 1-3 then stopped) and trying to hire other people with related skills. It feels useful to be able to say to someone e.g. "For this position, I want you to have completed Level 1 and have a very surface level grasp of Levels 2-4"!

1
Gabriel Mukobi
1y
Ha thanks Vael! Yeah, that seems hard to standardize but potentially quite useful to use levels like these for hiring, promotions, and such. Let me know how it goes if you try it!

Still in progress as always, but this talk covers a lot of it! https://forum.effectivealtruism.org/posts/q49obZkQujkYmnFWY/vael-gates-risks-from-advanced-ai-june-2022

(Unfortunately the part about the insights isn't transcribed-- the first 20m is introduction, next ~30m is the description you want, last 10m is questions)

 

1
Joseph Bloom
1y
Thanks Vael!

Suggestion for a project from Jonathan Yan: 

Given the Future Fund's recent announcement of the AI Worldview Prize, I think it would be a good idea if someone could create an online group of participants. Such a group can help participants share resources, study together, co-work, challenge each other's views, etc. After the AI Worldview Prize competition ends, if results are good, a global AI safety community can be built and grown from there.

("AI can have bad consequences" as a motivation for AI safety--> Yes, but AI can have bad consequences in meaningfully different ways!)

Here's some frame confusion that I see a lot, that I think leads to confused intuitions (especially when trying to reason about existential risk from advanced AI, as opposed to today's systems):

1. There's (weapons) -- tech like nuclear weapons or autonomous weapons that if used correctly involve people dying. (Tech like this exists)

2. There's (misuse) -- tech was intentioned to be anywhere from beneficial <> neutra... (read more)

^ Yeah, endorsed! This is work in (3)-- if you've got the skills and interests,  going to work with Josh and Lucius seems like an excellent opportunity, and they've got lots of interesting projects lined up. 

I think my data has insights about 3, and not about 1 and 2! You can take a look at https://www.lesswrong.com/posts/LfHWhcfK92qh2nwku/transcripts-of-interviews-with-ai-researchers to see what 11 interviews look like; I think it'd have to be designed differently to get info on 1 or 2. 

Sounds great; thanks Sawyer! "Reaching out to BERI" was definitely listed in my planning docs for this post; if there's anything that seems obvious to communicate about, happy to take a call, otherwise I'll reach out if anything seems overlapping.

Thanks levin! I realized before I published that I hadn't gotten nearly enough governance people to review this,  and indeed was hoping I'd get help in the comment section.

I'd thus be excited to hear more. Do you have specific questions / subareas of governance that are appreciably benefited by having a background in "economics, political science, legal studies, anthropology, sociology, psychology, and history" rather than a more generic "generalist"-type background (which can include any of the previous, but doesn't depend on any of them?)

I view the ... (read more)

2
tlevin
2y
Great questions, and I guess I agree that generalist skills are probably more important (with one implication being that I'd be less excited about people getting PhDs in these fields than my comment might have implied). Just as an example, since I'm quite new to the field as well: the project I'm currently working on includes a sub-question that I think an actual economist would be able to make much faster progress on: how does the availability of research talent to top technology firms affect their technological progress? My impression is that since a lot of important research projects on e.g. ideas for new treaties, historical analogies, military-strategic options seem to similarly break down into sub-questions that vary on how domain-knowledge-demanding they are, social scientists might be able to have an unusual impact working on the more demanding of these sub-questions.

"Preventing Human Extinction" at Stanford (first year undergraduate course)

Syllabus (2022)

Additional subject-specific reading lists (AI, bio, nuclear, climate) (2022)

@Pablo Could you also your longtermism list with the syllabus, and with the edit that the class is taught by Steve Luby and Paul Edwards jointly? Thanks and thanks for keeping this list :) .

Great idea, thank you Vaidehi! I'm pulling this from the Forum and will repost once I get that done. (Update: Was reposted)

I haven't received much feedback on this video yet, so I'm very curious to know how it's received! I'm interested in critiques and things that it does well, so I can refine future descriptions and know who to send this to.

5
Vaidehi Agarwalla
2y
It would be helpful to have the slides / transcript in the post body (I expect you'd get more feedback that way)

I've been finding "A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]" to have a lot of content that would likely be interesting to the audience reading these transcripts. For example, the incentives section rhymes with the type of things interviewees would sometimes say. I think the post generally captures and analyzes a lot of the flavor / contextualizes what it was like to talk to researchers.

This isn't particularly helpful since it's not sorted, but some transcripts with ML researchers: https://www.lesswrong.com/posts/LfHWhcfK92qh2nwku/transcripts-of-interviews-with-ai-researchers

My argument structure within these interviews was basically to ask them these three questions in order, then respond from there. I chose the questions initially, but the details of the spiels were added to as I talked to researchers and started trying to respond to their comments before they made them.

1. “When do you think we’ll get AGI / capable / generalizable AI / ... (read more)

Indeed! I've actually found that in most of my interviews people haven't thought about the 50+ year future much or heard of AI alignment, given that my large sample is researchers who had papers at NeurIPS or ICML. (The five researchers who were individually selected here had thought about AI alignment uncommonly much, which didn't particularly surprise me given how they were selected.)

A nice followup direction to take this would be to get a list of common arguments used by AI researchers to be less worried about AI safety (or about working on capabilities

... (read more)

I just did a fast-and-dirty version of this study with some of the students I'm TAing for, in a freshman class at Stanford called "Preventing Human Extinction". No promises I got all the details right, in either the survey or the analysis.

—————————————————————————————————

QUICK SUMMARY OF DATA FROM https://forum.effectivealtruism.org/posts/7f3sq7ZHcRsaBBeMD/what-psychological-traits-predict-interest-in-effective

MTurkers (n=~250, having a hard time extracting it from 1-3? different samples):
- expansive altruism (M = 4.4, SD = 1.1)
- effectiveness-focus scale ... (read more)

It's super cool :). I think SERI's funded by a bunch of places (including some university funding, and for sure OpenPhil), but it definitely feels incredible! 

Just wanted to mention that if you were planning on standardizing an accelerated fellowship retreat, it seems definitely worth reaching out to CFAR folks (as mentioned), since they spent a lot of time testing models, including for post-workshop engagement, afaik! Happy to provide names / introductions if desired.

2
DirectedEvolution
2y
That’s a good update, thank you!

Update on my post "Seeking social science students / collaborators interested in AI existential risks" from ~1.5 months ago: 

I've been running a two-month "program" with eight of the students who reached out to me! We've come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer research assistants. I've been meeting with each person / group for 30min per week to discuss progress. We're halfway through this experiment, with a variety of projects and progress states-- hopefully you'll see at... (read more)

Update: I've been running a two-month "program" with eight of the students who reached out to me! We've come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer RAs. I've been meeting with each person / group for 30min per week to discuss progress. We're halfway through this experiment, with a variety of projects and progress states-- hopefully you'll see at least one EA Forum post up from those students! 

--

I was quite surprised by the interest that this post generated; ~30 people reached... (read more)

I think classes are great given they're targeting something you want to learn, and you're not uncommonly self-motivated. They add a lot of structure and force engagement (i.e. homework, problem sets) in a way that's hard to find time / energy for by yourself. You also get a fair amount of guidance and scaffolding information, plus information presented in a pedagogical order!  With a lot of variance due to the skill and time investment of the instructor, size of class and quality of the curriculum etc. 

But if you DO happen to be very self-driven,... (read more)

(How to independent study) 

Stephen Casper (https://stephencasper.com/) was giving advice today in how to upskill in research, and suggested doing a "deep dive". 

Deep dive: read 40-50 papers in a specific research area you're interested in going into (e.g. adversarial examples in deep NNs). Take notes on each paper. You'll then have comparable knowledge  to people working in the area, after which you do a synthesis project at the end where you write something up (could be lit review, could be more original than that). 

He said he'd trade any class he'd ever taken for one of these deep dives, and they're worth doing even if it takes like 4 months.  

*cool idea

8
Miranda_Zhang
2y
This sounds like a great idea and aligns with my growing belief that classes are, more often than not, far from the best way to learn.

The comment about counterfactuals makes me think about computational cognitive scientist Tobias Gerstenberg's research (https://cicl.stanford.edu), where his research focuses a lot on counterfactual reasoning in the physical domain, but he also has work in the social domain. 

I confess to only a surface-level understanding of MIRI's research agenda, so I'm not quite able to connect my understanding of counterfactual reasoning in the social domain to a concrete research question within MIRI's agenda. I'd be happy to hear more though if you had more detail! 

3
Chris Leong
3y
I've written a post on this topic here - https://www.lesswrong.com/posts/9rtWTHsPAf2mLKizi/counterfactuals-as-a-matter-of-social-convention. BTW, I should be clear that my opinions on this topic aren't necessarily a mainstream position.
Load more