This is cool! Why haven't I heard of this?
Arkose has been in soft-launch for a while, and we've been focused on email outreach more than public comms. But we're increasingly public, and are in communication with other AI safety fieldbuilding organizations!
How big is the team?
3 people: Zach Thomas and Audra Zook are doing an excellent job in operations, and I'm the founder.
How do you pronounce "Arkose"? Where did the name come from?
I think whatever pronunciation is fine, and it's the name of a rock. We have an SEO goal for arkose.org to surpass t...
Neat! As someone who's not on the ground and doesn't know much about either initiative, I'm curious what Arcadia's relationship is London Initiative for Safe AI (LISA)? Mostly in the spirit of "if I know someone in AI safety in London, in what cases should I recommend them to each?"
This is sideways to the main point in the post, but I'm interested in a ticket type that's just "Swapcard / unsupported virtual attendee" where accepted people just get access to Swapcard, which lets them schedule 1-1 online videoconferencing, and that's it.
I find a lot of the value of EAG is in 1-1s, and I'd hope that this would be an option where virtual attendees can get potentially lots of networking value for very little cost.
(Asking because I don't want to pay a lot of money to attend an EAG where I'd mostly be taking on a mentor role, but I would po...
"For those applying for grants, asking for less money might make you more likely to be funded"
My guess is that it's good to still apply for lots of money, and then you just may not be funded the full amount? And one can say what one would do with more or less money granted, so that the grantmakers can take that into account in their decision.
I didn't give a disagreement vote, but I do disagree on aisafety.training being the "single most useful link to give anyone who wants to join the effort of AI Safety research", just because there's a lot of different resources out there and I think "most useful" depends on the audience. I do think it's a useful link, but most useful is a hard bar to meet!
Not directly relevant to the OP, but another post covering research taste: An Opinionated Guide to ML Research (also see Rohin Shah's advice about PhD programs (search "Q. What skills will I learn from a PhD?") for some commentary.
Small update: Two authors gave me permission to publish their transcripts non-anonymously!
Two authors gave me permission to publish their transcripts non-anonymously! Thus:
Whoops, forgot I was the owner. I tried moving those files to the drive folder, but also had trouble with it? So I'm happy to have them copied instead.
Thanks plex, this sounds great!
No, the same set of ~28 authors read all of the readings.
The order of the readings was indeed specified:
I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom. It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard. Are there any kind of "virtual, restricted" options like this?
Nice, yeah! I wouldn't have expected a statistically significant difference between a mean of 5.7 and 5.4 with those standard errors, but it's nice to see it here.
I considered doing a statistical test, and then spent some time googling how to do something like a "3-paired" ANOVA on data that looks like ("s" is subject, "r" is reading):
[s1 r1 "like"] [s1 r1 "agreement"] [s1 r1 "informative"]
[s2 r1 "like"] [s2 r1 "agreement"] [s2 r1 "informative"]
... [s28 r1 "like"] [s28 r1 "agreement"] [s28 r1 "informative"]
[s1 r2 "like"] [s1 r2"agreement"] [s1 r2 "in...
This is not directly related, but I would love a way to interface with EAGs where (I pay no money, but) I have access to the Swapcard interface and I talk only with people who request meetings with me. I often want to "attend" EAGs in this way, where I don't interface with the conference (physically or virtually) but I'm available as a resource if people want to talk to me, for which I will schedule remote 1:1s over Zoom. It'd be nice to be helpful to people at a time where they're available and can see I'm available on Swapcard, but I don't want to pay to do this (nor make the event organizers pay for my non-physical presence if I do end up applying-- I usually want a "virtual only" ticket).
I'm not going to comment too much here, but if you haven't seen my talk (“Researcher Perceptions of Current and Future AI” (first 48m; skip the Q&A) (Transcript)), I'd recommend it! Specifically, you want the timechunk 23m-48m in that talk, when I'm talking about the results of interviewing ~100 researchers about AI safety arguments. We're going to publish much more on this interview data within the next month or so, but the major results are there, which describes some AI researchers cruxes.
(in response to the technical questions)
Mostly n=28 for each document, some had n =29 or n= 30; you can see details in the Appendix, quantitative section.
The Carlsmith link is to the Youtube talk version, not the full report -- we chose materials based on them being pretty short.
Keeping a running list of field-building posts I personally want to keep track of:
Project ideas:
- Akash's: https://forum.effectivealtruism.org/posts/yoP2PN5zdi4EAxdGA/ai-safety-field-building-projects-i-d-like-to-see
- Ryan's: https://www.lesswrong.com/posts/v5z6rDuFPKM5dLpz8/probably-good-projects-for-the-ai-safety-ecosystem
Survey analysis:
- Ash's: https://forum.effectivealtruism.org/posts/SuvMZgc4M8FziSvur/analysis-of-ai-safety-surveys-for-field-building-insights
- Daniel Filan's: https://www.lesswrong.com/posts/rXSBvSKvKdaNkhLeJ/takeaways-from-a-sur...
The application form is actually really restrictive once you open it-- when I filled it out, it explicitly instructed not to write any new material and only attach old material that was sent to FTXFF, and only had a <20 word box and <150 word box for grant descriptions. Today when I open the form even those boxes have disappeared. I think it's meant to be a quite quick form, where they'll reach out for more details later.
Just updated my post on this: https://forum.effectivealtruism.org/posts/8sAzgNcssH3mdb8ya/resources-i-send-to-ai-researchers-about-ai-safety
I have different recommendations for ML researchers / the public / proto-EAs (people are more or less skeptical to begin with, rely on different kinds of evidence, and are willing to entertain weirder or more normative hypotheses), but that post covers some introductory materials.
If they're a somewhat skeptical ML researcher and looking for introductory material, my top recommendation at the moment is “Why I Thin...
(quick reply to a private doc on interaction effects vs direct effects for existential risks / GCR. They're arguing for more of a focus on interaction effects overall, I'm arguing for mostly work on direct effects. Keeping for my notes.)
In addition to direct effects from AI, bio, nuclear, climate...
...there are also mitigating / interaction effects, which could make these direct effects better or worse. For each of the possible direct risks, mitigating / interaction effects are more or less important.
For AI, the mainline direct risks that are possibl...
Continual investment argument for why AGI will probably happen, absent major societal catastrophes, written informally, for my notes:
We’ve been working on AI since ~1950s, in an era of history that feels normal to us but in fact develops technologies very very very fast compared to most of human existence. In 2012, the deep learning revolution of AI started with AlexNet and GPUs. Deep learning has made progress even faster than the current very fast rate of progress: 10 years later, we have unprecedented and unpredicted progress in large language mode...
Thanks Gabriel-- super useful step-by-step guide, and also knowledge/skill clarification structure! I usually gesture around vaguely when talking about my skills (I lose track of how much I know compared to others-- the answer is I clearly completed Levels 1-3 then stopped) and trying to hire other people with related skills. It feels useful to be able to say to someone e.g. "For this position, I want you to have completed Level 1 and have a very surface level grasp of Levels 2-4"!
Still in progress as always, but this talk covers a lot of it! https://forum.effectivealtruism.org/posts/q49obZkQujkYmnFWY/vael-gates-risks-from-advanced-ai-june-2022
(Unfortunately the part about the insights isn't transcribed-- the first 20m is introduction, next ~30m is the description you want, last 10m is questions)
Suggestion for a project from Jonathan Yan:
Given the Future Fund's recent announcement of the AI Worldview Prize, I think it would be a good idea if someone could create an online group of participants. Such a group can help participants share resources, study together, co-work, challenge each other's views, etc. After the AI Worldview Prize competition ends, if results are good, a global AI safety community can be built and grown from there.
("AI can have bad consequences" as a motivation for AI safety--> Yes, but AI can have bad consequences in meaningfully different ways!)
Here's some frame confusion that I see a lot, that I think leads to confused intuitions (especially when trying to reason about existential risk from advanced AI, as opposed to today's systems):
1. There's (weapons) -- tech like nuclear weapons or autonomous weapons that if used correctly involve people dying. (Tech like this exists)
2. There's (misuse) -- tech was intentioned to be anywhere from beneficial <> neutra...
^ Yeah, endorsed! This is work in (3)-- if you've got the skills and interests, going to work with Josh and Lucius seems like an excellent opportunity, and they've got lots of interesting projects lined up.
I think my data has insights about 3, and not about 1 and 2! You can take a look at https://www.lesswrong.com/posts/LfHWhcfK92qh2nwku/transcripts-of-interviews-with-ai-researchers to see what 11 interviews look like; I think it'd have to be designed differently to get info on 1 or 2.
Sounds great; thanks Sawyer! "Reaching out to BERI" was definitely listed in my planning docs for this post; if there's anything that seems obvious to communicate about, happy to take a call, otherwise I'll reach out if anything seems overlapping.
Thanks levin! I realized before I published that I hadn't gotten nearly enough governance people to review this, and indeed was hoping I'd get help in the comment section.
I'd thus be excited to hear more. Do you have specific questions / subareas of governance that are appreciably benefited by having a background in "economics, political science, legal studies, anthropology, sociology, psychology, and history" rather than a more generic "generalist"-type background (which can include any of the previous, but doesn't depend on any of them?)
I view the ...
"Preventing Human Extinction" at Stanford (first year undergraduate course)
Syllabus (2022)
Additional subject-specific reading lists (AI, bio, nuclear, climate) (2022)
@Pablo Could you also your longtermism list with the syllabus, and with the edit that the class is taught by Steve Luby and Paul Edwards jointly? Thanks and thanks for keeping this list :) .
Great idea, thank you Vaidehi! I'm pulling this from the Forum and will repost once I get that done. (Update: Was reposted)
I haven't received much feedback on this video yet, so I'm very curious to know how it's received! I'm interested in critiques and things that it does well, so I can refine future descriptions and know who to send this to.
I've been finding "A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]" to have a lot of content that would likely be interesting to the audience reading these transcripts. For example, the incentives section rhymes with the type of things interviewees would sometimes say. I think the post generally captures and analyzes a lot of the flavor / contextualizes what it was like to talk to researchers.
This isn't particularly helpful since it's not sorted, but some transcripts with ML researchers: https://www.lesswrong.com/posts/LfHWhcfK92qh2nwku/transcripts-of-interviews-with-ai-researchers
My argument structure within these interviews was basically to ask them these three questions in order, then respond from there. I chose the questions initially, but the details of the spiels were added to as I talked to researchers and started trying to respond to their comments before they made them.
1. “When do you think we’ll get AGI / capable / generalizable AI / ...
Indeed! I've actually found that in most of my interviews people haven't thought about the 50+ year future much or heard of AI alignment, given that my large sample is researchers who had papers at NeurIPS or ICML. (The five researchers who were individually selected here had thought about AI alignment uncommonly much, which didn't particularly surprise me given how they were selected.)
...A nice followup direction to take this would be to get a list of common arguments used by AI researchers to be less worried about AI safety (or about working on capabilities
I just did a fast-and-dirty version of this study with some of the students I'm TAing for, in a freshman class at Stanford called "Preventing Human Extinction". No promises I got all the details right, in either the survey or the analysis.
—————————————————————————————————
QUICK SUMMARY OF DATA FROM https://forum.effectivealtruism.org/posts/7f3sq7ZHcRsaBBeMD/what-psychological-traits-predict-interest-in-effective
MTurkers (n=~250, having a hard time extracting it from 1-3? different samples):
- expansive altruism (M = 4.4, SD = 1.1)
- effectiveness-focus scale ...
It's super cool :). I think SERI's funded by a bunch of places (including some university funding, and for sure OpenPhil), but it definitely feels incredible!
Just wanted to mention that if you were planning on standardizing an accelerated fellowship retreat, it seems definitely worth reaching out to CFAR folks (as mentioned), since they spent a lot of time testing models, including for post-workshop engagement, afaik! Happy to provide names / introductions if desired.
Update on my post "Seeking social science students / collaborators interested in AI existential risks" from ~1.5 months ago:
I've been running a two-month "program" with eight of the students who reached out to me! We've come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer research assistants. I've been meeting with each person / group for 30min per week to discuss progress. We're halfway through this experiment, with a variety of projects and progress states-- hopefully you'll see at...
Update: I've been running a two-month "program" with eight of the students who reached out to me! We've come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer RAs. I've been meeting with each person / group for 30min per week to discuss progress. We're halfway through this experiment, with a variety of projects and progress states-- hopefully you'll see at least one EA Forum post up from those students!
--
I was quite surprised by the interest that this post generated; ~30 people reached...
I think classes are great given they're targeting something you want to learn, and you're not uncommonly self-motivated. They add a lot of structure and force engagement (i.e. homework, problem sets) in a way that's hard to find time / energy for by yourself. You also get a fair amount of guidance and scaffolding information, plus information presented in a pedagogical order! With a lot of variance due to the skill and time investment of the instructor, size of class and quality of the curriculum etc.
But if you DO happen to be very self-driven,...
(How to independent study)
Stephen Casper (https://stephencasper.com/) was giving advice today in how to upskill in research, and suggested doing a "deep dive".
Deep dive: read 40-50 papers in a specific research area you're interested in going into (e.g. adversarial examples in deep NNs). Take notes on each paper. You'll then have comparable knowledge to people working in the area, after which you do a synthesis project at the end where you write something up (could be lit review, could be more original than that).
He said he'd trade any class he'd ever taken for one of these deep dives, and they're worth doing even if it takes like 4 months.
*cool idea
The comment about counterfactuals makes me think about computational cognitive scientist Tobias Gerstenberg's research (https://cicl.stanford.edu), where his research focuses a lot on counterfactual reasoning in the physical domain, but he also has work in the social domain.
I confess to only a surface-level understanding of MIRI's research agenda, so I'm not quite able to connect my understanding of counterfactual reasoning in the social domain to a concrete research question within MIRI's agenda. I'd be happy to hear more though if you had more detail!
Arkose is seeking an AI Safety Call Specialist who will be speaking with and supporting professors, PhD students, and industry professionals who are interested in AI safety research or engineering.
Salary: $75,000 - $95,000, depending on prior experience and location. This is currently a 9-month-long fixed contract.
Location: Remote (but we highly prefer candidates to be able to work in roughly US time zones).
Deadline: 30 March 2024, with rolling admission (early applications encouraged).
Learn more on our website, and apply here if you’re interested!