Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

PeterSlattery

For background and context, see my related series of posts on an approach for AI Safety Movement Building. This is a quick and concise rewrite of the main points in the hope that it will attract better engagement and feedback.

Which of the following assumptions do you agree or disagree with? Follow the links to see some of the related content from my posts.

Assumptions about the needs of the AI Safety community

A lack of people, inputs, and coordination is (one of several issues) holding back progress in AI Safety. Only a small portion of potential contributors are focused on AI Safety, and current contributors face issues such as limited support, resources, and guidance.
We need more (effective) movement builders to accelerate progress in AI Safety. Utilising diverse professions and skills, effective movement builders can increase contributors, contributions, and coordination within the AI Safety community, by starting, sustaining, and scaling useful projects. They can do so while getting supervision and support from those doing direct work and/or doing direct work themselves.
To increase the number of effective AI Safety movement builders we need to reduce movement building uncertainty. Presently, it's unclear who should do what to help the AI Safety Community or how to prioritise between options for movement building. There is considerable disagreement between knowledgeable individuals in our diverse community. Most people are occupied with urgent object-level work, leaving no one responsible for understanding and communicating the community's needs.
To reduce movement building uncertainty we need more shared understanding. Potential and current movement builders need a sufficiently good grasp of key variables such as contexts, processes, outcomes, and priorities to be able to work confidently and effectively.
To achieve more shared understanding we need shared language. Inconsistencies in vocabulary and conceptualisations hinder our ability to survey and understand the AI Safety community's goals and priorities.

Assumption about the contribution of my series of posts

I couldn't find any foundation of shared language or understanding in AI Safety Movement building to work from, so I created this series of posts to share and sense-check mine as it developed and evolved. Based on this, I now assume:

My post series offers a basic foundation for shared language and understanding in AI Safety Movement building, which most readers agree with. I haven't received much feedback but what I have received has generally been supportive. I could be making a premature judgement here so please share any disagreements you have.

Assumption about career paths to explore

If the above assumptions are valid then I have a good understanding of i) the AI Safety Community and what it needs, and ii) a basic foundation for shared language and understanding in AI Safety Movement building that I can build on. Given my experience with entrepreneurship, community building, and research, I therefore assume:

It seems reasonable for me to explore if I can provide value by using the shared language and understanding to initiate/run/collaborate on projects that help to increase shared understanding & coordination within the AI Safety Community. For instance, this could involve evaluating progress in AI Safety Movement building and/or surveying the community to determine priorities. I will do this while doing Fractional Movement Building (e.g., allocating some of my productive time to movement building and some of my time for direct work/self-education).

Feedback/Sense-checking

Do you agree or disagree with any of the above assumptions? If you disagree then please explain why.

Your feedback will be greatly valued and will help with my career plans.

To encourage feedback I am offering a bounty. I will pay up to 200USD in Amazon vouchers, shared via email, to up to 10 people who give helpful feedback on this post or my previous posts in the series by 15/4/2023. I will also consider rewarding anonymous feedback left here (but you will need to give me an email address). I will likely share anonymous feedback if it seems constructive, and I think other people will benefit from seeing it.

23 Reactions

Part 3: A Proposed Approach for AI Safety Movement Building: Projects, Professions, Skills, and Ideas for the Future [long post][bounty for feedback]

8 comments22 karma

New Answer

New Comment

4 Answers sorted by
Top

Yonatan Cale

Mar 27, 2023

Hey,

Some people think community building efforts around AI Safety are net negative, such as the post about "Shutting Down the Lightcone Offices".

I'm not saying they're right (it seems complicated in a way I don't know how to solve), but I do think they're pointing at a real failure mode.

I think that having metrics for community building (that are not strongly grounded in a "good" theory of change) (such as metrics for increasing people in the field in general) have extra risk for that kind of failure mode.

Excuse me if I misunderstood what you're saying - I saw you specifically wanted comments and don't have any yet, so I'm err'ing at the side of sharing my first (maybe wrong) thoughts

PeterSlatteryMar 28 20232

Hey Yonatan, thanks for replying, I really appreciate it! Here is a quick response.

I read the comments by Oliver and Ben in "Shutting Down the Lightcone Offices".

I think that they have very valid concerns about AI Safety Movement Building (pretty sure I linked this piece in my article).

However, I don't think that the optimum response to such concern is to stop trying to understand and improve how we do AI Safety Movement building. That seems premature given current evidence.

Instead, I think that the best response here (and everywhere else... (read more)

Yonatan Cale

Mar 28 2023

Edit: I just wrote this, it's ~1:30am here, I'm super tired and think this was incoherent. Please be extra picky with what you take from my message, if something doesn't make sense then it's me, not you. I'm still leaving the comment because it sounds like you really want comments --- Hey, TL;DR: This sounds too meta, I don't think I understand many important points of your plan, and I think examples would help. It involves trusting experts, or deferring to them, or polling them, or having them supervise your work, or other things. 1) This still leaves open questions like "how do you chose those experts", for example do you do it based on who has the most upvotes on the forum? (I guess not), or what happens if you chose "experts" who are making the AI situation WORSE and they tell you they mainly need to hire people to help them? 2) And if you pick experts correctly, then once you talk to one or several of these experts, you might discover a bottle neck that is not at all in having a shared language, but is "they need a latex editor" or "someone needs to brainstorm how to find nobel prize winners to work on AI Safety" (I'm just making this up here). My point it, my priors are they will give surprising answers. [my priors are from user research, and specifically this]. These are my priors for why picking something like "having a shared language" before talking to them is probably not a good idea (though I shared why I think so, so if it doesn't make sense, totally ignore what I said) 3) Too meta: "Finding a shared language" pattern matches for me (maybe incorrectly!) to solutions like "let's make a graph of human knowledge" which almost always fail (and I think when they work they're unusual). These solutions are.. "far" from the problem. Sorry I'm not so coherent. Anyway, something that might change my mind very quickly is if you'll give me examples of what "language" you might want to create. Maybe you want a term for "safety washing" as an exa

Yonatan Cale

Mar 28 2023

I want to add one more thing: This whole situation where a large number of possible seemingly-useful actions turns out to be net negative - is SUPER ANNOYING imo, it is absolutely not anything against you, I wish it wasn't this way, etc. Ah, and also: I with many others would consult about their ideas in public as you've done here, and you have my personal appreciation for that, fwiw

PeterSlattery

Mar 29 2023

Thanks! I understand. I am not taking any of this personally and I am enjoying the experience of getting feedback!

PeterSlattery

Mar 29 2023

Hi Yonatan, Thank you for this! Your comment is definitely readable and helpful. It highlights gaps in my communication and pushes me to think more deeply and explain my ideas better. I've gained two main insights. First, I should be clearer about what I mean when I use terms like "shared language." Second, I realise that I see EA as a well-functioning aggregator for the wisdom of well-calibrated crowds, and want to see something similar to that for AI Safety Movement building. Now, let me address your individual points, using the quotes you provided: Quote 1: "This still leaves open questions like "how do you chose those experts", for example do you do it based on who has the most upvotes on the forum? (I guess not), or what happens if you chose "experts" who are making the AI situation WORSE and they tell you they mainly need to hire people to help them?" Response 1: I agree that selecting experts is a challenge, but it seems better to survey credible experts than to exclude that evidence from the decision-making process. Also, the challenge of ‘who to treat as expert’ applies to EA and decision-making in general. We might later think that some experts were not the best to follow, but it still seems better to pay attention to those who seem expert now as opposed to the alternative of making decision based on personal intuitions. Quote 2: And if you pick experts correctly, then once you talk to one or several of these experts, you might discover a bottle neck that is not at all in having a shared language, but is "they need a latex editor" or "someone needs to brainstorm how to find nobel prize winners to work on AI Safety" (I'm just making this up here). My point it, my priors are they will give surprising answers. [my priors are from user research, and specifically this]. These are my priors for why picking something like "having a shared language" before talking to them is probably not a good idea (though I shared why I think so, so if it doesn't make sens

Yonatan Cale

Mar 29 2023

I want to point out you didn't address my intended point of "how to pick experts". You said you'd survey "credible experts" - who are those? How do you pick them? A more object-level answer would be "by forum karma" (not that I'm saying it's the best answer, but it is more object-level than saying you'd pick the "credible" ones) ---------------------------------------- Yonatan: Peter: Are these examples of things you think might be useful to add to the language of community building, such as "safety washing" might be an example of something useful? If so, it still seems too-meta / too-vague. Specifically, it fails to address the problem I'm trying to point out of differentiating positive community building from negative community building. And specifically, I think focusing on a KPI like "increased contributors" is the way that AI Safety community building accidentally becomes net negative. See my original comment:

Yonatan Cale

Mar 29 2023

TL;DR: No. (I know this is an annoying unintuitive answer) I wouldn't be surprised if 85% of researchers think that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a "full" safety theory of change), and they'll give you some reason that sounds very wrong to me. I'm assuming you interview anyone who sees themselves as working on "AI Safety". [I don't actually know if this statistic would be true, but it's a kind example of how your survey suggestion might go wrong imo]

PeterSlattery

Mar 30 2023

Thanks, that's helpful to know. It's a surprise to me though! You're the first person I have discussed this with who didn't think it would be useful to know which research agendas were more widely supported. Just to check, would your institution change if the people being survey were only people who had worked at AI organisations, or if you could filter to only see the aggregate ratings from people who you thought were most credible (e.g., these 10 researchers)? As an aside, I'll also mention that I think it would be a very helpful and interesting finding if we found that 85% of researchers thought that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a "full" safety theory of change). That would make me change my mind on a lot of things and probably spark a lot of important debate that probably wouldn't otherwise have happened.

PeterSlattery

Mar 30 2023

Thanks for replying: I want to point out you didn't address my intended point of "how to pick experts". You said you'd survey "credible experts" - who are those? How do you pick them? A more object-level answer would be "by forum karma" (not that I'm saying it's the best answer, but it is more object-level than saying you'd pick the "credible" ones) Sorry. Again, early ideas, but the credible experts might be people who have published an AI safety paper, received funding to work on AI, and/or worked at an organisation etc. Let me know what you think of that as a sample. Yonatan: something that might change my mind very quickly is if you'll give me examples of what "language" you might want to create. Peter would conceptualize movement building in a broad sense e.g., as something involving increased contributors, contribution and coordination, people helping with operations communication and working on it while doing direct work (e.g, via going to conferences etc) Are these examples of things you think might be useful to add to the language of community building, such as "safety washing" might be an example of something useful? The bolded terms are broadly examples of things that I want people in the community to conceptualize in similar ways so that we can have better conversations about them (i.e., shared language/understanding). What I mention there and in my posts is just my own understanding, and I'd be happy to revise it or use a better set of shared concepts. If so, it still seems too-meta / too-vague. Specifically, it fails to address the problem I'm trying to point out of differentiating positive community building from negative community building. And specifically, I think focusing on a KPI like "increased contributors" is the way that AI Safety community building accidentally becomes net negative. See my original comment: I agree that the shared language will fail to address the problem of differentiating positive community building from neg

Linda Linsefors

Mar 28, 2023

Are you aware of Alignment Ecosystem Development? They have a list of potentially high impact community building projects. If anyone want to volunteer some time to help with AI Safety community building, you can join their discord.

Regarding how to prioritise. I would be very worried if there where a consensus around how to rank what projects are highest priority, that would more likely be groupthink than wisdom. I think it's much healthier for everyone form their own opinion. The sort of coordination I would like to see is different community builders getting to know each other and knowing about each others projects.

Fractional Movement Building is a good idea. I'm doing this my self, and the same is true for about half of the AI Safety community builders I know. But I would not prescribe it as a one-size-fits-all solution. I am also not claiming that you are suggesting this, it's unclear to me how far you want to take this concept.

You write about the importance of a shared language. This sems useful. Although if you want to contribute to this, maybe create a vocabulary list? In the section about shared language in you previous post you wrote:

This is why I wrote this series of posts to outline and share the language and understanding that I have developed and plan to use if I engage in more direct work.

However, your posts are really long. I'm not going to read all of them. I only read a few bits of your previous post that seemed most interesting. Currently I don't know what vocabulary you are proposing.

Also regarding shared vocabulary, I'd be exited about that if and only if it doesn't become too normative. For example I think that the INT framework is really good as a starting point, but since then has become too influential. You can't make a singel framework that captures everything.

In your previous post you also write

To determine if they should get involved, they ask questions like: Which movement building projects are generally considered good to pursue for someone like me and which are bad? What criteria should I evaluate projects on? What skills do I need to succeed? If I leave my job would I be able to get funding to work on (Movement Building Idea X)?

I know you are just describing what other people are saying and thinking, so I'm not criticizing you. But other than the last question, these are the wrong questions to ask. I don't want community builders to ask what projects are generally considered good, I want them to ask what projects are good. Also, focus on evaluation early on seems backwards. Uncertainty about funding is a real issue though. It might be tempting to focus on the community building that is most ledigble good, in order to secure a career. But I think that is exactly the road that leads to potentially net negative community building.

PeterSlatteryMar 29 20235

Thanks, Linda, that was very helpful. I really appreciate that you took the time to respond in such detail.

Quote 1: “Are you aware of Alignment Ecosystem Development? They have a list of potentially high impact community building projects. If anyone want to volunteer some time to help with AI Safety community building, you can join their discord”

No, thank you for sharing. I will check it out. My immediate thought is that 54 options is a lot so I’d like to know which ones are more promising, which sort of ties into my comments below!

Quote 2: “Regarding how ... (read more)

Linda Linsefors

Mar 29 2023

As far as I can tell, Fractional Movement Building is the norm almost everywhere. In academia, most workshops are run by researchers for researchers. All hobby movements I've been a part of events are run by hobbyist for hobbyists. Unfortunately I don't have this sort of knowledge of other professional networks, other than EA and Academia. I also think that some amount of FMB is necessary to keep the movement building grounded in what is needed among practitioners. A lot of us organisers end up becoming organisers because we where researchers or aspiring researchers and notice an organisational need. I don't know of any successful movement building that don't have some of this grounding. However I also don't have intimate enough knowledge of all successful movement building to be sure there are no exception.

PeterSlattery

Mar 30 2023

Yes, this is a good point and something that I could/should probably mention when I make the case for more fractional movement building. I'll have to think more about the specifics as I get more engagement and experience. Thanks for all the useful input!

Linda Linsefors

Mar 29 2023

That's why there are also a discord and regular calls.

Linda Linsefors

Mar 29 2023

I gave the wrong link before. I menat to post this one: alignment.dev projects · Alignment Ecosystem Development (coda.io) But instead posted this one aisafety.community · Alignment Ecosystem Development (coda.io) I've fixed my previous comment now too. It's also a long list, so your point stands. But it's a list of projects not a list of groups. I would not send the list of communities to someone new, that's for when you know a bit more what you want to do and what community you are looking for. I would give the list of project to someone looking to help with community building, but more importantly, I'd point them to the discord. Which I did successfully link above. https://discord.gg/dRPdsEhYmY

Rebecca

Mar 29 2023

With the list of projects, it looks like most of them are launched or soft launched and so don’t require further assistance?

Linda Linsefors

Apr 4 2023

Some yes, but some still need more work. I hear some more will be added soon, and others are welcome to add too. There is also a related monthly call you can join for more details. Alignment Ecosystem Development Alignment Ecosystem Dev (google.com)

Linda Linsefors

Mar 29 2023

I'm not saying there is a perfect system for onboarding community builder, just saying that there is something, and you should know about it. There are always more organising work to do, including meta organising. Although in the spirit of FMB, it might be a good idea to do some regular movement building before you do meta movement building?

Linda Linsefors

Mar 29 2023

Oh, no. I'm still sharing the wrong link. This one is the right one: Alignment Ecosystem Development

PeterSlattery

Mar 30 2023

Thanks there are a lot of good ideas in here!

PeterSlattery

Mar 30 2023

Ok, that's fair. Although in the spirit of FMB, it might be a good idea to do some regular movement building before you do meta movement building? Yes, this seems right. I have done a lot of EA movement building and a little AI Safety movement building. I suspect that there is a still a lot to be learned from doing more movement building. I plan to do some in a few months so that should help me to revalidate if my various model/ideas make sense.

more better

Apr 17 2023

@PeterSlattery I want to push back on the idea about "regular" movement building versus "meta". It sounds like you have a fair amount of experience in movement building. I'm not sure I agree that you went meta here, but if you had, am not convinced that would be a bad thing, particularly given the subject matter. I have only read one of your posts so far, but appreciated it. I think you are wise to try and facilitate the creation of a more cohesive theory of change, especially if inadvertently doing harm is a significant risk. As someone on the periphery and not working in AI safety but who has tried to understand it a bit, I feel pretty confused as I haven't encountered much in the way of strategy and corresponding tactics. I imagine this might be quite frustrating and demotivating for those working in the field. I agree with the anonymous submission that broader perspectives would likely be quite valuable.

PeterSlattery

Apr 20 2023

Thanks for the thoughts, I really appreciate that you took the time to share them.

Linda Linsefors

Apr 4 2023

I don't want to discourage you in any way. The best person to solve a problem is often the one to spot that problem, so if you see problems and have ideas you should go for it. However, a consistent problem is that lots of people don't know what recourses that exist. I think a better recommendation than what I wrote before, is to find out what already exist, and then decide what to do. Maybe add more missing recourses, or help signal boosting, which ever make sense. Also, I'm not calming to be an expert. I think I know about half of what is going on in AI Safety community building. If you want to get in touch with more community builders, maybe join one of these calls? Alignment Ecosystem Dev (google.com) There are some different slacks and discords too for AIS community building but not any central one. Having a central one would be good. If you want to coordinate this, I'd support that, conditioned on you having plan for avoiding this problem: xkcd: Standards

PeterSlattery

Mar 28, 2023

Anonymous submission:

I only skimmed your post so I very likely missed a lot of critical info. That said, since you seem very interested in feedback, here are some claims that are pushing back against the value of doing AI Safety field building at all. I hope this is somehow helpful.

- Empirically, the net effects of spreading MIRI ideas seems to be squarely negative, both from the point of view of MIRI itself (increasing AI development, pointing people towards AGI), and from other points of views.

- The view of AI safety as expounded by MIRI, Nick Bostrom, etc is essentially an unsolvable problem. To put it in words that they would object it, they believe at some point humanity is going to invent a Godlike machine and this Godlike machine will then shape the future of the universe as it sees fit; perhaps according to some intensely myopic goal like maximizing paperclips. To prevent this from happening, we need to somehow make sure that AI does what we want it to do by formally specifying what we really want in math terms.

The reason MIRI have given up on making progress on this and don't see any way forward is because this is an unsolvable situation.

Eliezer sometimes talks about how the textbook from the future would have simple alignment techniques that work easily but he is simply imagining things. He has no idea what these techniques might be, and simply assumes there must be a solution to the problem as he sees it.

- There are many possibilities of how AI might develop that don't involve MIRI-like situations. The MIRI view essentially ignores economic and social considerations of how AI will be developed. They believe that the economic advantages of a super AI will lead to it eventually happening, but have never examined this belief critically, or even looked at the economic literature on this very big, very publicly important topic that many economists have worked on.

- A lot of abuse and bad behavior has been justified or swept under the rug in the name of 'We must protect unaligned AGI from destroying the cosmic endowment'. This will probably keep happening for the foreseeable future.

- People going into this field don't develop great option value.

PeterSlattery

Mar 30, 2023

Anonymous submission: I have pretty strong epistemics against the current approach of “we’ve tried nothing and we’re all out of ideas”. It’s totally tedious seeing reasonably ideas get put forward, some contrarian position gets presented, and the community reverts to “do nothing”. That recent idea of a co-signed letter about slowing down research is a good example of the intellectual paralysis that annoys me. In some ways it feels built on perhaps a good analytical foundation, but a poor understanding of how humans and psychology and policy change actually work.

Effective Altruism Forum
EA Forum

[ Question ]

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

23

Assumptions about the needs of the AI Safety community

Assumption about the contribution of my series of posts

Assumption about career paths to explore

Feedback/Sense-checking

23

Reactions

4 Answers sorted by
Top

Mar 27, 2023

Mar 28, 2023

Mar 28, 2023

Mar 30, 2023

[ Question ]

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

23

Assumptions about the needs of the AI Safety community

Assumption about the contribution of my series of posts

Assumption about career paths to explore

Feedback/Sense-checking

23

Reactions

4 Answers sorted by Top

Mar 27, 2023

Mar 28, 2023

Mar 28, 2023

Mar 30, 2023

4 Answers sorted by
Top