Meta-EA Needs Models

by Mark Xu4 min read5th Apr 20216 comments

26

Community Building StrategyCommunity
Frontpage
This is a linkpost for https://markxu.com/meta-ea-models

Thanks to Kuhan Jeyapragasan, Michael Byun, Sydney Von Arx, Thomas Kwa, Jack Ryan, Adam Křivka, and Buck Shlegeris for helpful comments and discussion.

Epistemic status: a bunch of stuff

Sometimes I have conversations with people that go like this:

Me: Feels like all the top people in EA would have gotten into EA anyway? I see a bunch of people talking about how sometimes people get into EA and they agree with the philosophy, but don't know what to do and slowly drift away from the movement. But also there's this other thing they could have done which is "figure out what to do and then do that." Why didn't they do that instead? More concretely, sometimes I talk to software engineers that want to get into AI Safety. Nate Soares was once a software engineer and he studied a bunch of math and now runs MIRI. This feels like a gap that can't really be solved by creating concrete next steps.

Them: A few things:

  1. Anecdotally (and also maybe some survey data), there are people that you would consider "top EAs" where it feels like they could have not gotten into EA if things were different, e.g. they were introduced by a friend they respected less or they read the wrong introduction. It seems still quite possible that we aren't catching all the "top people."

  2. Even if we can't get people to counterfactually become EAs, we can still make their careers faster. It’s much easier to convince people to change careers in undergrad, when they haven't spent a bunch of effort. For example, if you want EAs to get PhDs, then the path becomes much murkier after undergrad.

  3. There are people with other skills that the EA movement needs. Becoming a personal assistant, for example. Or other things like journalism and anthropology that the EA movement needs but might not get by default because such people aren't attracted to "EA-type-thinking" by default.

  4. Even if we don’t convert people to EA, we can spread EA ideas to people who might make important decisions in the future, like potential employees of large companies, politicians, lawyers, etc. For example, the world would probably be better if all ML engineers were mildly concerned with risks posed by AI. Additionally, people often donate money to charity, and if we can plant the seed of EA in such people, they might differentially donate more money to effective charities.

(This is the part where I just say the things I wanted to say from the beginning but didn't know how to lead into.)

Me: These all seem like good reasons why meta-EA is still valuable, but the main objection I have to them is that they all suggest that meta-EA is operating different orders of magnitude. For example, if the goal of meta-EA is counterfactual speed versus counterfactual careers, that's like a 10-50x difference in the "number of EA years" you're getting out of an individual.

More broadly, it feels like meta-EA has a high-level goal, which is "make the world better, whatever that means", but has a very murky picture about how this resolves into instrumental subgoals. There are certain cruxes like "how hard is it to get a counterfactual top EA" that potentially has a 10x influence on how impactful meta-EA is that we have very little traction on.

More concretely, I imagine taking my best guess at the "current plan of meta-EA" and giving it to Paul Graham and him not funding my startup because the plan isn't specific/concrete enough to even check if it's good and this vagueness is a sign that the key assumptions that need to be true for the plan to even work haven't been identified.

GiveWell tries to do a very hard thing and evaluate charities against each other. The way they do this involves a pretty complicated model and a bunch of estimations but you can look at the model and look at the estimates of the parameters and say "the model seems reasonable and the estimates seem reasonable, so the output must be reasonable." However, the current state of meta-EA is that I don't know how to answer the question of whether action A is better than action B and we're basically just looking at charities and estimating "12 goodness" and "10 goodness" directly, which is not how you're supposed to estimate things.

Sometimes, questions are too difficult to answer directly. However, if you’re unable to answer a question, then a sign that you’ve understood the question is your ability to break it down into concrete subquestions that can be answered, each of which is easier to answer than the original top-level question. If you can’t do this, then you’re just thinking in circles.

AI safety isn't much better from a "has concrete answers to questions" perspective, but in AI safety, I can give you a set of subquestions that would help answer the top-level questions, which I feel like I can't do for meta-EA. Like the questions would be something like "what is meta-EA even trying to do? What are some plausible goals? Which one are we aiming for?" and then I would proceed from there.

To me, this is the core problem of meta-EA: figuring out what the goal is. Who are we targeting? What's the end-game? Why do we think movement building is valuable? How does it ground out in actions that concretely make the world better?

Right now it feels like a lot of meta-EA is just taking actions without having models of how those actions lead to increased value in the world. And there are people working on collecting data and figuring out answers to some of these questions, which is great, but it still feels like a lot of people in meta-EA don't have this sense that they're trying to do a very hard thing and should be doing it in a principled way.

(This is the part where I don’t quite know how my interlocutor would respond.)

Them: Yeah I agree that meta-EA lacks explicit models, but that’s just because it’s trying to solve a much harder problem than GiveWell/AI Safety precisely because there are so many possible goals. In practice, meta-EA doesn’t look like “pick the best goal, then try and pursue it” it looks more like “observe the world, determine which of the many plausible goals of meta-EA are most tractable given what you’ve observed, then steer towards that.” There isn’t a strong “meta-EA agenda” because such an agenda would be shot to pieces within the week. In general, I’m worried about top-down style explicit reasoning overriding the intuitions of people on the ground by biasing towards calculable/measurable things.

Sure “plans are useless, but planning is indispensable”, but sometimes, if the world is changing quickly and you are aiming for five different things in a hundred different ways, planning is also not that useful. In practice, it’s just better to do things that are locally valuable instead of trying to back-chain from some victory condition that you’re not even sure is possible. In the AI safety case, this might look like work on neural network interpretability, which is robustly useful in a broad class of scenarios (see An Analytic Perspective on AI Alignment for more discussion).

Overall, I agree that meta-EA could benefit from more explicit models, but it’s important to note that “model-free” work can still be robustly useful.

(This is the part where I conclude by saying these issues are very tricky and confusing and I would be excited about people thinking about them and gathering empirical data.)

26

6 comments, sorted by Highlighting new comments since Today at 3:26 AM
New Comment

Thanks for sharing this! 

Feels like all the top people in EA would have gotten into EA anyway?

Possibly you don't endorse this statement and were just using it as an intro, but I think your interlocutor's response (1) is understated: I can't think of any products which don't benefit from having a marketing department. If EA doesn't benefit from marketing (broadly defined), it would be an exceptionally unusual product.

I imagine taking my best guess at the "current plan of meta-EA" and giving it to Paul Graham and him not funding my startup because the plan isn't specific/concrete enough to even check if it's good and this vagueness is a sign that the key assumptions that need to be true for the plan to even work haven't been identified.

For what it's worth, CEA's plans seem more concrete than mine were when I interviewed at YC. CLR's thoughts on creating disruptive research teams are another thing which comes to mind as having key assumptions which could be falsified.

Also seems relevant that both 80k and CEA went through YC (though I didn't work for 80k back then and don't know all the details).

Terminology comment: Although you refer to Meta EA throughout this post, it seems what you are really talking about is EA community building specifically, as opposed to other Meta EA efforts which could include infrastructure, cross-cutting services to the EA ecosystem, meta research (e.g. global priorities research) etc. Does this sound right, or do you actually also mean other Meta EA activities?

Anecdotally (and also maybe some survey data), there are people that you would consider "top EAs" where it feels like they could have not gotten into EA if things were different, e.g. they were introduced by a friend they respected less or they read the wrong introduction. It seems still quite possible that we aren't catching all the "top people."

I agree with all of this. In particular, saying "all the people in EA seem like they'd have ended up here eventually" leaves out all the people who also "seem like they'd have ended up here eventually" but... aren't here.

I can think of people like this! I had lots of conversations while I was leading the Yale group. Some of them led to people joining; others didn't; in some cases, people came to a meeting or two and then never showed up again. It's hard to imagine there's no set of words I could have said, or actions I could have taken, that wouldn't have converted some people from "leaving after one meeting" to "sticking around" or "never joining" to "attending a first event out of curiosity". 

The Introductory Fellowship is a thing, created and funded by "meta" people, that I think would have "converted" many of those people — if I'd had access to it back in 2014, I think EA Yale could have been twice the size in its first year, because we lost a bunch of people who didn't have anything to "do" or who were stuck toiling away on badly-planned projects because I was a mediocre leader.

*****

I also have at least one friend I think would have been a splendid fit, and who was involved with the community early on, but then had a terrible experience with the person who introduced her to EA (they are no longer a member) and has now soured on everything related to the community (while still holding personal beliefs that are basically EA-shaped, AFAICT). That's the sort of thing that meta/community-building work should clearly prevent if it's going well. 

Had my friend had the bad experience in 2021 rather than nearly a decade earlier, she'd have access to help from CEA, support from several specialized Facebook groups,  and a much larger/better-organized community in her area that would [I hope] have helped her resolve things.

Sometimes, questions are too difficult to answer directly. However, if you’re unable to answer a question, then a sign that you’ve understood the question is your ability to break it down into concrete subquestions that can be answered, each of which is easier to answer than the original top-level question. If you can’t do this, then you’re just thinking in circles.

I am actually working on a post that provides an adaptable framework for decision-making which tries to do this. That being said, I naturally make no guarantees that it will be a panacea (and in fact if there are any meta-EA-specific models being used, I would assume that the framework I'm presenting will be less well tailored to meta-EA specifically).