Some thoughts on EA outreach to high schoolers

First EuroSPARC was in 2016. Targeting 16-19 year olds, my prior would be participants should still mostly study, and not work full-time on EA, or only exceptionally.

Long feedback loops are certainly a disadvantage.

Also in the meantime ESPR underwent various changes and actually is not optimising for something like "conversion rate to an EA attractor state".

The case of the missing cause prioritisation research

Quick reaction:

I. I did spent a considerable amount of time thinking about prioritisation (broadly understood)

My experience so far is

  • some of the foundations / low hanging sensible fruits were discovered
  • when moving beyond that, I often run into questions which are some sort of "crucial consideration" for prioritisation research, but the research/understanding is often just not there.
  • often work on these "gaps" seems more interesting and tractable than trying to do some sort of "lets try to ignore this gap and move on" move

few examples, where in some cases I got to writing something

  • Nonlinear perception of happiness - if you try to add utility across time-person-moments, it's plausible you should log-transform it (or non-linearly transform it) . sums and exponentiation do not commute, so this is plausibly a crucial consideration for part of utilitarian calculations trying to be based on some sort of empirical observation like "pain in bad"
  • Multi-agent minds and predictive processing - while this is framed as about AI alignment, super-short version of why this is relevant for prioritisation is: theories of human values depend on what mathematical structures you use to represent these values. if your prioritization depnds on your values, this is possible important
  • Another example could be the style of thought explained in Eliezer's "Inadequate Equillibria". While you may not count it as "prioritisation research", I'm happy to argue the content is crucially important for prioritisation work on institutional change or policy work. I spent some time thinking about "how to overcome inadequate equillibria", which leads to topics from game theory, complex systems, etc.

II. My guess is there are more people who work in a similar mode, trying to basically 'build as good world model as you can', dive into problems you run into, and at the end prioritise informally based on such a model. Typically I would expect such model to be in parts implicit / be some sort of multi-model ensemble / ...

While this may not create visible outcomes labeled as prioritisation, I think it's important part of what's happening now

'Existential Risk and Growth' Deep Dive #2 - A Critical Look at Model Conclusions

I posted a short version of this, but I think people found it unhelpful, so I'm trying to post somewhat longer version.

  • I have seen some number of papers and talks broadly in the genre of "academic economy"
  • My intuition based on that is, often they seem to consist of projecting complex reality into a space of single-digit real number dimensions and a bunch of differential equations
  • The culture of the field often signals solving the equations is profound/important, and the how you do the projection "world -> 10d" is less interesting
  • In my view for practical decision making and world-modelling it's usually the opposite: the really hard and potentially profound part is the projection. Solving the maths is in often is some sense easy, at least in comparison to the best maths humans are doing
  • While I overall think the enterprise is worth to pursue, people should in my view have a relatively strong prior that for any conclusions which depends on the "world-> reals" projection there could be many alternatives leading to different conclusions; while I like the effort in this post to dig into how stable the conclusions are, in my view people who do not have cautious intuitions about the space of "academic economy models" could still easily over-update or trust too much the robustness
  • If people are not sure, an easy test could be something like "try to modify the projection in any way, so the conclusions do not hold". At the same time this will usually not lead to an interesting or strong argument, it's just trying some semi-random moves is the model space. But it can lead to a better intuition.
  • I tried to do few tests in a cheap and lazy way (eg what would this model tell me about running at night on a forested slope?) and my intuitions was:
  • I agree with the cautious the work in the paper represents very weak evidence for the conclusions that follow only from the detailed assumptions of the model in the present post. (At the same time it can be an excellent academic economy paper)
  • I'm more worried about other writing about the results, such as linked post on Phil's blog , which in my reading signals more of "these results are robust" than it's safe
  • Harder and more valuable work is to point to something like some of the most significant way in which the projection fails" (aspects of reality you ignored etc.). In this case this was done by Carl Shulman and it's worth discussing further
  • In practice I do have some worries about some meme 'ah, we don't know, but given we don't know, speeding up progress is likely good' (as proved in this good paper) being created in the EA memetic ecosystem. (To be clear I don't think the meme would reflect what Leopold or Ben believe)
Neglected EA Regions

I'm not sure you've read my posts on this topic? (1,2)

In the language used there, I don't think the groups you propose would help people overcome the minimum recommended resources, but are at the risk of creating the appearance some criteria vaguely in that direction are met.

  • e.g., in my view, the founding group must have a deep understanding of effective altruism, and, essentially, the ability to go through the whole effective altruism prioritization framework, taking into account local specifics to reach conclusions valid at their region. This basically impossible to implement as membership requirement in a fb group
  • or strong link(s) to the core of the community ... this is not fulfilled by someone from the core hanging in many fb groups with otherwise unconnected ppl

Overall, I think sometimes small obstacles - such as having to find EAs from your country in the global FB group or on EA hub and by other means - are a good thing!

Neglected EA Regions

FWIW the Why not to rush to translate effective altruism into other languages post was quite influential but is often wrong / misleading / advocating some very strong prior on inaction, in my opinion

Neglected EA Regions

I don't think this is actually neglected

  • in my view, bringing effective altruism into new countries/cultures is in initial phases best understood as a strategy/prioritisation research, not as "community building"
    • importance of this increases with increasing distance (cultural / economic / geographical / ...) from places like Oxford or Bay

(more on the topic here)

  • I doubt the people who are plausibly good founders would actually benefit from such groups, and even less from some vague coordination due to facebook groups
    • actually I think on the margin, if there are people who would move forward with the localization efforts if such fb groups exist and other similar people express interest, and would not do that otherwise, their impact could be easily negative
AI safety scholarships look worth-funding (if other funding is sane)
  • I don't think it's reasonable to think about FHI DPhil scholarships and even less so RSP as a mainly a funding program. (maybe ~15% of the impact comes from the funding)
  • If I understand the funding landscape correctly, both EA funds and LTFF are potentially able to fund single-digit number of PhDs. Actually has someone approached these funders with a request like "I want to work on safety with Marcus Hutter, and the only thing preventing me is funding"? Maybe I'm too optimistic, but I would expect such requests to have decent chance of success.
I'm Buck Shlegeris, I do research and outreach at MIRI, AMA



For example, CAIS and something like "classical superintelligence in a box picture" disagree a lot on the surface level. However, if you look deeper, you will find many similar problems. Simple to explain example: problem of manipulating the operator - which has (in my view) some "hard core" involving both math and philosophy, where you want the AI to somehow communicate with humans in a way which at the same time allows a) the human to learn from the AI if the AI knows something about the world b) the operator's values are not "overwritten" by the AI c) you don't want to prohibit moral progress. In CAIS language this is connected to so called manipulative services.

Or: one of the biggest hits of past year is the mesa-optimisation paper. However, if you are familiar with prior work, you will notice many of the proposed solutions with mesa-optimisers are similar/same solutions as previously proposed for so called 'daemons' or 'misaligned subagents'. This is because the problems partially overlap (the mesa-optimisation framing is more clear and makes a stronger case for "this is what to expect by default"). Also while, for example, on the surface level there is a lot of disagreement between e.g. MIRI researchers, Paul Christiano and Eric Drexler, you will find a "distillation" proposal targeted at the above described problem in Eric's work from 2015, many connected ideas in Paul's work on distillation, and while find it harder to understand Eliezer I think his work also reflects understanding of the problem.


For example: You can ask whether the space of intelligent systems is fundamentally continuous, or not. (I call it "the continuity assumption"). This is connected to many agendas - if the space is fundamentally discontinuous this would cause serious problems to some forms of IDA, debate, interpretability & more.

(An example of discontinuity would be existence of problems which are impossible to meaningfully factorize; there are many more ways how the space could be discontinuous)

There are powerful intuitions going both ways on this.

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA

I think the picture is somewhat correct, and we surprisingly should not be too concerned about the dynamic.

My model for this is:

1) there are some hard and somewhat nebulous problems "in the world"

2) people try to formalize them using various intuitions/framings/kinds of math; also using some "very deep priors"

3) the resulting agendas look at the surface level extremely different, and create the impression you have

but actually

4) if you understand multiple agendas deep enough, you get a sense

  • how they are sometimes "reflecting" the same underlying problem
  • if they are based on some "deep priors", how deep it is, and how hard to argue it can be
  • how much they are based on "tastes" and "intuitions" ~ one model how to think about it is people having boxes comparable to policy net in AlphaZero: a mental black-box which spits useful predictions, but is not interpretable in language

Overall, given our current state of knowledge, I think running these multiple efforts in parallel is a better approach with higher chance of success that an idea that we should invest a lot in resolving disagreements/prioritizing, and everyone should work on the "best agenda".

This seems to go against some core EA heuristic ("compare the options, take the best") but actually is more in line with what rational allocation of resources in the face of uncertainty.

Load More