Hide table of contents

Introduction

Suppose that at every point in time, we take the action  given by:

That is, we want to choose the action  in the set of possible actions  which maximizes () the expected () utility () in the world given that action () and given all our observations and models about the world ().

In the next sections, I will give a brief example, analyze each of the parts in some detail as they relate to altruism, flesh them out, and then point out where I think some EA organizations and myself fall into according to this model.

I hesitated for a long while about posting this piece, because I thought that it might be perceived as too basic or unsophisticated, and because I'd been working on a related but much more complicated model. And indeed, the below model is basic. However, I've found that it does contribute to my clarity of thought, which I think is valuable.

A brief example

If your utility function  is “eat as much ice cream as possible”, then at every point you’d want to choose the action a among the set of possible actions A available to you (buy ice cream, invest in the stock market, work to get  more money, etc.) which leads to the most ice cream eaten by you, given all you know about the world.

The moving parts in our model, and what it means to optimize them, are:

  • Our utility function . In this case, this is “eat as many ice creams as possible”. Fine-tuning this utility function might involve better defining what an ice-cream is, and why eating them is valuable to us.
    • Concrete example: Maybe you reflect on the meaning of ice-cream and decide that what you really care about is actually the feeling of contentment while eating ice cream in good company, maximizing sugar intake, or something else.
  • The optimal action a, and the set of actions A available to you. Fine-tuning them might involve gaining access to larger or better sets of actions.
    • Concrete example: You make sure to have better grades in school so that your parents don’t ground you and limit your ranges of action.
  • The expected value function . Fine-tuning this might involve becoming a better forecaster and tracking the past record of information sources.
    • Concrete example: You hire a group of superforecasters to predict inflation; the higher the inflation, the less ice-cream you will be able to later buy with your savings.
  • Our knowledge of the world, . Fine-tuning this might involve gaining more information about the world, and having it better organized.
    • Concrete example: You become an expert about the ice cream supply chain, but you also get a subscription to The Economist to be informed about broad trends which may affect your plans.
  • Our decision method, originally . Fine-tuning it might involve choosing a different decision method.
    • Concrete example: Because of your moral uncertainty, you're inclined to quantilize (e.g., to choose an action randomly among the top 5% of actions by expected value) rather than to directly choose the action with the highest expected value.
    • Concrete example: You fine-tune your  function to also take into account not only the direct utility of actions, but also their value of information. For instance, if your city has 100 ice cream shops, the long-run optimal behavior is probably to visit all of them at some point and choose the best, rather than to always go to the one which was in-expectation best at the beginning (cf. Multi-armed bandits).
    • Concrete example: Because you can't actually evaluate the value of all actions and then choose the most valuable ones, you find yourself making some simplifications.

Building Blocks

The choice function (originally )

So you have something like a landscape of the expected value of actions, and you want to find and choose the highest point. Some ways in which you can improve your ability to do this:

  • Having more computing power or intellectual manpower to sift through the landscape of expected values.
  • Having better algorithms; better processes to calculate and choose the value of actions.
    • cheaper algorithms,
    • more accurate algorithms,
    • more scalable algorithms, and in particular, computations that can be reused by many people,
    • etc.
  • Having better "parametrizations" of actions so that you can evaluate groups of actions all at once.`
  • Having better fundamentals

Consider an organization like GiveWell. GiveWell could estimate the value of any charity. But doing so is costly, so it can't just evaluate the value of every charity and choose the best ones. This leads to interesting exploration-exploitation tradeoffs problems, even if the evaluations of expected value of any particular charity were perfect (!).

Here, better parametrizations are particularly helpful. By parametrizations, I mean something like dividing into parts which can be considered in isolation. For example, GiveWell could divide charities into various cause areas, and evaluate swathes of causes (e.g., rare diseases) all at once. 

Good parametrizations could lead to efficiency gains, worse parametrizations could lead to confused results. For example, one might feel aversion towards "politics" in general—thinking that it is generally toxic—and as a result discount "better voting mechanisms" as a cause. But perhaps a more fine-grained parametrization would have made a distinction between "ideological or party politics" and "all other politics", and realized that "better voting mechanisms" falls into the second bucket.

With regards to fundamentals, one would want to make sure that one is maximizing over the right thing. For example one would want to make sure that one isn't e.g., triple counting impact, and one might want to maximize over Shapley values, instead of over counterfactual values to avoid this. Similarly, one might want to take into account that one is maximizing over an estimate, and adjust for the optimizer's curse.

Many of these points could also belong in the next section, estimating the utility of actions, or the consequences of actions in general.

the expected ()

In general, to get better predictions (or more accurate expectations), one can either:

  • Improve one's ability to predict the world, or
  • make the world more predictable (e.g., simpler, or by causing that which one predicting would happen).

Various forecasting platforms (such as Metaculus, Hypermind, Predictit, etc.) provide forecasting capabilities. Robust randomized trials can generate conclusions (and thus predictions) that span longer time periods, and scholarly works such as, e.g., the regressions from Acemoglu and Robinson could provide conclusions that could last many generations (though they are not immune to criticism [1].)

However, in general our current general forecasting capabilities feel insufficient, particularly because they don't allow for cheap, reliable, longer-term predictions. Some open questions in the area are:

  • How to create forecasts which influence real world decisions
  • How to design collaborative scoring rules which work in practice
  • How to scale prediction markets with real money
  • How to identify capable forecasters
  • To what extent have past long-term predictions proved accurate
  • How to make forecasts cheaper
  • ...

It also feels like there hasn't been much work in forecasting the value of individual actions, projects, or the promisingness of research directions, in such a way that forecasts could be action-guiding. 

Note also that forecasts normally require some sort of evaluation or resolution at the end in order for forecasters to be rewarded. This means that as evaluation capabilities increase, so do forecasting capabilities, because anything that can be evaluated could be forecasted in advance.

utility ()

Advances related to utility functions might be:

  • Designing better specifications or proxies of utility.
  • Discerning which agents are worthy of moral value, and to what extent.
  • Determining whether infinite ethics are plausible, and how to deal with the problems they pose.
  • Coming to terms with various seemingly tricky philosophical problems (e.g., the repugnant conclusion.)
  • ...

throughout time

Consider that the utility of an action can be expressed as 

Where  corresponds to additional utility during year , and  is a discount factor— which could correspond to the probability of value drift, the probability of expropriation, the probability of existential-risk, irrational bias, or intrinsically caring less about future people and events. Parts of that discount factor might be unavoidable (e.g., unavoidable probability of a physically unlikely catastrophe, practically unavoidable risk of expropriation), but the rest could likely be reduced, which would increase the overall utility.

Once one considers a time dimension, coordination throughout time becomes an additional point of optimization. 

Incidentally, note that because the expected value is  additive:

which could be a useful decomposition in terms of forecasting, because forecasting systems could forecast the additional expected value of an action for each year, and said predictions could be evaluated year by year.

of actions ()

Various ways of improving the set of actions () available to oneself might be:

  • To have larger sets of actions to choose from.
    • To increase the number of actions which you can physically take.
      • E.g.: Normally follows from accumulation of resources, such as capital or prestige.
      • Name: Pursuing instrumental goals.
    • To increase the number of actions which you can conceive of taking.
      • E.g.: Making "earning to give" or "working on AI-safety" or "create a charity" a thing people can conceive of doing.
      • Name: Could be called "iconoclastic altruism," or "exploratory altruism"
    • To increase the number of people to take actions, i.e., movement building.
  • To have better sets of actions to choose from.
    • E.g.: Being born rich, doing movement building in highly prestigious or affluent organizations, having an upper bound in the terribleness of your actions.
  • To improve your ability to actually take optimal actions.
    • E.g.: having better mental health, having better incentives or status gradients, having healthier communities with status dynamics that incentivize doing good
  • ...

¿taken by agents?

In the previous section, I added people kind of as an after-thought. We could make our model more elaborate by having

where  is now a vector of actions, with one index for each person (i.e.,  denotes an action which could be taken by the -th person, and  denotes the set of actions which the -th person could take) Writing  and , we could have:

This would open new avenues of optimization:

  • Make agents longer lived
  • Make agents more productive
  • Improve coordination between agents, now and throughout time.
  • Make agents more altruistic, so that  in general contains more altruistic actions.
  • Get more agents.
  • ...

But perhaps not all actions are carried out by human agents. For example, large bureaucracies, ideologies or nations could be modeled as having their own sets of actions at their disposal. This could be further modeled, and relates to the "improving institutional decision making" cause.

given your knowledge of the world ()

Previously, I was considering forecasting as the art of maximizing accuracy holding information about the world constant. But one can also improve one's grasp of the state of the world, and have more information with which to make better forecasts.

One particular useful type of knowledge about the world is a good categorization scheme or parametrization which allows you to group different things together and evaluate their characteristics at the same time, and thus more easily optimize over a set of options.

Where EA organizations fall in this scheme

There isn't a clear mapping between EA organizations and the parts of this scheme, but overall:

  1. Taking object-level optimal actions: Individual EAs, Good Ventures, object-level EA organizations like the Against Malaria Foundation, Wave,
  2. Estimating the expected value of actions: GiveWell, 80,000 hours, Animal Charity Evaluators, Open Philanthropy, SoGive, EA Funds, etc.
  3. Attaining clarity about one's values: Global Priorities Institute, Forethought Foundation, Rethink Priorities, Happier Lives Institute, etc.
  4. Fine-tuning agents:
    • More agents: EA local groups,
    • More coordinated agents: CEA (??)
    • More altruistic agents: Founders Pledge, Raising for Effective Giving, Giving What We Can.
    • More rational agents: CFAR, ClearerThinking.
  5. Improving models of the world: Our World in Data, Metaculus, Open Philanthropy, Rethink Priorities, J-PAL, IDInsight, etc.

Each of these points then has various meta-levels. Or, in other words, these can be stacked. For example, one can try to [estimate the expected value] of [more agents] (e.g., the expected value of an additional Giving What We Can pledge), or one can [recruit more agents] in order [to have better models] about [expected value estimates] about [object level actions] (e.g., by running forecasting tournament about OpenPhilanthropy grants.)

I see QURI as mostly working on the meta-level of 2. and 5. And I see myself as working on 2., 3. and 5., and maximally away from 4.

Conclusion

Intuitively, the EA community would want to invest in all of these "building blocks", because each of them probably has diminishing returns. For instance, as one gains influence over more and more rational agents, clarity about one's utility function becomes more valuable in comparison. [2]


[1]:  Despite criticisms, I do think that there is some core to those studies. For instance, the results of The Persistent Effects of Peru's Mining "Mita" seem relatively robust: the paper looks at extractive institutions which for bureaucratic reasons changed discretely at a geographic boundary: "on one side, all communities sent the same percentage of their population, while on the other side, all communities were exempt."

[2]: It also seems to me that considering the optimal distribution of talent and resources among these building blocks is probably more important than considering which has the highest marginal value at any given moment. 

In theory, both approaches should be equivalent—always directing resources to the block with the highest marginal value should lead to the optimal allocation, in which all marginal values are equal. 

But in practice, I imagine that coordination is difficult and includes some noise, and external shocks mean that knowing which block has the highest marginal value is less information that what one might think.

Comments3


Sorted by Click to highlight new comments since:

This is good stuff!

  1. I really like your way of framing abstractions as "parametrizations" of the choice function. Another way to think of this is that you want your ontology of things in the world to consist of abstractions with loose coupling.
  2. For example:
    1. Let's say you're considering eating something, and you have both "eating an apple" and "eating a blueberry muffin" as options. 
    2. Also assume that you don't have a class for "food" that includes a reference to "satiation" such that "if satiated, then food is low expected utility". Instead, that rule is encoded into every class of food separately.
    3. Then you'd have to run both "eating an apple" and "eating a blueberry muffin" into the choice function separately in order to figure out that they are low EV. If instead you had a reasonable abstraction for "food", you could just run the choice function once and not have to bother evaluating subclasses.
  3. Not only does loose coupling help with efficient computation, it also helps with increasing modularity and thereby reducing design debt.
    1. If base-level abstractions are loosely connected, then even if you build your model of the world on top of them, they still have a limited number of dependencies to other abstractions.
    2. Thus, if one of the base-level abstractions has a flaw, you can switch it out without having to refactor large parts of your entire model of the world. 
  4. A loosely coupled ontology also allows for further specialisation of each abstraction, without having to pay costs of compromise for when abstractions have to serve many different functions.

re: footnote 1

The paper The Standard Errors of Persistence, you cite as a criticism says the following about the robustness of Peruan study:

This study examines differences in household consumption and child stunting on either side of Peru’s Mitaboundary. It finds that areas which traditionally had to provide conscripted mine labour have household consumption almost 30 per centlower than on the other side of the boundary. We examine the regression in column 1 of Table 2, which compares equivalent household consumption in a hundred kilometre strip on either side of the boundary with controls for distance to the boundary, elevation, slope and household characteristics. The variable of interest is a dummy for being inside the boundary. We examine here how well the regression explains arbitrary patterns of consumption generated as spatial noise. To do this we take the locations where households live and simulate consumption levels based on median consumption at the points. The original study found a 28 per cent difference in consumption levels across the historic boundary. If we normalize the noise variables to have the same mean and standard deviation as the original consumption data, we get a difference of at least 28 per cent (positive or negative) in 70 per cent of cases.

What do you think of that? In general, it seems that your justification for relative robustness doesn't engage with the critiques at all. My understanding of their major point is that spatial autocorrelations of residuals are unaccounted for and might make noise look significant. The simpler example of a common spurious relationship was, AFIAK, first described in Spurious regressions in econometrics (see this decently looking blogpost for relevant intuitions).

Note that per Table A1...A3, the authors replace the explanatory variable with noise in every study except in the Mita study, for which they only make their point for the dependent variable. Also, the Mita study isn't present in Figure 8. Not sure why that is.

spatial autocorrelations of residuals are unaccounted for and might make noise look significant

So I sort of understand this point, but not enough to understand if the construction of the noise makes sense. 

In any case, yeah, it looks like it was less robust than I thought.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f
Recent opportunities in Forecasting
20
Eva
· · 1m read
32
Ozzie Gooen
· · 2m read