Hide table of contents

I’m interested in people’s thoughts on:

  1. How valuable would it be for more academics to do research into forecasting?
  2. How valuable would it be for more non-academics to do academic-ish research into forecasting? (E.g., for more people in think tanks or EA orgs to do research on forecasting that's closer in levels of "rigour" to the average paper than to the average EA Forum post.)
  3. What questions about forecasting should be researched by academics, or by non-academics using academic approaches?
    • I imagine this could involve psychological experiments, historical research, conceptual/theoretical/mathematical work, political science literature reviews, etc.

(Note that I don’t mean “What questions should people forecast?” For that, there’s this post.)

Context

Many EAs have appreciated and drawn on Phil Tetlock’s research into forecasting. Some people in the EA community or related communities are now doing non-academic work on forecasting, such as: 

  • building tools for forecasting (e.g. Foretold)
  • running forecasting projects (e.g. Foretell [no relation])
  • experimenting/playing around with different forecasting techniques and methods (e.g. here and here)
  • researching and writing about forecasting (e.g. here and here)

And some people in the EA community or related communities are doing academic work:

But I’m not aware of much academic research on forecasting itself from EAs. I imagine there might be room for more. And two things I’m considering trying to do in future are:

  • A PhD in psychology, focusing on forecasting or other things related to improving institutional decision-making
  • Research on forecasting in a job in an EA organisation or with a grant

But I don’t know how valuable that would be - e.g., maybe non-EA academics are covering this area well already, or maybe what’s really needed is just people in government/business trying to actually implement or make use of forecasting projects. Nor do I know what the important open questions or topics would be. And I haven’t engaged much with the academic literature on forecasting, other than reading Superforecasting, Expert Political Judgement, and various summaries by EAs.

So this leads me to ask the above questions, both for my own benefit and (hopefully) for the benefit of other people who might be a good fit for research into this topic. (To help capture that benefit, I’ve added this post to A central directory for open research questions.)

The 80,000 Hours profile on improving institutional decision-making has some great analysis and ideas on this, but it’s now almost 3 years old, and I’m interested in additional perspectives.

(I also posted these three questions to the Improving institutional decision-making Facebook group, and there was some discussion there.)

New Answer
New Comment


4 Answers sorted by

  1. How open are various decision-makers to actually paying attention to forecasts? 
  2. How likely are they to just make the same decision anyway, referring to forecasts when they justify this decision and ignoring them the rest of the time? 
  3. How does this vary for different decision-makers and contexts (e.g., politicians vs civil servants vs funders vs business leaders)? 
  4. How does this vary by different approaches to forecasting (e.g. those surveyed by Beard et al.), different presentations of forecasting, and different topics? 

I’m guessing a good amount of research will have already been done on these topics, but I’ve been surprised about such things before.

I imagine these questions could be answered through a mixture of: 

  • surveys of relevant people
  • lab experiments
  • interviews with people who’ve tried implementing forecasting in relevant settings
  • literature reviews of potentially relevant work in political science and political economy (e.g., on kinds of info are drawn on in political decision-making more generally)
  • probably also other approaches

It seems plausible that these sorts of questions aren't as well answered by academic research as by people just actually trying to implement forecasting in relevant institutions, get relevant decision-makers to pay attention to forecasts, etc.

(This is one of my own ideas about a cluster of questions that might warrant academic/academic-style research. I’d be interested in people’s thoughts on this cluster and the cluster I suggest in a different comment, including regarding how important, tractable, and neglected these questions are.)

Not really answering your question, but there is some recent work attempting to forecast clinical trial results that may be relevant: Can Oncologists Predict the Efficacy of Treatments in Randomized Trials? Kimmelman (the senior author) is doing other work on the topic too (e.g. here). I'm not aware of much published work in this space in a biomedical context. 

My guess is that key decision makers in medicine (e.g. funders of trials), would not be very open to paying attention to forecasts (even if shown to be accurate to some degree), as there is a very strong culture of relying on data and in particular on RCTs. 

Greenberg 2018 lists and evaluates forecasting scoring rules. Research on additionally more complex metrics that take into account e.g.:

  • importance of the questions forecasted on (perhaps using an interest score such as Metaculus)
  • number of questions forecasted on (brier score distorts if only few forecasts)
  • relative performance as compared to other forecasters on similar sets of questions

Might be useful to set incentives right in forecasting tournaments. Prediction markets solve the first point by logarithmic subsidising.

  1. How does the resolution and calibration of forecasts vary by the forecasts’ “range” (e.g., whether the forecast is for an event 6 months away vs 3 years away vs 20 years away)?
  2. How does this vary between topics, types of forecasters, etc.?
  3. Do people who are superforecasters for short- or medium-range questions (e.g., those that resolve within 3 years) still do better than average for long-range forecasts?
  4. Are there approaches to forecasting that are especially useful for long-range forecasts? (Approaches that could be considered include those surveyed by Beard et al.)
  5. All the same questions, but now for “extreme” or “extremely rare” vs more “mundane” or “common” events (e.g., forecasts of global catastrophes vs forecasts of more minor disruptions).
  6. All the same questions, but now for forecasts that are long-range and about extreme/extremely rare outcomes.

Background for this cluster of questions: How Feasible Is Long-range Forecasting? 

One rationale for this cluster of questions would be to improve our ability to forecast global catastrophic and existential risks, and/or inform how we interpret those forecasts and how much weight we give them.

This research could use very similar methodologies to those used by Tetlock, just with different forecasts. However, a very important practical limitation is that the research may inherently take many years or decades.

(This post says "Fortunately, we might soon be in a position to learn more about long-range forecasting from the EPJ [Expert Political Judgement] data, since most EPJ forecasts (including most 25-year forecasts) will have resolved by 2022". This might reduce the marginal value of further work on this, but I imagine the marginal value could remain quite high.)

How does the resolution and calibration of forecasts vary by the
forecasts’ “range” (e.g., whether the forecast is for an event 6 months
away vs 3 years away vs 20 years away)?

I have an (unfinished) essay on the topic using Metaculus and PredictionBook data. Relation between range and accuracy is negative within forecasts on one specific questions. Specifically, linear regression is 0.0019x+0.0105 for brier score over the range in days. Of course, I'll look into better statistical analyses if I find time.

3
MichaelA🔸
Thanks for sharing this, I'll try look over it soon. 
1
niplav
Just beware I got feedback by two different people that it's difficult to understand.
Comments1
Sorted by Click to highlight new comments since:

I've just discovered the very recently published Forecasting AI Progress: A Research Agenda. The abstract reads: 

Forecasting AI progress is essential to reducing uncertainty in order to appropriately plan for research efforts on AI safety and AI governance. While this is generally considered to be an important topic, little work has been conducted on it and there is no published document that gives and objective overview of the field. Moreover, the field is very diverse and there is no published consensus regarding its direction. 

This paper describes the development of a research agenda for forecasting AI progress which utilized the Delphi technique to elicit and aggregate experts' opinions on what questions and methods to prioritize. The results of the Delphi are presented; the remainder of the paper follow the structure of these results, briefly reviewing relevant literature and suggesting future work for each topic. Experts indicated that a wide variety of methods should be considered for forecasting AI progress. Moreover, experts identified salient questions that were both general and completely unique to the problem of forecasting AI progress. Some of the highest priority topics include the validation of (partially unresolved) forecasts, how to make forecasting action-guiding and the quality of different performance metrics. While statistical methods seem more promising, there is also recognition that supplementing judgmental techniques can be quite beneficial.

I hope to read the full paper soon, as I imagine it'd contain ideas and insights relevant to my questions (even if the paper's primary focus is forecasting AI specifically).

(There are also some comments on the AIAF cross-post.)

Curated and popular this week
 ·  · 5m read
 · 
[Cross-posted from my Substack here] If you spend time with people trying to change the world, you’ll come to an interesting conundrum: Various advocacy groups reference previous successful social movements as to why their chosen strategy is the most important one. Yet, these groups often follow wildly different strategies from each other to achieve social change. So, which one of them is right? The answer is all of them and none of them. This is because many people use research and historical movements to justify their pre-existing beliefs about how social change happens. Simply, you can find a case study to fit most plausible theories of how social change happens. For example, the groups might say: * Repeated nonviolent disruption is the key to social change, citing the Freedom Riders from the civil rights Movement or Act Up! from the gay rights movement. * Technological progress is what drives improvements in the human condition if you consider the development of the contraceptive pill funded by Katharine McCormick. * Organising and base-building is how change happens, as inspired by Ella Baker, the NAACP or Cesar Chavez from the United Workers Movement. * Insider advocacy is the real secret of social movements – look no further than how influential the Leadership Conference on Civil Rights was in passing the Civil Rights Acts of 1960 & 1964. * Democratic participation is the backbone of social change – just look at how Ireland lifted a ban on abortion via a Citizen’s Assembly. * And so on… To paint this picture, we can see this in action below: Source: Just Stop Oil which focuses on…civil resistance and disruption Source: The Civic Power Fund which focuses on… local organising What do we take away from all this? In my mind, a few key things: 1. Many different approaches have worked in changing the world so we should be humble and not assume we are doing The Most Important Thing 2. The case studies we focus on are likely confirmation bias, where
 ·  · 2m read
 · 
I speak to many entrepreneurial people trying to do a large amount of good by starting a nonprofit organisation. I think this is often an error for four main reasons. 1. Scalability 2. Capital counterfactuals 3. Standards 4. Learning potential 5. Earning to give potential These arguments are most applicable to starting high-growth organisations, such as startups.[1] Scalability There is a lot of capital available for startups, and established mechanisms exist to continue raising funds if the ROI appears high. It seems extremely difficult to operate a nonprofit with a budget of more than $30M per year (e.g., with approximately 150 people), but this is not particularly unusual for for-profit organisations. Capital Counterfactuals I generally believe that value-aligned funders are spending their money reasonably well, while for-profit investors are spending theirs extremely poorly (on altruistic grounds). If you can redirect that funding towards high-altruism value work, you could potentially create a much larger delta between your use of funding and the counterfactual of someone else receiving those funds. You also won’t be reliant on constantly convincing donors to give you money, once you’re generating revenue. Standards Nonprofits have significantly weaker feedback mechanisms compared to for-profits. They are often difficult to evaluate and lack a natural kill function. Few people are going to complain that you provided bad service when it didn’t cost them anything. Most nonprofits are not very ambitious, despite having large moral ambitions. It’s challenging to find talented people willing to accept a substantial pay cut to work with you. For-profits are considerably more likely to create something that people actually want. Learning Potential Most people should be trying to put themselves in a better position to do useful work later on. People often report learning a great deal from working at high-growth companies, building interesting connection
 ·  · 31m read
 · 
James Özden and Sam Glover at Social Change Lab wrote a literature review on protest outcomes[1] as part of a broader investigation[2] on protest effectiveness. The report covers multiple lines of evidence and addresses many relevant questions, but does not say much about the methodological quality of the research. So that's what I'm going to do today. I reviewed the evidence on protest outcomes, focusing only on the highest-quality research, to answer two questions: 1. Do protests work? 2. Are Social Change Lab's conclusions consistent with the highest-quality evidence? Here's what I found: Do protests work? Highly likely (credence: 90%) in certain contexts, although it's unclear how well the results generalize. [More] Are Social Change Lab's conclusions consistent with the highest-quality evidence? Yes—the report's core claims are well-supported, although it overstates the strength of some of the evidence. [More] Cross-posted from my website. Introduction This article serves two purposes: First, it analyzes the evidence on protest outcomes. Second, it critically reviews the Social Change Lab literature review. Social Change Lab is not the only group that has reviewed protest effectiveness. I was able to find four literature reviews: 1. Animal Charity Evaluators (2018), Protest Intervention Report. 2. Orazani et al. (2021), Social movement strategy (nonviolent vs. violent) and the garnering of third-party support: A meta-analysis. 3. Social Change Lab – Ozden & Glover (2022), Literature Review: Protest Outcomes. 4. Shuman et al. (2024), When Are Social Protests Effective? The Animal Charity Evaluators review did not include many studies, and did not cite any natural experiments (only one had been published as of 2018). Orazani et al. (2021)[3] is a nice meta-analysis—it finds that when you show people news articles about nonviolent protests, they are more likely to express support for the protesters' cause. But what people say in a lab setting mig