Hide table of contents

Hi everyone! We, Ought, have been working on Elicit, a tool to express beliefs in probability distributions. This is an extension of our previous work on delegating reasoning. We’re experimenting with breaking down the reasoning process in forecasting into smaller steps and building tools that support and automate these steps.

In this specific post, we’re exploring the dynamics of Q&A with distributions by offering to make a forecast for a question you want answered. Our goal is to learn:

  1. Whether people would appreciate delegating predictions to a third party, and what types of predictions they want to delegate
  2. Whether a distribution can more efficiently convey information (or convey different types of information) than text-based interactions
  3. Whether conversing in distributions isolates disagreements or assumptions that may be obscured in text
  4. How to translate the questions people care about or think about naturally into more precise distributions (and what gets lost in that translation)

We also think that making forecasts is quite fun. In that spirit, you can ask us (mainly Amanda Ngo and Eli Lifland) to forecast any continuous question that you want answered. Just make a comment on this post with a question, and we’ll make a distribution to answer it.

Some examples of questions you could ask:


We’ll spend <=1 hour on each one, so you should expect about that much rigor and information density. If there’s context on you or the question that we won’t be able to find online, you can include it in the comment to help us out.

We’ll answer as many questions as we can from now until Monday 8/3. We expect to spend about 10-15 hours on this, so we may not get to all the questions. We’ll post our distributions in the comments below. If you disagree or think we missed something, you can respond with your own distribution for the question.

We’d love to hear people’s thoughts and feedback on outsourcing forecasts, providing beliefs in probability distribution, or Elicit generally as a tool. If you’re interested in more of what we’re working on, you can also check out the competition we’re currently running on LessWrong to amplify Rohin Shah’s forecast on when the majority of AGI researchers will agree with safety concerns.

Comments42


Sorted by Click to highlight new comments since:

Conditional on OpenAI API generating at least $100M in total revenue for OpenAI, by what year will that happen?

(You might also want to combine this with an estimate of the binary variable of whether it will generate $100M in revenue at all.)

This was really interesting to forecast! Here's my prediction, and my thought process is below. I decomposed this into several questions:

  • Will OpenAI commercialize the API?
    • 94% – this was the intention behind releasing the API, although the potential backlash adds some uncertainty [1]
  • When will OpenAI commercialize the API? (~August/September)
    • They released the API in June and indicated a 2 month beta, so it would begin generating revenue in August/September [2]
  • Will the API reach $100M revenue? (90%)
    • Eliezer is willing to bet there’ll be 1B in revenue from GPT-like services in 2025. This broader than just the revenue from the OpenAI API, but is also a lot more than 100M
    • A tiny list of industries OpenAI API will affect, to give reference points:
    • It seems absurd to me that OpenAI wouldn’t generate 100M from the API *at some point*, but I’ve adjusted down because it’s easy to be overconfident about exciting tech like this
  • If it does reach $100M, when will it?
    • This article suggests that SaaS companies started in the last 15 years took 8 years to reach $100M ARR
    • The question asks about total revenue not ARR, so this timescale would be a lot shorter

What do you think? Are you more bullish on it generating 100M sooner? (My median was April 17, 2022 – this seems like it could be a bit late, but ultimately I'm not that certain in the 2021 – 2024 range). Here's a blank distribution if you want to make your own!

If a question like that from Grace et.al's 2016 survey (note I can not find the exact question)

High-level machine intelligence” (HLMI) is achieved when unaided machines can accomplish every task better and more cheaply than human workers. How many years from now will HLMI be achieved?

was replicated in August 2025 (and had high rates of people filling it, etc), what will the unweighted average of the 50th percentile from the following groups be?

1. AI experts, similar to Grace et.al's original survey

2. Economists, eg IGM economist panels

3. Attendants of the Economics of AI conference

4. Superforecasters

5. Top 100 users on Metaculus

6. Historians

7. Neuroscientists

8. Long-termist philosophers

9. The EA Survey

10. Employees of Ought

This is a lot of questions, so just pick whichever one you're most excited to answer and/or think is the best reference class! :)

My predictions for:

1. AI researchers

2. Historians

Notes:

1. I chose AI researchers because then I could use Grace et. al. as directly as possible, and I chose historians because I expected them to differ the most from AI researchers

2. I worked on this for about 30 min, so it's pretty rough. To make it better, I'd:

a. dig into Grace et. al. more (first data, then methods) to learn more about how to interpret the results/what they tell us about the answer to Linch's question

b. read other expert surveys re: when will AGI come (I think AI Impacts has collected these)

(JTBC, I work at Ought)

Conditional on us starting this small grant project in EA Israel, at what year would we terminate the program? 

The purpose is to see how likely it is to remain valuable over time (and I hope, and think that's likely the case, that we will terminate it if it stops being cost-effective). 

I think that the distribution is only interesting for this purpose until 2030, and then the probability of it lasting to >= 2030 can collapse to one point.  

Here’s my prediction for this! Awesome proposal, I enjoyed reading it. I wrote up more of my thought process here, but a brief overview:

  • It would help a lot to know the base rate of EA initiatives succeeding past the first year. I couldn’t find any information on this, but it possibly does exist
  • It wasn’t entirely clear to me what the impact you expect from this project is, which made it hard to estimate cost effectiveness.
    • I suspect a lot of the indirect impact (building EA connections, converting researchers to EA philosophies) will take a while to manifest
  • I wanted to know more information about the expected time cost of organizing this, as this would make it less cost effective.

I say this in the post above, but since this may be relevant to decisions you make I want to caveat that I only spent ~1 hour on this! I’d love to hear if you agree/disagree (here’s a blank distribution for the question if you want to create your own.

Fantastic, thanks! 

I've requested access to the doc :) 

(Regarding the platform, I think it would help a bit to clarify things if I could do something like selecting a range with the mouse and have the probability mass of that interval displayed) 

You should be able to access the doc from the link in my comment now! That's useful feedback re: selecting a range and seeing the probability. You can currently see the probability of an interval by defining the interval, leaving the prob blank, and hovering over the bin, but I like the solution you described.

When (if ever) will marijuana be legal for recreational use, or effectively so, across all 50 US states?

Here’s my prediction. Based on this timeline, I started out thinking it would be quite a while (10+ years) before all 50 states legalized recreational marijuana. This paper caused a pretty significant update towards thinking that federal legalization was more likely sooner than I had previously thought. I also found this map useful for getting a quick sense of the current status.

Curious what you think - here’s a blank distribution if you want to make your own.

Yeah — this seems pretty reasonable to me. I'd not thought about this explicitly before, but the rough numbers/boundaries you provide seem quite plausible!

When will there be an AI that can play random computer games from some very large and diverse set (say, a representative sample of Steam) that didn't appear in its training data and do about as well as an casual human player trying the game for the first time?

Here’s my prediction for this. It’s pretty uncertain, and I expect others have perspectives which could narrow the range on this forecast. Some thoughts:

  • Although the same algorithms can be generalized, we’re still at the stage where agents have to be trained on individual games [1] [2] [3] [4]
  • It’s really hard to quickly get a sense of how this will progress and what the challenges are without knowing more about the technical research
  • Given that, my prediction is very uncertain over the range, but bounded by AGI timelines

Does this seem in line with what you expected? Do you know of any good ways to estimate how fast this kind of research will progress? If anyone else has insight that would increase the certainty over a range, you can edit my snapshot or create your own here.

Thanks! It's about what I expected, I guess, but different from my own view (I've got more weight on much shorter timelines). It's encouraging to hear though!

Yeah I could definitely see it being sooner, but didn't find any sources that convinced me it would be more likely in the next 10 years than later – what's driving your shorter timelines?

I have a spreadsheet of different models and what timelines they imply, and how much weight I put on each model. The result is 18% by end of 2026. Then I consider various sources of evidence and update upwards to 38% by end of 2026. I think if it doesn't happen by 2026 or so it'll probably take a while longer, so my median is on 2040 or so.

The most highly weighted model in my spreadsheet takes compute to be the main driver of progress and uses a flat distribution over orders of magnitude of compute. Since it's implausible that the flat distribution should extend more than 18 or so OOMs from where we are now, and since we are going to get 3-5 more OOM in the next five years, that yields 20%.

The biggest upward update from the bits of evidence comes from the trends embodied in transformers (e.g. GPT-3) and also to some extent in alphago, alphazero, muzero: Strip out all that human knowledge and specialized architecture, just make a fairly simple neural net and make it huge, and it does better and better the bigger you make it.

Another big update upward is... well, just read this comment. To me, this comment did not give me a new picture of what was going on but rather confirmed the picture I already had. The fact that it is so highly upvoted and so little objected to suggests that the same goes for lots of people in the community. Now there's common knowledge.

Oh, and to answer your question for why it's more likely shorter than later: Progress right now seems to be driven by compute, and in particular by buying greater and greater quantities of it. In a few years this trend MUST stop, because not even the US government would have enough money to continue the trend of spending an order of magnitude+ more each year. So if we haven't got to crazy AI by 2026 or so, the current paradigm of "just add more compute" will no longer be so viable, and we're back to waiting for new ideas to come along.

Gwern's comment was really helpful to see the different paradigms, thanks for sharing! This reasoning makes sense to me in terms of increasing compute - I could see this pushing me slightly more towards shorter timelines, although I'd want to spend a lot longer researching this.

How many points will I have on Metaculus at the end of 2021?

(Question resolves according to the number listed here on 2021-12-31 23:59:59 GMT.)

I misread the post as asking for a personal forecast. Since I now realize it's possible to ask questions of any type, I would much rather delegate a forecast on an important topic, such as:

How many standard deviations away from the mean will the 1000th human born from stem-cell derived gametes score in a test of cognitive ability taken at the age of 18?

This was pretty difficult to forecast in a limited amount of time, so you should take my prediction with a large grain of salt. Broadly, I thought about this as:

  • How likely is the 1000th baby to involve iterated embryo selection?
    • There’s a lot of controversy around genetic manipulation for ability, and it’s possible that stem cell gamete reproduction is regulated such that you can only use it as an alternative fertility treatment
      • E.G. controversy around the ethics of genetic relationship of parents to children (see this series of papers for an overview)
    • I think 1000 babies is still sufficiently small that it could still be a niche fertility treatment (rather than mass iterated embryo selection), but I could be persuaded otherwise
  • If the 1000th baby does involve iterated embryo selection, what is the gain in IQ we would expect? (IQ seems like the easiest measure of cognitive ability)
    • This is pretty hard to estimate.
    • This paper (Schulman & Bostrom, 2014) suggests a cap of 30 SDs (~300 IQ points). Based on their simulation, choosing one embryo in 10 would lead to 11.5 points (0.8SD) and running 10 generations of choosing 1 in 10 embryos would lead to an increase of 130 points (8.6SD). These estimates may be high – this paper by Karavani, Zuk et al. (2019) suggests the gain from choosing one in 10 embryos is closer to 2.5 IQ points (0.2SD)

This was a really interesting question to look into – what motivated you to ask this? Is there anything you think I missed? (here's a blank distribution if you want to make your own).

Thank you, that was informative. I don't think you missed anything, though I haven't myself thought about this question much—that is in part why I was curious to see someone else try to answer it.

I think genetic selection and/or editing has the potential to be transformative, and perhaps even to result in greater-than-human intelligence. Despite this, it's comparatively neglected, both within EA and society at large. So having more explicit forecasts in this area seems pretty valuable.

What will the maximum reduction in the S&P 500 from today's value (July 30, 2020) be over the next 12 months? Having a probability for each maximum percentage reduction would be very helpful for investing now (e.g. whether it makes sense to short). I could only see value of S&P 500 on June 30, 2021 in Good Judgement.

Hey, I run a business teaching people how to overcome procrastination (procrastinationplaybook.net is our not yet fully fleshed out web presence).

I ran a pilot program that made roughly $8,000 in revenue by charging 10 people for a premium interactive course. Most of these users came from a couple of webinars that my friend's hosted, a couple came from finding my website through the CFAR mailing list and webinars I hosted for my twitter friends.

The course is ending soon, and I'll spend a couple of months working on marketing and updating the course before the next launch, as well as:

1. Launching a podcast breaking down skills and models and selling short $10 lessons for each of them teaching how to acquire the skill.

2. Creating a sales funnel for my pre-course, which is a do-it-yourself planning course for creating the "perfect procrastination plan". Selling for probably $197

3. Creating the "post-graduate" continuity program after people have gone through the course, allowing people to have a community interested in growth and development, priced from $17/month for basic access to $197 with coaching.

Given those plans for launch in early 2021:

1. What will be my company's revenue in Q1 2021?

2. What will be the total revenue for this company in 2021?

Here’s my Q1 2021 prediction, with more detailed notes in a spreadsheet here. I started out estimating the size of the market, to get reference points. Based on very rough estimates of CEA subscriptions, # of people Effective Altruism Coaching has worked with, and # of people who have gone through a CFAR workshop, I estimated the number of EAs who are interested enough in productivity to pay for a service to be ~8000. The low number of people who have done Effective Altruism Coaching (I estimated 100, but this is an important assumption that could be wrong since I don’t think Lynette has published this number anywhere) suggests a range for your course (which is more expensive) of ~10 to 45 people in Q1. Some other estimates, which are in the spreadsheet linked above, gave me a range of $8,000 to $42,000. I didn’t have enough time to properly look into 2021 as a whole, so I just did a flat 10% growth rate across all the numbers and got this prediction. Interestingly, I notice a pressure to err on the side of optimistic when publicly evaluating people’s companies/initiatives.

Your detailed notes were very helpful in this. I noticed that I wanted more information on:

  • The feedback you got from the first course. How many of them would do it again or recommend it to a friend?
  • More detail on your podcast plans. I didn’t fully understand the $10 lessons – I assumed it was optional $10 lessons attached to each podcast, but this may be wrong.
  • How much you’re focusing on EA’s. The total market for productivity services is a lot bigger (here’s an estimate of $1B market value for life coaching, which encompasses productivity coaching).

Do these estimates align with what you're currently thinking? Are there any key assumptions I made that you disagree with? (here are blank distributions for Q1 and 2021 if you want to share what you're currently projecting).

Thanks, this was great!

The estimates seem fair, Honestly, much better than I would expect given the limited info you had, and the assumptions you made (the biggest one that's off is that I don't have any plans to only market to EAs).

Since I know our market is much larger, I use a different forecasting methodology internally which looks at potential marketing channels and growth rates.

I didn't really understand how you were working in growth rate into your calculations in the spreadsheet, maybe just eyeballing what made sense based on the current numbers and the total addressable market?

One other question I have about your platform is that I don't see any way to get the expected value of the density function, which is honestly the number I care most about. Am I missing something obvious?

Yeah, I mostly focused on the Q1 question so didn't have time to do a proper growth analysis across 2021 – I just did 10% growth each quarter and summed that for 2021, and it looked reasonable given the EA TAM. This was a bit of a 'number out of the air,' and in reality I wouldn't expect it to be the same growth rate across all quarters. Definitely makes sense that you're not just focusing on the EA market – the market for general productivity services in the US is quite large! I looked briefly at the subscriptions for top productivity podcasts on Castbox (e.g. Getting Things Done, 5am miracle), which suggests lots of room for growth (although I imagine podcast success is fairly power law distributed).

There isn't a way to get the expected value, just the median currently (I had a bin in my snapshot indicating a median of $25,000). I'm curious what makes the expected value more useful than the median for you?

Yeah, I mostly focused on the Q1 question so didn't have time to do a proper growth analysis across 2021

Yeah, I was talking about the Q1 model when I was trying to puzzle out what your growth model was.

There isn't a way to get the expected value, just the median currently (I had a bin in my snapshot indicating a median of $25,000). I'm curious what makes the expected value more useful than the median for you?

A lot of the value of potential growth vectors of a business come in the tails. For this particular forecast it doesn't really matter because it's roughly bell-curved shape, but if I was using this as for instance decisionmaking tool to decide what actions to take, I'd really want to look at which ideas had a small chance of being very runaway successes, and how valuable that makes them compared to other options which are surefire, but don't have that chance of tail success. Choosing those ideas isn't likely to pay off on any single idea, but is likely to pay off over the course of a business's lifetime.

I just eyeballed the worst to best case for each revenue source (and based on general intuitions about e.g. how hard it is to start a podcast). Yeah, this makes a lot of sense – we've thought about showing expected value in the past so this is a nice +1 to that.

What is the probability that my baby daughter's US passport application will be rejected on account of inadequate photo?

Evidence: The photo looked acceptable to me but my wife, who thought a lot more about it, judged it to be overexposed. It wasn't quite as bad as the examples of overexposure given on the website, but in her opinion it was too close for comfort.

Evidence: The lady at the post office said the photo was fine, but she was rude to us and in a hurry. For example, she stapled it to our application and hustled us through the rest of the process and we were too shy and indecisive to stop her.

When will my daughter's passport arrive? (We are US citizens, applied by mail two weeks ago, application received last week)

Here's my prediction! My median is October 3, 2020. If you want to keep checking in on this, the Bureau of Consular Affairs is helpfully tracking their passport backlog and how many they're processing each week here.

Was this in line with what you were expecting?

Thanks! Yes it is. All I had been doing was looking at that passport backlog, but I hadn't made a model based on it. It's discouraging to see so much probability mass on December, but not too surprising...

Amazing, splendiforously wonderful news! The passport arrived TODAY, August 7!

Assume that I will try to become a GJP registered superforecaster, giving up all the time I currently spend on other platforms and only focusing on making Good judgement Open predictions to the best of my ability, choosing questions based on whether they'll help me be a superforecaster, not how useful predictions are for the world. Let's say I'll give up after 2 years if I don't make it.

When do I become a superforecaster? Time interval now->2022, with an open upper bound indicating "didn't make it".

If it matters, assume I haven't seen your distribution, but if you also want to do a seperate distribution assuming I have seen it, that might be fun (not sure if it would converge though).

My forecast is pretty heavily based on the GoodJudgment article How to Become a Superforecaster. According to it they identify Superforecasters each autumn and require forecasters to have made 100 forecasts (I assume 100 resolved), so now might actually be the worst time to start forecasting. It looks like if you started predicting now the 100th question wouldn't close until the end of 2020. Therefore it seems very unlikely you'd be able to become a Superforecaster in this autumn's batch.

[Note: alexrjl clarified over PM that I should treat this as "Given that I make a decision in July 2020 to try to become a Superforecaster" and not assume he would persist for the whole 2 years.]

This left most of my probability mass given you becoming a Superforecaster eventually on you making the 2021 batch, which requires you to both stick with it for over a year and perform well enough to become a Superforecaster. If I were to spend more time on this I would refine my estimates of how likely each of those are.

I assumed if you didn't make the 2021 batch you'd probably call it quits before the 2022 batch or not be outperforming the GJO crowd by enough to make it, and even if you didn't you made that batch you might not officially become a Superforecaster before 2023.

Overall I ended up with a 36% chance of you becoming a Superforecaster in the next 2 years. I'm curious to hear if your own estimate would be significantly different.

Also had misunderstood this as being personal questions. If you prefer, replace me with something like:

A randomly selected person from the top ~50 recently active Metaculus users (so maybe this list or this one), excluding those who are already supers.

You didn't misunderstand! The intention was that you ask any question that's interesting to you, including personal questions. I'm assuming you're more interested in the first question you asked, so I'll answer that unless you feel otherwise :)

Ok awesome, thanks!

How many EA forum posts will there be with greater than or equal to 10 karma submitted in August of 2020?

Here's my forecast. The past is the best predictor of the future, so I looked at past monthly data as the base rate.

I first tried to tease out whether there was a correlation in which months had more activity between 2020 and 2019. It seemed there was a weak negative correlation, so I figured my base rate should be just based on the past few months of data.

In addition to the past few months of data, I considered that part of the catalyst for record-setting July activity might be Aaron's "Why you should put on the EA Forum" EAGx talk. Due to this possibility, I gave August a 65% chance of hitting over the base rate of 105 >=10 karma posts.

My numerical analysis is in this sheet.

Curated and popular this week
Relevant opportunities