elifland

Software Engineer at Ought. Interested in all things EA but especially cause prioritization, forecasting, and AI safety.

Posts

Sorted by New

Wiki Contributions

Comments

Incentivizing forecasting via social media

Overall I like this idea, appreciate the expansiveness of the considerations discussed in the post, and would excited to hear takes from people working at social media companies.

Thoughts on the post directly

Broadly, we envision i) automatically suggesting questions of likely interest to the user—e.g., questions related to the user’s current post or trending topics—and ii) rewarding users with higher than average forecasting accuracy with increased visibility

I think some version of some type of boosting visibility based on forecasting accuracy seems promising, but I feel uneasy about how this would be implemented. I'm concerned about (a) how this will be traded off with other qualities and (b) ensuring that current forecasting accuracy is actually a good proxy.

On (a), I think forecasting accuracy and the qualities it's a proxy for represent a small subset of the space that determines which content I'd like to see promoted; e.g. it seems likely to be loosely correlated with writing quality. It may be tricky to strike the right balance in terms of how the promotion system works.

On (b):

  1. Promoting and demoting content based on a small sample size of forecasts. In practice it often takes many resolved questions to discern which forecasters are more accurate, and I'm worried that it will be easy to increase/decrease visibility too early.
  2. Even without a small sample size, there may be issues with many of the questions being correlated. I'm imagining a world in which lots of people predict on correlated questions about the 2016 presidential election, then Trump supporters get a huge boost in visibility after he wins because they do well on all of them.

That said, these issues can be mitigated with iteration on the forecasting feature if the people implementing it are careful and aware of these considerations. 

Generally, it might be best if the recommendation algorithms don’t reward accurate forecasts in socially irrelevant domains such as sports—or reward them less so.

Insofar as the intent is to incentivize people to predict on more socially relevant domains, I agree. But I think forecasting accuracy on sports, etc. is likely strongly correlated with performance in other domains. Additionally, people may feel more comfortable forecasting on things like sports than other domains which may be more politically charged.

My experience with Facebook Forecast compared to Metaculus

I've been forecasting regularly on Metaculus for about 9 months and Forecast for about 1 month.

  1. I don't feel as pressured to regularly go back and update my old predictions on Forecast as on Metaculus since Forecast is a play-money prediction market rather than a prediction platform. On Metaculus if I predict 60% and the community is at 50%, then don't update for 6 months and the community has over time moved to 95%, I'm at a huge disadvantage in terms of score relative to predictors who did update. But with a prediction market, if I buy  shares at 50 cents and the price of the shares go up to 95 cents, it just helps me. The prediction market structure makes me feel less pressured to continually update on old questions, which has both its positives and negatives but seems good for a social media forecasting structure. 
  2. The aggregate on Forecast is often decent, but occasionally horrible more egregiously and more often than on Metaculus (e.g. this morning I bought some shares for Kelly Loeffler to win the Georgia senate runoff at as low as ~5 points implying 5% odds, while election betting odds currently have Loeffler at 62%). The most common reasons I've noticed are: 
    1. People misunderstand how the market works and bet on whichever outcome they think is most probable, regardless of the prices.
    2. People don't make the error described in (1) (that I can tell), but are over-confident.
    3. People don't read the resolution criteria carefully.
    4. Political biases. 
    5. There aren't many predictors so the aggregate can be swung easily.
  3. As hinted at in the post, there's an issue with being able to copy the best predictors. I've followed 2 of the top predictors on Forecast and usually agree with their analyses and buy into the same markets with the same positions.
  4. Forecast currently gives points when other people forecast based on your "reasons" (aka comments), and these points are then aggregated on the leaderboard with points gained from actual predictions. I wish there were separate leaderboards for these.
Incentivizing forecasting via social media

The forecasting accuracy of Forecast’s users was also fairly good: “Forecast's midpoint brier score [...] across all closed Forecasts over the past few months is 0.204, compared to Good Judgement's published result of 0.227 for prediction markets.”

For what it's worth , as noted in Nuño's comment this comparison holds little weight when the questions aren't the same or on the same time scales; I'd take it as fairly weak evidence from my prior that real-money prediction markets are much more accurate.

Delegate a forecast

My forecast is pretty heavily based on the GoodJudgment article How to Become a Superforecaster. According to it they identify Superforecasters each autumn and require forecasters to have made 100 forecasts (I assume 100 resolved), so now might actually be the worst time to start forecasting. It looks like if you started predicting now the 100th question wouldn't close until the end of 2020. Therefore it seems very unlikely you'd be able to become a Superforecaster in this autumn's batch.

[Note: alexrjl clarified over PM that I should treat this as "Given that I make a decision in July 2020 to try to become a Superforecaster" and not assume he would persist for the whole 2 years.]

This left most of my probability mass given you becoming a Superforecaster eventually on you making the 2021 batch, which requires you to both stick with it for over a year and perform well enough to become a Superforecaster. If I were to spend more time on this I would refine my estimates of how likely each of those are.

I assumed if you didn't make the 2021 batch you'd probably call it quits before the 2022 batch or not be outperforming the GJO crowd by enough to make it, and even if you didn't you made that batch you might not officially become a Superforecaster before 2023.

Overall I ended up with a 36% chance of you becoming a Superforecaster in the next 2 years. I'm curious to hear if your own estimate would be significantly different.

Delegate a forecast

Here's my forecast. The past is the best predictor of the future, so I looked at past monthly data as the base rate.

I first tried to tease out whether there was a correlation in which months had more activity between 2020 and 2019. It seemed there was a weak negative correlation, so I figured my base rate should be just based on the past few months of data.

In addition to the past few months of data, I considered that part of the catalyst for record-setting July activity might be Aaron's "Why you should put on the EA Forum" EAGx talk. Due to this possibility, I gave August a 65% chance of hitting over the base rate of 105 >=10 karma posts.

My numerical analysis is in this sheet.

I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA

I've recently gotten into forecasting and have also been a strategy game addict enthusiast at several points in my life. I'm curious about your thoughts on the links between the two:

  • How correlated is skill at forecasting and strategy games?
  • Does playing strategy games make you better at forecasting?
Problem areas beyond 80,000 Hours' current priorities

Relevant Metaculus question about whether the impact of the Effective Altruism movement will still be picked up by Google Trends in 2030 (specifically, whether it will have at least .2 times the total interest from 2017) has a community prediction of 70%

elifland's Shortform

The efforts by https://1daysooner.org/ to use human challenge trials to speed up vaccine development make me think about the potential of advocacy for "human challenge" type experiments in other domains where consequentialists might conclude there hasn't been enough "ethically questionable" randomized experimentation on humans. 2 examples come to mind:

My impression of the nutrition field is that it's very hard to get causal evidence because people won't change their diet at random for an experiment.

Why We Sleep has been a very influential book, but the sleep science research it draws upon is usually observational and/or relies on short time-spans. Alexey Guzey's critique and self-experiment have both cast doubt on its conclusions to some extent.

Getting 1,000 people to sign up and randomly contracting 500 of them to do X for a year, where X is something like being vegan or sleeping for 6.5 hours per day, could be valuable.

How should longtermists think about eating meat?

I think we have good reason to believe veg*ns will underestimate the cost of not-eating-meat for others due to selection effects. People who it's easier for are more likely to both go veg*n and stick with it. Veg*ns generally underestimating the cost and non-veg*ns generally overestimating the cost can both be true.

The cost has been low for me, but the cost varies significantly based on factors such as culture, age, and food preferences. I think that in the vast majority of cases the benefits will still outweigh the costs and most would agree with a non-speciesist lens, but I fear down-playing the costs too much will discourage people who try to go veg*n and do find it costly. Luckily, this is becoming less of an issue as plant-based substitutes are becoming more widely available.

Why not give 90%?
If I was donating 90% every year, I think my probability of giving up permanently would be even higher than 50% each year. If I had zero time and money left to enjoy myself, my future self would almost certainly get demotivated and give up on this whole thing. Maybe I’d come back and donate a bit less but, for simplicity, let’s just assume that if Agape gives up, she stays given up.

The assumption that if she gives up, she is most likely to give up on donating completely seems not obvious to me. I would think that it's more likely she scales back to a lower level, which would change the conclusion. It would be helpful to have data to determine which of these intuitions are correct.

Perhaps we should be encouraging a strategy where people increase their percentage donated by a few percentage points per year until they find the highest sustainable level for them. Combined with a community norm of acceptance for reductions in amounts donated, people could determine their highest sustainable donation level while lowering risk of stopping donations entirely.