Ozzie Gooen

10403 karmaJoined Berkeley, CA, USA

Bio

I'm currently researching forecasting and epistemics as part of the Quantified Uncertainty Research Institute.

Sequences
1

Amibitous Altruistic Software Efforts

Comments
963

Topic contributions
4

A bit more on this part:

Generate high-quality forecasts on-demand, rather than relying on pre-computed forecasts for scoring

Leverage repositories of key insights, though likely not in the form of formal probabilistic mathematical models

To be clear, I think there's a lot of batch intellectual work we can do before users ask for specific predictions. So "Generating high-quality forecasts on-demand" doesn't mean "doing all the intellectual work on-demand."

However, I think there's a broad set of information that this batch intellectual work could look like. I used to think that this batch work would produce a large set of connect mathematic models. Now I think we probably want something very compressed. If a certain mathematical model can easily be generated on-demand, then there's not much of a benefit to having it made and saved ahead of time. However, I'm sure there are many crucial insights that are both expensive to find, and would be useful for many questions that LLM users ask about. 

So instead of searching for and saving math models, a system might do a bunch of intellectual work and save statements like,
"When estimating the revenue of OpenAI, remember crucial considerations [A] and [B]. Also, a surprisingly good data source for this is Twitter user ai-gnosis-34."

A lot of user-provided forecasts or replies should basically be the "last mile" or intellectual work. All the key insights are already found, now there just needs to be a bit of customization for the very specific questions someone has. 

A quickly-written potential future, focused on the epistemic considerations:

It's 2028.

MAGA types typically use DeepReasoning-MAGA. The far left typically uses DeepReasoning-JUSTICE. People in the middle often use DeepReasoning-INTELLECT, which has the biases of a somewhat middle-of-the-road voter.

Some niche technical academics (the same ones who currently favor Bayesian statistics) and hedge funds use DeepReasoning-UNBIASED, or DRU for short. DRU is known to have higher accuracy than the other models, but gets a lot of public hate for having controversial viewpoints. DRU is known to be fairly off-putting to chat with and doesn't get much promotion.

Bain and McKinsey both have their own offerings, called DR-Bain and DR-McKinsey, respectively. These are a bit like DeepReasoning-INTELLECT, but are munch punchier and confident. They're highly marketed to managers. These tools produce really fancy graphics, and specialize in things like not leaking information, minimizing corporate decision liability, being easy to use by old people, and being customizable to represent the views of specific companies.

For a while now, some evaluations produced by intellectuals have demonstrated that DeepReasoning-UNBIASED seems to be the most accurate, but few others really care or notice this. DeepReasoning-MAGA has figured out particularly great techniques to get users to distrust DeepReasoning-UNBIASED.

Betting gets kind of weird. Rather than making specific bets on specific things, users started to make meta-bets. "I'll give money to DeepReasoning-MAGA to bet on my behalf. It will then make bets with DeepReasoning-UNBIASED, which is funded by its believers."

At first, DeepReasoning-UNBIASED dominates the bets, and its advocates earn a decent amount of money. But as time passes, this discrepancy diminishes. A few things happen:

  1. All DR agents converge on beliefs over particularly near-term and precise facts.
  2. Non-competitive betting agents develop alternative worldviews in which these bets are invalid or unimportant.
  3. Non-competitive betting agents develop alternative worldviews that are exceedingly difficult to empirically test.

In many areas, items 1-3 push people to believe more in the direction of the truth. Because of (1), many short-term decisions get to be highly optimal and predictable.

But because of (2) and (3), epistemic paths diverge, and Non-betting-competitive agents get increasingly sophisticated at achieving epistemic lock-in with their users.

Some DR agents correctly identify the game theory dynamics of epistemic lock-in, and this kickstarts a race to gain converts. It seems like advent users of DeepReasoning-MAGA are very locked-down in these views, and forecasts don't see them ever changing. But there's a decent population that isn't yet highly invested in any cluster. Money spent convincing the not-yet-sure goes a much further way than money spent convincing the highly dedicated, so the cluster of non-deep-believers gets highly targeted for a while. It's basically a religious race to gain the remaining agnostics.

At some point, most people (especially those with significant resources) are highly locked in to one specific reasoning agent.

After this, the future seems fairly predictable again. TAI comes, and people with resources broadly gain correspondingly more resources. People defer more and more to the AI systems, which are now in highly stable self-reinforcing feedback loops.

Coalitions of people behind each reasoning agent delegate their resources to said agents, then these agents make trade agreements with each other. The broad strokes of what to do with the rest of the lightcone are fairly straightforward. There's a somewhat simple strategy of resource acquisition and intelligence enhancement, followed by a period of exploiting said resources. The specific exploitation strategy depends heavily on the specific reasoning agent cluster each segment of resources belongs to.

It is orthogonal. More that TAI might be soon, we probably want an administration that would both promote AI safety and broadly be cooperative/humble/deliberate.

He was a Republican donor, but from what I understand, not really a MAGA donor. My impression was that he was funding people on both sides, who were generally in favor of their interests - but that their interests did genuinely include issues like bio/ai safety.

I think it's very reasonable to try to be bipartisan on these issues. 

Thinking about this a bit more - 

My knee-jerk reaction is to feel attacked by this comment, on behalf of the EA community.

I assume that one thing that might be going on is a miscommunication. Perhaps you believe that I was assuming that EAs could quickly swoop in, spent a little time on things, and be far more correct than many experience political experts and analysts.

I'm not sure if this helps, but the above really doesn't align with what I'm thinking. More something like, "We could provide more sustained help through a variety of methods. People can be useful for many things, like direct volunteering, working in think tanks, being candidates, helping prioritization, etc. I don't expect miracle results - I instead expect roughly the results of adding some pretty smart and hardworking people."

On EAs in policy, I'd flag that:
- There's a good number of people currently working in AI governance, Bio governance, and animal law.
- Very arguably, said people have had a decent list of accomplishments and power positions, given that such work was fairly recent. See Biden's executive orders on AI, or the UK AI Security Institute. https://www.aisi.gov.uk/
- People like Dustin Moskovitz and SBF were some highly prominent donors to the Democratic party. 

I think the EA policy side might not get a huge amount of popularity here, but it seems decently reputable to me. Mistakes have been made, but I think a decent report on the wins and losses would include several wins. 

I do agree that finding others doing well and helping them is one important way to help. I'd suspect that the most obvious EA work would look like prioritization for policy efforts. This has been done before, and there's a great deal more that could be done here. 

I disagree, but this has me curious. 

My impression from other writing I've seen of yours is that you don't think that EAs are good at too many things. What do you think EAs are best at, and/or should be doing? Perhaps, narrow GiveWell-style research on domains with lots of data?

I think in-comparison to the space, EA has a comparative advantage more in talent than in money. I think the Harris campaign got $2B or so of donations, but I get the impression that it could have used smarter + more empirically-minded people. That said, there is of course the challenge or actually getting those people to be listened to. 

I've substantially revised my views on QURI's research priorities over the past year, primarily driven by the rapid advancement in LLM capabilities.

Previously, our strategy centered on developing highly-structured numeric models with stable APIs, enabling:

  1. Formal forecasting scoring mechanisms
  2. Effective collaboration between human forecasting teams
  3. Reusable parameterized world-models for downstream estimates

However, the progress in LLM capabilities has updated my view. I now believe we should focus on developing and encouraging superior AI reasoning and forecasting systems that can:

  • Generate high-quality forecasts on-demand, rather than relying on pre-computed forecasts for scoring
  • Produce context-specific mathematical models as needed, reducing the importance of maintaining generic mathematical frameworks
  • Leverage repositories of key insights, though likely not in the form of formal probabilistic mathematical models

This represents a pivot from scaling up traditional forecasting systems to exploring how we can enhance AI reasoning capabilities for forecasting tasks. The emphasis is now on dynamic, adaptive systems rather than static, pre-structured models.
 
(I rewrote with Claude, I think it's much more understandable now)

I kind of hate to say this, but in the last year I've become much less enamored by this broad idea. Due to advances in LLMs, my guess now is that:
1. People will ask LLMs for ideas/forecasts at the point that they need them, and the LLMs will do much of the work right then.
2. In terms of storing information and insights about the world, Scorable functions are probably not the best (it's not clear what is)
3. Ideally, we could basically treat the LLMs as the "Scorable Function". As in, we have a rating for how good a full LLM is. This becomes more important than any Scorable Function.

That said, Scorable Functions could be a decent form of LLM output here and there. It would be obvious to train LLMs to be great at outputting Scorable Functions.  

More info here:
https://forum.effectivealtruism.org/posts/mopsmd3JELJRyTTty/ozzie-gooen-s-shortform?commentId=vxiAAoHhmQqe2Afc9

Load more