In that case, FTX and other series B funders held about a 14% stake in Anthropic. If FTX is liquidated and someone ends up owning their share, what does it get them? A seat on the board?
A concrete version of this I've been wondering about the last few days: To what extent are the negative results on Debate (single-turn, two-turn) intrinsic to small-context supervision vs. a function of relatively contingent design choices about how people get to interact with the models?
I agree that misuse is a concern. Unlike alignment, I think it's relatively tractable because it's more similar to problems people are encountering in the world right now.
To address it, we can monitor and restrict usage as needed. The same tools that Elicit provides for reasoning can also be used to reason about whether a use case constitutes misuse.
This isn't to say that we might not need to invest a lot of resources eventually, and it's interestingly related to alignment ("misuse" is relative to some values), but it feels a bit less open-ended.
Elicit is using using the Semantic Scholar Academic Graph dataset. We're working on expanding to other sources. If there are particular ones that would be helpful, message me?
Have you listened to the 80k episode with Nova DasSarma from Anthropic? They might have cybersecurity roles. The closest we have right now is devops—which, btw, if anyone is reading this comment, we are really bottlenecked on and would love intros to great people.
No, it's that our case for alignment doesn't rest on "the system is only giving advice" as a step. I sketched the actual case in this comment.
Oh, forgot to mention Jonathan Uesato at Deepmind who's also very interested in advancing the ML side of factored cognition.
The things that make submodels easier to align that we’re aiming for:
For AGI there isn't much of a distinction between giving advice and taking actions, so this isn't part of our argument for safety in the long run. But in the time between here and AGI it's better to focus on supporting reasoning to help us figure out how to manage this precarious situation.
To clarify, here’s how I’m interpreting your question:
“Most technical alignment work today looks like writing papers or theoretical blog posts and addressing problems we’d expect to see with more powerful AI. It mostly doesn’t try to be useful today. Ought claims to take a product-driven approach to alignment research, simultaneously building Elicit to inform and implement its alignment work. Why did Ought choose this approach instead of the former?”
First, I think it’s good for the community to take a portfolio approach and for different teams to pur...
We're aiming to shift the balance towards supporting high-quality reasoning. Every tool has some non-zero usefulness for non-central use cases, but seems unlikely that it will be as useful as tools that were made for those use cases.
I found your factored cognition project really interesting, is anyone still researching this? (besides the implementation in Elicit)
Some people who are explicitly interested in working on it: Sam Bowman at NYU, Alex Gray at OpenAI. On the ML side there’s also work like Selection-Inference that isn’t explicitly framed as factored cognition but also avoids end-to-end optimization in favor of locally coherent reasoning steps.
I’d say what we’re afraid of is that we’ll have AI systems that are capable of sophisticated planning but that we don’t know how to channel those capabilities into aligned thinking on vague complicated problems. Ought’s work is about avoiding this outcome.
At this point we could chat about why it’s plausible that we’ll have such capable but unaligned AI systems, or about how Ought’s work is aimed at reducing the risk of such systems. The former isn’t specific to Ought, so I’ll point to Ajeya’s post Without specific countermeasures, the easiest path to trans...
We built Ergo (a Python library for integrating model-based and judgmental forecasting) as part of our work on forecasting. In the course of this work we realized that for many forecasting questions the bottleneck isn’t forecasting infrastructure per se, but the high-quality research and reasoning that goes into creating good forecasts, so we decided to focus on that aspect.
I’m still excited about Ergo-like projects (including Squiggle!). Developing it further would be a valuable contribution to epistemic infrastructure. Ergo is an MIT-licensed open-source...
Ought is an applied machine learning lab, hiring for:
Our mission is to automate and scale open-ended reasoning. To that end, we’re building Elicit, the AI research assistant. Elicit's architecture is based on supervising reasoning processes, not outcomes. This is better for supporting open-ended reasoning in the short run and better for alignment in the long run.
Over the last year, we built Elicit to support broad reviews of empirical l...
We're also only reporting our current guess for how things will turn out. We're monitoring how Elicit is used and we'll study its impacts and the anticipated impacts of future features, and if it turns out that the costs outweigh the benefits we will adjust our plans.
Are you worried that your work will be used for more likely regretable things like
- improving the competence of actors who are less altruistic and less careful about unintended consequences (e.g. many companies, militaries and government insitutions), and
Less careful actors: Our goal is for Elicit to help people reason better. We want less careful people to use it and reason better than they would have without Elicit, recognizing more unintended consequences and finding actions that are more aligned with their values. The hope is that if we can make go...
Ought co-founder here. There are two ways Elicit relates to alignment broadly construed:
1 - Elicit informs how to train powerful AI through decomposition
Roughly speaking, there are two ways of training AI systems:
We think decomposition may be a safer way to train powerful AI if it can scale as well as end-to-end training.
Elicit is our bet on the compositional approach. We’re testing how feasible it is to decompose large tasks like “figure out the answer to this science question by ...
Speaker here. I haven't reviewed this transcript yet, but shortly after the talk I wrote up these notes (slides + annotations) which I probably endorse more than what I said at the time.
Another potential windfall I just thought of: the kind of AI scientist system discussed by Bengio in this talk (older writeup). The idea is to build a non-agentic system that uses foundation models and amortized Bayesian inference to create and do inference on compositional and interpretable world models. One way this would be used is for high-quality estimates of p(harm|action) in the context of online monitoring of AI systems, but if it could work it would likely have other profitable use cases as well.