We're Ought. We're going to answer questions here on Tuesday August 9th at 10am Pacific. We may get to some questions earlier, and may continue answering a few more throughout the week.
About us:
- We're an applied AI lab, taking a product-driven approach to AI alignment.
- We're 10 people right now, roughly split between the Bay Area and the rest of the world (New York, Texas, Spain, UK).
- Our mission is to automate and scale open-ended reasoning. We are working on getting AI to be as helpful for supporting reasoning about long-term outcomes, policy, alignment research, AI deployment, etc. as it is for tasks with clear feedback signals.
- We're building the AI research assistant Elicit. Elicit's architecture is based on supervising reasoning processes, not outcomes, an implementation of factored cognition. This is better for supporting open-ended reasoning in the short run and better for alignment in the long run.
- Over the last year, we built Elicit to support broad reviews of empirical literature. We're currently expanding to deep literature reviews, then other research workflows, then general-purpose reasoning.
- We're hiring for full-stack, devops, ML, product analyst, and operations manager roles.
We're down to answer basically any question, including questions about our mission, theory of change, work so far, future plans, Elicit, relation to other orgs in the space, and what it's like to work at Ought.
There are and have been a lot startups working on similar things (AI to assist researchers), going back to IBM's ill-fated Watson. Your demo makes it look very useful and is definitely the most impressive I've seen. I'm deeply suspicious of demos, however.
How can you test if your system is actually useful for researchers?
[One (albeit imperfect) way to gauge utility is to see if people are willing to pay money for it and keep paying money for it over time. However, I assume that is not the plan here. I guess another thing would be to track how much people use it over time or see if they fall away from using it. Another of course would be an RCT, although it's not clear how it would be structured.]
This is a live product - not just a demo! You can use it at elicit.org.
More than 45K users have tried it and ~ 10K use it each month. Users say that Elicit saves them ~ 1-2 hours / week. They proactively share positive feedback on places like Twitter and with their colleagues or friends: Elicit’s growth is entirely by word of mouth.
I agree that having people pay for it is one of the greatest indicators of value. We’ll have to balance financial sustainability with the desire to make high-quality accessible.
At some point, we probably will do a more for... (read more)