goodgravy

Head of Engineering @ Elicit

70 karmaJoined Apr 2022jmsbrdy.com/

Message

Posts
2

Sorted by New

Discovering alignment windfalls reduces AI risk

goodgravy

· 2mo ago · 10m read

AI Safety Needs Great Product Builders

goodgravy

· 1y ago · 7m read

Comments
4

AMA: Ought

goodgravy2y3

You're welcome!

The goal for Elicit is for it to be a research assistant, leading to more and higher quality research. Literature review is only one small part of that: we would like to add functionality like brainstorming research directions, finding critiques, identifying potential collaborators, …

Beyond that, we believe that factored cognition could scale to lots of knowledge work. Anywhere the tasks are fuzzy, open-ended, or have long feedback loops, we think Elicit (or our next product) could be a fit. Journalism, think-tanks, policy work.
It is, very much. Answering so-called strength of evidence questions accounts for big chunks of researchers' time today.

AMA: Ought

goodgravy2y2

Another benefit of our product-driven approach is that we aim to provide a positive contribution to the alignment community. By which I mean:

Thanks to amazing prior work in straight alignment research, we already have some idea of anti-patterns and risks that we all want to avoid. What we're still lacking are safety attractors: i.e. alternative approaches which are competitive with and safer than the current paradigm.

We want for Elicit to be an existence proof that there is a better way to solve certain complex tasks, and for our approach to go on to be adopted by others – because it's in their self-interest, not because it's safe.

AMA: Ought

goodgravy2y5

In a research assistant setting, you could imagine the top-level task being something like "Was this a double-blind study?", which we might factor out as:

Were the participants blinded?
- Was there a placebo?
  - Which paragraphs relate to placebos?
    - Does this paragraph state there was a placebo?
      - …
- Did the participants know if they were in the placebo group?
  - …
Were the researchers blinded?
- …

In this example, by the time we get to the "Does this paragraph state there was a placebo?" level, a submodel is given a fairly tractable question-answering task over a given paragraph. A typical response for this example might be a confidence level and text spans pointing to the most relevant phrases.

AMA: Ought

goodgravy2y4

Great question! Yes, this is definitely on our minds as a potential harm of Elicit.

Of the people who end up with one-sided evidence right now, we can probably form two loose groups:

People who accidentally end up with it because good reasoning is hard and time-consuming to do.
People who seek it out because they want to bolster a pre-existing belief.

For the first group – the accidental ones – we’re aiming to make good reasoning as easy (and ideally easier than) finding one-sided evidence. Work we’ve done so far:

We have a “possible critiques” feature in Elicit which looks for papers which arrive at different conclusions. These critiques are surfaced – if available – whenever a user clicks in to see more information on a paper.
We have avoided using social standing cues such as citations when evaluating papers. We do expose those data in the app, but don’t – for example – boost papers cited by others. In this way, we hope to surface relevant and diverse papers from a range of authors, whether or not they happen to be famous.
At the same time, our chosen initial set of users (professional researchers) are relatively immune to accidentally doing one-sided research, because they care a lot about careful and correct reasoning.

For the second group – the intentional ones – we expect that Elicit might have a slight advantage right now over alternative tools, but longer-term probably won’t be more useful than other search tools that use language models with retrieval (e.g. this chatbot). And the better Elicit is, and the better other tools that care about good epistemics are, the easier it will be to reveal misleading arguments by this second group.

goodgravy

Posts 2

Comments4

Posts
2

Comments
4