Sorted by New


Retrospective on thinking about my career for a year

Thanks for this. I was curious about "Pick a niche or undervalued area and become the most knowledgeable person in it." Do you feel comfortable saying what the niche was? Or even if not, can you say a bit more about how you went about doing this?

Parallels Between AI Safety by Debate and Evidence Law

This is very interesting! I'm excited to see connections drawn between AI safety and the law / philosophy of law. It seems there are a lot of fruitful insights to be had.

You write,

The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.

Can you elaborate a bit on this?

I don't know anything about the history of these rules about evidence. But why think that over this history, these rules have trended towards truth-seeking per se? I wouldn't be surprised if the rules have evolved to better serve the purposes of the legal system over time, but presumably the relationship between this end and truth-seeking is quite complex. Also, people changing the rules could be mistaken about what sorts of evidence do in fact tend to lead to wrong decisions.

I think all of this is compatible with your claim. But I'd like to hear more!

AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher

Thanks for the great summary! A few questions about it

1. You call mesa-optimization "the best current case for AI risk". As Ben noted at the time of the interview, this argument hasn't yet really been fleshed out in detail. And as Rohin subsequently wrote in his opinion of the mesa-optimization paper, "it is not yet clear whether mesa optimizers will actually arise in practice". Do you have thoughts on what exactly the "Argument for AI Risk from Mesa-Optimization" is, and/or a pointer to the places where, in your opinion, that argument has been made (aside from the original paper)?

2. I don't entirely understand the remark about the reference class of ‘new intelligent species’. What species are in that reference class? Many species which we regard as quite intelligent (orangutans, octopuses, New Caledonian crows) aren't risky. Probably, you mean a reference class like "new species as smart as humans" or "new 'generally intelligent' species". But then we have a very small reference class and it's hard to know how strong that prior should be. In any case, how were you thinking of this reference class argument?

3. 'The Boss Baby', starring Alec Baldwin, is available for rental on Amazon Prime Video for $3.99. I suppose this is more of a comment than a question.