Hide table of contents

I was mildly disappointed in the responses to my last question, so I did a bit of thinking and came up with some answers myself. I'm not super happy with them either, and would love feedback on them + variations, new suggestions, etc. The ideas are:

1. I approach someone who has longer timelines than me, and I say: I'll give you $X now if you promise to publicly admit I was right later and give me a platform, assuming you later come to update significantly in my direction.

2. I approach someone who has longer timelines than me, and I say: I'll give you $X now if you agree to talk with me about timelines for 2 hours. I'd like to hear your reasons, and to hear your responses to mine. The talk will be recorded. Then I get one coupon which I can redeem for another 2-hour conversation with you.

3. I approach someone who has longer timelines than me, and I say: For the next 5 years, you agree to put x% of your work-hours into projects of my choosing (perhaps with some constraints, like personal fit). Then, for the rest of my life, I'll put x% of my work-hours into projects of your choosing (constraints, etc.).

The problem with no. 3 is that it's probably too big an ask. Maybe it would work with someone who I already get along well with and can collaborate on projects, whose work I respect and who respects my work.

The point of no. 2 is to get them to update towards my position faster than they otherwise would have. This might happen in the first two-hour conversation, even. (They get the same benefits from me, plus cash, so it should be pretty appealing for sufficiently large X. Plus, I also benefit from the extra information which might help me update towards longer timelines after our talk!) The problem is that maybe forced 2-hour conversations don't actually succeed in that goal, depending on psychology/personality.

A variant of no 2 would simply be to challenge people to a public debate on the topic. Then the goal would be to get the audience to update.

The point of no. 1 is to get them to give me some of their status/credibility/platform, in the event that I turn out to be probably right. The problem, of course, is that it's up to them to decide whether I'm probably right, and it gives them an incentive to decide that I'm not!

New Answer
New Comment

4 Answers sorted by

I'm pretty sure I have longer timelines than you. On each of the bets:

  1. I would take this, but also I like to think if I did update towards your position I would say that anyway (and I would say that you got it right earlier if you asked me to do so, to the extent that I thought you got it right for the right reasons or something).
  2. I probably wouldn't take this (unless X was quite high), because I don't really expect either of us to update to the other's position.
  3. I wouldn't take this; I am very pessimistic about my ability to do research that I'm not inside-view excited about (like, my 50% confidence interval is that I'd have 10-100x less impact even in the case where someone with the same timelines as me is choosing the project, if they don't agree with me on research priorities). It isn't necessary that someone with shorter timelines than me would choose projects I'm not excited about, but from what I know about what you care about working on, I think it would be the case here. Similarly, I am pessimistic about your ability to do research on broad topics that I choose on my inside view. (This isn't specific to you; it applies to anyone who doesn't share most of my views.)

Thanks! Yeah, your criticism of no. 3 is correct.  As for no. 1, yeah, probably this works best for bets with people who I don't think would do this correctly absent a bet, but who would do it correctly with a bet... which is perhaps a narrow band of people! 

How high would you need for no. 2? I might do it anyway, just for the information value. :) My views on timelines haven't yet been shaped by much direct conversation with people like yourself.

Rohin Shah
I'm happy to sell an hour of my time towards something with no impact at $1,000, so that puts an upper bound of $4,000. (Though currently I've overcommitted myself, so for the next month or two it might be  ~2x higher.) That being said, I do think it's valuable for people working on AI safety to at least understand each other's positions; if you don't think you can do that re: my position, I'd probably be willing to have that conversation without being paid at all (after the next month or two). And I do expect to understand your position better, though I don't expect to update towards it, so that's another benefit.
OK, thanks. FWIW I expect at least one of us to update at least slightly. Perhaps it'll be me.  I'd be interested to know why you disagree--do I come across as stubborn or hedgehoggy? If so, please don't hesitate to say so, I would be grateful to hear that.  I might be willing to pay $4,000, especially if I could think of it as part of my donation for the year. What would you do with the money--donate it? As for time, sure, happy to wait a few months.  

Counterfactuals are hard. I wouldn't be committing to donate it. (Also, if I were going to donate it, but it would have been donated anyway, then $4,000 no longer seems worth it if we ignore the other benefits.)

I expect at least one of us to update at least slightly.

I agree with "at least slightly".

I'd be interested to know why you disagree

Idk, empirically when I discuss things with people whose beliefs are sufficiently different from mine, it doesn't seem like their behavior changes much afterwards, even if they say they updated towards X. Similarly, when people talk to me, I often don't see myself making any particular changes to how I think or behave. There's definitely change over the course of a year, but it feels extremely difficult to ascribe that to particular things, and I think it more often comes from reading things that people wrote, rather than talking to them.

OK. Good to hear. I'm surprised to hear that you think my beliefs are sufficiently different from yours. I thought your timelines views are very similar to Ajeya's; well, so are mine! (Also, I've formed my current views mostly in the last 6 months. Had you asked me a year or two ago, I probably would have said something like median 20-25 years from now, which is pretty close to your median I think. This is evidence, I think, that I could change my mind back.) Anyhow, I won't take up any more of your time... for now! Bwahaha!  :)

I mostly agree with Rohin's answer, and I'm pretty skeptical overall of AI safety as a cause area, although I have deep uncertainty about this and might hedge by supporting s-risk-focused work.

Are you primarily interested in these trades with people who already prioritize AI safety?

On 3, do you mean you'd start putting x% after the first 5 years?

I think it's plausible you could find people who are undecided between AI safety with short timelines and other cause areas or between short and long timelines, and pay them enough to work on AI safety for short timelines, since they could address their uncertainty with donations outside of (short timeline) AI safety. I've worked as a deep learning researcher/engineer to earn-to-give for animal welfare, and I have considered working in AI safety, focusing on worst-case scenarios (CLR, CRS) or to earn-to-give for animal welfare. I think technical AI safety would be more interesting and motivating than my past work in deep learning, and perhaps more interesting day-to-day than my current plans but less motivating in the long run due to my skepticism. I was preparing to apply to CLR's internship for this past summer, but got an internship offer from Charity Entrepreneurship first and decided to go with that. I know one person who did something similar but went with AI safety instead.

It might be too expensive to pay people interested in earning-to-give enough to earn-to-give in (short timeline) AI safety, if AI safety isn't already one of their top priorities. Also, they don't even have to be EAs; you could find people who would just find the work interesting (e.g. people with graduate degrees in related subjects) but are worried about it not paying enough. You could take out loans to do this, but this kind of defies common sense and sounds pretty crazy to me.

(FWIW, my own price to work on AI safety (short or long timelines) is probably too high now, and, of course, there's the question of whether I'm a good fit, anyway.)

Sorry for the delayed reply. I'm primarily interested in making these trades with people who have a similar worldview to me, because this increases the chance that as a result of the trade they will start working on the things I think are most valuable. I'd be happy to talk with other people too, except that if there's so much inferential distance to cross it would be more for fun than for impact. That said, maybe I'm modelling this wrong.

Yes, for no. 3 I meant after the first 5 years. Good catch.

It sounds like you might be a good fit for this sort of thing! Want to have a call to chat sometime? I'm also interested in doing no. 2 with you...


I think I'd need to read more before we could have a very productive conversation. If you want to point me to some writing that you found most persuasive for short timelines (or you could write a post laying out your reasoning, if you haven't already; this could get more useful community discussion, too), that would be helpful. I don't want to commit to anything yet, though. I'm also not that well-read on AI safety in general. I guess a few sources of skepticism I have now are: 1. Training an agent to be generally competent in interactions with humans and our systems (even virtually, and not just in conversation) could be too slow or require more complex simulated data than is feasible. Maybe a new version of GPT will be an AGI but not an agent and that might come soon, and while that could still be very impactful, it might not pose an existential risk. Animals as RL agents have had millions of years of evolution to have strong priors fitted to real world environments built into each individual. 2. I'm just skeptical about trying to extrapolate current trends to AGI. 3. On AI risk more generally, I'm skeptical that an AI could acquire and keep enough resources without the backing of people with access to them to be very dangerous. It would have to deceive us at least until it's too late for us to cut its access (and I haven't heard of such a scenario that wasn't far-fetched), e.g. by cutting the power or internet, which we can do physically, including by bombing. If we do catch it doing something dangerous, we will cut access. It would need access to powerful weapons to protect its access to resources or do much harm before we could cut its access to resources. This seems kind of obvious, though, so I imagine there are some responses from the AI safety community.
Thanks, this is helpful! I'm in the middle of writing some posts laying out my reasoning... but it looks like it'll take a few more weeks at least, given how long it's taken so far. Funnily enough, all three of the sources of skepticism you mention are things that I happen to have written things about or else am in the process of writing something about. This is probably a coincidence. Here are my answers to 1, 2, and 3, or more like teasers of answers: 1. I agree, it could. But it also could not. I think a non-agent AGI would also be a big deal; in fact I think there are multiple potential AI-induced points of no return. (For example, a non-agent AGI could be retrained to be an agent, or could be a component of a larger agenty system, or could be used to research agenty systems faster, or could create a vulnerable world that ends quickly or goes insane.) I'm also working on a post arguing that the millions of years of evolution don't mean shit and that while humans aren't blank slates they might as well be for purposes of AI forecasting. :) 2. My model for predicting AI timelines (which I am working on a post for) is similar to Ajeya's. I don't think it's fair to describe it as an extrapolation of current trends; rather, it constructs a reasonable prior over how much compute should be needed to get to AGI, and then we update on the fact that the amount of compute we have so far hasn't been enough, and make our timelines by projecting how the price of compute will drop. (So yeah, we are extrapolating compute price trends, but those seem fairly solid to extrapolate, given the many decades across which they've held fairly steady, and given that we only need to extrapolate them for a few more years to get a non-trivial probability.) 3. Yes, this is something that's been discussed at length. There are lots of ways things could go wrong. For example, the people who build AGI will be thinking that they can use it for something, otherwise they wouldn't have built it. B
I guess a few quick responses to each, although I haven't read through your links yet. 1. I think agenty systems in general can still be very limited in how competent they are, due to the same data/training bottlenecks, even if you integrate a non-agential AGI into the system. 2. I did see Ajeya's post and read Rohin's summary. I think there might not be any one most reasonable prior for compute necessary for AGI (or whether hitting some level of compute is enough, even given enough data or sufficiently complex training environments), since this will need to make strong and basically unjustified assumptions about whether current approaches (or the next approaches we will come up with) can scale to AGI. Still, this doesn't mean AGI timelines aren't short; it might just means you should do a sensitivity analysis on different priors when you're thinking of supporting or doing certain work. And, of course, they did do such a sensitivity analysis for the timeline question. 3. In response to this specifically, "As for whether we'd shut it off after we catch it doing dangerous things -- well, it wouldn't do them if it thought we'd notice and shut it off. This effectively limits what it can do to further its goals, but not enough, I think.", what other kinds of ways do you expect it would go very badly? Is it mostly unknown unknowns?
Well, I look forward to talking more sometime! No rush, let me know if and when you are interested. On point  no. 3 in particular, here are some relevant parables (a bit lengthy, but also fun to read!) https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien-message https://www.lesswrong.com/posts/bTW87r8BrN3ySrHda/starwink-by-alicorn https://www.gregegan.net/MISC/CRYSTAL/Crystal.html (I especially recommend this last one, it's less relevant to our discussion but a better story and raises some important ethical issues.)  

To the extent funding constraints are real, betting on short timelines can be seen as just a subset of wanting money now vs later more than other people. 

In that regard, it'd be reasonable to figure out mechanisms to borrow against your future income. I don't know how difficult it is to do in practice, but it's plausible you can figure out ways to do this with EAs if for various reasons (eg counterparty risk) standard financial institutions do not let you do this. 

Yes. As I explained in my previous post, it's not money I'm after, but rather knowledge and help.

I personally feel skeptical of short AI timelines (though I feel far too confused about this question to have confident views!). I'd definitely be interested in having a call where you try to convince me of this though, if that offer is open to anyone! I expect to find this interesting, so don't care at all about money here.

OK, cool, yes let's talk sometime! Will send pm.

Curated and popular this week
Relevant opportunities