I'm currently working as a Senior Research Scholar at the Future of Humanity Institute.
Takeaways from some reading about economic effects of human-level AI
I spent some time reading things that you might categorise as “EA articles on the impact of human-level AI on economic growth”. Here are some takeaways from reading these (apologies for not always providing a lot of context / for not defining terms; hopefully clicking the links will provide decent context).
If you're interested in more on this topic, I'd highlight Holden Karnofsky's recent blog series and Tom Davidson's recent Open Phil report as good places to start.
In case it’s useful for other people, here’s the main stuff I (at least partially) read / listened to:
Thanks for this, I think it's really brilliant, I really appreciate how clearly the details are laid out in the blog and report. It's really cool to be able to see external reviewer comments too.
I found it kind of surprising that there isn't any mention of civilizational collapse etc when thinking about growth outcomes for the 21st century (e.g. in Appendix G, but also apparently in your bottom line probabilities in e.g. Section 4.6 "Conclusion" -- or maybe it's there and I missed it / it's not explicit).
I guess your probabilities for various growth outcomes in Appendix G are conditional on ~no civilizational collapse (from any cause) and ~no AI-triggered fundamental reshaping of society that unexpectedly prevents growth? Or should I read them more as "conditional on ~no civilizational collapse etc other than due to AI", with the probability mass for AI-triggered collapse etc being incorporated into your "AI robots don't have a tendency to drive explosive growth because none of our theories are well-suited for this situation" and/or "an unanticipated bottleneck prevents explosive growth"?
Thanks that's interesting, I've heard of it but I haven't looked into it.
Causal vs evidential decision theory
I wrote this last Autumn as a private “blog post” shared only with a few colleagues. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public. Decision theories are a pretty well-worn topic in EA circles and I'm definitely not adding new insights here. These are just some fairly naive thoughts-out-loud about how CDT and EDT handle various scenarios. If you've already thought a lot about decision theory you probably won't learn anything from this.
The last two weeks of the Decision Theory seminars I’ve been attending have focussed on contrasting causal decision theory (CDT) and evidential decision theory (EDT). This seems to be a pretty active area of discussion in the literature - one of the papers we looked at was published this year, and another is yet to be published.
In terms of the history of the field, it seems that Newcomb’s problem prompted a move towards CDT (e.g. in Lewis 1981). I find that pretty surprising because to me Newcomb’s problem provides quite a bit of motivation for EDT, and without weird scenarios like Newcomb’s problem I think I might have taken something like CDT to be the default, obviously correct theory. But it seems like you didn’t need to worry about having a “causal aspect” to decision theories until Newcomb’s problem and other similar problems brought out a divergence in recommendations from (what became known as) CDT and EDT.
I guess this is a very well-worn area (especially in places like Lesswrong) but anyway I can’t resist giving my fairly naive take even though I’m sure I’m just repeating what others have said. When I first heard about things like Newcomb’s problem a few years ago I think I was a pretty ardent CDTer, whereas nowadays I am much more sympathetic to EDT.
In Newcomb’s problem, it seems pretty clear to me that one-boxing is the best option, because I’d rather have $1,000,000 than $1000. Seems like a win for EDT.
Dicing With Death is designed to give CDT problems, and in my opinion it does this very effectively. In Dicing With Death, you have to choose between going to Aleppo or Damascus, and you know that whichever you choose, death will have predicted your choice and be waiting for you (a very bad outcome for you). Luckily, a merchant offers you a magical coin which you can toss to decide where to go, in which case death won’t be able to predict where you go, giving you a 50% chance of avoiding death. The merchant will charge a small fee for this. However CDT gets into some strange contortions and as a result recommends against paying for the magical coin, even though the outcome if you pay for the magical coin seems clearly better. EDT recommends paying for the coin, another win for EDT.
To me, The Smoking Lesion is a somewhat problematic scenario for EDT. Still, I feel like it’s possible for EDT to do fine here if you think carefully enough.
You could make the following simple model for what happens in The Smoking Lesion: in year 1, no-one knows why some people get cancer and some don’t. In year 2, it’s discovered that everyone who smokes develops cancer, and furthermore there’s a common cause (a lesion) that causes both of these things. Everyone smokes iff they have the lesion, and everyone gets cancer iff they have the lesion. In year 3, following the publication of these results, some people who have the lesion try not to smoke. Either (i) none of them can avoid smoking because the power of the lesion is too strong; or (ii) some of them do avoid smoking, but (since they still have the lesion) they still develop cancer. In case (i), the findings from year 2 remain valid even after everyone knows about them. In case (ii), the findings from year 2 are no longer valid: they just tell you about how the world would have been if the correlation between smoking and cancer wasn’t known.
The cases where you use the knowledge about the year 2 finding to decide not to smoke are exactly the cases where the year 2 finding doesn’t apply. So there’s no point in using the knowledge about the year 2 finding to not smoke: either your not smoking (through extreme self-control etc) is pointless because you still have the lesion and this is a case where the year 2 finding doesn’t apply, or it’s pointless because you don’t have the lesion.
So it seems to me like the right answer is to smoke if you want to, and I think EDT can recommend this by incorporating the fact that if you choose not to smoke purely because of the year 2 finding, this doesn’t give you any evidence about whether you have the lesion (though this is pretty vague and I wouldn’t be that surprised if making it more precise made me realise it doesn’t work).
In general it seems like these issues arise from treating the agent’s decision making process as being removed from the physical world - a very useful abstraction which causes issues in weird edge cases like the ones considered above.
Changing your working to fit the answer
I wrote this last Autumn as a private “blog post” shared only with a few colleagues. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public. It is quite rambling and doesn't really have a clear point (but I think it's at least an interesting topic).
Say you want to come up with a model for AI timelines, i.e. the probability of transformative AI being developed by year X for various values of X. You put in your assumptions (beliefs about the world), come up with a framework for combining them, and get an answer out. But then you’re not happy with the answer - your framework must have been flawed, or maybe on reflection one of your assumptions needs a bit of revision. So you fiddle with one or two things and get another answer - now it looks much better, close enough to your prior belief that it seems plausible, but not so close that it seems suspicious.
Is this kind of procedure valid? Here’s one case where the answer seems to be yes: if your conclusions are logically impossible, you know that either there’s a flaw in your framework or you need to revise your assumptions (or both).
A closely related case is where the conclusion is logically possible, but extremely unlikely. It seems like there’s a lot of pressure to revise something then too.
But in the right context revising your model in this way can look pretty dodgy. It seems like you’re “doing things the wrong way round” - what was the point of building the model if you were going to fiddle with the assumptions until you got the answer you expected anyway?
I think this is connected to a lot of related issues / concepts:
Presumably, you could put this question of whether and how much to modify your model into some kind of formal Bayesian framework where on learning a new argument you update all your beliefs based on your prior beliefs in the premises, conclusion, and validity of the argument. I’m not sure whether there’s a literature on this, or whether e.g. highly skilled forecasters actually think like this.
In general though, it seems (to me) that there’s something important about “following where the assumptions / model takes you”. Maybe, given all the ways we fall short of being perfectly rational, we should (and I think that in fact we do) put more emphasis on this than a perfectly rational Bayesian agent would. Avoiding having a very strong prior on the conclusion seems helpful here.
One (maybe?) low-effort thing that could be nice would be saying "these are my top 5" or "these are listed in order of how promising I think they are" or something (you may well have done that already and I missed it).
Thanks, I think this is a great topic and this seems like a useful list (although I do find reading through 19 different types of options without much structure a bit overwhelming!).
I'll just ~repost a private comment I made before.
Encouraging and facilitating aspiring/junior researchers and more experienced researchers to connect in similar ways
This feels like an especially promising area to me. I'd guess there are lots of cases where this would be very beneficial for the junior researcher and at least a bit beneficial for the experienced researcher. It just needs facilitation (or something else, e.g. a culture change where people try harder to make this happen themselves, some strong public encouragement to juniors to make this happen, ...).This isn't based on really strong evidence, maybe mostly my own (limited) experience + assuming at least some experienced researchers are similar to me. And that there are lots of excellent junior researcher candidates out there (again from first hand impressions).
Improving the vetting of (potential) researchers, and/or better “sharing” that vetting
This also seems like a big deal and an area where maybe you could improve things significantly with a relatively small amount of effort. I don't have great context here though.
I wrote this last Autumn as a private “blog post” shared only with a few colleagues. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public.
In this post I’m going to discuss two papers regarding imprecise probability that I read this week for a Decision Theory seminar. The first paper seeks to show that imprecise probabilities don’t adequately constrain the actions of a rational agent. The second paper seeks to refute that claim.
Just a note on how seriously to take what I’ve written here: I think I’ve got the gist of what’s in these papers, but I feel I could spend a lot more time making sure I’ve understood them and thinking about which arguments I find persuasive. It’s very possible I’ve misunderstood or misrepresented the points the papers were trying to make, and I can easily see myself changing my mind about things if I thought and read more.
Also, a note on terminology: it seems like “sharpness/unsharpness” and “precision/imprecision” are used interchangeably in these papers, as are “probability” and “credence”. There might well be subtle distinctions that I’m missing, but I’ll try to consistently use “imprecise probabilities” here.
I imagine there are (at least) several different ways of formulating imprecise probabilities. One way is the following: your belief state is represented by a set of probability functions, and your degree of belief in a particular proposition is represented by the set of values assigned to it by the set of probability functions. You also then have an imprecise expectation: each of your probability functions has an associated expected utility. Sometimes, all of your probability functions will agree on the action that has the highest expected value. In that case, you are rationally required to take that action. But if there’s no clear winner, that means there’s more than one permissible action you could take.
The first paper, Subjective Probabilities should be Sharp, was written in 2010 by Elga. The central claim is that there’s no plausible account of how imprecise probabilities constrain which choices are reasonable for a perfectly rational agent.
The argument centers around a particular betting scenario: someone tells you “I’m going to offer you bet A and then bet B, regarding a hypothesis H”:
Bet A: win $15 if H, else lose $10
Bet B: lose $10 if H, else win $15
You’re free to choose whether to take bet B independently of whether you choose bet A.
Depending on what you believe about H, it could well be that you prefer just one of the bets to both bets. But it seems like you really shouldn’t reject both bets. Taking both bets guarantees you’ll win exactly $5, which is strictly better than the $0 you’ll win if you reject both bets.
But under imprecise probabilities, it’s rationally permissible to have some range of probabilities for H, which implies that it’s permissible to reject both bet A and bet B. So imprecise probabilities permit something which seems like it ought to be impermissible.
Elga considers various rules that might be added to the initial imprecise probabilities-based decision theory, and argues that none of them are very appealing. I guess this isn’t as good as proving that there are no good rules or other modifications, but I found it fairly compelling on the face of it.
The rules that seemed most likely to work to me were Plan and Sequence. Both rules more or less entail that you should accept bet B if you already accepted bet A, in which case rejecting both bets is impermissible and it looks like the theory is saved.
Elga tries to show that these don’t work by inviting us to imagine the case where a particular agent called Sally faces the decision problem. Sally has imprecise probabilities, maximises expected utility and has a utility function that is linear in dollars.
Elga argues that in this scenario it just doesn’t make sense for Sally to accept bet B only if she already accepted bet A - the decision to accept bet B shouldn’t depend on anything that came before. It might do if Sally had some risk averse decision theory, or had a utility function that was concave in dollars - but by assumption, she doesn’t. So Plan and Sequence, which had seemed like the best candidates for rescuing imprecise probabilities, aren’t plausible rules for a rational agent like Sally.
The 2014 paper by Bradley and Steele, Should Subjective Probabilities be Sharp? is, as the name suggests, a response to Elga’s paper. The core of their argument is that the assumptions for rationality implied by Elga’s argument are too strong and that it’s perfectly possible to have rational choice with imprecise probabilities provided that you don’t make these too-strong assumptions.
I’ll highlight two objections and give my view.
In summary, in Subjective Probabilities should be Sharp, Elga illustrates how imprecise probabilities appear to permit a risk-neutral agent with linear utility to make irrational choices. In addition, Elga argues that there aren’t any ways to rescue things while keeping imprecise probabilities. In Should Subjective Probabilities be Sharp?, Bradley and Steele argue that Elga makes some implausibly strong assumptions about what it takes to be rational. I didn't find these arguments very convincing, although I might well have just failed to appreciate the points they were trying to make.
I think it basically comes down to this: for an agent with decision theory features like Sally’s, i.e. no risk aversion and linear utility, the only way to avoid passing up opportunities like making a risk-free $5 by taking bet A and bet B is if you’re always willing to take one side of any particular bet. The problem with imprecise probabilities is that they permit you to refrain from taking either side, which implies that you’re permitted to decline the risk-free $5.
The fan of imprecise probabilities can wriggle out of this by saying that you should be allowed to do things like taking bet B only if you just took bet A - but I agree with Elga that this just doesn’t make sense for an agent like Sally. I think the reason this might look overly demanding on the face of it is that we’re not like Sally - we’re risk averse and have concave utility. But agents who are risk averse or have concave utility are allowed both to sometimes decline bets and to take risk-free sequences of bets, even according to Elga’s rationality requirements, so I don’t think this intuition pushes against Elga’s rationality requirements.
It feels kind of useful to have read these papers, because
Thanks for these comments and for the chat earlier!
*I probably missed some nuance here, please feel free to clarify if so.
Thanks, this was helpful as an example of one way I might improve this process.