I'm currently working as a Senior Research Scholar at the Future of Humanity Institute.
Also I noticed that Jess Whittlestone wrote some probably much better notes on this a few years ago
Things I took away for myself
Here are some notes I made while reading a transcript of a seminar called You and Your Research by Richard Hamming. (I'd previously read this article with the same name, but I feel like I got something out of reading this seminar transcript although there's a lot of overlap).
I recently spent some time trying to work out what I think about AI timelines. I definitely don’t have any particular insight here; I just thought it was a useful exercise for me to go through for various reasons (and I did find it very useful!).
As it came out, I "estimated" a ~5% chance of TAI by 2030 and a ~20% chance of TAI by 2050 (the probabilities for AGI are slightly higher). As you’d expect me to say, these numbers are highly non-robust.
When I showed them the below plots a couple of people commented that they were surprised that my AGI probabilities are higher than my TAI ones, and I now think I didn’t think about non-AGI routes to TAI enough when I did this. I’d now probably increase the TAI probabilities a bit and lower the AGI ones a bit compared to what I’m showing here (by “a bit” I mean ~maybe a few percentage points).
I generated these numbers by forming an inside view, an outside view, and making some heuristic adjustments. The inside and outside views are ~weighted averages of various forecasts. My timelines are especially sensitive to how I chose and weighted forecasts for my outside view.
Here are my timelines in graphical form:
And here they are again alongside some other timelines people have made public:
If you want more detail, there’s a lot more in this google doc. I’ll probably write another shortform post with some more thoughts / reflections on the process later.
I wrote this last Summer as a private “blog post” just for me. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public. These rambling thoughts come from my very naive point of view (as it was in the Summer of 2020; not to suggest my present day point of view is much less naive). In particular if you’ve already read lots of moral philosophy you probably won’t learn anything from reading this.
Generally, reading various moral philosophy writings has probably made me (even) more comfortable trusting my own intuitions / reasoning regarding what “morality” is and what the “correct moral theory” is.
I think that, when you start engaging with moral philosophy, there’s a bit of a feeling that when you’re trying to reason about things like what’s right and wrong, which moral theory is superior to the other, etc, there are some concrete rules you need to follow, and (relatedly) certain words or phrases have a solid, technical definition that everyone with sufficient knowledge knows and agrees on. The “certain words or phrases” I have in mind here are things like “morally right”, “blameworthy”, “ought”, “value”, “acted wrongly”, etc.
To me right now, the situation seems a bit more like the following: moral philosophers (including knowledgeable amateurs, etc) have in mind definitions for certain words, but these definitions may be more or less precise, might change over time, and differ from person to person. And in making a “moral philosophy” argument (say, writing down an argument for a certain moral theory), the philosopher can use flexibility of interpretation as a tool to make their argument appear more forceful than it really is. Or, the philosopher’s argument might imply that certain things are self-evidently true, and the reader might be (maybe unconsciously) fooled into thinking that this is the case, when in fact it isn’t.
It seems to me now that genuinely self-evident truths are in very short supply in moral philosophy. And, now that I think this is the case, I feel like I have much more licence to make up my own mind about things. That feels quite liberating.
But it does also feel potentially dangerous. Of course, I don’t think it’s dangerous that *I* have freedom to decide what “doing good” means to me. But I might find it dangerous that others have that freedom. People can consider committing genocide to be “doing what is right” and it would be nice to have a stronger argument against this than “this conflicts with my personal definition of what good is”. And, of course, others might well think it’s dangerous that I have the freedom to decide what doing good means.
Now that we’re in this free-for-all, even defining morality seems problematic.
I suppose I can make some observations about myself, like
I guess these things are all in the region of “wanting to improve the lives of others”. This sounds a lot like wanting to do what is morally good / morally praiseworthy, and seems at least closely related to morality.
In some ways, whether I label some of my goals and beliefs as being to do with “morality” doesn’t matter - either way, it seems clear that the academic field of moral philosophy is pretty relevant. And presumably when people talk about morality outside of an academic context, they’re at least sometimes talking about roughly the thing I’m thinking of.
In the below I give a very rough summary of Will MacAskill’s article Are We Living At The Hinge Of History? and give some very preliminary thoughts on the article and some of the questions it raises.
I definitely don’t think that what I’m writing here is particularly original or insightful: I’ve thought about this for no more than a few days, any points I make are probably repeating points other people have already made somewhere, and/or are misguided, etc. This seems like an incredibly deep topic which I feel like I’ve barely scratched the surface of. Also, this is not a focussed piece of writing trying to make a particular point, it’s just a collection of thoughts on a certain topic.(If you want to just see what I think, skip down to "Some thoughts on the issues discussed in the article")
(note that the article is an updated version of the original EA Forum post Are we living at the most influential time in history?)
The Hinge of History claim (HH): we are among the most influential people ever (past or future). Influentialness is, roughly, how much good a particular person at a particular time can do through direct expenditure of resources (rather than investment)
Two worldviews prominent in longtermist EA imply that HH is true:
The base rates argument
Claim: our prior should be that we’re as likely as anyone else, past or present, to be the most influential person ever (Bostrom’s Self-Sampling Assumption (SSA)). Under this prior, it’s astronomically unlikely that any particular person is the most influential person ever.
Then the question is how much should we update from this prior
Counterargument 1: we only need to be at an enormously influential time, not the most influential, and the implications are ~the same either way
Counterargument 2: the action-relevant thing is the influentialness of now compared to any time we can pass resources on to
The inductive argument
Argument 1: we’re living on a single planet, implying greater influentialness
Argument 2: we’re now in a period of unusually fast economic and tech progress, implying greater influentialness. We can’t maintain the present-day growth rate indefinitely.
MacAskill seems sympathetic to the argument, but says it implies not that today is the most important time, but that the most important time is some time might be in the next few thousand years
A few other arguments for HH are briefly touched on in a footnote: that existential risk / value lock-in lowers the number of future people in the reference class for the influentialness prior; that we might choose other priors that are more favourable for HH, and that earlier people can causally affect more future people
It kind of feels like there are two somewhat independent things that are most interesting from the article:
Avoiding rejecting the Time of Perils and Bostrom-Yudkowsky views
I think there are a few ways we can go to avoid rejecting the Time of Perils and Bostrom-Yudkowsky views
A nearby alternative is to modify the Time of Perils and Bostrom-Yudkowsky views a bit so that they don’t imply we’re among the most influential people ever. E.g. for the Bostrom-Yudkowsky view we could make the value lock-in a bit “softer” by saying that for some reason, not necessarily known/stated, the lock-in would probably end after some moderate (on cosmological scales) length of time. I’d guess that many people might find a modified view more plausible even independently of the influentialness implications.
I’m not really sure what I think here, but I feel pretty sympathetic to the idea that we should be uncertain about the prior and that this maybe lends itself to having not too strong a prior against the Time of Perils and Bostrom-Yudkowsky views.
On the question of whether to expend resources now or later
The arguments MacAskill discusses suggest that the relevant time frame is the next few thousand years (because the next few thousand years seem (in expectation) especially high influentialness and because it might be effectively impossible to pass our resources further into the future).
It seems like the pivotal importance of priors on influentialness (or similar) then evaporates: it no longer seems that implausible on the SSA prior that now is a good time to expend resources rather than save. E.g. say there’ll be a 20 year period in the next 1000 years where we want to expend philanthropic resources rather than save them to pass on to future generations. Then a reasonable prior might be that we have a 20/1000 = 1 in 50 chance of being in that period. That’s a useful reference point and is enough to make us skeptical about arguments that we are in such a period, but it doesn’t seem overwhelming. In fact, we’d probably want to spend at least some resources now even purely based on this prior.
In particular, it seems like some kind of detailed analysis is needed, maybe along the lines of Trammell’s model or at least using that model as a starting point. I think many of the arguments in MacAskill’s article should be part of that detailed analysis, but, to stress the point, they don’t seem decisive to me.
This comment by Carl Shulman on the related EA Forum post and its replies has some stuff on this.
In the article, the Inductive Argument is supported by the idea of moral progress: MacAskill cites the apparent progress in our moral values over the past 400 years as evidence for the idea that we should expect future generations to have better moral values than we do. Obviously, whether we should expect moral progress in the future is a really complex question, but I’m at least sympathetic to the idea that there isn’t really moral progress, just moral fashions (so societies closer in time to ours seem to have better moral values just because they tend to think more like us).
Of course, if we don’t expect moral progress, maybe it’s not so surprising that we have very high influentialness: if past and future actors don’t share our values, it seems very plausible on the face of it that we’re better off expending our resources now than passing them off to future generations in the hope they’ll carry out our wishes. So maybe MacAskill’s argument about influentialness should update us away from the idea of moral progress?
But if we’re steadfast in our belief in moral progress, maybe it’s not so surprising that we have high influentialness because we find ourselves in a world where we are among the very few with a longtermist worldview, which won’t be the case in the future as longtermism becomes a more popular view. (I think Carl Shulman might say something like this in the comments to the original EA Forum post)
I like the way the article ends, providing some motivation for the Inductive Argument in a way I find appealing on a gut level:
Just as our powers to grow crops, to transmit information, to discover the laws of nature, and to explore the cosmos have all increased over time, so will our power to make the world better — our influentialness. And given how much there is still to understand, we should believe, and hope, that our descendents look back at us as we look back at those in the medieval era, marvelling at how we could have got it all so wrong.
I wrote this last Summer as a private “blog post” just for me. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public. These thoughts come from my very naive point of view (as it was in the Summer of 2020; not to suggest my present day point of view is much less naive). In particular if you’ve already read lots of moral philosophy you probably won’t learn anything from reading this. Also, I hope my summaries of other people’s arguments aren’t too inaccurate.
Recently, I’ve been trying to think seriously about what it means to do good. A key part of Effective Altruism is asking ourselves how we can do the most good. Often, considering this question seems to be mostly an empirical task: how many lives will be saved through intervention A, and how many through intervention B? Aside from the empirical questions, though, there are also theoretical ones. One key consideration is what we mean by doing good.
There is a branch of philosophy called moral philosophy which is (partly) concerned with answering this question.
It’s important to me that I don’t get too drawn into the particular framings that have evolved within the academic discipline of moral philosophy, which are, presumably, partly due to cultural or historical forces, etc. This is because I really want to try to come up with my own view, and I think that (for me) the best process for this involves not taking other people’s views or existing works too seriously, especially while I try to think about these things seriously for the first time.
Still, it seems useful to get familiar with the major insights and general way of thinking within moral philosophy, because
I’ve read a couple of Stanford Encyclopedia of Philosophy articles, and a series of posts by Lukas Gloor arguing for moral anti-realism.
I found the Stanford Encyclopedia of Philosophy articles fairly tough going but still kind of useful. I thought the Gloor posts were great.
The Gloor posts have kind of convinced me to take the moral anti-realist side, which, roughly, denies the existence of definitive moral truths.
While I suppose I might consider my “inside view” to be moral anti-realist at the moment, I can easily see this changing in the future. For example, I imagine that if I read a well-argued case for moral realism, I might well change my mind.
In fact, prior to reading Gloor’s posts, I probably considered myself to be a moral realist. I think I’d heard arguments, maybe from Will MacAskill, along the lines that i) if moral anti-realism is true, then nothing matters, whereas if realism is true, you should do what the true theory requires you to do, and ii) there’s some chance that realism is true, therefore iii) you should do what the true theory requires you to do.
Gloor discusses an argument like this in one of his posts. He calls belief in moral realism founded on this sort of argument “metaethical fanaticism” (if I’ve understood him correctly).
I’m not sure that I completely understood everything in Gloor’s posts. But the “fanaticism” label does feel appropriate to me. It feels like there’s a close analogy with the kinds of fanaticism that utilitarianism is susceptible to, for example. An example of that might be a Pascal’s wager type argument - if there’s a finite probability that I’ll get infinite utility derived from an eternal life in a Christian heaven, I should do what I can to maximise that probability.
It feels like something has gone wrong here (although admittedly it’s not clear what), and this Pascal’s wager argument doesn’t feel at all like a strong argument for acting as if there’s a Christian heaven. Likewise, the “moral realist wager” doesn’t feel like a strong argument for acting as if moral realism is true, in my current view.
Gloor also argues that we don’t lose anything worth having by being moral anti-realists, at least if you’re his brand of moral anti-realist. I think he calls the view he favours “pluralistic moral reductionism”.
On his view, you take any moral view (or maybe combination of views) you like. These can (and maybe for some people, “should”) be grounded in our moral intuitions, and maybe use notions of simplicity of structure etc, just as a moral realist might ground their guess(?) at the true moral theory in similar principles. Your moral view is then your own “personal philosophy”, which you choose to live by.
One unfortunate consequence of this view is that you don’t really have any grounds to argue with someone else who happens to have a different view. Their view is only “wrong” in the sense that it doesn’t agree with yours; there’s no objective truth here.
From this perspective, it would arguably be nicer if everyone believed that there was a true moral view that we should strive to follow (even if we don’t know what it is). Especially if you also believe that we could make progress towards that true moral view.
I’m not sure how big this effect is, but it feels like more than nothing. So maybe I don’t quite agree that we don’t lose anything worth having by being moral anti-realists.
In any case, the fact that we might wish that moral realism is true doesn’t (presumably) have any bearing on whether or not it is true.
I already mentioned that reading Gloor’s posts has caused me to favour moral anti-realism. Another effect, I think, is that I am more agnostic about the correct moral theory. Some form of utilitarianism, or at least consequentialism, seems far more plausible to me as the moral realist “one true theory” than a deontological theory or virtue ethics theory. Whereas if moral anti-realism is correct, I might be more open to non-consequentialist theories. (I’m not sure whether this new belief would stand up to a decent period of reflection, though - maybe I’d be just as much of a convinced moral anti-realist consequentialist after some reflection).
Thanks for doing this and for doing the 80k podcast, I enjoyed the episode.
I haven't thought about this angle very much, but it seems like a good angle which I didn't talk about much in the post, so it's great to see this comment.
I guess the question is whether you can take the model, including the optimal allocation assumption, as corresponding to the world as it is plus some kind of (imagined) quasi-effective global coordination in a way that seems realistic. It seems like you're pretty skeptical that this is possible (my own inside view is much less certain about this but I haven't thought about it that much).
One thing that comes to mind is that you could incorporate into the model spending on dangerous tech by individual states for self-defence into the hazard rate equation through epsilon - it seems like the risk from this should probably increase with consumption (easier to do it if you're rich), so it doesn't seem that unreasonable. Not sure whether this is getting to the core of the issue you've raised, though.
I suppose you can also think about this through the "beta and epsilon aren't really fixed" lens that I put more emphasis on in the post. It seems like greater / less coordination (generally) implies more / less favourable epsilon and beta, within the model.
Thanks for this, it's pretty interesting to get your perspective as someone who's been (I presume) heavily engaged in the community for some time. I thought your other post on the All-Party Parliamentary Group for Future Generations was awesome, by the way.
You asked for comments including "small" thoughts so here are some from me, for what they're worth. These are my current views which I can easily see changing if I were to think about this more etc.
I think I basically agree that there doesn't seem to have been much progress in cause prioritisation in say the last five years, compared to what you might have hoped for.
(mainly written to clarify my own thoughts:) It seems like you can do cause prioritisation work either by comparing different causes, or by investigating a particular cause (especially a cause that's relatively unknown or poorly investigated), or by doing more "foundational" things like asking "what is moral value anyway?", "how should we compare options under uncertainty", etc.
My impression the Effective Altruism community has invested a significant amount of resources into cause prioritisation research, and relative lack of progress is because it's hard
The above (which is probably far from comprehensive) seems like a decent fraction of the resources of the "longtermist" part of the community (the part I'm familiar with). I suppose I lean towards wanting a larger fraction of resources allocated to cause prioritisation, but I don't think it's that obvious either way. Anyway, regardless of whether the right fraction of resources have been spent on this, I think it's just very hard and that this explains a lot of what you're describing.
Maybe one reason there's not much work comparing causes in particular is that there's so much uncertainty, which makes it very difficult to do well enough that the output is valuable. In particular
Edit: reading the above you could probably get the impression that I think you're wrong to "raise the alarm" about the need for more / different cause prioritisation, but I don't think that at all. I think I'm pretty sympathetic to most of what you wrote.