[EDIT: Thanks for the questions everyone! Just noting that I'm mostly done answering questions, and there were a few that came in Tuesday night or later that I probably won't get to.]
Hi everyone! I’m Ajeya, and I’ll be doing an Ask Me Anything here. I’ll plan to start answering questions Monday Feb 1 at 10 AM Pacific. I will be blocking off much of Monday and Tuesday for question-answering, and may continue to answer a few more questions through the week if there are ones left, though I might not get to everything.
About me: I’m a Senior Research Analyst at Open Philanthropy, where I focus on cause prioritization and AI. 80,000 Hours released a podcast episode with me last week discussing some of my work, and last September I put out a draft report on AI timelines which is discussed in the podcast. Currently, I’m trying to think about AI threat models and how much x-risk reduction we could expect the “last long-termist dollar” to buy. I joined Open Phil in the summer of 2016, and before that I was a student at UC Berkeley, where I studied computer science, co-ran the Effective Altruists of Berkeley student group, and taught a student-run course on EA.
I’m most excited about answering questions related to AI timelines, AI risk more broadly, and cause prioritization, but feel free to ask me anything!
Hi Ajeya! I"m a huge fan of your timelines report, it's by far the best thing out there on the topic as far as I know. Whenever people ask me to explain my timelines, I say "It's like Ajeya's, except..."
My question is, how important do you think it is for someone like me to do timelines research, compared to other kinds of research (e.g. takeoff speeds, alignment, acausal trade...)
I sometimes think that even if I managed to convince everyone to shift from median 2050 to median 2032 (an obviously unlikely scenario!), it still wouldn't matter much because people's decisions about what to work on are mostly driven by considerations of tractability, neglectedness, personal fit, importance, etc. and even that timelines difference would be a relatively minor consideration. On the other hand, intuitively it does feel like the difference between 2050 and 2032 is a big deal and that people who believe one when the other is true will probably make big strategic mistakes.
Bonus question: Murphyjitsu: Conditional on TAI being built in 2025, what happened? (i.e. how was it built, what parts of your model were wrong, what do the next 5 years look like, what do the 5 years after 2025 look like?)
Thanks so much, that's great to hear! I'll answer your first question in this comment and leave a separate reply for your Murphyjitsu question.
First of all, I definitely agree that the difference between 2050 and 2032 is a big deal and worth getting to the bottom of; it would make a difference to Open Phil's prioritization (and internally we're trying to do projects that could convince us of timelines significantly shorter than in my report). You may be right that it could have a counterintuitively small impact on many individual people's career choices, for the reasons you say, but I think many others (especially early career people) would and should change their actions substantially.
I think there are roughly three types of reasons why Bob might disagree with Alice about a bottom line conclusion like TAI timelines, which correspond to three types of research or discourse contributions Bob could make in this space:
1. Disagreements can come from Bob knowing more facts than Alice about a key parameter, which can allow Bob to make "straightforward corrections" to Alice's proposed value for that parameter. E.g., "You didn't think much about hardware, but I did a solid research project... (read more)
An extension of Daniel's bonus question:
If I condition on your report being wrong in an important way (either in its numerical predictions, or via conceptual flaws) and think about how we might figure that out today, it seems like two salient possibilities are inside-view arguments and outside-view arguments.
The former are things like "this explicit assumption in your model is wrong". E.g. I count my concern about the infeasibility of building AGI using algorithms available in 2020 as an inside-view argument.
The latter are arguments that, based on the general difficulty of forecasting the future, there's probably some upcoming paradigm shift or crucial consideration which will have a big effect on your conclusions (even if nobody currently knows what it will be).
Are you more worried about the inside-view arguments of current ML researchers, or outside-view arguments?
I mostly agree with that with the further caveat that I tend to think the low value reflects not that ML is useless but the inertia of a local optima where the gains from automation are low because so little else is automated and vice-versa ("automation as colonization wave"). This is part of why, I think, we see the broader macroeconomic trends like big tech productivity pulling away: many organizations are just too incompetent to meaningful restructure themselves or their activities to take full advantage. Software is surprisingly hard from a social and organizational point of view, and ML more so. A recent example is coronavirus/remote-work: it turns out that remote is in fact totally doable for all sorts of things people swore it couldn't work for - at least when you have a deadly global pandemic solving the coordination problem...
As for my specific tweet, I wasn't talking about making $$$ but just doing cool projects and research. People should be a little more imaginative about applications. Lots of people angst about how they can possibly compete with OA or GB or DM, but the reality is, as crowded as specific research topics like 'yet another efficient Transformer variant' m... (read more)
What type of funding opportunities related to AI Safety would OpenPhil want to see more of?
Anything else you can tell me about the funding situation with regards to AI Safety. I'm very confused about why not more people and projects get funded. Is because there is not enough money, or if there is some bottleneck related to evaluation and/or trust?
Looking at the mistakes you've made in the past, what fraction of your (importance-weighted) mistakes would you classify the issue as being:
And what ratios would you assign to this for EAs/career EAs in general?
For context, a coworker and I recently had a discussion about, loosely speaking, whether it was more important for junior researchers within EA to build domain knowledge or general skills. Very very roughly, my coworker was more on the former case because he thought that EAs had an undersupply of domain knowledge over so-called "generalist skills." However, I leaned more on the latter side of this debate because I weakly believe that more of my mistakes (and more of my most critical mistakes) were due to errors of cognition rather than insufficient knowledge of facts. (Obviously credit assignment is hard in both cases).
I’d be keen to hear your thoughts about the (small) field of AI forecasting and its trajectory. Feel free to say whatever’s easiest or most interesting. Here are some optional prompts:
One issue I feel the EA community has badly neglected is the probability given various (including modest) civilizational backslide scenarios of us still being able to (and *actually*) developing the economies of scale needed to become an interstellar species.
To give a single example, a runaway Kessler effect could make putting anything in orbit basically impossible unless governments overcome the global tragedy of the commons and mount an extremely expensive mission to remove enough debris to regain effective orbital access - in a world where we've lost satellite technology and everything that depends on it.
EA so far seem to have treated 'humanity doesn't go extinct' in scenarios like this as equivalent to 'humanity reaches its interstellar potential', which seems very dangerous to me - intuitively, it feels like there's at least a 1% chance that we wouldn't ever solve such a problem in practice, even if civilisation lasted for millennia afterwards. If so, then we should be treating it as (at least) 1/100th of an existential catastrophe - and a couple of orders of magnitude doesn't seem like that big a deal especially if there are many more such scenarios than there are extinction-causing ones.
Do you have any thoughts on how to model this question in a generalisable way that it could give a heuristic for non-literal-extinction GCRs? Or do you think one would need to research specific GCRs to answer it for each of them?
Also a big fan of your report. :)
Historically, what has caused the subjectively biggest-feeling updates to your timelines views? (e.g. arguments, things you learned while writing the report, events in the world).
Thanks! :)
The first time I really thought about TAI timelines was in 2016, when I read Holden's blog post. That got me to take the possibility of TAI soonish seriously for the first time (I hadn't been explicitly convinced of long timelines earlier or anything, I just hadn't thought about it).
Then I talked more with Holden and technical advisors over the next few years, and formed the impression that there was a relatively simple argument that many technical advisors believed that if a brain-sized model could be transformative, then there's a relatively tight argument implying it would take X FLOP to train it, which would become affordable in the next couple decades. That meant that if we had a moderate probability on the first premise, we should have a moderate probability on TAI in the next couple decades. This made me take short timelines even more seriously because I found the biological analogy intuitively appealing, and I didn't think that people who confidently disagreed had strong arguments against it.
Then I started digging into those arguments in mid-2019 for the project that ultimately became the report, and I started to be more skeptical again because it seemed tha... (read more)
What cause-prioritization efforts would you most like to see from within the EA community?
I'm most interested in forecasting work that could help us figure out how much to prioritize AI risk over other x-risks, for example estimating transformative AI timelines, trying to characterize what the world would look like in between now and transformative AI, and trying to estimate the magnitude of risk from AI.
If a magic fairy gave you 10 excellent researchers from a range of relevant backgrounds who were to work on a team together to answer important questions about the simulation hypothesis, what are the top n research questions you'd be most excited to discover they are pursuing?
I'm afraid I don't have crisp enough models of the simulation hypothesis and related sub-questions to have a top n list. My biggest question is something more like "This seems like a pretty fishy argument, and I find myself not fully getting or buying it despite not being able to write down a simple flaw. What's up with that? Can somebody explain away my intuition that it's fishy in a more satisfying way and convince me to buy it more wholeheartedly, or else can someone pinpoint the fishiness more precisely?" My second biggest question is something like "Does this actually have any actionable implications for altruists/philanthropists? What are they, and can you justify them in a way that feels more robust and concrete and satisfying than earlier attempts, like Robin Hanson's How to Live in a Simulation?"
Thanks for doing this AMA!
I'd be interested to hear about what you or Open Phil include (and prioritise) within the "longtermism" bucket. In particular, I'm interested in things like:
- When you/Open Phil talk about existential risk, are you (1) almost entirely concerned about extinction risk specifically, (2) mostly concerned about extinction risk specifically, or (3) somewhat similar concerned about extinction risk and other existential risks (i.e., risks of unrecoverable collapse or unrecoverable dystopias)?
- When you/Open Phil talk about longtermism, are yo
... (read more)How would you define a "cause area" and "cause prioritization", in a way which extends beyond Open Phil?
I'd say that a "cause" is something analogous to an academic field (like "machine learning theory" or "marine biology") or an industry (like "car manufacturing" or "corporate law"), organized around a problem or opportunity to improve the world. The motivating problem or opportunity needs to be specific enough and clear enough that it pays off to specialize in it by developing particular skills, reading up on a body of work related to the problem, trying to join particular organizations that also work on the problem, etc.
Like fields and industries, the boundaries around what exactly a "cause" is can be fuzzy, and a cause can have sub-causes (e.g. "marine biology" is a sub-field of "biology" and "car manufacturing" is a sub-industry within "manufacturing"). But some things are clearly too broad to be a cause: "doing good" is not a cause in the same way that "learning stuff" is not an academic field and "making money" is not an industry. Right now, the cause areas that long-termist EAs support are in their infancy, so they're pretty broad and "generalist"; over time I expect sub-causes to become more clearly defined and deeper specialized expertise to develop within them (e.g. ... (read more)
Hi Ajeya, that's a wonderful idea - I have a couple of questions below that are more about how you find working as a Senior Research Analyst and in this area:
What do you love about your role / work?
What do you dislike about your role / work?
What’s blocking you from having the impact you’d like to have?
What is the most important thing you did to get to where you are? (e.g., network, trying out lots of jobs / internships, continuity at one job, a particular a course etc.)
Regarding forecasts on transformative AI:
I'd be really interested in hearing about the discussions you have with people that have earlier median estimates, and/or what you expect those discussions would resolve around. For example, I saw that the Metaculus crowd has a median estimate of 2035 for fully general AI. Skimming their discussions, they migh... (read more)
Like Linch says, some of the reason the Metaculus median is lower than mine is probably because they have a weaker definition; 2035 seems like a reasonable median for "fully general AI" as they define it, and my best guess may even be sooner.
With that said, I've definitely had a number of conversations with people who have shorter timelines than me for truly transformative AI; Daniel Kokotajlo articulates a view in this space here. Disagreements tend to be around the following points:
- People with shorter timelines than me tend to feel that the notion of "effective horizon length" either doesn't make sense, or that training time scales sub-linearly rather than linearly with effective horizon length, or that models with short effective horizon lengths will be transformative despite being "myopic." They generally prefer a model where a scaled-up GPT-3 constitutes transformative AI. Since I published my draft report, Guille Costa (an intern at Open Philanthropy) released a version of the model that explicitly breaks out "scaled up GPT-3" as a hypothesis, which would imply a median of 2040 if all my other assumptions are kept intact.
- They also tend to feel that extrapolations of whe
... (read more)Note that the definition of "fully general AI" on that Metaculus question is considerably weaker than how Open Phil talks about "transformative AI."
... (read more)Imagine you win $10B in a donor lottery. What sort of interventions—that are unlikely to be funded by Open Phil in the near future—might you fund with that money?
How much worldview-diversification and dividing capital into buckets do you have within each of the three main cause areas, if at all? For example, I could imagine a divide between short and long AI Timelines, or a divide between policy-oriented and research-oriented grants.
I'd be interested to hear whether you think eventually expanding beyond our solar system is necessary for achieving a long period with very low extinction risk (and, if so, your reasons for thinking that).
Context for this question (adapted from this comment):
As part of the discussion of "Effective size of the long-term future" during your recent 80k appearance, you and Rob discussed the barriers to and likelihood of various forms of space colonisation.
During that section, I got the impression that you were implicitly thinking that a stable, low... (read more)
Apart from the biological anchors approach, what efforts in AI timelines or takeoff dynamics forecasting—both inside and outside Open Phil—are you most excited about?
What instrumental goals have you pursued successfully?
For thinking about AI timelines, how do you go about choosing the best reference classes to use (see e.g., here and here)?
[I'm not sure if you've thought about the following sort of question much. Also, I haven't properly read your report - let me know if this is covered in there.]
I'm interested in a question along the lines of "Do you think some work done before TAI is developed matters in a predictable way - i.e., better than 0 value in expectation - for its effects on the post-TAI world, in ways that don't just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?"
An example to illustra... (read more)
To the extent that you have "a worldview" (in scare quotes), what is a short summary of that worldview?
I'm curious about your take on prioritizing between science funding and other causes. In the 80k interview you said:
... (read more)I really appreciated your 80K episode - it was one of my favorites! I created a discussion thread for it.
Some questions - feel free to answer as many as you want:
Any thoughts on the recent exodus of employees from OpenAI?
In your 80,000 Hours interview you talked about worldview diversification. You emphasized the distinction between total utilitarianism vs. person-affecting views within the EA community. What about diversification beyond utilitarianism entirely? How would you incorporate other normative ethical views into cause prioritization considerations? (I'm aware that in general this is basically just the question of moral uncertainty, but I'm curious how you and Open Phil view this issue in practice.)
Most people at Open Phil aren't 100% bought into to utilitarianism, but utilitarian thinking has an outsized impact on cause selection and prioritization because under a lot of other ethical perspectives, philanthropy is supererogatory, so those other ethical perspectives are not as "opinionated" about how best to do philanthropy. It seems that the non-utilitarian perspectives we take most seriously usually don't provide explicit cause prioritization input such as "Fund biosecurity rather than farm animal welfare", but rather provide input about what rules or constraints we should be operating under, such as "Don't misrepresent what you believe even if it would increase expected impact in utilitarian terms."
Hello! I really enjoyed your 80,000 Hours interview, and thanks for answering questions!
1 - Do you have any thoughts about the prudential/personal/non-altruistic implications of transformative AI in our lifetimes?
2 - I find fairness agreements between worldviews unintuitive but also intriguing. Are there any references you'd suggest on fairness agreements besides the OpenPhil cause prioritization update?
I've been increasingly hearing advice to the effect that "stories" are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn't make a dent in it, but do you think it's valuable for AI x-safety researchers to make significant (... (read more)
Thanks for doing this and for doing the 80k podcast, I enjoyed the episode.
[The following question might just be confused, might not be important, and will likely be poorly phrased/explained.]
In your recent 80k appearance, you and Rob both say that the way the self-sampling assumption (SSA) leads to the doomsday argument seems sort-of "suspicious". You then say that, on the other hand, the way the self-indication assumption (SIA) causes an opposing update also seems suspicious.
But I think all of your illustrations of how updates based on the SIA can seem suspicious involved infinities. And we already know that loads o... (read more)
Hi Ajeya! :) What do you think about open source projects like https://www.eleuther.ai/ that replicate cutting-edge projects like GPT-3 or Alphafold? Speaking as an outsider, I imagine that a lot of AI progress comes from "random" tinkering, and so I wondered if "Discord groups tinkering along" are relevant actors in your strategic landscape.
(I really enjoyed listening to the recent interview!)
Hi Ajeya, thank you for publishing such a massive and detailed report on timelines!! Like other commenters here, it is my go-to reference. Allowing users to adjust the parameters of your model is very helpful for picking out built-in assumptions and being able to update predictions as new developments are made.
In your report you mention that you discount the aggressive timelines in part due to lack of major economic applications of AI so far. I have a few questions along those lines.
Do you think TAI will necessarily be foreshadowed by incremental economic ... (read more)
Hi Ajeya, thanks for doing this and for your recent 80K interview! I'm trying to understand what assumptions are needed for the argument you raise in the podcast discussion on fairness agreements that a longtermist worldview should have been willing to trade up all its influence for ever-larger potential universe. There are two points I was wondering if you could comment on if/how these align with your argument.
My intuition says that the argument requires a prior probability distribution on universe size that has an infinite expectation, rather than jus
What do you make of Ben Garfinkel's work on scepticism towards AI's capacity being separable from its goals/his broader skepticism of brain in a box scenarios?