Most college-educated adults would get well under half of these problems right (the authors used computer science undergraduates as human subjects, and their performance ranged from 40% to 90%).
I think the hardness of the MATH benchmark was somewhat exaggerated. I downloaded the dataset myself and took a look, and came to the conclusion that many -- perhaps most -- of the questions are simple plug-and-chug problems. The reported performance of 40-90% among students may have been a result of time constraints rather than pure difficulty. In the paper, they wrote:
"To provide a rough but informative comparison to human-level performance, we randomly sampled 20 problems from the MATH test set and gave them to humans. We artificially require that the participants have 1 hour to work on the problems and must perform calculations by hand."
Two key points I want to add to this summary:
The Bay Area rationalist scene is a hive of techno-optimisitic libertarians.[1] These people have a negative view of state/government effectiveness at a philosophical and ideological level, so their default perspective is that the government doesn't know what it's doing and won't do anything
The attitude of expecting very few regulations made little sense to me, because -- as someone who broadly shares these background biases -- my prior is that governments will generally regulate a new scary technology that comes out by default. I just don't expect that regulations will always be thoughtful, or that they will weigh the risks and rewards of new technologies appropriately.
There's an old adage that describes how government sometimes operates in response to a crisis: "We must do something; this is something; therefore, we must do this." Eliezer Yudkowsky himself once said,
So there really is a reason to be allergic to people who go around saying, "Ah, but technology has risks as well as benefits". There's a historical record showing over-conservativeness, the many silent deaths of regulation being outweighed by a few visible deaths of nonregulation. If you're really playing the middle, why not say, "Ah, but technology has benefits as well as risks"?
Can you give some examples where individual humans have a clear strategic decisive advantage (i.e. very low risk of punishment), where the low-power individual isn't at a high risk of serious harm?
Why are we assuming a low risk of punishment? Risk of punishment depends largely on social norms and laws, and I'm saying that AIs will likely adhere to a set of social norms.
I think the central question is whether these social norms will include the norm "don't murder humans". I think such a norm will probably exist, unless almost all AIs are severely misaligned. I think severe misalignment is possible; one can certainly imagine it happening. But I don't find it likely, since people will care a lot about making AIs ethical, and I'm not yet aware of any strong reasons to think alignment will be super-hard.
Correct me if I'm wrong, but it seems like most of these reasons boil down to not expecting AI to be superhuman in any relevant sense
No, I certainly expect AIs will eventually be superhuman in virtually all relevant respects.
Resource allocation is relatively equal (and relatively free of violence) among humans because even humans that don't very much value the well-being of others don't have the power to actually expropriate everyone else's resources by force.
Can you clarify what you are saying here? If I understand you correctly, you're saying that humans have relatively little wealth inequality because there's relatively little inequality in power between humans. What does that imply about AI?
I think there will probably be big inequalities in power among AIs, but I am skeptical of the view that there will be only one (or even a few) AIs that dominate over everything else.
I do not think GPT-4 is meaningful evidence about the difficulty of value alignment.
I'm curious: does that mean you also think that alignment research performed on GPT-4 is essentially worthless? If not, why?
I think it's extremely unlikely that GPT-4 has preferences over world states in a way that most humans would consider meaningful, and in the very unlikely event that it does, those preferences almost certainly aren't centrally pointed at being honest, kind, and helpful.
I agree that GPT-4 probably doesn't have preferences in the same way humans do, but it sure appears to be a limited form of general intelligence, and I think future AGI systems will likely share many underlying features with GPT-4, including, to some extent, cognitive representations inside the system.
I think our best guess of future AI systems should be that they'll be similar to current systems, but scaled up dramatically, trained on more modalities, with some tweaks and post-training enhancements, at least if AGI arrives soon. Are you simply skeptical of short timelines?
re: endogenous reponse to AI - I don't see how this is relevant once you have ASI.
To be clear, I expect we'll get AI regulations before we get to ASI. I predict that regulations will increase in intensity as AI systems get more capable and start having a greater impact on the world.
Note that we are currently moving at pretty close to max speed, so this is a prediction that the future will be different from the past.
Every industry in history initially experienced little to no regulation. However, after people became more acquainted with the industry, regulations on the industry increased. I expect AI will follow a similar trajectory. I think this is in line with historical evidence, rather than contradicting it.
re: perfectionism - I would not be surprised if many current humans, given superhuman intelligence and power, created a pretty terrible future. Current power differentials do not meaningfully let individual players flip every single other player the bird at the same time.
I agree. If you turned a random human into a god, or a random small group of humans into gods, then I would be pretty worried. However, in my scenario, there aren't going to be single AIs that suddenly become gods. Instead, in my scenario, there will be millions of different AIs, and the AIs will smoothly increase in power over time. During this time, we will be able to experiment and do alignment research to see what works and what doesn't at making the AIs safe. I expect AI takeof will be fairly diffuse, and AIs will probably be respectful of norms and laws because no single AI can take over the world by themselves. Of course, the way I think about the future could be wrong on a lot of specific details, but I don't see a strong reason to doubt the basic picture I'm presenting, as of now.
My guess is that your main objection here is that you think foom will happen, i.e. there will be a single AI that takes over the world and imposes its will on everyone else. Can you elaborate more on why you think that will happen? I don't think it's a straightforward consequence of AIs being smarter than humans.
I'm not totally sure what analogy you're trying to rebut, but I think that human treatment of animal species, as a piece of evidence for how we might be treated by future AI systems that are analogously more powerful than we are, is extremely negative, not positive.
My main argument is that we should reject the analogy itself. I'm not really arguing that the analogy provides evidence for optimism, except in a very weak sense. I'm just saying: AIs will be born into and shaped by our culture; that's quite different than what happened between animals and humans.
I might elaborate on this at some point, but I thought I'd write down some general reasons why I'm more optimistic than many EAs on the risk of human extinction from AI. I'm not defending these reasons here; I'm mostly just stating them.
The second is describing an all-ready existing phenomenon of cost disease which while concerning has been compatible with high rates of growth and progress over the past 200 years.
I want to add further that cost disease is not only compatible with economic growth, cost disease itself is a result of economic growth, at least in the usual sense of the word. The Baumol effect -- which is what people usually mean when they say cost disease -- is simply a side effect of some industries becoming more productive more quickly than others. Essentially the only way to avoid cost disease is to have uniform growth across all industries, and that's basically never happened historically, except during times of total stagnation (in which growth is ~0% in every industry).
Even though I disagree with Caplan on x-risks, animal rights, mental illness, free will, and a few other things, I ultimately don't think it's necessarily suspicious for him to hold the most convenient view on a broad range of topics. One can imagine two different ways of forming an ideology:
I predict that, regardless of his own personal history, Bryan Caplan will probably appeal to the second type of reasoning in explaining why his views all seem "convenient". He might say: it's not that the facts are ideologically convenient, but that the ideology is convenient since it fits all the facts. (Although I also expect him to be a bit modest and admit that he might be wrong about the facts.)
Bryan Caplan co-authored a paper critiquing Georgism in 2012. From the blog post explaining the critique,
My co-author Zachary Gochenour and I have a new working paper arguing that the Single Tax suffers from a much more fundamental flaw. Namely: A tax on the unimproved value of land distorts the incentive to search for new land and better uses of existing land. If we actually imposed a 100% tax on the unimproved value of land, any incentive to search would disappear. This is no trivial problem: Imagine the long-run effect on the world’s oil supply if companies stopped looking for new sources of oil.
I can explain our argument with a simple example. Clever Georgists propose a regime where property owners self-assess the value of their property, subject to the constraint that owners must sell their property to anyone who offers that self-assessed value. Now suppose you own a vacant lot with oil underneath; the present value of the oil minus the cost of extraction equals $1M. How will you self-assess? As long as the value of your land is public information, you cannot safely self-assess at anything less than its full value of $1M. So you self-assess at $1M, pay the Georgist tax (say 99%), and pump the oil anyway, right?
There’s just one problem: While the Georgist tax has no effect on the incentive to pump discovered oil, it has a devastating effect on the incentive to discover oil in the first place. Suppose you could find a $1M well by spending $900k on exploration. With a 99% Georgist tax, your expected profits are negative $890k. (.01*$1M-$900k=-$890k)
You might think that this is merely a problem for a handful of industries. But that’s probably false. All firms engage in search, whether or not they explicitly account for it. Take a real estate developer. One of his main functions is to find valuable new ways to use existing land. “This would be a great place for a new housing development.” “This would be a perfect location for a Chinese restaurant.” And so on.
Note: I've edited this post to change my bottom-line TAI arrival distribution slightly. The edit doesn't reflect much of a change in my (underlying) transformative AI timelines, but rather (mostly) reflects a better compromise when visualizing things.
To make a long story short, previously I put too little probability on TAI arriving between 2027-2035 because I wanted the plot to put very low probability on TAI arriving before 2027. Because of the way the Metaculus sliders work, this made it difficult to convey a very rapid increase in my probability after 2026. Now I've decided to compromise in a way that put what I think is an unrealistically high probability on TAI arriving before 2027.
That said, I have updated a little bit since I wrote this post:
I'll try not to change the post much going forward in the future, so that it can reflect a historical snapshot of how I thought about AI timelines in 2023, rather than a frequently updated document.