Effective Altruism Forum
EA Forum

In this "quick take", I want to summarize some my idiosyncratic views on AI risk. My goal here is to list just a few ideas that cause me to approach the subject differently from how I perceive most other EAs view the topic. These ideas largely push me in the direction of making me more optimistic about AI, and less likely to support heavy regulations on AI. (Note that I won't spend a lot of time justifying each of these views here. I'm mostly stating these points without lengthy justifications, in case anyone is curious. These ideas can perhaps inform why I spend significant amounts of my time pushing back against AI risk arguments. Not all of these ideas are rare, and some of them may indeed be popular among EAs.) 1. Skepticism of the treacherous turn: The treacherous turn is the idea that (1) at some point there will be a very smart unaligned AI, (2) when weak, this AI will pretend to be nice, but (3) when sufficiently strong, this AI will turn on humanity by taking over the world by surprise, and then (4) optimize the universe without constraint, which would be very bad for humans. By comparison, I find it more likely that no individual AI will ever be strong enough to take over the world, in the sense of overthrowing the world's existing institutions and governments by surprise. Instead, I broadly expect unaligned AIs will integrate into society and try to accomplish their goals by advocating for their legal rights, rather than trying to overthrow our institutions by force. Upon attaining legal personhood, unaligned AIs can utilize their legal rights to achieve their objectives, for example by getting a job and trading their labor for property, within the already-existing institutions. Because the world is not zero sum, and there are economic benefits to scale and specialization, this argument implies that unaligned AIs may well have a net-positive effect on humans, as they could trade with us, producing value in exchange for our own property and services. Note that my claim here is not that AIs will never become smarter than humans. One way of seeing how these two claims are distinguished is to compare my scenario to the case of genetically engineered humans. By assumption, if we genetically engineered humans, they would presumably eventually surpass ordinary humans in intelligence (along with social persuasion ability, and ability to deceive etc.). However, by itself, the fact that genetically engineered humans will become smarter than non-engineered humans does not imply that genetically engineered humans would try to overthrow the government. Instead, as in the case of AIs, I expect genetically engineered humans would largely try to work within existing institutions, rather than violently overthrow them. 2. AI alignment will probably be somewhat easy: The most direct and strongest current empirical evidence we have about the difficulty of AI alignment, in my view, comes from existing frontier LLMs, such as GPT-4. Having spent dozens of hours testing GPT-4's abilities and moral reasoning, I think the system is already substantially more law-abiding, thoughtful and ethical than a large fraction of humans. Most importantly, this ethical reasoning extends (in my experience) to highly unusual thought experiments that almost certainly did not appear in its training data, demonstrating a fair degree of ethical generalization, beyond mere memorization. It is conceivable that GPT-4's apparently ethical nature is fake. Perhaps GPT-4 is lying about its motives to me and in fact desires something completely different than what it professes to care about. Maybe GPT-4 merely "understands" or "predicts" human morality without actually "caring" about human morality. But while these scenarios are logically possible, they seem less plausible to me than the simple alternative explanation that alignment—like many other properties of ML models—generalizes well, in the natural way that you might similarly expect from a human. Of course, the fact that GPT-4 is easily alignable does not immediately imply that smarter-than-human AIs will be easy to align. However, I think this current evidence is still significant, and aligns well with prior theoretical arguments that alignment would be easy. In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking bad actions, allowing us to shape its rewards during training accordingly. After we've aligned a model that's merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point. 3. The default social response to AI will likely be strong: One reason to support heavy regulations on AI right now is if you think the natural "default" social response to AI will lean too heavily on the side of laissez faire than optimal, i.e., by default, we will have too little regulation rather than too much. In this case, you could believe that, by advocating for regulations now, you're making it more likely that we regulate AI a bit more than we otherwise would have, pushing us closer to the optimal level of regulation. I'm quite skeptical of this argument because I think that the default response to AI (in the absence of intervention from the EA community) will already be quite strong. My view here is informed by the base rate of technologies being overregulated, which I think is quite high. In fact, it is difficult for me to name even a single technology that I think is currently clearly underregulated by society. By pushing for more regulation on AI, I think it's likely that we will overshoot and over-constrain AI relative to the optimal level. In other words, my personal bias is towards thinking that society will regulate technologies too heavily, rather than too loosely. And I don't see a strong reason to think that AI will be any different from this general historical pattern. This makes me hesitant to push for more regulation on AI, since on my view, the marginal impact of my advocacy would likely be to push us even further in the direction of "too much regulation", overshooting the optimal level by even more than what I'd expect in the absence of my advocacy. 4. I view unaligned AIs as having comparable moral value to humans: This idea was explored in one of my most recent posts. The basic idea is that, under various physicalist views of consciousness, you should expect AIs to be conscious, even if they do not share human preferences. Moreover, it seems likely that AIs — even ones that don't share human preferences — will be pretrained on human data, and therefore largely share our social and moral concepts. Since unaligned AIs will likely be both conscious and share human social and moral concepts, I don't see much reason to think of them as less "deserving" of life and liberty, from a cosmopolitan moral perspective. They will likely think similarly to the way we do across a variety of relevant axes, even if their neural structures are quite different from our own. As a consequence, I am pretty happy to incorporate unaligned AIs into the legal system and grant them some control of the future, just as I'd be happy to grant some control of the future to human children, even if they don't share my exact values. Put another way, I view (what I perceive as) the EA attempt to privilege "human values" over "AI values" as being largely arbitrary and baseless, from an impartial moral perspective. There are many humans whose values I vehemently disagree with, but I nonetheless respect their autonomy, and do not wish to deny these humans their legal rights. Likewise, even if I strongly disagreed with the values of an advanced AI, I would still see value in their preferences being satisfied for their own sake, and I would try to respect the AI's autonomy and legal rights. I don't have a lot of faith in the inherent kindness of human nature relative to a "default unaligned" AI alternative. 5. I'm not fully committed to longtermism: I think AI has an enormous potential to benefit the lives of people who currently exist. I predict that AIs can eventually substitute for human researchers, and thereby accelerate technological progress, including in medicine. In combination with my other beliefs (such as my belief that AI alignment will probably be somewhat easy), this view leads me to think that AI development will likely be net-positive for people who exist at the time of alignment. In other words, if we allow AI development, it is likely that we can use AI to reduce human mortality, and dramatically raise human well-being for the people who already exist. I think these benefits are large and important, and commensurate with the downside potential of existential risks. While a fully committed strong longtermist might scoff at the idea that curing aging might be important — as it would largely only have short-term effects, rather than long-term effects that reverberate for billions of years — by contrast, I think it's really important to try to improve the lives of people who currently exist. Many people view this perspective as a form of moral partiality that we should discard for being arbitrary. However, I think morality is itself arbitrary: it can be anything we want it to be. And I choose to value currently existing humans, to a substantial (though not overwhelming) degree. This doesn't mean I'm a fully committed near-termist. I sympathize with many of the intuitions behind longtermism. For example, if curing aging required raising the probability of human extinction by 40 percentage points, or something like that, I don't think I'd do it. But in more realistic scenarios that we are likely to actually encounter, I think it's plausibly a lot better to accelerate AI, rather than delay AI, on current margins. This view simply makes sense to me given the enormously positive effects I expect AI will likely have on the people I currently know and love, if we allow development to continue.

124

William_MacAskill

4mo

This is a quick post to talk a little bit about what I’m planning to focus on in the near and medium-term future, and to highlight that I’m currently hiring for a joint executive and research assistant position. You can read more about the role and apply here! If you’re potentially interested, hopefully the comments below can help you figure out whether you’d enjoy the role. Recent advances in AI, combined with economic modelling (e.g. here), suggest that we might well face explosive AI-driven growth in technological capability in the next decade, where what would have been centuries of technological and intellectual progress on a business-as-usual trajectory occur over the course of just months or years. Most effort to date, from those worried by an intelligence explosion, has been on ensuring that AI systems are aligned: that they do what their designers intend them to do, at least closely enough that they don’t cause catastrophically bad outcomes. But even if we make sufficient progress on alignment, humanity will have to make, or fail to make, many hard-to-reverse decisions with important and long-lasting consequences. I call these decisions Grand Challenges. Over the course of an explosion in technological capability, we will have to address many Grand Challenges in a short space of time including, potentially: what rights to give digital beings; how to govern the development of many new weapons of mass destruction; who gets control over an automated military; how to deal with fast-reproducing human or AI citizens; how to maintain good reasoning and decision-making even despite powerful persuasion technology and greatly-improved ability to ideologically indoctrinate others; and how to govern the race for space resources. As a comparison, we could imagine if explosive growth had occurred in Europe in the 11th century, and that all the intellectual and technological advances that took a thousand years in our actual history occurred over the course of just a few years. It’s hard to see how decision-making would go well under those conditions. The governance of explosive growth seems to me to be of comparable importance as AI alignment, not dramatically less tractable, and is currently much more neglected. The marginal cost-effectiveness of work in this area therefore seems to be even higher than marginal work on AI alignment. It is, however, still very pre-paradigmatic: it’s hard to know what’s most important in this area, what things would be desirable to push on, or even what good research looks like. I’ll talk more about all this in my EAG: Bay Area talk, “New Frontiers in Effective Altruism.” I’m far from the only person to highlight these issues, though. For example, Holden Karnofsky has an excellent blog post on issues beyond misalignment; Lukas Finnveden has a great post on similar themes here and an extensive and in-depth series on potential projects here. More generally, I think there’s a lot of excitement about work in this broad area that isn’t yet being represented in places like the Forum. I’d be keen for more people to start learning about and thinking about these issues. Over the last year, I’ve done a little bit of exploratory research into some of these areas; over the next six months, I plan to continue this in a focused way, with an eye toward making this a multi-year focus. In particular, I’m interested in the rights of digital beings, governance of space resources, and, above all, on the “meta” challenge of ensuring that we have good deliberative processes through the period of explosive growth. (One can think of work on the meta challenge as fleshing out somewhat realistic proposals that could take us in the direction of the “long reflection”.) By working on good deliberative processes, we could thereby improve decision-making on all the Grand Challenges we will face. This work could also help with AI safety, too: if we can guarantee power-sharing after the development of superintelligence, that decreases the incentive for competitors to race and cut corners on safety. I’m not sure yet what output this would ultimately lead to, if I decide to continue work on this beyond the next six months. Plausibly there could be many possible books, policy papers, or research institutes on these issues, and I’d be excited to help make happen whichever of these seem highest-impact after further investigation. Beyond this work, I’ll continue to provide support for individuals and organisations in EA (such as via fundraising, advice, advocacy and passing on opportunities) in an 80/20 way; most likely, I’ll just literally allocate 20% of my time to this, and spend the remaining 80% on the ethics and governance issues I list above. I expect not to be very involved with organisational decision-making (for example by being on boards of EA organisations) in the medium term, in order to stay focused and play to my comparative advantage. I’m looking for a joint research and executive assistant to help with the work outlined above. The role involves research tasks such as providing feedback on drafts, conducting literature reviews and small research projects, as well as administrative tasks like processing emails, scheduling, and travel booking. The role could also turn into a more senior role, depending on experience and performance. Example projects that a research assistant could help with include: * A literature review on the drivers of moral progress. * A “literature review” focused on reading through LessWrong, the EA Forum, and other blogs, and finding the best work there related to the fragility of value thesis. * Case studies on: What exactly happened to result in the creation of the UN, and the precise nature of the UN Charter? What can we learn from it? Similarly for The Kyoto Protocol, the Nuclear Non-Proliferation Agreement, the Montreal Protocol. * Short original research projects, such as: * Figuring out what a good operationalisation of transformative AI would be, for the purpose of creating an early tripwire to alert the world of an imminent intelligence explosion. * Taking some particular neglected Grand Challenge, and fleshing out the reasons why this Grand Challenge might or might not be a big deal. * Supposing that the US wanted to make an agreement to share power and respect other countries’ sovereignty in the event that it develops superintelligence, figuring out how we could legibly guarantee future compliance with that agreement, such that the commitment is credible to other countries? The deadline for applications is February the 11th. If this seems interesting, please apply!

108

Cullen

I am not under any non-disparagement obligations to OpenAI. It is important to me that people know this, so that they can trust any future policy analysis or opinions I offer. I have no further comments at this time.

100

David Mathers

1mo

Community

Please people, do not treat Richard Hannania as some sort of worthy figure who is a friend of EA. He was a Nazi, and whilst he claims he moderated his views, he is still very racist as far as I can tell. Hannania called for trying to get rid of all non-white immigrants in the US, and the sterilization of everyone with an IQ under 90 indulged in antisemitic attacks on the allegedly Jewish elite, and even post his reform was writing about the need for the state to harass and imprison Black people specifically ('a revolution in our culture or form of government. We need more policing, incarceration, and surveillance of black people' https://en.wikipedia.org/wiki/Richard_Hanania). Yet in the face of this, and after he made an incredibly grudging apology about his most extreme stuff (after journalists dug it up), he's been invited to Manifiold's events and put on Richard Yetter Chappel's blogroll. DO NOT DO THIS. If you want people to distinguish benign transhumanism (which I agree is a real thing*) from the racist history of eugenics, do not fail to shun actual racists and Nazis. Likewise, if you want to promote "decoupling" factual beliefs from policy recommendations, which can be useful, do not duck and dive around the fact that virtually every major promoter of scientific racism ever, including allegedly mainstream figures like Jensen, worked with or published with actual literal Nazis (https://www.splcenter.org/fighting-hate/extremist-files/individual/arthur-jensen). I love most of the people I have met through EA, and I know that-despite what some people say on twitter- we are not actually a secret crypto-fascist movement (nor is longtermism specifically, which whether you like it or not, is mostly about what its EA proponents say it is about.) But there is in my view a disturbing degree of tolerance for this stuff in the community, mostly centered around the Bay specifically. And to be clear I am complaining about tolerance for people with far-right and fascist ("reactionary" or whatever) political views, not people with any particular personal opinion on the genetics of intelligence. A desire for authoritarian government enforcing the "natural" racial hierarchy does not become okay, just because you met the person with the desire at a house party and they seemed kind of normal and chill or super-smart and nerdy. I usually take a way more measured tone on the forum than this, but here I think real information is given by getting shouty. *Anyone who thinks it is automatically far-right to think about any kind of genetic enhancement at all should go read some Culture novels, and note the implied politics (or indeed, look up the author's actual die-hard libertarian socialist views.) I am not claiming that far-left politics is innocent, just that it is not racist.

Will Howard

2mo

You can now import posts directly from Google docs Plus, internal links to headers[1] will now be mapped over correctly. To import a doc, make sure it is public or shared with "eaforum.posts@gmail.com"[2], then use the widget on the new/edit post page: Importing a doc will create a new (permanently saved) version of the post, but will not publish it, so it's safe to import updates into posts that are already published. You will need to click the "Publish Changes" button to update the live post. Everything that previously worked on copy-paste[3] will also work when importing, with the addition of internal links to headers (which only work when importing). There are still a few things that are known not to work: * Nested bullet points (these are working now) * Cropped images get uncropped * Bullet points in footnotes (these will become separate un-bulleted lines) * Blockquotes (there isn't a direct analog of this in Google docs unfortunately) There might be other issues that we don't know about. Please report any bugs or give any other feedback by replying to this quick take, you can also contact us in the usual ways. Appendix: Version history There are some minor improvements to the version history editor[4] that come along with this update: * You can load a version into the post editor without updating the live post, previously you could only hard-restore versions * The version that is live[5] on the post is shown in bold Here's what it would look like just after you import a Google doc, but before you publish the changes. Note that the latest version isn't bold, indicating that it is not showing publicly: 1. ^ Previously the link would take you back to the original doc, now it will take you to the header within the Forum post as you would expect. Internal links to bookmarks (where you link to a specific text selection) are also partially supported, although the link will only go to the paragraph the text selection is in 2. ^ Sharing with this email address means that anyone can access the contents of your doc if they have the url, because they could go to the new post page and import it. It does mean they can't access the comments at least 3. ^ I'm not sure how widespread this knowledge is, but previously the best way to copy from a Google doc was to first "Publish to the web" and then copy-paste from this published version. In particular this handles footnotes and tables, whereas pasting directly from a regular doc doesn't. The new importing feature should be equal to this publish-to-web copy-pasting, so will handle footnotes, tables, images etc. And then it additionally supports internal links 4. ^ Accessed via the "Version history" button in the post editor 5. ^ For most intents and purposes you can think of "live" as meaning "showing publicly". There is a bit of a sharp corner in this definition, in that the post as a whole can still be a draft. To spell this out: There can be many different versions of a post body, only one of these is attached to the post, this is the "live" version. This live version is what shows on the non-editing view of the post. Independently of this, the post as a whole can be a draft or published.

Load more (5/340)