Karl von Wendt

154 karmaJoined Jun 2022


I'll bring a simple board game about AI safety that I've developed recently in case anyone wants to do an initial test (not the one that was tested on the AISER, which was way too complex and slow ;).

As I have argued here in more detail, we don't need AGI for an amazing future, including curing cancer. We don't have to decide between "all in for AGI" and "full-stop in developing AI". There's a middle ground, and I think it's the best option we have.

Policymakers and people in industry, at least till ChatGPT had no idea what was going on (e.g at the AI World Summit, 2 months ago very few people even knew about GPT-3). SOTA large language models are not really properly deployed, so nobody cared about them or even knew about them (till ChatGPT at least).

As you point out yourself, what makes people interested in developing AGI is progress in AI, not the public discussion of potential dangers. "Nobody cared about" LLMs is certainly not true - I'm pretty sure the relevant people watched them closely. That many people aren't concerned about AGI or doubting its feasibility by now only means that THOSE people will not pursue it, and any public discussion will probably not change their minds. There are others who think very differently, like the people at OpenAI, Deepmind, Google, and (I suspect) a lot of others who communicate less openly about what they do.

I agree that [a common understanding of the dangers] would be something good to have. But the question is: is it even possible to have such a thing? 

I think that within the scientific community, it's roughly possible (but then your book/outreach medium must be highly targeted towards that community). Within the general public, I think that it's ~impossible.

I don't think you can easily separate the scientific community from the general public. Even scientific papers are read by journalists, who often publish about them in a simplified or distorted way. Already there are many alarming posts and articles out there, as well as books like Stuart Russell's "Human Compatible" (which I think is very good and helpful), so keeping the lid on the possibility of AGI and its profound impacts is way too late (it was probably too late already when Arthur C. Clarke wrote "2001 - A  Space Odyssey"). Not talking about the dangers of uncontrollable AI for fear that this may lead to certain actors investing even more heavily in the field is both naive and counterproductive in my view.

And I would strongly recommend not publishing your book as long as you haven't done that.

I will definitely publish it, but I doubt very much that it will have a large impact. There are many other writers out there with a much larger audience who write similar books.

I also hope that a lot of people who have thought about these issues have proofread your book because it's the kind of thing that could really increase P(doom) substantially.

I'm currently in the process of translating it to English so I can do just that. I'll send you a link as soon as I'm finished. I'll also invite everyone else in the AI safety community (I'm probably going to post an invite on LessWrong).

Concerning the Putin quote, I don't think that Russia is at the forefront of development, but China certainly is. Xi has said similar things in public, and I doubt very much that we know how much they currently spend on training their AIs. The quotes are not relevant, though, I just mentioned them to make the point that there is already a lot of discussion about the enormous impact AI will have on our future. I really can't see how discussing the risks should be damaging, while discussing the great potential of AGI for humanity should not.

I strongly disagree with "Avoid publicizing AGI risk among the general public" (disclaimer: I'm a science fiction novelist about to publish a novel about AGI risk, so I may be heavily biased). Putin said in 2017 that "the nation that leads in AI will be the ruler of the world". If anyone who could play any role at all in developing AGI (or uncontrollable AI as I prefer to call it) isn't trying to develop it by now, I doubt very much that any amount of public communication will change that. 

On the other hand, I believe our best chance of preventing or at least slowing down the development of uncontrollable AI is a common, clear understanding of the dangers, especially among those who are at the forefront of development. To achieve that, a large amount of communication will be necessary, both within development and scientific communities and in the public. 

I see various reasons for that. One is the availability heuristic: People don't believe there is an AI x-risk because they've never seen it happen outside of science fiction movies and nobody but a few weird people in the AI safety community is talking seriously about it (very similar to climate change a few decades ago).  Another reason is social acceptance: As long as everyone thinks AI is great and the nation with the most AI capabilities wins, if you're working on AI capabilities, you're a hero. On the other hand, if most people think that strong AI poses a significant risk to their future and that of their kids, this might change how AI capabilities researchers are seen, and how they see themselves. I'm not suggesting disparaging people working at AI labs, but I think working in AI safety should be seen as "cool", while blindly throwing more and more data and compute at a problem and see what happens should be regarded as "uncool". 

You could have an AI with some meta-cognition, able to figure out what's good and maximizing it in the same way EAs try to figure out what's good and maximize it with parts of their life.

I'm not sure how that would work, but we don't need to discuss it further, I'm no expert.

I don't think it's a good method and I think you should target a much more specific public but yes, I know what you mean.

What exactly do you think is "not good" about a public discussion of AI risks?

The superintelligence is misaligned with our own objectives but is benign

I don't see how this is possible. There is nothing like "a little misalignment". Keep in mind that creating an unstoppable and uncontrollable AI is a one-shot event that can't be undone and will have extremely wide and long-term effects on everything. If this AI is misaligned even very slightly, the differences between its goals and humanity's will aggregate and increase over time. It's similar to launching a rocket without any steering mechanism with the aim of landing it on the Jupiter moon Europa: You have to set every parameter exactly right or the rocket will miss the target by far. Even the slightest deviance, like e.g. an unaccounted-for asteroid passing by close to the rocket and altering its course very slightly due to gravitational effects, will completely ruin the mission.

On the other hand, if we manage to build an AGI that is "docile" and "corrigible" (which I doubt very much we can do), this would be similar to having a rocket that can be steered from afar: In this case, I would say it is fully aligned, even if corrections are necessary once in a while.

Should we end up with both - a misaligned and an aligned AGI, or more of them - it is very likely that the worst AGI (from humanity's perspective) will win the battle for world supremacy, so this is more or less the same as just having one misaligned AGI.

My personal view on your subject is that you don't have to work in AI to shape its future. You can also do that by bringing the discussion into the public and create awareness for the dangers. This is especially relevant, and may even be more effective than a career in an AI lab, if our only chance for survival is to prevent a misaligned AI, at least until we have solved alignment (see my post on "red lines").

The above statement appears to assume that dangerous transformative AI has already been created,

Not at all. I'm just saying that if any AI with external access would be considered dangerous, then the same AI without access should be considered dangerous as well.

The dynamite analogy was of course not meant to be a model for AI, I just wanted to point out that even an inert mass that in principle any child could play with without coming to harm is still considered dangerous, because under certain circumstances it will be harmful. Dynamite + fire = damage, dynamite w/o fire = still dangerous.

Your third argument seems to prove my point: An AI that seems aligned in the training environment turns out to be misaligned if applied outside of the training distribution. If that can happen, the AI should be considered dangerous, even if within the training distribution it shows no signs of it.

Thank you for sharing this! It takes a lot of courage to talk about one's "failures", because we're constantly bombarded with (often fake or incomplete) success stories. Social media tell us we're not beautiful enough, not smart enough, not rich and successful enough. As a management consultant, I learned to pursue "best practice", to learn from these success stories and apply their principles of success to my own projects. It took me a while to figure out that this is complete bogus and almost never works in real life.

Im 61 now, and my list of "failures" is far longer than yours: I founded four start-ups, none of which became successful. I wrote three novels, none of which got published. I invented more than a hundred board-games, none of which was played outside of the circle of my family and friends (who hate being my play-testers by now). I tried to become a musician, song-writer, and poet, and failed miserably at it.  I developed a computer game which I published in one of my start-ups, but despite nice reviews we sold only about 5% of the games we produced (these were the 90s, when computer games came in paper boxes with a CD in it). We got lucky, though - the storage house of our distributor burned down and their insurarnce covered a part of the production costs. I launched a YouTube channel, posting a video every week for a year, getting me to 238 followers.

To me, all those failures aren't things I wish I hadn't done. They weren't mistakes. They were tries that didn't work out. But at least I did try, and that is a good thing - far better than just doing nothing because you're afraid of failing. On our death bed, they say, we mostly regret the things we didn't do, rather than our mistakes. So you should be as proud of your so-called "failures" as I am of mine. And, if you can overcome your depression, you should continue trying. Not because you have to, but because you want to - because it's much nicer doing things you believe in than doing what some manager tells you to do.

By the way, my fourth novel got published when I was 47 and became a German bestseller. Today, I am a full-time writer with more than 50 books published (https://karl-olsberg.jimdo.com/english/). A lot of them are flops. Some are not.

Edit: Shortly after writing this, my publisher informed me that they weren't publishing the 5th book of a children's book series, which I had already finished and they had already paid me for, due to lack of success of the first four books, and because of the paper shortage. Well ... I told them that I'm sorry to hear that and that I'll try to think of a better idea for the next series, which is what I'm going to do.

I wish you all the best for recovery from your depression. Two of my sons had depressions, so I know it's a serious burden, but I also know it can be overcome.

Load more