You could have an AI with some meta-cognition, able to figure out what's good and maximizing it in the same way EAs try to figure out what's good and maximize it with parts of their life.
I'm not sure how that would work, but we don't need to discuss it further, I'm no expert.
I don't think it's a good method and I think you should target a much more specific public but yes, I know what you mean.
What exactly do you think is "not good" about a public discussion of AI risks?
The superintelligence is misaligned with our own objectives but is benign
I don't see how this is possible. There is nothing like "a little misalignment". Keep in mind that creating an unstoppable and uncontrollable AI is a one-shot event that can't be undone and will have extremely wide and long-term effects on everything. If this AI is misaligned even very slightly, the differences between its goals and humanity's will aggregate and increase over time. It's similar to launching a rocket without any steering mechanism with the aim of landing it on the Jupiter moon Europa: You have to set every parameter exactly right or the rocket will miss the target by far. Even the slightest deviance, like e.g. an unaccounted-for asteroid passing by close to the rocket and altering its course very slightly due to gravitational effects, will completely ruin the mission.On the other hand, if we manage to build an AGI that is "docile" and "corrigible" (which I doubt very much we can do), this would be similar to having a rocket that can be steered from afar: In this case, I would say it is fully aligned, even if corrections are necessary once in a while.Should we end up with both - a misaligned and an aligned AGI, or more of them - it is very likely that the worst AGI (from humanity's perspective) will win the battle for world supremacy, so this is more or less the same as just having one misaligned AGI.My personal view on your subject is that you don't have to work in AI to shape its future. You can also do that by bringing the discussion into the public and create awareness for the dangers. This is especially relevant, and may even be more effective than a career in an AI lab, if our only chance for survival is to prevent a misaligned AI, at least until we have solved alignment (see my post on "red lines").
The above statement appears to assume that dangerous transformative AI has already been created,
Not at all. I'm just saying that if any AI with external access would be considered dangerous, then the same AI without access should be considered dangerous as well.The dynamite analogy was of course not meant to be a model for AI, I just wanted to point out that even an inert mass that in principle any child could play with without coming to harm is still considered dangerous, because under certain circumstances it will be harmful. Dynamite + fire = damage, dynamite w/o fire = still dangerous.Your third argument seems to prove my point: An AI that seems aligned in the training environment turns out to be misaligned if applied outside of the training distribution. If that can happen, the AI should be considered dangerous, even if within the training distribution it shows no signs of it.
Thank you for sharing this! It takes a lot of courage to talk about one's "failures", because we're constantly bombarded with (often fake or incomplete) success stories. Social media tell us we're not beautiful enough, not smart enough, not rich and successful enough. As a management consultant, I learned to pursue "best practice", to learn from these success stories and apply their principles of success to my own projects. It took me a while to figure out that this is complete bogus and almost never works in real life.
Im 61 now, and my list of "failures" is far longer than yours: I founded four start-ups, none of which became successful. I wrote three novels, none of which got published. I invented more than a hundred board-games, none of which was played outside of the circle of my family and friends (who hate being my play-testers by now). I tried to become a musician, song-writer, and poet, and failed miserably at it. I developed a computer game which I published in one of my start-ups, but despite nice reviews we sold only about 5% of the games we produced (these were the 90s, when computer games came in paper boxes with a CD in it). We got lucky, though - the storage house of our distributor burned down and their insurarnce covered a part of the production costs. I launched a YouTube channel, posting a video every week for a year, getting me to 238 followers.To me, all those failures aren't things I wish I hadn't done. They weren't mistakes. They were tries that didn't work out. But at least I did try, and that is a good thing - far better than just doing nothing because you're afraid of failing. On our death bed, they say, we mostly regret the things we didn't do, rather than our mistakes. So you should be as proud of your so-called "failures" as I am of mine. And, if you can overcome your depression, you should continue trying. Not because you have to, but because you want to - because it's much nicer doing things you believe in than doing what some manager tells you to do.By the way, my fourth novel got published when I was 47 and became a German bestseller. Today, I am a full-time writer with more than 50 books published (https://karl-olsberg.jimdo.com/english/). A lot of them are flops. Some are not.
Edit: Shortly after writing this, my publisher informed me that they weren't publishing the 5th book of a children's book series, which I had already finished and they had already paid me for, due to lack of success of the first four books, and because of the paper shortage. Well ... I told them that I'm sorry to hear that and that I'll try to think of a better idea for the next series, which is what I'm going to do.
I wish you all the best for recovery from your depression. Two of my sons had depressions, so I know it's a serious burden, but I also know it can be overcome.
Mir gefällt "Zukunftsschutz" von @Moritz K. Hagemann, weil es positiv ist. Etwas neutraler wäre "Langfristperspektive", was aber in anderen Kontexten bereits verwendet wird.