tl;dr: If your reason for cramming AI knowledge into your brain is stress, then don't do it. You can still be useful, but walk away from the front lines where people are directly responsible.
Disclaimer: 1) This is an obvious problem that has already been noticed and addressed by many LessWrong users 2) This is not an original solution but rather a specific framing of the problem and some food for thought 3) I could be gravely mistaken, and your best bet might be putting your all into research after all. However, you might just want to emerge from lurking and actually do something if that is the case. I wouldn't put all my chips on solely advancing the front lines.
The heroes who will save the world will be alignment researchers: when, dot by dot, the stars will grow dark and a dark dawn will rise, we will all have to buy them a beer. If you are not among this small band of heroes who will guarantee us the best future possible, you may feel an urgent need to promptly join them: I beg of you only to consider whether this is a good idea.
If you think that you should become an alignment researcher in a matter of months, [1] I will not try to stop you. But it's probably worth a few days worth of cranial compute to establish whether you are exploiting yourself in the best way you could.
I'll set the parameters of the problem. "Becoming an alignment researcher" is a spectrum: the more you learn about alignment, the more capable you are at navigating the front lines of the alignment project. Certainly, understanding the tenets of alignment is a laudable goal; but at what point will you be faced with diminishing returns? If you are not planning on single-handedly solving the alignment problem, are there not better uses of your time?[2]
There are many instrumental goals that serve the terminal goal of completing the alignment project, and they might be more worthy of your time:
- The logistics of the alignment project: educating and hiring all the AI experts out there who could be working directly on the front lines. How many Von Neumanns are locked in Subsaharian Africa that could be shipped to the front lines of alignment in a few years? How many engineers at OpenAI could be hired to work on the alignment problem instead of on the next variation of GPT?
- The economics of the project: financing alignment research, finding an economic incentive for AI companies to stride forward more safely and spend R&D on alignment and optimizing how many ressources in the general economy are spent toward this goal. Money is always useful: what could we use it for exactly? Is open-sourcing the alignment project possible?
- The politics of the project: If you know anybody in politics anywhere, it might be a good idea to try and convince them to pay attention to this AGI thing. If you know anybody who knows anybody in politics anywhere, that's good too. The power of government and its resources already hugely determine what the AI field looks like.
- The PR of the project: More people should probably be taking AGI risk seriously, and it should be prioritized over weaker risks such as climate change,[3] which tend to monopolize all the attention. Misconceptions should be cleared, the tenets of AI safety should be clearly understood, and the reality of how difficult it is to build AGI and to make it aligned should be well-known. Getting the crowds over on our side won't help too much directly because it is alignment researchers, not crowds, who will save the world; but it could increase the amount of smart people joining our ranks and it could help economically and politically as well. This also increases human dignity.
- The mental health aspect of the project: We must keep alignment researcher minds intact. There are some resources out there addressing this problem, but mental health is extremely specific to each human being, and so having as many "how to be okay" takes as possible is probably a good idea. The current landscape seems to be: "AI progress is careening toward a future in which AI systems are smarter than us, have orthogonal values, and are inscrutable black boxes." The current landscape justly terrifies many of us. Coming up with new truth-based reasons to keep fighting and remain composed is crucial.
- The "you are a human" aspect of the project: This one is slightly different from the last because it is an injunction to take care of your mental health. You are more useful to us when you are not stressed. I won't deny that you are personally responsible for the entire destiny of the universe, because I won't lie to you: but we have no use for a broken tool. Work on the alignment problem only when you're at peace. You are a specifically human general intelligence, which means that your set of determining variables for problem-solving are specifically human. You cannot ignore the parts of your mind that make you human, so spend time on them too. [4]
- The practical project: There are practical things you could be doing right now that might disproportionately increase our odds of survival. Cooking pasta one night for an alignment researcher you know. Fixing their toilet or babysitting their kids. Being friendly to strangers. Picking up trash on the street. Gifting good books to children. I don't know. Making the world a little better will achieve the following: increase our odds of survival by making the front lines more bearable; increase human dignity; diminish distractions; solve part of the "you are a human" problem. Just be a good person. You'll be contributing to the practical project. [5]
- Sit in a room. Your next logical action two minutes from now might not be contributing to the alignment project at all: may I suggest sitting alone in a room? Compose your thoughts! Let your mind roam free. Let it engage in play and take up stock. In realities in which the alignment problem is solved, it is most likely thanks to some lone researcher sitting in a room, dancing gracefully between ideas and spotting patterns before inspiration strikes. Boredom can produce miracles, and cramming knowledge often just drives you into burnout. To increase our log odds of survival, just do nothing. You have a lot to think about anyhow.
- The creative solution to the project: One of the many uses of sitting alone in a room is that that's the path to creative solutions. Humans are unimaginably unique, so if you have something you can contribute to the effort that nobody will have thought of before, I urge you to think particularly about that. People much smarter and more knowledgeable than us have thought about this problem, and the meta-problem around it, a lot: your only hope of not straying down some painfully obvious false path is having a unique take on the problem.
If you feel stressed right now and have thus decided to spend your time scrantically [6] reading LessWrong posts about AGI projections and alignment solutions while breathing heavily. . . just don't. Don't become an alignment scientist today because you are stressed. Don't sacrifice doing what you love, because there's a good chance you can help us by doing what you love. Solve the "you are human" problem first and then perhaps solve one of the others, so that you are not directly involved on the front lines where responsibility is direct. You are just as responsible for the universe as the rest of us: but you are responsible for results, not effort, and that could mean walking away from the front lines.
Ah, and if you're too stressed: breathe three times using the whole capacity of your lungs, smile at a mirror, then eat some chocolate. I bid you an excellent day. Spend some time looking at flowers or something. Then you can return to heroics.
- ^
It's not enough to become an alignment researcher. You must become a useful alignment researcher, which is of course an even harder target to attain.
- ^
I'm conflicted because there might be too many people writing and reading on the blog instead of spending hours in solitude attempting to actually find a solution to alignment. I deeply respect people who do the latter, and we need more of them. More on that later.
- ^
"Weaker risks" does not mean they should not be addressed: if climate change starts hampering alignment research, like by slowing down development in poor countries, it should be proportionally paid attention to. How important minor risks are kind of depends on what your AGI forecast model is (there are a dozen on LessWrong). But the point is, almost all the existential risk is concentrated in AGI and so the importance of all other problems should be correlated with their relevance to the alignment problem.
- ^
The cool aspect of this problem is that you can't be stressed by it. The other problems are external in nature: but the whole point of this problem is that you must be at peace for it to be solved, meaning that you can't rush your way through it, half-ass it, or have a breakdown while doing it. Take a walk outside or something.
- ^
False hope is a dangerous thing and I do not mean to supply that here. If we all recycle our pizza boxes, the world won't be saved. But taking away some distractions and burdens that alignment researchers may be plagued with seems like an excellent use of time. And being a good person is just generally a good thing: i.e. don't drop everything including your morals and give your all to the alignment project. There's a lot to say about arrogance of this kind: think of Raskolnikov from Crime and Punishment. AGI is not an excuse for you to forget basic duties.
- ^
Rushing around LessWrong with its abundance of footnotes and references with the goal of learning something and clarifying the picture for you, will accomplish nothing but fragment your mind and increase your stress levels. Digest all knowledge.
I'm not so sure these days[1]. The heroes who save the world may well be those that get Microsoft/Open AI, Google Deepmind and Anthropic to halt their headlong Earth-threatening suicide-race toward AGI. Or those who help create a global moratorium or taboo on AGI that gives us the years (or decades) of breathing space needed for Alignment to be solved. Or those who help craft, enact, and enforce strict global limits on compute and data aimed at preventing AGI-level training runs, and those who adhere to them. Without these, I just don't think there is time for Alignment to be solved. They are now the bottleneck through which the future flows, not Alignment research. (More.)
Also, it might even be the case that Alignment is impossible. In this case, then I guess Alignment researchers could be instrumental in providing the theoretical proofs for this, and thus keep the world safe by providing justification for the continuation of an indefinite moratorium on AGI.
Although to be clear, we owe historical alignment researchers a huge debt of gratitude for raising awareness of AI x-risk
Thank you, Neil. It's highly relevant