I follow Crocker's rules.
The last person to have a case of smallpox, Ali Maow Maalin, dedicated years of his life to eradicating polio in the region.
On July 22nd 2013, he died of Malaria, while traveling again after polio had been reintroduced.
On mental health:
Since AI systems will likely have a very different cognitive structure than biological humans, it seems quite unlikely that they will develop mental health issues like humans do. There are some interesting things that happen to the characters that large-language models "role-play" as: They switch from helpful to mischievous when the right situation arises.
I could see a future in which AI systems are emulating the behavior of specific humans, in which case they might exhibit behaviors that are similar to the ones of mentally ill humans.
On addiction problems:
If one takes the concept of addiction seriously, wireheading is a failure mode remarkably similar to it.
I am somewhat more hopeful about society at large deciding how to use AI systems: I have the impression that wealth has made moral progress faster (since people have more slack for caring about others). This becomes especially stark when I read about very poor people in the past and their behavior towards others.
That said, I'd be happier if we found out how to encode ethical progress in an algorithm and just run that, but I'm not optimistic about our chances of finding such an algorithm (if it exists).
There are several plans for this scenario.
I hope this answers the question somewhat :-)
In my conception, AI alignment is the theory of aligning any stronger cognitive system with any weaker cognitive system, allowing for incoherencies and inconsistencies in the weaker system's actions and preferences.
I very much hope that the solution to AI alignment is not one where we have a theory of how to align AI systems to a specific human—that kind of solution seems fraudulent just on technical grounds (far too specific).
I would make a distinction between alignment theorists and alignment engineers/implementors: the former find a theory of how to align any AI system (or set of systems) with any human (or set of humans), the alignment implementors take that theoretical solution and apply it to specific AI systems and specific humans.
Alignment theorists and alignment implementors might be the same people, but the roles are different.
This is similar to many technical problems: You might ask someone trying to find a slope that goes through a could of x/y points, with the smallest distance to each of those points, “But which dataset are you trying to apply the linear regression to?”—the answer is “any”.
There are three levels of answers to this question: What the ideal case would be, what the goal to aim for should be, and what will probably happen.
In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
Technically, I think that AI safety as a technical discipline has no "say" in who the systems should be aligned with. That's for society at large to decide.
I like this idea :-)
I think that there are some tricky questions about comparing across different forecasters and their predictions. If you simply take Brier score, this can be Goodharted: people can choose the "easiest" questions and get way better scores than the ones taking on difficult questions.
I can think of some attempts to go at this:
🔭 Looking for good book on Octopus Behavior
Criteria: Scientific (which rules out The Soul of an Octopus), up to date (which mostly rules out Octopus: Physiology and Behaviour of an Advanced Invertebrate.
Why: I've heard claims that octopuses are quite intelligent, with claims going so far to attribute transmitting knowledge between individuals. I'd like to know more about how similar and different octopus behavior is from human behavior (perhaps shedding light on the space of possible minds/fragility of value).
🔭 Looking for good book/review on Universal Basic Income
Criteria: Book should be ~completely a literature review and summary of current evidence on universal basic income/unconditional cash transfers. I'm not super interested in any moral arguments. The more it talks about actual studies the better. Can be quite demanding statistically.
Why: People have differing opinions on the feasibility/goodness of universal basic income, and there's been a whole bunch of experiment, but I haven't been able to find a good review of that evidence.
🔭 Looking for a good textbook on Cryobiology
Criteria: The more of these properties the textbook has the better. Fundamentals of Cryobiology looks okay but has no exercises.
Why: I have signed up for cryonics, and would like to understand the debate between cryobiologists and cryonicists better.