Ongoing project on moral AI

For lack of a better name =)

The idea is to use current AI technologies, like language models, to get an impartial AI that understands ethics as humans do, possibly even better.

You heard me right: in the same way as an AI can be smarter than a human, we should also accept the fact that we are not morally perfect creatures, and that it’s possible to create an AI which is better than us at, for example, spotting injustice. If you want to know more about the motivation behind this project and its value, you can have a look at these two short posts.

In philosophical terms: my objective is a philosopher AI that figures out epistemology and ethics on its own, and then communicates its beliefs.

In AI alignment terms: I’m saying that going for ‘safe’ or ‘aligned’ is meh, and that aiming for ‘moral’ is better. Instead of trying to limit the side effects of, or fix, agents which are morally clueless, I’d like to see more people working on agents which perceive and interpret the world from a human-like point of view.

This sequence is just a collection of posts about the same topic. Later on, I expect that posts will become more algorithmic and, finally, about practical experiments run on hardware.

You can find this sequence also on the AI Alignment Forum.