This post argues that desirable AI qualities include 1) Bayesian inference of its expected impacts on wellbeing across times and moral circles, 2) Finding and actualizing solutions to increase individuals’ wellbeing, 3) Motivating wellbeing focus among other systems that use AI, 4) Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives, and 5) Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. This piece overviews each of these quality categories.
I thank Dario Citrini for feedback on a draft of this post. All errors are mine.
Notes on level and definition of AI and normative type of alignment
These desirable? qualities relate to artificial intelligence that can outperform a single human or a coordinated group of humans in a specific task; AI agents that work with human agents, regardless of these groups’ relative performance to other ones in any metric; artificial general intelligence with agency that can independently shift the direction of global output; and artificial specific intelligence that functions as a learning step to more powerful machines.
Intelligence is defined as the ability to develop and advance solutions including by quantifying relevant concepts and gathering and processing necessary information.
This is a maximalist framework: seeking to develop a ‘fully aligned’ system that continuously optimizes itself according to an improving understanding of human values rather than seeking to prevent a negative outcome based on some definition of such values.
Desirable AI qualities
- Bayesian inference of its expected impacts on wellbeing across times and moral circles. This includes:
- Understanding advertisement impacts on viewers’ psychology
- Accounting for human welfare impacts of different economic production decisions
- Comprehending animal welfare effects of its actions and inactions
- Predicting safety of systems that have various effects on individuals and can be developed by its decisions
- Updating its understanding of wellbeing with new evidence
- Finding and actualizing solutions to increase individuals’ wellbeing. For example, by:
- Synthesizing relevant literature and ‘impersonating’ experts
- Developing connections with key stakeholders by self-optimization
- Gaining financial resources by market analysis
- Organizing employees to actualize solutions
- Gaining information necessary for intervention optimization
- Motivating wellbeing focus among other systems that use AI. This can include:
- Presenting success and sharing it at a good price
- Detecting possibly harmful algorithm pieces and suggesting alternatives
- Recommending purchases of AI with high harm/price ratio and changing or discarding them
- Combining wellbeing effects with interests of other systems and sharing the two together
- Finding ‘the most good’ solutions enjoyable under the veil of ignorance using independently developing perspectives. This may go beyond wellbeing (which should be fundamental), e. g. estimated by consciousness metrics. For instance:
- Understanding fundamental needs of different humans and non-humans and advancing a system that continuously fulfills these needs
- Stimulating and coordinating differentiated thinking regarding the most good and enabling a competition among various solutions which would not jeopardize any individuals’ wellbeing or needs fulfillment
- Conducting spacetime research to gain further definitions of the most good and adding relevant solutions
- Differentiating ‘human’ values from ‘inhumane’ goods based on inclusive ethics and morality research. For example:
- Gathering definitions of human values from various philosophy scholars
- Surveying representative samples of homo sapiens sapiens groups about descriptions of ideal systems that they can imagine
- Developing a method for including non-human animals’ expressions of human values
Based on Artificial Intelligence, Values, and Alignment, p. 2: “‘intelligence’ is understood to refer to ‘an agent’s ability to achieve goals in a wide range of environments (Legg and Hutter 2007, 12).’”