Tentatively against making AIs 'wise'

OscarD🔸

Summary

Wisdom is best conceived of as being more intuitive than carefully reasoned. This is helpful in order to distinguish ‘wisdom’ from ‘rationality’ or ‘good thinking’.
Intuitions, including wise intuitions, are easy to communicate but hard to justify, and need to be taken ‘on faith’.
It is very important that early AGIs are transparent and that their justifications for any actions they propose can be checked by a human.
Therefore, we should prefer to train smart, careful-reasoning AIs rather than inscrutable wisdom-nugget dispensing AIs.
Arguably I am unreasonably shifting the goalposts of the essay competition. The more positive framing is that I am “noticing that an old ontology was baking in some problematic assumptions about what was going on”^[1] and therefore I am actually being wise!

Wisdom as Intuition

I am sceptical of the given definition of wisdom:

Wisdom (1) - thinking/planning which is good at avoiding large-scale errors.

To me, this connotes logistics experts and military generals and central planners: people who carefully and methodically reason through a problem to succeed at their chosen objective. Instead, I conceptualise wisdom as more about having an insightful intuitive appraisal of a situation, without necessarily doing lots of careful ‘thinking/planning’. So I propose this alternate definition:

Wisdom (2) - the ability to intuitively form good views about important topics.

I particularly want to foreground the ‘intuitive’ part; I think if someone gives me a 20-premise argument flowchart with credences and Bayesian updates, this (if done well) can be a sign of careful thinking and being smart, but isn’t what I want to call ‘wise’. This is related to Kahneman’s classic two-systems model of cognition: I think wisdom is best thought of as mainly a system 1 process where thanks to someone’s vast knowledge and experience of related situations, they can encounter a new case and quickly come to a view.^[2]

Consider AlphaGo. My (amateur) understanding is that there is a ‘policy network’ trained on vast numbers of games that develops an ‘intuition’ for which moves are strong. This is then coupled with a Monte Carlo tree search exploring many possible game continuations. Without the policy network, just the tree search would be computationally intractable due to the combinatorial explosion of possible games.^[3] I claim that ‘wisdom’ is analogous to the policy network, which ~instantly perceives the gestalt of the position and guesses what top moves would be. This is in contrast to the careful, plodding, systematic reasoning of the tree search, which is an important component of making good decisions, but is more akin to being ‘smart’ or ‘systematic’ than ‘wise’ I think.

Why do these definitional quibbles matter? I think the given definition of wisdom is unhelpfully close to simply ‘good thinking’ or ‘rationality’ (surely ‘avoiding large scale errors’ is key to reasoning effectively). So to have a conceptual analysis of (artificial) wisdom, I think it is helpful to try to differentiate what is special about wisdom as opposed to intelligence or reasoning ability, and I think ‘intuition’ is the best candidate for this.

Intuition resists Communication

It is easy to communicate the contents of an intuition, but not the justifications for it. AlphaGo’s policy network can write out which moves it thinks are best, but cannot explain why it thinks this.^[4] For humans too, it is far easier to convey the fact of an intuition (e.g. ‘person X seems suspicious’) than to explain, or even know ourselves, why our brains formed these views.

We may be able to generate reasons for views we hold intuitively, but often this will be post hoc justification and not the real reasons we formed that view intuitively. Insofar as we do engage in effortful system 2 reasoning about a topic and convey our reasoning to others, this is then no longer communicating a ‘wise’ intuition, but rather formulating a ‘smart’ argument.

AIs should communicate their thought processes accurately

This then leads to my central concern. We may end up in an early-AGI world where AIs are ‘wiser’ than humans - they form better intuitive views on important new topics - but the AIs are not so superhuman that we want to blindly trust their inscrutable judgements. Worries about alignment also mean that even if AIs were vastly superhuman, we may still not want to trust their black-box judgements. Instead, I think our goal should be to develop ‘savant’ AIs that are superhuman at reasoning and have been trained in such a way as to always truthfully convey their internal representations using human-understandable language. Humans could then still form our overall moral judgements, hopefully wise ones, based on the reasoning support of superintelligent AIs.

To use a human analogy, I think we should prefer to build a

Peter Singer AI - good at taking commonly shared moral intuitions and systematically working out their implications for right action, with careful, transparent, and checkable reasoning

instead of

Dalai Lama AI - full of wise insights, but difficult to follow and verify the reasoning process generating the insights (insofar as there even is a systematic process)

Eventually, once we are very confident in the trustworthiness and robustness of our AIs, we may want to transition to a ‘sovereign’ AI which is all-knowing and all-wise and makes all important decisions without being constrained by needing to explain its intuitions and reasons to humans. But I think this should not be an early goal. Initially, we want AI systems that are smart but not wise, in that they are good at careful system 2 reasoning which we can comb over and check (or have narrow AIs check for us), while we continue to rely on our own moral intuitions.

So maybe let's not make 'wise' AIs (yet)?

^{^}
This quote is from the definition of wisdom from the Essay competition on the Automation of Wisdom and Philosophy, to which I am submitting this piece.
^{^}
It has been a while since I read Thinking Fast and Slow, but I recall there being examples about fire fighters knowing by intuition how dangerous a building was, or doctors knowing when a patient was in danger without knowing why they knew, or chess masters strguggling to explain why they picked some moves.
^{^}
The value network - which estimates a win probability as a function of a given position - is also needed.
^{^}
It would be interesting to couple AlphaGo with a language model to allow it to express opinions about moves in words - my guess is that it would be very difficult to get it to provide useful reasoning about why it recommends the moves it does, beyond just post hoc justification.

Chris LeongOct 28 20246

My position (to be articulated in an upcoming sequence) is the exact opposite of this, but fascinating post anyway and congrats on winning a prize!

OscarD🔸Oct 29 20242

Cheers, OK look forward to reading it!

Chris LeongFeb 124

I think it'll still take me a while to produce this, so I'll just link you to my notes for now:

• Some Preliminary Notes on the Promise of a Wisdom Explosion
• Why the focus on wise AI advisors?

SharmakeOct 29 20243

I honestly agree with this post, and to best translate this into my own thinking, we should rather have AI that is superhuman at faithful COT reasoning than it is at wise forward pass thinking.

The Peter Singer/Einstein/Legible reasoning corresponds to COT reasoning, whereas a lot of the directions for intuitive wise/illegible thinking depend on making the forward pass thinking more capable, which is not a great direction for reasons of trust and alignment.

Effective Altruism Forum
EA Forum

Tentatively against making AIs 'wise'

15

Summary

Wisdom as Intuition

Intuition resists Communication

AIs should communicate their thought processes accurately

15

Reactions

More posts like this