Hide table of contents

Introduction

In this post I will describe two possible designs for Artificial Wisdom (AW.) This post can easily be read as a stand-alone piece, however it is also part of a series on artificial wisdom. In essence:

Artificial Wisdom refers to artificial intelligence systems which substantially increase wisdom in the world. Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," or as “having good terminal goals and sub-goals.” By “strapping” wisdom to AI via AW as AI takes off, we may be able to generate enormous quantities of wisdom which could help us navigate Transformative AI and The Most Important Century wisely.

TL;DR

Artificially wise coaches that improve human wisdom seem like another promising path to AW. Such coaches could have negligible costs, be scalable & personalized, and soon perform at a superhuman level. Certain critical humans receiving wise coaching could be decisive in humans navigating transformative AI wisely.

One path to AW coaches is by creating a decentralized system like a wiki or GitHub for wisdom-enhancing use-cases. Users could build up a database of instructions for LLM’s to act as AW coaches to help users make difficult decisions, navigate difficult life and epistemic dilemmas, work through values conflicts, achieve career goals, improve relational/mental/physical/emotional well-being, and increase fulfillment/happiness.

One especially wise use-case could be a premortem/postmortem bot that helps people, organizations, and governments to avoid large-scale errors.

Another path to creating an AW coach is to build a new system trained on biographical data, which analyses and learns to predict which decision-making processes and strategies of humans with various traits in various environments are most effective for achieving certain goals.

Artificial Wisdom Coaches

The are several possible paths for developing AW coaches. After introducing the basic idea, I will briefly outline two of them.

The essential idea is that AW can learn from the best human coaches, therapists, teachers, or other wise data and training mechanisms, but can study orders of magnitude more data, which it never forgets, plus be fine-tuned on a large amount of human outcome data directly tied to the results of its coaching. 

Artificially wise coaches will eventually perform much better than human coaches, and be much more scalable with negligible costs, hence they will be able to help humans perform at a higher level and not fall behind AI as much, since as AI gets better humans will also be getting better AW coaches (again, this is the idea of “strapping” human wisdom to AI as AI takes off.)

GitWise 

The first way of creating an AW coach is via a decentralized system in which numerous people interested in this concept contribute to gradually build up a database of ways in which to use LLM’s to improve human functioning and increase human wisdom. 

This could be something like a wiki, a forum, or a GitHub for wise AI use, maybe we could call it “GitWise.”

This database could include many aspects, such as:

  • What background information to share and how to share it
  • Specific prompts
  • Prompt workflows
  • Highly effective use cases
  • Plug-and-play agentic coaches
  • Tips for effectively interacting with AW coaches
  • Etc.

Using the first category above as an example, there might be various processes for listing out all of the personal background information important for an AW coach to help you specifically, in a specific use-case or across all use-cases. For example, there could be instructions and examples of how to list out all of your most important life goals, your values, your daily routine, important parts of your life history, what traits you want in a coach, your preferred learning styles, what is most motivating and demotivating to you, various details about your occupation, hobbies, social life, etc. and how exactly to present all of this to an AW coach, whether as a PDF in the context window, as part of its memory, etc.

Each set of instructions/prompts/workflows/etc. would be rated by users, and perhaps rated across the most relevant traits such as effectiveness, user-friendliness, etc. and this rating would also contribute to a user’s rating, so that users can easily find the best instructions and build reputations. A more advanced version could track more detailed metrics of how effectively 

AW coach instructions could be organized across many categories, for example instructions could help users make difficult decisions, navigate difficult life and epistemic dilemmas, work through values conflicts, achieve career goals, improve relational/mental/physical/emotional well-being, increase fulfillment/happiness, etc.

As the base model, and hence the AW coach, gets increasingly intelligent and eventually superhumanly intelligent, it will be able to help humans perform at increasingly high levels. As discussed in the first piece, high performing humans who make fewer large-scale mistakes could be incredibly important in certain existential security/longtermist domains, and effective, happy humans seem good to have in general.

In addition to individual humans, this AW design could also be adapted to apply to teams, for-profit and non-profit organizations, and governments.

Premortem/Postmortem Bot

I was originally going to include this as a separate AW design, but realized I don't have enough technical knowledge to fully flesh it out as its own project, so I'm including it as a sub-idea within the GitWise coach.

The idea is that an LLM could help perform premortems on important projects. It could help think through all of the ways that a project could go wrong at each step, and pre-plan how to avoid or deal with the most likely obstacles and most serious risks.

It could also help perform postmortems on projects that have failed, in which projects are analyzed to see what went wrong and what could have been have done differently, the various ways failure modes could have been systematically avoided or resolved, and how to perform better in the future, including redesigning processes to broadly avoid similar classes of problems as effectively and efficiently as possible.

While such a bot could easily be created by giving an LLM instructions on how to help perform a premortem/postmortem, and A/B testing until optimal instructions are found, it would likely be more effective to custom pre-train or fine-tune an LLM to be especially effective at helping perform premortems and postmortems. 

Perhaps one way to train such a model is to give it historical accounts of goal-directed events (perhaps curated by LLM’s), such as individual’s biographies, or stories/public data of companies/non-profits/governments, and have it start with the goal of the entity and then have it try to do a premortem to predict what might go wrong and how the entity could try to prevent or address the issue, and then do gradient descent updating it with what actually went wrong and how it was resolved, if it was successfully resolved (unless the LLM's own solutions are rated as more effective/efficient.)

Premortems and postmortems seem especially useful in avoiding large-scale errors. They allow an entity to do a detailed analysis of everything that could go wrong with their plans, and to effectively learn from what went wrong in previous plans and how they can systematically avoid similar errors in the future. An LLM that has been custom trained to do premortems and postmortems extremely effectively might be able to see many ways that plans could go wrong that most people would miss, using its extensive database of examples and highly developed reasoning regarding premortems/postmortems.

AlphaWise

The second way of training an AW coach is one I admit to not having enough technical knowledge to know whether it is actually feasible with current technology. Even if this idea is ahead of its time, I think it may still be good to have an abundance of such ideas shovel-ready since time is rapidly shrinking with the rapid progress of AI, so even some advanced ideas may be possible quite soon.

The idea is that you could use biographical knowledge of numerous individuals and create a game board on which these individuals, who possess various traits (including epistemic & decision-making processes,) make various decisions within various environments, which in turn increase and decrease various traits, and lead to various outcomes.

LLM’s could be used to analyze biographies and create a set of scores at various checkpoints in historical people’s lives by giving estimates of how much/what type of each trait an individual possesses across time; for example, social support, intellectual ideas, character traits, values, habits, education, social skills, finances and access to various types of resources, cultural access and cultural knowledge, cultural competence of various types, aspects of artistic/scientific/political/etc. pursuits, etc. etc. etc. (good candidate traits could be drawn from psychological and other social science research.) 

All of these, along with the individual's environment, would be quantified and mapped, as though the person was playing their life on a many dimensional game board, for example each of these traits could be scored on a scale of 1-100 (perhaps some traits could be estimated or ignored for some individuals if there was low information), and then an updated score for each trait could be given each year (or as often as possible and useful), with various decisions and decision processes people use that increase or decrease these scores modeled and mapped.

The outcomes of the person's life could also be mapped and scored across important values, for example how happy or successful they were, how much they contributed to/detracted from the happiness/success of others around them, how much they benefited or harmed society in various ways, etc.

If enough data could be gathered, and data could be cleaned up and made legible and reliable enough, the same gradient descent deep learning techniques, including the policy network, value network, and tree search used to train Chess and Go AI systems (and perhaps more relevantly, techniques used in RPG's and strategy games, though I haven't personally studied these much) could be used to model what types of policies are good to use in various types of environments for individuals with various traits, pursuing specific life goals.

It is even possible that an AI could learn from self-play; a simulated game-world could be created where many individuals are generated with randomized characteristics and then play with/against each other, trying to maximize various goals, and the AI could try various strategies to maximize outcomes for certain individuals, for example it could “coach” individuals in the environment to see if it can help them to achieve their goals, and learn what type of coaching is most effective for improving specified outcomes for certain individuals.

It seems to me the biggest difficulty here is probably gathering enough usable data, though overall the project seems highly ambitious.

The same concept could be applied to companies, nonprofits, or governments, using historical accounts and records, but also using present-day public data (such as economic data or public government records). Thus, the AW coach could also learn to give advice to these entities in order to achieve certain collective outcomes, creating greater wisdom across all levels and sectors of society.

If there was some other way to bootstrap such an AW coaching system, which didn't require so much upfront data and intensive pre-training, then once in use it could continuously collect data from users and over a few years or decades build up enough data to give increasingly helpful & wise advice.

Such a system would allow individual humans, companies, nonprofits, and governments to deliberately choose the outcomes they achieve and wisely select between various paths (subgoals) to achieve those outcomes. It would give people greater control to make wise decisions that don't accidentally sacrifice some type of value that is important to them. Each of these entities would be able to get immediate feedback when they are falling into certain predictable traps, and would have access to wise, contextually sensitive advice on the best ways of thinking about various parameters to keep them happy, healthy, functioning, and moving toward positive outcomes across important values with a high degree of certainty.

While I suspect AlphaWise may be a bit of a moon-shot at present, perhaps it could nonetheless inspire a simpler version that is more tractable, and perhaps one day soon something like this will be feasible.

The next post in this series will explore a design for artificial wisdom a la decision forecasting and Futarchy, or return to the Series on Artificial Wisdom homepage.

Comments1


Sorted by Click to highlight new comments since:

Executive summary: The author proposes two designs for Artificial Wisdom (AW) coaches - GitWise and AlphaWise - which aim to enhance human wisdom and decision-making through AI-powered systems, potentially helping navigate transformative AI challenges.

Key points:

  1. GitWise: A decentralized system like GitHub for wisdom-enhancing use-cases, where users contribute to building a database of instructions for LLMs to act as AW coaches.
  2. AlphaWise: A system trained on biographical data to predict effective decision-making processes and strategies for achieving specific goals.
  3. AW coaches could help with difficult decisions, life dilemmas, career goals, and well-being, potentially at superhuman levels.
  4. A premortem/postmortem bot is proposed as a sub-idea within GitWise to help avoid large-scale errors in projects.
  5. The author acknowledges technical limitations in fully developing the AlphaWise concept but presents it as a potential future direction.
  6. Both designs aim to "strap" wisdom to AI as it develops, helping humans keep pace with advancing AI capabilities.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f