Hide table of contents

Introduction

In this post I will describe one possible design for Artificial Wisdom (AW.) This post can easily be read as a stand-alone piece, however it is also part of a series on artificial wisdom. In essence:

Artificial Wisdom refers to artificial intelligence systems which substantially increase wisdom in the world. Wisdom may be defined as "thinking/planning which is good at avoiding large-scale errors," or as “having good terminal goals and sub-goals.” By “strapping” wisdom to AI via AW as AI takes off, we may be able to generate enormous quantities of wisdom which could help us navigate Transformative AI and The Most Important Century wisely.

TL;DR

This AW design involves using advanced forecasting AI to help humans make better decisions. Such a decision forecasting system could help individuals, organizations, and governments achieve their values while maintaining important side constraints and minimizing negative side effects.

An important feature to include in such AW systems is the ability to accurately forecast even minuscule probabilities of actions increasing the likelihood of catastrophic risks. The system could refuse to answer, attempt to persuade the user against such actions, and the analyses of such queries could be used to better understand the risk humanity is facing, and to formulate counter-strategies and defensive capabilities.

In addition to helping users select good strategies to achieve values or terminal goals, it is possible such systems could also learn to predict and help users understand what values and terminal goals will be satisfying once achieved.

While such technologies seem likely to be developed, it is questionable whether this is a good thing due to potential dual-use applications, for example use by misaligned AI agents; therefore, while it is good to use such capabilities wisely if they arise, it is important to do more research on whether differential technological development of such systems is desirable.

AI Forecasting & Prediction Markets

There has been some buzz lately about the fact that LLM's are now able to perform modestly-moderately well compared to human forecasters. This has even led Metaculus to host an AI bot forecasting benchmark tournament series with $120,000 in prizes.

I think things are just getting started, and as LLM's become increasingly good at forecasting, it may soon be possible to automate the work of decision markets, and perhaps even Futarchy.

Forecasting and prediction markets (which use markets to aggregate forecasts) are important because by knowing what is likely to happen in the future, we can more wisely choose our present actions in order to achieve our desired goals. While it is uncertain whether prediction markets could yet help us choose our terminal goals wisely, it seems likely they could help us choose our sub-goals wisely - especially a certain type of prediction markets:

Decision Markets & Futarchy

Decision markets and Futarchy were invented by Robin Hanson. In decision markets, there are multiple separate prediction markets, one for each option. The example Robin Hanson always gives is a “fire the CEO” market. Market participants then predict whether the stock price will go up or down if the CEO is fired, and whether it will go up or down if the CEO is not fired. Depending on which decision gets more votes for stock prices going higher, a decision can be made whether or not to fire the CEO. 

Futarchy, in turn, can roughly be described as a society governed by decision markets. The phrase most associated with Futarchy is “vote on values, bet on beliefs.” First, the citizenry vote on values, or elect officials who will define various measures of national well-being. Then the policies most likely to achieve those values or measures of well-being are decided by betting on decision markets which predict which policies are most likely to achieve those values.

So decision markets and Futarchy can allow people to make better decisions by using the power of forecasting and markets to help us choose the correct sub-goals that will lead us to achieve our terminal goals[1], our values. 

Decision Forecasting AI's

I am not entirely sure why decision markets and Futarchy are not more popular[2], but in any case, one of the largest obstacles could soon be removed when AI’s are able to predict as well as human forecasters, largely removing the human labor requirement, making generating well-calibrated answers to forecasting questions as easy as asking a question and pressing a button - and such systems will be especially appealing if there are superhuman AI forecasters, or many human-level AI forecasters with diverse strengths and weaknesses so there can be a market-like system of AI forecasters which are collectively more reliable than individual human-level AI forecasters, enabling super-humanly effective decision-making.

If AI's could rapidly, reliably predict which of several courses of action (sub-goals) fulfill a set of values and achieve certain terminal goals[3] with a high degree of likelihood, and can explain their reasoning (bots being able to explain their reasoning is one of the requirements of the Metaculus tournament,) then humans who utilize such AI forecasting systems could have a huge advantage in achieving their terminal goals reliably without making mistakes; hence such decision-forecasting AI's are another path to artificial wisdom.

There would of course still be the necessity of generating possible courses of action to choose between, although it seems likely AI could eventually also be used to help do this; in fact, such possibility generation was already mentioned in the previous workflows post and forecasting AI’s could be just one more element in the artificial wisdom workflow system.

Additional Features

It would be very useful if such AI’s could learn to automatically generate new predictions that would be useful to us, perhaps another system could generate a large number of predictions options, trying to guess ones that will be  useful to users and tractable to predict and we could give them feedback on which ones are actually useful, and they could use this as a training signal to learn to auto-generate progressively more useful predictions.

Another feature that would seem very wise to include is the ability to accurately forecast even minuscule probabilities of a course of action increasing the likelihood of catastrophic risks. For example, perhaps someone is considering developing a new technology and the system predicts that the efforts of this specific individual pursuing this project, against the counterfactual, would lead to a 0.1% increase in the chance humanity ends in an existential catastrophe, and furthermore it is estimated that there are at least 1,000 people who are likely to take risks of similar magnitude within a time-frame of vulnerability to such risks.

Perhaps the system could refuse to answer, and instead explain the above analysis to the user in a highly persuasive way, perhaps describing a vivid (info hazard free) story of how this pursuit could lead to the end of humanity, and then forward the query and analysis to the creators of the system in order to inform better understanding and estimates of the risk humanity is facing, as well as to help formulate counter-strategies and defensive capabilities.

It is possible such systems could also learn to predict how satisfied we will be once we have achieved the goals or values (including the effects of sub-goals[1]) that we give as our terminal goals and values. Perhaps the artificially wise forecasting system could be designed in such a way that it is able to give us feedback on our goals and values, explain why in certain situations it predicts we will be unsatisfied, or sub-optimally satisfied with the results when we achieve them, and give highly intelligent premortem advice on how we might rethink what we are actually aiming for so that we do not end up regretting getting what we thought we wanted, perhaps suggesting possible alternative goals we might consider instead, explaining why it predicts these will give better results.

Decision Forecasting AI's At Scale

At scale, people could use such systems to get helpful insights and input to more wisely achieve their goals and values, as could nonprofits, companies, and governments. 

There would of course be serious safety concerns with having such a system run the sub-goals of the entire government as in Futarchy, and it would be absolutely essential to achieve AI alignment (more below) and extensively test the system first; but it is encouraging that there is, in the mechanism of Futarchy, a way to separate "voting on values" and "betting on beliefs," so that these systems could be restricted to only improving our ability to forecast and understand the consequences of various decisions, and helping us determine the best paths to achieve our goals, yet we could still be in control of the values and terminal goals which our decisions are working to achieve - although, as mentioned, it could also be nice to have input on what values and terminal goals will actually be most satisfying.

It would be interesting to see what happens when individual humans, nonprofits, or companies use a scaled down version of Futarchy to operate a significant fraction of their decisions. This seems like a good first experiment for interested entities, to see what works well, and what bugs need to be worked out before moving on to larger experiments.

Again, this is another example of an AW system that is strapped to increases in AI capabilities, allowing humans access to increasing power in predicting the future and making wise decisions, in parallel to base models becoming more powerful as AI takes off.

Dual-Use Concerns

Of course, such prediction systems could also be used to enhance AI agents. AI agents would hence be able to predict the consequences of high-level strategies such that they are better able to make plans and achieve medium-term goals set for them, including anything that we explicitly instruct them to adopt as a goal and any side constraints we give them. 

This could hence be good news for enhancing the wisdom and usefulness of AI agents at a certain level of intelligence, however, it seems highly dubious as to whether such forecasting abilities would have positive consequences as AI agents scale to general intelligence and superintelligence due to increasing risk of misalignment with potentially catastrophic failure modes.

Because of this, it seems advisable to be highly cautious when increasing AI forecasting capabilities, and because such technology is dual-use this topic deserves more research to determine if this technology deserves differential technological development or should instead be avoided, advocated against, or developed under carefully controlled conditions. 

That said, if this technology is developed, as it currently looks like it is on course to do, it seems highly desirable to adapt it for human use and make sure humans working to ensure a safe and positive long-term future are able to use it and benefit from it to the fullest extent safely possible, and to advocate that it be developed in such a way as to be safe as possible.

It is encouraging to see the requirement in the Metaculus tournament that forecasting bots be able to explicitly explain the reasoning used to make the prediction, increasing interpretability. Yet it seems like as bots scale to be much more intelligent, a great deal more probing will be required to make sure that such systems are safe and not subtly deceptive or misaligned.

I greatly appreciate feedback, or if you want to learn more about AW, see a full list of posts at the Series on Artificial Wisdom homepage.

  1. ^

    To be clear, terminal goals are not fully separable from sub-goals, but rather, in a sense include sub-goals. For example, if someone's goal was to live a happy and virtuous life, etc., the sub-goals of these terminal goals would themselves contain a large amount of the value being pursued. Furthermore, as discussed in the introductory piece in the series, it is essential that both terminal goals and sub-goals meet at least minimum acceptability requirements, or better yet are themselves good.

  2. ^

    I believe one reason that decision markets and Futarchy are not more popular is that large amount of human forecasting interest and participation are required to get such markets off the ground, and at present forecasting is relatively niche. One suggested reason that prediction markets are not more popular is that they are not good savings devices or attractive to gamblers and so do not attract sufficient capital to be attractive to professional traders without market subsidization - the same article argues that regulation is not the primary obstacle, though of course in the case of Futarchy there is the massive obstacle that society as a whole would need to coordinate to transition to government by decision markets, or to at least test the idea on a smaller scale. One reason Robin Hanson mentions is that people in companies who could use such markets don't actually want to know the truth due to office politics, for example higher-ups who possess decision-making power, which such markets might take away from them.[4] If the norm was in place it would be a net benefit to everyone, but since it is not in place, it feels threatening. I believe another reason could be that the vast majority of the population has not heard of these markets, and most who have heard of them are uncertain about them and haven't heard of them being successful in practice (since they haven't been put into practice, the “Matthew Effect.”)

  3. ^

    Terminal goals which, as discussed in the first post, include both what you want and what you don't want, hence minimizing negative side effects 

  1. ^

    There are some closely related business mechanisms which achieve meritocratic decision making through other mechanisms such as Ray Dalio's “believability-weighted decision making” 

Show all footnotes
Comments1


Sorted by Click to highlight new comments since:

Executive summary: Artificial Wisdom systems using advanced AI forecasting could help humans make better decisions and achieve goals more effectively, but raise dual-use concerns that require careful consideration.

Key points:

  1. Decision forecasting AI could help individuals and organizations select optimal strategies to achieve goals while minimizing negative side effects.
  2. Such systems could predict even small probabilities of catastrophic risks and refuse risky queries to improve safety.
  3. At scale, these systems could enable more effective decision-making for governments and companies, similar to "Futarchy."
  4. Features like auto-generating useful predictions and providing feedback on goals could further enhance wisdom.
  5. Dual-use potential for misaligned AI agents raises concerns about differential development of this technology.
  6. If developed, it's crucial to make these systems safe, interpretable, and beneficial for human use in ensuring a positive long-term future.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
sawyer🔸
 ·  · 2m read
 · 
Note: This started as a quick take, but it got too long so I made it a full post. It's still kind of a rant; a stronger post would include sources and would have gotten feedback from people more knowledgeable than I. But in the spirit of Draft Amnesty Week, I'm writing this in one sitting and smashing that Submit button. Many people continue to refer to companies like OpenAI, Anthropic, and Google DeepMind as "frontier AI labs". I think we should drop "labs" entirely when discussing these companies, calling them "AI companies"[1] instead. While these companies may have once been primarily research laboratories, they are no longer so. Continuing to call them labs makes them sound like harmless groups focused on pushing the frontier of human knowledge, when in reality they are profit-seeking corporations focused on building products and capturing value in the marketplace. Laboratories do not directly publish software products that attract hundreds of millions of users and billions in revenue. Laboratories do not hire armies of lobbyists to control the regulation of their work. Laboratories do not compete for tens of billions in external investments or announce many-billion-dollar capital expenditures in partnership with governments both foreign and domestic. People call these companies labs due to some combination of marketing and historical accident. To my knowledge no one ever called Facebook, Amazon, Apple, or Netflix "labs", despite each of them employing many researchers and pushing a lot of genuine innovation in many fields of technology. To be clear, there are labs inside many AI companies, especially the big ones mentioned above. There are groups of researchers doing research at the cutting edge of various fields of knowledge, in AI capabilities, safety, governance, etc. Many individuals (perhaps some readers of this very post!) would be correct in saying they work at a lab inside a frontier AI company. It's just not the case that any of these companies as
 ·  · 11m read
 · 
My name is Keyvan, and I lead Anima International’s work in France. Our organization went through a major transformation in 2024. I want to share that journey with you. Anima International in France used to be known as Assiettes Végétales (‘Plant-Based Plates’). We focused entirely on introducing and promoting vegetarian and plant-based meals in collective catering. Today, as Anima, our mission is to put an end to the use of cages for laying hens. These changes come after a thorough evaluation of our previous campaign, assessing 94 potential new interventions, making several difficult choices, and navigating emotional struggles. We hope that by sharing our experience, we can help others who find themselves in similar situations. So let me walk you through how the past twelve months have unfolded for us.  The French team Act One: What we did as Assiettes Végétales Since 2018, we worked with the local authorities of cities, counties, regions, and universities across France to develop vegetarian meals in their collective catering services. If you don’t know much about France, this intervention may feel odd to you. But here, the collective catering sector feeds a huge number of people and produces an enormous quantity of meals. Two out of three children, more than seven million in total, eat at a school canteen at least once a week. Overall, more than three billion meals are served each year in collective catering. We knew that by influencing practices in this sector, we could reach a massive number of people. However, this work was not easy. France has a strong culinary heritage deeply rooted in animal-based products. Meat and fish-based meals remain the standard in collective catering and school canteens. It is effectively mandatory to serve a dairy product every day in school canteens. To be a certified chef, you have to complete special training and until recently, such training didn’t include a single vegetarian dish among the essential recipes to master. De
 ·  · 1m read
 · 
 The Life You Can Save, a nonprofit organization dedicated to fighting extreme poverty, and Founders Pledge, a global nonprofit empowering entrepreneurs to do the most good possible with their charitable giving, have announced today the formation of their Rapid Response Fund. In the face of imminent federal funding cuts, the Fund will ensure that some of the world's highest-impact charities and programs can continue to function. Affected organizations include those offering critical interventions, particularly in basic health services, maternal and child health, infectious disease control, mental health, domestic violence, and organized crime.