Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

Owen Cotton-Barratt; AI Impacts

This is a linkpost for https://blog.aiimpacts.org/p/essay-competition-on-the-automation

With AI Impacts, we’re pleased to announce an essay competition on the automation of wisdom and philosophy. Submissions are due by July 14th. The first prize is $10,000, and there is a total of $25,000 in prizes available.

Submit an entry

The full announcement text is reproduced here:

Background

AI is likely to automate more and more categories of thinking with time.

By default, the direction the world goes in will be a result of the choices people make, and these choices will be informed by the best thinking available to them. People systematically make better, wiser choices when they understand more about issues, and when they are advised by deep and wise thinking.

Advanced AI will reshape the world, and create many new situations with potentially high-stakes decisions for people to make. To what degree people will understand these situations well enough to make wise choices remains to be seen. To some extent this will depend on how much good human thinking is devoted to these questions; but at some point it will probably depend crucially on how advanced, reliable, and widespread the automation of high-quality thinking about novel situations is.

We believe^[1] that this area could be a crucial target for differential technological development, but is at present poorly understood and receives little attention. This competition aims to encourage and to highlight good thinking on the topics of what would be needed for such automation, and how it might (or might not) arise in the world.

For more information about what we have in mind, see some of the suggested essay prompts or the FAQ below.

Scope

To enter, please submit a link to a piece of writing, not published before 2024. This could be published or unpublished; although if selected for a prize we will require publication (at least in pre-print form; optionally on the AI Impacts website) in order to pay out the prize.

There are no constraints on the format — we will accept essays, blog posts, papers^[2], websites, or other written artefacts^[3] of any length. However, we primarily have in mind essays of 500–5,000 words. AI assistance is welcome but its nature and extent should be disclosed. As part of your submission you will be asked to provide a summary of 100–200 words.

Your writing should aim to make progress on a question related to the automation of wisdom and philosophy. A non-exhaustive set of questions of interest, in four broad categories:

Automation of wisdom

What is the nature of the sort of good thinking we want to be able to automate? How can we distinguish the type of thinking it’s important to automate well and early from types of thinking where that’s less important?
What are the key features or components of this good thinking?
- How do we come to recognise new ones?
What are traps in thinking that is smart but not wise?
- How can this be identified in automatable ways?
How could we build metrics for any of these things?

Automation of philosophy

What types of philosophy are language models well-equipped to produce, and what do they struggle with?
What would it look like to develop a “science of philosophy”, testing models’ abilities to think through new questions, with ground truth held back, and seeing empirically what is effective?
What have the trend lines for automating philosophy looked like, compared to other tasks performed by language models?
What types of training/finetuning/prompting/scaffolding help with the automation of wisdom/philosophy?
- How much do they help, especially compared to how much they help other types of reasoning?

Thinking ahead

Considering the research agenda that will (presumably) eventually be needed to automate high quality wisdom/philosophy:
- Which parts of the agenda can we expect to automate in a timely fashion?
- What is the core that we will need humans to address?
- What do we expect the thorny sticking points to be?
Why may or may not this problem be solved “by default”? (from a technical standpoint)
Can we tell concrete stories or vignettes in which the automation of wisdom/philosophy is/isn’t important, to triangulate our understanding of what matters?
What preparatory research could provide the best groundwork for humanity to automate high-quality wisdom/philosophy before it is necessary?
What projects today or in the near future would be valuable to undertake?

Ecosystems

If the world were devoting serious attention to this, what would that look like?
- What incentives on institutional actors could push work onto related but less important questions; vice-versa what could help ensure that work remained well-targeted?
What are the natural institutional homes for this research in the short term?
- Academia? Nonprofits? Frontier AI labs? Elsewhere in industry?
What might be needed (proofs, audits, track record?) to enable humans (decision-makers, voters) and human institutions to correctly trust wise advice from AI systems?
- How could we lay the groundwork for this?
Ideas for catalysing/sustaining this field?
Why may or may not this problem be solved “by default”? (from a social standpoint)

If you’re not sure whether a topic would be within scope, feel free to check with us.

Judging

The judging process will be coordinated by Owen Cotton-Barratt. After shortlisting, entries will be assessed by a panel of judges: Andreas Stuhlmüller, Brad Saad, David Manley, Linh Chi Nguyen, and Wei Dai.

Judging criteria will be:

Does the entry tackle an important facet of the automation of wisdom/philosophy?
Does the entry contain good analysis or valuable new ideas?
Is the writing clear, succinct, and epistemically appropriate?
Does the entry provide something that we are excited to see built upon or explored further?

The prize pool is $25,000, and the prize schedule will be:

$10,000 First Prize
$5,000 Second Prize
4x $2,000 Best-in-Category Prizes
- Judging for these will exclude the overall First and Second Prize winners from consideration
  - So if e.g. the overall First Prize and Second Prize both went to entries in the “Ecosystems” category, then the third-best entry in that category would receive $2,000
4x $500 Runner Up Prize, for the best entries across any category that did not receive another prize
- For these prizes, the judges may give preference to impressive entries by people at early career stages
  - Whereas judging for the main prizes will — insofar as this is feasible — be blind to the identities and personal characteristics of the authors

We may contact entrants whose work impresses us about possible further opportunities (e.g. conferences or research positions) on these topics.

Details

Entries should be submitted via this form, which asks for:

Your name and email address
A link to your entry
A 100–200 word summary
Which if any of our four categories your entry falls under
Statement of authorship credit (including AI credit)
A brief description of career stage (so that judges can at their discretion account for this in awarding Runner Up prizes)
Opportunity to opt out of future contact not directly related to this competition
Anything else we should know

You are of course welcome to seek feedback on drafts before submission. Coauthored articles are also very welcome.

The deadline for submissions is midnight anywhere in the world on Sunday 14th July. We hope to complete shortlisting within two weeks of the submission deadline, and contact winners within four weeks of the submission deadline. Winners whose entries are not yet public will have two weeks after we contact them to provide a public version, or agree to us publishing it on the AI Impacts website. Payment will be made by ACH (for US-based winners) or wire transfer (for international winners).

We reserve the right to extend the submission deadline or increase the prize pool without notice. Judges have the right to split prizes in cases of ties, or to not award prizes in the unlikely event that no submissions are found to merit them.

If you want to ask questions about the competition, feel free to comment, or to email essaycompetition@aiimpacts.org

FAQ on the automation of wisdom and philosophy

What’s the basic idea here?

We're interested in the automation of thinking that can help actors to take wise actions (whatever that means) and avoid unwise actions. As an important subcategory, we're interested in the automation of philosophical thinking, and how to avoid practical errors grounded in philosophical mistakes.

What do you want to know about such automation?

We're not certain! We think it's a potentially important area which hasn't received that much attention. We'd like people to explore more of the ideas around this. If we understood more of the contours of when such automation might be helpful (or unhelpful!), that would seem good. If we understood more about what would be necessary for automation, that would seem good. If people developed a sense of things it would be good for someone to do in the world, that's potentially great.

We give a bunch of example questions we'd be interested in people addressing in the essay prompts part of the announcement, but because it seems like a broad area we've preferred to leave the competition fairly open, and wait to see which parts people can make meaningful contributions to.

What do you mean by “wisdom” and “philosophy”?

By “wisdom”, we mean something like “thinking/planning which is good at avoiding large-scale errors”. An archetype of something which is smart-but-not-wise might be a plan full of clever steps which are each individually well-chosen to chain to the previous step in the plan, but which collectively forget why they were doing this, and end up taking actions which are in conflict with the original goal. Wisdom is also what’s needed for noticing that an old ontology was baking in some problematic assumptions about what was going on.

By “philosophy”, we mean something like “the activity of trying to find answers by thinking things through, without the ability to observe answers”. This is close to the sense understood in the academic discipline of philosophy.

We’re not sure if automating these things is most naturally thought of as one topic, two topics, or more …

What threats are you concerned about?

Progress in these areas seems like it could potentially help avoid a number of different issues:

Unwise human actions

Humans sometimes take actions which are predictably unwise (from some perspectives), and which they later regret. Such actions could be really bad if they interact with high stakes situations. If people had access to trusted high wisdom automated advice, this could help them to reduce the rate of these errors.

This might be particularly important around issues coming with the development of AI, as people will be facing very novel situations and be less able to rely on experience.

Human philosophical errors

People sometimes make decisions that are influenced by their philosophical understanding of an issue. This could happen in the future, e.g. around understanding of AI consciousness/rights. Automation of good work, if achievable, could help people to have deeper understanding by the times they need to make key decisions.

Unwise AI actions

If people empower AI agents, ensuring that they are in some sense wise and not just smart could help to reduce rare damaging actions. In the extreme this could reduce risk of human extinction (imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place).

AI philosophical errors

If AI systems become superintelligent and are meaningfully running the world, their stances on philosophical questions could matter. e.g. deciding to engage in acausal trade (if it doesn’t actually make sense), or deciding not to (if it does) could be a large and consequential error. Better understanding of the automation of philosophy could help either to lead to more philosophically-competent AI systems, or alternatively could help people to coordinate about which parts of thinking should not be delegated to AI systems.

Is there a particular threat model you’re focused on?

No. We could make some guesses (both about which of the above categories are most concerning, and more concretely what the most concerning threats within them are), but we feel like the whole area is under-explored, and wouldn’t be confident in our guesses. We’d love to see high-quality analysis of this.

The fact that the automation of wisdom/philosophy seems important to better understand for multiple different threats — and also seems like a plausibly useful intervention for improving our ability to handle unknown unknowns — feeds into our desire to see it prioritized more than at present.

Automating wisdom, philosophy — isn’t this all just AI capabilities work?

Maybe! Certainly this is a type of capability (and high performance probably requires significantly advanced general capabilities, relative to today).

However, it seems to us that for a given level of general smarts in a system, the capacity for wisdom or philosophy could keep up with that, or could fail to. We are concerned about worlds where the ability to automate wise actions is outstripped by the ability to automate smart ones. So it seems like it may (at least in part) be a problem of differential technological development. We would be interested in further analysis of this question.

^{^}
The precise opinions expressed in this post should not be taken as institutional views of AI Impacts, but as approximate views of the competition organizers. We offer them not because we're sure they're exactly right, but because we think they're pointing in a promising direction and it's more likely to provoke high quality interesting entries if we provide some concrete starting points.
^{^}
We recognise that the timeline may be on the tight side for thoroughly researched papers. We are very happy to consider papers (and note that most journals accept papers that have been available as pre-prints, e.g. see https://philarchive.org/journals.html for philosophy journals), but for entrants who are targeting academic publication we also welcome people putting the heart of their argument into an essay for the competition and later expanding it into a paper.
^{^}
Feel free to use unusual formats if you consider them best for exploring the ideas. e.g. we would be happy to receive a fictional business plan or technical roadmap for a hypothetical firm working on a challenge in these areas.

finmJun 15 20243

~~Just noticed I missed the deadline — will you be accepting late entries?~~

Edit: I had not in fact missed the deadline

rileyharrisJun 15 202417

I am a wizard. I have magically transported you back to June 15th 2024. You will have all your progress so far. The essays are due in one month.

Liav.KorenApr 22 20243

I know a decent amount about ML and AI safety, have a working knowledge of various philosophy bits and bobs, and consider myself a good to very good writer. I'd be interested in potentially collaborating with someone on this.

SummaryBotApr 16 20243

Executive summary: An essay competition with $25,000 in prizes aims to encourage thinking on the automation of wisdom and philosophy, which could be crucial for making wise choices in a world reshaped by advanced AI.

Key points:

The competition seeks essays on what is needed to automate high-quality thinking about novel situations, and how this might arise.
Key questions include the nature of good thinking to automate, recognizing new components, identifying traps in smart but unwise thinking, and developing metrics.
Other topics include types of philosophy language models can produce, empirical testing of philosophical abilities, helpful training/prompting approaches, and the likely research agenda.
Essays may also cover what serious attention to this problem would look like, natural institutional homes for the research, enabling trust in AI-generated wise advice, and catalyzing the field.
Judging criteria include importance, quality of ideas and analysis, clarity, and potential for further exploration. Prizes total $25,000.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Jordan ArelAug 14 20241

I am wondering if the winners of this contest are going to be publicly announced at some point?

Owen Cotton-BarrattAug 14 20243

They definitely are! Judge discussions are ongoing, and after that we'll be contacting winners a while before any public announcements, so I'm afraid this won't be imminent, but we are looking forward to getting to talk about the winners publicly.

Jordan ArelSep 14 20241

Hi, hate to bother you again, just wondering where things are at with this contest?

Owen Cotton-BarrattSep 22 20246

I've now sent emails contacting all of the prize-winners.

Jordan ArelSep 22 20241

Great, thank you!

Owen Cotton-BarrattSep 14 20242

The judging process should be complete in the next few days. I expect we'll write to winners at the end of next week, although it's possible that will be delayed. A public announcement of the winners is likely to be a few more weeks.

Jordan ArelJun 25 20241

I have been thinking about this kind of thing quite a lot and have several ideas I have been working on. Just to clarify, is it acceptable to have multiple entries, or are there any limit on this?

Owen Cotton-BarrattJun 25 20244

Multiple entries are very welcome!

[With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I'm not sure exactly how we'd handle it if someone did the latter, but we'd aim for something sensible that didn't incentivise people to have been silly about it.]

mako yassJun 6 20241

Debate safety essentially is a wisdom-augmenting approach, each AI is attempting to arm the human with the wisdom to assess the arguments (or mechanisms) of the other.

I'd love to see an entry that discusses safety through debate, in a public-facing way. It's an interesting approach that may demonstrate to people outside of the field that making progress here is tractable. Assessing debates between experts is also a pretty important skill for dealing with the geopolitics of safety, an opportunity to talk about debate in the context of AI would be valuable.
It's also conceivable (to me at least) that some alignment approaches will put ordinary humans in the position of having to referee dueling AI debaters, bidding for their share of the cosmic endowment, and without some pretty good public communication leading up to that, that could produce outcomes that're worse than random.

I might be the first to notice the relevance of debate to this prize, but I'm probably not the right person to write that entry (and I have a different entry planned, discussing mental enhancement under alignment, inevitably retroactively dissolving all prior justifications for racing). So, paging @Rohin Shah, @Beth Barnes, @Liav.Koren

Owen Cotton-BarrattJun 6 20242

To respond to your parenthetical: if you did write on two topics you'd be welcome to submit both pieces.

(On the object-level: yes, this is on-topic and we'd be very happy to get an entry on it.)

Effective Altruism Forum
EA Forum

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

80