*Cross-posted from the **AI Alignment Forum**.*

On behalf of __ALTER__ and __Superlinear__, I am pleased to announce a prize of at least^{[1]} 50,000 USD, to be awarded for the best substantial contribution to the __learning-theoretic AI alignment research agenda__ among those submitted before October 1, 2023. Depending on the quality of submissions, the winner(s) may be offered a position as a researcher in ALTER (similar to __this one__), to continue work on the agenda, if they so desire.

Submit __here__.

**Topics**

The research topics eligible for the prize are:

- Studying the mathematical properties of the algorithmic information-theoretic
__definition of intelligence__. - Building and analyzing formal models of value learning based on the above.
- Pursuing any of the
__future research directions__listed in the article on infra-Bayesian physicalism. - Studying
__infra-Bayesian logic__in general, and its applications to infra-Bayesian reinforcement learning in particular. - Theoretical study of the behavior of RL agents in
__population games__. In particular, understand to what extent__infra-Bayesianism__helps to avoid the__grain-of-truth problem__. - Studying
__the____conjectures__relating superrationality to thermodynamic Nash equilibria. - Studying the theoretical properties of the
__infra-Bayesian Turing reinforcement learning__setting. - Developing a theory of reinforcement learning with
__traps__, i.e. irreversible state transitions. Possible research directions include studying the computational complexity of Bayes-optimality for finite state policies (in order to avoid the__NP-hardness__for arbitrary policies) and__bootstrapping__from a safe baseline policy.

New topics might be added to this list over the year.

**Requirements**

The format of the submission can be either a LessWrong post/sequence or an arXiv paper.

The submission is allowed to have one or more authors. In the latter case, the authors will be considered for the prize as a team, and if they win, the prize money will be split between them either equally or according to their own internal agreement. For the submission to be eligible, its authors must *not *include:

- Anyone employed or supported by ALTER.
- Members of the board of directors of ALTER.
- Members of the panel of the judges.
- First-degree relatives or romantic partners of judges.

In order to win, the submission must be a *substantial* contribution to the mathematical theory of one of the topics above. For this, it must include at least one of:

- A novel theorem, relevant to the topic, which is difficult to prove.
- A novel
*unexpected*mathematical definition, relevant to the topic, with an array of natural properties.

Some examples of known results which would be considered substantial at the time:

- Theorems 1 and 2 in "
__RL with imperceptible rewards__". - Definition 1.1 in "
__infra-Bayesian physicalism__", with the various theorems proved about it. - Theorem 1 in "
__Forecasting using incomplete models__". - Definition 7 in "
__Basic Inframeasure Theory__", with the various theorems proved about it.

**Evaluation**

The evaluation will consist of two phases. In the first phase, I will select 3 finalists. In the second phase, each of the finalists will be evaluated by a panel of judges comprising of:

Each judge will score the submission on a scale of 0 to 4. These scores will be added to produce a total score between 0 and 16. If no submission achieves a score of 12 or more, the main prize will not be awarded. If at least one submission achieves a score of 12 or more, the submission with the highest score will be the winner. In case of a tie, the money will be split between the front runners.

The final winner will be announced publicly, but the scores received by various submissions will not.

**Fast Track**

If the prize is awarded, and at least one author of the winning submission is interested in a researcher position in ALTER, they will be considered for it, although this is not an offer or guarantee of employment. In fact, making such hires to continue to advance the agenda is my foremost reason for organizing this prize.

If multiple winning authors are interested in researcher positions, we will consider hiring *at least* one of them. It is also quite likely we will consider hiring all of them, but this depends on our financial and organization ability to support that number.

For additional details about the position, see our previous __hiring announcement__.

**Assistance**

We do not promise to provide any guidance or mentorship to contestants. In fact, identifying researchers that can work with minimal guidance is one of the advantages of this process. However, I expect to be usually available for providing comments on well-written research proposals. Contestants are encouraged to write such proposals in case of any doubt about the eligibility of their research. Moreover, technical questions pertaining to the learning-theoretic agenda can be asked either on the MIRIx Discord server (where either Alex or I often answer them), or as comments on the relevant posts by Diffractor (Alex) or myself. Invites to the server are available to good-faith contestants upon request.

If you wish to contact me about either a research proposal or an invite, please write to rot13 of inarffn@nygre.bet.vy and attach a CV plus any other relevant background about yourself.

Please indicate your interest in working on this prize on Superlinear or comment to find potential collaborators.

Good luck!

^{^}Donors can increase the prize pool using the Superlinear platform.