Alignment tax

An alignment tax (sometimes called a safety tax) is the additional cost of making AI aligned, relative to unaligned AI.

Askell, Amanda et al. (2021) A general language assistant as a laboratory for alignment, ArXiv:arXiv:2112.00861 [Cs].

  1. ^

    Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Global, April 3.

  2. ^

    For a summary, see Shah, Rohin Shah (2020) A framework for thinking about how to make AI go well, LessWrong, April 15.

Paul Christiano distinguishes two main approaches for dealing with the alignment tax (Christiano 2020; for a summary, see Shah 2020).tax.[1][2]  One approach seeks to find ways to pay the tax, such as persuading individual actors to pay it or facilitating coordination of the sort that would allow groups to pay it. The other approach tries to reduce the tax, by differentially advancing existing alignable algorithms or by making existing algorithms more alignable.

BibliographyFurther reading

Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Global, April 3.

Shah, Rohin (2020) A framework for thinking about how to make AI go well, LessWrong, April 15.

  1. ^

    Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Global, April 3.

  2. ^

    For a summary, see Shah, Rohin (2020) A framework for thinking about how to make AI go well, LessWrong, April 15.

Here's my original proposal for this entry, in case this info is of use to someone:

"This is probably an even better fit for LW or the Alignment Forum, but they don't seem to have it. We could make a version here anyway, and then we could copy it there or someone from those sites could.

Here are some posts that have relevant content, from a very quick search:

Related entries:

  • differential progress
  • AI alignment
  • AI forecasting
  • AI governance
  • Maybe Corporate governance, if that entry is made"

An alignment tax (sometimes called safety tax) is the additional cost incurred of making AI aligned, relative to unaligned AI.

An alignment tax (sometimes called safety tax) is the additional cost incurred fromof making AI aligned, relative to unaligned AI.

An alignment tax (sometimes called safety tax) is the additional cost incurred from making AI aligned, relative to unaligned AI.

Approaches to the alignment tax

Paul Christiano distinguishes two main approaches for dealing with the alignment tax (Christiano 2020; for a summary, see Shah 2020). One approach seeks to find ways to pay the tax, such as persuading individual actors to pay it or facilitating coordination of the sort that would allow groups to pay it. The other approach tries to reduce the tax, by differentially advancing existing alignable algorithms or by making existing algorithms more alignable.

Bibliography

Askell, Amanda et al. (2021) A general language assistant as a laboratory for alignment, ArXiv:2112.00861 [Cs].

Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Global, April 3.

Shah, Rohin (2020) A framework for thinking about how to make AI go well, LessWrong, April 15.

Xu, Mark & Carl Shulman (2021) Rogue AGI embodies valuable intellectual property, LessWrong, June 3.

Related entries

AI alignment

Created by Pablo at 2y