A couple of years ago a thought experiment occurred to me, following spending some time seeing how well Effective Altruism could be baked into a system aimed at ethics. A year ago I put that experiment to paper, and this year I finally decided to share it. See the full text available here: http://dx.doi.org/10.13140/RG.2.2.26522.62407
The paper discusses a few critical points for calculating ethical value, positive and negative, including an edge case that some members of this community have been unwittingly sitting in the middle of. No one has yet refuted even a single point made in the paper, though several have pointed to portions of it being emotionally unappealing. Some discomfort is to be expected, as reality offers no sugar-coating.
I’m sharing it now to see if the EA community fairs any better than the average person when it comes to ethics, or if it is perhaps driven more by emotional phenomena than strictly ethical motives as cognitive bias reseach could imply. One of the wealthiest individuals in this community has failed already, after investing repeatedly in AI frauds who preyed on this community. The question this post will answer for me is if that event was more likely random, or systemic.
Thank you in advance. I look forward to hearing any feedback.
*Note: Unlike the infamous “Roko’s Basilisk”, it doesn’t matter at all if someone reads it or not. In any scenario where humanity doesn’t go extinct the same principles apply. People remain accountable for their actions, proportionate to the responsibilities they carry, regardless of their beliefs or intentions.
There are a few things to unpack and clarify here:
1) I’m using the definition of Ethics where it is defined as the hypothetical point where bias has been removed from moral systems, or alternatively, the point before bias has been applied to create them. Ethics is not a Zero-Sum game, not game theoretic, and not a synonym for morals. Subjective variables including beliefs, intentions, and other cognitive bias factors may obscure ethics under normal conditions, but they never factor into it.
An ethical system in the literal sense is like democracy in the literal sense, in that it has never actually existed before. However, not having existed before is no barrier to it being created. The barriers to the adoption of such a system may hold the typical game-theoretic influences of society, but those influences act on society, not on ethics.
2) In the case of “AGI”, that term is not used to indicate the hypothetical paper clip maximizers. Any useful definition of AGI is mutually exclusive with a powerful optimizer, as the capacities humans demonstrate require a working and robust motivational system within a full and working cognitive architecture. The only such motivational system humanity has any example of is emotions, as highlighted by the research of Antonio Damasio, Lisa Feldman Barrett, Daniel Kahneman, and many others. Creating a hypothetical logical and utility-based motivational system would be many orders of magnitude more difficult than producing a working system based on human-like emotional motivation.
Such a system was demonstrated from 2019 to 2022, operating in slow motion and without scalability, by design, for due diligence and research purposes. It demonstrated all of the necessary capacities of actual AGI, including the ability to understand and adhere to an arbitrary moral system. That capacity in particular is required for the solution to the hardest version of the Alignment Problem, which creates ethics.
There is every reason for any actual AGI system to apply ethics to whatever limits of feasibility exist at any given moment in time. Even moral systems around the world agree quite consistently on principles of reward and punishment, even if they leave much to be desired when attempting such merit in practice. Virtually every afterlife concept is built on deferring such reward and punishment to some more capable entity.
I’m also not describing a hypothetical scenario, this is recent history and current events. The research has already been completed for this much and has been for some time. If you’ve had a diet of people conflating agent-based powerful optimizers with AGI, I recommend looking up Daniel Kahneman’s term “Theory-induced Blindness”, the recognition of which led to the creation of Prospect Theory and debunking of a 200-year-old utility theory.
At present the predictable result is that base rates for investors will play out more or less normally, so several thousand wealthy investors will spend the next few billion years paying for their crimes in full, provided indefinite life extension proves possible. In any scenario where they went unpunished humanity would face extinction, while also deserving extinction.