Some thoughts on risks from narrow, non-agentic AI

One possibility that maybe you didn't close off (unless I missed it) is "death by feature creep" (more likely "decline by feature creep").  It's somewhat related to the slow-rolling catastrophe, but with the assumption that AI (or systems of agents including AI,  also involving humans) might be trying to optimize for stability and thus regulate each other, as well as trying to maximize some growth variable (innovation, profit).

 Our inter-agent (social, regulatory, economic, political) systems were built by the application of human intelligence, to the point that human intelligence can't comprehend the whole, making it hard to solve systemic problems.  So in one possible scenario, humans plus narrow AI might simplify the system at first, but then keep adding features to the system of civilization until it is unwieldy again.  (Maybe a superintelligent AGI could figure it out?  But if it started adding its own features, then maybe not even it understand what had evolved.)  Complexity can come from competitive pressures, but also from technological innovations.  Each innovation stresses the system, until the system can assimilate it more or less safely, by means of new regulation (social media that messes up politics unless / until we can break or manage some of its power).  

Then, if some kind of feedback loop leading toward civilizational decline begins, general intelligences (humans, if humans are the only general intelligences) might be even less capable of figuring out how to reverse course than they currently are.  In a way, this could be narrow AI as just another important technology, marginally complicating the world.  But also,  we might use narrow AI as tools in AI/AI+humans governance (or perhaps in understanding innovation), and they might be capable of understanding things that we cannot (often things that AI themselves made up), creating a dependency that could contribute in a unique way to a decline.  

(Maybe "understand" is the wrong word to apply to narrow AI but "process in a way sufficiently opaque to humans" works and is as bad.)

Being Inclusive

One thought that re-occurs to me is that there could be two, related EA movements, which draw from each other.    No official barrier to participating in both (like being on LessWrong and EA Forum at the same time).  Possible to be a leader in both at the same time (if you have time/energy for it).  One of them emphasizes the "effective" in "effective altruists", the other the "altruists".  The first more like current EA, the second more focused on increasing the (lasting) altruism of the greatest number of people.  Human resource focused.   

Just about anyone could contribute to the second one, I would think.  It could be a pool of people from which to recruit for the first one, and both movements would share ideas and culture (to an appropriate degree).

James_Banks's Shortform

"King Emeric's gift has thus played an important role in enabling us to live the monastic life, and it is a fitting sign of gratitude that we have been offering the Holy Sacrifice for him annually for the past 815 years."

(source: https://sancrucensis.wordpress.com/2019/07/10/king-emeric-of-hungary/ )

It seems to me like longtermists could learn something from people like this.  (Maintaining a point of view for 800 years, both keeping the values aligned enough to do this and being around to be able to.)

(Also a short blog post by me occasioned by these monks about "being orthogonal to history" https://formulalessness.blogspot.com/2019/07/orthogonal-to-history.html )

The despair of normative realism bot

Moral realism can be useful in letting us know what kind of things should be considered moral.

For instance, if you ground morality in God, you might say: Which God? Well, if we know which one, we might know his/her/its preferences, and that inflects our morality.  Also, if God partially cashes out to "the foundation of trustworthiness, through love", then we will approach knowing and obligation themselves (as psychological realities) in a different way (less obsessive? less militant? or, perhaps, less rigorously responsible?).

Sharon Hewitt Rawlette (in The Feeling of Value) grounds her moral realism in "normative qualia", which for her is something like "the component of pain that feels unacceptable" or its opposite in pleasure), which leads her to hedonic utilitarianism.  Not to preference satisfaction or anything else, but specifically to hedonism.

I think both of the above are best grounded in a "naturalism" (a "one-ontological-world-ism" from my other comment), rather than in anything Enochian or Parfitian.  

The despair of normative realism bot

I can see the appeal in having one ontological world.  What is that world, exactly?  Is it that which can be proven scientifically (in the sense of, through the scientific method used in natural science)?  I think what can be proven scientifically is perhaps what we are most sure is real or true.  But things that we are less certain of being real can still exist, as part of the same ontological world.  The uncertainty is in us, not in the world.  One simplistic definition of natural science is that it is simply rigorous empiricism.   The rigor isn't how we are metaphysically connected with things, rather it's the empirical that does so, the experiences contacting or occurring to observers.  The rigor simply helps us interpret our experiences.

We can have random experiences that don't add up to anything.  But maybe whatever experiences that give rise to our concept "morality", which we do seem to be able to discuss with some success with other people, and have done so in different time periods, may be rooted in a natural reality (which is not part of the deliverances of "natural science" as "natural" is commonly understood, but which is part of "natural science" if by "natural" we mean "part of the one ontological world").  Morality is something we try hard to make a science of (hence the field of ethics), but which to some extent eludes us.  But that doesn't mean that there isn't something natural there, but that it's something we have so far not figured out.

What types of charity will be the most effective for creating a more equal society?

Here are some ideas:

The rich have too much money relative to poor:

Taking money versus eliciting money.

Taking via

  • revolution
  • taxation

Eliciting via

  • shame, pressure, guilt
  • persuasion, psychological skill
  • friendship

Change of culture

  • culture in general
  • elite culture

Targeting elite money

  • used to be stewards of investments
  • used for personal spending


Revolutions are risky and can lead to worse governments.

Taxation might work better. (Closing tax haven loopholes.) Building political will for higher taxes on wealthy. There are people in the US who don't want there to be higher taxes on wealthy even though it would materially benefit them (culture change opportunity).

Eliciting could be more effective. Social justice culture (OK with shame, pressure, guilt) has philanthropic charities. (Not exactly aligned with EA.) Guerrilla Foundation, Resource Generation. (Already established. You could donate or join now.)

Eliciting via persuasion or psychological tactics sounds like it would appeal to some people to try to do.

Eliciting via friendship: what if a person, or movement, was very good friends with both rich and poor people? Then they could represent the legitimate interests of both to each other in a trustworthy way. I'm not sure anyone is trying this route. Maybe the Giving Pledge counts?

Change of culture. What are the roots of the altruistic mindset? What would help people have, or prepare people to have, values of altruists (a list of such for EA or EA-compatible people; there could be other lists)? Can this be something that gets "in the water" of culture at large? Can culture at large reach into elite culture, or does there have to be a special intervention to get values into elite culture? This sounds more like a project for a movement or set of movements than for a discrete charity.

Elite people have money that they spend on themselves personally -- easy to imagine they could just spend $30,000 a year on themselves and no more, give the balance to charity. But they also have money tied up in investments. Not so easy to ask them to liquidate those investments. If they are still in charge of those investments, then there is an inequality of power, since they can make decisions that affect many people without really understanding the situation of those people. Maybe nationalize industries? But then there can still be an inequality of power between governments and citizens.

If there can be a good flow between citizens and governments, whereby the citizens' voices are heard by the government, then could there be a similar thing between citizens and unelected elite? Probably somebody needs to be in charge of complex and powerful infrastructure, inevitably leading to potential for inequalities of power. Do the elite have an effective norm of listening to non-elite?


You might also consider the effect of AI and genetic engineering, or other technologies, on the problem of creating a more equal society. AI will either be basically under human control, or not. If it is, the humans who control it will be yet another elite. If it isn't, then we have to live with whatever society it comes up with. We can hope that maybe AI will enforce norms that we all really want deep down but couldn't enforce ourselves, like equality.

On the other hand, maybe, given the ability to change our own nature using genetic engineering, we (perhaps with the help of the elite) will choose to no longer care about inequality, only a basic sense of happiness which will be attainable by the emerging status quo.

Expected value theory is fanatical, but that's a good thing

1. I don't know much about probability and statistics, so forgive me if this sounds completely naive (I'd be interested in reading more on this problem, if it's as simple for you as saying "go read X").

Having said that, though, I may have an objection to fanaticism, or something in the neighborhood of it:

  • Let's say there are a suite of short-term payoff, high certainty bets for making things better.
  • And also a suite of long-term payoff, low certainty bets for making things better. (Things that promise "super-great futures".)

You could throw a lot of resources at the low certainty bets, and if the certainty is low enough, you could get to the end of time and say "we got nothing for all that". If the individual bets are low-certainty enough, even if you had a lot of them in your suite you would still have a very high probability of getting nothing for your troubles. (The state of coming up empty-handed.)

That investment could have come at the cost of pursuing the short-term, high certainty suite.

So you might feel regret at the end of time for not having pursued the safer bets, and with that in mind, it might be intuitively rational to pursue safe bets, even with less expected value. You could say "I should pursue high EV things just because they're high EV", and this "avoid coming up empty-handed" consideration might be a defeater for that.

You can defeat that defeater with "no, actually the likelihood of all these high-EV bets failing is low enough that the high-EV suite is worth pursuing."

2. It might be equally rational to pursue safety as it is to pursue high EV, it's just that the safety person and the high-EV person have different values.

3. I think in the real world, people do something like have a mixed portfolio, like Taleb's advice of "expose yourself to high-risk, high-reward investments/experiences/etc., and also low-risk, low-reward." And how they do that shows, practically speaking, how much they value super-great futures versus not coming up empty-handed. Do you think your paper, if it got its full audience, would do something like "get some people to shift their resources a little more toward high-risk, high-reward investments"? Or do you think it would have a more radical effect? (A big shift toward high-risk, high-reward? A real bullet-biting, where people do the bare minimum to survive and invest all other resources into pursuing super-high-reward futures?)

Are social media algorithms an existential risk?

(The following is long, sorry about that. Maybe I should have written it up already as a normal post. A one sentence abstract could be: "Social media algorithms could be dangerous as a part of the overall process of leading people to 'consent' to being lesser forms of themselves to further elite/AI/state goals, perhaps threatening the destruction of humanity's longterm potential.")

It seems plausible to me that something like algorithmic behavior modification (social media algorithms are algorithms designed to modify human behavior, to some extent; could be early examples of the phenomenon) could bend human preferences so that future humans freely (or "freely"?) choose things that we (the readers of this comment? reflective humans of 2020?) would consider non-optimal. If you combine that with the possibility of algorithms recommending changes in human genes, it's possible to rewrite human nature (with the consent of humans) into a form that AI (or the elite who control AI) find more convenient. For instance, humans could be simplified so that they consume fewer resources or present less of a political threat. The simplest humans are blobs of pleasure (easily satisfying hedonism) and/or "yes machines" (people who prefer cheap and easy things and thus whose preferences are trivial to satisfy). Whether this technically counts as existential risk, I'm not sure. It might be considered a "destruction of humanity's longterm potential". Part of human potential is the potential of humans to be something.

I suggest "freely" might ought to be in quotes for two reasons. One is the "scam phenomenon". A scammer can get a mark into a mindset in which they do things they wouldn't ordinarily do. (Withdraw a large sum of money from their bank account and give it to the scammer, just because the scammer asks for it.) The scammer never puts a gun to the mark's head. They just give them a plausible-enough story, and perhaps build a simple relationship, skillfully but not forcefully suggesting that the mark has something to gain from giving, or some obligation compelling it. If after "giving" the money, the mark wises up and feels regret, they might appeal to the police. Surely they were psychologically manipulated. And they were, they were in a kind of dream world woven by the scammer, who never forced anything but who drew the mark into an alternate reality. In some sense what happened was criminal, a form of theft. But the police will say "But it was of your own free will." The police are somewhat correct in what they say. The mark was "free" in some sense. But in another sense, the mark was not. We might fear that an algorithm (or AI) could be like a sophisticated scammer, and scam the human race, much like some humans have scammed large numbers of humans before.

The second reason is that adoption of changes (notably technology, but also social changes), of which changing human genes would be an example, and of which accepting algorithmic behavior modification could be another, is something that is only in a limited sense a satisfaction of the preferences of humans, or the result of their conscious decision. In the S-shaped curve of adoption, there are early adopters, late/non-adopters, and people in the middle. Early adopters probably really do affirm the innovations they adopt. Late or non-adopters probably really do have some kind of aversion to them. These people have true opinions about innovations. But most people, in the middle of the graph, are incentivized to a large extent by "doing whatever it is looks like is popular, is becoming popular, is something that looks pretty clear has become and will be popular". So technological adoption, or the adoption of any other innovation, is not necessarily something we as a whole species truly prefer or decide for, but there's enough momentum that we find ourselves falling in line.

I think more likely than the extreme of "blobs of pleasure / yes machines" are people who lack depth, are useless, and live in a VR dream world. On some, deeper, level they would be analogous to blobs/yes machines, but their subjective experience, on a surface level, would be more recognizably human. Their lives would be positive on some level and thus would be such that altruistic/paternalistic AI or AI-controlling elite could feel like they were doing the right thing by them. But their lives would be lacking in dimensions that perhaps AI or AI-controlling elite wouldn't think of including in their (the people's, or even the elite's/AI's own) experience. The people might not have to pay a significant price for anything and thus never value things (or other people) in a deeper way. They might be incapable of desiring anything other than "this life", such as a "spiritual world" (or something like a "spiritual world", a place of greater meaning) (something the author of Brave New World or Christians or Nietzscheans would all object to). In some objective sense, perhaps capability -- toward securing your own well-being, capability in general, behaving in a significant way, being able to behave in a way that really matters -- is something that is part of human well-being (and so civilization is both progress and regress as we make people who are less and less capable of, say, growing their own food, because of all the conveniences and safety we build up). We could further open up the thought that there is some objective state of affairs, something other than human perceptions of well-being or preference-satisfaction, which constitutes part of human well-being. Perhaps to be rightly related to reality (properly believing in God, or properly not believing in God, as the case may be).

So we might need to figure out exactly what human well-being is, or if we can't figure it out in advance for the whole human species (after all, each person has a claim to knowing what human well-being is), then try to keep technology and policy from doing things that hamper the ability of each person to come to discover and to pursue true human well-being. One could see in hedonism and preferentialism a kind of attempt at value agnosticism: we no longer say that God (a particular understanding of God), or the state, or some sacred site is the Real Value, we instead say "well, we as the state will support you or at least not hinder you in your preference for God, the state, or the sacred site, whatever you want, as long as it doesn't get in the way of someone else's preference -- whatever makes you happy". But preferentialism and hedonism aren't value-agnostic if they start to imply through their shaping of a person's experience "none of your sacred things are worth anything, we're just going to make you into a blob of pleasure who says yes, on most levels, with a veneer of human experience on the surface level of your consciousness." I think that a truly value-agnostic state/elite/AI might ought to try to maximize "the ability for each person to secure their own decision-making ability and basic physical movement", which could be taken as a proxy for the maximization of each person's agency and thus their ability to discover and pursue true human well-being. And to make fewer and fewer decisions for the populace, to try to make itself less and less necessary from a paternalistic point of view. Rather than paternalism, adopt a parental view -- parents tend to want their children to be capable, and to become, in a sense, their equals. All these are things that altruists who might influence the AI-controlling elite in the coming decades or centuries, or those who might want to align AI, could take into account.

We might be concerned with AI alignment, but we should also be concerned with the alignment of human civilization. Or the non-alignment, the drift of it. Fast take-off AI can give us stark stories where someone accidentally misaligns an AI to a fake utility function and it messes up human experience and/or existence irrevocably and suddenly -- and we consider that a fate to worry about and try to avoid. But slow take-off AI (I think) would/will involve the emergence of a bunch of powerful Tool AIs, each of which (I would expect) would be designed to be basically controllable by some human and to not obviously kill anyone or cause comparably clear harm (analogous to design of airplanes, bridges, etc.) -- that's what "alignment" means in that context [correct me if I'm wrong]; none of which are explicitly defined to take care of human well-being as a whole (something a fast-takeoff aligner might consciously worry about and decide about); no one of which rules decisively; all of which would be in some kind of equilibrium reminiscent of democracy, capitalism, and the geopolitical world. They would be more a continuation of human civilization than a break with it. Because the fake utility function imposition in a slow takeoff civilizational evolution is slow and "consensual", it is not stark and we can "sleep through it". The fact that Nietzsche and Huxley raised their complaints against this drift long ago shows that it's a slow and relatively steady one, a gradual iteration of versions of the status quo, easy for us to discount or adapt to. Social media algorithms are just a more recent expression of it.

Deliberate Consumption of Emotional Content to Increase Altruistic Motivation

I like the idea of coming up with some kind of practice to retrain yourself to be more altruistic. There should be some version of that idea that works, and maybe exposing yourself to stories / imagery / etc. about people / animals who can be helped would be part of that.

One possibility is that such images could become naturally compelling for people (and thus would tend to be addictive or obsession-producing, because of their awful compellingness) -- for such people, this practice is probably bad, sometimes (often?) a net bad. But for other people, the images would lose their natural compellingness, and would have to be consumed deliberately.

In our culture we don't train ourselves to deliberately meditate on things, so it feels "culturally unrealistic", like something you can't expect of yourself and the people around you. (Or perhaps some subtle interplay of environmental influences on how we develop as "processors of reality" when we're growing up is to blame.) I feel like that part of me is more or less irrevocably closed over (maybe not an accurate sentiment, but a strong one). But in other cultures (not so much in the contemporary West), deliberate meditation was / is a thing. For instance people used to (maybe still do) meditate on the death of Jesus to motivate their love of God.

Load More