Hide table of contents

In this post, I ask the readers about what motivates them, I provide an example of answer (i.e. a reasoning that motivates me) and, as a by-product, I jot down some rudimentary considerations about AGI safety derived from this exercise. Happy reading!

(Posting triggered by Lizka's post- thank you!)


I am curious to learn more about what motivates other readers of this forum. In 2015, the moral philosophy arguments for effective altruism relied heavily on utilitarianism, and I wonder whether this has evolved. If I remember correctly, moral philosophy discussions in Rationality from AI to Zombie were cut short by the Is-Ought problem. I am also curious to hear if anyone relies on solutions to the Is-Ought problem in their own moral philosophy/motivational framework.

I am looking for more “applied” philosophy rather than theoretical argumentation about philosophical approaches, though I welcome links between theory and practice. In brief, what drives you, fundamentally? When you ask yourself “why am I exerting so much efforts towards that goal?”, what are the arguments you give yourself?


To illustrate the type of answers I am looking for, here is a reasoning that motivates me:

1. We cannot deduce from real observation whether the existence of reality has a purpose; we cannot deduce what we should do (~the Is-Ought problem). Note here the assumed definition that ‘purpose of reality’ = ‘what we should do’.

2. Current models of reality rely on the occurrence of a phenomenon that does not respect causality-as-my-brain-understands-it and ‘resulted’ in reality (~the Big Bang occurred but we don’t know why, and when we discover why, we won’t know why that, etc.)

3. Given this cosmological uncertainty, it is possible that the (literally metaphysical) laws/processes governing that precursor phenomenon apply around reality. (For example, if you are into simulation arguments: whatever launched our simulation is still around ‘outside’ the simulation)

4. And therefore it is possible that in the future, a similar causality-violating phenomenon strikes again and gives a purpose to reality (sim version: whatever launched our sim intervenes to give a hint about what the purpose of the simulation is)

5. Or, regardless of points 2-4, maybe through more research & philosophy, we will figure out a way around the Is-Ought problem down the road and find a convincing moral philosophy, a deterministic purpose informing us on what we should do.

6. By point 4 or point 5, there is a chance we’ll eventually get to know the purpose of reality i.e. know what we should do.

7. This purpose can take many forms, from “break free from the sim” or “capture the flag” to “beat heat death” or “create as many paperclips as possible”. Ultimately, the assumption here is that we find something informative about “what we should do” in a deterministic way. So for example “42” or “blue” would not qualify as purposes, in this reasoning (and we’d therefore have to wait/search/philosophise longer to get something informative).

8. Since it is something we should do, regardless of what it is, sentience should probably equip itself with as much deliberate control (incl. understanding) over its environment as possible, in order to be better able to “do” whatever in a deliberate way, and therefore to be ready to fulfill that potential upcoming purpose ASAP once it is discovered. And there is the meta implication that sentience should also invest in its ability to equip itself better/faster for deliberately controlling its environment (analogous to Omohundro’s basic AI drives arguments – cf. Post-script below). The term “deliberate” is important: by “deliberate control”, I don’t mean just “control”, which would be more akin to ~“conquest of the universe and expand the frontier of knowledge”, but the ability to deliberately direct this control over resources/knowledge towards the fulfillment of the purpose sentience sets itself to achieve, whichever it is. (So, not only better sensors and actuators for society, but also better decision-making process/more wisdom to use these sensors and actuators.)

9. So maximizing sentience’s deliberate control over its environment seems like a good starting point to fulfill the purpose of reality (whatever that purpose), it seems like a good instrumental objective for what we should do.

10. This is true unless the purpose is explicitly something on the spectrum of “do not maximize deliberate control over the environment” (which would be a surprisingly specific goal, so surprising in fact I’d get suspicious: it probably means whatever deus ex machina provides us the goal is afraid of something. In any case, if that is the purpose and you spent eons maximizing control, you have enough control to simply give up control straight away by letting everything go as soon as you discover this was the wrong thing to do.)

11. This applies also if there are competing purposes (~imagine a weird situation where the sim is fought over by multiple dei ex machina: whatever the purpose or combination of purposes you end up electing to fulfill, you are still in a better position by controlling your environment to achieve the compromise/the most likely true purpose.

12. If points 2-9 have 0% chance of being correct, you are left with point 1 where nothing matters, and therefore wasting your time towards achieving sentient deliberate control over the environment does not matter either and at least it keeps your mind busy. If points 2-9 has a non-zero chance of being correct, you should care and try to maximize sentience’s deliberate control over its environment (something like ~Pascal Wager?)

13. If point 1 is incorrect and there is already a way to deduce a goal for reality and an answer to what we should do, do share :^) 


A longer-than-expected post-script on considerations for AGI safety:

a. Points 6-9 made me think that an AI system that is still very uncertain about its goal (e.g. bc trained through inverse reinforcement) could still suffer from the Omohundro’s basic AI drives if it weakly 'expects' to have a goal in the future. So, basic AI drives are not only for “almost any goals” as written in Omohundro’s 2008 paper, but also for weakly positive expectations of any goal. As a result, risk arises from the mere ‘conceptualization’ by a powerful AI system that it may eventually have a purpose. This might be true at the encoding of the objective function (though I am not sure about that), but it is definitely true at more advanced stages where an AI system expects to be given instructions/be applied to various tasks.

b. Point 10 made me think that, at the design level, there is a need for explicit strong rewards for minimizing the impact potential/Omohundro’s AI drives even in AI systems that do not know the rest of their reward functions/their goals. There is also a need for making the relative weighting of that impact minimization immutable, even if the AI system can alter its code (~so that we can be sure that, contrary to me, the AI system is not suspicious of that goal)

c. Overall, writing down this reasoning made me think that blocking this “expectation of a goal” by design at an advanced stage or placating the behaviors resulting from that expectation (e.g. “inactivity as a default” by design, until uncertainty is reduced) sound more important than before writing it down. Solving the question of what the AI system should aim for might be secondary to the solving of Omohundro’s basic AI drives.





More posts like this

Sorted by Click to highlight new comments since:

My justification is pretty simple:

  1. I like being happy and not having malaria and eating food.

  2. I appear to be fundamentally similar to other people.

  3. Therefore, other people probably want to be happy and not have malaria and have food to eat.

  4. I don’t appear to be special, so my interests shouldn’t be prioritized more than my fair share.

  5. Therefore I should help other people more than I help myself because there are more of them and they need more help.

I was interested to see the suggestion that rational discussions of value are cut short by the  is-ought gap.  This has been an influential  view but I have a different angle.

We should acknowledge that normative judgements have a different semantic nature from the factual. When we use normative words such as 'ought' and 'good' we make judgements relative to ends or other criteria. Factual judgements report on facts in the world, normative judgements report on relations between objects and criteria.  Judgements of practical reason, of how we ought to act, are about our means and our ends.

 But we can and do reason about both means and ends.  Judgements of practical reason range from the certain 'You  ought to turn right to get to the station' to the unknowable 'Ought I to take this job for my long-run happiness?'  There are better and worse ends - welfare is clearly more important than grass-counting and there are strong arguments why welfare is a better end than national glory. 

Among ends, happiness seems to have a special place.  We are creatures with valenced experience and we are directly aware that our own enjoyment is good and our suffering is bad.  Reason seems to require us to expand the circle to also consider the enjoyment and suffering of other creatures.  If nothing else, extremes of enjoyment and suffering surely matter.

My concern for reducing S-risks is based largely on self-interest. There was this LessWrong post on the implications of worse than death scenarios. As long as there is a >0% chance of eternal oblivion being false and there being a risk of experiencing something resembling eternal hell, it seems rational to try to avert this risk, simply because of its extreme disutility. If Open Individualism turns out to be the correct theory of personal identity, there is a convergence between self-interest and altruism, because I am everyone.

The dilemma is that it does not seem possible to continue living as normal when considering the prevention of worse than death scenarios. If it is agreed that anything should be done to prevent them then Pascal's Mugging seems inevitable. Suicide speaks for itself, and even the other two options, if taken seriously, would change your life. What I mean by this is that it would seem rational to completely devote your life to these causes. It would be rational to do anything to obtain money to donate to AI safety for example, and you would be obliged to sleep for exactly nine hours a day to improve your mental condition, increasing the probability that you will find a way to prevent the scenarios. I would be interested in hearing your thoughts on this dilemma and if you think there are better ways of reducing the probability.

Great post on this topic. I used to think about where obligations come from. ~~I came to no satisfying conclusion -- the is-ought gap is a real killer xD

For me my motivation used to come from identifying as a utilitarian, until I read the replacing guilt ea sequence https://forum.effectivealtruism.org/s/a2LBRPLhvwB83DSGq/p/Syz3Fiqn5rBqhePiz. It really made me to a 180 on things like "I should donate to charity" and "I should be vegan" to "I want to see the world a better place, these actions help make that true, thus I want to do them"(which I think is personally much healthier for me).

I think your point 3/4 is a "big IF true" sort of thing, but reminds me of a "pascals wager" argument for the points that depend on it.

The link doesn't work for me



Curated and popular this week
Relevant opportunities