Tor_Barstad

Posts

Sorted by New

Wiki Contributions

Comments

EA's Image Problem

Academics must gain something from spending ages thinking and studying ethics, be it understanding of the arguments, knowledge of more arguments or something else. I think this puts them in a better position than others and should make others tentative in saying that they're wrong.

Btw, I agree with this in the sense that I'd rather have a random ethicist make decisions about an ethical question than a random person.

I'd definitely be interested to hear more :)

Great! I'm writing a text about this, and I'll add a comment with a reference to it when the first-draft finished :)

Your explanation for disagreeing with certain academics is that they have different starting intuitions. But does this account for the fact that academics can revise/abandon intuitions because of broader considerations. Even if you're right, why you think your intuitions are more reliable than theirs?

A reasonable question, and I'll try to give a better account of my reasons for this in my next comment, since the text may help in giving a picture of where I'm coming from. I will say in my defence though, that I do have at least some epistemic modesty in regards to this - although not as much as I think you would think is the reasonable level. While what I think of as probably being the best outcomes from an "objective" perspective corresponds to some sort of hedonistic utilitarianism, I do not and do not intend to ever work towards outcomes that don't also take other ethical concerns into account, and hope to achieve a future that that is very good from the perspective of many ethical viewpoints (rights of persons, fairness, etc) - partly because of epistemic modesty.

EA's Image Problem

Thanks for a thoughtful response.

Likewise :)

My worry is the idea we can round this problem by evaluating the arguments ourselves. We're not special. Academics just evaluate the arguments, like we would, but understand them better. The only way i can see myself being justified in rejecting their views is by showing they're biased. So maybe my point wasn't "the academics are right, so narrow consequentialism is wrong" but "most people who know much more about this than us don't think narrow consequentialism is right, so we don't know its right".

That's a reasonable worry, but whereas the field of ethics as a whole is concerned I would be much more worried about trusting the judgment of the average ethicist over ours.

I would also agree that the "we are not special"-assumption seems like a reasonable best-guess for how things are in the absence of evidence for or against (although, in fear of violating your not-comming-across-as-smug-and-arrogant-reccomendation, I’m genuinely unsure about whether its correct or not).

I've also thought a lot about ethics, I’ve been doing so since childhood. But admittedly, most of the philosophical texts that have been written about these topics have not been read by me (or by most professional ethicists I suppose, but I've read far less than them also, for sure). I have read a significant amount though, enough for me to have heard most or all memorable arguments I've heard be repeated several times. Also, perhaps more surprisingly; I'm somewhat confident that I've never heard an argument against my opinions about ethics (that is, not the specific issues, but the abstract issues) that was both (1) not based on axiomatic assumptions/intuitions I disagree with and (2) something I hadn't already thought of (of course, I may have forgotten, but it also seems like something that would have been memorable). Examples where criteria #2 was met but #1 wasn't met includes things like e.g. "the repugnant conclusion" (it doesn't seem repugnant to me at all, so it never occurred to me that this should be seen as a possible counter argument). Philosophy class was a lot of "oh.. so that argument has a name" (and also a lot of “what? do people find that a convincing argument against utilitarianism?”).

For what I know this could be the experience of many with opinions different from mine also, but if so, it suggests that intuitions and/or base assumptions may be the determining factor for many, as opposed to knowledge and understanding of arguments presented by differing sides. My suspicion is that the main contributor for the current "stale-mate" in philosophical debates is that people have different intuitions and commitments. Some ethicists realize that utilitarianism in some circumstances would require us to prioritise other children to the extent that we let our own children starve, and say "reductio absurdism". I realize the same thing, and say "yes, of course" (and if I don't act by that, it's because I have other urges and commitments beyond doing what I think is best, not because I think that I don't think doing so could be the best thing from a non-partial point of view).

My best guess would be that most ethicists don’t understand the arguments surrounding my views better than I do, but that they know a lot more than I do about views that are based on assumptions I don't agree with or am unconfident about (and about specific non-abstract issues they work with). But I'm not a 100% sure about this, and it would be interesting to test.

In the short story Three worlds collide one of the species the space-travelers meet evolved to see the eating of children as a terminal value. This doesn't seem to me like something that's necessarily is implausible (after all, evolution doesn't pass the ethical intuitions it gives us through an ethics review board). I can absolutely imagine alien ethicists viewing hedonistic utilitarianism as a reductio absurdum because it doesn't allow for the eating of conscious children.

While we have turned out much better than the hypothetical baby-eating aliens, I don't think its a ridiculous example to bring up. I once talked on Facebook with a person taking a PHD in ethics who disagreed that we should care about the suffering about wildlife animals (my impression was that I was rounding him into a corner where he would have to either change previously stated positions or admit that he didn't fully believe in logic, but at some point he didn't continue the discussion). And you'll find ethicists who see punishment against wrongdoers as a terminal value (I obviously see the use of punishment as an instrumental value).

A reasonable question to ask of me would be; so if you think peoples ethical intuitions are unreliable, isn't that also true of yourself?

Well, that's the thing. The views that I'm confident in are the ones that aren't based on core ethical intuitions (although they overlap with my ethical intuitions), but can be deduced from things that aren’t ethical intuitions, as well as principles such as logical consistency and impartiality (I know I’m being unspecif here, and can extend on this if anyone wants me to). I could have reasoned myself to these views also if I was a complete psychopath. And the views I'm most confident in are the ones that don't even rely on my beliefs about what I want for myself (that is, I'm much more sure about the conscious experience I have if tortured being inherently bad than I am about e.g. whether it inherently matters if my beliefs about reality correspond with reality). My impression is that this commitment to being sceptical of ethical intuitions in this way is something that isn't shared among all (or even the majority?) of ethicists.

Anyway, I think it would be stupid of me to go on a lot longer since this is a comment and not something that will be read by a lot of people, but I felt an urge to give at least some account of why I think like I do. To summarize: I’m not so sure that the average ethicist understands the relevant arguments better than the EAs who have reflected the most about this, and would be very unsurprised if the opposite was the case. And I think ethicists having other opinions than ‘narrow consequentialism’ is more about them having a commitment to other ethical intuitions, and lacking some of the commitments to “impartiality” that I suspect narrow consequensialists often have, as opposed to them having arguments that narrow consequensialist EAs haven’t considered or don’t understand. But I’m really not sure about this - if people think I’m wrong I’m interested in hearing about it, and looking more into this is definitely on my todo-list.

It would be interesting if comprehensive studies were done, or tools were made, in order to identify what differences of opinion are caused by, to which degree philosophers belonging to one branch of ethical theory are logically consistent and to which degree they understand the arguments of other branches, etc. Debates about these kinds of things can often be frustrating and inefficient, so I hope that we in the future will be able to make progress.

EA's Image Problem

Starting a long debate about moral philosophy would be relevant here, but also out of place, so I'll refrain myself.

But what do you mean by "Refrain from posting things that assume that consequentialism is true"? That its best to refrain from posting things that assume that values like e.g. justice aren't ends-in-themselves, or refrain from posting things that assume that consequences and their quantity are important?

If it is something more like the latter, I would ask myself if this would be to pursue the goal of popularity by diminishing a part of the movement that is among the main foundations of what makes it valuable.

Would you e.g. suggest for people to refrain from referring to scope insensitivity like its a cognitive bias?: http://lesswrong.com/lw/hw/scope_insensitivity/, http://lesswrong.com/lw/hx/one_life_against_the_world/

Lots of things are philosophically controversial. The question of whether slavery is a bad thing has renowned philosophers on both sides. I haven't looked much into it much, but I suppose that the anti-slavery movement at some point was going against the majority opinion of the "experts" with nothing speaking in favour of their view except specific arguments concerning the issue in question. I haven't given it a lot of thought, but I suppose that if being uncontroversial among "experts" is a good measure of reasonableness, then even today we should be more open to the possible importance of acting in accordance with theistic holy texts.

Don't get me wrong: I am aware of that there is a pluralism of ethical theories that motivate EAs. I appreciate people motivated by other ethical assumptions than my own and their good deeds, and wouldn't want EA to be considered a narrow-consequentialism-only movement where non-consequentialists aren't welcomed. That being said: While parts of EAs appeal are independent of the moral theory I agree with, other parts that I consider important are very much not. It's hard to think of any more fundamental assumptions in the reasoning behind e.g. why far future concerns are important.

While I try to make decisions that aren't deontologically outrageous, and make sense both from the perspective of "broad" and "not-so-broad" utilitarianism, it's clearly the case that if Immanuel Kant is right then a lot of the EA-relevant decisions I make are pointless. While Kantians who care about EA should be welcomed into the movement, and that not relying on only consequentialist reasoning when its not necessary, I think that encouraging all EAs to speak as if Kant and other philosophers with a complete disregard for consequentialism might be correct would be asking a lot.

While avoiding unnecessary alienation is good, I observe that the way of a movement to succeed isn't always to cave in (although it sometimes may be). Proponents of evolutionary theory don't concede that some species may be created by God, people arguing in favour of vaccines don't concede that the scientific method may be useless, etc.

I also honestly think that the word rational is a good description of the approach EA takes to doing good in a way that clearly isn't the case for many other ways of going about it (by most reasonable definitions of the word). The effective altruism way of going about things IS far superior to a lot of alternatives, and while tactfulness is a good thing, avoiding to say things that implies that this is the case does not seem to me like a good strategy. At least not in all cases.

You raise some interesting perspectives about an important topic, and my comment only concerns a fraction of your post. Many of the suggestions you raise seem good and wouldn't come at the expense of anything important :) I'm not at all certain about any of the strategic concerns that I comment upon here, so take it only as my vague and possibly wrong perspective.

The first talk of this video feels relevant: https://vimeo.com/136877104

What is the expected effect of poverty alleviation efforts on existential risk?

I'm not aware of careful analysis having been done on the topic.

One thing speaking in favour of it increasing existential risk is if it leads to faster technological progress, which in turn could give less time to research on things that specifically benefit safety, of the kind that MIRI and FHI are doing. I'm thinking that more rich people in previously poor countries would make it more profitable for western countries to invest in R&D and that these previously would fund proportionally less x-risk-research than what takes place in the west (this is not an obvious assumption, but it is my suspicion).

But as mentioned by others here there are risks pointing in the other direction also.

I don't have an opinion myself as to which direction the effect on x-risk is, but I suspect the effect on x-risk from donating to GiveWell is of neglectable importance compared to effect of whether or not you donate to x-risk-related work (assuming, as I do, that x-risk research and work directed specifically at x-risk can have a significant impact on x-risk). Your donation to aid projects seems unlikely to have a significant effect on the speed of global development as seen as a fraction of the current speed, but the number of people working on x-risk is small and thus it's easier to affect the size of it by a significant fraction.

Might wireheaders turn into paperclippers?

“I think the major issue here is that you seem to be taking moral realism for granted and assume that if we look hard enough, morality will reveal itself to us in the cosmos. I'm a moral anti-realist, and I'm unable to conceive of what evidence for moral realism would even look like.”

That may be a correct assessment.

I think that like all our knowledge about anything, statements about ethics rest on unproven assumptions, but that there are statements about some states of the world being preferable to others that we shouldn’t have less confidence in than many of the mathematical and metaphysical axioms we take for granted.

That being said, I do realize that there are differences between statements about preferences and statements about physics or mathematics. A child-torture-maximizing alien species could have a self-consistent view of morality with no internal logical contradictions, and would not be proven wrong by interaction with reality in the way interaction with reality can show some ideas about physics and mathematics to be wrong.

I don’t think moral law somehow is ingrained into the universe somehow and will be found by any mind once sufficiently intelligent, but I do think that we are right to consider certain experiences as better to occur than not occur and certain experiences as worse to occur than occur, and that we should consider ways of thinking that lead us to accept statements entail statements that are in logical contradiction with this as wrong.

To summarise some of my views that I think are relevant to your original post:

  • I don’t expect every being above a certain intelligence-level to be conscious (although I don’t dismiss the possibility), and I certainly don’t think every satisfaction of a reward function has value.
  • I’m unsure about how much or little progress we will make in our understanding of consciousness, but it’s not at all intuitively clear to me that it should be an unrealistic problem to solve (even with todays limited intelligence and tools for reasoning we’re not totally clueless).
  • If we don’t get a better understanding of consciousness I think and making inferences about the possible consciousness of other structures by noticing differences with and similarities with our own brains will be a very central tool, and it may be that the best way to go is to fill much of the universe with structures that are similar to human brains having positive lives/experiences, but avoid structures that if plausible theories of consciousness are true could be very bad (like e.g. computer simulations of suffering brains).
  • For all I know, “selective pressures to become less like humans and more like paperclippers” could be something to worry about.
  • While I think likeness-to-humans can be a useful heuristic for avoiding getting things wrong and ensuring a future that’s valuable, I think it is unreasonable to make the assumption that conscious experiences are valuable only insofar as they are similar to those of humans.
Might wireheaders turn into paperclippers?

So a bit of a late answer here :)

"Is this a problem? I don't think humor is inherently valuable. It happens to be valuable to humans, but an alternate world in which it weren't valuable seems acceptable."

If a species has conscious experiences that all are of a kind that we are familiar with, but they lack our strongest and most valued experiences, and devalue these because they follow a strict the-less-similar-to-us-the-less-valuable-policy, then I think that’s regrettable. If they themselves and/or beings they create don’t laugh at jokes but have other positive experiences/feelings in place of this, then whether it is a problem depends on the quality and quantity of these other experiences/feelings.

Just in case I've been imprecise in describing my own position: All I would be confident in claiming is that there are experiences that are positive (it is better for them to exist than not exist), experiences that are negative (it would be better if they didn't exist), and collections of experiences that have higher value than other experiences (the experience of a pinprick is preferable to the experience of being burned alive, one experience of being burned alive is preferable to a thousand identical experiences of being burned alive, etc).

"Completely disagree. They'd be in disagreement with my values, but there's no way to show that they're objectively wrong."

Would you say the say the same thing if I brought forward an example of an alien species that doesn't recognise that it's bad when humans have the conscious experiences they have when they're being tortured? Given that they don't have corresponding conscious experiences themselves, this seems to follow from the methodology of thinking about consciousness that you describe.

Whether we consider the foundation of morals to be objective or not, and what we would mean by objective, is something we could discuss, but if we suppose that we can’t reasonably talk about “being right” about moral questions then that doesn’t seem to me to undermine my point of view anymore than it undermines the point of your post.

“What they "want"? Just like paperclippers "want" paperclips? "Chemical occurrences" is an even more implausible framing. I doubt they'd have any analogue of dopamine, etc.”

You say “they”, but if I am interpreted to refer to any specific physical structure, this is by accident. I don’t presuppose that structures/beings that are created for the sake of their consciousness should be based on other neurotransmitters than ours. Biological brains are the only structures that I’m confident are conscious (the more similar to humans they are the more confident I am). The point I’m trying to communicate is that we may be able to deduce with moderate-to-high confidence whether or not a structure is conscious and whether the experiences in question are positive, also when we haven’t experienced them ourselves. We can e.g. argue that rewarding brain simulation probably is a positive experience for a rat (https://www.youtube.com/watch?v=7HbAFYiejvo), not because we ourselves have rat brains or have experienced such simulations, but because the chemical occurrences seem to correspond with what’s happening the brain of a happy human, and because they act in a way that signals that they want more of it (and the correspondence between wanting something and positive feelings probably is similar to that of a human brain, since these parts of human and rat brains probably work in similar ways).

“Maybe some states are better but only because of degree, e.g. developing purer heroin. I don't think anyone could convince me that a certain configuration of, say, helium is more valuable than a human mind.”

In regards to physical structures based on a completely different chemical underpinning than the human brain that have more value than a conscious human, I’m unsure if there will be arguments in the future that will convince me of the likeliness or unlikeliness of this, but I don’t assume that there necessarily will (I really hope that we come to grips with how consciousness works, but I’m genuinely unsure about whether or not it’s likely that we will).

Good to hear that you are open towards the possibly acknowledging of conscious states that are more valuable than ones we have now if they are “the same” experience but with a higher “degree” :) If I interpreted that correctly it’s different from and better than the view I interpreted as being described in the main post (which I interpreted as asserting positive feelings that are more intense than the human experience as always being less valuable).

“I'm not sure what impartial means in this context. This is a discussion of values, so "impartial" is a contradiction.”

Here is an excerpt from an unfinished text of mine where I try to describe what I mean by partial (I acknowledge that the concept is a bit fussy, but I don’t think it is a contradiction when used in the sense I mean it):


Many people agree with the principles logic - among those that true statements cannot be logically inconsistent. There are also principles beyond those of logic that many consider to be a part of rational thinking, like e.g. Occam's razor. In my mind an essential aspect of thinking honestly and rationally about morality is to be impartial in the way you think. A loose description of what I mean by an impartial way of thinking would be that a mind that has the same knowledge as you but is in different circumstances from you wouldn’t reach conclusions that are logically contradictory with your conclusions.

Take the example of a soldier fighting for Germany in World War I, and a soldier fighting on the opposing side. They are both doing something that tends to feel right for humans; namely being on the side of their country. Their tribe. But given that the goals of one of the soldiers relies on the assumption “Germany winning World War I is good”, while the other soldiers has assumptions that implies “Germany winning World War I is not good”, then the principle of no logical contradictions dictates that they cannot both be right.

If you in one setting (be that living in a specific country or time period, belong to a specific species, etc) reach one opinion, but in another setting would have reached a contradictory opinion is using the same way of thinking, then this suggests that your way of thinking isn’t impartial.

We should remember to ask ourselves: Which action would we have chose for ourselves if we were spectating from the outside? If we didn’t belong to any nation or species? If we were neither born or unborn? If we knew everything we know, but wouldn’t be affected by the action chosen, and weren’t affiliated with anyone in any way?

For example, we know that there are children in the world who are dying from poverty-related causes that could be saved at a cost of some hundred dollars per person. Meanwhile, in my home country Norway, many people are upset that we don’t spend more money on refurbishing swimming pools and sport facilities. But if we were impartial observers; which action would you consider best?

Always choosing the action that from an impartial point of view has the best consequences may be too much to expect of ourselves, but we should still be aware of which actions we would have considered to have the best consequences if we were impartial observers.


Here is another excerpt from the same text:


We could imagine a group of aliens that are concious, and have some feelings in common with us. Let’s say that they get the same kind of enjoyment that we do out of friendship and sexual gratification, but that they aren’t familiar with the positive experiences we get out of romantic relationships, art, movies and litterature, eating a good meal, eating ice cream or sweets, games, humour, music, learning, etc.

One could imagine this alien species observing us, and deeming parts of our existance that we consider valuable and meaningful as meaningless. “Sure”, they could say, “we can see the value of these beings experiencing the kind of experiences that we value for ourselves, but why should these other kinds of conciousness that we’re not familiar with have any value?”.

We could also imagine that the aliens have the same kind of negative experience as us when being hit or cut with sharp objects, but are totally foreign to the discomforts we would experience if burned alive or drowned. “It would be a tradgedy beyond imaginening if the universe was filled with concious experiences of being stabbed and hit”, they could say, “but we see no reason why we should try to minimize the kinds of experiences humans experience when burning alive or drowning“.

The mistake these aliens are doing is to not assume, or even think it a possibility, that there are experiences worth valuing or avoiding outside of the range of experiences they know. When we evaluate structures that might be concious, and might have experiences that are different from the ones we are familiar with, we should try to think in a way that wouldn’t lead us to make the same mistakes as the aliens in this thought-experiment if we only knew the kinds of experiences that they knew.


Does this conception of partiality/impartiality make at least some sense in your mind?

Might wireheaders turn into paperclippers?

It appears to me that if we were a species that didn't have [insert any feeling we care about, e.g. love, friendship, humour or the feeling of eating tasty food], and someone then invented it, then many people would think of it as not being valuable. The same would go for some alien species that has different kinds of conscious experiences from us trying to evaluate our experiences. I'm convinced that they would be wrong in not valuing our experiences, and I think this shows that that way of thinking leads to mistakes. Would you agree with this (but perhaps still think it's the best policy because there's no better option)?

I agree that analysing the conscious experiences of others, especially those with minds that are very different from ours, isn't straight forward, and that we very well might not ever understand the issue completely. But it seems likely to me that we, especially if aided by superintelligence, could be able to make solid case for why some minds have conscious experiences that are better than ours (and are unlikely to be bad). Strong indicators could include what the minds want themselves, how different chemical occurrences in our brains correlate with which experiences we value/prefer, etc. While similarities to our own minds makes it easier for us to make judgments about the value of a minds consciousness with confidence, it could be that we find that there are states of being that probably are more valuable than that of a biological human. Would you agree?

It seems entirely plausible that there are conscious experiences that can be perceived to be much more profound/meaningful than anything experienced by current biological humans, and that there could be experiences that are as intensively positive as the experiences of torture are negative to us. Would you agree?

My own stance on utilitronium and post-humans is that I wouldn't take a stance today in regards to specific non-human-like designs/structures, but suspect that if we created conscious beings/stuff based on good thinking about consciousness with the main goal of maximising for positive/meaningful experience, and set aside some small or large fraction of the universe we colonise to this, it would be likely to make our civilisation more valuable by orders of magnitude than if all minds experienced the human experience.

If we based on self-interest, or based on other feelings, are uncomfortable about where our thinking about what's valuable leads us, we could compromise by using much of the matter in the universe we get hold of in the way impartial thinking tells us is best, and some other part or fraction in a way that fits the egoistic interests of the human species and/or make us feel fuzzy inside. If we think that some forms of minds or conscious matter are likely to have extreme value (and doesn't plausibly have negative value), but we are genuinely unsure if this is the case, then a reasonable solution could be to dedicate some fraction of the matter we get hold on to this kind of structure, and another to that kind of structure, etc.

Permanent Societal Improvements

An important topic!

Potentially influencing lock-in is certainly among my motivations for wanting to work on AI friendliness, and doing things that could have a positive impact of a potential lock-in has a lot speaking for it I think (and many of these things, such as improving the morality of the general populous, or creating tools or initiatives for thinking better about such questions, are things that could have significant positive effects also if no lock-in occurs).

As to example of having-more-children out of far-future concerns, I think this could go the other way also (although I don't necessarily thing that it would - I really don't know). If we e.g. reach a solution where it is decided that all humans have certain rights, can reproduce, etc, but also decide that all or a fraction of the matter in the universe we have little need for are used to increase utility in more efficient ways (e.g. by creating utilitronium or by creating non-human sentient beings with positive and meaningful existences), then a larger human population could lead to less of that.

How much does work in AI safety help the world?

Cool idea and initiative to make such a calculator :) Although it doesn't quite reflect how I make estimations myself (I might make a more complicated calculator of my own at some point that does).

The way I see it, the work that is done now will be the most valuable per person, and the amount of people working on this towards the end may not be so indicative (nine women cannot make a baby in a month, etc).

I am Nate Soares, AMA!

So as I understand it, what MIRI is doing now is to think about theoretical issues and strategies and write papers about this, in the hope that the theory you develop can be made use of by others?

Does MIRI think of ever:

  1. Developing AI yourselves at some point?
  2. Creating a goal-alignment/safy-framework to be used by people developing AGI? (Where e.g. reinforcement learners or other AI-compinents can be "plugged in", but in some sense are abstracted away.)

Also (feel free to skip this part of the question if it is too big/demanding):

Personally, I have a goal of progressing the field of computer-assisted proofs by making them more automated and by making the process of making them more user-friendly. The system would be made available through a website where people can construct proofs and see the proofs, but the components of the system would also be made available for use elsewhere. One of the goals would be to make it possible and practical to construct claims that are in natural language and are made using components of natural language, but also have an unambiguous logical notation (probably in Martin-Löf type theory). The hope would be that this could be used for rigorous proofs about self-inproving AI, and that the technologies/code-base developed and the vocabulary/defnitions/claims/proofs in the system could be of use for a goal-alignment/safy-framework.

(Anyone reading this who are interested in hearing more, could get in touch with me, and/or take a look at this document:

https://docs.google.com/document/d/1GTTFO7RgEAJxy8HRUprCIKZYpmF4KJiVAGRHXF_Sa70/edit)

If I got across what it is that I'm hoping to make; does it sound like this could be useful to the field of AI safety / goal alignment? Or are you unsure? Or does it seem like my understanding of what the field needs is flawed to some degree, and that my efforts in all probability would be better spent elsewhere?

Load More