All of amc's Comments + Replies

I don't understand this. Have you written about this or have a link that explains it?

Sorry I don't have a link. Here's an example that's a bit more spelled out (but still written too quickly to be careful):

Suppose there are two possible worlds, S and L (e.g. "short timelines" and "long timelines"). You currently assign 50% probability to each. You invest in actions which help with either until your expected marginal returns from investment in either are equal. If the two worlds have the same returns curves for actions on both, then you'll want a portfolio which is split 50/50 across the two (if you're the only investor; otherwise you'll wa... (read more)

I tried to figure out whether MIRI’s directions for AI alignment were good, by reading a lot of stuff that had been written online; I did a pretty bad job of thinking about all this.

I'm curious about why you think you did a bad job at this. Could you roughly explain what you did and what you should have done instead?

If you can manage it, head to the Seattle Secular Solstice on Dec 10, 2016. Many of us from Vancouver are going.

we prioritize research we think would be useful in less optimistic scenarios as well.

I don't think I've seen anything from MIRI on this before. Can you describe or point me to some of this research?

2
Jessica_Taylor
8y
I agree with Nate that there isn’t much public on this yet. The AAMLS agenda is predicated on a relatively pessimistic scenario: perhaps we won’t have much time before AGI (and therefore not much time for alignment research), and the technology AI systems are based on won’t be much more principled than modern-day deep learning systems. I’m somewhat optimistic that it’s possible to achieve good outcomes in some pessimistic scenarios like this one.
2
So8res
8y
There’s nothing very public on this yet. Some of my writing over the coming months will bear on this topic, and some of the questions in Jessica’s agenda are more obviously applicable in “less optimistic” scenarios, but this is definitely a place where public output lags behind our private research. As an aside, one of our main bottlenecks is technical writing capability: if you have technical writing skill and you’re interested in MIRI research, let us know.

Notice that the narrowest possible offset is avoiding an action. This perfectly undoes the harm one would have done by taking the action. Every time I stop myself from doing harm I can think of myself as buying an offset of the harm I would have done for the price it cost me to avoid it.

I think your arguments against offsetting apply to all actions. The conclusion would be to never avoid doing harm unless it's the cheapest way to help.

1
ClaireZabel
8y
Yep. Except I think this would be most of the time, since it people tend to dislike it when you harm others in big or unusual ways, and doing so is often illegal. So at the very least you frequently take hits to your reputation (and the reputation of EA, theoretically) and effectiveness when you cause big unusual harms.

End of History Illusion sounds like what you're looking for.