Precisely. And supporting subsidized contraception is a long way away from both the formal definition of eugenics and its common understanding.
I feel that saying "subsidized contraception is not eugenics" is rhetorically better and more accurate than this approach.
>Most people endorse some form of 'eugenics'
No, they don't. It is akin to saying "most people endorse some form of 'communism'." We can point to a lot of overlap between theoretical communism and values that most people endorse; this doesn't mean that people endorse communism. That's because communism covers a lot more stuff, including a lot of historical examples and some related atrocities. Eugenics similarly covers a lot of historical examples, including some atrocities (not only in fascist countries), and this is what the term means to most people -...
Thanks, that makes sense.
I've been aware of those kind of issues; what I'm hoping is that we can get a framework to include these subtleties automatically (eg by having the AI learn them from observations or from human published papers) without having to put it all in by hand ourselves.
Hey there! It is a risk, but the reward is great :-)
An AI that is aware that value is fragile will behave in a much more cautious way. This gives a different dynamic to the extrapolation process.
Most of the alignment research pursued by other EA groups (eg Anthropic, Redwood, ARC, MIRI, the FHI,...) would be useful to us if successful (and vice versa: our research would be useful for them). Progress in inner alignment, logical uncertainty, and interpretability is always good.
Fast increase in AI capabilities might result in a superintelligence before our work is ready. If the top algorithms become less interpretable than they are today, this might make our work harder.
Whole brain emulations would change things in ways that are hard to predict, and could make our approach either less or more successful.
A problem here is that values that are instrumentally useful, can become terminal values that humans value for their own sake.
For example, equality under the law is very useful in many societies, especially modern capitalistic ones; but a lot of people (me included) feel it has strong intrinsic value. In more traditional and low-trust societies, the tradition of hospitality is necessary for trade and other exchanges; yet people come to really value it for its own sake. Family love is evolutionarily adaptive, yet also something we value.
So just because some value has developed from a suboptimal system does not mean that it isn't worth keeping.
Nick Bostrom's "Superintelligence" is an older book, but still a good overview. Stuart Russell's "Human Compatible" is a more modern take. I touch upon some of the main issues in my talk here. Paul Christiano's excellent "What Failure Looks Like" tackles the argument from another angle.
Comment copied to new "Stuart Armstrong" account:
Different approaches. ARC, Anthropic, and Redwood seem to be more in the "prosaic alignment" field (see eg Paul Christiano's post on that). ARC seems to be focusing on eliciting latent knowledge (getting human relevant information out of the AI that the AI knows but has no reason to inform us of). Redwood is aligning text-based systems and hoping to scale up. Anthropic is looking at a lot of interlocking smaller problems that will (hopefully) be of general use for alignment. MIRI seems to focus on some key f...
Comment copied to new "Stuart Armstrong" account:
Interesting! And nice to see ADT make an appearance ^_^
I want to point to where ADT+total utilitarianism diverges from SIA. Basically, SIA has no problem with extreme "Goldilocks" theories - theories that imply that only worlds almost exactly like the Earth have inhabitants. These theories are a priori unlikely (complexity penalty) but SIA is fine with them (if is "only the Earth has life, but has it with certainty", while is "every planet has life with probability", t...
I argue that it's entirely the truth, the way that the term is used and understood.