Yeah, I would be in favor of interaction in simulated environments -- other's might disagree, but I don't think this influences the general argument very much as I don't think leaving some matter for computers will reduce the number of brains by more than an order of magnitude or so.
Having a superintelligence aligned to normal human values seems like a big win to me!
Not super sure what this means but the 'normal human values' outcome as I've defined it hardly contributes to EV calculations at all compared to the utopia outcome. If you disagree with this, please look at the math and let me know if I made a mistake.
Yep, I didn't initially understand you. That's a great point!This means the framework I presented in this post is wrong. I agree now with your statement:
the EV of partly utilitarian AI is higher than that of fully utilitarian AI.
I think the framework in this post can be modified to incorporate this and the conclusions are similar. The quantity that dominates the utility calculation is now the expected representation of utilitarianism in the AGI's values.The two handles become:(1) The probability of misalignment.(2) The expected representation of utilitarianism in the moral parliament conditional on alignment.The conclusion of the post, then, should be something like "interventions that increase (2) might be underrated" instead of "interventions that increase the probability of fully utilitarian AGI are underrated."
Yep, thanks for pointing that out! Fixed it.
...I haven't seen much discussion about the downsides of delaying
I'm not sure how your first point relates to what I was saying in this post; but, I'll take a guess. I said something about how investing in capabilities at anthropic could be good. An upside to this would be increasing the probability that EAs end up controlling the super-intelligent AGI in the future. The downside is that it could shorten timelines, but hopefully this can be mitigated by keeping all of the research under wraps (which is what they are doing). This is a controversial issue though. I haven't thought very much about whether the upsides outweigh the downsides, but the argument in this post caused me to believe the upsides were larger than I thought before.
Also I'm not sure about outcome 1 having zero utility...
It doesn't matter what outcome you assign zero value to as long as the relative values are the same since if a utility function is an affine function of another utility function then they produce equivalent decisions.
I agree with Zach Stein-Perlman. I did some BOTECs to justify this (see 'evaluating outcome 3'). If a reasonable candidate for a 'partially-utilitarian AI' leads to an outcome where there are 10 billion happy humans on average per star, then an AI that is using every last Joule of energy to produce positive experiences would produce at least ~ 10^15 times more utility.
These are great! I'll add that you should be careful not to overbook yourself. I would leave an hour and a half in the middle of the day open in case you want to take a nap.
This could be helpful. Maybe posting questions on the EA forum and allowing the debate to happen in the comments could be a good format for this.
Got it! I edited the point about in-person reading so that it provides a more accurate portrayal of what you all are doing.