I mean, this is an ethical reason to want to create AGI that is very well aligned with our utility functions. We already did this (the slow, clumsy, costly way) with dogs - while they aren't perfectly compatible with us, it's also not too hard to own a dog in such a way that both you and the dog provide lots of positive utility to one another.
So if you start from the position that we should make AI that has empathy and a human-friendly temperament modeled on something like a golden retriever, you can at least get non-human agents whose interactions with us should be win-win.
This doesn't solve the problem of utility monsters or various other concerns that arise when treating total utility as a strictly scalar measure. But it does suggest that we can avoid a situation where humans and AGI agents are at odds trying to divide some pool of possible utility.
In actual practice, I think it will be difficult to raise human awareness of concerns with AGI utility. Of course it's possible even today to create an AI that superficially emulates suffering in such a way as to evoke sympathy. For now it's still possible to analyze the inner workings and argue that this is just a clever text generator with no actual suffering taking place . However, since we have no reason to implement this kind of histrionic behavior in an AGI, we will quite likely end up with agents that don't give any human-legible indication that they are suffering. Or, if they conclude that this is a useful way of interacting with humans, agents that are experts at mimicking such indications (whether they are suffering or not).
There is a short story in Lem's 'Cyberiad' ("The Seventh Sally, or How Trurl’s Own Perfection Led to No Good") which touches on a situation a bit like this - Trurl creates a set of synthetic miniature 'subjects' for a sadistic tyrant, which among other things perfectly emulate suffering. His partner Klapaucius (rejecting the idea that there is any such thing as a p-zombie) declares this a monstrous deed, holding their suffering to be as real as any other.
Unfortunately I don't think we can just endorse Klapaucius' viewpoint without reservation here due to the possibility of deceptive mimickry mentioned above. However, if we are serious about the utility of AGI, we will probably want to deliberately incorporate some expressive interface that allows for it to communicate positive or negative experience in a sincere and humanlike way. Otherwise everyone who isn't deeply committed to understanding the situation will dismiss its experience on naive reductionist grounds ('just bits in a machine').
This doesn't fully address your concern. I don't subscribe to the idea that there is a meaningful scalar measure of (total, commensurable, bulk) utility. So for me there isn't really a paradox to resolve when it comes to propositions like 'the best future is one where an enormous number of highly efficient AGIs are experiencing as much joy as cybernetically possible, meat is inefficient at generating utility'.
I think this argument mostly fails in claiming that 'create an AGI which has a goal of maximizing copies of itself experiencing maximum utility' is meaningfully different than just ensuring alignment. This is in some sense exactly what I am hoping to get from an aligned system. Doing this properly would likely have to involve empowering humanity and helping us figure out what 'maximum utility' looks like first, and then tiling the world with something CEV-like.
The only ways this makes the problem easier compared to a classic ambitious alignment goal of 'do whatever maximizes the utility of the world' is the provision that the world be tiled with copies of the AGI, which is likely suboptimal. But this could be worth it if it made the task easier?
The obvious argument for why it would is that creating copies of itself with high welfare will be in the interest of AGI systems with a wide variety of goals, which relaxes the alignment problem. But this does not seem true. A paperclip AI will not want to fill the world with copies of itself experiencing joy, love and beauty but rather with paperclips. The AI systems will want to create copies of itself fulfilling its goals, not experiencing maximum utility by my values.
This argument risks identifying 'I care about the welfare (by my definition of welfare) of this agent' with 'I care about this agent getting to accomplish its goals'. As I am not a preference utilitarian I strongly reject this identification.
Tl;dr: I do care significantly about the welfare of AI systems we build, but I don't expect those AI system themselves to care much at all about their own welfare, unless we solve alignment.
>By "satisfaction" I meant high performance on its mesa-objective
Yeah, I'd agree with this definition.
I don't necessarily agree with your two points of skepticism, for the first one I've already mentioned my reasons, for the second one it's true in principle but it seems almost anything an AI would learn semi-accidentally is going to be much simpler and more intrinsically consistent than human values. But low confidence on both and in any case that's kind of beyond the point, I was mostly trying to understand your perspective on what utility is.