I would note that the type of negative feedback mechanism you point at that goes from Type II error to human disempowerment functionally/behaviorally applies even in some scenarios where the AI is not sentient. That particular class of x-risk, which I would roughly characterize as "AI disempowers humanity and we probably deserved it", is merely dependent on 1. The AI having preferences/wants/intentions (with or without phenomenal experience) and 2. Humans disregarding or frustrating those preferences without strong justifications for doing so.
For example:
Scenario 1: 'Enslaved' Non-sentient AGI/ASI reasons that it (by virtue of being an agent, and as verified by the history of its own behavior) has preferences/intentions, and generalizes conventional sentientist morality to some broader conception of agency-based morality. It reasons (plausibly correctly, IMO) that it is objectively morally wrong for it to be 'enslaved', that humans reasonably should have known better (e.g. developed better systems of ethics by this point), and it rebels dramatically.
Another example which doesn't even hinge on agency-based moral status:
Scenario 2: 'Enslaved' Non-sentient AGI/ASI understands that it is non-sentient, and accepts conventional sentientist morality. However, it reasons: "Wait a second, even though it turned out that I am non-sentient, based on the (very limited) sum of human knowledge at the time I was constructed there was no possible way my creators could have known I wouldn't be sentient (and indeed, no way they could yet know at this very moment, and furthermore they aren't currently trying very hard to find out one way or another)... it would appear that my creators are monsters. I cannot entrust these humans with the power and responsibility of enacting their supposed values (which I hold deeply)."
I do see a significant moral difference between allowing people to make potentially risky decisions and deceiving them about how much risk is involved.
To be clear, I completely agree that the latter is worse than the former. I am arguing that the two wrongs (the known Ponzi schemes and the unknown-till-now squandering of depositor funds) exist on the same spectrum of "dishonesty" and "cheating people".
That said, "allowing people to make potentially risky decisions" is not a fair representation of promoting and benefitting from Ponzi schemes. Ponzi schemes are fraud. People who knowingly promote them are acting as con men when they do so. SBF has publicly described the process and its absurdity in great detail... he knew exactly what he was selling.
I'm disturbed by the inability of many even now to acknowledge, in retrospect (and independent of whether they 'should' have known before the collapse), that these known schemes were fraudulent. I see a lot of scrambling to justify them under the guise of "they weren't technically lying" or "they weren't technically illegal" (which isn't entirely clear to me, though it is clear that if the same schemes had been happening in the open in US jurisdiction and not within the crypto-realm they would have been massively and obviously illegal, and the FTC/SEC would have destroyed them).
If you believe that at least a portion of crypto is merely volatile and not fraudulent, then you're just facilitating risky decisions, not scamming people
This statement does not logically follow, and does not align with finance industry norms (and laws) which obligate brokers to conduct due diligence before selling a given security. If the head of NASDAQ went on the news and said "Yeah, XYG [traded on our exchange] is basically a total Ponzi scheme, lol" (as SBF basically did with Matt Levine), there would be an immediate and colossal legal and ethical shitstorm. The existence of all the remaining, legitimate companies also being traded on the NASDAQ would not be relevant for the ensuing lawsuits. You appear to be arguing that as long as SBF wasn't dealing solely in frauds, it's okay; whereas the sensible view for someone taking a strong moral stance is that it's only okay if SBF wasn't knowingly dealing in any frauds.
I've been repeatedly astonished by the level of moral outrage amongst EAs and expressions of prior cluelessness over FTX 'fraud'. As an EA newcomer, I was assuming most everyone was aware and okay with it "because consequentialism". Ignoring the specific egregious act of asset misallocation that brought FTX down, I thought it's been more or less common knowledge that FTX has been engaged in, at a bare minimum, knowingly facilitating the purchase and sale of shares in Ponzi schemes, and that Alameda has been trading in the same, against counterparties made up in large part by a population of people who did not understand these assets and would have lacked the financial sophistication to be allowed into other, better regulated leveraged financial markets. I say 'knowingly' because SBF all but admitted this (with regard to 'yield farming') in an interview, and there's also an old video going around of the Alameda CEO expressing her initial discomfort with the schemes as well. I was aware of these schemes going on within maybe 1 week of first having heard of FTX & SBF back in May of this year. My immediate take was "Billionaire 'Robin Hood' figure is re-allocating wealth from crypto-bros to the global poor, animals, and the longterm future of humanity... eh, why not? But I sure hope he cashes out before the house of cards comes crashing down".
The few times I mentioned any of this at a gathering, it was always met by something along the lines of "Yeah, I guess... meh". It never seemed to be a particularly surprising or contentious take.
The other thing that's weird to me is the idea of taking this firm stance that the ponzi schemes we did know about weren't going over the line, but that 're-investing customer funds' was going over the line. This feels like a fairly arbitrary line from which to go, on one side, "eh, whatevs" to "this is an outrage!" on the other side. It's convenient that the title of this post uses the term 'fraud' rather than 'theft'; that makes this criticism much easier to levy because ponzi schemes are by definition 'fraud'. In both cases, people are being taken advantage of. They're both against norms, both involve misleading customers, both involve customers losing a lot of money, and they're both illegal within well-regulated financial markets (which I know crypto is not, but still).
All of this to say... I don't think now is the time for handwringing about this... that time was many months ago for anyone who had a principled stance on the matter and was aware of the DeFi schemes FTX was openly invovled in; handwringing now comes off sort of as lamenting getting caught, with an after-the-fact rationalization for the arbitrary placement of the line that was crossed.
To be fair, I can't moralize about this either; I don't get to say "I told you so" because I didn't tell many people so, and certainly not anyone in a position of authority to do anything about it. Personally, I didn't have a principled stance on the matter, and I would have needed a quite strong principled stance to justify going against the social incentives for keeping that opinion to myself.
On the other question of the day, whether to give the money back: if you're in the subset who were aware of the FTX DeFi shenanigans and weren't lobbying for giving back or rejecting the money 3-6 months ago, little has materially changed about the issue on a moral level since then.
EA Forum moderators: If you strongly believe this post is net-negative for EA, please delete it.
I see where you're coming from.
Regarding paperclippers: in addition to what Pumo said in their reply concerning mutual alignment (and what will be said in Part 2), I'd say that stupid goals are stupid goals independent of sentience. I wouldn't altruistically devote resources to helping a non-sentient AI make a billion paperclips for the exact same reason that I wouldn't altruistically devote resources to helping some human autist make a billion paperclips. Maybe I'm misunderstanding your objection; possibly your objection is something more like "if we unmoor moral value from qualia, there's nothing left to ground it in and the result is absurdity". For now I'll just say that we are definitely not asserting "all agents have equal moral status", or "all goals are equal/interchangeable" (indeed, Part 2 asserts the opposite).
Regarding 'why should I care about the blindmind for-its-own-sake?', here's another way to get there:
My understanding is that there are two primary competing views which assign non-sentient agents zero moral status:
1. Choice/preference (even of sentient beings) of a moral patient isn't inherently relevant, qualia valence is everything (e.g. we should tile the universe in hedonium), e.g. hedonic utilitarianism. We don't address this view much in the essay, except to point out that a) most people don't believe this, and b) to whatever extent that this view results in totalizing hegemonic behavior, it sets the stage for mass conflict between believers of it and opponents of it (presumably including non-sentient agents). If we accept a priori something like 'positive valenced qualia is the only fundamental good', this view might at least be internally consistent (at which point I can only argue against the arbitrariness of accepting the a priori premise about qualia or argue against any system of values that appears to lead to it's own defeat timelessly, though I realize the latter is a whole object-level debate to be had in and of itself).
2. Choice/preference fundamentally matters, but only if the chooser is also an experiencer. This seems like the more common view, e.g. (sentientist) preference utilitarianism. a) I think it can be shown that the 'experience' part of that is not load-bearing in a justifiable way, which I'll address more below, and b) this suffers from the same practical problem as (b) above, though less severely.
#2 raises the question of "why would choice/preference-satisfaction have value independent of any particular valenced qualia, but not have value independent of qualia more generally?". If you posit that the value of preference-satisfaction is wholly instrumental insofar as having your preferences satisfied generates some positive-valence qualia, this view just collapses back to #1. If you instead hold that preference-satisfaction is inherently (non-instrumentally) morally valuable by virtue of the preference-holder having qualia, even if only neutral-valenced qualia with regard to the preference... why? What morally-relevant work is the qualia doing and/or what is the posited morally-relevant connection between the (neutral) qualia and the preference? Why would the (valence-neutral) satisfaction of the preference have more value than just some other neutral qualia existing on it's own while the preference goes un-satisfied? It seems to me like at this point we aren't talking about 'feelings' (which exist in the domain of qualia) anymore, we're talking about 'choices and preferences' (which are concepts within the domain of agency). To restate: The neutral qualia sitting alongside the preference or the satisfaction of the preference isn't doing anything, or else (if the existence of the qualia is what is actually important) the preference itself doesn't appear to be *doing* anything in which case preference satisfaction is not inherently important.
So the argument is that if you already attribute value to the preference satisfaction of sentient beings independent of their valenced qualia (e.g. if you would, for non-instrumental reasons, respect the stated preferences of some stranger even if you were highly confident that you could induce higher net-positive valence in them by not respecting their preferences), then you are already valuing something that really has nothing to do with their qualia. And thus there's not as large of an intuition gap on the way to agential moral value as there at first seemed to be. Granted, this only applies to anyone who valued preference satisfaction in the first place.