Exploring consciousness, AI alignment, moral psychology and how they interact in decision-making.
Always happy to chat!
Asking whether a process is "close enough [to the brain] to produce the same effect" implicitly begs the question - i.e. assumes consciousness is biological.
P-zombies who wouldn't describe their sensations in terms like "qualia" would likely have an evolutionary fit that's equal to humans. I don't know if they're possible, but I think it demonstrates evolution wasn't optimizing for consciousness. Therefore, we shouldn't ask "is such system sufficiently close to the brain" but "is it sufficiently close to the processes that happen to make brain (phenomenally) conscious".
In general, there isn't agreement about any correlate of consciousness within philosophy of mind - there are well regarded thinkers who claim it's not real (Frankish) or that it's the basic substance of the universe (Goff). I think it's possible consciousness is similar to, say, intelligence or humor, which means you need a complex system to meaningfully implement it. However, I think it's unlikely that "complexity itself" is what gives rise to consciousness, e.g. sunspots are very complex (~unpredictable interaction of many elements).
I'm not convinced by Anil Seth's narrative about our biases in mind attribution.
I've been to his talk where he summarized these points. He talked about our inherent tendency to emotionally relate to entities that can use language. Later, he presented a picture of a transistor and a picture of a monkey and asked which seems more conscious on priors.
The prime mechanism by which human decide whether an entity is valuable and conscious is empathy. We are evolved to feel empathy - that is, modelling "what it is like to be them" - towards entities that have faces, limbs, fur and a squishy body. We feel a lot of empathy for pets and babies - entities that don't control language. And we feel zero empathy for the Chinese room.
The argument relies a lot on trying to depict computers as something rigid, cold and dead and life as something interesting, warm and energetic. This works well for our empathy module but does not convince me as a philosophical argument.
I'm curious whether there's any definition of brain's processes as "non-algorithmic" that doesn't end up in Russellian monism (which I'm inclined to support but suspect Seth isn't). Aren't the laws of physics themselves an algorithm? I see autopoiesis as the most interesting connection between consciousness and life but precisely when you find a clear conceptualization like this, it becomes unclear
We have seen an order-of-magnitude increase in the interest in AI alignment, according to Google Trends. Part of it (July peak) can be attributed to Grok's behavior (see my little analysis). The YouTube channel AI in Context correctly identified this opportunity and swiftly released a viral video explaining how the incident connects to alignment. September peak might be attributed to the release of If Anyone Builds It.
Fortunately, the WWOTF link still works: https://whatweowethefuture.com/wp-content/uploads/2023/06/Climate-Change-Longtermism.pdf
Alternatively, it loads a little faster on Web Archive: https://web.archive.org/web/20250426191314/https://whatweowethefuture.com/wp-content/uploads/2023/06/Climate-Change-Longtermism.pdf
I disagree with your argumentation but agree there's quite a significant (e.g. 6.5%) chance that you're correct about the thesis that consciousness has causal efficacy through quantum indeterminacy and that this might be helpful for alignment.
However, my take is that if the effects were very significant and similarly straightforward, they would be scientifically detectable even with very simple fun experiments like the "global consciousness project". It's hard to imagine "selection" among possibly infinite universes and planets and billions of years - but if you manage to do so, the "coincidences" that brought about life can be easily explained with the anthropic principle.
I see this as a more general lesson: People are often overconfident about a theory because they can't imagine an alternative. When it comes to consciousness, the whole debate comes down to to what extent something that seems impossible to imagine is a failure of imagination vs failure of a theory. Personally, I myself give most weight to Rusellian monism but I definitely recommend letting some room for reductionism, especially if you don't see how anyone could possibly believe that, as that was the case for me, before I deeply engaged with the reductionist literature.
But I'm glad whenever people aren't afraid to be public about weird ideas - someone should be trying this and I'm really curious whether e.g. Nirvanic AI finds anything.
The MechaHitler incident seems to have worked as something of a warning shot, Google interest in AI risk has reached an absolute all time high. Trump's AI plan came out on the same day but the comparisons suggests Grok accounts for ~70% of the peak.
I can't quite dismiss the possibility that the interest was driven by new Chinese AI norms, because Chinese people have to use VPNs, so the geography isn't trustworthy. However, if this were true, I would expect that the number of searches for AI risk in Chinese on Google would be higher than roughly zero (link).
I think objective ordering does imply "one should" so I subscribe to moral realism. However, recently I've been highly appreciating the importance of your insistence that the "should" part is kind of fake - i.e. it means something like "action X is objectively the best way to create most value from the point of view of all moral patients" but it doesn't imply that an ASI that figures out what is morally valuable will be motivated to act on it.
(Naively, it seems like if morality is objective, there's basically a physical law formulated as "you should do actions with characteristics X". Then, it seems like a superintelligence that figures out all the physical laws internalizes "I should do X". I think this is wrong mainly because in human brains, that sentence deceptively seems to imply "I want to do x" (or perhaps "I want to want x") whereas it actually means "Provided I want to create maximum value from an impartial perspective, I want to do x". In my own case, the kind of argument for optimism around AI doom in the style that @Bentham's Bulldog advocated in Doom Debates seemed a bit more attractive before I truly spelled this out in my head.)
My impression is that CEA's goal is to fund the meta cause area and the main goal of local groups is to organize events. While funding is hard to democratize unless you convince some billionaire, democratizing the organizations that run events is trivial. [Edit: Also, while it makes sense to organize local events directly based on the local community's preferences / demand, I think it makes sense to take a more top-down (principles-oriented) approach when it comes to distributing funding, because the "demand-side" here comprises of every person on the planet who appreciates money.]
But now I do realize that in my head, I equated CEA with OpenPhil's wing for the meta cause area, which might not be accurate. I also feel good about democratizing CEA if I imagine it implemented as an indirect democracy (i.e. with local organizations voting, instead of every EA member). This probably moves me towards the middle of the poll - i.e. I would be in favor of this kind of democracy. Indirect democracy would reduce the problem of uninformed voters, the problem of dealing with problems publicly and the problem of disbalance in the level of reflection between the average member and highly-engaged members.
I endorse the temperature approach. I'm not sure illusionists would accept the question "What's the % probability that an entity is conscious?" as meaningful but maybe a similar question could indeed be universally accepted, like "Compared to your pain intensity 1 (being poked by a needle), what's your central estimate for the intensity of suffering experienced in scenario X?"
Just to clarify, my argument didn't concern classical p-zombies but what I call "honest p-zombies" - intelligent humanoid entities capable of metacognition but without any intuition similar to our phenomenal intuitions.