User Comment Replies

Hey! I've had a look at some parts of this post, don't know where the sequence is going exactly, but I thought that you might be interested in some parts of this post I've written. Below I give some info about how it relates to ideas you've touched on:

This view has the advantage, for philosophers, of making no empirical predictions (for example, about the degree to which different rational agents will converge in their moral views)

I am not sure about the views of the average non-naturalist realist, but in my post (under Moral realism and anti-realism, in t... (read more)

Free agents

Michele Campolo4mo3

Thank you!

Yes I am considering both options. For the next two months I'll focus on job and grant applications, then I'll reevaluate what to do depending on the results.

Critiques of prominent AI safety labs: Redwood Research

Michele Campolo1y1

Hey, I just wanted to thank you for writing this!

I'm looking forward to reading future posts in the series; actually, I think it would be great to have series like this one for each major cause area.

On value in humans, other animals, and AI

Michele Campolo1y1

Yes I'd like to read a clearer explanation. You can leave the link here in a comment or write me a private message.

On value in humans, other animals, and AI

Michele Campolo1y4

Hey!

Thanks for the suggestion. I've read part of the Wikipedia page on Jungian archetypes, but my background is not in psychology and it was not clear to me. The advantage of just saying that our thoughts can be abstract (point 1) is that pretty much everyone understands the meaning of that, while I am not sure this is true if we start introducing concepts like Jungian archetypes and the collective unconscious.

I agree with you that the AI (and AI safety) community doesn't seem to care much about Jungian archetypes. It might be that AI people get the idea anyway, maybe they just express it in different terms (e.g. they talk about the influence of culture on human values, instead of archetypes).

Miguel

Also, If AI researchers truly grasp point no. 1 - they will not use "language" or "large bodies of text" as the main source of data for reinforcement learning (like chatGPT) they would rather - focus on capturing ethical behaviour or large bodies of information that can capture ethical norms. Not all data (books, internet info etc.) can infer our best characteristics as humans.

Miguel

Hi Michelle, I honestly believe that psychology and the concepts that interact with our ancient pattern recognition abilities - as expressed by Jung in his views in archetypes is a link to how we can bridge our human intentions to align with AI. I'm trying to write a more thorough explanation for this. Will you be interested in reading it once published? Cheers!

Criticism of the main framework in AI alignment

Michele Campolo2y2

Thank you!

Criticism of the main framework in AI alignment

Michele Campolo2y2

Maybe "only person in the world" is a bit excessive :)

As far as I know, no one else in AI safety is directly working on it. There is some research in the field of machine ethics, about Artificial Moral Agents, that has a similar motivation or objective. My guess is that, overall, very few people are working on this.

Holly Morgan

I dunno, I still think my summary works. (To be clear, I wasn't trying to be like, "You must be exaggerating, tsk tsk," - I think you're being honest and for me it's the most important part of your post so I wanted to draw attention to it.)

Naturalism and AI alignment

Michele Campolo3y2

What you wrote about the central claim is more or less correct: I actually made only an existential claim about a single aligned agent, because the description I gave is sketchy and really far from the more precise algorithmic level of description. This single agent probably belongs to a class of other aligned agents, but it seems difficult to guess how large this class is.

That is also why I have not given a guarantee that all agents of a certain kind will be aligned.

Regarding the orthogonality thesis, you might find 1.2 in Bostrom's 2012 paper interesting... (read more)

Effective Altruism Forum
EA Forum

All of Michele Campolo's Comments + Replies