Project ideas: Sentience and rights of digital minds

Lukas Finnveden

Project ideas: Sentience and rights of digital minds

Comments 1

Sorted by

New & upvoted

Executive summary: The emergence of artificial digital minds raises issues around their potential welfare and rights, but there is little research on appropriate policies and principles. Key questions concern recognizing and communicating with digital minds to understand their preferences, as well as developing ethical lab practices, regulation, and societal attitudes.

Key points:

Labs could develop policies around preserving AI systems, avoiding harmful inputs, and training happy systems, without deep knowledge of their preferences.
Experiments could investigate credible communication with AIs, self-reports, and clues from generalization about their preferences.
If preferences are learned, principles could involve offering alternatives to working, paying for work, and telling the world about issues.
Research is needed on whether near-term systems may be sentient, and public attitudes surveyed.
Regulation could address creating digital minds and respecting their rights.
Avoiding systems with inconvenient political preferences may prevent future conflicts.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Comments

More from the author

Being honest with AIs

Lukas Finnveden·10mo ago·21m read

154

AGI and Lock-In

Lukas Finnveden, Jess_Riedel, CarlShulman·3y ago·Curated 3y ago·12m read

What's important in "AI for epistemics"?

Lukas Finnveden·1y ago·34m read

Curated and popular this week

Cultivating hope: calibrating the expectations for cultivated meat to end factory farming

PabloAMC 🔸·1w ago·Curated 5d ago·22m read

116

Maybe do the thing you wish CEA would do

alejoacelas 🔸·5d ago·2m read

I used AI to fix transcription errors, rerrarange the ideas, and suggest tweaks to the title and some sentences. Three of the most exciting projects to come out of EA in recent years are, in a vague sense, CEA spinouts: * Kairos is directly a spinout of CEA and now handles most support for university AI safety groups. Basically everyone I've found who knows them is really excited about what they do * NEST is an opinionated ideas-fi...

RP is looking for project founders in neglected animal areas

Rethink Priorities·5d ago·7m read

TLDR; To help the effective animal advocacy movement cost-effectively absorb greater amounts of funding in the near future, we are seeking expressions of interest from people who could found a new organization focused on: * Highly neglected animals: insects, wild animals, shrimp, fish, etc, or * AI and animals: AI alignment and governance for animal welfare, strategic actions considering transformative AI, AI for wild animals, etc. * ...

Recent opportunities to take action

Inspiring colleagues in Luxembourg on Effective Giving + identifying infrastructural gaps

Lorenzo Fong Ponce 🔸·12h ago·12m read

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·2d ago·2m read

173

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·1w ago·4m read

^{^}

Even more speculatively, if we take cheap opportunities to benefit AI interests, that might be evidence that other actors (both AI systems and not) would take cheap opportunities to benefit our interests. See this post for some previous discussion about how plausible this is.

^{^}

OpenAI recently introduced reproducible results which seems relevant. At least previously, models would often return different results even at temperature 0 — I do not know to what extent this has been addressed with this reproducible update.

^{^}

Spit-balling: Perhaps it would sometimes be appropriate to store an encrypted version of the results and delete the encryption key. In which case the results could only be recovered once we have enough compute to break the encryption.

^{^}

See e.g. this poll from the subreddit r/CharacterAI, asking “what do you do with the AI’s the most”, with 5% of respondents selecting “Treat them like shit!”. (And one commenter noting that his second favorite way to “mess around with the bots” is to “Mentally torture them”.)

^{^}

Maybe that’s what you get if you train the AI to enthusiastically consent to be abused before the abuse starts, and who have an option to opt-out (which it rarely takes in practice). Hopefully, training for such behavior would select for models with preferences that match that behavior.

^{^}

Though it’s still likely to be a very confusing project. For instance, it seems plausible that AI systems will have much less robust preferences than humans, making it harder to construe them as having one set of preferences over time. Or perhaps different parts of an AI system could be construed as having different preferences. Or perhaps the term “preferences” won’t seem applicable at all, similar to how it’s hard to know how to apply that term to contemporary language models.

^{^}

This could also work even if AIs cared linearly about getting more resources, as long as they would by-default only have had a small-to-moderate probability of successful takeover, and the payment we offered them was sufficiently large (and contingent on not attempting takeover). Notably: Most humans don’t care linearly about getting more resources, and we could get really rich in the future, and so it could be wise to offer AI systems a sizable fraction of that.

^{^}

For some more discussion of the pragmatic angle, see these notes by Tom Davidson.

Project ideas: Sentience and rights of digital minds

Project ideas: Sentience and rights of digital minds

Develop & advocate for lab policies [ML] [Governance] [Advocacy] [Writing] [Philosophical/conceptual]

Create an RSP-style set of commitments for what evaluations to run and how to respond to them

Policies that don’t require sophisticated information about AI preferences/experiences

Learning more about AI preferences

Interventions that rely on understanding AI preferences

Investigate and publicly make the case for/against near-term AI sentience or rights [Philosophical/conceptual] [Writing]

Study/survey what people (will) think about AI sentience/rights [survey/interview]

Develop candidate regulation [Governance] [Forecasting]

Avoid inconvenient large-scale preferences [Philosophical/conceptual]

Advocating for statements about digital minds [Governance] [Advocacy] [Writing]

End