Milan Weibel's Quick takes

Milan Weibel🔹

Milan Weibel's Quick takes

Milan Weibel🔹

1 min readNov 16, 2023

Frontpage

Comments 6

Sorted by

New & upvoted

Milan Weibel🔹

Philosophy

Contra hard moral anti-realism: a rough sequence of claims

Epistemic and provenance note: This post should not be taken as an attempt at a complete refutation of moral anti-realism, but rather as a set of observations and intuitions that may or may not give one pause as to the wisdom of taking a hard moral anti-realist stance. I may clean it up to construct a more formal argument in the future. I wrote it on a whim as a Telegram message, in direct response to the claim

> “you can't find "values" in reality”.

Yet, you can find valence in your own experiences (that is, you just know from direct experience whether you like the sensations you are experiencing or not), and you can assume other people are likely to have a similar enough stimulus-valence mapping. (Example: I'm willing to bet 2k USD on my part against a single dollar yours that that if I waterboard you, you'll want to stop before 3 minutes have passed.)^[1]

However, since we humans are bounded imperfect rationalists, trying to explicitly optimize valence is often a dumb strategy. Evolution has made us not into fitness-maximizers, nor valence-maximizers, but adaptation-executers.

"values" originate as (thus are) reifications of heuristics that reliably increase long term valence in the real world (subject to memetic selection pressures, among them social desirability of utterances, adaptativeness of behavioral effects, etc.)

If you find yourself terminally valuing something that is not someone's experienced valence, then either one of these propositions is likely true:

A nonsentient process has at some point had write access to your values.
What you value is a means to improving somebody's experienced valence, and so are you now.

crossposted from lesswrong

^{^}
In retrospect, making this proposition was a bit crass on my part.

Milan Weibel🔹

EA Forum feature request: Can we get a bluesky profile link button for profile pages?

Sarah Cheng 🔸

Thanks for the suggestion! This should be relatively quick to add so I'll see if we can do it soon. :) I was also thinking of setting up a bluesky bot account similar to our twitter account. Do you know how active the EA-ish bluesky community is?

Milan Weibel🔹

High-variance. Most people seem to have created an account and then gone back to being mostly on (eX)twitter. However, there are some quite active accounts. I'm not the best to ask this question to, since I'm not that active either. Still, having the bluesky account post as a mirror of the twitter acccount maybe isn't hard to set up?

Milan Weibel🔹

1y*

Just noticed the feature is deployed already. Thanks!

Milan Weibel🔹

AI safety

Existential risk

In a certain sense, an LLM's token embedding matrix is a machine ontology. Semantically similar tokens have similar embeddings in the latent space. However, different models may have learned different associations when their embedding matrix was trained. Every forward pass starts colored by ontological assumptions, an these may have alignment implications.

For instance, we would presumably not want a model to operate within an ontology that associates the concept of AI with the concept of evil, particularly if it is then prompted to instantiate a simulacrum that believes it is an AI.

Has someone looked into this? That is, the alignment implications of different token embedding matrices? I feel like it would involve calculating a lot of cosine similarities and doing some evals.

Comments

Curated and popular this week

What would an animal-aligned AI be aligned to?

Aidan Kankyoku, Anima International·1w ago·Curated 2d ago·15m read

This is a crosspost from the new Animal Welfare Alignment Newsletter by Anima International. You can subscribe on Substack if you are interested in following these efforts. Audio reading also available on Substack. The goals of this post are to: 1. Raise a question I see as crucially important to the goal of aligning AI to animal welfare...

172

The first video from Giving What We Can's new channel is out now!

JustinPortela·4d ago·1m read

Hello! I'm Justin Portela. I got hired by GWWC to make YouTube videos after AI in Context did such a kickass job. My channel is using that same cinematic, high-production value beauty to talk about everything in the EA universe that isn't AI. ...

New round of digital minds funding opportunities at Longview

zdgroff, Longview Philanthropy·5d ago·2m read

This is a linkpost for Request for Proposals: Research and Applied Work on Digital Minds. I'm glad to announce a request for proposals for research and applied work on digital minds at Longview Ph...

Recent opportunities to take action

A huge way you can help pigs in 5-20 minutes (in the US)

ElliotTep·1d ago·1m read

PauseCon London '26: Applications now open

Jonathan@PauseAI·1d ago·1m read

173

Possible mistake EAs are making and shout out to Pause AI UK

Michelle_Hutchinson·2w ago·4m read

Contra hard moral anti-realism: a rough sequence of claims

A nonsentient process has at some point had write access to your values.

What you value is a means to improving somebody's experienced valence, and so are you now.

^{^}

In retrospect, making this proposition was a bit crass on my part.