99 karmaJoined


AI Safety 101


It's always hard finding a balance between brevity + de-jargonification in the answers while also touching on every topic that people might be confused about. Mainly because we expect a large diversity of people with varying technical and non-technical backgrounds to interact with the answers. This means that we try and minimize the information per answer to the main essentials required for understanding the topic.

All that being said, I also don't really know if there is a major difference between polysemanticity and superposition. Additionally, I am also confused about if polysemantic and monosemantic neurons refer to the same underlying concept as disentangled vs distributed representations because all these concepts sound like they are describing the same thing. I took note of your comment in the thread about that particular answer, and will get back to it when I learn more.

If you have any resources/posts/papers that point out that they are indeed different let me know and I'll write up a new answer. Something like - What is the difference between polysemantic neurons and superposition?, and, What is the difference between solving polysemanticity in neurons and disentangling latent space representations?

Alternatively, if you come by some awesome explanation and feel like writing it up, you could just edit the stampy docs on superposition or polysemanticity yourself.