Eleni_A

Pursuing a doctoral degree (e.g. PhD)
Seeking work
161Joined Jun 2022

Bio

PhD student. AI safety, cognitive science, history and philosophy of science/technology. 

Sequences
3

Alignment Theory Series
Between pure reason and effectiveness
HPS FOR AI SAFETY

Comments
5

Topic Contributions
1

Both Redwood and Anthropic have labs and do empirical work. This is also an example of experimental work: https://twitter.com/Karolis_Ram/status/1540301041769529346

I don't think it's restricted only to agentic technologies; my model is for all technologies that involve risk. My toy example is that even producing a knife requires the designer to think about its dangers in advance and propose precautions.  

Five types of people on AI risks:

  1. Wants AGI as soon as possible, ignores safety.
  2. Wants AGI, but primarily cares about alignment.
  3. Doesn't understand AGI/doesn't think it'll happen anytime during her lifetime; thinks about robots that might take people's jobs.
  4. Understands AGI, but thinks the timelines are long enough not to worry about it right now.
  5. Doesn't worry about AGI; being locked-in in our choices and "normal accidents" are both more important/risky/scary.

Thank you, that's great. I'd be keen to start a project on this. For whoever is interested, please DM me and we can start brainstorming and form a group etc.