From my understanding, AI alignment is difficult because AI models would struggle to understand human values.

Could progress in technology like brain-computer interfaces that Neuralink works on and brain organoids (essentially human brains grown in a lab) allow us to connect AI models to human brains, which could help them understand human values?

Is anyone working on this as a research direction for AI Alignment?

Comments1
Sorted by Click to highlight new comments since: Today at 9:28 PM

I am personally not convinced of their usefulness, Robert Long has an alternative take here.

The fundamental problem, as I see it, is that

  1. Giving unaligned AI systems access to your neural state is bad™, and
  2. "Merging" with AI systems is under-defined.

I'd love to see an actual explanation for how brain-computer interfaces would be useful for alignment.

Additionally, I object to "AI alignment is difficult because AI models would struggle to understand human values". Under my best understanding, AI alignment is about making cognition aimable at all.