ojorgensen

Pursuing a graduate degree (e.g. Master's)
24Joined Jan 2021

Bio

Incoming AI MsC student at Imperial College London. I  helped run the EA student group at St Andrews during my undergrad.

Comments
3

Disagreements about Alignment: Why, and how, we should try to solve them

Thanks for reading the post Catherine! I like this list a lot, and I agree that trying to answer ‘sub-AGI evidence of alignment doesn’t tell us about AGI alignment’ is the key here.

I think that trying to evaluate research agendas might still be important given this. We may struggle to verify the most general version of the claim above, but maybe we can make progress if we restrict ourselves to analysing the kinds of evidence that are generated by specific research agendas. Hence, if we try to answer the claim as in the context of specific research agendas (like "to what extent does interpretability give us evidence of alignment in AGI systems?"), the question might become more tractable, although this is offset by having to answer more questions!

Disagreements about Alignment: Why, and how, we should try to solve them

Thanks for reading the post Oscar! Going to reply to both of your comments here! I haven't thought a lot about when one should start "steering" in their career, but I think starting with an approach focussed on rowing  makes a lot of sense.

Addressing the idea that steering is less important if we can just fund all possible research agendas, I don't think this necessarily holds. It seems that we are talent-constrained at least to an extent, and so every researcher focussed on a hopeless / implausible research agenda is one that isn't working on a plausible research agenda. Thus, even with lots of funding, steering is still important. 

20 Critiques of AI Safety That I Found on Twitter

Could someone explain the “e/acc” in some of these? I haven’t seen it before.