Or what should I read to understand this?
It seems like some people expect descendants of large language models to pose a risk of becoming superintelligent agents. (By ‘descendants’ I mean adding scale and non-radical architectural changes: GPT-N.)
I accept that there’s no reason in principle that LLM intelligence (performance on tasks) should be capped at the human level.
But I don’t know why to believe that at some point language models would develop agency / goal-directed behaviour, where they start to try to achieve things in the real world instead of continuing to perform their ‘output predicted text’ behaviour.