In Human Compatible, Stuart Russell makes an argument that I have heard him make repeatedly (I believe on the 80K podcast and the FLI conversation with Steven Pinker). He suggests a pretty bold and surprising claim:
[C]onsider how content-selection algorithms function on social media... Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user's preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on... Like any rational entity, the algorithm learns how to modify the state of its environment—in this case, the user's mind—in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do.
I don't doubt that in principle this can and must happen in a sufficiently sophisticated system. What I'm surprised by is the claim that it is happening now. In particular, I would think that modifying human behavior to make people more predictable is pretty hard to do, so that any gains in predictive accuracy for algorithms available today would be swamped by (a) noise and (b) the gains from presenting the content that someone is more likely to click on given their present preferences.
To be clear, I also don't doubt that there might be pieces of information algorithms can show people to make their behavior more predictable. Introducing someone to a new YouTube channel they have not encountered might make them more likely to click its follow-up videos, so that an algorithm has an incentive to introduce people to channels that lead predictably to their wanting to watch a number of other videos. But this is not the same as changing preferences. He seems to be claiming, or at least very heavily implying, that the algorithms change what people want, holding the environment (including information) constant.
Is there evidence for this (especially empirical evidence)? If so, where could I find it?
Facebook has at least experimented with using deep reinforcement learning to adjust its notifications according to https://arxiv.org/pdf/1811.00260.pdf . Depending on which exact features they used for the state space (i.e. if they are causally connected to preferences), the trained agent would at least theoretically have an incentive to change user's preferences.
The fact that they use DQN rather than a bandit algorithm seems to suggest that what they are doing involves at least some short term planning, but the paper does not seem to analyze the experiments in much detail, so it is unclear whether they could have used a myopic bandit algorithm instead. Either way, seeing this made me update quite a bit towards being more concerned about the effect of recommender systems on preferences.