While I’ve been working on my Law-Following AI Sequence, researchers from Stanford University released a very interesting paper and accompanying dataset and models called “Pile of Law.” Pile of Law contains interesting (and encouraging, in my opinion) evidence about the feasibility of constructing Law-Following AI (“LFAI”) systems, as I have defined it.
Relevant Paper Contents
The Pile of Law paper focuses most directly on the law and ethics of dataset compilation for NLP, including such issues as copyright, privacy preservation, and bias and toxicity management. As the authors correctly note, legal systems face their own versions of these problems when publishing publicly available legal sources. Since legal systems’ solutions to those problems are implicit in the distribution of legal data, the authors hypothesized that training LLMs on such data could cause the models to learn those solutions, and thereby avoid the “need to reinvent the law."
In a series of small experiments, the researchers tried to learn “contextual privacy rules,” such as whether to pseudonymize a party’s name, from legal corpora. In Case Study 1, an LLM trained on immigration data correctly learns to preferentially pseudonymize the names of asylees, refugees, and victims of torture. Case Study 2 similarly showed that training an LLM on Pile of Law improved the model’s ability to correctly pseudonymize names in “sensitive and highly personal” court cases.
Implications for LFAI
In addition to contributing a useful new dataset to the field, Pile of Law provides hints that LFAI is a tractable near-term research direction. As the authors say, “Pile of Law encodes signals about privacy standards that can be learned to produce more nuanced recommendations about filtering.” This accords nicely with one of the driving beliefs behind LFAI: that law and LLM safety have natural, untapped synergies due to the volume, structure, and political legitimacy of legal data. When I first began thinking about LFAI theoretically, I expected LLMs fine-tuned on legal data to both (a) behave better (by legal standards) in certain ways, and (b) have the ability to augment the legal compliance of AI systems as a whole. The Pile of Law paper provides empirical evidence for (a), and the authors indeed suggest that such systems could be integrated into data workflows to accomplish (b).
Pile of Law primarily analyzed data privacy and toxicity. LFAI as a long-term safety measure is more ambitious, with the goal of creating AI systems or modules that learn legal rules and help conform agentic AI systems to law. Pile of Law shows a very weak form of legal rule-learning, insofar as the fine-tuned models pick up on contextual law-derived trends in legal data. But full LFAI would need to go well beyond this, to incorporate much more data about both facts and law and explicitly analyze the legal consequences of an agent’s behavior, rather than relying on implicit learning of probabilistic trends. Full LFAI would also need to be embedded into, and constrain, agentic AI systems, which would require nontrivial engineering. Thus, I cannot claim that Pile of Law is a major vindication of or achievement in LFAI. Still, its empirical evidence makes me more bullish on a major premise of LFAI, which had previously been mainly theoretical: there are significant and underexplored synergies between legal corpora, AI safety, and large language models. I am very thankful to the authors for their work, and for releasing the dataset to enable further explorations along this line.
Peter Henderson & Mark S. Krass et al., Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset 2–3 (“When releasing internal documents concerning individuals, courts and governments have long struggled to balance transparency against the inclusion of private or offensive content. Model creators now face a similar struggle: what content to filter before pretraining a large language model on the data.”). ↩︎
Id. at 3. The authors “do not take the position that legal rules are optimal nor monolithic,” while noting the legitimating procedural benefits of relying on legal sources. See id. I have made or plan to make similar points in the LFAI sequence. ↩︎
Id. at 6–7. Cf. 8 CFR § 208.6(a). ↩︎
Henderson & Krass et al. at 7. ↩︎
Though admittedly I had not specifically thought of learning privacy and toxicity rules from legal corpora! ↩︎
See id. at 7 (“These experiments show that the Pile of Law encodes signals about privacy standards that can be learned to produce more nuanced recommendations about filtering. Such contextualized filters may help ensure that generative models strike the right balance between accuracy and privacy protection, for example by accurately distinguishing benign releases of names and contact information (e.g., in response to queries about government officials) from harmful ones (sensitive circumstances where harm is plausible).”). ↩︎
Toxicity examined in id. § 4.2. ↩︎