Parallels Between AI Safety by Debate and Evidence Law

by Cullen_OKeefe


In this post, I highlight some parallels between AI Safety by Debate (“Debate”) and evidence law.

Evidence law structures high-stakes arguments with human judges.

The prima facie reason that Evidence law (“Evidence”) is relevant to Debate is because Evidence is one of the few areas, like Debate, where debates have high stakes: potentially including severe criminal penalties or millions of dollars in liability. Other high-stakes debates could include parliamentary or electoral debates, but these are less substantively limited (i.e., there are fewer restraints on what debaters can do) and less aimed at seeking truth (and more aimed at political theater).

In court proceedings, questions of law are decided by the judge, while the questions of fact are decided by the finder of fact (usually the jury, but sometimes a judge). The finder of fact weighs the persuasiveness of factual arguments (e.g., whether the defendant shot the victim, and whether he intended to do so). In all cases, like in Debate, the final arbiter of factual debates is human.

Evidence law limits the types of arguments available to debaters.

The goal of the Federal Rules of Evidence is “ascertaining the truth and securing a just determination.”[1] Therefore, generally, “relevant evidence is admissible unless [otherwise provided].”[2] A piece of evidence is relevant if “(a) it has any tendency to make a fact more or less probable than it would be without the evidence; and (b) the fact is of consequence in determining the action.”[3]

However, the bulk of Evidence law is dedicated to exceptions to this presumption of admissibility. The precision of these exceptions varies significantly. Some are less precise (“standards,” in legal jargon) such as Rule 403: “The court may exclude relevant evidence if its probative value is substantially outweighed by a danger of one or more of the following: unfair prejudice, confusing the issues, misleading the jury, undue delay, wasting time, or needlessly presenting cumulative evidence.”[4] Others are more specific (“rules”).

As Rule 403 exemplifies, many of the exceptions to the general admissibility of relevant evidence are based on the fallibility of fact-finders. Evidence that is relevant but likely to be on-balance detrimental to truth-seeking is therefore excluded. Other examples of rules of this form include:

  1. Use of a person’s character to prove action in conformity with that character;[5]
  2. Limitations on the use of out-of-court statements;[6] and
  3. Limitations on impeaching witnesses by their past criminal convictions[7] or religious beliefs.[8]

Relevance to Debate

Types of Arguments to Watch For

The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.

Opportunities for Structuring Debate

The rules of evidence could also be used to structure Debate: e.g., by training AI debaters to not make certain types of arguments, or by having a mediator screen any arguments that would violate the rules, such that the ultimate judge does not see them.

This is very interesting! I'm excited to see connections drawn between AI safety and the law / philosophy of law. It seems there are a lot of fruitful insights to be had.

You write,

The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.

Can you elaborate a bit on this?

I don't know anything about the history of these rules about evidence. But why think that over this history, these rules have trended towards truth-seeking per se? I wouldn't be surprised if the rules have evolved to better serve the purposes of the legal system over time, but presumably the relationship between this end and truth-seeking is quite complex. Also, people changing the rules could be mistaken about what sorts of evidence do in fact tend to lead to wrong decisions.

I think all of this is compatible with your claim. But I'd like to hear more!

Thanks for this very thoughtful comment!

I think it is accurate to say that the rules of evidence have generally aimed for truth-seeking per se. That is their stated goal, and it generally explains the liberal standard for admission (relevance, which is a very low bar and tracks Bayesian epistemology well), the even more liberal standards for discovery, and most of the admissibility exceptions (which are generally explainable by humans' imperfect Bayesianism).

You're definitely right that the legal system as a whole has many goals other than truth-seeking. However, those other goals are generally advanced through other aspects of the justice system. As an example, finality is a goal of the legal system, and is advanced through, among other things, statutes of limitations and repose. Similarly, the "beyond reasonable doubt" standard for criminal conviction is in some sense contrary to truth-seeking but advances the policy preference for underpunishment over overpunishment.

You're also right that there are some exceptions to this within evidence law itself, but not many. For example, the attorney–client privilege exists not to facilitate truth-seeking, but to protect the attorney–client relationship. Similarly, the spousal privileges exist to protect the marital relationship. (Precisely because such privileges are contrary to truth-seeking, they are interpreted narrowly. See, e.g., United States v. Aramony, 88 F.3d 1369, 1389 (4th Cir. 1996); United States v. Suarez, 820 F.2d 1158, 1160 (11th Cir. 1987)). And of course, some rules of evidence have both truth-seeking and other policy rationales. Still, on the whole and in general, the rules of evidence are aimed towards truth.