A new review of Ajeya Cotra's Forecasting TAI with biological anchors (see also update here), written by David Roodman in April 2020, has been added to the folder of public reviews for Cotra's report.

Roodman's summary:

I think my main critical reaction is about the draft report’s ecumenical approach. It puts non-zero weight on several different frameworks which, conditional on the various parameter choices favored in the report, contradict one another. This mixing of distributions expresses a kind of radical uncertainty: not only are we unsure about the parameter values within each framework; we’re also unsure about which framework is most right.

This set-up is pragmatic and humble, but… I think in principle the ecumenism discards useful information, by not imposing the restriction that the various frameworks agree. In principle, they are all measuring the same thing. In pure Bayesian reasoning, if one has several uncertain measurements of the same value, each represented by a probability distribution, then one combines these primary measurements by multiplying them pointwise and rescaling the result to have total integral one. This contrasts with the pointwise averaging performed in the draft report, which is the mathematical expression of ecumenism.

In Bayesian reasoning, if two distributions for the same parameter are normal, then their combination is too; its mean is the average of the two primary means, weighting by the respective precisions (inverse variances). Weirdly, if the two primary means are far apart, so that the two distributions hardly overlap, then their combination can pop up in the no-man’s-land between them. The intuition is that the combined distribution centers on the least unlikely estimate given what we know.

I make that mathematical point less to argue for a mechanical implementation of Bayesian mixing of different perspectives than to advocate for an informal didactic that aims at unification. What is the least implausible way to reconcile the large disagreements between different frameworks? Could answers to that question help us settle on a single, favored framework, perhaps one that synthesizes ideas from more than one?

That impulse ultimately led me to favor a single framework that fuses elements from several in the draft report. The idea is to model two training levels at once, of parameters and hyperparameters. Training of parameters corresponds to the training of a single neural network, or the learning a sentient organism undergoes during maturation. Hyperparameter training corresponds to the design space exploration that AI researchers engage in and, in the biological realm, to evolution. Each parameter training run may involve huge numbers of small parameter updates; each in turn serves a single hyperparameter training step…

72

5 comments, sorted by Click to highlight new comments since: Today at 3:14 PM
New Comment

I disagree with Roodman's criticism quoted here. Cotra's approach involves estimating that there's an X% chance that the first achievable TAI will look like A, a Y% chance like B, and so on. Some anchors (e.g., short-horizon neural network and long-horizon neural network) are obviously incompatible; whatever the future looks like, they won't both describe the first achievable TAI. Multiplying them is clearly not meaningful; Roodman's proposed "restriction that the various frameworks agree" makes no sense. (Multiplying them would be correct if Cotra's different anchors represented something like different information-sources on necessary-and-sufficient-conditions-for-TAI, but that's not what her anchors represent.)

(I suspect I may be missing something.)

Roodman's proposed "restriction that the various frameworks agree" makes no sense.

I'm with you. I think Roodman must disagree with the idea of giving probabilies to different - and necessarily conflicting - models of the world, but to me this seems like an odd position/disagreement. I might also be missing something.

The approach of Cotra criticised here could be interpreted as Bayesian model averaging I think. This seems fine, maybe Roodman disagrees, but I think he needs to expand a bit.

If we’re looking at AI meeting any of multiple thresholds, you could take the minimum of the random variables representing the dates the thresholds are passed. If it's supposed to meet all, you'd take the maximum. You could pick subsets of dates to take mins or maxes of to do this with, and mix probabilistically by sampling each distribution.

(Maybe this was already done? It's been a while since I thought about the report.)

In Bayesian reasoning, if two distributions for the same parameter are normal, then their combination is too; its mean is the average of the two primary means, weighting by the respective precisions (inverse variances).

I think this refers to the inverse-variance method. I am not sure under which conditions it should be applied, but it minimises the variance of a weighted mean of 2 estimates of the same variable of interest.