Wiki Contributions


Forecasting transformative AI: what's the burden of proof?

So I'd much rather people focus on the claim that "AI will be really, really big" than "AI will be bigger than anything else which comes afterwards".

I think AI is much more likely to make this the most important century than to be "bigger than anything else which comes afterwards." Analogously, the 1000 years after the IR are likely to be the most important millennium even though it seems basically arbitrary whether you say the IR is more or less important than AI or the agricultural revolution. In all those cases, the relevant thing is that a significant fraction of all remaining growth and technological change is likely to occur in the period, and many important events are driven by growth or tech change.

The answer to this question could change our estimate of P(this is the most important century) by an order of magnitude

I think it's more likely than not that there will be future revolutions as important TAI, but there's a good probability that AI leads to enough acceleration that a large fraction of future revolutions occur in the same century. There's room for the debate over the exact probability and timeline for such acceleration, but I think no real way to argue for anything as low as 10%.

All Possible Views About Humanity's Future Are Wild

We were previously comparing two hypotheses:

  1. HoH-argument is mistaken
  2. Living at HoH

Now we're comparing three:

  1. "Wild times"-argument is mistaken
  2. Living at a wild time, but HoH-argument is mistaken
  3. Living at HoH

"Wild time" is almost as unlikely as HoH. Holden is trying to suggest it's comparably intuitively wild, and it has pretty similar anthropic / "base rate" force.

So if your arguments look solid,  "All futures are wild" makes hypothesis 2 look kind of lame/improbable---it has to posit a flaw in an argument, and also that you are living at a wildly improbable time. Meanwhile, hypothesis 1 merely has to posit a flaw in an argument, and hypothesis 3 merely has to a posit HoH (which is only somewhat more to swallow than a wild time).

So now if you are looking for errors, you probably want to focus for errors in the argument that we are living at a "wild time." Realistically, I think you probably need to reject the possibility that the stars are real and that it is possible for humanity to spread to them. In particular, it's not too helpful to e.g. be skeptical of some claim about AI timelines or about our ability to influence society's trajectory.

This is kind of philosophically muddled because (I think) most participants in this discussion already accept a simulation-like argument that "Most observers like us are mistaken about whether it will be possible for them to colonize the stars." If you set aside the simulation-style arguments, then I think the "all futures are wild" correction is more intuitively compelling.

(I think if you tell people "Yes, our good skeptical epistemology allows us to be pretty confident that the stars don't exist" they will have a very different reaction than if you tell them "Our good skeptical epistemology tells us that we aren't the most influential people ever.")

Taboo "Outside View"

I do think my main impression of insect <-> simulated robot parity comes from very fuzzy evaluations of insect motor control vs simulated robot motor control (rather than from any careful analysis, of which I'm a bit more skeptical though I do think it's a relevant indicator that we are at least trying to actually figure out the answer here in a way that wasn't true historically). And I do have only a passing knowledge of insect behavior, from watching youtube videos and reading some book chapters about insect learning. So I don't think it's unfair to put it in the same reference class as Rodney Brooks' evaluations to the extent that his was intended as a serious evaluation.

Taboo "Outside View"

The Nick Bostrom quote (from here) is:

In retrospect we know that the AI project couldn't possibly have succeeded at that stage. The hardware was simply not powerful enough. It seems that at least about 100 Tops is required for human-like performance, and possibly as much as 10^17 ops is needed. The computers in the seventies had a computing power comparable to that of insects. They also achieved approximately insect-level intelligence.

I would have guessed this is just a funny quip, in the sense that (i) it sure sounds like it's just a throw-away quip, no evidence is presented for those AI systems being competent at anything (he moves on to other topics in the next sentence), "approximately insect-level" seems appropriate as a generic and punchy stand in for "pretty dumb," (ii) in the document he is basically just thinking about AI performance on complex tasks and trying to make the point that you shouldn't be surprised by subhuman performance on those tasks, which doesn't depend much on the literal comparison to insects, (iii) the actual algorithms described in the section (neural nets and genetic algorithms) wouldn't plausibly achieve insect-level performance in the 70s since those algorithms in fact do require large training processes (and were in fact used in the 70s to train much tinier neural networks).

(Of course you could also just ask Nick.)

I also think it's worth noting that the prediction in that section looks reasonably good in hindsight. It was written right at the beginning of resurgent interest in neural networks (right before Yann LeCun's paper on MNIST with neural networks). The hypothesis "computers were too small in the past so that's why they were lame" looks like it was a great call, and Nick's tentative optimism about particular compute-heavy directions looks good. I think overall this is a significantly better take than mainstream opinions in AI. I don't think this literally affects your point, but it is relevant if the implicit claim is "And people talking about insect comparisons were lead astray by these comparisons."

I suspect you are more broadly underestimating the extent to which people used "insect-level intelligence" as a generic stand-in for "pretty dumb," though I haven't looked at the discussion in Mind Children and Moravec may be making a stronger claim. I'd be more inclined to tread carefully if some historical people tried to actually compare the behavior of their AI system to the behavior of an insect and found it comparable as in posts like this one (it's not clear to me how such an evaluation would have suggested insect-level robotics in the 90s or even today, I think the best that can be said is that today it seems compatible with insect-level robotics in simulation today). I've seen Moravec use the phrase "insect-level intelligence" to refer to the particular behaviors of "following pheromone trails" or "flying towards lights," so I might also read him as referring to those behaviors in particular. (It's possible he is underestimating the total extent of insect intelligence, e.g. discounting the complex motor control performed by insects, though I haven't seen him do that explicitly and it would be a bit off brand.)

ETA: While I don't think 1990s robotics could plausibly be described as "insect-level," I actually do think that the linked post on bee vision could plausibly have been written in the 90s and concluded that computer vision was bee-level, it's just a very hard comparison to make and the performance of the bees in the formal task is fairly unimpressive.

Issues with Using Willingness-to-Pay as a Primary Tool for Welfare Analysis

Ironically, although cost-benefit analysts generally ignore the diminishing marginal benefit of money when they are aggregating value across people at a single date, their main case for discounting future commodities is founded on this diminishing marginal benefit. 

I think the "main" (i.e. econ 101) case for time discounting (for all policy decisions other than determining savings rates) is roughly the one given by Robin here

I don't think there is a big incongruity here. Questions about diminishing returns to wealth become relevant when trying to determine what savings rate might be socially optimal. Analogously, questions about diminishing returns to wealth become relevant when we ask about what level of redistribution might be socially optimal, even if most economists would prefer to bracket them for most other policy discussions.

Issues with Using Willingness-to-Pay as a Primary Tool for Welfare Analysis

For governments who have the option to tax, WTP has obvious relevance as a way of comparing a policy to a benchmark of taxation+redistribution. I tentatively think that an idealized state (representing any kind of combination of its constituents' interests) ought to use a WTP analysis for almost all of its policy decisions. I wrote some opinionated thoughts here.

It's less clear if this is relevant for a realistic, state and the discussion becomes more complex. I think it depends on a question like "what is the role of cost-effectiveness analysis in contexts where it is a relatively minor input  into decision-making?" I think realistically there will be different kinds of cost-benefit analyses for different purposes.  Sometimes WTP will be appropriate but probably not most of the time. When those other analyses depend on welfare, I expect they can often be productively framed as "WTP x (utility/$)" with some reasonable estimate for utility/$. But even that abstraction will often break down in cases where WTP is hard-to-observe or beneficiaries are irrational or whatever.

I think for a philanthropist WTP isn't compelling as a metric, and should usually be combined with an explicit estimate of (utility/$). I don't think I've seen philanthropists using WTP in this way and certainly wouldn't expect to see someone suggesting that handing money to rich people is more effective since it can be done with lower overhead.

Draft report on existential risk from power-seeking AI

A 5% probability of disaster isn't any more or less confident/extreme/radical than a 95% probability of disaster; in both cases you're sticking your neck out to make a very confident prediction.

"X happens" and "X doesn't happen" are not symmetrical once I know that X is a specific event. Most things at the level of specificity of "humans build an AI that outmaneuvers humans to permanently disempower them" just don't happen.

The reason we are even entertaining this scenario is because of a special argument that it seems very plausible. If that's all you've got---if there's no other source of evidence than the argument---then you've just got to start talking about the probability that the argument is right.

And the argument actually is a brittle and conjunctive thing. (Humans do need to be able to build such an AI by the relevant date, they do need to decide to do so, the AI they build does need to decide to disempower humans notwithstanding a prima facie incentive for humans to avoid that outcome.)

That doesn't mean this is the argument or that the argument is brittle in this way---there might be a different argument that explains in one stroke why several of these things will happen. In that case, it's going to be more productive to talk about that.

(For example, in the context of the multi-stage argument undershooting success probabilities, it's that people will be competently trying to achieve X and most of uncertainty is estimating how hard and how effectively people are trying---which is correlated across steps. So you would do better by trying to go for the throat and reason about the common cause of each success, and you will always lose if you don't see that structure.)

And of course some of those steps may really just be quite likely and one shouldn't be deterred from putting high probabilities on highly-probable things. E.g. it does seem like people have a very strong incentive to build powerful AI systems (and moreover the extrapolation suggesting that we will be able to build powerful AI systems is actually about the systems we observe in practice and already goes much of the way to suggesting that we will do so). Though I do think that the median MIRI staff-member's view is overconfident on many of these points.

Dutch anti-trust regulator bans pro-animal welfare chicken cartel

Is your impression that if customers were willing to pay for it, then that wouldn't be sufficient cause to say that it benefited customers? (Does that mean that e.g. a standard ensuring that children's food doesn't cause discomfort also can't be protected, since it benefits customers' kids rather than customers themselves?)

Dutch anti-trust regulator bans pro-animal welfare chicken cartel

These cases are also interesting for alignment agreements between AI labs, and it's interesting to see it playing out in practice. Cullen wrote about this here much better than I will.

Roughly speaking, if individual consumers would prefer use a riskier AI (because costs are externalized) then it seems like an agreement to make AI safer-but-more-expensive would run afoul of the same principles as this chicken-welfare agreement.

On paper, there are some reasons that the AI alignment case should be easier than the chicken-welfare case: (i) using unsafe AI hurts non-customer humans, and AI customers care more about other humans than they do about chickens, (ii) deploying unaligned AI actually likely hurts other AI customers in particular (since they will be the main ones competing with the unaligned but more sophisticated AI). So it's likely that every individual AI customer would  benefit.

Unfortunately, it seems like the same thing could be true in the chicken case---every individual customer could prefer the world with the welfare agreement---and it wouldn't change the regulator's decision.

For example, suppose that Dutch consumers eat 100 million chickens a year, 10/year for each of 10 million customers. Customer surveys discover that customers would only be willing to pay $0.01 for a chicken to have more space and a slightly longer life, but that these reforms increase chicken prices by $1. So they strike down the reform.

But with welfare standards in place, each customer pays an extra $10/year for chicken and 100 million chickens have improved lives, with a cost per chicken of less than $0.0000001/chicken, thousands of times lower than their WTP. (This is the same dynamic described here.) So every chicken consumer prefers the world where the standards are in place, despite not being willing to pay money to improve the lives of the tiny number of chickens they eat personally. This seems to be a very common reaction to discussions of animal welfare ("what difference does my consumption make? I can't change the way most chickens are treated...")

Because the number of chicken-eaters is so large, the relevant question in the survey should be "Would you prefer that someone else pay $X in order to improve chicken welfare?", making a tradeoff between two strangers. That's the relevant question for them, since the welfare standards mostly affect other people.

Analogously, if you ask AI consumers "Would you prefer have an aligned AI, or a slightly more sophisticated unaligned AI?" they could easily all say "I want the more sophisticated one," even if every single human would be better off if there were an agreement to make only aligned AI. If an anti-trust regulator used the same standard as in this case, it seems like they would throw out an alignment agreement because of that, even knowing that it would make every single human worse off.

I still think in practice AI alignment agreements would be fine for a variety of reasons. For example, I think if you ran a customer survey it's likely people would say they prefer use aligned AI even if it would disadvantage them personally because public sentiment towards AI is very different and the regulatory impulse is stronger. (Though I find it hard to believe that anything would end up hinging on such a survey, and even more strongly I think it would never come to this because there would be much less political pressure to enforce anti-trust.)

Alternatives to donor lotteries

I guess I wouldn't recommend the donor lottery to people who wouldn't be happy entering a regular lottery for their charitable giving

Strong +1.

If I won a donor lottery, I would consider myself to have no obligation whatsoever towards the other lottery participants, and I think many other lottery participants feel the same way. So it's potentially quite bad if some participants are thinking of me  as an "allocator" of their money. To the extent there is ambiguity in the current setup, it seems important to try to eliminate that.

Load More