titotal

Computational Physicist
8893 karmaJoined

Bio

I'm a computational physicist, I generally donate to global health.  I am skeptical of AI x-risk and of big R Rationalism, and I intend on explaining why in great detail. 

Comments
746

This is a very important question to be asking. 

This analysis seems to be entirely based on METR's time horizon research. I think that research is valuable, but it raises the concern that any findings may be a result of particular quirks of METR's approach, you describe some of those in here. 

Are you aware of any alternative groups that have explored this question? It feels to me like it's not a question you explicitly need time horizons to answer. 

Yes, but in doing so the uncertainty in both A and B matters, and showing that A is lower variance than B doesn't show that E[benefits(A)] > E[benefits(B)]. Even if benefits(B) are highly uncertain and we know benefits(A) extremely precsiely, it can still be the case that benefits(B) are larger in expectation.

 

If you properly account for uncertainty, you should pick the certain cause over the uncertain one even if a naive EV calculation says otherwise, because you aren't accounting for the selection process involved in picking the cause. I'm writing an explainer for this, but if I'm reading the optimisers curse paper right, a rule of thumb is that if cause A is 10 times more certain than cause B, cause B should be downweighted by a factor of 100 when comparing them. 

Generally, the scientific community is not going around arguing that drastic measures should be taken based on singular novel studies. Mainly, what a single novel study will produce is a wave of new studies on the same subject, to ensure that the results are valid and that the assumptions used hold up to scrutiny. Hence why that low-temperature superconductor was so quickly debunked

I do not see similar efforts in the AI safety community. The studies by METR are great first forays into difficult subjects, but then I see barely any scrutinity or follow-up by other researchers. And people accept much worse scholarship like AI2027 at face-value for seemingly no reason. 

I have experience in both academia and EA now, and I believe that the scholarship and skeptical standards in EA are substantially worse. 

Taken literally, "accelerationist" implies that you think the technology isn't currently progressing fast enough, and that some steps should be taken to make it go faster. This seems a bit odd, because one of your key arguments (that I actually agree with) is that we learn to adapt to technology as it rolls out. But obviously it's harder to adapt when change is super quick, compared to gradual progress. 

How fast do you think AI progress should be going, and what changes should be made to get there?

I think Eric has been strong about making reasoned arguments about the shape of possible future technologies, and helping people to look at things for themselves.

I guess this is kind of my issue, right? He's been quite strong at putting forth arguments about the shape of the future that were highly persuasive and yet turned out to be badly wrong.[1] I'm concerned that this does not seem to have his affected his epistemic authority in these sort of circles. 

You may not be "defering" to drexler, but you are singling out his views as singularly important (you have not made similar posts about anybody else[2]). There are hundreds of people discussing AI at the moment, a lot of them with a lot more expertise, and a lot of whom have not been badly wrong about the shape of the future. 

Anyway, I'm not trying to discount your arguments either, I'm sure you have found stuff in valuable. But if this post is making a case for reading Drexler despite him being difficult, I'm allowed to make the counterargument. 

  1. ^

    In answer to your footnote: If more than one of those things occurs in the next thirty years, I will eat a hat. 

  2. ^

    If this is the first in a series, feel free to discount this.

Drexlers previous predictions seem to have gone very poorly. This post evaluated the 30 year predictions of a group of seven futurists in 1995, and Drexler came in last, predicting that by 2026 we would have complete drexlerian nanotech assemblers, be able to reanimate cryonic suspendees, have uploaded minds, and have a substantial portion of our economy outside the solar system. 

Given this track record of extremely poor long-term prediction, why should I be interested in the predictions that Drexler makes today? I'm not trying to shit on Drexler as a person (and he has had a positive influence in inspiring scientists), but it seems like his epistemological record is not very good. 

I'm broadly supportive of this type of initiative, and it seems like it's definitely worth a try (the downsides seem low compared to the upsides). However I suspect that, like most apparently good ideas, scrutiny will yield problems. 

One issue I can think of: in this analysis, a lot of the competitive advantage for the company arises from the good reputation of the charitable foundation running it. However, running a large company competitively sometimes involves making tough, unpopular decisions, like laying off portions of your workforce. So I don't think your assumption that the charity-owned company can act exactly like a regular company holds up necessarily: doing so risks eliminating the reputational advantage that is needed for the competitive edge. 

I have many disagreements, but I'll focus on one: I think point 2 is in contradiction with points 3 and 4. To put it it plainly: the "selection pressures" go away pretty quickly if we don't have reliable methods of knowing or controlling what the AI will do, or preventing it from doing noticeably bad stuff.  That applies to the obvious stuff like if AI tries to prematurely go skynet, but it also applies to more mundane stuff like getting an AI to act reliably more than 99% of the time. 

I believe that if we manage to control AI enough to make widespread rollout feasible, then it's pretty likely we've already solved alignment well enough to prevent extinction. 

I'm not super excited about revisiting the model, to be honest, but I'll probably take a look at some point. 

What I'd really like to see, and what I haven't noticed from a quick look through the update, is some attempt to prove the validity of the models with reference to actual data. For example, I think METR comes off looking pretty good right now with their exponential model of horizon growth, which has held up for nearly a year post-publication now. The AI2027 model's prediction of superexponential growth has not. So I think they have to make a pretty strong case for why I should trust the new model. 

I think the problem here is that novel approaches are substantially more likely to be failures due to being untested and unproven. This isn't a big deal in areas where you can try lots of stuff out and sift through them with results, but in something like an election you only get feedback like once a year or so. Worse, the feedback is extremely murky, so you don't know if it was your intervention or something else that resulted in the outcome you care about. 

Load more