DM

David Mathers🔸

5722 karmaJoined

Bio

Superforecaster, former philosophy PhD, Giving What We Can member since 2012. Currently trying to get into AI governance. 

Posts
11

Sorted by New

Comments
671

Working on AI isn't the same as doing EA work on AI to reduce X-risk. Most people working in AI are just trying to make the AI more capable and reliable. There probably is a case for saying that "more reliable" is actually EA X-risk work in disguise, even if unintentionally, but it's definitely not obvious this is true. 

"Any sort of significant credible evidence of a major increase in AI capabilities, such as LLMs being able to autonomously and independently come up with new correct ideas in science, technology, engineering, medicine, philosophy, economics, psychology, etc"

Just in the spirit of pinning people to concrete claims: would you count progress on Frontier Math 4, like say, models hitting 40%*, as being evidence that this is not so far off for mathematics specifically? (To be clear, I think it is very easy to imagine models that are doing genuinely significant research maths but still can't reliably be a personal assistant, so I am not saying this is strong evidence of near-term AGI or anything like that.) Frontier Math Tier 4 questions allegedly require some degree of "real" mathematical creativity and were designed by actual research mathematicians-including in some cases Terry Tao EDIT: that is he supplied some Frontier Math questions, not sure if any were Tier 4, so we're not talking cranks here. Epoch claim some of the problems can take experts weeks. If you wouldn't count this as evidence that genuine AI contributions to research mathematics might not be more than 6-7 years off, what, if anything would you count as evidence of that? If you don't like Frontier Math Tier 4 as an early warning sign is that because: 

1) You think it's not really true that the problems require real creativity, and you don't think "uncreative" ways of solving them will ever get you to being able to do actual research mathematics that could get in good journals.

2) You just don't trust models not to be trained on the test set because there was a scandal about Open AI having access to the answers. (Though as I've said, current state of the art is a Google model). 

3) 40% is too low, something like 90% would be needed for a real early warning sign. 

4) In principle, this would be a good early warning sign if for all we knew RL scaling could continue for many more orders of magnitude, but since we know it can't continue for more than a few, it isn't because by the time your hitting a high level on Frontier Math 4, your hitting the limits of RL-scaling and can't improve further 

Of course, maybe you think the metric is fine, but you just expect progress to stall well before scores are high enough to be an early warning sign of real mathematical creativity, because of limits to RL-scaling? 
*Current best is some version of Gemini at 18%. 

Yeah, it's fair objection that even answer the why question like I did presupposes that EAs are wrong, or at least, merely luckily right. (I think this is a matter of degree, and that EAs overrated the imminence of AGI and the risk of takeover on average, but it's still at least reasonable to believe AI safety and governance work can have very high expected value for roughly the reasons EAs do.) But I was responding to Yarrow who does think that EAs are just totally wrong, so I guess really I was saying that "conditional on a sociological explanation being appropriate, I don't think it's as LW-driven as Yarrow thinks", although LW is undoubtedly important.)

Can you say more about what makes something "a subjective guess" for you? When you say well under 0.05% chance of AGI in 10 years, is that a subjective guess? 

Like, suppose I am asked, as a pro-forecaster, to say whether the US will invade Syria, after a US military build-up involving air craft carriers in the Eastern Med, and I look for newspaper reports of signs of this, look up the base rate of how often the US bluffs with a military build up rather than invading, and then make a guess as to how likely an invasion is, is that "a subjective guess". Or am I relying on data? What about if I am doing what AI 2027 did and trying to predict when LLMs match human coding ability on the basis of current data. Suppose I use the METR data like they did, and I do the following. I assume that if AIs are genuinely able to complete 90% of real world tasks that take human coders 6 months, then they are likely as good at coding as humans. I project the METR data out to find a date for when we will hit 6-months tasks, theoretically if the trend continues. But then, instead of stopping, and saying that is my forecast, I remember that benchmark performance is generally a bit misleading in terms of real-world competence, and remember METR found that AIs often couldn't complete more realistic versions of the tasks which the benchmark counted them as passing. (Couldn't find a source for this claim, but I remember seeing it somewhere.) I decide maybe when models will hit real world 6-month task 90% completion rate should maybe be a couple more doubling times of the 90 time-horizon METR metric forward.  I move my forecast for human-level coders to, say, 15 months after the original to reflect this. Am I making a subjective guess, or relying on data? When I made the adjustment to reflect issues about construct validity, did that make my forecast more subjective? If so, did it make it worse, or did it make it better? I would say better, and I think you'd probably agree, even if you still think the forecast is bad. 

This geopolitical example here is not particularly hypothetical. I genuinely get paid to do this for Good Judgment, and not ONLY by EA orgs, although often it is by them. We don't know who the clients are, but some questions have been clearly commercial in nature and of zero EA interest.  

I'm not particular offended* if you think this kind of "get allegedly expert forecasters, rather than or as well as domain experts to predict stuff" is nonsense. I do it because people pay me and it's great fun, rather than because I have seriously investigated it's value. But what I do disagree with the idea that this is distinctively a Less Wrong rationalist thing. There's a whole history of relatively well-known work on it by the American political scientists Philip Tetlock that I think began when Yudkowsky was literally still a child. It's out of that work that Good Judgment, that org for which I work as a forecaster comes, not anything to do with Less Wrong. It's true that LessWrong rationalists are often enthusiastic about it, but that's not all that interesting on its own. (In general many Yudkowskian ideas actually seem derived from quite mainstream sources on rationality and decision-making to me. I would not reject them just because you don't like what LW does with them. Bayesian epistemology is a real research program in philosophy for example.) 


*Or at least, I am trying my best not to be offended, because I shouldn't be, but of course I am human and objectivity about something I derive status and employment from is hard. Though I did have a cool conversation at the least EAG London with a very good forecaster who thought it was terrible Open Phil put money into forecasting because it just wasn't very useful or important. 

I don't think EAs AI focus is a product only of interaction with Less Wrong,-not claiming you said otherwise-but I do think people outside the Less Wrong bubble tend to be less confident AGI is imminent, and in that sense less "cautious".  

I think EAs AI focus is largely a product of the fact that Nick Bostrom knew Will and Toby when they were founding EA, and was a big influence on their ideas. Of course, to some degree this might be indirect influence from Yudkowsky since he was always interacting with Nick Bostrom, but it's hard to know in what direction the influenced flowed here. I was around in Oxford during the embryonic stages of EA, and while I was not involved-beyond being a GWWC member, I did have the odd conversation with people who were involved, and my memory is that even then, people were talking about X-risk from AI as a serious contender for the best cause area, as early as at least 2014, and maybe a bit before that. They -EDIT: by "they" here I mean, "some people in Oxford, I don't remember who"; don't know when Will and Toby specifically first interacted with LW folk-were involved in discussion with LW people, but I don't think they got the idea FROM LW. Seems more likely to me they got it from Bostrom and the Future of Humanity Institute, who were just down the corridor. 

What is true is that Oxford people have genuinely expressed much more caution about timelines. I.e. in What We Owe the Future, published as late as 2022, Will is still talking about how AGI might be more than 50 years, away but also "it might come soon-within the next fifty or even twenty years." (If you're wondering what evidence he cites, it's the Cotra bioanchors report.) His discussion primarily emphasizes uncertainty about exactly when AGI will arrive, and how we can't be confident it's not close. He cites a figure from an Open Phil report guessing an 8% chance of AGI by 2036*. I know you're view is that this is all wildly wrong still, but it's quite different from what many-not all-Less Wrong people say, who tend to regard 20 years as a long time line. (Maybe Will has updated to shorter timelines since of course.) 

I think there is something of a divide between people who believe strongly in a particular set of LessWrong derived ideas about the imminence of AGI, and another set of people who are mainly driven by something like "we should take positive EV bets with a small chance of paying off, and doing AI stuff just in case AGI arrives soon". Defending the point about taking positive EV bets with only a small chance of pay-off is what a huge amount of the academic work on Longtermism at the GPI in Oxford was about. (This stuff definitely has been subjected to-severe-levels of peer reviewed scrutiny, as it keeps showing up in top philosophy journals with rejection rates of like, 90%.) 

*This is more evidence people were prepared to bet big on AI risk long before the idea that AGI is actually imminent became as popular as it is now. I think people just rejected the idea that useful work could only be done when AGI was definitely near, and we had near-AGI models. 

People vary a lot in how they interpret terms like "unlikely" or "very unlikely" in % terms, so I think >10% is not all that obvious. But I agree that it is evidence they don't think the whole idea is totally stupid, and that a relatively low probability of near-term AGI is still extremely worth worrying about. 

I don't think it's clear, absent further argument, that there has to be a 10% chance of full AGI in the relatively near future to justify the currently high valuations of tech stocks. New, more powerful models could be super-valuable without being able to do all human labour. (For example, if they weren't so useful working alone, but they made human workers in most white collar occupations much more productive.) And you haven't actually provided evidence that most experts think there's a 10% chance current paradigm will lead to AGI. Though the latter point is a bit of a nitpick if 24% of experts think it will, since I agree the latter is likely enough to justify EA money/concern. (Maybe the survey had some don't knows though?).

"i don't believe very small animals feel pain, and if they do my best guess would be it would be thousands to millions orders of magnitude less pain than larger animals."

I'll repeat what regular readers of the forum are bored of me saying about this. As a philosophy of consciousness PhD, I barely ever heard the idea that small animals are conscious, but their experiences are way less intense. At most, it might be a consequence of integrated information theory, but not one I ever saw discussed and most people in the field don't endorse that one theory anyway. I cannot think of any other theory which implies this, or any philosophy of mind reason to think it is so. It seems very suspiciously like it is just something EAs say to avoid commitments to prioritizing tiny animals that seem a bit mad. Even if we take seriously the feeling that those commitments are a bit mad, there are any number of reasons that could be true apart from "small conscious brains have proportionally less intense experiences than large conscious brains." The whole idea also smacks to me of the idea that pain is literally a substance, like water or sand that the brain somehow "makes" using neurons as an ingredient, in the way that combining two chemicals might make a third via a reaction, and how much of the product you get out depends on how much you put in. On  mind-body dualist views this picture might make  some kind of surface sense, though it'll get a bit complicated once you start thinking about the possibility of conscious aliens without neurons. But on more popular physicalist views of consciousness, this picture is just wrong: conscious pain is not stuff that the brain makes. 

Nor does it particularly seem "commonsense" to me. A dog has a somewhat smaller brain than a human, but I don't think most people think that their dog CAN feel pain, but it feels somewhat less pain than it appears, because it's brain is a bit smaller than a persons. Of course, it could be intensity is the same once you hit a certain brain size no matter how much you then scale up, but it starts to dop off proportionately when you hit a certain level of smallness, but that seems pretty ad hoc.  

I think when people say it is rapidly decreasing they may often mean that the the % of the world's population living in extreme poverty is declining over time, rather than that the total number of people living in extreme poverty is going down?

I think when people say it is rapidly decreasing they may often mean that the the % of the world's population living in extreme poverty is declining over time, rather than that the total number of people living in extreme poverty is going down?

Load more