Glad to see this series up! Tons of great points here.
One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there's more to gain. OK.
But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors ...
I think there's some talking past each other happening.
I am claiming that there are real coordination problems that lead even actors who believe in a large amount of AI risk to think that they need to undertake risky AI development (or riskier) for private gain or dislike of what others would do. I think that dynamic will likely result in future governments (and companies absent government response) taking on more risk than they otherwise would, even if they think it's quite a lot of risk.
I don't think that most AI companies or governments would want...
"Second, the primary benefits—higher incomes and earlier biomedical breakthroughs—are also broadly shared; they are not gated to the single lab that crosses the finish line first."
If you look at the leaders of major AI companies you see people like Elon Musk and others who are concerned with getting to AGI before others who they distrust and fear. They fear immense power in the hands of rivals with conflicting ideologies or in general.
OpenAI was founded and funded in significant part based on Elon Musk's fear of the consequences of the Google leaders...
Right, those comments were about the big pause letter, which while nominally global in fact only applied at the time to the leading US lab, and even if voluntarily complied with would not affect the PRC's efforts to catch up in semiconductor technology, nor Chinese labs catching up algorithmically (as they have partially done).
On my view the OP's text citing me left out the most important argument from the section they linked: the closer and tighter an AI race is at the international level as the world reaches strong forms of AGI and ASI, the less slack there is for things like alignment. The US and Chinese governments have the power to prohibit their own AI companies from negligently (or willfully) racing to create AI that overthrows them, if they believed that was a serious risk and wanted to prioritize stopping it. That willingness will depend on scientific and political effo...
Absent an agreement with enough backing it to stick, slowdown by the US tightens the international gap in AI and means less slack (and less ability to pause when it counts) and more risk of catastrophe in the transition to AGI and ASI.
I agree this mechanism seems possible, but it seems far from certain to me. Three scenarios where it would be false:
I have two views in the vicinity. First, there's a general issue that human moral practice generally isn't just axiology, but also includes a number of elements that are built around interacting with other people with different axiologies, e.g. different ideologies coexisting in a liberal society, different partially selfish people or family groups coexisting fairly while preferring different outcomes. Most flavors of utilitarianism ignore those elements, and ceteris paribus would, given untrammeled power, call for outcomes that would be ruinous for ~all c...
Physicalists and illusionists mostly don't agree with the identification of 'consciousness' with magical stuff or properties bolted onto the psychological or cognitive science picture of minds. All the real feelings and psychology that drive our thinking, speech and action exist. I care about people's welfare, including experiences they like, but also other concerns they have (the welfare of their children, being remembered after they die), and that doesn't hinge on magical consciousness that we, the physical organisms having this conversation, would have ...
(I understand you are very busy this week, so please feel free to respond later.)
Re desires, the main upshot of non-dualist views of consciousness I think is responding to arguments that invoke special properties of conscious states to say they matter but not other concerns of people.
I would say that consciousness seems very plausibly special in that it seems very different from other types of things/entities/stuff we can think or talk or have concerns about. I don't know if it's special in a "magical" way or some other way (or maybe not special at all...
Here’s a fairly safe prediction: most of the potential harm from AI is potential harm to nonhuman animals.
I would think for someone who attended an AI, Animals, and Digital Minds conference it should look like an extremely precarious prediction, as AIs will likely immensely outnumber nonhuman animals, and could have much more of most features we could use in measuring 'harm'?
Rapid fire:
I agree that people should not focus on nuclear risk as a direct extinction risk (and have long argued this), see Toby's nuke extinction estimates as too high, and would assess measures to reduce damage from nuclear winter to developing neutral countries mainly in GiveWell-style or ordinary CBA terms, while considerations about future generations would favor focus on AI, and to a lesser extent bio.
However, I do think this wrongly downplays the effects on our civilization beyond casualties and local damage of a nuclear war that wrecks the current nucl...
Thank you for the comment Bob.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of ...
Lots of progress on AI, alignment, and governance. This sets up a position where it is likely that a few years later there's an AI capabilities explosion and among other things:
Not much new on that front besides continuing to back the donor lottery in recent years, for the same sorts of reasons as in the link, and focusing on research and advising rather than sourcing grants.
A bit, but more on the willingness of AI experts and some companies to sign the CAIS letter and lend their voices to the view 'we should go forward very fast with AI, but keep an eye out for better evidence of danger and have the ability to control things later.'
My model has always been that the public is technophobic, but that 'this will be constrained like peaceful nuclear power or GMO crops' isn't enough to prevent a technology that enables DSA and OOMs (and nuclear power and GMO crops exist, if AGI exists somewhere that place outgrows the rest of the w...
Thanks, I was referring to this as well, but should have had a second link for it as the Rethink page on neuron counts didn't link to the other post. I think that page is a better link than the RP page I linked, so I'll add it in my comment.
I'm not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I'll be moving on after this. But I will say two things regarding the above. First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies. Second, this is a huge move and one really needs to wrestle with intertheoretic comparisons to justify it:
I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above.
Suppose we...
I can't speak for Open Philanthropy, but I can explain why I personally was unmoved by the Rethink report (and think its estimates hugely overstate the case for focusing on tiny animals, although I think the corrected version of that case still has a lot to be said for it).
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
...However, I say "naively" because this doesn't actually work, due to two-envelope effects...whenev
Thanks for your discussion of the Moral Weight Project's methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we're keen to write more about it. Perhaps 2024 will provide the opportunity!
For now, we'll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some...
...It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit
(I'm not at Rethink Priorities anymore, and I'm not speaking on their behalf.)
Rethink's work, as I read it, did not address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around.
(...)
Rethink's discussion of this almost completely sidestepped the issue in my view.
RP did in fact respond to some versions of these arguments, in the piece Do Brains Contain Many Conscious Subsystems? If So, Should We Act Di...
This consideration is something I had never thought of before and blew my mind. Thank you for sharing.
Hopefully I can summarize it (assuming I interpreted it correctly) in a different way that might help people who were as befuddled as I was.
The point is that, when you have probabilistic weight to two different theories of sentience being true, you have to assign units to sentience in these different theories in order to compare them.
Say you have two theories of sentience that are similarly probable, one dependent on intelligence and one depend...
Thanks for elaborating, Carl!
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
Let me try to restate your point, and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:
One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team. Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn't imply that the judges or the institution changed their views very much in that direction.
At large scale, Information can be valuable enough to buy even if it only modestly adjusts pro...
Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.
That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:
...
- We don’t use Rethink’s moral weights.
- Our cur
- there not being enough practically accessible matter available (even if we only ever need a finite amount), and
This is what I was thinking about. If I need a supply of matter set aside in advance to be able to record/receive an answer, no finite supply suffices. Only an infinite brain/tape, or infinite pile of tape making resources, would suffice.
If the resources are created on demand ex nihilo, and in such a way that the expansion processes can't be just 'left on' you could try to jury rig around it.
Now, there’s an honest and accurate genie — or God or whoever’s simulating our world or an AI with extremely advanced predictive capabilities — that offers to tell you exactly how will turn out.[9] Talking to them and finding out won’t affect or its utility, they’ll just tell you what you’ll get.
This seems impossible, for the possibilities that account for ~all the expected utility (without which it's finite)? You can't fit enough bits in a human brain or lifetime (or all accessible galaxies, or whatever). Your brain would...
Alone and directly (not as a contributing factor to something else later), enough below 0.1% that I evaluate nuclear interventions based mainly on their casualties and disruption, not extinction. I would (and have) support them in the same kind of metric as GiveWell, not in extinction risk.
In the event of all-out WMD war (including with rogue AGI as belligerent) that leads to extinction nukes could be a contributing factor combined with bioweapons and AI (strategic WMD war raises the likelihoods of multiple WMDs being used together).
>It's plausible humans will go extinct from AI. It's also plausible humans will go extinct from supervolcanoes.
Our primitive and nontechnological ancestors survived tens of millions of years of supervolcano eruptions (not to mention mass extinctions from asteroid/comet impacts) and our civilization's ability to withstand them is unprecedentedly high and rapidly increasing. That's not plausible, it's enormously remote, well under 1/10,000 this century.
The thing to see is if the media attention translates into action with more than a few hundred people working on the problem as such rather than getting distracted, and government prioritizing it in conflict with competing goals (like racing to the precipice). One might have thought Covid-19 meant that GCBR pandemics would stop being neglected, but that doesn't seem right. The Biden administration has asked for Congressional approval of a pretty good pandemic prevention bill (very similar to what EAs have suggested) but it has been ...
I feel like this does not really address the question?
A possible answer to Rockwell's question might be "If we have 15000 scientists working full-time on AIS, then I consider AIS to no longer be neglected" (this is hypothetical, I do not endorse it. And its also not as contextualized as Rockwell would want it).
But maybe I am interpreting the question too literally and you are making a reasonable guess what Rockwell wants to hear.
I actually do every so often go over the talks from the past several EAGs on Youtube and find it does better. Some important additional benefits are turning on speedup and subtitles, being able to skip forward or bail more easily if the talk turns out bad, and not being blocked from watching two good simultaneous talks.
In contrast, a lot of people really love in-person meetings compared to online video or phone.
I disagree with the idea that short AI timelines are not investable (although I agree interest rates are a bad and lagging indicator vs AI stocks). People foreseeing increased expectations of AI sales as a result of scaling laws, shortish AI timelines, and the eventual magnitude of success have already made a lot of money investing in Nvidia, DeepMind and OpenAI. Incremental progress increases those expectations, and they can increase even in worlds where AGI winds up killing or expropriating all investors so long as there is some expectation of enough inv...
Carl, I agree with everything you're saying, so I'm a bit confused about why you think you disagree with this post.
This post is a response to the very specific case made in an earlier forum post, where they use a limited scenario to define transformative AI, and then argue that we should see interest rates rising if if traders believe that scenario to be near.
I argue that we can't use interest rates to judge if said, specific scenario is near or not. That doesn't mean there are no ways to bet on AI (in a broader sense). Yes, when tech firms are tradi...
If you haven't read this piece by Ajeya Cotra, Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover I would highly recommend it. Some of the post on AI alignment here (aimed at a general audience) might also be helpful.
I was going from this: "The DICE baseline emissions scenario results in 83 million cumulative excess deaths by 2100 in the central estimate. Seventy-four million of these deaths can be averted by pursuing the DICE-EMR optimal emissions path." I didn't get into deaths vs DALYs (excess deaths among those with less life left to live), chances of scenarios, etc, and gave 'on the order of' for slack.
"But I don't see why we're talking about scale. Are you defining neglectedness as a ratio of <people potentially killed in worst case>/<dollars spent>?"...
In this 2022 ML survey the median credence on extinction-level catastrophe from AI is 5%, with 48% of respondents giving 10%. Some generalist forecaster platforms put numbers significantly lower, some forecasting teams or researchers with excellent forecasting records and more knowledge of the area put more (with I think the tendency being for more information to yield higher forecasts, and my own expectation). This scale looks like hundreds of millions of deaths or equivalent this century to me, although certainly many disagree. The argu...
I'm not sure this is super relevant to our core disagreement (if we have one), but how are you counting this? Glancing at that article, it looks like a pessimistic take on climate change's harm puts excess deaths at around 10m per year, and such damage would persist much more than 10 years.
But I don't see why we're talking about scale. Are you defining neglectedness as a ratio of <people potentially killed in worst case>/<dollars spent>?
How exactly could that be true?
Because coders who don't work explicitly on AI alignment sti...
$200B includes a lot of aid aimed at other political goals more than humanitarian impact, , with most of a billion people living at less than $700/yr, while the global economy is over $100,000B and cash transfer programs in rich countries are many trillions of dollars. That's the neglectedness that bumps of global aid interventions relative to local rich country help to the local relative poor.
You can get fairly arbitrarily bad cost-effectiveness in any area by taking money and wasting on it things that generate less value than the money. E.g. spending 99.9% on digging holes and filling them in, and 0.1% on GiveDirectly. But just handing over the money to the poor is a relevant attainable baseline.
Calling an area neglected because a lot of money is spent badly sounds like a highly subjective evaluation that's hard to turn into a useful principle. Sure, $200B annually is a small proportion of the global economy, but so is almost any cause area you can describe. From a quick search, the World Bank explicitly spends slightly more than a tenth of that on climate change, one of the classically 'non-neglected' evaluands of EA. It's hard to know how to compare these figures, since they obviously omit a huge number of other projects, but I doubt the WB cons...
Helping the global poor is neglected, and that accounts for most bednet outperformance. GiveDirectly, just giving cash, is thought by GiveWell/GHW to be something like 100x better on direct welfare than rich country consumption (although indirect effects reduce that gap), vs 1000x+ for bednets. So most of the log gains come from doing stuff with the global poor at all. Then bednets have a lot of their gains as positive externalities (protecting one person also protects others around them), and you're left with a little bit of 'being more confident about be...
Here's an example of a past case where a troll (who also trolled other online communities) made up multiple sock-puppet accounts, and assorted lies about sources for various arguments trashing AI safety, e.g. claiming to have been at events they were not and heard bad things, inventing nonexistent experts who supposedly rejected various claims, creating fake testimonials of badness, smearing people who discovered the deception, etc.
Came here to cite the same thing! :)
Note that Dustin Moskovitz says he's not a longtermist, and "Holden isn't even much of a longtermist."

So my intuition is that the two main important updates EA has undergone are "it's not that implausible that par-human AI is coming in the next couple of decades" and "the world is in fact dropping the ball on this quite badly, in the sense that maybe alignment isn't super hard, but to a first approximation no one in the field has checked."
(Which is both an effect and a cause of updates like "maybe we can figure stu...
If investors with $1T thought AGI soon, and therefore tried to buy up a portfolio of semiconductor, cloud, and AI companies (a much more profitable and capital-efficient strategy than betting on real interest rates) they could only a buy a small fraction of those industries at current prices. There is a larger pool of investors who would sell at much higher than current prices, balancing that minority.
Yes, it's weighted by capital and views on asset prices, but still a small portion of the relevant capital trying to trade (with risk and years in advance) o...
That there are particular arguments for decisions like bednets or eating sandwiches to have expected impacts that scale with the scope of the universes or galactic civilizations. E.g. the more stars you think civilization will be able to colonize, or the more computation that will be harvested, the greater your estimate of the number of sims in situations like ours (who will act the same as we do, so that on plausible decision theories we should think of ourselves as setting policy at least for the psychologically identical ones). So if you update to...
This sort of estimate is in general off by many orders of magnitude for thinking about the ratio of impact between different interventions when it only considers paths to very large numbers for the intervention under consideration, and not to reference interventions being compared against. For example, the expected number of lives saved from giving a bednet is infinite. Connecting to size-of-the-accessible-universe estimates, perhaps there are many simulations of situations like ours at an astronomical scale, and so our decisions will be replicated a...
Thanks Will!