All of CarlShulman's Comments + Replies

[Duplicate comment removed.]

[This comment is no longer endorsed by its author]Reply
2
William_MacAskill
Ah, I responded on the other comment, here.  

Glad to see this series up! Tons of great points here.

One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there's more to gain. OK.

But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors ... (read more)

4
William_MacAskill
Thanks! And it’s great to see you back on here!     Well, it depends on how many multiplicative factors. If 100, then yes. If 5, then maybe not.  So maybe the sweet spot for impact is where value is multiplicative, but with a relatively small number of multiplicative factors.  And you could act to make the difference in worlds in which society has already gotten all-but-one of the factors correct. Or act such that all the factors are better, in a correlated way. Great - I make a similar argument in Convergence and Compromise, section 5. (Apologies that the series is so long and interrelated!). I’ll quote the whole thing at the bottom of this comment. Here I want to emphasise the distinction between two ways in which it could be “easy” to get things right: (i) mostly-great futures are a broad target because of the nature of ethics (e.g. bounded value at low bounds); (ii) (some) future beings will converge on the best views and promote them. (This essay (No Easy Eutopia) is about (i), and Convergence and Compromise is about (ii).) W r t (ii)-type reasons, I think this argument works.  I don’t think it works w r t (i)-type reasons, though, because of questions around intertheoretic comparisons. On (i)-type reasons, it’s easier to get to a meaningful % of the optimum because of the nature of ethics (e.g. value is bounded rather than unbounded). But then we need to compare the stakes across different theories. And normalising at the difference in value between 0 and 100% would be a big mistake; it seeming “natural” is just an artifact of the notation we’ve used.  We discuss the intertheoretic comparisons issue in section 3.5 of No Easy Eutopia.   And here's Convergence and Compromise, section 5:  

I think there's some talking past each other happening. 

I am claiming that there are real coordination problems that lead even actors who believe in a large amount of AI risk to think that they need to undertake risky AI development (or riskier) for private gain or dislike of what others would do. I think that dynamic will likely result in future governments (and companies absent government response) taking on more risk than they otherwise would, even if they think it's quite a lot of risk.

I don't think that most AI companies or governments would want... (read more)

2
Matthew_Barnett
Perhaps I overstated some of my claims or was unclear. So let me try to be more clear about my basic thesis. First of all, I agree that in the most basic model of the situation, being slightly ahead of a competitor can be the decisive factor between going bankrupt and making enormous profits. This creates a significant personal incentive to race ahead, even if doing so only marginally increases existential risk overall. As a result, AI labs may end up taking on more risk than they would in the absence of such pressure. More generally, I agree that without competition—whether between states or between AI companies—progress would likely be slower than it currently is. My main point, however, is that these effects are likely not strong enough to justify the conclusion that the socially optimal pace of AI R&D is meaningfully slower than the current pace we in fact observe. In other words, I’m not convinced that what's rational from an individual actor’s perspective diverges greatly from what would be rational from a collective or societal standpoint. This is the central claim underlying my objection: if there is no meaningful difference between what is individually rational and what is collectively rational, then there is little reason to believe we are facing a tragedy-of-the-commons scenario as suggested in the post. To sketch a more complete argument here, I would like to make two points: First, while some forces incentivize speeding up AI development, others push in the opposite direction. Measures like export controls, tariffs, and (potentially) future AI regulations can slow down progress. In these cases, the described dynamic flips: the global costs of slowing down are shared, while the political rewards—such as public credit or influence—are concentrated among the policymakers or lobbyists who implement the slowdown. Second, as I’ve mentioned, a large share of both the risks and benefits of AI accrue directly to those driving its development. This alignment

"Second, the primary benefits—higher incomes and earlier biomedical breakthroughs—are also broadly shared; they are not gated to the single lab that crosses the finish line first."

If you look at the leaders of major AI companies you see people like Elon Musk and others who are concerned with getting to AGI before others who they distrust and fear. They fear immense power in the hands of rivals with conflicting ideologies or in general. 

OpenAI was founded and funded in significant part based on Elon Musk's fear of the consequences of the Google leaders... (read more)

4
Matthew_Barnett
Musk is a vivid example of the type of dynamic you're describing, but he’s also fairly unusual in this regard. Sundar Pichai, Satya Nadella, and most other senior execs strike me as more like conventional CEOs: they want market share, profits, and higher margins, but they're not seeking the kind of hegemonic control that would justify accepting a much higher p(doom). If the dominant motive is ordinary profit maximization rather than paranoid power-seeking, then my original point stands: both the upside (huge profit streams) and the downside (self-annihilation) accrue to the people pushing AI forward, so the private incentives already internalize a large chunk of the social calculus. Likewise, it's true that governments often seek hegemonic control over the world in a way that creates destructive arms races, but even in the absence of such motives, there would still be a strong desire among humans to advance most technologies to take advantage of the benefits.  The most important fact here is that AI has an enormous upside: people would still have strong reasons to aggressively seek it to obtain life extension and extraordinary wealth even in the absence of competitive dynamics—unless they were convinced that the risk from pursuing that upside was unacceptably high (which is an epistemic consideration, not a game-theoretic trap). You may have misread me here. I'm not claiming that AI labs are motivated by a desire to create broad prosperity. They certainly do care about "power," but the key question is whether they're primarily driven by the type of zero-sum, adversarial power-seeking you described. I'm skeptical that this is the dominant motive. Instead, I think ordinary material incentives likely play a larger role. The unilateralist’s curse is primarily worrisome when the true value of an initiative is negative; for good projects it usually helps them proceed. Moreover, if good projects can be vetoed (e.g., via regulators), this creates a reverse curse that ca

Right, those comments were about the big pause letter, which while nominally global in fact only applied at the time to the leading US lab, and even if voluntarily complied with would not affect the PRC's efforts to catch up in semiconductor technology, nor Chinese labs catching up algorithmically (as they have partially done).

Sure, these are possible. My view above was about expectations. #1 and #2 are possible, although look less likely to me. There's some truth to #3, but the net effect is still gap closing, and the slowing tends to be more earlier (when it is less impactful) than later.

On my view the OP's text citing me left out the most important argument from the section they linked: the closer and tighter an AI race is at the international level as the world reaches strong forms of AGI and ASI, the less slack there is for things like alignment. The US and Chinese governments have the power to prohibit their own AI companies from negligently (or willfully) racing to create AI that overthrows them, if they believed that was a serious risk and wanted to prioritize stopping it. That willingness will depend on scientific and political effo... (read more)

1
Nate Sharpe
Thanks for the thoughtful feedback Carl, I appreciate it. This is one of my first posts here so I'm unsure of the norms - is it acceptable/preferred that I edit the post to add that point to the bulleted list in that section (and if so do I add a "edited to add" or similar tag) or just leave it to the comments for clarification? I hope the bulk of the post made it clear that I agree with what you're saying - a pause is only useful if it's universal, and so what we need to do first is get universal agreement among the players that matter on why, when, and how to pause.
5
Holly Elmore ⏸️ 🔸
(Makes much more sense if you were talking about unilateral pauses! The PauseAI pause is international, so that's just how I think of Pause.)

Absent an agreement with enough backing it to stick, slowdown by the US tightens the international gap in AI and means less slack (and less ability to pause when it counts) and more risk of catastrophe in the transition to AGI and ASI.

I agree this mechanism seems possible, but it seems far from certain to me. Three scenarios where it would be false:

  • One country pauses, which gives the other country a commanding lead with even more slack than anyone had before.
  • One country pauses, and the other country, facing reduced incentives for haste, also pauses.
  • One cou
... (read more)

I have two views in the vicinity. First, there's a general issue that human moral practice generally isn't just axiology, but also includes a number of elements that are built around interacting with other people with different axiologies, e.g. different ideologies coexisting in a liberal society, different partially selfish people or family groups coexisting fairly while preferring different outcomes. Most flavors of utilitarianism ignore those elements, and ceteris paribus would, given untrammeled power, call for outcomes that would be ruinous for ~all c... (read more)

3
Michael St Jules 🔸
(You may be aware of these already, but I figured they were worth sharing if not, and for the benefit of other readers.) Some "preference-affecting views" do much better on these counts and can still be interpreted as basically utilitarian (although perhaps not based on "axiology" per se, depending on how that's characterized). In particular: 1. Object versions of preference views, as defended in Rabinowicz & Österberg, 1996 and van Weeldon, 2019. These views are concerned with achieving the objects of preferences/desires, essentially taking on everyone's preferences/desires like moral views weighed against one another. They are not (necessarily) concerned with having satisfied preferences/desires per se, or just having more favourable attitudes (like hedonism and other experientialist views), or even objective/stance-independent measures of "value" across outcomes.[1] 2. The narrow and hard asymmetric view of Thomas, 2019 (for binary choices), applied to preferences/desires instead of whole persons or whole person welfare. In binary choices, if we add a group of preferences/desires and assume no other preference/desire is affected, this asymmetry is indifferent to the addition of the group if their expected total value (summing the value in favourable and disfavourable attitudes) is non-negative, but recommends against it if their expected total value is negative. It is also indifferent between adding one favourable attitude and another even more favourable attitude. Wide views, which treat contingent counterparts as if they're necessary, lead to replacement. 3. Actualism, applied to preferences instead of whole persons or whole person welfare (Hare, 2007, Bykvist, 2007, St. Jules, 2019, Cohen, 2020, Spencer, 2021, for binary choices). 4. Dasgupta's view, or other modifications of the above views in a similar direction, for more than two options to choose from, applied to preferences instead of whole persons or whole person welfare. This can avoid repugnance

Physicalists and illusionists mostly don't agree with the identification of 'consciousness' with magical stuff or properties bolted onto the psychological or cognitive science picture of minds. All the real feelings and psychology that drive our thinking, speech and action exist. I care about people's welfare, including experiences they like, but also other concerns they have (the welfare of their children, being remembered after they die), and that doesn't hinge on magical consciousness that we, the physical organisms having this conversation, would have ... (read more)

(I understand you are very busy this week, so please feel free to respond later.)

Re desires, the main upshot of non-dualist views of consciousness I think is responding to arguments that invoke special properties of conscious states to say they matter but not other concerns of people.

I would say that consciousness seems very plausibly special in that it seems very different from other types of things/entities/stuff we can think or talk or have concerns about. I don't know if it's special in a "magical" way or some other way (or maybe not special at all... (read more)

Here’s a fairly safe prediction: most of the potential harm from AI is potential harm to nonhuman animals.

I would think for someone who attended an AI, Animals, and Digital Minds conference it should look like an extremely precarious prediction, as AIs will likely immensely outnumber nonhuman animals, and could have much more of most features we could use in measuring 'harm'? 

2
Zachary Brown🔸
Thanks for the comment. I was clearly too quick with that opening statement. Perhaps in part I let my epistemic guard down there out of general frustration at the neglectedness of the topic, and a desire to attract some attention with a bold opener. So much harm could accrue to nonhuman animals relative to humans, and I really want more discussion on this. PLF is -- I've argued, anyway -- a highly visible threat to the welfare of zillions, but rarely mentioned. I hope you'll forgive an immodest but emotional claim. I've edited the opener and the footnote to be more defensible, in response to this comment. I actually don't believe, in the median scenario, that AIs are likely to both outnumber sentient animals and have a high likelihood of suffering, but I don't really want that to be the focus of this piece. And either way, I don't believe that with high certainty: in that respect, the statement was not reflective of my views.

Rapid fire:
 

  • Nearterm extinction risk from AI is wildly closer to total AI x-risk than the nuclear analog
  • My guess is that nuclear war interventions powerful enough to be world-beating for future generations would look tremendous in averting current human deaths, and most of the WTP should come from that if one has a lot of WTP related to each of those worldviews
  • Re suspicious convergence, what do you want to argue with here? I've favored allocation on VOI and low-hanging fruit on nuclear risk not leveraging AI related things in the past less than 1% of
... (read more)
2
Vasco Grilo🔸
Thanks for following up! Sorry for the lack of clarity. Some thoughts: * The 15.3 M$ grantmakers aligned with effective altruism have influenced aiming to decrease nuclear risk seem mostly optimised to decrease the nearterm damage caused by nuclear war (especially the spending on nuclear winter), not the more longterm existential risk linked to permanent global totalitarianism. * As far as I know, there has been little research on how a minor AI catastrophe would influence AI existential risk (although wars over Taiwan have been wargamed). Looking into this seems more relevant than investigating how a non-AI catastrophe would influence AI risk. * The risk from permanent global totalitarianism is still poorly understood, so research on this and how to mitigate it seems more valuable than efforts focussing explicitly on nuclear war. There might well be interventions to increase democracy levels in China which are more effective to decrease that risk than interventions aimed at ensuring that China does not become the sole global hegemon after a nuclear war. * I guess most of the risk from permanent global totalitarianism does not involve any major catastrophes. As a data point, the Metaculus' community predicts an AI dystopia is 5 (= 0.19/0.037) times as likely as a paperclipalypse by 2050. More broadly, which pieces would you recommend reading on this topic? I am not aware of substantial blogposts, although I have seen the concern raised many times.

I agree that people should not focus on nuclear risk as a direct extinction risk (and have long argued this), see Toby's nuke extinction estimates as too high, and would assess measures to reduce damage from nuclear winter to developing neutral countries mainly in GiveWell-style or ordinary CBA terms, while considerations about future generations would favor focus on AI, and to a lesser extent bio. 

However, I do think this wrongly downplays the effects on our civilization beyond casualties and local damage of a nuclear war that wrecks the current nucl... (read more)

2
Vasco Grilo🔸
Thanks for sharing your thoughts, Carl! Thanks for mentioning these points. Would you also rely on ordinary CBAs to assess interventions to decrease the direct damage of nuclear war? I think this would still make sense. At the same time, the nearterm extinction risk from AI also misses most of the existential risk from AI? I guess you are implying that the ratio between nearterm extinction risk and total existential risk is lower for nuclear war than for AI. Related to your point above, I say that: Regarding: Note I mention right after this that: You say that: I agree these are relevant considerations. On the other hand: * The US may want to attack China in order not to relenquish its position as global hegemon. * I feel like there has been little research on questions like: * How much it would matter if powerful AI was developped in the West instead of China (or, more broadly, in a democracy instead of autocracy). * The likelihood of lock-in. On the last point, your piece is a great contribution, but you say: However, the likelihood of lock-in is crucial to assess the strength of your points. I would not be surprised if the chance of an AI lock-in due to a nuclear war was less than 10^-8 this century. In terms of nuclear war indirectly causing extinction: In contrast, if powerful AI caused extinction, control over the future would arguably permanently be lost. Agreed. Is there any evidence for this? Makes sense. If GiveWell's top charities are not a cost-effective way of improving the longterm future, then decreasing starvation in low income countries in a nuclear winter may be cost-effective in terms of saving lives, but has semingly negligible impact on the longterm future too. Such countries just have too little influence on transformative technologies.

Thank you for the comment Bob.

I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones).  E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of ... (read more)

9
Michael St Jules 🔸
(Speaking for myself only.) FWIW, I think something like conscious subsystems (in huge numbers in one neural network) is more plausible by design in future AI. It just seems unlikely in animals because all of the apparent subjective value seems to happen at roughly the highest level where everything is integrated in an animal brain. Felt desire seems to (largely) be motivational salience, a top-down/voluntary attention control function driven by high-level interpretations of stimuli (e.g. objects, social situations), so relatively late in processing. Similarly, hedonic states depend on high-level interpretations, too. Or, according to Attention Schema Theory, attention models evolved for the voluntary control of attention. It's not clear what the value would be for an attention model at lower levels of organization before integration. And evolution will select against realizing functions unnecessarily if they have additional costs, so we should provide a positive argument for the necessary functions being realized earlier or multiple times in parallel that overcomes or doesn't incur such additional costs. So, it's not that integration necessarily reduces value; it's that, in animals, all the morally valuable stuff happens after most of the integration, and apparently only once or in small number. In artificial systems, the morally valuable stuff could instead be implemented separately by design at multiple levels. EDIT: I think there's still crux about whether realizing the same function the same number of times but "to a greater degree" makes it more morally valuable. I think there are some ways of "to a greater degree" that don't matter, and some that could. If it's only sort of (vaguely) true that a system is realizing a certain function, or it realizes some but not all of the functions possibly necessary for some type of welfare in humans, then we might discount it for only meeting lower precisifications of the vague standards. But adding more neurons ju

Lots of progress on AI, alignment, and governance. This sets up a position where it is likely that a few years later there's an AI capabilities explosion and among other things:
 

  • Mean human wealth skyrockets, while AI+robots make cultured meat and substitutes, as well as high welfare systems (and reengineering biology) cheap relative to consumers' wealth; human use of superintelligent AI advisors leads to global bans on farming with miserable animals and/or all farming
  • Perfect neuroscientific and psychological knowledge of humans and animals, combined w
... (read more)

Not much new on that front besides continuing to back the donor lottery in recent years, for the same sorts of reasons as in the link, and focusing on research and advising rather than sourcing grants.

A bit, but more on the willingness of AI experts and some companies to sign the CAIS letter and lend their voices to the view 'we should go forward very fast with AI, but keep an eye out for better evidence of danger and have the ability to control things later.'

My model has always been that the public is technophobic, but that 'this will be constrained like peaceful nuclear power or GMO crops' isn't enough to prevent a technology that enables DSA and OOMs (and nuclear power and GMO crops exist, if AGI exists somewhere that place outgrows the rest of the w... (read more)

I don't want to convey that there was no discussion, thus my linking the discussion and saying I found it inadequate and largely missing the point from my perspective. I made an edit for clarity, but would accept suggestions for another.

 

1
Michael St Jules 🔸
Your edit looks good to me. Thanks!

I have never calculated moral weights for Open Philanthropy, and as far as I know no one has claimed that. The comment you are presumably responding to began by saying I couldn't speak for Open Philanthropy on that topic, and I wasn't.

Thanks, I was referring to this as well, but should have had a second link for it as the Rethink page on neuron counts didn't link to the other post. I think that page is a better link than the RP page I linked, so I'll add it in my comment.

5
Michael St Jules 🔸
(Again, not speaking on behalf of Rethink Priorities, and I don't work there anymore.) (Btw, the quote formatting in your original comment got messed up with your edit.) I think the claims I quoted are still basically false, though? Do Brains Contain Many Conscious Subsystems? If So, Should We Act Differently? explicitly considered a conscious subsystems version of this thought experiment, focusing on the more human-favouring side when you normalize by small systems like insect brains, which is the non-obvious side often neglected. There's a case that conscious subsystems could dominate expected welfare ranges even without intertheoretic comparisons (but also possibly with), so I think we were focusing on one of strongest and most important arguments for humans potentially mattering more, assuming hedonism and expectational total utilitarianism. Maximizing expected choiceworthiness with intertheoretic comparisons is controversial and only one of multiple competing approaches to moral uncertainty. I'm personally very skeptical of it because of the arbitrariness of intertheoretic comparisons and its fanaticism (including chasing infinities, and lexically higher and higher infinities). Open Phil also already avoids making intertheoretic comparisons, but was more sympathetic to normalizing by humans if it were going to.

I'm not planning on continuing a long thread here, I mostly wanted to help address the questions about my previous comment, so I'll be moving on after this. But I will say two things regarding the above. First, this effect (computational scale) is smaller for chickens but progressively enormous for e.g. shrimp or lobster or flies.  Second, this is a huge move and one really needs to wrestle with intertheoretic comparisons to justify it:

I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above. 

Suppose we... (read more)

7
Vasco Grilo🔸
Fair, as this is outside of the scope of the original post. I noticed you did not comment on RP's neuron counts post. I think it would be valuable if you commented there about the concerns you expressed here, or did you already express them elsewhere in another post of RP's moral weight project sequence? I agree that is the case if one combines the 2 wildly different estimates for the welfare range (e.g. one based on the number of neurons, and another corresponding to RP's median welfare ranges) with a weighted mean. However, as I commented above, using the geometric mean would cancel the effect. Is this a good analogy? Maybe not: * Broadly speaking, giving the same weight to multiple estimates only makes sense if there is wide uncertainty with respect to which one is more reliable. In the example above, it would make sense to give negligible weight to all metrics except for the aggregate mass. In contrast, there is arguably wide uncertainty with respect to what are the best models to measure welfare ranges, and therefore distributing weights evenly is more appropriate. * One particular model on which we can put lots of weight on is that mass is straightforwardly additive (at least at the macro scale). So we can say the mass of all humans equals the number of humans times the mass per human, and then just estimate this for a typical human. In contrast, it is arguably unclear whether one can obtain the welfare range of an animal by e.g. just adding up the welfare range of its individual neurons.
CarlShulman
164
16
10
10
10

I can't speak for Open Philanthropy, but I can explain why I personally was unmoved by the Rethink report (and think its estimates hugely overstate the case for focusing on tiny animals, although I think the corrected version of that case still has a lot to be said for it).
 
Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.
 

However, I say "naively" because this doesn't actually work, due to two-envelope effects...whenev

... (read more)
4
MichaelDickens
It seems to me that the naive way to handle the two envelopes problem (and I've never heard of a way better than the naive way) is to diversify your donations across two possible solutions to the two envelopes problem: * donate half your (neartermist) money on the assumption that you should use ratios to fixed human value * donate half your money on the assumption that you should fix the opposite way (eg fruit flies have fixed value) Which would suggest donating half to animal welfare and probably half to global poverty. (If you let moral weights be linear with neuron count, I think that would still favor animal welfare, but you could get global poverty outweighing animal welfare if moral weight grows super-linearly with neuron count.) Plausibly there are other neartermist worldviews you might include that don't relate to the two envelopes problem, e.g. a "only give to the most robust interventions" worldview might favor GiveDirectly. So I could see an allocation of less than 50% to animal welfare.

Thanks for your discussion of the Moral Weight Project's methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we're keen to write more about it. Perhaps 2024 will provide the opportunity!

For now, we'll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some... (read more)

It is not unthinkably improbable that an elephant brain where reinforcement from a positive or negative stimulus adjust millions of times as many neural computations could be seen as vastly more morally important than a fruit fly, just as one might think that a fruit fly is much more important than a thermostat (which some suggest is conscious and possesses preferences). Since on some major functional aspects of mind there are differences of millions of times, that suggests a mean expected value orders of magnitude higher for the elephant if you put a bit

... (read more)

(I'm not at Rethink Priorities anymore, and I'm not speaking on their behalf.)

Rethink's work, as I read it, did not address that central issue, that you get wildly different results from assuming the moral value of a fruit fly is fixed and reporting possible ratios to elephant welfare as opposed to doing it the other way around. 

(...)

Rethink's discussion of this almost completely sidestepped the issue in my view.

RP did in fact respond to some versions of these arguments, in the piece Do Brains Contain Many Conscious Subsystems? If So, Should We Act Di... (read more)

Gil
30
1
0

This consideration is something I had never thought of before and blew my mind. Thank you for sharing.

Hopefully I can summarize it (assuming I interpreted it correctly) in a different way that might help people who were as befuddled as I was. 

The point is that, when you have probabilistic weight to two different theories of sentience being true, you have to assign units to sentience in these different theories in order to compare them. 

Say you have two theories of sentience that are similarly probable, one dependent on intelligence and one depend... (read more)

Thanks for elaborating, Carl!

Luke says in the post you linked that the numbers in the graphic are not usable as expected moral weights, since ratios of expectations are not the same as expectations of ratios.

Let me try to restate your point, and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:

  • Humans relative to that of chickens is E("WR of humans"/"WR of chickens") = w*N + (1 - w)*n.
  • Chickens relative to that of humans is E("WR
... (read more)

One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team.  Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn't imply that the judges or the institution changed their views very much in that direction.

At large scale, Information can be valuable enough to buy even if it only modestly adjusts pro... (read more)

Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.

That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:

  • We don’t use Rethink’s moral weights.
    • Our cur
... (read more)
  1. there not being enough practically accessible matter available (even if we only ever need a finite amount), and

This is what I was thinking about. If I need a supply of matter set aside in advance to be able to record/receive an answer, no finite supply suffices. Only an infinite brain/tape, or infinite pile of tape making resources, would suffice. 

If the resources are created on demand ex nihilo, and in such a way that the expansion processes can't be just 'left on' you could try to jury rig around it.

1
Michael St Jules 🔸
The resources wouldn't necessarily need to be created on demand ex nihilo either (although that would suffice), but either way, we're forced into extremely remote possibilities — denying our current best understanding of physics — and perhaps less likely than infinite accessible resources (or other relevant infinities). That should be enough to say it's less conservative than actual infinities and make your point for this particular money pump, but it again doesn't necessarily depend on actual infinities. However, some people actually assign 0 probability to infinity (I think they're wrong to do so), and some of them may be willing to grant this possibility instead. For them, it would actually be more conservative. The resources could just already exist by assumption in large enough quantities by outcome in the prospect (at least with nonzero probability for arbitrarily large finite quantities). For example, the prospect could be partially about how much information we can represent to ourselves (or recognize). We could be uncertain about how much matter would be accessible and how much we could do with it. So, we can have uncertainty about this and may not be able to put an absolute hard upper bound on it with certainty, even if we could with near-certainty, given our understanding of physics and the universe, and our confidence in them. And this could still be the case conditional on no infinities. So, we could consider prospects with extremely low probability heavy tails for how much we could represent to ourselves, which would have the important features of St Petersburg prospects for the money pump argument. It’s also something we'd care about naturally, because larger possible representations would tend to coincide with much more possible value. St Petersburg prospects already depend on extremely remote possibilities to be compelling, so if you object to extremely low probabilities or instead assign 0 probability to them (deny the hypothetical), then you can

I personally think unbounded utility functions don't work, I'm not claiming otherwise here, the comment above is about the thought experiment.

Now, there’s an honest and accurate genie — or God or whoever’s simulating our world or an AI with extremely advanced predictive capabilities — that offers to tell you exactly how  will turn out.[9] Talking to them and finding out won’t affect  or its utility, they’ll just tell you what you’ll get.


This seems impossible, for the possibilities that account for ~all the expected utility (without which it's finite)? You can't fit enough bits in a human brain or lifetime (or all accessible galaxies, or whatever). Your brain would... (read more)

2
Michael St Jules 🔸
  My post also covers two impossibility theorems that don't depend on anyone having arbitrary precision or unbounded or infinite representations of anything:[1] 1. Stochastic Dominance, Anteriority and Impartiality are jointly inconsistent. 2. Stochastic Dominance, Separability and Compensation (Impartiality) are jointly inconsistent. The proofs are also of course finite, and the prospects used have finite representations, even though they represent infinitely many possible outcomes and unbounded populations. 1. ^ The actual outcome would be an unbounded (across outcomes) representation of itself, but that doesn't undermine the argument.
1
Michael St Jules 🔸
It wouldn't have to definitely be infinite, but I'd guess it would have to be expandable to arbitrarily large finite sizes, with the size depending on the outcome to represent, which I think is also very unrealistic. I discuss this briefly in my Responses section. Maybe not impossible if we're dealing with arbitrarily long lives, because we could keep expanding over time, although there could be other practical physical limits on this that would make this impossible, maybe requiring so much density that it would collapse into a black hole?

Alone and directly (not as a contributing factor to something else later), enough below 0.1% that I evaluate nuclear interventions based mainly on their casualties and disruption, not extinction. I would (and have) support them in the same kind of metric as GiveWell, not in extinction risk.

In the event of all-out WMD war (including with rogue AGI as belligerent) that leads to extinction nukes could be a contributing factor combined with bioweapons and AI (strategic WMD war raises the likelihoods of multiple WMDs being used together).

3
Vasco Grilo🔸
Thanks for the reply! I agree the direct/nearterm extinction risk posed by nuclear war alone would be quite low (maybe of the order of 10^-6 in the next 100 years), but I wonder whether it could still meaningfully decrease the value of the longterm future if thousands of nukes are detonated. Models S and E of Denkenberger 2022 consider a full scale nuclear war would decrease such value by 24 % and 7 %. I think these are too high, but guess a nuclear war involving thousands of nukes might still increase indirect/longterm extinction risk to a significant extent. So I would say they are not directly comparable to GiveWell's top charities.  Maybe you think they are comparable given the low likelihood of civilisation collapse, and the flow-through effects of saving lives (which might include decreasing longterm extinction risk)?

>It's plausible humans will go extinct from AI. It's also plausible humans will go extinct from supervolcanoes. 

Our primitive and nontechnological ancestors survived tens of millions of years of supervolcano eruptions (not to mention mass extinctions from asteroid/comet impacts) and our civilization's ability to withstand them is unprecedentedly high and rapidly increasing. That's not plausible, it's enormously remote, well under 1/10,000 this century.

2
NickLaing
I agree with what I think you intend to say, but in my mind plausible = any chance at all.

I think there are whole categories of activity that are not being tried by the broader world, but that people focused on the problem attend to, with big impacts in both bio and AI. It has its own diminishing returns curve.

The thing to see is if the media attention translates into  action with more than a few hundred people working on the problem as such rather than getting distracted, and government prioritizing it in conflict with competing goals (like racing to the precipice). One might have thought Covid-19 meant that GCBR pandemics would stop being neglected,  but  that doesn't seem right. The Biden administration has asked for Congressional approval of a pretty good pandemic prevention bill (very similar to what EAs have suggested)  but it has been ... (read more)

2
Vasco Grilo🔸
Hi Carl, Do you have any thoughts on how the expected impact of the few hundred people working most directly on AGI safety compares with that of the rest of the world (on mitigating the risks from advanced AI)? I suppose a random person from the few hundred people will have much greater (positive/negative) impact than a random person, but it is not obvious to me that we can round the (positive/negative) impact of the rest of the world (on mitigating the risks from advanced AI) to zero, and increased awareness of the rest of the world will tend to increase its expected impact (for better or worse). In terms of bio risk, even if covid did not result in more people working towards fighting existential pandemics, it may still have increased the number of people working on pandemics, which arguably has positive flow-through effects to mitigate the existential ones. In addition, covid may have increased the amount of resources that would be directed towards fighting a pandemic if one happens. It is unclear to me how large these effects are relative to a given increase in the number of people working most directly on reducing the risk from existential pandemics.

I feel like this does not really address the question?

A possible answer to Rockwell's question might be "If we have 15000 scientists working full-time on AIS, then I consider AIS to no longer be neglected" (this is hypothetical, I do not endorse it. And its also not as contextualized as Rockwell would want it).

But maybe I am interpreting the question too literally and you are making a reasonable guess what Rockwell wants to hear.

I actually do every so often go over the talks from the past several EAGs on Youtube and find it does  better. Some important additional benefits are turning on speedup and subtitles, being able to skip forward or bail more easily if the talk turns out bad, and not being blocked from watching two good simultaneous talks.

In contrast, a lot of people really love in-person meetings compared to online video or phone.

4
Jakob
Yes, in isolation I see how that seems to clash with what Carl is saying. But that’s after I’ve granted the limited definition of TAI (x-risk or explosive, shared growth) from the former post. When you allow for scenarios with powerful AI where savings still matter, the picture changes (and I think that’s a more accurate description of the real world). I see that I could’ve been more clear that this post was a case of “even if blindly accepting the (somewhat unrealistic) assumptions of another post, their conclusions don’t follow”, and not an attempt at describing reality as accurately as possible

I disagree with the idea that short AI timelines are not investable (although I agree interest rates are a bad and lagging indicator vs AI stocks). People foreseeing increased expectations of AI sales as a result of scaling laws, shortish AI timelines, and the eventual magnitude of success have already made a lot of money investing in Nvidia, DeepMind and OpenAI. Incremental progress increases those expectations, and they can increase even in worlds where AGI winds up killing or expropriating all investors so long as there is some expectation of enough inv... (read more)

Carl, I agree with everything you're saying, so I'm a bit confused about why you think you disagree with this post.

This post is a response to the very specific case made in an earlier forum post, where they use a limited scenario to define transformative AI, and then argue that we should see interest rates rising if if traders believe that scenario to be near. 

I argue that we can't use interest rates to judge if said, specific scenario is near or not. That doesn't mean there are no ways to bet on AI (in a broader sense). Yes, when tech firms are tradi... (read more)

If you haven't read this piece by Ajeya Cotra, Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover I would highly recommend it. Some of the post on AI alignment here (aimed at a general audience) might also be helpful.

2
Amber Dawn
Thanks, I'll check out the Cotra post. I've have skimmed some of the Cold Takes posts and not found where he addresses the specific confusions I have above.

This tweet seems like vague backtracking on the long timelines.

Well Musk was the richest, who notably pulled out and then the money seems mostly not to have manifested. I haven't seen a public breakdown of commitments those sorts of statements were based on.

1
MarkusAnderljung
Semafor reporting confirms your view. They say Musk promised $1bn and gave $100mn before pulling out. 

The kind of examples people used to use to motivate frame problem stories in the days of GOFAI in the 20th century  are routinely solved by AI systems today. 

2
Lixiang
Interesting, well maybe I'm off base then.

I was going from this: "The DICE baseline emissions scenario results in 83 million cumulative excess deaths by 2100 in the central estimate. Seventy-four million of these deaths can be averted by pursuing the DICE-EMR optimal emissions path." I didn't get into deaths vs DALYs (excess deaths among those with less life left to live), chances of scenarios, etc, and gave 'on the order of' for slack.

"But I don't see why we're talking about scale. Are you defining neglectedness as a ratio of <people potentially killed in worst case>/<dollars spent>?"... (read more)

3
Arepo
I'm happy to leave it there, but to clarify I'm not claiming 'no difference in the type of work they do', but rather 'no a priori reason to write one group off as "not concerned with safety"'.

In this 2022 ML survey the median credence on extinction-level catastrophe from AI is 5%, with 48% of respondents giving 10%.  Some generalist forecaster platforms put numbers significantly lower, some  forecasting teams or researchers with excellent forecasting records and more knowledge of the area put more (with I think the tendency being for more information to yield higher forecasts, and my own expectation).  This scale looks like hundreds of millions of deaths or equivalent this century to me, although certainly many disagree. The argu... (read more)

order 100M deaths.

I'm not sure this is super relevant to our core disagreement (if we have one), but how are you counting this? Glancing at that article, it looks like a pessimistic take on climate change's harm puts excess deaths at around 10m per year, and such damage would persist much more than 10 years.

But I don't see why we're talking about scale. Are you defining neglectedness as a ratio of <people potentially killed in worst case>/<dollars spent>?

How exactly could that be true?

Because coders who don't work explicitly on AI alignment sti... (read more)

$200B  includes a lot of aid aimed at other political goals more than humanitarian impact, , with most of a billion people living at less than $700/yr, while the global economy is over $100,000B and cash transfer programs in rich countries are many trillions of dollars. That's the neglectedness that bumps of global aid interventions relative to local rich country help to the local relative poor. 

You can get fairly arbitrarily bad cost-effectiveness in any area by taking money and wasting on it things that generate less value than the money. E.g. spending 99.9% on digging holes and filling them in, and 0.1% on GiveDirectly. But just handing over the money  to the poor is a relevant attainable baseline.

Calling an area neglected because a lot of money is spent badly sounds like a highly subjective evaluation that's hard to turn into a useful principle. Sure, $200B annually is a small proportion of the global economy, but so is almost any cause area you can describe. From a quick search, the World Bank explicitly spends slightly more than a tenth of that on climate change, one of the classically 'non-neglected' evaluands of EA. It's hard to know how to compare these figures, since they obviously omit a huge number of other projects, but I doubt the WB cons... (read more)

Helping the global poor is neglected, and that accounts for most bednet outperformance. GiveDirectly, just giving cash, is thought by GiveWell/GHW to be something like 100x better on direct welfare than rich country consumption (although indirect effects reduce that gap), vs 1000x+ for bednets. So most of the log gains come from doing stuff with the global poor at all. Then bednets have a lot of their gains as positive externalities (protecting one person also protects others around them), and you're left with a little bit of 'being more confident about be... (read more)

9
Arepo
Official development assistance is nearly $200billion annually. I think if that's going to be called 'neglected', the term needs some serious refinement. People have compared various development interventions like antiretroviral drugs for HIV, which have the same positive externalities and (at least according to a presentation Toby Ord gave a few years ago), still something like 100fold difference in expected outcomes from AMF.

Here's an example of a past case where a troll (who also trolled other online communities) made up multiple sock-puppet accounts, and assorted lies about sources for various arguments trashing AI safety, e.g. claiming to have been at events they were not and heard bad things, inventing nonexistent experts who supposedly rejected various claims, creating fake testimonials of badness, smearing people who discovered the deception, etc. 

https://www.openphilanthropy.org/research/three-key-issues-ive-changed-my-mind-about/

Came here to cite the same thing! :) 

Note that Dustin Moskovitz says he's not a longtermist, and "Holden isn't even much of a longtermist."

Image

So my intuition is that the two main important updates EA has undergone are "it's not that implausible that par-human AI is coming in the next couple of decades" and "the world is in fact dropping the ball on this quite badly, in the sense that maybe alignment isn't super hard, but to a first approximation no one in the field has checked."

(Which is both an effect and a cause of updates like "maybe we can figure stu... (read more)

  1. But the stocks are the more profitable and capital-efficient investment, so that's  where you see effects first on market prices (if much at all) for a given number of traders buying the investment thesis. That's the main investment on this basis I see short timelines believers making (including me), and has in fact yielded a lot of excess returns since EAs started to identify it in the 2010s.
  2. I don't think anyone here is arguing against the no-trade theorem, and that's not an argument that prices will never be swayed  by anything, but that you ca
... (read more)

If investors with $1T thought AGI soon, and therefore tried to buy up a portfolio of semiconductor, cloud, and AI companies (a much more profitable and capital-efficient strategy than betting on real interest rates) they could only a buy a small fraction of those industries at current prices. There is a larger pool of investors who would sell at much higher than current prices, balancing that minority.

Yes, it's weighted by capital and views on asset prices, but still a small portion of the relevant capital trying to trade (with risk and years in advance) o... (read more)

9
basil.halperin
1. We would welcome engagement from you regarding our argument that stock prices are not useful for forecasting timelines (the sign is ambiguous and effect noisy). 2. You offer what is effectively a full general argument against market prices ever being swayed by anything -- a bit more on this point here. Price changes do not need to be driven by volume! (cf the no-trade theorem, for the conceptual idea) 3. I'm not sure if this is exactly your point about prediction markets (or if you really want to talk about total capital, on which see again #2), but:  Sovereign debt markets are orders of magnitude larger than PredictIt or other political prediction markets. These are not markets where individual traders are capped to $600 max positions and shorting is limited (or whatever the precise regulations are)! Finding easy trades in these markets is ...not easy.

They still have not published. You can email Jan Brauner and Fabienne Sandkuehler for it.

Expected lives saved and taken are both infinite, yes.

That there are particular arguments for decisions like bednets or eating sandwiches to have expected impacts that scale with the scope of the universes or galactic civilizations. E.g. the more stars you think civilization will be able to colonize, or the more computation that will be harvested, the greater your estimate of the number of sims in situations like ours (who will act  the same as we do, so that on plausible decision theories we should think of ourselves as setting policy at least for the psychologically identical ones). So if you update to... (read more)

3
Jordan Arel
Ah yes! I think I see what you mean. I hope to research topics related to this in the near future, including in-depth research on anthropics, as well as on what likely/desirable end-states of the universe are (including that we may already be in an end-state simulation) and what that implies for our actions. I think this could be a 3rd reason for acting to create a high amount of well-being for those close to you in proximity, including yourself.

This sort of estimate is in general off by many orders of magnitude for thinking about the ratio of impact between different interventions when it only considers paths to very large numbers for the intervention under consideration, and not to reference interventions being compared against. For example, the expected number of lives saved from giving a bednet is infinite. Connecting  to size-of-the-accessible-universe estimates, perhaps there are many simulations of situations like ours at an astronomical scale, and so our decisions will be replicated a... (read more)

5
Fermi–Dirac Distribution
I want to point out something that I find confusing.  This can't be true unless your credence that you're killing an infinite number of lives by buying a bednet is exactly zero, right? Otherwise -- if your credence is, say, 10−1010101010-- then the expected number of lives saved is undefined. Am I thinking about this correctly?
5
Linch
I agree with the rest of your comment, but I'm a bit confused about this phrasing. 
6
Jordan Arel
Hey Carl! Thanks for your comment. I am not sure I understand. Are you arguing something like “comparing x-risk interventions to other inventions such as bed nets is invalid because the universe may be infinite, or there may be a lot of simulations, or some other anthropic reason may make other interventions more valuable”?
Load more