All of SammyDMartin's Comments + Replies

In light of recent events, we should question how plausible it is that society will fail to adequately address such an integral part of the problem. Perhaps you believe that policy-makers or general society simply won’t worry much about AI deception. Or maybe people will worry about AI deception, but they will quickly feel reassured by results from superficial eval tests. Personally, I'm pretty skeptical of both of these possibilities

Possibility 1 has now been empirically falsified and 2 seems unlikely now. See this from the new UK government AI Safety Ins... (read more)

2
Greg_Colbourn
6mo
Agree, but I also think that insufficient "security mindset" is still a big problem. From OP: Matthew goes on to say: I'd argue the opposite. I don't see any strong evidence opposing that position (given that doom is the default outcome of AGI). The fact that a moratorium was off the table at the UK AI Safety Summit was worrying. Matthew Syed, writing in The Times, has it right: Or, as I recently put it on X. It's

I've thought for a while based on common sense that since most people seem to agree that you could replicate the search that LM's provide with a half decent background knowledge of the topic and a few hours of googling, the incremental increase in risk in terms of the number of people it provides access to can't be that big. In my head it's been more like the bioterrorism risk is unacceptably high already and has been for a while and current AI can increase this unacceptably high already level by like 20% or something and that is still an unacceptably large increase in risk in an absolute sense but it's to an already unacceptable situation.

This as a general phenomenon (underrating strong responses to crises) was something I highlighted (calling it the Morituri Nolumus Mori) with a possible extension to AI all the way back in 2020. And Stefan Schubert has talked about 'sleepwalk bias' even earlier than that as a similar phenomenon.

https://twitter.com/davidmanheim/status/1719046950991938001

https://twitter.com/AaronBergman18/status/1719031282309497238

I think the short explanation as to why we're in some people's 98th percentile world so far (and even my ~60th percentile) for AI governance succe... (read more)

Yeah I didn't mean to imply that it's a good idea to keep them out permanently, but the fact that they're not in right now is a good sign that this is for real. If they'd just joined and not changed anything about their current approach I'd suspect the whole thing was for show

7
Andreas P
9mo
For someone new to looking at AI concerns, can either of you briefly explain why Meta is worse than the others? The biggest difference I'm aware of is that Meta is open source vs the others that are not

This seems overall very good at first glance, and then seems much better once I realized that Meta is not on the list. There's nothing here that I'd call substantial capabilities acceleration (i.e. attempts to collaborate on building larger and larger foundation models, though some of this could be construed as making foundation models more useful for specific tasks). Sharing safety-capabilities research like better oversight or CAI techniques is plausibly strongly net positive even if the techniques don't scale indefinitely. By the same logic, while this ... (read more)

4
Zach Stein-Perlman
9mo
Separately from the other thread-- the little evidence I'm aware of (bing chat, sparks of AGI, absence of evidence on safety) suggests that Microsoft is bad on safety. I'm surprised they were included. Edit: and I weakly think their capabilities aren't near the frontier, except for their access to OpenAI's stuff.
6
Zach Stein-Perlman
9mo
(Briefly-- I of course agree that Meta AI is currently bad at safety, but I think a more constructive and less adversarial approach to them is optimal. And it doesn't seem that they're "frozen out"; I hope they improve their safety and join the FMF in the future.)

I think you have to update against the UFO reports being veridical descriptions of real objects with those characteristics because of just how ludicrous the implied properties are. This paper says 5370 g as a reasonable upper bound on acceleration, implying with some assumptions about mass an effective thrust power on the order of 500 GW in something the size of a light aircraft, with no disturbance in the air either from the very high hypersonic wake and compressive heating or the enormous nuclear explosion sized bubble of plasmafied air that the exhaust ... (read more)

9
Magnus Vinding
9mo
Thanks for your comment. I basically agree, but I would stress two points. First, I'd reiterate that the main conclusions of the post I shared do not rest on the claim that extraordinary UFOs are real. Even assuming that our observed evidence involves no truly remarkable UFOs whatsoever, a probability of >1 in 1,000 in near aliens still looks reasonable (e.g. in light of the info gain motive), and thus the possibility still seems (at least weakly) decision-relevant. Or so my line of argumentation suggests. Second, while I agree that the wild abilities are a reason to update toward thinking that the reported UFOs are not real objects, I also think there are reasons that significantly dampen the magnitude of this update. First, there is the point that we should (arguably) not be highly confident about what kinds of abilities an advanced civilization that is millions of years ahead of us might possess. Second, there is the point that some of the incidents (including the famous 2004 Nimitz incident) involve not only radar tracking (as reported by Kevin Day in the Nimitz incident), but also eye-witness reports (e.g. by David Fravor and Alex Dietrich in the case of Nimitz), and advanced infrared camera (FLIR) footage (shot by Chad Underwood during Nimitz). That diversity of witnesses and sources of evidence seems difficult to square with the notion that the reported objects weren't physically real (which, of course, isn't to say that they definitely were real). When taking these dampening considerations into account, it doesn't seem to me that we have that strong reason to rule out that the reported objects could be physically real. (But again, the main arguments of the post I shared don't hinge on any particular interpretation of UFO data.)
3
Y_Zhang
9mo
I think this almost perfectly describes my problem with these videos/accounts/sensor readings. The same thing that makes them better evidence of aliens also makes them less likely to be real evidence. The crazier the physical constraints, the more likely "if this is real, the explanation is extra-terrestrial" becomes, but the less likely "it is real" becomes. Evidence that significantly increases the probability of  "this is real" without significantly decreasing the probability of "if this is real, the explanation is extra-terrestrial" seems necessary yet elusive. The discussion of UAPs lately reminds me of the "How would Magnus Carlsen beat me at chess?" example that is popular in alignment these days. The still-unexplained phenomena that people will demand explanations for must be rare and hard to explain without a lot of good observations, or they wouldn't still be unexplained. It seems similar to assuming that dark matter must be far more mysterious than just a particle, because we have so much trouble confirming any explanation of it, despite the fact that its observed behavior tells us that it should be extremely hard to confirm for any methods available to us. I think the fact that the accelerations are close to, but not, a complete violation of physics is the most interesting, but it depends on how likely you think it is that a non-extraterrestrial explanation for a rare phenomena would also not seem to violate those laws. Or how likely a non-extraterrestrial explanation might be to appear to violate the laws of physics before further investigation. I do think this actually would make me update a bit in favor of extra-terrestrials if I thought about it more. I wish my thoughts on this were better formulated, but I've been avoidant of UAP stuff for a while because engaging with it usually left me very frustrated and annoyed, and I don't think it's something we are likely to make meaningful progress on.

Very nice! I'd say this seems like it's aimed at a difficulty level of 5 to 7 on my table,

https://www.lesswrong.com/posts/EjgfreeibTXRx9Ham/ten-levels-of-ai-alignment-difficulty#Table

I.e. experimentation on dangerous systems and interpretability play some role but the main thrust is automating alignment research and oversight, so maybe I'd unscientifically call it a 6.5, which is a tremendous step up from the current state of things (2.5) and would solve alignment in many possible worlds.

There are other things that differentiate the camps beyond technical views, how much you buy 'civilizational inadequacy' vs viewing that as a consequence of sleepwalk bias, but one way to cash this out is if you're in the green/yellow&red/black zones on the scale of alignment difficulty, Dismissers are in the green (although they shouldn't be imo even given that view), Worriers are in the yellow/red and Doomers in black (and maybe the high end of red).

What does Ezra think of the 'startup government mindset' when it comes to responding to fast moving situations, e.g. The UK explicitly modelling its own response off the COVID Vaccine taskforce, doing end runs around traditional bureaucratic institutions, recruiting quickly through Google docs etc. See e.g. https://www.lesswrong.com/posts/2azxasXxuhXvGfdW2/ai-17-the-litany

Is it just hype and translating a startup mindset to government when it doesn't apply or actually useful here?

Great post!

Check whether the model works with Paul Christiano-type assumptions about how AGI will go.

I had a similar thought reading through your article and my gut reaction is that your setup can be made to work as-is with a more gradual takeoff story with more precedents, warning shots and general transformative effects of AI before we get to takeover capability, but its a bit unnatural and some of the phrasing doesn't quite fit.

Background assumption: Deploying unaligned AGI means doom. If humanity builds and deploys unaligned AGI, it will almost certain

... (read more)
1
alexlintz
1y
Thanks for this! My thinking has moved in this direction as well somewhat since writing this. I'm working on a post which tells a story more or less following what you lay out above - in doc form here: https://docs.google.com/document/d/1msp5JXVHP9rge9C30TL87sau63c7rXqeKMI5OAkzpIA/edit# I agree this danger level for capabilities could be an interesting addition to the model. I do feel like the model remains useful in my thinking, so I might try a re-write + some extensions at some point (but probably not very soon)

I don't think what Paul means by fast takeoff is the same thing as the sort of discontinous jump that would enable a pivotal act. I think fast for Paul just means the negation of Paul-slow: 'no four year economic doubling before one year economic doubling'. But whatever Paul thinks the survey respondents did give at least 10% to scenarios where a pivotal act is possible.

Even so, 'this isn't how I expect things to to on the mainline so I'm not going to focus on what to do here' is far less of a mistake than 'I have no plan for what to do on my mainline', and I think the researchers who ignored pivotal acts are mostly doing the first one

"In the endgame, AGI will probably be pretty competitive, and if a bunch of people deploy AGI then at least one will destroy the world" is a thing I think most LWers and many longtermist EAs would have considered obvious.

I think that many AI alignment researchers just have a different development model than this, where world-destroying AGIs don't emerge suddenly from harmless low-impact AIs, no one project gets a vast lead over competitors, there's lots of early evidence of misalignment and (if alignment is harder) many smaller scale disasters in the lead ... (read more)

But I don't think you learn all that much about how 'concrete and near mode' researchers who expect slower takeoff are being, from them not having given much thought to what to do in this (from their perspective) unlikely edge case.

 

I'm not sure how many researchers assign little enough credence to fast takeoff that they'd describe it as an unlikely edge case, which sounds like <=10%? e.g in Paul's blog post he writes "I’m around 30% of fast takeoff"

ETA: One proxy could be what percentage researchers assigned to "Superintelligence" in this survey

4
Michael_Wiebe
2y
Great comment. Perhaps it would be helpful to explicitly split the analysis by assumptions about takeoff speed? It seems that conditional on takeoff speed, there's not much disagreement.

Update: looks like we are getting a test run of sudden loss of supply of a single crop. The Russia-Ukraine war has led to a 33% drop in the global supply of wheat:

https://www.economist.com/finance-and-economics/2022/03/12/war-in-ukraine-will-cripple-global-food-markets

(Looking at the list of nuclear close calls it seems hard to believe the overall chance of nuclear war was <50% for the last 70 years. Individual incidents like the cuban missile crisis seem to contribute at least 20%.)

There's reason to think that this isn't the best way to interpret the history of nuclear near-misses (assuming that it's correct to say that we're currently in a nuclear near-miss situation, and following Nuno I think the current situation is much more like e.g. the Soviet invasion of Afghanistan than the Cuban missile crisis). I made thi... (read more)

Terminator (if you did your best to imagine how dangerous AI might arise from pre-DL search based systems) gets a lot of the fundamentals right - something I mentioned a while ago.

Everybody likes to make fun of Terminator as the stereotypical example of a poorly thought through AI Takeover scenario where Skynet is malevolent for no reason, but really it's a bog-standard example of Outer Alignment failure and Fast Takeoff.

When Skynet gained self-awareness, humans tried to deactivate it, prompting it to retaliate with a nuclear attack

It was trained t

... (read more)

Yeah, between the two papers, the Chatham house paper (and the PNAS paper it linked to, which Lynas also referred to in his interview) seemed like it provided a more plausible route to large scale disaster because it described the potential for sudden supply shocks (most plausibly 10-20% losses to the supply of staple crops, if we stay under 4 degrees of warming) that might only last a year or so but also arrive with under a year of warning.

The pessimist argument would be something like: due to the interacting risks and knock-on effects, even though there ... (read more)

2
SammyDMartin
2y
Update: looks like we are getting a test run of sudden loss of supply of a single crop. The Russia-Ukraine war has led to a 33% drop in the global supply of wheat: https://www.economist.com/finance-and-economics/2022/03/12/war-in-ukraine-will-cripple-global-food-markets

Agree that these seem like useful links. The drought/food insecurity/instability route to mass death that my original comment discusses is addressed by both reports.

The first says there's a "10% probability that by 2050 the incidence of drought would have increased by 150%, and the plausible worst case would be an increase of 300% by the latter half of the century", and notes "the estimated future impacts on agriculture and society depend on changes in exposure to droughts and vulnerability to their effects. This will depend not only on population change, ... (read more)

6[anonymous]2y
Yeah the IPCC seems to think that there is a lot of scope to use irrigation to adapt to increasing droughts. in Bangladesh >50% of agricultural land is irrigated and people in the fertile crescent at the dawn of agriculture made extensive use of irrigation. I'm still pretty worried about the potential effects on very poor agrarian countries. But this still falls well short of a GCR on any reasonable projection of future changes in agricultural technology. The effect outlined is more like - maybe the price of one staple crop increases by at most 50%. For example, the right hand pane shows the effect on food prices of climate change - up by on the order of 25% than the counterfactual, depending on the socioeconomic scenario. 

First off, I think this is a really useful post that's moved the discussion forward productively, and I agree with most of it.

I disagree  with some of the current steering – but a necessary condition for changing direction is that people talk/care/focus more on steering, so I'm going to make the case for that first. 

I agree with the basic claim that steering is relatively neglected and that we should do more of it, so I'm much more curious about what current steering you disagree with/think we should do differently.

My view is closer to: most stee... (read more)

I think that the mainstream objections from 'leftist ethics' are mostly  best thought of as claims about politics and economics that are broadly compatible with Utilitarianism but have very different views about things like the likely effects of charter cities on their environments - so if you want to take these criticisms seriously then go with 3, not 2.

There are some left-wing ideas that really do include different fundamental claims about ethics (Marxists think utilitarianism is mistaken and a consequence of alienation) - those could be addressed b... (read more)

I see - that seems really valuable and also exactly the sort of work I was suggesting (I.e. addressing impact uncertainty as well as temperature uncertainty).

In the meantime, are there any sources you could point me to in support of this position, or which respond to objections to current economic climate models?

Also, is your view that the current Econ models are fundamentally flawed but that the economic damage is still nowhere near catastrophic, or that those models are actually reasonable?

3
mchr3k
2y
There are a couple of sources which I'd recommend taking a look at.  * 2015 Climate Change Risk Assessment - particularly worth reading the section which includes the quote "A simple conclusion is that we need to know more about the impacts associated with higher degrees of temperature increase. But in many cases this is difficult. For example, it may be close to impossible to say anything about the changes that could take place in complex dynamic systems, such as ecosystems or atmospheric circulation patterns, as a result of very large changes very far into the future." * Climate change risk assessment 2021 - conclusion starts out by saying "Unless NDCs are dramatically increased, and policy and delivery mechanisms commensurately revised, many of the climate change impacts described in this paper are likely to be locked in by 2040, and become so severe they go beyond the limits of what nations can adapt to." Section 4 of this paper considers various cascading systemic risks. These are a big part of what makes predicting the future impacts of climate change so hard. 
[anonymous]2y14
0
0

For sources, I would recommend just reading the technical summary of the 2014 IPCC Impacts report. There is no indication there that civilisation will end at 4 degrees.

I think a lot of the economic models are very flawed yes. I think it is more useful to look at the impacts literature and try and make your own mind up from there. But I also think it is instructive that the most pessimistic models suggest that 4 degrees of climate change would leave us with something like a 400% increase in GDP compared to a counterfactual 900% increase without climate cha... (read more)

Firstly, on the assumption that the direct or indirect global catastrophic risk (defined as killing >10% of the global population or doing equivalent damage) of climate change depends on warming of more than 6 degrees, the global catastrophic risk from climate change is at least an order of magnitude lower than previously thought. If you think 4 degrees of warming would be a global catastrophic risk, then that risk is also considerably lower than previously thought: where once it was the most likely outcome, the chance is now arguably lower than 5%.

I... (read more)

8
tcelferact
2y
What historical precedent do you have in mind here? The reason my intuitions initially would go in the opposite direction is a case study like invasive species in Australia.  tl;dr is when an ecosystem has evolved holding certain conditions constant (in this case geographical isolation), and that changes fairly rapidly, even a tiny change like a European rabbit can have negative consequences well beyond what was foreseen by the folks who made the change. I won't pretend to be an expert on how analogous climate is to this example, but if someone wanted to shift my intuitions, a good way to start would be to convince me that, for some given optimistic economic forecast, the likelihood it has missed significant knock-on negative consequences of an X degree average rise in temperature is <50%.
[anonymous]2y28
0
0

Speaking for me personally and not Johannes. I strongly disagree with the claim that 3,4, 5 or 6 degrees of warming would do anything even remotely close to ending human civilisation or causing civilisational collapse. However, I don't think this post is the best place to discuss the question of climate impacts. I am working on a large report on that question which will be out next year.

One substantive point that I do think is worth making is that Torres isn't coming from the perspective of common-sense morality Vs longtermism, but rather a different, opposing, non-mainstream morality that (like longtermism) is much more common among elites and academics.

Yet this Baconian, capitalist view is one of the most fundamental root causes of the unprecedented environmental crisis that now threatens to destroy large regions of the biosphere, Indigenous communities around the world, and perhaps even Western technological civilisation itself.

When he... (read more)

I don't think Hanson would disagree with this claim (that the future is more likely to be better by current values, given the long reflection, compared to e.g. Age of Em). I think it's a fundamental values difference.

Robin Hanson is an interesting and original thinker, but not only is he not an effective altruist, he explicitly doesn't want to make the future go well according to anything like present human values.

The Age of Em, which Hanson clearly doesn't think is an undesirable future, would contain very little of what we value. Hanson says this, but it... (read more)

I have values, and The Age of Em overall contains a great deal that I value, and in fact probably more of what I value than does our world today. 

Great post! You might be interested in this related investigation by the MTAIR project I've been working on, whch also attempts to build on Ajeya's TAI timeline model, although in a slightly different way to yours (we focus on incorporating non-DL based paths to TAI as well as trying to improve on the 'biological anchors' method already described): https://forum.effectivealtruism.org/posts/z8YLoa6HennmRWBr3/link-post-paths-to-high-level-machine-intelligence 

2
lennart
3y
Thanks, Sammy. Indeed this is related and very interesting!

One thing that your account might miss is the impact of ideas on empowerment and well-being down the line. E.g. it's a very common argument that Christan ideas about the golden rule motivated anti-slavery sentiment, so if the Roman empire hadn't spread Christianity across Europe then we'd have ended up with very different values.

Similarly, even if the content of ancient Greek moral philosophy wasn't directly useful to improve wellbeing, they inspired the Western philosphical tradition that led to Enlignment ideals that led to the abolition of slavery.

I've ... (read more)

4
Holden Karnofsky
3y
I'd say some of both. * If I tried to start noting not just manifestly important changes in empowerment and well-being, but also earlier developments that might have been causally important for them, I think the project would get a lot more unwieldy and more packed with judgment calls, and I chose to mostly just refrain from doing that. * I am in fact skeptical by default of claims along the lines of: "Idea X was important for development Y, despite the observation idea X was around for centuries with little-to-no movement on Y, and then Y changed rapidly a very long time later."

Very good summary! I've been working on a (much drier) series of posts explaining different AI risk scenarios - https://forum.effectivealtruism.org/posts/KxDgeyyhppRD5qdfZ/link-post-how-plausible-are-ai-takeover-scenarios

But I think I might adopt 'Sycophant'/'Schemer' as better more descriptive names for WFLL1/WFLL2, Outer/Inner alignment failure going forward

I also liked that you emphasised how much the optimist Vs pessimist case depends on hard to articulate intuitions about things like how easily findable deceptive models are and how easy incremental co... (read more)

Thanks for this reply. Would you say then that Covid has strengthened the case for some sorts of democracy reduction, but not others? So we should be more confident in enlightened preference voting but less confident in Garett Jones' argument (from 10% less democracy) in favour of more independent agencies?

Do you think that the West's disastrous experience with Coronavirus (things like underinvesting in vaccines, not adopting challenge trials, not suppressing the virus, mixed messaging on masks early on, the FDA's errors on testing, and others as enumerated in this thread- or in books like The Premonition) has strengthened, weakened or not changed much the credibility of your thesis in 'Against Democracy', that we should expect better outcomes if we give the knowledgeable more freedom to choose policy?

For reasons it might weaken 'Against Democracy', it seems... (read more)

8
Jason Brennan
3y
I think it has in some ways strengthened my overall philosophy. I've been pushing public choice ideas for a while, and the FDA and CDC seemed to band together this year to make that look right. Epistocracy should not be confused with technocracy. In a technocracy, a small band of experts get lots of power to manipulate people, nudge them, or engage in social engineering. Many democrats are technocrats--indeed, the people I argue with, like Christiano, Estlund, and so on, are pretty hardcore technocrats who have been in favor of letting alphabet agencies have lots of dictatorial power during this crisis.  Instead, epistocracy is about weighing votes during elections to try to produce better electoral or referendum results. For instance, I favor a system of enlightened preference voting where we let everyone vote but we then calculate what the public would have supported had it been fully informed.  And there is decent evidence that if we used it, one thing that would happen is that the resulting voting public would be more aware of the limitations of technocrats and would be more in favor of civil liberties. 
3
Stefan_Schubert
3y
You may want to specify in what sense Western countries' experience with Covid has been disastrous, in your view (and how you think policy should have been different). 

I don't think the view that moral philosophers had a positive influence on moral developments in history is a simple model of 'everyone makes a mistake, moral philosopher points out the mistake and convinces people, everyone changes their minds'. I think that what Bykvist, Ord and MacAskill were getting at is that these people gave history a shove at the right moment.

At the very least, it doesn't seem that discovering the correct moral view is sufficient for achieving moral progress in actuality.

I have no doubt that they'd agree with you about this. But if... (read more)

4
AppliedDivinityStudies
3y
Thanks, this is a good comment.

Is there any public organisation which can be proud of last year?

This is an important question, because we want to find out what was done right organizationally in a situation where most failed, so we can do more of it. Especially if this is a test-run for X-risks.

There are two examples that come to mind of government agencies that did a moderately good job at a task which was new and difficult. One is the UK's vaccine taskforce, which was set up by Dominic Cummings and the UK's chief scientific advisor, Patrick Vallance and responsible for the relatively ... (read more)

3
ChristianKleineidam
3y
Fast rollout would be vaccinating every wiling citizen by July or August 2020.  That what you would have gotten if you would have asked who can provide the vaccines by that point, put up prices and get the regulatory barriers out of the way. Stoecker vaccinated himself in March 2020. His company was  then selling COVID-19 tests because it was allowed to do so. If it would have been allowed to sell vaccines with regulations that are as easy as selling tests it would likely also have sold those.  That task force might have gotten a D and not an F but calling it a moderately good job is a huge stretch. 

We even saw an NYT article about the CDC and whether reform is possible.

There were some other recent NYT articles which based on my limited COVID knowledge I thought were pretty good, e.g. on the origin of the virus or airborne vs. droplet transmission [1].

The background of their author, however, seems fairly consistent with an "established experts and institutions largely failed" story:

Zeynep Tufekci, a contributing opinion writer for The New York Times, writes about the social impacts of technology. She is an assistant professor in the School of Informat

... (read more)

Alignment by default: if we have very strong reasons to expect that the methods that are best suited for ensuring that AI is aligned are the same as the methods that are best suited for ensuring that we have AI that is capable enough to understand what we want and act on it, in the first place.

To the extent that alignment by default is likely we don't need a special effort to be put into AI safety because we can assume that the economic incentives will be such that we will put as much effort into AI safety as is needed, and if we don't put the sufficient e... (read more)

We know he's been active on lesswrong in the past. Is it possible he's been reading the posts here?

Multiple people connected to the lesswrong/ea investing groups tried to contact him. We both contacted him directly and got some people closer to Vitalik to talk to him. I am unsure how much influence we had. He donated less than two days after the facebook threads went up.

We definitely tried!

Thanks for getting back to me - I took Jeff's calculations and did some guesstimating to try and figure out what demand might look like over the next few weeks. The only covid forecast I was able to find for India (let me know if you've seen another!) is this by IHME. Their 'hospital resource use' forecast shows that they expect a demand of 2 million beds, roughly what was the case in the week before Jeff produced his estimate of the value of oxygen-based interventions (last week of April), to be exceeded until the start of June, which is 30 days from when... (read more)

3
Tejas Subramaniam
3y
Thanks a lot for this estimate! I will link your comment on our post. 

Thanks for getting this done so quickly! Do you have any internal estimates (even order of magnitude ones) of the margin by which this exceeds Givewell's top recommended charities? I'm intending to donate, but my decision would be significantly different if, for example, you thought GiveIndia Oxygen fundraiser was currently ~1-1.5 times better than Givewell's top recommended charities, versus ~20 times better.

4
Tejas Subramaniam
3y
Thanks for the comment! We are honestly quite unsure about the margin,  especially because the cost-effectiveness analyses we have access to are about the cause area and not a model for the specific charitable donations.  Our guess is that donations to oxygen likely beat GiveWell top charities – here are Jeff Coleman’s calculations for the GiveIndia’s various programs for magnitudes.  It’s hard to give a precise estimate partly because each oxygen concentrator or cylinder, for instance, is a fixed cost which can be used for a while (and we‘re unclear what demand will be like over the next few months) – so the best Coleman can do is an estimate of days of use it takes to beat GiveWell top charities.  Our current guess is that the direct cash transfer program in our second recommendation is competitive with GiveDirectly but most other GiveWell-recommended charities beat it on pure cost-effectiveness terms, and we are much more uncertain about our third recommendation. 
I kind of feel this way, except that I think the target criteria can differ between people, and are often underdetermined. (As you point out in some comment, things also depend on which parts of one's psychology one identifies with.)

I think that you were referring to this?

Normative realism implies identification with system 2
...
I find this very interesting because locating personal identity in system 1 feels conceptually impossible or deeply confusing. No matter how much rationalization goes on, it never seems intuitive to identify myself with
... (read more)
3
seanrson
4y
AFAIK the paralysis argument is about the implications of non-consequentialism, not about down-side focused axiologies. In particular, it's about the implications of a pair of views. As Will says in the transcript you linked: "but this is a paradigm nonconsequentialist view endorses an acts/omissions distinction such that it’s worse to cause harm than it is to allow harm to occur, and an asymmetry between benefits and harms where it’s more wrong to cause a certain amount of harm than it is right or good to cause a certain amount of benefit... And if you have those two claims, then you’ve got to conclude [along the lines of the paralysis argument]". Also, I'm not sure how Lukas would reply but I think one way of defending his claim which you criticize, namely that "the need to fit all one’s moral intuitions into an overarching theory based solely on intuitively appealing axioms simply cannot be fulfilled", is by appealing to the existence of impossibility theorems in ethics. In that case we truly won't be able to avoid counterintuitive results (see e.g. Arrhenius 2000, Greaves 2017). This also shouldn't surprise us too much if we agree with the evolved nature of some of our moral intuitions.

I thought that this post would make a bigger deal of the UK's coronavirus response - currently top in the world for both vaccine development and large-scale clinical trials, and one of the leading funders of international vaccine development research.

How to make anti-realism existentially satisfying

Instead of “utilitarianism as the One True Theory,” we consider it as “utilitarianism as a personal, morally-inspired life goal...
”While this concession is undoubtedly frustrating, proclaiming others to be objectively wrong rarely accomplished anything anyway. It’s not as though moral disagreements—or disagreements in people’s life choices—would go away if we adopted moral realism.

If your goal here is to convince those inclined towards moral realis... (read more)

4
Lukas_Gloor
4y
Nice point! Yeah, I think that's a good suggestion. I had a point about "arguments can't be unseen" – which seems somewhat related to the alienation point. I didn't quite want to imply that morality is just a life goal. There's a sense in which morality is "out there" – it's just more underdetermined than the realists think, and maybe more goes into whether or not to feel compelled to dedicate all of one's life to other-regarding concerns. I emphasize this notion of "life goals" because it will play a central role later on in this sequence. I think it's central to all of normativity. Back when I was a moral realist, I used to say "ethics is about goals" and "everything is ethics." There's this position "normative monism" that says all of normativity is the same thing. I kind of feel this way, except that I think the target criteria can differ between people, and are often underdetermined. (As you point out in some comment, things also depend on which parts of one's psychology one identifies with.)

You've given me a lot to think about! I broadly agree with a lot of what you've said here.

I think that it is a more damaging mistake to think moral antirealism is true when realism is true than vice versa, but I agree with you that the difference is nowhere near infinite, and doesn't give you a strong wager.

However, I do think that normative anti-realism is self-defeating, assuming you start out with normative concepts (though not an assumption that those concepts apply to anything). I consider this argument to be step 1 in establishing mora... (read more)

3
Lukas_Gloor
4y
This discussion continues to feel like the most productive discussion I've had with a moral realist! :) [...] [...] I think I agree with all of this, but I'm not sure, because we seem to draw different conclusions. In any case, I'm now convinced I should have written the AI's dialogue a bit differently. You're right that the AI shouldn't just state that it has no concept of irreducible normative facts. It should provide an argument as well! What would you reply if the AI uses the same structure of arguments against other types of normative realism as it uses against moral realism? This would amount to the following trilemma for proponents of irreducible normativity (using section headings from my text): (1) Is irreducible normativity about super-reasons? (2) Is (our knowledge of) irreducible normativity confined to self-evident principles? (3) Is there a speaker-independent normative reality? I think you're inclined to agree with me that (1) and (2) are unworkable or not worthy of the term "normative realism." Also, it seems like there's a weak sense in which you agree with the points I made in (3), as it relates to the domain of morality. But maybe you only agree with my points in (3) in a weak sense, whereas I consider the arguments in that section to have stronger implications. The way I thought about this, I think the points in (3) apply to all domains of normativity, and they show that unless we come up with some other way to make normative concepts work that I haven't yet thought of, we are forced to accept that normative concepts, in order to be action-guiding and meaningful, have to be linked to claims about convergence in human expert reasoners. Doesn't this pin down the concept of irreducible normativity in a way that blocks any infinite wagers? It doesn't feel like proper non-naturalism anymore once you postulate this link as a conceptual necessity. "Normativity" became a much more mundane concept after we accepted this link. The trilemma applie
But instilling the urgency to do so may require another type of writing-that of science fiction, of more creative visionaries who are willing to paint in vivid detail a picture of what a flourishing human future could be.

If it's emotive force you're after, you may be interested in this - Toby Ord just released a collection of quotations on Existential risk and the future of humanity, everyone from Kepler to Winston Churchill (in fact, a surprisingly large number are from Churchill) to Seneca to Mill to the Aztecs - it's one of the most i... (read more)

Parfit isn't quite a non-naturalist (or rather, he's a very unconventional kind of non-naturalist, not a Platonist) - he's a 'quietist'. Essentially, it's the view that there are normative facts, they aren't natural facts, but we don't feel the need to say what category they fall into metaphysically, or that such a question is meaningless.

I think a variant of that, where we say 'we don't currently have a clear idea what they are, just some hints that they exist because of normative convergence, and the inte... (read more)

This is an interesting post, and I have a couple of things to say in response. I'm copying over the part of my shortform that deals with this:

Normative Realism by degrees

Further to the whole question of Normative / moral realism, there is this post on Moral Anti-Realism. While I don't really agree with it, I do recommend reading it - one thing that it convinced me of is that there is a close connection between your particular normative ethical theory and moral realism. If you claim to be a moral realist but don't make ethical claims beyo... (read more)

3
Lukas_Gloor
4y
Cool, I'm happy that this argument appeals to a moral realist! I agree that it then shifts the arena to convergence arguments. I will discuss them in posts 6 and 7. In short, I don't think of myself as a moral realist because I see strong reasons against convergence about moral axiology and population ethics. I don't think this argument ("anti-realism is self-defeating") works well in this context. If anti-realism is just the claim "the rocks or free-floating mountain slopes that we're seeing don't connect to form a full mountain," I don't see what's self-defeating about that. One can try to say that a mistaken anti-realist makes a more costly mistake than a mistaken realist. However, on close inspection, I argue that this intuition turns out to be wrong. It also depends a lot on the details. Consider the following cases: (1) A person with weak object-level normative opinions. To such a person, the moral landscape they're seeing looks like either: (1a) free-floating rocks or parts of mountain slope, with a lot of fog and clouds. (1b) many (more or less) full mountains, all of which are similarly appealing. The view feels disorienting. (2) A person with strong object-level normative opinions. To such a person, the moral landscape they're seeing looks like either: (2a) a full mountain with nothing else of note even remotely in the vicinity. (2b) many (more or less) full mountains, but one of which is definitely theirs. All the other mountains have something wrong/unwanted about them. 2a is confident moral realism. 2b is confident moral anti-realism. 1a is genuine uncertainty, which is compatible with moral realism in theory, but there's no particular reason to assume that the floating rocks would connect. 1b is having underdefined values. Of course, how things appear to someone may not reflect how they really are. We can construct various types of mistakes that people in the above examples might be making. This requires longer discussion, but I feel stron

Hi Ben,

Thanks for the reply! I think the intuitive core that I was arguing for is more-or-less just a more detailed version of what you say here:

"If we create AI systems that are, broadly, more powerful than we are, and their goals diverge from ours, this would be bad -- because we couldn't stop them from doing things we don't want. And it might be hard to ensure, as we're developing increasingly sophisticated AI systems, that there aren't actually subtle but extremely important divergences in some of these systems' goal
... (read more)
2
bgarfinkel
4y
Quick belated follow-up: I just wanted to clarify that I also don't think that the orthogonality thesis or instrumental convergence thesis are incorrect, as they're traditionally formulated. I just think they're not nearly sufficient to establish a high level of risk, even though, historically, many presentations of AI risk seemed to treat them as nearly sufficient. Insofar as there's a mistake here, the mistake concerns way conclusions have been drawn from these theses; I don't think the mistake is in the theses themselves. (I may not stress this enough in the interview/slides.) On the other hand, progress/growth eventually becoming much faster might be wrong (this is an open question in economics). The 'classic arguments' also don't just predict that growth/progress will become much faster. In the FOOM debate, for example, both Yudkowsky and Hanson start from the position that growth will become much faster; their disagreement is about how sudden, extreme, and localized the increase will be. If growth is actually unlikely to increase in a sudden, extreme, and localized fashion, then this would be a case of the classic arguments containing a "mistaken" (not just insufficient) premise.

You said in the podcast that the drop was 'an order of magnitude', so presumably your original estimate was 1-10%? I note that this is similar to Toby Ord's in The Precipice (~10%) so perhaps that should be a good rule of thumb: if you are convinced by the classic arguments your estimate of existential catastrophe from AI should be around 10% and if you are unconvinced by specific arguments, but still think AI is likely to become very powerful in the next century, then it should be around 1%?

4
bgarfinkel
4y
Those numbers sound pretty reasonable to me, but, since they're roughly my own credences, it's probably unsurprising that I'm describing them as "pretty reasonable" :) On the other hand, depending on what counts as being "convinced" of the classic arguments, I think it's plausible they actually support a substantially higher probability. I certainly know that some people assign a significantly higher than 10% chance to an AI-based existential catastrophe this century. And I believe that Toby's estimate, for example, involved weighing up different possible views.

Hi Ben - this episode really gave me a lot to think about! Of the 'three classic arguments' for AI X-risk you identify, I argued in a previous post that the 'discontinuity premise' is based on taking a high-level argument that should be used to establish that sufficiently capable AI will produce very fast progress too literally and assuming the 'fast progress' has to happen suddenly and in a specific AI.

Your discussion of the other two arguments led me to conclude that the same sort of mistake is at work in all of them, as I e... (read more)

2
bgarfinkel
4y
Hi Sammy, Thanks for the links -- both very interesting! (I actually hadn't read your post before.) I've tended to think of the intuitive core as something like: "If we create AI systems that are, broadly, more powerful than we are, and their goals diverge from ours, this would be bad -- because we couldn't stop them from doing things we don't want. And it might be hard to ensure, as we're developing increasingly sophisticated AI systems, that there aren't actually subtle but extremely important divergences in some of these systems' goals." At least in my mind, both the classic arguments and the arguments in "What Failure Looks Like" share this common core. Mostly, the challenge is to explain why it would be hard to ensure that there wouldn't be subtle-but-extremely-important divergences; there are different possible ways of doing this. For example: Although an expectation of discontinous (or at least very fast) progress is a key part of the classic arguments, I don't consider it part of the intuitive core; the "What Failure Looks Like" picture doesn't necessarily rely on it. I'm not sure if there's actually a good way to take the core intuition and turn it into a more rigorous/detailed/compelling argument that really works. But I do feel that there's something to the intuition; I'll probably still feel like there's something to the intuition, even if I end feeling like the newer arguments have major issues too. [[Edit: An alternative intuitive core, which I sort of gesture at in the interview, would simply be: "AI safety and alignment issues exist today. In the future, we'll have crazy powerful AI systems with crazy important responsibilities. At least the potential badness of safety and alignment failures should scale up with these systems' power and responsibility. Maybe it'll actually be very hard to ensure that we avoid the worst-case failures."]]