Spencer Becker-Kahn

280 karmaJoined


I think Neel makes a good point.

And to me the sort of 'other' elephant in the room is the value of Wytham Abbey beyond thinking of it purely as an investment, i.e. in the comment that you link, Owen Cotton-Barratt tried to explain something about his belief in the value of specialist venues that can host conferences, workshops, researcher meetings etc. and that are committed to promoting a certain flavour of "open-ended intellectual exploration". One can obviously reasonably disagree with him (and admittedly it is very difficult to figure out the monetary value of this stuff), but it probably at least deserves rebutting explicitly?

Perhaps at the core there is a theme here that comes up a lot which goes a bit like: Clearly there is a strong incentive to 'work on' any imminent and unavoidable challenge whose resolution could require or result in "hard-to-reverse decisions with important and long-lasting consequences". Current x-risks have been established as sort of the 'most obvious' such challenges (in the sense that making wrong decisions potentially results in extinction, which obviously counts as 'hard-to-reverse' and the consequences of which are 'long-lasting'). But can we think of any other such challenges or any other category of such challenges? I don't know of any others that I've found anywhere near as convincing as the x-risk case, but I suppose that's why the example project on case studies could be important?

Another thought I had is kind of: Why might people who have been concerned about x-risk from misaligned AI pivot to asking about these other challenges? (I'm not saying Will counts as 'pivoting' but just generally asking the question). I think one question I have in mind is: Is it because we have already reached a point of small (and diminishing) returns from putting today's resources into the narrower goal of reducing x-risk from misaligned AI?

I think there's something quite interesting here...I feel like one of the main things I see in the post is sort of the opposite of the intended message.

(I realise this is an old post now but I've only just read it and - full disclosure - I've ended up reading it now because I think my skepticism about AI risk arguments is higher than it's been for a long time and so I'm definitely coming at it from that point of view).

If I may paraphrase a bit flippantly, I think that one of the messages is sort of supposed to be: 'just because the early AI risk crowd were very different for me and kind of irritating(!), it doesn't mean that they were wrong' and so 'sometimes you need to pay attention to messages coming from outside of your subculture'.

But actually what happens in the narrative is that you only start caring about AI risk when an old friend who 'felt like one of your own' - and who was "worried" - manages to make you "feel viscerally" about it. So it wasn't that, without direct intervention from either 'tribe', you actually sat down with the arguments/data and understood things logically. Nor was it that you, say, found a set of AI/technology/risk experts to defer to. It was that someone with whom you had more of an affinity made you feel like we should care more and take it seriously. This sounds sort of like the opposite of the intended message, does it not?. i.e. it sounds like more attention was paid to an emotional appeal from an old friend than to whatever arguments were available at the time.

I would suggest that this post probably shouldn't have quotation marks in the title. It makes it look like there is a quote from Sunak that includes the phrase "existential threats", but that isn't the case.

I appreciate the comment.

I'll be honest, I'm probably not going to go back through all of the quotations now and give the separate posts they come from. Karnofsky did put the whole series into a single pdf which can function as a single source (but as I say, I haven't gone through again and checked myself).

I do recognize the bit you are quoting though - yes that is indeed where I got my quotations for that part from - and I did think a bit the issue you are bringing up. To me, this seems like a fair paraphrase/description of Karnofsky in that section:

a) He's developed a mindset in which he critically examines things like this with as much rigour as possible;


b) He has made a lot of investment in examining this thesis.

So it seemed - and still seems - to me that he is implying that he has invested a lot of time in critically examining this thesis with as much as rigour as possible. 

However, I would not have used the specific phrasing that JoshuaBlake used. i.e. I deliberately did not say something like 'Karnofsky claims to have written these posts with as much rigour as possible". So I do think that one could try to give him the benefit of the doubt and say something like: 

'Although he spent a lot of time examining the thesis with as much rigor as possible, it does not necessarily follow that he wrote the posts in a way that shows that. So criticising the writing in the posts is kind of an unfair way to attack his use of rigour.'

But I think to me this just seemed like I would be trying a bit too hard to avoid criticising him. This comes back to some of my points in my post: i.e. I am suggesting that his posts are not written in a way that invites clear criticism, despite his claim that his is one of his main intentions and I suggest that Karnofsky's rhetorical style aims for the best of both worlds: He wants his readers to think both that he has thought about this very rigorously and critically for a long time but also - wherever it seems vague or wrong - to give him the benefit of the doubt and say 'well it was never meant to be taken too seriously or to be 100% rigorous, they're just blog posts etc.'.

(Just a note: Of course you are free to go into more detail in comments but I'm not sure I have much bandwidth to devote to writing long replies.)


I don’t really understand the response of asking me what I would have done. I would find it very tricky to put myself in the position of someone who is writing the posts but who also thinks what I have said I think.

Otherwise, I don’t think you’re making unreasonable points, but I do think that in the piece itself I already tried to directly address much of what you’re saying.

Ie You talk about his ideas being hard to define or hard to precisely explain but that we all kind of get the intention. Among other things, I write about how his vagueness allows him to lump together many versions of his claims under one heading, which hides the true level of disagreement and uncertainty that is likely to exist about any particular precise version of a claim. And about how it means he can focus on persuading you (-and potentially persuading you to act or donate etc) above figuring out what’s true. Etc

(It feels weird to me to have to repeat that we should be skeptical of - or at least very careful with - ideas/arguments that seem to continue to be “hard to explain” but that “we all kinda get what we mean”.)

And I don’t expect to actually be written like analytic philosophy either: eg one of my points here is that it isn’t reasonable to suppose that he is unaware of the standards of academic philosophy and so it doesn’t feel right for him to suggest that he is using a high level of rigour etc

Thanks for the comment.

Yeah, it wasn't too clear to me how to think about using the community tag but I decided to go with it in the end. This exchange however makes it look like people tend to disagree and think I shouldn't have used it. Hmmm, I'm not sure.

My mind comes back to points like this very often:

We cannot help but be reminded of Frank H. Westheimer's advice to his research students: “Why spend a day in the library when you can learn the same thing by working in the laboratory for a month?"

It's an - or perhaps yet another - example of how sometimes the Bay Area/SE/entrepreneurial mindset is almost diametrically opposed to certain mindsets coming from academia and how this community is trying to balance them or get the best of both worlds (which isn't a stupid thing to try to do per se, it just seems like sometimes it's very tricky). In the spirit of the former you kinda want to move fast (and break things), but the latter wants you to remember the virtues of deliberately taking the time to demonstrate how thorough and methodical you are being (and partly so that you don't, say, squander your resources by running a foreseeably dud experiment)

The case for risk that you sketch isn't the only case that one can lay out, but if we are focussing on this case, then your response is not unreasonable. But do you want go give up or do you want to try? The immediate response to your last suggestion is surely: Why devote limited resources to some other problem if this is the one that destroys humanity anyway?

You might relate to the following recent good posts:

Given the similarities, why are there so many different organizations? How is an outsider >supposed to know what makes each of them unique? ... What makes these different from each other?

You are not the first person to ask questions like this. And I feel like I often hear people trying to make databases of researchers and orgs or give overviews of the field etc. etc. But the situation you describe just seems...normal, to me. As in, most industries and areas, even pretty niche things will have many different organizations with slightly different outlooks and no central way to navigate them and learn about them. And most research fields in academia are like this; there isn't a great way to get to know everyone in the field, where they work, what they work on etc, it just takes years of talking to people, reading things, meeting them at conferences etc. and you just slowly build up the picture .

I don't think it's silly or bad to ask what you're asking, that's not what I'm saying/I may well be wrong anyway, but to my mind the situation seems like it wouldn't necessarily have a good 'explanation' that someone can just hand to you, and the usual way to find out the sorts of things you want to find out would be to just continue immersing yourself in the field (from what I hear from most people who start figuring out this field, ~15 hours is not very long).

Load more