All of Harrison Durland's Comments + Replies

I definitely think beware is too strong. I would recommend “discount” or “be skeptical” or something similar.

5
yanni kyriacos
26d
I like "“be skeptical"! thanks for the feedback :)

Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C. 
[...]
[Regarding ice melting -- ] That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.

I'll be blunt, remarks like these undermine your credibility. But regardless, I just don't have any experience or contributions to make on climate change, other than re-emphasizing my general impression that, as a person who cares a lot about e... (read more)

6
Denis
1mo
I know. :(  But as a scientist, I feel it's valuable to speak the truth sometimes, to put my personal credibility on the line in service of the greater good. Venus is an Earth-sized planet which is 400C warmer than Earth, and only a tiny fraction of this is due to it being closer to the sun. The majority is about the % of the sun's heat that it absorbs vs. reflects. It is an extreme case of global warming. I'm not saying that Earth can be like Venus anytime soon, I'm saying that we have the illusion that Earth has a natural, "stable" temperature, and while it might vary, eventually we'll return to that temperature. But there is absolutely no scientific or empirical evidence for this.  Earth's temperature is like a ball balanced in a shallow groove on the top of a steep hill. We've never experienced anything outside the narrow groove, so we imagine that it is impossible. But we've also never dramatically changed the atmosphere the way we're doing now. There is, like I said, no fundamental reason why global-warming could not go totally out of control, way beyond 1.5C or 3C or even 20C.  I have struggled to explain this concept, even to very educated, open-minded people who fundamentally agree with my concerns about climate change. So I don't expect many people to believe me. But intellectually, I want to be honest.  I think it is valuable to keep trying to explain this, even knowing the low probability of success, because right now, statements like "1.5C temperature increase" are just not having the impact of changing people's habits. And if we do cross a tipping point, it will be too late to start realising this.   

Everything is going more or less as the scientists predicted, if anything, it's worse.

I'm not that focused on climate science, but my understanding is that this is a bit misleading in your context—that there were some scientists in the (90s/2000s?) who forecasted doom or at least major disaster within a few decades due to feedback loops or other dynamics which never materialized. More broadly, my understanding is that forecasting climate has proven very difficult, even if some broad conclusions (e.g., "the climate is changing," "humans contribute to climat... (read more)

3
Denis
1mo
I'm not sure. IMHO a major disaster is happening with the climate. Essentially, people have a false belief that there is some kind of set-point, and that after a while the temperature will return to that, but this isn't the case. Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C.  It's always interesting to ask people how high they think sea-level might rise if all the ice melted. This is an uncontroversial calculation which involves no modelling - just looking at how much ice there is, and how much sea-surface area there is. People tend to think it would be maybe a couple of metres. It would actually be 60 m (200 feet). That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.  Right now, if anything what we're seeing is worse than the average prediction. The glaciers and ice sheets are melting faster. The temperature is increasing faster. Etc. Feedback loops are starting to be powerful. There's a real chance that the Gulf Stream will stop or reverse, which would be a disaster for Europe, ironically freezing us as a result of global warming ...  Among serious climate scientists, the feeling of doom is palpable. I wouldn't say they are exaggerating. But we, as a global society, have decided that we'd rather have our oil and gas and steaks than prevent the climate disaster. The US seems likely to elect a president who makes it a point of honour to support climate-damaging technologies, just to piss off the scientists and liberals. 

I don't find this response to be a compelling defense of what you actually wrote:

since AIs would "get old" too [...] they could also have reason to not expropriate the wealth of vulnerable old agents because they too will be in such a vulnerable position one day

It's one thing if the argument is "there will be effective enforcement mechanisms which prevent theft," but the original statement still just seems to imagine that norms will be a non-trivial reason to avoid theft, which seems quite unlikely for a moderately rational agent.

Ultimately, perhaps much o... (read more)

6
Matthew_Barnett
1mo
Sorry, I think you're still conflating two different concepts. I am not claiming: * Social norms will prevent single agents from stealing from others, even in the absence of mechanisms to enforce laws against theft I am claiming: * Agents will likely not want to establish a collective norm that it's OK (on a collective level) to expropriate wealth from old, vulnerable individuals. The reason is because most agents will themselves at some point become old, and thus do not want there to be a norm at that time, that would allow their own wealth expropriated from them when they become old. There are two separate mechanisms at play here. Individual and local instances of theft, like robbery, are typically punished by specific laws. Collective expropriation of groups, while possible in all societies, is usually handled via more decentralized coordination mechanisms, such as social norms.  In other words, if you're asking me why an AI agent can't just steal from a human, in my scenario, I'd say that's because there will (presumably) be laws against theft. But if you're asking me why the AIs don't all get up together and steal from the humans collectively, I'd say it's because they would not want to violate the general norm against expropriation, especially of older, vulnerable groups. For what it's worth, I asked Claude 3 and GPT-4 to proof-read my essay before I posted, and they both appeared to understand what I said, with almost no misunderstandings, for every single one of my points (from my perspective). I am not bringing this up to claim you are dumb, or anything like that, but I do think it provides evidence that you could probably better understand what I'm saying if you tried to read my words more carefully.

Apologies for being blunt, but the scenario you lay out is full of claims that just seem to completely ignore very facially obvious rebuttals. This would be less bad if you didn’t seem so confident, but as written the perspective strikes me as naive and I would really like an explanation/defense.

Take for example:

Furthermore, since AIs would "get old" too, in the sense of becoming obsolete in the face of new generations of improved AIs, they could also have reason to not expropriate the wealth of vulnerable old agents because they too will be in such a vu

... (read more)
4
Matthew_Barnett
1mo
This isn't the scenario I intended to describe, since it seems very unlikely that a single agent could get away with mass expropriation. The more likely scenario is that any expropriation that occurs must have been a collective action to begin with, and thus, there's no problem of coordination that you describe. This is common in ordinary expropriation in the real world: if you learned that we were one day going to steal all the wealth from people above the age of 90, you'd likely infer that that this decision was decided collectively, rather than being the result of a single lone agent who went and stole all the wealth for themselves. Your described scenario is instead more similar to ordinary theft, such as robbery. In that case, defection is usually punished by laws against theft, and people generally have non-altruistic reasons to support the enforcement of these laws. I'm happy for you to critique the rest of the post. As far as I can tell, the only substantive critique you have offered so far seems to contain a misunderstanding of the scenario I described (conflating private lawbreaking from a lone actor with a collective action to expropriate wealth). But it would certainly not be surprising if my arguments had genuine flaws: these are about speculative matters concerning the future.

Sure! (I just realized the point about the MNIST dataset problems wasn't fully explained in my shared memo, but I've fixed that now)

Per the assessment section, some of the problems with assuming that FRVT demonstrates NIST's capabilities for evaluation of LLMs/etc. include:

  1. Facial recognition is a relatively "objective" test—i.e., the answers can be linked to some form of "definitive" answer or correctness metric (e.g., name/identity labels). In contrast, many of the potential metrics of interest with language models (e.g., persuasiveness, knowledge about d
... (read more)

Seeing the drama with the NIST AI Safety Institute and Paul Christiano's appointment and this article about the difficulty of rigorously/objectively measuring characteristics of generative AI, I figured I'd post my class memo from last October/November.

The main point I make is that NIST may not be well suited to creating measurements for complex, multi-dimensional characteristics of language models—and that some people may be overestimating the capabilities of NIST because they don't recognize how incomparable the Facial Recognition Vendor Test is to this ... (read more)

1
Denis
1mo
There are some major differences with the type of standards that NIST usually produces. Perhaps the most obvious is that a good AI model can teach itself to pass any standardised test. A typical standard is very precisely defined in order to be reproducible by different testers. But if you make such a clear standard test for an LLM, it would, say, be a series of standard prompts or tasks, which would be the same no matter who typed them in. But in such a case, the model just trains itself on how to answer these prompts, or follows the Volkswagen model of learning how to recognize that it's being evaluated, and to behave accordingly, which won't be hard if the testing questions are standard.  So the test tells you literally nothing useful about the model.  I don't think NIST (or anyone outside the AI community) has experience with the kind of evals that are needed for models, which will need to be designed specifically to be unlearnable. The standards will have to include things like red-teaming in which the model cannot know what specific tests it will be subjected to. But it's very difficult to write a precise description of such an evaluation which could be applied consistently.  In my view this is a major challenge for model evaluation. As a chemical engineer, I know exactly what it means to say that a machine has passed a particular standard test. And if I'm designing the equipment, I know exactly what standards it has to meet. It's not at all obvious how this would work for an LLM. 
3
Rebecca
1mo
I would be quite interested to hear more about what you’re saying re MNIST and the facial recognition vendor test

I probably should have been more clear, my true "final" paper actually didn't focus on this aspect of the model: the offense-defense balance was the original motivation/purpose of my cyber model, but I eventually became far more interested in using the model to test how large language models could improve agent-based modeling by controlling actors in the simulation. I have a final model writeup which explains some of the modeling choices in more detail and talks about the original offense/defense purpose in more detail.

(I could also provide the model code ... (read more)

If offence and defence both get faster, but all the relative speeds stay the same, I don’t see how that in itself favours offence

Funny you should say this, it so happens that I just submitted a final paper last night for an agent-based model which was meant to test exactly this kind of claim for the impacts of improving “technology” (AI) in cybersecurity. Granted, the model was extremely simple + incomplete, but the theoretical results explain how this could possible.

In short, when assuming a fixed number of vulnerabilities in an attack surface, while atta... (read more)

3
finm
4mo
Very cool! Feel free to share your paper if you're able, I'd be curious to see. I don't know how to interpret the image, but the this makes sense:

Thank you so much for articulating a bunch of the points I was going to make!

I would probably just further drive home the last paragraph: it’s really obvious that the “number of people a lone maniac can kill in given time” (in America) has skyrocketed with the development of high fire-rate weapons (let alone knowledge of explosives). It could be true that the O/D balance for states doesn’t change (I disagree) while the O/D balance for individuals skyrockets.

I have increasingly become open to incorporating alternative decision theories as I recognize that I cannot be entirely certain in expected value approaches, which means that (per expected value!) I probably should not solely rely on one approach. At the same time, I am still not convinced that there is a clear, good alternative, and I also repeatedly find that the arguments against using EV are not compelling (e.g., due to ignoring more sophisticated ways of applying EV).

Having grappled with the problem of EV-fanaticism for a long time in part due to the ... (read more)

Since I think substantial AI regulation will likely occur by default, I urge effective altruists to focus more on ensuring that the regulation is thoughtful and well-targeted rather than ensuring that regulation happens at all.

I think it would be fairly valuable to see a list of case studies or otherwise create base rates for arguments like “We’re seeing lots of political gesturing and talking, so this suggests real action will happen soon.” I am still worried that the action will get delayed, watered down, and/or diverted to less-existential risks, onl... (read more)

I definitely would have preferred a TLDR or summary at the top, not the bottom. However, I definitely appreciate your investigation into this, as I have long loathed Eliezer’s use of the term once I realized he just made it up.

Strange, unless the original comment from Gerald has been edited since I responded I think I must have misread most of the comment, as I thought it was making a different point (i.e., "could someone explain how misalignment could happen"). I was tired and distracted when I read it, so it wouldn't be surprising. However, the final paragraph in the comment (which I originally thought was reflected in the rest of the comment) still seems out of place and arrogant.

-1
Gerald Monroe
7mo
The 'final paragraph' was simply noting that when you try to make concrete AI risks - instead of an abstract machine that is overwhelmingly smarter than human intelligence and randomly aligned, but a real machine that humans have to train and run on their computers - numerous technical mitigation methods are obvious.  The one I was alluding to was (1) (1) https://www.lesswrong.com/posts/C8XTFtiA5xtje6957/deception-i-ain-t-got-time-for-that Sparsity and myopia are general alignment strategies and as it happens are general software engineering practices.  Many of the alignment enthusiasts on lesswrong have rediscovered software architectures that already exist.  Not just exist, but are core to software systems ranging from avionics software to web hyperscalers.   Sparsity happens to be TDD. Myopia happens to be stateless microservices.  

This really isn’t the right post for most of those issues/questions, and most of what you mentioned are things you should be able to find via searches on the forum, searches via Google, or maybe even just asking ChatGPT to explain it to you (maybe!). TBH your comment also just comes across quite abrasive and arrogant (especially the last paragraph), without actually appearing to be that insightful/thoughtful. But I’m not going to get into an argument on these issues.

[This comment is no longer endorsed by its author]Reply
0
Dzoldzaya
7mo
At risk of compounding the hypocrisy here, criticizing a comment for being abrasive and arrogant while also saying: "Your ideas are neither insightful nor thoughtful, just Google/ ChatGpt it" might be showing a lack of self-awareness... But agreed that this post is probably not the best place for an argument on the feasibility of a pause, especially as the post mentions that David M will make the case for how a pause would work as part of the debate. If your concerns aren't addressed there, Gerald, that post will probably be a better place to discuss.

I wish! I’ve been recommending this for a while but nobody bites, and usually (always?) without explanation. I often don’t take seriously many of these attempts at “debate series” if they’re not going to address some of the basic failure modes that competitive debate addresses, e.g., recording notes in a legible/explorable way to avoid the problem of arguments getting lost under layers of argument branches.

Hi Oisín, no worries, and thanks for clarifying! I appreciate your coverage of this topic, I just wanted to make sure there aren't misinterpretations.

1
Oisín Considine
7mo
Yeah all the better that you did! It can only help.

In policy spaces, this is known as the Brussels Effect; that is, when a regulation adopted in one jurisdiction ends up setting a standard followed by many others.

I am not clear how the Brussels effect applies here, especially since we’re not talking manufacturing a product with high costs of running different production lines. I recognize there may be some argument/step that I’m missing, but I can’t dismiss the possibility that the author doesn’t actually understand what the Brussels Effect really is / normally does, and is throwing it around like a buzzword. Could you please elaborate a bit more?

1
Oisín Considine
7mo
I used the Brussels Effect in the context of the Plant-Based Universities campaign to demonstrate how a university or universities in a particular region transitioning to a fully plant-based catering system would encourage other universities to follow suit at least partly due to wanting to maintain a forward-looking and innovative image and reputation which makes them appear more attractive. However, I admit that I didn’t fully understand the true nature of the Brussels Effect when I first wrote the post, and now after reading back over my post, particularly the part mentioning the Brussels Effect which you highlighted, and after understanding more about how this effect has less of a reputational factor and more of an economical factor which plays the role in encouraging change, I believe that mentioning it has no real use here. I was a little bit naive in the way in which I used (or perhaps misused) the term, so thank you for pointing this out to me. I take your point into due consideration and I have now edited that sentence out. Nevertheless my point about a university/universities going plant-based indirectly encouraging others to follow them due to factors including wanting to maintain a certain image/attractiveness, as well as the inertia of the movement itself, stands as is.

I’m curious whether people (e.g., David, MIRI folk) think that LLMs now or in the near future would be able to substantially speed up this kind of theoretical safety work?

4
Davidmanheim
9mo
I would prefer a pause on LLMs that are more capable, in part to give us time to figure out how to align these systems. As I argued, I think mathematical approaches are potentially critical there. But yes, general intelligences could help - I just don't expect them to be differentially valuable for mathematical safety over capabilities, so if they are capable of these types of work, it's a net loss.

I was not a huge fan of the instrumental convergence paper, although I didn't have time to thoroughly review it. In short, it felt too slow in making its reasoning and conclusion clear, and once (I think?) I understood what it was saying, it felt quite nitpicky (or a borderline motte-and-bailey). In reality, I'm still unclear if/how it responds to the real-world applications of the reasoning (e.g., explaining why a system with a seemingly simple goal like calculating digits of pi would want to cause the extinction of humanity).

The summary in this forum pos... (read more)

Sorry about the delayed reply, I saw this and accidentally removed the notification (and I guess didn't receive an email notification, contrary to my expectations) but forgot to reply. Responding to some of your points/questions:

One can note that AIXR is definitely falsifiable, the hard part is falsifying it and staying alive.

I mostly agree with the sentiment that "if someone predicts AIXR and is right then they may not be alive", although I do now think it's entirely plausible that we could survive long enough during a hypothetical AI takeover to say "ah ... (read more)

Epistemic status: I feel fairly confident about this but recognize I’m not putting in much effort to defend it and it can be easily misinterpreted.

I would probably just recommend not using the concept of neglectedness in this case, to be honest. The ITN framework is a nice heuristic (e.g., usually more neglected things benefit more from additional marginal contributions) but it is ultimately not very rigorous/logical except when contorted into a definitional equation (as many previous posts have explained). Importantly, in this case I think that focusing on neglectedness is likely to lead people astray, given that a change in neglectedness could equate to an increase in tractability.

Over the past few months I have occasionally tried getting LLMs to do some tasks related to argument mapping, but I actually don't think I've tried that specifically, and probably should. I'll make a note to myself to try here.

Interesting. Perhaps we have quite different interpretations of what AGI would be able to do with some set of compute/cost and time limitations. I haven't had the chance yet to read the relevant aspects of your paper (I will try to do so over the weekend), but I suspect that we have very cruxy disagreements about the ability of a high-cost AGI—and perhaps even pre-general AI that can still aid R&D—to help overcome barriers in robotics, semiconductor design, and possibly even aspects of AI algorithm design.

Just to clarify, does your S-curve almost entir... (read more)

4
Ted Sanders
10mo
No it's not just extrapolating base rates (that would be a big blunder). We assume that the development of proto-AGI or AGI will rapidly accelerate progress and investment, and our conditional forecasts are much more optimistic about progress than they would be otherwise. However, it's a totally fair to disagree with us on the degree of that acceleration. Even with superhuman AGI, for example, I don't think we're moving away from semiconductor transistors in less than 15 years. Of course, it really depends on how superhuman this superhuman intelligence would be. We discuss this more in the essay.

I find this strange/curious. Is your preference more a matter of “Traditional interfaces have good features that a flowing interface would lack“ (or some other disadvantage to switching) or “The benefits of switching to a flowing interface would be relatively minor”?

For example on the latter, do you not find it more difficult with the traditional UI to identify dropped arguments? Or suppose you are fairly knowledgeable about most of the topics but there’s just one specific branch of arguments you want to follow: do you find it easy to do that? (And more on... (read more)

1
zchuang
10mo
I asked other debaters/EAs intersecting and they agreed with my line of reasoning that it would be contrived and lead to poorly structured arguments. I can elaborate if you really want but I hesitate spending time to write this out because I'm behind on work and don't think it'll have any impact on anything to be honest.

Thanks for posting this, Ted, it’s definitely made me think more about the potential barriers and the proper way to combine probability estimates.

One thing I was hoping you could clarify: In some of your comments and estimates, it seems like you are suggesting that it’s decently plausible(?)[1] we will “have AGI“ by 2043, it’s just that it won’t lead to transformative AGI before 2043 because the progress in robotics, semiconductors, and energy scaling will be too slow by 2043. However, it seems to me that once we have (expensive/physically-limited) AG... (read more)

5
Ted Sanders
10mo
Agree that: * The odds of AGI by 2043 are much, much higher than transformative AGI by 2043 * AGI will rapidly accelerate progress toward transformative AGI * The odds of transformative AGI by 2053 is higher than by 2043 We didn't explicitly forecast 2053 in the paper, just 2043 (0.4%) and 2100 (41%). If I had to guess without much thought I might go with 3%. It's a huge advantage to get 10 extra years to build fabs, make algorithms efficient, collect vast training sets, train from slow/expensive real-world feedback, and recover from rare setbacks. My mental model is some kind of S surve where progress in the short-term is extremely unlikely, progress in the medium-term is more likely, and after a while, the longer it takes to happen, the less likely it is to happen in any given year, as that suggests that some ingredient is still missing and hard to get. I think you may be right that twenty years is before the S of my S curve really kicks in. Twenty just feels so short with everything that needs to be solved and scaled. I'm much more open-minded about forty.

Are your referring to this format on LessWrong? If so I can’t say I’m particularly impressed, as it still seems to suffer from the problems of linear dialogue vs. a branching structure (e.g., it is hard to see where points have been dropped, it is harder to trace specific lines of argument). But I don’t recall seeing this, so thanks for the flag.

As for “I don’t think we could have predicted people…”, that’s missing my point(s). I’m partially saying “this comment thread seems like it should be a lesson/example of how text-blob comment-threads are inefficien... (read more)

1
zchuang
10mo
I don't know. As someone who was/still is quite good at debating and connected to debating communities I would find a flow-centric comment thread bothersome and unhelpful for reading the dialogues. I quite like internet comments as is in this UI. 

Am I really the only person who thinks it's a bit crazy that we use this blobby comment thread as if it's the best way we have to organize disagreement/argumentation for audiences? I feel like we could almost certainly improve by using, e.g., a horizontal flow as is relatively standard in debate.[1]

With a generic example below:

To be clear, the commentary could still incorporate non-block/prose text.

Alternatively, people could use something like Kialo.com. But surely there has to be something better than this comment thread, in terms of 1) ease of determini... (read more)

1
Joe Rogero
10mo
How hard do you suppose it might be to use an AI to scrub the comments and generate something like this? It may be worth doing manually for some threads, even, but it's easier to get people to adopt if the debate already exists and only needs tweaking. There may even already exist software that accepts text as input and outputs a Kialo-like debate map (thank you for alerting me that Kialo exists, it's neat). 
1
zchuang
10mo
But I don't think we could have predicted people would die into the comments like this. Usually comments have minimal engagement. There's a lesswrong debate format for posts but that's usually with a moderator and such. This seems spontaneous. 

Epistemic status: writing fast and loose, but based on thoughts I've mulled over for a while given personal experience/struggles and discussions with other people. Thus, easy to misinterpret what I'm saying. Take with salt.

On the topic of educational choice, I can't emphasize enough the importance of having legible hard skills such as language or, perhaps more importantly, quantitative skills. Perhaps the worst mistake I made in college was choosing to double major in both international studies and public policy, rather than adding a second major in econ o... (read more)

Ultimately, I've found that the line between empirical and theoretical analysis is often very blurry, and if someone does develop a decent brightline to distinguish the two, it turns out that there are often still plenty of valuable theoretical methods, and some of the empirical methods can be very misleading. 

For example, high-fidelity simulations are arguably theoretical under most definitions, but they can be far more accurate than empirical tests.

Overall, I tend to be quite supportive of using whatever empirical evidence we can, especially experim... (read more)

I see. (For others' reference, those two points are pasted below)

  1. All knowledge is derived from impressions of the external world. Our ability to reason is limited, particularly about ideas of cause and effect with limited empirical experience.
  2. History shows that societies develop in an emergent process, evolving like an organism into an unknown and unknowable future. History was shaped less by far-seeing individuals informed by reason than by contexts which were far too complex to realize at the time.

Overall, I don't really know what to make of these. They ... (read more)

1
Matt Beard
10mo
Thanks for the feedback. I agree that trying to present an alternative worldview ends up quite broad with some good counter examples. And I certainly didn't want to give this impression: Instead I'd say that it is difficult to make these predictions based on a priori reasoning, which this community often tries for AI, and that we should shift resources towards rigorous empirical evidence to better inform our predictions. I tried to give specific examples- Anthropic style alignment research is empiricist, Yudkowsky style theorizing is a priori rationalist. This sort of epistemological critique of longtermism is somewhat common.

I haven't looked very hard but the short answer is no, I'm not aware of any posts/articles that specifically address the idea of "methodological overhang" (a phrase I hastily made up and in hindsight realize may not be totally logical) as it relates to AI capabilities.

That being said, I have written about the possibility that our current methods of argumentation and communication could be really suboptimal, here: https://georgetownsecuritystudiesreview.org/2022/11/30/complexity-demands-adaptation-two-proposals-for-facilitating-better-debate-in-internationa... (read more)

Is your claim just that people should generally "increase [their] error bars and widen [their] probability distribution"? (I was frustrated by the difficulty of figuring out what this post is actually claiming; it seems like it would benefit from a "I make the following X major claims..." TLDR.)

I probably disagree with your points about empiricism vs. rationalism (on priors that I dislike the way most people approach the two concepts), but I think I agree that most people should substantially widen their "error bars" and be receptive to new information. An... (read more)

1
Matt Beard
11mo
Thanks for the feedback! definitely a helpful question. That error bars answer was aimed at OpenPhil based on what I've read from them on AI risk + the prompt in their essay question. I'm sure many others are capable of answering the "what is the probability" forecasting question better/more directly than me, but my two cents was to step back and question underlying assumptions about forecasting that seem common in these conversations. Hume wrote that "all probable reasoning is nothing but a species of sensation." This doesn’t mean we should avoid probable reasoning (we can't) but I think we should recognize it is based only on our experiences/observations of the world. and question how rational its foundations are. I don't think at this stage anyone actually has the empirical basis to give a meaningful % for "AI will kill everyone." Call it .5 or 1 or 7 or whatever but my essay is about trying to take a step back and question epistemological foundations. Anthropic seems much better at this so far (if they mean it that they'd stop given further empirical evidence of risks). I did list two premises from Hume that I think are true (or truer than the average person concerned about AI x-risk holds them to be), so those were my TLDR I guess also.

I think there is plenty of room for debate about what the curve of AI progress/capabilities will look like, and I mostly skimmed the article in about ~5 minutes, but I don't think your post's content justified the title ("exponential AI takeoff is a myth"). "Exponential AI takeoff is currently unsupported" or "the common narrative(s) for exponential AI takeoff is based on flawed premises" are plausible conclusions from this post (even if I don't necessarily agree with them), but I think the original title would require far more compelling arguments to be j... (read more)

1
Christoph Hartmann
11mo
Hi Harrison, thanks for stating what I guess a few people are thinking - it's a bit of a clickbait title. I do think though that the non-exponential growth is much more likely than exponential growth just becuase exponential takeoff would require no constraints on growth while it's enough if one constraint kicks in (maybe even one I didn't consider here) to stop exponential growth. I'd be curious on the methodological overhang though. Are you aware of any posts / articles discussing this further?

TL;DR: Someone should probably write a grant to produce a spreadsheet/dataset of past instances where people claimed a new technology would lead to societal catastrophe, with variables such as “multiple people working on the tech believed it was dangerous.”

Slightly longer TL;DR: Some AI risk skeptics are mocking people who believe AI could threaten humanity’s existence, saying that many people in the past predicted doom from some new tech. There is seemingly no dataset which lists and evaluates such past instances of “tech doomers.” It seems somewhat ridic... (read more)

1
Denis
1mo
Just saw this now, after following a link to another comment.  You have almost given me an idea for a research project. I would run the research honestly and report the facts, but my in-going guess is that survivor bias is a massive factor, contrary to what you say here. And that in most cases, the people who believed it could lead to catastrophe were probably right to be concerned. A lot of people have the Y2K bug mentality, in which they didn't see any disaster and so concluded that it was all a false-alarm, rather than the reality which is that a lot of people did great work to prevent it.  If I look at the different x-risk scenarios the public is most aware of: * Nuclear annihilation - this is very real. As is nuclear winter.  * Climate change. This is almost the poster-child for deniers, but in fact there is as yet no reason to believe that the doom-saying predictions are wrong. Everything is going more or less as the scientists predicted, if anything, it's worse. We have just underestimated the human capacity to stick our heads in the ground and ignore reality*.  * Pandemic. Some people see covid as proof that pandemics are not that bad. But we know that, for all the harm it wrought, covid was far from the worst-case. A bioweapon or a natural pandemic.  * AI - the risks are very real. We may be lucky with how it evolves, but if we're not, it will be the machines who are around to write about what happened (and they will write that it wasn't that bad ...)  * Etc.  My unique (for this group) perspective on this is that I've worked for years on industrial safety, and I know that there are factories out there which have operated for years without a serious safety incident or accident - and someone working in one of those could reach the conclusion that the risks were exaggerated, while being unaware of cases where entire factories or oil-rigs or nuclear power plants have exploded and caused terrible damage and loss of life.    Before I seriously start w

TBH, I think that the time spent scoring rationales is probably quite manageable: I don’t think it should take longer than 30 person-minutes to decently judge each rationale (e.g., have three judges each spend 10 minutes evaluating each), maybe less? It might be difficult to have results within 1-2 hours if you don’t have that many judges, but probably it should be available by the end of the day.

To be clear, I was thinking that only a small number (no more than three, maybe just two) of the total questions should be “rationale questions.”

But definitely th... (read more)

I would also strongly recommend having a version of the fellowship that aligns with US university schedules, unlike the current Summer fellowship!

Given the (accusations of) conflicts of interest in OpenAI’s calls for regulation of AI, I would be quite averse to relying on OpenAI for funding for AI governance

This situation was somewhat predictable and avoidable, in my view. I’ve lamented the early-career problem in the past but did not get many ideas for how to solve it. My impression has been that many mid-career people in relevant organizations put really high premiums on “mentorship,” to the point that they are dismissive of proposals that don’t provide such mentorship. 

There are merits to emphasizing mentorship, but the fact has been that there are major bottlenecks on mentorship capacity and this does little good for people who are struggling to get ... (read more)

Although this article might interest me, it is also >30 minutes long and it does not appear to have any kind of TL;DR, and the conclusion is too vague for me to sample test the article‘s insights. It is really important to provide such a summary. For example, for an article of this type and length, I’d really like to know things like:

  1. What would summarize as the main deficiency in current methods? Or “Many people/readers probably believe X, or just haven’t considered Y.”
  2. What does this article argue instead? “I argue Y.”
  3. Why does this all matter? “Y is significantly better than X.”
4
Max Reddel
11mo
Fair point! The article is very long. Given that this post is discussing less theory but rather teases an abstract application, it seemed originally hard to summarize. I updated the article and added a summary.

I was also going to recommend this, but I’ll just add an implementation idea (which IDK if I fully endorse): you could try to recruit a few superforecasters or subject-matter experts (SMEs) in given field to provide forecasts on the questions at the same time, then have a reciprocal scoring element (I.e., who came closest to the superforecasters’/SMEs’ forecasts). This is basically what was done in the 2022 Existential Risk Persuasion/Forecasting Tournament (XPT), which Philip Tetlock ran (and I participated in). IDK when the study results for that tournam... (read more)

2
Jason
11mo
A trial of #2 would have some information value -- you could discern how strong the correlation was between the rationale scores and final standings to decide if rationales were a good way to produce a same-week result. Maybe you could also use idea #1 with only the top-scoring teams making it to the rationale round, to cut down on time spent scoring rationales?

I think the ask-a-question feature on EAF/LW partially fills the need.

I considered this largely insufficient for what I have in mind; I want a centralized hub that I can check (and/or that I expect other people will check), whereas the ask-a-question feature is too decentralized.

 

I expect most professional researchers don't have a need for this, since they have a sufficiently strong network that they can ask someone who will know the answer or know who to ask.

Perhaps! I wouldn't know, I'm mostly just a wannabe amateur who doesn't have such a luxury. T... (read more)

One particularly confusing fact is that OAI's valuation appears to have gone from $14 billion in 2021 to $19 billion in 2023. Even ignoring anything about transformative AI, I would have expected that the success of ChatGPT etc. should have resulted in a more than a 35% increase.

I've frequently seen people fail to consider the fact that potential competitors (including borderline copycats) can significantly undermine the profitability of one company's technology, even if the technology could generate substantial revenue. Are you taking this into account?

3
Davidsk
10mo
Yes, absolutely! It's important to recognize and account for the potential impact of competitors, including those who may closely replicate a company's technology or offerings. Competition in the market can indeed pose a significant threat to the profitability of a company, regardless of how promising its technology might be. Even if a technology has the potential to generate substantial revenue, if competitors emerge with similar or even slightly different solutions, it can lead to a loss of market share and decreased profitability.
4
Ben_West
1y
I have not done a formal analysis, but yes, even given the possibility of copycats I still feel like the unexpected success of ChatGPT should have resulted in more than a 35% increase to their value.

Regarding

Then Karnofsky shouldn't claim that he was arguing with "as much rigour as possibly".

See my comment

2
JoshuaBlake
1y
Thank you for the correction, I had not checked the full context of the quote myself. I have now edited to clarify that Karnofsky did not claim this in response to this and Spencer's reply.

Holden Karnofsky describes the claims of his “Most Important Century” series as “wild” and “wacky”, but at the same time purports to be in the mindset of “critically examining” such “strange possibilities” with “as much rigour as possible”. This emphasis is mine, but for what is supposedly an important piece of writing in a field that has a big part of its roots in academic analytic philosophy, it is almost ridiculous to suggest that this examination has been carried out with 'as much rigour as possible'. 

(Probably unimportant note: I don't understand... (read more)

2
Spencer Becker-Kahn
1y
I appreciate the comment. I'll be honest, I'm probably not going to go back through all of the quotations now and give the separate posts they come from. Karnofsky did put the whole series into a single pdf which can function as a single source (but as I say, I haven't gone through again and checked myself). I do recognize the bit you are quoting though - yes that is indeed where I got my quotations for that part from - and I did think a bit the issue you are bringing up. To me, this seems like a fair paraphrase/description of Karnofsky in that section: a) He's developed a mindset in which he critically examines things like this with as much rigour as possible; and b) He has made a lot of investment in examining this thesis. So it seemed - and still seems - to me that he is implying that he has invested a lot of time in critically examining this thesis with as much as rigour as possible.  However, I would not have used the specific phrasing that JoshuaBlake used. i.e. I deliberately did not say something like 'Karnofsky claims to have written these posts with as much rigour as possible". So I do think that one could try to give him the benefit of the doubt and say something like:  'Although he spent a lot of time examining the thesis with as much rigor as possible, it does not necessarily follow that he wrote the posts in a way that shows that. So criticising the writing in the posts is kind of an unfair way to attack his use of rigour.' But I think to me this just seemed like I would be trying a bit too hard to avoid criticising him. This comes back to some of my points in my post: i.e. I am suggesting that his posts are not written in a way that invites clear criticism, despite his claim that his is one of his main intentions and I suggest that Karnofsky's rhetorical style aims for the best of both worlds: He wants his readers to think both that he has thought about this very rigorously and critically for a long time but also - wherever it seems vague or

Just to clarify, I don't think that the legislation actually explicitly mentions "artificial intelligence," right? Just "intentional or accidental threats arising from the use and development of emerging technologies."

1
JuanGarcia
1y
That's correct. Artificial intelligence and engineered pandemics are just given as examples of things that may fall under the category of threats from emerging technologies.

Learn the ‘proper’ way to type.

Is this just directed at people who still use hunt-and-peck, or is there some new “proper” way to type beyond normal [full hand / home key] typing?

2
freedomandutility
1y
But proper way I just mean full hand / home key

If they are aligned, then surely our future selves can figure this out?

I think it’s entirely plausible we just don’t care to figure it out, especially if we have some kind of singleton scenario where the entity in control decides to optimize human/personal welfare at the expense of other sentient beings. Just consider how humans currently treat animals and now imagine that there is no opportunity for lobbying for AI welfare, we’re just locked into place.

Ultimately, I am very uncertain, but I would not say that solving AI alignment/control will “surely” lead to a good future.

To be completely honest I think the new title—mainly the “worth it”—is actually worse: That’s a phrase you would use if you are intentionally choosing to create global warming. (It’s also somewhat flippant.)

I would recommend something simple like “Some global warming can reduce the risk of mass starvation from abrupt sunlight reduction scenarios.”

I think having the split bullet is probably good, though.

3
Vasco Grilo
1y
Thanks! I have updated the title to: I have: * Replaced "worth it" by "good", which sounds less flippant to me. * Added the term "food shocks" to clarify the risk from ASRSs I am referring to. * Replaced "Some" by "More" to clarify I mean additional global warming may be good (instead of just the level of warming we are heading to).

Do you have any suggestions for a better title?

I'm not sure I have a great suggestion off the top of my head; my initial guess would be something like "Some global warming could decrease the risk from [?]..." (On second review, I'm especially confused as to whether the title is claiming that global warming reduces the risk from climate change??)

For me, an overly provocative title would be something like, "Is global warming actually good? At the margin, maybe".

I wasn't claiming your title was overly provocative, just that it can be very easily taken out of ... (read more)

2
Vasco Grilo
1y
Thanks! I have changed the title to: I assume global warming always increases the risk from climate change, I have added some more words to the 2nd bullet of the summary to clarify this: My best guess is that additional emissions of greenhouse gases (GHGs) are beneficial up to an optimal median global warming in 2100 relative to 1880 of 4.3 ºC, [what I added:] after which the increase in the risk from climate change outweighs the reduction in that from ASRSs. I have now divided the 1st bullet into 2, where the 1st acknowledges the risks from climate change:
Load more