Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C.
[...]
[Regarding ice melting -- ] That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.
I'll be blunt, remarks like these undermine your credibility. But regardless, I just don't have any experience or contributions to make on climate change, other than re-emphasizing my general impression that, as a person who cares a lot about e...
Everything is going more or less as the scientists predicted, if anything, it's worse.
I'm not that focused on climate science, but my understanding is that this is a bit misleading in your context—that there were some scientists in the (90s/2000s?) who forecasted doom or at least major disaster within a few decades due to feedback loops or other dynamics which never materialized. More broadly, my understanding is that forecasting climate has proven very difficult, even if some broad conclusions (e.g., "the climate is changing," "humans contribute to climat...
I don't find this response to be a compelling defense of what you actually wrote:
since AIs would "get old" too [...] they could also have reason to not expropriate the wealth of vulnerable old agents because they too will be in such a vulnerable position one day
It's one thing if the argument is "there will be effective enforcement mechanisms which prevent theft," but the original statement still just seems to imagine that norms will be a non-trivial reason to avoid theft, which seems quite unlikely for a moderately rational agent.
Ultimately, perhaps much o...
Apologies for being blunt, but the scenario you lay out is full of claims that just seem to completely ignore very facially obvious rebuttals. This would be less bad if you didn’t seem so confident, but as written the perspective strikes me as naive and I would really like an explanation/defense.
Take for example:
...Furthermore, since AIs would "get old" too, in the sense of becoming obsolete in the face of new generations of improved AIs, they could also have reason to not expropriate the wealth of vulnerable old agents because they too will be in such a vu
Sure! (I just realized the point about the MNIST dataset problems wasn't fully explained in my shared memo, but I've fixed that now)
Per the assessment section, some of the problems with assuming that FRVT demonstrates NIST's capabilities for evaluation of LLMs/etc. include:
Seeing the drama with the NIST AI Safety Institute and Paul Christiano's appointment and this article about the difficulty of rigorously/objectively measuring characteristics of generative AI, I figured I'd post my class memo from last October/November.
The main point I make is that NIST may not be well suited to creating measurements for complex, multi-dimensional characteristics of language models—and that some people may be overestimating the capabilities of NIST because they don't recognize how incomparable the Facial Recognition Vendor Test is to this ...
I probably should have been more clear, my true "final" paper actually didn't focus on this aspect of the model: the offense-defense balance was the original motivation/purpose of my cyber model, but I eventually became far more interested in using the model to test how large language models could improve agent-based modeling by controlling actors in the simulation. I have a final model writeup which explains some of the modeling choices in more detail and talks about the original offense/defense purpose in more detail.
(I could also provide the model code ...
If offence and defence both get faster, but all the relative speeds stay the same, I don’t see how that in itself favours offence
Funny you should say this, it so happens that I just submitted a final paper last night for an agent-based model which was meant to test exactly this kind of claim for the impacts of improving “technology” (AI) in cybersecurity. Granted, the model was extremely simple + incomplete, but the theoretical results explain how this could possible.
In short, when assuming a fixed number of vulnerabilities in an attack surface, while atta...
Thank you so much for articulating a bunch of the points I was going to make!
I would probably just further drive home the last paragraph: it’s really obvious that the “number of people a lone maniac can kill in given time” (in America) has skyrocketed with the development of high fire-rate weapons (let alone knowledge of explosives). It could be true that the O/D balance for states doesn’t change (I disagree) while the O/D balance for individuals skyrockets.
I have increasingly become open to incorporating alternative decision theories as I recognize that I cannot be entirely certain in expected value approaches, which means that (per expected value!) I probably should not solely rely on one approach. At the same time, I am still not convinced that there is a clear, good alternative, and I also repeatedly find that the arguments against using EV are not compelling (e.g., due to ignoring more sophisticated ways of applying EV).
Having grappled with the problem of EV-fanaticism for a long time in part due to the ...
Since I think substantial AI regulation will likely occur by default, I urge effective altruists to focus more on ensuring that the regulation is thoughtful and well-targeted rather than ensuring that regulation happens at all.
I think it would be fairly valuable to see a list of case studies or otherwise create base rates for arguments like “We’re seeing lots of political gesturing and talking, so this suggests real action will happen soon.” I am still worried that the action will get delayed, watered down, and/or diverted to less-existential risks, onl...
I definitely would have preferred a TLDR or summary at the top, not the bottom. However, I definitely appreciate your investigation into this, as I have long loathed Eliezer’s use of the term once I realized he just made it up.
Strange, unless the original comment from Gerald has been edited since I responded I think I must have misread most of the comment, as I thought it was making a different point (i.e., "could someone explain how misalignment could happen"). I was tired and distracted when I read it, so it wouldn't be surprising. However, the final paragraph in the comment (which I originally thought was reflected in the rest of the comment) still seems out of place and arrogant.
This really isn’t the right post for most of those issues/questions, and most of what you mentioned are things you should be able to find via searches on the forum, searches via Google, or maybe even just asking ChatGPT to explain it to you (maybe!). TBH your comment also just comes across quite abrasive and arrogant (especially the last paragraph), without actually appearing to be that insightful/thoughtful. But I’m not going to get into an argument on these issues.
I wish! I’ve been recommending this for a while but nobody bites, and usually (always?) without explanation. I often don’t take seriously many of these attempts at “debate series” if they’re not going to address some of the basic failure modes that competitive debate addresses, e.g., recording notes in a legible/explorable way to avoid the problem of arguments getting lost under layers of argument branches.
Hi Oisín, no worries, and thanks for clarifying! I appreciate your coverage of this topic, I just wanted to make sure there aren't misinterpretations.
In policy spaces, this is known as the Brussels Effect; that is, when a regulation adopted in one jurisdiction ends up setting a standard followed by many others.
I am not clear how the Brussels effect applies here, especially since we’re not talking manufacturing a product with high costs of running different production lines. I recognize there may be some argument/step that I’m missing, but I can’t dismiss the possibility that the author doesn’t actually understand what the Brussels Effect really is / normally does, and is throwing it around like a buzzword. Could you please elaborate a bit more?
I’m curious whether people (e.g., David, MIRI folk) think that LLMs now or in the near future would be able to substantially speed up this kind of theoretical safety work?
I was not a huge fan of the instrumental convergence paper, although I didn't have time to thoroughly review it. In short, it felt too slow in making its reasoning and conclusion clear, and once (I think?) I understood what it was saying, it felt quite nitpicky (or a borderline motte-and-bailey). In reality, I'm still unclear if/how it responds to the real-world applications of the reasoning (e.g., explaining why a system with a seemingly simple goal like calculating digits of pi would want to cause the extinction of humanity).
The summary in this forum pos...
Sorry about the delayed reply, I saw this and accidentally removed the notification (and I guess didn't receive an email notification, contrary to my expectations) but forgot to reply. Responding to some of your points/questions:
One can note that AIXR is definitely falsifiable, the hard part is falsifying it and staying alive.
I mostly agree with the sentiment that "if someone predicts AIXR and is right then they may not be alive", although I do now think it's entirely plausible that we could survive long enough during a hypothetical AI takeover to say "ah ...
Epistemic status: I feel fairly confident about this but recognize I’m not putting in much effort to defend it and it can be easily misinterpreted.
I would probably just recommend not using the concept of neglectedness in this case, to be honest. The ITN framework is a nice heuristic (e.g., usually more neglected things benefit more from additional marginal contributions) but it is ultimately not very rigorous/logical except when contorted into a definitional equation (as many previous posts have explained). Importantly, in this case I think that focusing on neglectedness is likely to lead people astray, given that a change in neglectedness could equate to an increase in tractability.
Over the past few months I have occasionally tried getting LLMs to do some tasks related to argument mapping, but I actually don't think I've tried that specifically, and probably should. I'll make a note to myself to try here.
Interesting. Perhaps we have quite different interpretations of what AGI would be able to do with some set of compute/cost and time limitations. I haven't had the chance yet to read the relevant aspects of your paper (I will try to do so over the weekend), but I suspect that we have very cruxy disagreements about the ability of a high-cost AGI—and perhaps even pre-general AI that can still aid R&D—to help overcome barriers in robotics, semiconductor design, and possibly even aspects of AI algorithm design.
Just to clarify, does your S-curve almost entir...
I find this strange/curious. Is your preference more a matter of “Traditional interfaces have good features that a flowing interface would lack“ (or some other disadvantage to switching) or “The benefits of switching to a flowing interface would be relatively minor”?
For example on the latter, do you not find it more difficult with the traditional UI to identify dropped arguments? Or suppose you are fairly knowledgeable about most of the topics but there’s just one specific branch of arguments you want to follow: do you find it easy to do that? (And more on...
Thanks for posting this, Ted, it’s definitely made me think more about the potential barriers and the proper way to combine probability estimates.
One thing I was hoping you could clarify: In some of your comments and estimates, it seems like you are suggesting that it’s decently plausible(?)[1] we will “have AGI“ by 2043, it’s just that it won’t lead to transformative AGI before 2043 because the progress in robotics, semiconductors, and energy scaling will be too slow by 2043. However, it seems to me that once we have (expensive/physically-limited) AG...
Are your referring to this format on LessWrong? If so I can’t say I’m particularly impressed, as it still seems to suffer from the problems of linear dialogue vs. a branching structure (e.g., it is hard to see where points have been dropped, it is harder to trace specific lines of argument). But I don’t recall seeing this, so thanks for the flag.
As for “I don’t think we could have predicted people…”, that’s missing my point(s). I’m partially saying “this comment thread seems like it should be a lesson/example of how text-blob comment-threads are inefficien...
Am I really the only person who thinks it's a bit crazy that we use this blobby comment thread as if it's the best way we have to organize disagreement/argumentation for audiences? I feel like we could almost certainly improve by using, e.g., a horizontal flow as is relatively standard in debate.[1]
With a generic example below:
To be clear, the commentary could still incorporate non-block/prose text.
Alternatively, people could use something like Kialo.com. But surely there has to be something better than this comment thread, in terms of 1) ease of determini...
Epistemic status: writing fast and loose, but based on thoughts I've mulled over for a while given personal experience/struggles and discussions with other people. Thus, easy to misinterpret what I'm saying. Take with salt.
On the topic of educational choice, I can't emphasize enough the importance of having legible hard skills such as language or, perhaps more importantly, quantitative skills. Perhaps the worst mistake I made in college was choosing to double major in both international studies and public policy, rather than adding a second major in econ o...
Ultimately, I've found that the line between empirical and theoretical analysis is often very blurry, and if someone does develop a decent brightline to distinguish the two, it turns out that there are often still plenty of valuable theoretical methods, and some of the empirical methods can be very misleading.
For example, high-fidelity simulations are arguably theoretical under most definitions, but they can be far more accurate than empirical tests.
Overall, I tend to be quite supportive of using whatever empirical evidence we can, especially experim...
I see. (For others' reference, those two points are pasted below)
- All knowledge is derived from impressions of the external world. Our ability to reason is limited, particularly about ideas of cause and effect with limited empirical experience.
- History shows that societies develop in an emergent process, evolving like an organism into an unknown and unknowable future. History was shaped less by far-seeing individuals informed by reason than by contexts which were far too complex to realize at the time.
Overall, I don't really know what to make of these. They ...
I haven't looked very hard but the short answer is no, I'm not aware of any posts/articles that specifically address the idea of "methodological overhang" (a phrase I hastily made up and in hindsight realize may not be totally logical) as it relates to AI capabilities.
That being said, I have written about the possibility that our current methods of argumentation and communication could be really suboptimal, here: https://georgetownsecuritystudiesreview.org/2022/11/30/complexity-demands-adaptation-two-proposals-for-facilitating-better-debate-in-internationa...
Is your claim just that people should generally "increase [their] error bars and widen [their] probability distribution"? (I was frustrated by the difficulty of figuring out what this post is actually claiming; it seems like it would benefit from a "I make the following X major claims..." TLDR.)
I probably disagree with your points about empiricism vs. rationalism (on priors that I dislike the way most people approach the two concepts), but I think I agree that most people should substantially widen their "error bars" and be receptive to new information. An...
I think there is plenty of room for debate about what the curve of AI progress/capabilities will look like, and I mostly skimmed the article in about ~5 minutes, but I don't think your post's content justified the title ("exponential AI takeoff is a myth"). "Exponential AI takeoff is currently unsupported" or "the common narrative(s) for exponential AI takeoff is based on flawed premises" are plausible conclusions from this post (even if I don't necessarily agree with them), but I think the original title would require far more compelling arguments to be j...
TL;DR: Someone should probably write a grant to produce a spreadsheet/dataset of past instances where people claimed a new technology would lead to societal catastrophe, with variables such as “multiple people working on the tech believed it was dangerous.”
Slightly longer TL;DR: Some AI risk skeptics are mocking people who believe AI could threaten humanity’s existence, saying that many people in the past predicted doom from some new tech. There is seemingly no dataset which lists and evaluates such past instances of “tech doomers.” It seems somewhat ridic...
TBH, I think that the time spent scoring rationales is probably quite manageable: I don’t think it should take longer than 30 person-minutes to decently judge each rationale (e.g., have three judges each spend 10 minutes evaluating each), maybe less? It might be difficult to have results within 1-2 hours if you don’t have that many judges, but probably it should be available by the end of the day.
To be clear, I was thinking that only a small number (no more than three, maybe just two) of the total questions should be “rationale questions.”
But definitely th...
I would also strongly recommend having a version of the fellowship that aligns with US university schedules, unlike the current Summer fellowship!
Given the (accusations of) conflicts of interest in OpenAI’s calls for regulation of AI, I would be quite averse to relying on OpenAI for funding for AI governance
This situation was somewhat predictable and avoidable, in my view. I’ve lamented the early-career problem in the past but did not get many ideas for how to solve it. My impression has been that many mid-career people in relevant organizations put really high premiums on “mentorship,” to the point that they are dismissive of proposals that don’t provide such mentorship.
There are merits to emphasizing mentorship, but the fact has been that there are major bottlenecks on mentorship capacity and this does little good for people who are struggling to get ...
Although this article might interest me, it is also >30 minutes long and it does not appear to have any kind of TL;DR, and the conclusion is too vague for me to sample test the article‘s insights. It is really important to provide such a summary. For example, for an article of this type and length, I’d really like to know things like:
I was also going to recommend this, but I’ll just add an implementation idea (which IDK if I fully endorse): you could try to recruit a few superforecasters or subject-matter experts (SMEs) in given field to provide forecasts on the questions at the same time, then have a reciprocal scoring element (I.e., who came closest to the superforecasters’/SMEs’ forecasts). This is basically what was done in the 2022 Existential Risk Persuasion/Forecasting Tournament (XPT), which Philip Tetlock ran (and I participated in). IDK when the study results for that tournam...
I think the ask-a-question feature on EAF/LW partially fills the need.
I considered this largely insufficient for what I have in mind; I want a centralized hub that I can check (and/or that I expect other people will check), whereas the ask-a-question feature is too decentralized.
I expect most professional researchers don't have a need for this, since they have a sufficiently strong network that they can ask someone who will know the answer or know who to ask.
Perhaps! I wouldn't know, I'm mostly just a wannabe amateur who doesn't have such a luxury. T...
One particularly confusing fact is that OAI's valuation appears to have gone from $14 billion in 2021 to $19 billion in 2023. Even ignoring anything about transformative AI, I would have expected that the success of ChatGPT etc. should have resulted in a more than a 35% increase.
I've frequently seen people fail to consider the fact that potential competitors (including borderline copycats) can significantly undermine the profitability of one company's technology, even if the technology could generate substantial revenue. Are you taking this into account?
Regarding
Then Karnofsky shouldn't claim that he was arguing with "as much rigour as possibly".
See my comment
Holden Karnofsky describes the claims of his “Most Important Century” series as “wild” and “wacky”, but at the same time purports to be in the mindset of “critically examining” such “strange possibilities” with “as much rigour as possible”. This emphasis is mine, but for what is supposedly an important piece of writing in a field that has a big part of its roots in academic analytic philosophy, it is almost ridiculous to suggest that this examination has been carried out with 'as much rigour as possible'.
(Probably unimportant note: I don't understand...
Just to clarify, I don't think that the legislation actually explicitly mentions "artificial intelligence," right? Just "intentional or accidental threats arising from the use and development of emerging technologies."
Learn the ‘proper’ way to type.
Is this just directed at people who still use hunt-and-peck, or is there some new “proper” way to type beyond normal [full hand / home key] typing?
If they are aligned, then surely our future selves can figure this out?
I think it’s entirely plausible we just don’t care to figure it out, especially if we have some kind of singleton scenario where the entity in control decides to optimize human/personal welfare at the expense of other sentient beings. Just consider how humans currently treat animals and now imagine that there is no opportunity for lobbying for AI welfare, we’re just locked into place.
Ultimately, I am very uncertain, but I would not say that solving AI alignment/control will “surely” lead to a good future.
To be completely honest I think the new title—mainly the “worth it”—is actually worse: That’s a phrase you would use if you are intentionally choosing to create global warming. (It’s also somewhat flippant.)
I would recommend something simple like “Some global warming can reduce the risk of mass starvation from abrupt sunlight reduction scenarios.”
I think having the split bullet is probably good, though.
Do you have any suggestions for a better title?
I'm not sure I have a great suggestion off the top of my head; my initial guess would be something like "Some global warming could decrease the risk from [?]..." (On second review, I'm especially confused as to whether the title is claiming that global warming reduces the risk from climate change??)
For me, an overly provocative title would be something like, "Is global warming actually good? At the margin, maybe".
I wasn't claiming your title was overly provocative, just that it can be very easily taken out of ...
I definitely think beware is too strong. I would recommend “discount” or “be skeptical” or something similar.