Thanks, I found this post helpful, especially the diagram.
What (if any) is the overlap of cooperative AI […] and AI safety?
One thing I’ve thought about a little is the possiblility of there being a tension wherein making AIs more cooperative in certain ways might raise the chance that advanced collusion between AIs breaks an alignment scheme that would otherwise work.[1]
I’ve not written anything up on this and likely never will; I figure here is as good a place as any to leave a quick comment pointing to the potential problem, appreciating that it’s
Hard to tell from the information given. Two sources saying an unknown number of people are threatening to resign could just mean that two people are disgruntled and might themselves resign.
Hmm, okay, so it sounds like you’re arguing that even if we measure the curvature of our observable universe to be negative, it could still be the case that the overall universe is positively curved and therefore finite? But surely your argument should be symmetric, such that you should also believe that if we measure the curvature of our observable universe to be positive, it could still be the case that the overall universe is negatively curved and thus infinite?
Thanks for replying, I think I now understand your position a bit better. Okay, so if your concern is around measurements only being finitely precise, then my exactly-zero example is not a great one, because I agree that it’s impossible to measure the universe as being exactly flat.
Maybe a better example: if the universe’s large-scale curvature is either zero or negative, then it necessarily follows that it’s infinite.
—(I didn’t give this example originally because of the somewhat annoying caveats one needs to add. Firstly, in the flat case, that the ...
Hi Vasco, I’m having trouble parsing your comment. For example, if the universe’s large-scale curvature is exactly zero (and the universe is simply connected), then by definition it’s infinite, and I’m confused as to why you think it could still be finite (if this is what you’re saying; apologies if I’m misinterpreting you).
I’m not sure what kind of a background you already have in this domain, but if you’re interested in reading more, I’d recommend first going to the “Shape of the universe” Wikipedia page, and then, depending on your mileage, lecture...
I’m confused about why you think forecasting orgs should be trying to acquire commercial clients.[1] How do you see this as being on the necessary path for forecasting initiatives to reduce x-risk, contribute to positive trajectory change, etc.? Perhaps you could elaborate on what you mean by “real-world impact”?
COI note: I work for Metaculus.
The main exception that comes to mind, for me, is AI labs. But I don’t think you’re talking about AI labs in particular as the commercial clients forecasting orgs should be aiming for?
What you describe in your first paragraph sounds to me like a good updating strategy, except I would say that you’re not updating your “natural independent opinion,” you’re updating your all-things-considered belief.
Related short posts I recommend—the first explains the distinction I’m pointing at, and the second shows how things can go wrong if people don’t track it:
AI x-risk is unique because humans would be replaced by other beings, rather than completely dying out. This means you can't simply apply a naive argument that AI threatens total extinction of value
Paul Christiano wrote a piece a few years ago about ensuring that misaligned ASI is a “good successor” (in the moral value sense),[1] as a plan B to alignment (Medium version; LW version). I agree it’s odd that there hasn’t been more discussion since.[2]
...Here's a non-exhaustive list of guesses for why I think EAs haven't historically been sympathetic [.
Individually, altruists [...] can make a habit of asking themselves and others what risks they may be overlooking, dismissing, or downplaying.
Institutionally, we can rearrange organizational structures to take these individual tendencies into account, for example by creating positions dedicated to or focused on managing risk.
I’ve been surprised by how this seems to be a bit of a blind spot in our community.[1] I’ve previously written a couple of comments—excerpted below—on this theme, about the state of community building. These garnered a decent numb...
Objection 1: It is unlikely that there will be a point at which a unified agent will be able to take over the world, given the existence of competing AIs with comparable power
For what it’s worth, the Metaculus crowd forecast for the question “Will transformative AI result in a singleton (as opposed to a multipolar world)?” is currently “60%”. That is, forecasters believe it’s more likely than not that there won’t be competing AIs with comparable power, which runs counter to your claim.
(I bring this up seeing as you make a forecasting-based argument for your claim.)
Following on from your saner world illustration, I’d be curious to hear what kind of a call to action you might endorse in our current world.
I personally find your writings on metaphilosophy, and the closely related problem of ensuring AI philosophical competence, persuasive. In other words, I think this area has been overlooked, and that more people should be working in it given the current margin in AI safety work. But I also have a hard time imagining anyone pivoting into this area, at present, given that:[1]
Yes, this is a fair point; Holden has discussed these dangers a little in “Digital People Would Be An Even Bigger Deal”. My bottom-line belief, though, is that mind uploads are still significantly more likely to be safe than ML-derived ASI, since uploaded minds would presumably work, and act, much more similarly to (biological) human minds. My impression is that others also hold this view? I’d be interested if you disagree.
To be clear, I rank moratorium > mind uploads > ML-derived ASI, but I think it’s plausible that our strategy portfolio shoul...
Hmm, based on what you’ve said here—and I acknowledge that what you’ve said is a highly compressed version of your experience, thus I may well be failing to understand you (and I apologize in advance if I mischaracterize your experience)—I think I’m not quite seeing how this refutes my framing? I accept that my type-A/B framing rounds off a bunch of nuance, but to me, within that framing, it sounds like you’re type-A?
Like, I’m not sure how long the transition period was for you, and I expect different people’s transition periods will vary considerably, but...
Thanks for your comments, both. I agree that the personal versus universal statements distinction is noteworthy (and missing from my take above).
Shower thought: A lot of the talking past each other that happens between vegan and non-vegan[1] EAs[2] might come from selection effects plus typical mind fallacy.[3]
Let’s say there are two types of people: type-A, for whom a vegan diet imposes little or no costs, and type-B, for whom a vegan diet imposes substantial costs (in things like health, productivity,[4] social life). My hunch is that most long-time vegans are type-A, while most type-B people who try going vegan bounce.
Now, to a type-A vegan who doesn’t realize type-B is a thing, t...
This doesn't super resonate with my experience. I haven't really seen anyone argue for "veganism is costly for everyone". I feel like the debate has always been between "for some people veganism is very costly" and "veganism is very cheap for everyone (if they just try properly)".
Like, it's not like anyone is arguing that there should be no vegan food at EAG, or that all EAs should be carnivores. Maybe I am missing something here and there are places where people are talking past each other in the way you describe, but e.g. recent conversations with ...
Just to give your final point some context: the average in-depth research project by Rethink Priorities reportedly costs $70K-$100K. So, if this AI Impacts survey cost $138K in participant compensation, plus some additional amount for things like researcher time, then it looks like this survey was two or three times more expensive than the average research project in its approximate reference class.
I haven’t thought hard about whether the costs of EA-funded research make sense in general, but I thought I’d leave this comment so that readers don’t go a...
I’m curious, since it sounds like MIRI folks may have thought about this, if you have takes on how best to allocate marginal effort between pushing for cooperation-to-halt-AI-progress on the one hand, and accelerating cognitive enhancement (e.g., mind uploading) on the other?[1]
Like, I see that you list promoting cooperation as a priority, but to me, based on your footnote 3, it doesn’t seem obvious that promoting cooperation to buy ourselves time is a better strategy at the margin than simply working on mind uploading.[2] (At least, I don’t see this ...
Also, I’m pretty sure that octopuses do play? A quick search appears to confirm this: “Octopuses like to play” (BBC).
I mention this in response to the part of the post that reads: “Other animals like fish, reptiles, and octopuses do not engage in sensation-seeking or play and so do not have those internal conscious experiences.”
(ETA: I see that in their closing section, OP acknowledges some uncertainty here, listing as a question for further investigation: “Is it really true that fish, shrimp, octopuses, and other animals of particular concern do not engag...
Upvoted because I’m a fan of people summarizing and signal-boosting literature that bears on EA priorities. Disagree-voted because I’m not convinced that any of the observations or considerations put forward support the headline claim that only mammals and birds are sentient.
Also, I’m pretty sure that octopuses do play? A quick search appears to confirm this: “Octopuses like to play” (BBC).
I mention this in response to the part of the post that reads: “Other animals like fish, reptiles, and octopuses do not engage in sensation-seeking or play and so do not have those internal conscious experiences.”
(ETA: I see that in their closing section, OP acknowledges some uncertainty here, listing as a question for further investigation: “Is it really true that fish, shrimp, octopuses, and other animals of particular concern do not engag...
I'm sure ~everyone involved considers nuclear war a negative-sum game. (They likely still think it's preferable to win a nuclear war than to lose it, but they presumably think the "winner" doesn't gain as much as the "loser" loses.)
On top of this, I imagine most involved view not fighting a nuclear war as preferable to fighting and winning. (In other words, a nuclear war is not only negative on net, but negative for everyone.)[1]
...Yeah, my sense is multiple countries will upgrade their arsenals soon. I'm legitimately uncertain whether this will on net increa
Writing in a personal capacity.
Hi Geoffrey, I think you raise a very reasonable point.
There’s some unfortunate timing at play here: 3/7 of the active mod team—Lizka, Toby, and JP—have been away at a CEA retreat for the past ~week, and have thus mostly been offline. In my view, we would have ideally issued a proper update by now on the earlier notice: “For the time being, please do not post personal information that would deanonymize Alice or Chloe.”
In lieu of that, I’ll instead publish one of my comments from the moderators’ Slack thread, along w...
I think what this comes down to for me is: If Kat Woods’ Forum username was pseudonymous, would we have taken down Ben’s post? (Or otherwise removed all references to Kat by her real name?)
If the answer to this is “yes,” then I don’t think Alice+Chloe should be deanonymized.
I do not like the incentive structure that this would create if adopted. Kat did not get to look at this particular drama and decide whether she wanted it discussed under a real or pseudonymous username. Her decision point was when she created her forum account however many years ...
Will - thanks very much for sharing your views, and some of the discussion amongst the EA Forum moderators.
These are tricky issues, and I'm glad to see that they're getting some serious attention, in terms of the relative costs, benefits, and risks of different possible politicies.
I'm also concerned about 'setting a precedent of first-mover advantage'. A blanket policy of first-mover (or first-accuser) anonymity would incentivize EAs to make lots of allegations before the people they're accusing could make counter-allegations. That seems likely to create massive problems, conflicts, toxicity, and schisms within EA.
Thanks for sharing this!
I had a bunch of thoughts on this situation, enough that I wrote them up as a post. Unfortunately your response came out while I was writing and I didn't see it, but I think doesn't change much?
In addition to your three paths forward, I see a fourth one: you extend the policy to have the moderators (or another widely-trusted entity) make decisions on when there should be exceptions in cases like this, and write a bit about how you'll make those decisions.
(I only came across this post because I saw your comment; agree that it's breathtakingly moving :') If the author sees this: I appreciate you, and I too would definitely subscribe.)
Writing in a personal capacity.
“An update to our policies on revealing personal information on the Forum” covers some of what you’re asking about, I think, although the framing there is more about revealing private vs public info than about “How substantiated is substantiated enough?” The most relevant part:
...
- We think a very good norm is to check unverified rumors or claims before sharing them — especially if they might be damaging or if they relate to sensitive or stigmatized topics.
- If you’re not sure whether you should check something (or how to check
Yeah, this is a reasonable thing to ask. So, the “if we have reason to believe that someone is violating norms around voting” clause is intentionally vague, I believe, because if we gave more detail on the kinds of checks/algorithms we have in place for flagging potential violations, then this could help would-be miscreants commit violations that slip past our checks.
(I’m a bit sad that the framing here is adversarial, and that we can’t give users like you more clarification, but I think this state of play is the reality of running an online forum.)
If it helps, though, the bar for looking into a user’s voting history is high. Like, on average I don’t think we do this more than once or twice per month.
Writing in a personal capacity; I haven't run this by other mods.
Hi, just responding to these parts of your comment:
I think people might reasonably (though wrongly) assume that forum mods are not monitoring accounts at this level of granularity, and thus believe that their voting behavior is private.
...
Frankly, I don’t love that mods are monitoring accounts at this level of granularity. (For instance, knowing this would make me less inclined to put remotely sensitive info in a forum dm.)
We include some detail on what would lead moderators to look into a us...
Hey folks, a reminder to please be thoughtful as you comment.
The previous Nonlinear thread received almost 500 comments; many of these were productive, but there were also some more heated exchanges. Following Forum norms—in a nutshell: be kind, stay on topic, be honest—is probably even more important than usual in charged situations like these.
Discussion here could end up warped towards aggression and confusion for a few reasons, even if commenters are generally well intentioned:
Edited to add: My objection to John’s comment in what I write below lies with the “deranged” part. If John had instead said something like “unnecessary” or “overly escalatory/ad hominem,” then I would not have responded. But “deranged” — dictionary definition: “completely unable to think clearly or behave in a controlled way, especially because of mental illness” (source) — which I take as John implying that the direction Kat has gone in is so completely nonsensical that there can’t possibly be a reasonable explanation, struck me as sufficiently inaccurate...
Retaliation is bad. If you think doing X is bad, then you shouldn't do X, even if you're 'only doing it to make the point that doing X is bad'.
the actions he [SBF] was convicted of are nearly universally condemned by the EA community
I don’t think that observing lots of condemnation and little support is all that much evidence for the premise you take as given—that SBF’s actions were near-universally condemned by the EA community—compared to meaningfully different hypotheses like “50% of EAs condemned SBF’s actions.”
There was, and still is, a strong incentive to hide any opinion other than condemnation (e.g., support, genuine uncertainty) over SBF’s fraud-for-good ideology, out of legiti...
I notice I’m confused by what Anders says about the offence-defence balance.
The argument, as I understand it, is that in the far future there’ll be a lot of space—lightyears, perhaps—between warring factions/civilizations. Offensive attacks therefore won’t work well because, with all the distance the offensive weapons need to cover, the defenders will have plenty of time to block or move out of the way.
But… this relies on the defenders seeing the weapons approaching, no? And I would expect weapons of the far future to travel at or very close to the speed o...
This isn’t an isolated incident either, by the way. See my write-up on the errors in the math in the quantum physics sequences
The average physics textbook contains multiple errors of the order, “forgot to include the normalization term.”[1] To me it seems incorrect to bring up Eliezer's error as strong evidence that he doesn't know what he's talking about, without acknowledging the base rate of similar errors.
(I recognize that you also point to Eliezer's economics errors. I haven't read Inadequate Equilibria or the post critiquing it, so I can't comme...
Importance of the digital minds stuff compared to regular AI safety; how many early-career EAs should be going into this niche? What needs to happen between now and the arrival of digital minds? In other words, what kind of a plan does Carl have in mind for making the arrival go well? Also, since Carl clearly has well-developed takes on moral status, what criteria he thinks could determine whether an AI system deserves moral status, and to what extent.
Additionally—and this one's fueled more by personal curiosity than by impact—Carl's beliefs on consciousne...
It took me just under 5 minutes.
The percentages I inputted were best guesses based on my qualitative impressions. If I'd been more quantitative about it, then I expect my allocations would have been better—i.e., closer to what I'd endorse on reflection. But I didn't want to spend long on this, and figured that adding imperfect info to the commons would be better than adding no info.
Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.
That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:
...
- We don’t use Rethink’s moral weights.
- Our cur
Fair points, Carl. Thanks for elaborating, Will!
- We don’t use Rethink’s moral weights.
- Our current moral weights, based in part on Luke Muehlhauser’s past work, are lower. We may update them in the future; if we do, we’ll consider work from many sources, including the arguments made in this post.
Interestingly and confusingly, fitting distributions to Luke's 2018 guesses for the 80 % prediction intervals of the moral weight of various species, one gets mean moral weights close to or larger than 1:
It is also worth noting that Luke seemed very much willing...
Here, you say, “Several of the grants we’ve made to Rethink Priorities funded research related to moral weights.” Yet in your initial response, you said, “We don’t use Rethink’s moral weights.” I respect your tapping out of this discussion, but at the same time I’d like to express my puzzlement as to why Open Phil would fund work on moral weights to inform grantmaking allocation, and then not take that work into account.
One can value research and find it informative or worth doing without being convinced of every view of a given researcher or team. Open Philanthropy also sponsored a contest to surface novel considerations that could affect its views on AI timelines and risk. The winners mostly present conclusions or considerations on which AI would be a lower priority, but that doesn't imply that the judges or the institution changed their views very much in that direction.
At large scale, Information can be valuable enough to buy even if it only modestly adjusts pro...
The "EA movement", however you define it, doesn't get to control the money and there are good reasons for this.
I disagree, for the same reasons as those given in the critique to the post you cite. Tl;dr: Trades have happened in EA where many people have cast aside careers with high earning potential to work on object-level problems. I think these people should get a say over where EA money goes.
I think the 20% figure, albeit a step in the right direction, is in reality a lot less impressive than it first sounds. From OpenAI's "Introducing Superalignment" post:
We are dedicating 20% of the compute we’ve secured to date over the next four years to solving the problem of superintelligence alignment.
I expect that 20% of OpenAI's 2023 compute will be but a tiny fraction of their 2027 compute, given that training compute has been growing by something like 4.2x/year.
If you go to “Account settings”—which is the second from bottom item when you hover over your user avatar in the top right—then go to “Notifications” then “Private messages,” you can make it so that you’ll receive an email when someone messages you on this forum, to whichever email you used to sign up. (This is in fact the default setting, I believe.) I think as long as you don’t deactivate your account, you can thus ensure that you’ll be notified if someone tries to contact you in the future, even if you’re not actively monitoring your account.
I sometimes post (narrow) reading lists on the forum. Are those actually helpful to anyone?
For what it's worth, I found your "AI policy ideas: Reading list" and "Ideas for AI labs: Reading list" helpful,[1] and I've recommended the former to three or four people. My guess would be that these reading lists have been very helpful to a couple or a few people rather than quite helpful to lots of people, but I'd also guess that's the right thing to be aiming for given the overall landscape.
...Why don't there exist better reading lists / syllabi, especially be
For anyone interested in watching a dramatic reconstruction of this incident, go to timestamp 43:30–47:05 of The Man Who Saved The World. (I recommend watching at 1.5x speed.)
On the last point: I take this potassium iodide supplement, which I recommend. (I take one capsule every two days.)
I looked into iodine supplements a while ago—the two main forms are potassium iodide and seaweed/kelp. Two or three legit-looking articles I found said that seaweed/kelp-based iodine supplements can contain very different amounts of iodine to what they say on their labels, which seems like a good reason to be wary. (Additionally, a couple other articles claimed that seaweed/kelp supplements contain high levels of toxic heavy metals, but the ve...
According to the debate week announcement, Scott Alexander will be writing a summary/conclusion post.
One thing the AI Pause Debate Week has made salient to me: there appears to be a mismatch between the kind of slowing that on-the-ground AI policy folks talk about, versus the type that AI policy researchers and technical alignment people talk about.
My impression from talking to policy folks who are in or close to government—admittedly a sample of only five or so—is that the main[1] coordination problem for reducing AI x-risk is about ensuring the so-called alignment tax gets paid (i.e., ensuring that all the big labs put some time/money/effort i...
We're issuing a warning for this comment for breaking our Forum norm on civility. We don't think it was meant to be insulting, based on Linch's previous Twitter poll (created months ago) and the fact that he himself is not a native speaker. However, we think the stark difference between the Twitter poll and responses here shows that this comment was widely taken as insulting, even if that wasn't the intent. (I certainly saw it that way before reading the Twitter poll.)
A subsequent comment ("I at least made an effort to understand the language when I immigr...
Directionally, I agree with your points. On the last one, I'll note that counting person-years (or animal-years) falls naturally out of empty individualism as well as open individualism, and so the point goes through under the (substantively) weaker claim of “either open or empty individualism is true”.[1]
(You may be interested in David Pearce's take on closed, empty, and open individualism.)
For the casual reader: The three candidate theories of personal identity are empty, open, and closed individualism. Closed is the common sense view, but most people w
The problem with this position is that the Black Hole Era—at least, the way the “Five Ages of the Universe” article you link to defines it—only starts after proton decay has run to (effective) completion,[1] which means that all matter will be in black holes, which means that conscious beings will not exist to farm black holes for their energy. (If do, however, agree that life is in theory not dependent on luminous stars, and so life could continue beyond the Stelliferous Era and into the Degenerate Era, which adds many years.)
Whether proton decay wi
I just came across this old comment by Wei Dai which has aged well, for unfortunate reasons.
I think a healthy dose of moral uncertainty (and normative uncertainty in general) is really important to have, because it seems pretty easy for any ethical/social movement to become fanatical or to incur a radical element, and end up doing damage to itself, its members, or society at large. (“The road to hell is paved with good intentions” and all that.)
Perhaps this old comment from Rohin Shah could serve as the standard link?
(Note that it’s on the particular case of recommending people do/don’t work at a given org, rather than the general case of praise/criticism, but I don’t think this changes the structure of the argument other than maybe making point 1 less salient.)
Excerpting the relevant part:
... (read more)