All of Will Aldred's Comments + Replies

Good post, and I strongly agree. My preferred handle for what you’re pointing at is ‘integrity’. Quoting @Habryka (2019):

I think of integrity as a more advanced form of honesty […] Where honesty is the commitment to not speak direct falsehoods, integrity is the commitment to speak truths that actually ring true to yourself, not ones that are just abstractly defensible to other people. It is also a commitment to act on the truths that you do believe, and to communicate to others what your true beliefs are.

(In this frame, What We Owe the Future, for exa... (read more)

3
Jamie_Harris
Wait why was What We Owe the Future low integrity (according to this definition of integrity, which strikes me as an unusual usage, for what it's worth)?

There was some discussion of this a few months ago: see here.

Although, maybe your main point—which I agree the existing discussion doesn’t really have answers to—is, “How, if at all, should we be getting ahead of things and proactively setting a framing for the social media conversation that will surely follow (as opposed to just forming some hypotheses over what that conversation will be like, but not particularly doing anything yet)? Who within our community should lead these efforts? How high priority is this compared to other forms of improvi... (read more)

[sorry I’m late to this thread]

@William_MacAskill, I’m curious which (if any) of the following is your position?

1.

“I agree with Wei that an approach of ‘point AI towards these problems’ and ‘listen to the AI-results that are being produced’ has a real (>10%? >50%?) chance of ending in moral catastrophe (because ‘aligned’ AIs will end up (unintentionally) corrupting human values or otherwise leading us into incorrect conclusions).

And if we were living in a sane world, then we’d pause AI development for decades, alongside probably engagi... (read more)

3
Wei Dai
See also this post, which occurred to me after writing my previous reply to you.
4
Wei Dai
The answer would depend a lot on what the alignment/capabilities profile of the AI is. But one recent update I've made is that humans are really terrible at strategy (in addition to philosophy) so if there was no way to pause AI, it would help a lot to get good strategic advice from AI during crunch time, which implies that maybe AI strategic competence > AI philosophical competence in importance (subject to all the usual disclaimers like dual use and how to trust or verify its answers). My latest LW post has a bit more about this. (By "strategy" here I especially mean "grand strategy" or strategy at the highest levels, which seems more likely to be neglected versus "operational strategy" or strategy involved in accomplishing concrete tasks, which AI companies are likely to prioritize by default.) So for example if we had an AI that's highly competent at answering strategic questions, we could ask it "What questions should I be asking you, or what else should I be doing with my $1B?" (but this may have to be modified based on things like how much can we trust its answers of various kinds, how good is it at understanding my values/constraints/philosophies, etc.). If we do manage to get good and trustworthy AI advice his way, another problem would be how to get key decision makers (including the public) to see and trust such answers, as they wouldn't necessarily think to ask such questions themselves nor by default trust the AI answers. But that's another thing that a strategically competent AI could help with. BTW your comment made me realize that it's plausible that AI could accelerate strategic thinking and philosophical progress much more relative to science and technology, because the latter could become bottlenecked on feedback from reality (e.g., waiting for experimental results) whereas the former seemingly wouldn't be. I'm not sure what implications this has, but want to write it down somewhere. One thought I have here is that AIs could give very differe

Nice!

One quibble: IMO, the most important argument within ‘economic dominance,’ which doesn’t appear in your list (nor really in the body of your text), is Wei Dai’s ‘AGI will drastically increase economies of scale’.

4
rosehadshar
Thanks for the quibble, seems big if true! And agreed it is not something that I was tracking when writing the article. A few thoughts: * I am fairly unsure if the economies of scale point is actually right. Some reasons for doubt:   * Partly I'm thinking of Drexler's CAIS arguments and intuitions that ecosystems of different specialised systems will outcompete monoculture * Partly I'm looking at AI development today * Partly the form of the economies of scale argument seems to be 'one constraint on human economies of scale is coordination costs between humans. So if those are removed, economies of scale will go to infinity!' But there may well be other trade offs that you reach at higher levels. For example, I'd expect that you lose out on things like creativity/innovation, and that you run higher risks of correlated failures, vulnerabilities etc. * Assuming it is true, it doesn't seem like the most important argument within economic dominance to me: * The most natural way of thinking about it for me is that AGI increasing economies of scale is a subset of outgrowing the world (where the general class is 'having better AI enables growing to most of the economy', and the economies of scale sub-class is 'doing that via using copies of literally the same AI, such that you get more economies of scale' * Put another way, I think the economies of scale thing only leads to extreme power concentration in combination with a big capabilities gap. If lots of people have similarly powerful AI systems, and can use them to figure out that they'd be best off by using a single system to do everything, then I don't see any reason why one country would dominate. So it doesn't seem like an independent route to me, it's a particular form of a route that is causally driven by another factor. Interested in your takes here!
Will Aldred
Moderator Comment6
1
0

Mod here. It looks like this thread has devolved into a personal dispute with only tangential relevance to EA. I’m therefore locking the thread.

Those involved, please don’t try to resurrect the dispute elsewhere on this forum; we may issue bans if we see that happening.

A moderator has deactivated replies on this comment

Richard’s ‘Coercion is an adaptation to scarcity’ post and follow-up comment talk about this (though ofc maybe there’s more to Richard’s view than what’s discussed there). Select quotes:

What if you think, like I do, that we live at the hinge of history, and our actions could have major effects on the far future—and in particular that there’s a significant possibility of existential risk from AGI? I agree that this puts us in more of a position of scarcity and danger than we otherwise would be (although I disagree with those who have very high cre

... (read more)

"I don't want to encourage people to donate (even to the same places as I did) unless you already have a few million dollars in assets"

I do see advantages of the abundance mindset, but your threshold is extremely high-it excludes nearly everyone in developed countries, let alone the world. Plenty of people without millions of dollars of assets have an abundance mindset (including myself).

FYI readers, here is Habryka’s response to this post over on LessWrong, if you haven’t seen it.

Relatedly, @MichaelDickens shallow-reviewed Horizon just under a year ago—see here.[1] Tl;dr: Michael finds that Horizon’s work isn’t very relevant to x-risk reduction; Michael believes Horizon is net-negative for the world (credence: 55%).

(On the other hand, it was Eth, Perez and Greenblatt—i.e., people whose judgement I respect—who recommended donating to Horizon in that post Mikhail originally commented on. So, I overall feel unsure about what to think.)

  1. ^

    See also ensuing discussion here.

6
MichaelDickens
I've seen a number of people I respect recommend Horizon, but I've never seen any of them articulate a compelling reason why they like it. For example in that comment you linked in the footnote, I found the response pretty unpersuasive (which is what I said in my follow-up comment, which got no reply). Absence of evidence is evidence of absence, but I have to weigh that against the fact that so many people seem to like Horizon. A couple weeks ago I tried reaching out to Horizon to see if they could clear things up, but they haven't responded. Although even if they did respond, I made it apparent that the answer I'm looking for is "yes Horizon is x-risk-pilled", and I'm sure they could give that answer even if it's not true.

Fyi, the Forum team has experimented with LLMs for tagging posts (and for automating some other tasks, like reviewing new users), but so far none have been accurate enough to rely on. Nonetheless, I appreciate your comment, since we weren’t really tracking the transparency/auditing upside of using LLMs.

6
Owen Cotton-Barratt
That makes sense!  (I'm curious how much you've invested in giving them detailed prompts about what information to assess in applying particular tags, or even more structured workflows, vs just taking smart models and seeing if they can one-shot it; but I don't really need to know any of this.)

Beyond the specifics (which Vasco goes into in his reply): These tweets are clearly not serious/principled/good-faith criticisms. If we are constantly moderating what we say to try to make sure that we don’t possibly give trolls any ammunition, then our discourse is forever at the mercy of those most hostile to the idea of doing good better. That’s not a good situation to be in. Far better, I say, to ignore the trolling.

-1
Henry Howard🔸
Saying crazy but philosophically valid things is fine as long as it’s useful. Many of our current morals would have looked crazy 300 years ago, so I’m glad people spoke up. Nematode welfare is not productive conversation. The conclusions are clearly not tenable, the uncertainties too broad, the key questions (is a nematode life net good or bad) unanswerable. What is the purpose?
4
NickLaing
I think this is a false binary. For sure we don't have to always be at the "Mercy" of trolls, but we can be wise about what ideas to toss into any given public sphere at any point in time. 

I agree with ‘within dedicated discussions and not on every single animal welfare post,’ and I think Vasco should probably take note, here.

However, I’m not really sure what you mean by reputational risk—whose reputation is at risk?

Generally speaking, I very much want people to be saying what they honestly believe, both on this forum and elsewhere. Vasco honestly believes that soil animal welfare outweighs farmed animal welfare, and he has considered arguments for why he believes this, and so I think it’s valuable for him to say the things he says (so long ... (read more)

2
Vasco Grilo🔸
Thanks, Will. I definitely agree I should not be commenting about soil animals on every post about animal welfare. I have not been doing this, although I think most people would like me to bring up soil animals less frequently. I have been trying to focus on more prominent posts, and ones from people who I think may be more open to it. "if soil animals are conscious". Nitpick. Certainty of consciousness is not needed. An (expected) welfare per animal-year which is not very low is enough, and I suppose this follows from a probability of sentience which is not very low. By sentience, I mean experiencing positive or negative experiences. Consciousness includes neutral experiences, so it does not necessarily imply sentience. I am confident effects on soil animals matter for people endorsing something like the welfare ranges presented in Table 8.6 of Bob Fischer's book about comparing animal welfare across species. I estimate effects on soil animals would still be much larger than those on the target beneficiaries for a welfare per animal-year of exactly 0 for animals with fewer neurons than those considered in Bob’s book, and welfare per animal-year for animals with at least as many neurons as shrimp (the animal with the least neurons for which the welfare range is estimated in the book) proportional to "number of neurons as a fraction of that of humans"^0.19, which explains explains 78.6 % of the variance in the estimates for the welfare range presented in the book. I calculate soil ants and termites have 2.91 and 1.16 times as many neurons as shrimp, so effects on them would still be relevant. I get the following increase in the welfare of soil ants and termites as a fraction of the increase in the welfare of the target beneficiaries for an exponent of 0.19 (the chicken welfare corporate campaigns would decrease animal welfare): * For cage-free corporate campaigns, -20.4. * For buying beef, 3.31 M. * For broiler welfare corporate campaigns, -321. * For GiveWel

I By reputational risk I mean that an organization like Hive who's stated purpose is to be "Your global online hub for farmed animal advocates." could be undermined by their platform being spammed with arguments suggesting helping farmed animals is a bad idea.

I don't think discussions about whether what your entire platfor is doing is even net positive, are best had on an organizational slack forum. It's could demotivate people who are passionate about helping farmed animals.

Perhaps very uncertain philosophical questions can be discussed on other forums, t... (read more)

Just want to quickly flag that you seem to have far more faith in superforecasters’ long-range predictions than do most people who have worked full-time in forecasting, such as myself.

@MichaelDickens’ ‘Is It So Much to Ask?’ is the best public writeup I’ve seen on this (specifically, on the problems with Metaculus’ and FRI XPT’s x-risk/extinction forecasts, which are cited in the main post above). I also very much agree with:

Excellent forecasters and Superforecasters™ have an imperfect fit for long-term questions

Here are some reasons why we might expect lo

... (read more)
2
Arepo
You might be right re forecasting (though someone willing in general to frequently bet on 2% scenarios manifesting should fairly quickly outperform someone who frequently bets against them - if their credences are actually more accurate).  The two jobs you mention only refer to 'loss of control' as a single concern among many - 'risks with security implications, including the potential of AI to assist with the development of chemical and biological weapons, how it can be used to carry out cyber-attacks, enable crimes such as fraud, and the possibility of loss of control.' I'm not claiming that these orgs don't or shouldn't take the lesser risks and extreme tail risks seriously (I think they should and do), but denying the claim that people who 'think seriously' about AI risks necessarily lean towards high extinction probabilities.

There’s an old (2006) Bostrom paper on ~this topic, as well as Yudkowsky’s ‘Anthropic Trilemma’ (2009) and Wei Dai’s ‘Moral Status of Independent Identical Copies’ (2009). Perhaps you’re remembering one of them?

(Bostrom disagrees with the second paragraph you cite, as far as I can tell. He writes: ‘If a brain is duplicated so that there are two brains in identical states, are there then two numerically distinct phenomenal experiences or only one? There are two, I argue.’)

Will Aldred
8
0
0
50% agree

I don’t know much about nematodes, mites or springtails in particular, but I agree that, when thinking about animal welfare interventions, one should be accounting for effects on wild animals.

(As Vasco says, these effects plausibly reverse the sign of factory farming—especially cattle farming—from negative to positive. I’m personally quite puzzled as to why this isn’t a more prominent conversation/consideration amongst the animal welfare community. (Aside from Vasco’s recent work, has ~any progress been made in the decade since Shulman and Tomasi... (read more)

5
Vasco Grilo🔸
Thanks, Will. There is a series from @Michael St Jules 🔸 on human impacts on animals.

This post did generate a lot of pushback. It has more disagree votes than agree votes, the top comment by karma argues against some of its claims and is heavily upvoted and agree-voted, and it led to multiple response posts including one that reaches the opposite conclusion and got more karma & agree votes than this one.

I agree that this somewhat rebuts what Raemon says. However, I think a large part of Raemon’s point—which your pushback doesn’t address—is that Bentham’s post still received a highly positive karma score (85 when Raemon came u... (read more)

[resolved]

Meta: I see that this poll has closed after one day. I think it would make sense for polls like this to stay open for seven days, by default, rather than just one?[1] I imagine this poll would have received another ~hundred votes, and generated further good discussion, had it stayed open for longer (especially since it was highlighted in the Forum Digest just two hours prior).

@Sarah Cheng

  1. ^

    I’m unsure if OP meant for this poll to close so soon. Last month, when I ran some polls, I found that a bunch of them ended up closing after the

... (read more)
6
Sarah Cheng 🔸
Ugh I agree yeah, thanks for flagging this! I re-opened the poll by manually updating it in the db, and we should increase the default duration of polls.

Yeah, thanks for pointing this out. With the benefit of hindsight, I’m seeing that there are really three questions I want answers to:

1.  Have you been voting in line with the guidelines (whether or not you’ve literally read them)?

2a. Have you literally read the guidelines? (In other words, have we succeeded in making you aware of the guidelines’ existence?)

2b. If you have read the guidelines, to what extent can you accurately recall them? (In other words, conditional on you knowing the guidelines exist, to what extent have we succeeded at drilling th

... (read more)

‘Relevant error’ is just meant to mean a factual error or mistaken reasoning. Thanks for pointing out the ambiguity, though, we might revise this part.

Thanks, yeah, I like the idea of guidelines popping up while hovering. (Although, I’m unsure if the rest of the team like it, and I’m ultimately not the decision maker.) If going this route, my favoured implementation, which I think is pretty aligned with what you’re saying, is for the popping up to happen in line with a spaced repetition algorithm. That is, often enough—especially at the beginning—that users remember the guidelines, but hopefully not so often that the pop ups become redundant and annoying.

Before reading this quick take, how familiar were you with this forum’s voting guidelines?
T
Z
P
HP
JW
M
MK
NR
N
I
N
YY
BM
B
JP
J
NN
TH
R
SR
B
LA
F
R
AD
H
C
J
M
T
JB
A
AD
ID
J
K
R
A
A
EG
SR
R
not at all
very

The Forum moderation team (which includes myself) is revisiting thinking about this forum’s norms. One thing we’ve noticed is that we’re unsure to what extent users are actually aware of the norms. (It’s all well and good writing up some great norms, but if users don’t follow them, then we have failed at our job.)

Our voting guidelines are of particular concern,[1] hence this poll. We’d really appreciate you all taking part, especially if you don’t usually take part in polls but do take part in voting. (We worry that the ‘silent majority’ of... (read more)

2
NickLaing
What does "there's a relevant error" mean exactly for downvite?
5
akash 🔸
Random idea: for new users and/or users with less than some threshold level of karma and/or users who use the forum infrequently, Bulby pops up with a little banner that contains a tl;dr on the voting guidelines. Especially good if the banner pops up when a user hovers their cursor over the voting buttons. 
5
Isaac Dunn
I wasn't sure if I was, but reading the guidelines matched my guess of what they would say, so I think I was familiar with them. 

Nice post (and I only saw it because of @sawyer’s recent comment—underrated indeed!). A separate, complementary critique of the ‘warning shot’ idea, made by Gwern (in reaction to 2023’s BingChat/Sydney debacle, specifically), comes to mind (link):

One thing that the response to Sydney reminds me of is that it demonstrates why there will be no 'warning shots' (or as Eliezer put it, 'fire alarm'): because a 'warning shot' is a conclusion, not a fact or observation.

One man's 'warning shot' is just another man's "easily patched minor bug of no im

... (read more)

Hmm, I think there’s some sense to your calculation (and thus I appreciate you doing+showing this calculation), but the $6.17 conclusion—specifically, “engagement time would drop significantly if users had to pay 6.17 $ per hour they spend on the EA Forum, which suggests the marginal cost-effectiveness of running the EA Forum is negative”—strikes me as incorrect.

What matters is by how much engaging with the Forum raises altruistic impact, which, insofar as this impact can be quantified in dollars, is far, far higher than what one would be willing and ... (read more)

4
Vasco Grilo🔸
Thanks for sharing your thoughts, Will! I agree what matters is the additional altruistic impact caused by engaging with the Forum. However, I think my point holds as long as people have accurate views about how to maximise their altruistic impact. For example, if you believed "factual impact of your marginal hour on the Forum" - "counterfactual impact of this hour" < "impact of donating 100 $[1] to the organisation or project you consider the most cost-effective", and using the Forum costed 100 $/h, I think you would have a greater altruistic impact by your own lights by spending less time on the Forum, and donating the savings. Do you agree? Analogously, if the user spending the marginal hour on the Forum believed "factual impact of their marginal hour on the Forum" - "counterfactual impact of this hour" < "impact of donating 6.17 $ to the organisations or projects they consider the most cost-effective", and using the Forum costed 6.17 $, I think they would have a greater altruistic impact by their own lights by spending less time on the Forum, and donating the savings. In this case, if the marginal user-hour costed 6.17 $ to the Forum team[2], I believe they would also increase altruistic impact in the eyes of the user of the marginal hour by spending less time generating engagement on the Forum, and donating the savings to the organisations or projects that user considers the most cost-effective. 1. ^ Implying the marginal cost-effectiveness of your time on the Forum is 25 % (= 400/100) of the past cost-effectiveness. 2. ^ I guess it costs more due to increasing user-hours becoming more difficult as user-hours increase.

Note: Long-time power user of this forum, @NunoSempere, has just rebooted the r/forecasting subreddit. How that goes could give some info re. the question of “to what extent can a subreddit host the kind of intellectual discussion we aim for?”

(I’m not aware of any subreddits that meet our bar for discussion, right now—and I’m therefore skeptical that this forum should move to Reddit—but that might just be because most subreddit moderators aren’t aiming for the same things as this forum’s moderators. r/forecasting is an interesting experiment beca... (read more)

Relevant reporting from Sentinel earlier today (May 19):

Forecasters estimated a 28% chance (range, 25-30%) that the US will pass a 10-year ban on states regulating AI by the end of 2025.

28% is concerningly high—all the more reason for US citizens to heed this post’s call to action and get in touch with your Senators. (Thank you to those who already have!)

(Current status is: “The bill cleared a key hurdle when the House Budget Committee voted to advance it on Sunday [May 18] night, but it still must undergo a series of votes in the House befo... (read more)

Inspired by the last section of this post (and by a later comment from Mjreard), I thought it’d be fun—and maybe helpful—to taxonomize the ways in which mission or value drift can arise out of the instrumental goal of pursuing influence/reach/status/allies:

Epistemic status: caricaturing things somewhat

Never turning back the wheel

In this failure mode, you never lose sight of how x-risk reduction is your terminal goal. However, in your two-step plan of ‘gain influence, then deploy that influence to reduce x-risk,’ you wait too long to move onto ste... (read more)

6
Mjreard
Great write up. I think all three are in play and unfortunately kind of mutually reinforcing, though I'm more agnostic about how much of each. 

For what it’s worth, I find some of what’s said in this thread quite surprising.

Reading your post, I saw you describing two dynamics:

  1. Principles-first EA initiatives are being replaced by AI safety initiatives
  2. AI safety initiatives founded by EAs, which one would naively expect to remain x-risk focused, are becoming safety-washed (e.g., your BlueDot example)

I understood @Ozzie’s first comment on funding to be about 1. But then your subsequent discussion with Ozzie seems to also point to funding as explaining 2.[1]

While Open Phil has opinions within AI s... (read more)

I think OP and grantees are synced up on xrisk (or at least GCRs) being the terminal goal. My issue is that their instrumental goals seem to involve a lot of deemphasizing that focus to expand reach/influence/status/number of allies in ways that I worry lend themselves to mission/value drift. 

Do those other meditation centres make similarly extreme claims about the benefits of their programs? If so, I would be skeptical of them for the same reasons. If not, then the comparison is inapt.

Why would the comparison be inapt?

A load-bearing piece of your argument (insofar as I’ve understood it) is that most of the benefit of Jhourney’s teachings—if Jhourney is legit—can be conferred through non-interactive means (e.g., YouTube uploads). I am pointing out that your claim goes against conventional wisdom in this space: these other meditation centres bel... (read more)

2
Yarrow Bouchard 🔸
Sorry, this is an incredibly late reply in a (by Internet standards) ancient comment thread.  My point is about differentiation. If Jhourney is saying their work confers benefits on approximately the same level as the many meditation centres you can find all over the place, then I have no qualms with that claim. If Jhourney, or someone else, is saying that Jhourney's work confers benefits far, far higher than any or almost any other meditation centre or retreat on Earth, then I'm skeptical about that.  Transcendental Meditation or TM is an organization that claims far, far higher benefits from its techniques than other forms of meditation, insists on in-person teaching, and charges a very high fee. It's viewed by some people as essentially a scam and some people as a sort of luxury product that is not particularly differentiated from the commodity product. I'm not saying Jhourney is like Transcendental Meditation, I'm just noting that similar claims have been made in the area of meditation before with a clear financial self-interest to make these claims, and the claims have not been borne out. So, there is a certain standard of evidence a company like Jhourney has to rise above, a certain level of warranted skepticism it has to overcome. 

What is the interactive or personalized aspect of the online “retreats”? Why couldn’t they be delivered as video on-demand (like a YouTube playlist), audio on-demand (like a podcast), or an app like Headspace or 10% Happier?

I mean, Jhourney is far from the only organisation that offers online retreats. Established meditation centres like Gaia HousePlum Village and DeconstructingYourself—to name but a few—all offer retreats online (as well as in person).

If Jhourney’s house blend of jhana meditation makes you more altruistic, why wouldn’t th

... (read more)
2
Yarrow Bouchard 🔸
Do those other meditation centres make similarly extreme claims about the benefits of their programs? If so, I would be skeptical of them for the same reasons. If not, then the comparison is inapt.  If I had developed a meditation program that I really thought did what Jhourney is claiming their meditation program does, I would not be approaching it this way. I would try to make the knowledge as widely accessible as I could as quickly as possible. Jhourney has been doing retreats for over two years. What's the hold up? Transcendental Meditation (TM)'s stated justification for their secrecy and high prices is that TM requires careful, in-person, one-on-one instruction. What's Jhourney's justification for not making instructional videos or audio recordings that anyone can buy for, say, $70? Could it be just commercial self-interest? But, in that case, why hasn't the jhana meditation encouraged them to prize altruism more? Isn't that supposed to be one of the effects? I'm willing to make some allowance for personal self-interest and for the self-interest of the business, of course. But selling $70 instructional materials to millions of people would be a good business. And the Nobel Peace Prize comes with both a $1 million cash prize and a lot of fame and acclaim. Similarly, the Templeton Prize comes with $1.4 million in cash and some prestige. There are other ways to capitalize on fame and esteem, such as through speaking engagements. So, sharing a radical breakthrough in jhana meditation with the world has strong business incentives and strong personal self-interest incentives. Why not do it? The simplest explanation is that they don't actually have the "product" they're claiming to have. Or, to put it another way, the "product" they have is not as differentiated from other meditation programs as they're claiming and does not reliably produce the benefits they're claiming it reliably produces.

I’m not Holly, but my response is that getting a pause now is likely to increase, rather than decrease, the chance of getting future pauses. Quoting Evan Hubinger (2022):

In the theory of political capital, it is a fairly well-established fact that ‘Everybody Loves a Winner.’ That is: the more you succeed at leveraging your influence to get things done, the more influence you get in return. This phenomenon is most thoroughly studied in the context of the ability of U.S. presidents to get their agendas through Congress—contrary to a naive mode

... (read more)

It seems like I interpreted this question pretty differently to Michael (and, judging by the votes, to most other people). With the benefit of hindsight, it probably would have been helpful to define what percentage risk the midpoint (between agree and disagree) corresponds to?[1] Sounds like Michael was taking it to mean ‘literally zero risk’ or ‘1 in 1 million,’ whereas I was taking it to mean 1 in 30 (to correspond to Ord’s Precipice estimate for pandemic x-risk).

(Also, for what it’s worth, for my vote I’m excluding scenarios where a misaligne... (read more)

2
Toby Tremlett🔹
This is helpful. If this was actually for a debate week, I'd have made it 'more than 5% extinction risk this century' and (maybe) excluded risks from AI.

Meta: I’m seeing lots of blank comments in response to the DIY polls. Perhaps people are thinking that they need to click ‘Comment’ in order for their vote to count? If so, PSA: your vote counted as soon as you dropped your slider. You can simply close the pop-up box that follows if you don’t also mean to leave a comment.

Happy voting!

2
Toby Tremlett🔹
I've let @Will Howard🔹 know - people probably don't see the cross/ don't intuitively see the cross as doing what it does. 
Will Aldred
*1
0
2
89% agree

Consequentialists should be strong longtermists

For me, the strongest arguments against strong longtermism are simulation theory and the youngness paradox (as well as yet-to-be-discovered crucial considerations).[1]

(Also, nitpickily, I’d personally reword this poll from ‘Consequentialists should be strong longtermists’ to ‘I am a strong longtermist,’ because I’m not convinced that anyone ‘should’ be anything, normatively speaking.)

  1. ^

    I also worry about cluelessness, though cluelessness seems just as threatening to neartermist interventions

... (read more)
4
Toby Tremlett🔹
I'm a pretty strong anti-realist but this is one of the strongest types of shoulds for me.  I.e. 'If you want to achieve the best consequences, then you should expect the majority of affectable consequences to be in the far future' Seems like the kind of thing that could be true or false on non-normative grounds, and would normatively ground a 'should' if you are already committed to consequentialism. In the sense that believing "I should get to Rome as fast as possible" and "The fastest way to get to Rome is to take a flight" grounds a 'should' for "I should take a flight to Rome".

[Good chance you considered my idea already and rejected it (for good reason), but stating it in case not:]

For these debate week polls, consider dividing each side up into 10 segments, rather than 9? That way, when someone votes, they’re agreeing/disagreeing by a nice, round 10 or 20 or 30%, etc., rather than by the kinda random amounts (at present) of 11, 22, 33%?

2
Will Howard🔹
Good idea, I hadn't thought of that! I've changed it to 10 like you suggested (will be deployed in ~10 mins at time of writing)

I think Holly’s claim is that these people aren’t really helping from an ‘influencing the company to be more safety conscious’ perspective, or a ‘solving the hard parts of the alignment problem’ perspective. They could still be helping the company build commercially lucrative AI.

Hmm, I’m not a fan of this Claude summary (though I appreciate your trying). Below, I’ve made a (play)list of Habryka’s greatest hits,[1] ordered by theme,[2][3] which might be another way for readers to get up to speed on his main points:

Leadership

Reputation[5]

Funding

Impact

  1. ^

    ‘Great

... (read more)

I’m not sure if this hits what you mean by ‘being ineffective to be effective’, but you may be interested in Paul Graham’s ‘Bus ticket theory of genius’.

Will Aldred
*Moderator Comment20
5
2

The moderation team is issuing @Eugenics-Adjacent a 6-month ban for flamebait and trolling.

I’ll note that Eugenics-Adjacent’s posts and comments have been mostly about pushing against what they see as EA groupthink. In banning them, I do feel a twinge of “huh, I hope I’m not making the Forum more like an echo chamber.” However, there are tradeoffs at play. “Overrun by flamebait and trolling” seems to be the default end state for most internet spaces: the Forum moderation team is committed to fighting against this default.

All in all, we think the ... (read more)

I’m also announcing this year’s first debate week! We'll be discussing whether, on the margin, we should put more effort into reducing the chances of avoiding human extinction or increasing the value of futures where we survive.

Nice! A couple of thoughts:

1.

In addition to soliciting new posts for the debate week, consider ‘classic repost’-ing relevant old posts, especially ones that haven’t been discussed on the Forum before?

Tomasik’s ‘Risks of astronomical future suffering’ comes to mind, as well as Assadi’s ‘Will humanity choose its future?’ and Anthis’s ... (read more)

+1. I appreciated @RobertM’s articulation of this problem for animal welfare in particular:

I think the interventions for ensuring that animal welfare is good after we hit transformative AI probably look very different from interventions in the pretty small slice of worlds where the world looks very boring in a few decades.

If we achieve transformative AI and then don’t all die (because we solved alignment), then I don’t think the world will continue to have an “agricultural industry” in any meaningful sense (or, really, any other traditional industry;

... (read more)
2
Tristan Katz
Animal welfare guy tuning in. My own take is that the majority of the world actually is almost entirely indifferent about animal suffering, so if AI tries to reflect global values (not just the values of the progressive, elite silicon valley bubble) there is a real risk that it will be indifferent to animal suffering. Consider how Foie Gras is still legal in most countries, or bullfighting, both of which are totally unnecessary. And those are just examples from western countries. I think it's very likely that TAI will lock in only a very mild concern for animal welfare. Or perhaps, concern for animal welfare in certain contexts (e.g. pets), and none in others (e.g. chicken). Maybe that will lead to a future without factory farming, but it will lead to a future with unnecessary animal suffering nonetheless.  What I'm not sure about is: how do we ensure that TAI locks in a strong valuation of animal welfare? One route is to try to change how much society cares about animal welfare, and hope that TAI then reflects that. I guess this is the hope of many animal advocates. But I admit that seems too slow to work at this stage, so I agree that animal advocates should probably prioritize trying to influence those developing AI right now.

Agree. Although, while the Events dashboard isn’t up to date, I notice that the EAG team released the following table in a post last month, which does have complete 2024 data:

EAG applicant numbers were down 42% from 2022 to 2024,[1] which is a comparable decline to that in monthly Forum users (down 35% from November 2022’s peak to November 2024).[2]

To me, this is evidence that the dropping numbers are driven by changes in the larger zeitgeist rather than by any particular thing the Events or Online team is doing (as @Jason surmises in his&nb... (read more)

6
Sarah Cheng 🔸
Thanks for sharing some data here! I think the picture is more complicated than it seems (isn't it always), though I'm not super confident about that. A couple points: * I think one relevant factor here is that (I believe) the Events and Groups teams rely more on funding to scale, and so when funding became less available they (I think) made an explicit decision to spend less. Funding doesn't affect Forum usage nearly as much (for example, we've almost always had one content manager on the Forum Team). * I mentioned in a comment I just wrote earlier that my understanding is that traffic to 80k resources has not declined. * Actually you may be interested to read my whole comment that I linked to, since I think it adds some context relevant to this thread.

Bug report (although this could very well be me being incompetent!):

The new @mention interface doesn’t appear to take users’ karma into account when deciding which users to surface. This has the effect of showing me a bunch of users with 0 karma, none of whom are the user I’m trying to tag.[1] (I think the old interface showed higher-karma users higher up?)

More importantly, I’m still shown the wrong users even when I type in the full username of the person I’m trying to tag—in this case, Jason. [Edit: I’ve tried @ing some other users, now, and I’ve fo... (read more)

OP has provided very mixed messages around AI safety. They've provided surprisingly little funding / support for technical AI safety in the last few years (perhaps 1 full-time grantmaker?), but they have seemed to provide more support for AI safety community building / recruiting

Yeah, I find myself very confused by this state of affairs. Hundreds of people are being funneled through the AI safety community-building pipeline, but there’s little funding for them to work on things once they come out the other side.[1]

As well as being suboptimal from the ... (read more)

8
Ozzie Gooen
Yea, I think this setup has been incredibly frustrating downstream. I'd hope that people from OP with knowledge could publicly reflect on this, but my quick impression is that some of the following factors happened: 1. OP has had major difficulties/limitations around hiring in the last 5+ years. Some of this is lack of attention, some is that there aren't great candidates, some is a lack of ability. This effected some cause areas more than others. For whatever reason, they seemed to have more success hiring (and retaining talent) for community than for technical AI safety.  2. I think there's been some uncertainties / disagreements into how important / valuable current technical AI safety organizations are to fund. For example, I imagine if this were a major priority from those in charge of OP, more could have been done.  3. OP management seems to be a bit in flux now. Lost Holden recently, hiring a new head of GCR, etc.  4. I think OP isn't very transparent and public with explaining their limitations/challenges publicly. 5. I would flag that there are spots at Anthropic and Deepmind that we don't need to fund, that are still good fits for talent. 6. I think some of the Paul Christiano - connected orgs were considered a conflict-of-interest, given that Ajeya Cotra was the main grantmaker.  7. Given all of this, I think it would be really nice if people could at least provide warnings about this. Like, people entering the field are strongly warned that the job market is very limited. But I'm not sure who feels responsible / well placed to do this. 

While not a study per se, I found the Huberman Lab podcast episode ‘How Smartphones & Social Media Impact Mental Health’ very informative. (It’s two and a half hours long, mostly about children and teenagers, and references the study(ies) it draws from, IIRC.)

0
Charlie_Guthmann
Ty!

For previous work, I point you to @NunoSempere’s ‘Shallow evaluations of longtermist organizations,’ if you haven’t seen it already. (While Nuño didn’t focus on AI safety orgs specifically, I thought the post was excellent, and I imagine that the evaluation methods/approaches used can be learned from and applied to AI safety orgs.)

4
Mikolaj Kniejski
Thanks! I saw that post. It's an excellent approach. I'm planning to do something similar, but less time-consuming and limited. The range of theories of change that are pursued in AIS is limited and can be broken down into: * Evals * Field-building * Governance * Research Evals can be measured by quality and number of evals, relevance to ex-risks. It seems pretty straightforward to differentiate a bad eval org from a good eval org—engaging with major labs, having a lot of evals, and a relation to existential risks. Field-building—having a lot of participants who do awesome things after the project. Research—I argue that the number of citations is also a good proxy for the impact of a paper. It's definitely easy to measure and is related to how much engagement a paper received. In the absence of any work done to bring the paper to the attention of key decision makers, it's very related to the engagement. I'm not sure how to think about governance. Take this with a grain of salt.  EDIT: Also I think that engaging broader ML community with AI safety is extremely valuable and citations tells us how if an organization is good at that. Another thing that would be good to reivew is to ask about transparency of organizations, how thier estimate their own impact and so on - this space is really unexplored and this seems crazy to me. The amount of money that goes into AI safety is gigantic and it would be worth exploring what happens with it. 

I hope in the future there will be multiple GV-scale funders for AI GCR work, with different strengths, strategies, and comparative advantages

(Fwiw, the Metaculus crowd prediction on the question ‘Will there be another donor on the scale of 2020 Good Ventures in the Effective Altruist space in 2026?’ currently sits at 43%.)

Epistemic status: strong opinions, lightly held

I remember a time when an org was criticized, and a board member commented defending the org. But the board member was factually wrong about at least one claim, and the org then needed to walk back wrong information. It would have been clearer and less embarrassing for everyone if they’d all waited a day or two to get on the same page and write a response with the correct facts.

I guess it depends on the specifics of the situation, but, to me, the case described, of a board member making one or two incorrect cl... (read more)

0
David_Moss
  I agree that it depends on the situation, but I think this would often be quite a lot worse in real, non-ideal situations. In ideal communicative situations, mistaken information can simply be corrected at minimal cost. But in non-ideal situations, I think one will often see things like: * Mistaken information gets shared and people spend time debating or being confused about the false information * Many people never notice or forget that the mistaken information got corrected and it keeps getting believed and shared * Some people speculate that the mistaken claims weren't innocently shared, but that the board member was being evasive/dishonest * People conclude that the organization / board is incompetent and chaotic because they can't even get basic facts right Fwiw, I think different views about this ideal/non-ideal distinction underlie a lot of disagreements about communicative norms in EA.
Answer by Will Aldred46
12
4
2

Open Phil has seemingly moved away from funding ‘frontier of weirdness’-type projects and cause areas; I therefore think a hole has opened up that EAIF is well-placed to fill. In particular, I think an FHI 2.0 of some sort (perhaps starting small and scaling up if it’s going well) could be hugely valuable, and that finding a leader for this new org could fit in with your ‘running specific application rounds to fund people to work on [particularly valuable projects].’

My sense is that an FHI 2.0 grant would align well with EAIF’s scope. Quoting fro... (read more)

2
Jason
One possible concern with this idea is that the project would probably take a lot of funding to launch. With Open Phil's financial distancing from EA Funds, my guess is that EAIF may often not be in the ideal position to be an early funder of a seven-figure-a-year project, by which I mean one that comes on board earlier than individual major funders. I can envision some cases in which EAIF might be a better fit for seed funding, such as cases where funding would allow further development or preliminary testing of a big-project proposal to the point it could be better evaluated by funders who can consistently offer mid-six figures plus a year. It's unclear how well that would describe something like the FHI/West proposal, though. I could easily be wrong (or there could already be enough major funder interest to alleviate the first paragraph concern), and a broader discussion about EAIF's comparative advantages / disadvantages for various project characteristics might be helpful in any event.
6
hbesceli
Thanks for the suggestion - I read the proposal a while ago, and hadn't thought about it recently, so it's good to be reminded of it again.  We haven't decided against funding projects like this. (EAIF's grantmaking historically has been very passive - eg. the projects that we end up considering for funding has been determined by the applications we received. And we haven't received any strong applications in the 'FHI of the West' ballpark - at least as far as I'm aware)

Thanks for clarifying!

Be useful for research on how to produce intent-aligned systems

Just checking: Do you believe this because you see the intent alignment problem as being in the class of “complex questions which ultimately have empirical answers, where it’s out of reach to test them empirically, but one may get better predictions from finding clear frameworks for thinking about them,” alongside, say, high energy physics?

2
Owen Cotton-Barratt
Yep.
Load more