TFD

Moderators have a hard job, I think it can't be entirely on moderation to drive the culture of a website. A lot of the work has to be on the users. Starting with moderation issues though, I have a couple ideas:

Clarity: I think moderation benefits from simplicity and clarity. It isn't a good sign when you are taking mod action against someone but can't really explain why because its too difficult or would take too long. I feel like that indicates that the underlying rules/principles aren't really clear or simple enough. It is hard for people to adapt to comply with complicated and unclear rules and the road to motivated reasoning is also paved in vague principles that are easily applied differently to different stituations.

Proportionality: This one goes in both directions. I think sometimes it would be better for mods to step in early but with a lighter touch, something like "this seems to be getting a bit heated/unproductive, friendly reminder to everyone to keep it civil". An ounce of prevention and all that.

For what users can do:

Stick to the topic/don't go meta: Stay grounded in the discussion, try not to import assumptions based on previous arguments with people vaguely on the same "side" as who you are talking to, focus on their arguments. Try to make the discussion more specific rather than more general. Try not to take your argument in a meta direction, don't talk about what arguments are good in the abstract or focus too much on claiming that the other person's arguments are an example of a general phenonemon, try to respond to their claims specifically.

Try to stay calm: It is common that people feel a bit uncomfortable when faced with strong disagreements, including that the person they are talking to is being unfair or mean in some way. The problem is that if both people go along with this feeling it often leads to a bad place. If one person temporarily lapses into a less friendly tonw but the other person stays calm, sometimes the conversation is salvagable. Try to be that calm person sometimes in hopes that when you are the one who isn't calm, the person who you are talking to can cover for you a little.

You never have an obligation to respond: I think some of the stress in arguing online comes from a feeling of being trapped, like you will be judged if the person you are talking to gets the last word. We should try to cultivate a culture where it is okay to respond within your own time limitations. If you get the sense the person you are talking to is feeling tapped out, you can openly raise this issue or try to take the peddle off the gas a little.

An idea

I will also take this chance to float an idea I've been thinking about recently. I've been calling this "epistolary debates" in my head. The idea is inspired by legal briefs in courts, where the parties submit sequences of written statements that respond to each other.

I think this format, where the "debaters" write longer form content over an extended period of time would be an improvement in a lot of ways. I imagine this as follows:

Partipants message each other privately to align on a topic and definitions, agreeing in advance on the general parameters of what they would like to discuss, who will write first, and timelines/length goals.
First person writes an essay/"opening statement" laying out there position.
Second person starts writing their response, aiming for a write within a certain time frame (e.g. a week, a month).
First person writes their response to the second person's response.
Repeat as desired.

I think a benefit of this format is that extending the time over which the discussion occurs cuts down on stress and helps people set aside snarkier bits that feel right in the moment but aren't a good idea on further reflection. It also lets people work a little more on explaining their views more fully and clearly, and potentially cuts down on the conversation wondering off from the main topic.

ETA: I also agree with your last paragraph that for the specific case of people interested in AI safety/related issue, it is very important to have openness to discussing the topic in a way that isn't dismissive and takes seriously engages with the arguments, even if you sometimes feel frustrated with how these conversations play out.

Thanks!

Yeah, I'm trying to maintain openness to different possibilities on the time issue to an extent since I don't really know what happened. If I had to venture a guess (which could obviously be wrong), I'd say something like this:

Other forum users who got frustrated with your posts/comments reach out to the mod team privately, mod team has extended discussions amongst themselves, decides on the soft ban, and then reaches out to you to tell you. If this is what happened, I can imagine that it did take a reasonable amount of time and also its understandable that the mod team would want to incorporate feedback from users, but I would say this is a mistake on the part of the mod team if that's what happened. If you're talking here about the value of being able to reach out privately, wouldn't this be the time to do that, before going to the soft ban? If you're not making the decision lightly and discussing a lot and writing google docs, couldn't you copy some examples of problematic comments or posts from these documents fairly easily?

I think a prcess where there is a lot of back channelling has a siginificant risk of filter bubble/echo chamber issues as you mention, similar to what Habryka calls the "linkedin attractor" in the post linked above.

I'm generally of the belief that people should work on hashing out their own disagreements more rather than escalating to meta/moderation issues. Taking moderation actions because someone writes a lot of "hard to dispel misunderstandings" seems like it is extremely likely to be filtered through the lens of what a mod already agrees/disagrees with, and so is likely to be unevenly applied against critics.

Yarrow's Quick takes

TFD18h7

Although I think you and I would have several disagreements on the AI topic, I will put my vote in, to the extent anyone cares, that a ban was not justified in this case. There are things you've written that annoy me or that I'd have said differently, but in general I don't think these are anywhere close to warranting a ban (or frankly the level of downvotes some of your comments have gotten). I also think in several discussions you've been involved in that went unproductively, the people you were responding to or who responded to you are at least equally to blame and sometimes have behaved worse, including clearly uncivil behave (e.g. clearly intentionally insulting phrasing, unreasonable accusations of bad faith etc.).

Yarrow's Quick takes

TFD18h19

One section from that post raises the concept of 'Asymmetric effort ratios'. This is definitely part of our moderation decision. At one point, if I remember correctly, you wrote almost a fifth of the words on the Forum in a week. You are very productive of long comments, which are often packed with difficult to dispel misunderstandings. This is part of why a rate-limit was the solution we arrived at. In small doses, you can be a valuable contributor, but without limit, it becomes unfairly taxing on your interlocutors.

I feel like the issue addressed in the Said Saga is somewhat the opposite of what it seems like Yarrow does. A paradigmatic example of the bad bahavior attributed to Said was posting extremely short comments, such as just saying "examples?". Its obviously extremely easy to type a one word comment, and the possibility of adding additional examples of something a poster describes isn't exactly a deep insight. The asymetry is that it is very easy for Said to post a comment like that but would take a lot of effort for the original poster to respond providing examples, giving rise to the complaint that Said wasn't willing to put in equal effort.

This doesn't seem to be true at all of Yarrow, and producing a large quantity of words on the forum seems like it is actually contrary to the type of behavior that Said was alleged to have engaged in. Part of the issue is that lots of Said's comments were super brief! Yarrow also seems to write plenty of top level posts and many of their comments are on their own posts, again this is the exact opposite of part of the issue identified with Said, where a critical part of thbe asymetry is about the dynamic of commenting vs posting.

Yarrow also doesn't seem to do the "just asking questions" style that Habryka claims to identify in Said's comments, which I think is also a big part of why karma might not be sufficient. People may be hesitant to downvote comments on a post that are just a question, thus creating a possibility of negative behavior that systematically evades the karma system. But Yarrow seems to be pretty open in their criticism and often writes extended comments (not just one or two sentence questions), and you identify that the issue seems to be present in many of their downvoted comments. Why is the karma system not sufficient in that case? If the concern is posters feeling the need to respond to comments, I think that is mitigated strongly when those comments are downvoted.

In general, it seems like Yarrow actually does many of the things that supposedly would have improved Said's commenting, and the issues is essentially the opposite (writing too much rather than too little).

In both cases it seems like lots of moderator effort was devoted towards a specific user, which I definitely think is a reasonable thing for moderators to react too. At a certain point moderators can't just be expending infinite time and effort just devoted to stuff that is going on with one user.

At the same time, it does seem to me that there is another asymetry present here, where the voting/moderator attention/banning etc. is being deployed more harshly towards someone who is a critic/disagrees with popular ideas in the community vs people who express popular ideas or aren't critical. I think reviewing the context of Yarrow's posts/comments and the reactions/responses to them on the forum strongly suggests this. There are definitely times when Yarrow reacts in ways that I don't see as ideal and which I can understand people having issues with, but many times this is in response to other forum users also behaving questionably, often in a more egregiously than Yarrow. It seems likely to me that Yarrow being critical of popular ideas in the community is an important factor in the different responses.

Why I care

I feel like I semi-frequently find myself defending people who I probably have significant disagreements with (such as in this case with Yarrow) on here/lesswrong because it seems like there is a tendency within the EA/rationality community to make convoluted meta-arguments for why critics are doing something bad/"insuffciently truthseeking"/"burning the commons"-y and I think this is bad and not productive. These communities are indeed much more open to criticism than a lot of online communities (where disagreeing with popular ideas with just get you instantly insulted and/or banned), but I still think a more sophisticated defense mechanism against critics is alive and well.

I'm particularly interested/concerned about this because I believe that we will need pretty substantial policy actions on AI, and that this will likely require convincing people who have very different worldviews of certain things about AI. I view conversations on places like the EA forum or lesswrong as good testing grounds for how this might go, but in a place where it should be substanially easier than it would be in the policy arena. In my mind, if people are struggling to have conversations about AI with Yarrow (who is likely going to be more pleasant to discuss this issue with than 90% of people who don't already agree), I don't think that is a good sign for how this community is approaching that challenge of getting a diverse coalition on-side on the AI issue.

You Aren't in Charge of the Overton Window; Politics Is Not Interior Design

TFD1mo-5

The best cause will disappoint you: An intro to the optimisers curse

TFD3mo2

Good post. I have two general themes I'd like to comment on:

Analogies for cause prioritization

Your analysis covers several perspectives on this phenomenon, if we focus on the "actual performance" perspective, this is pretty similar to multi-armed bandits. One pattern that I think is present in strategies for these types of problems is the idea of spreading out actions across the different possibilities (explore vs exploit and all that). It wouldn't necessarily make sense to commit to one "arm" (or cause) early on when information is low. This "spreading out" across options is one way of dealing with uncertainty.

A similar idea comes up in another potential anology for cause prioritization, financial investing. We can think about optimizing a portfolio and its allocation to achieve good returns relative to risk, rather than trying to pick the single highest return asset. Thus we get concepts like disversification.

I find this stock-picking analogy helpful for thinking about how "neglectedness" is often treated in practice. I've often found myself skeptical of arguments for and from neglectedness, and I feel the way it is applied in practice doesn't really align with the classic "diminishing returns" conception. I think the way neglectedness is treated in practice ends up being more like how an investor with a high risk tolerance might view a risky asset. Riskier assets are expected to have higher returns, investors with lower risk tolerance would staturate low-risk/high-return options quickly, leaving risker investments "neglected". Thus an investor with high risk tolerance can find good opportunities that would be unappealing to other less risk tolerant investors by going to higher risk assets. I think this captures the spirit of what "neglected" cause areas have often looked like in EA, more speculative but where some EAs have a strong feeling that they caould have outsized impact.

If I can read between the lines a bit, under this anology EA pivoting more into AI is kind of like an investor who wants higher returns putting more of their portfolio in small cap growth stocks that are risker but which the investor thinks will result in higher return. One downside of this is decreased diversification. Another possible option would be to hold a more diversified portfolio but use leverage.

In-model vs Out-of-model robustness

The problem is not limited to cases with trials and noisy statistics, because the error does not have to arise from random chance. Problems with assumptions, bad guesses, even math errors will equally get you cursed. If anything, I would expect causes that lack empirical experimental data to be more cursed, not less.

I think this gets at a distinction that is worth calling out, in-model vs out-of-model robustness.

In my experience with cost-benefit analysis, both reading EA related ones and in industry, it is fairly common to propose a "median" scenario and also a "pessimistic" scenario, and provide estimates for these cases. The point is usually that since even the pessimistic scenario looks good, the analysis shows that the proposed intervention is robustly beneficial. This has a two-fold problem:

First, usually the reason to think that the "pessimistic" scenario is 'pessimistic is just that it uses parameter values that reduce the estimated benefit below the "median" scenario. It's unclear sometimes why that means the estimate is robustly lower than the actual benefit. This is the in-model robustness.

Despite the fact that I think this is an issue, sometimes it may be perceived as (or actually be) a somewhat unfair critique. All models are wrong, we have to use what we have to make estimates. This can result in polarized views of what an estimate shows. For a person who likes the intervention and has a gut feeling it is good, the "median" estimate makes a ton of sense and this seems like a very reasonable approach. For a skeptic, it seems prone to over-estimation for the reasons you highlight in the post. Moving the parameters so that your estimate is 25% lower doesn't turn garbage into non-garbage.

However, there is another source of error lurking in the background. What about costs that you haven't included? The potential for the intervention to backfire that isn't considered in any scenario? The hidden assumption that hasn't been tested in the "pessimistic" scenario? This is out-of-model robustness.

I think the polarization when it comes to in-model robustness causes proponents or fans of an idea or intervention to over-estimate robustness even when in-model robustness is high, because they implicitly credit the (perceived) in-model robustness to the out-of-model robustness.

In my view, the whole "rule high stakes in, not out" idea in practice will result in systematically doing this a lot, which I think makes it a bad heuristic for approaching these types of situations. One way to think about this is it encourages us to focus on specific high-volatility "assets" and thus lacks diversification.

Beware of non-evidence-based argumentation

TFD4mo1

In one of my comments above, I say this:

I will caveat this by saying that in my opinion it makes sense for estimation purposes to discount or shrink estimates of highly uncertainty quantities, which I think many advocates of AI as a cause fail to do and can be fairly criticized for. But the issue is a quantitative one, and so can come out either way. I think there is a difference between saying that we should heavily shrink estimates related to AI due to their uncertainty and lower quality evidence, vs saying that they lack any evidence whatsoever.

I feel like my position is consistent with what you have said, I just view this as part of the estimation process. When I say "E[benefits(A)] > E[benefits(B)]" I am assuming these are your best all-inclusive estimates including regularization/discounting/shrinking of highly variable quantities. In fact I think its also fine to use things other than expected value or in general use approaches that are more robust to outliers/high-variance causes. As I say in the above quote, I also think it is a completely reasonable criticism of AI risk advocates that they fail to do this reasonably often.

If you properly account for uncertainty, you should pick the certain cause over the uncertain one even if a naive EV calculation says otherwise

This is sometimes correct, but the math could come out that the highly uncertain cause area is preferable after adjustment. Do you agree with this? That's really the only point I'm trying to make!

I don't think the difference here comes down to one side which is scientific and rigorous and loves truth against another that is bias and shoddy and just wants to sneak there policies through in an underhanded manner with no consideration for evidence or science. Analyzing these things is messy, and different people interpret evidence in different ways or weigh different factors differently. To me this is normal and expected.

I'd be very interested to read your explainer, it sounds like it addresses a valid concern with arguments for AI risk that I also share.

Beware of non-evidence-based argumentation

TFD4mo3

If you believe that evidence that does not withstand scrutiny (that is, evidence that does not meet basic quality standards, contains major methodological errors, is statistically insignificant, is based on fallacious reasoning, or any other reason why the evidence is scrutinized) is evidence that we should use, then you are advocating for pseudoscience. The expected value of benefits based on such evidence is near zero.

I don't think evidence which is based on something other than "high-quality studies that withstand scrutiny" is pseudoscience. You could have moderate-quality studies that withstand scutiny, you could have preliminary studies which are suggestive but which haven't been around long enough for scrutiny to percolate up. I don't think these things have near zero evidential value.

This is my issue with your use of the term "scientific evidence" and related concepts. Its role in the argument is mostly rhetorical, having the effective of charcterizing other arguments or positions as not worthy of consideration without engaging with the messy question of what value various pieces of evidence actually have. It causes confusion and results in you equivocating about what counts as "evidence".

My view, and where we seem to disagree, is that I think there are types of evidence other than "high-quality studies that withstand scrutiny" and pseudoscience. Look, I agree that if something has basically zero evidential value we can reasonably round that off to zero. But "limited evidence" isn't the same as near-zero evidence. I think there is a catgory of evidence between pseudoscience/near-zero evidence and "high-quality studies that withstand scrutiny". When we don't have access to the highest quality evidence, it is acceptable in my view to make policy based on the best evidence that we have, including if it is in that imtermediate category. This is the same argument made in the quote from the report.

The quoted text implies that the evidence would not be sufficient under normal circumstances

This is exactly what I mean when I say this approach results in you equivocating. In your OP, you explicitly claim that this quote argues that evidence is not something that is needed. You clarify in your comments with me and in a clarification at the top of your post that only "high-quality studies that withstand scrutiny" really count as evidence as you use the term. The fact that you are using the word "evidence" in this way is causing you to misinterpret the quoted statement. The quote is saying that even if we don't have the ideal, high-quality evidence that we would like and that might be need for us to be highly confident and establish a strong consensus that in situations of uncertainty it is acceptable to make policy based on more limited or moderate evidence. I share this view and think it is reasonable nad not pseudoscientific or somehow a claim that evidence of some kind isn't required.

If the amount of evidence was sufficient, there would be no question about what is the correct action.

Uncetainty exists! You can be in a situation where the correct decision isn't clear because the available information isn't ideal. This is extremely common in real-world decision making. The entire point of this quote and my own comments is that when these situations arise the reasonable thing to do is to make the best possible decision with the information you have (which might involve trying to get more information) rather than declaring some policies off the table because they don't have the highest quailty evidence supporting them. Making decisions under uncertainty means making decisions based on limited evidence sometimes.

Beware of non-evidence-based argumentation

TFD4mo2

Where have I ever claimed that there is no evidence worth considering?

In your OP, you write:

In this post, I've criticized non-evidence-based arguments, which hangs on the idea that evidence is something that is inherently required. Yet it has become commonplace to claim the opposite. One example of this argument is presented in the International AI Safety Report

You then quote the following:

Given sometimes rapid and unexpected advancements, policymakers will often have to weigh potential benefits and risks of imminent AI advancements without having a large body of scientific evidence available. In doing so, they face a dilemma. On the one hand, pre-emptive risk mitigation measures based on limited evidence might turn out to be ineffective or unnecessary. On the other hand, waiting for stronger evidence of impending risk could leave society unprepared or even make mitigation impossible – for instance if sudden leaps in AI capabilities, and their associated risks, occur.

Your summary of the quoted text is inaccurate. You claim that this is an arguement that evidence is not something that in inherently required, but the quote says no such thing. Instead, it references "a large body of scientific evidence" and "stronger evidence" vs "limited evidence". This quote essential makes the same arguement I do above. How can we square the differences in these interpretations?

In response to me, you write:

In my post, I referred to the concept of "evidence-based policy making". In this context, evidence refers specifically to rigorous, scientific evidence, as opposed to intuitions, unsubstantiated beliefs and anecdotes. Scientific evidence, as I said, referring to high-quality studies corroborated by other studies.

You also have added as a clarfication to your OP:

Clarification (29.1.2026): In this post, I use the term "evidence" in the context of "evidence-based policy making".

So, as used in your post, "evidence" means "rigorous, scientific evidence, as opposed to intuitions, unsubstantiated beliefs and anecdotes". This is why I find your reference to "scientific evidence" frustrating. You draw a distinct between two categories of evidence and claim policy should be based on only one. I disagree, I think policy should be based on all available evidence, including intuition and anecdote ("unsubtantiated belief" obviously seems definitionally not evidence). I also think your argument relies heavily on contrasting with a hypothetical highly rigorous body of evidence that isn't often achieved, which is why I have pointed out what I see as the "messiness" of lots of published scientific research.

The distinction you draw and how you defined "evidence" results in an equivocation. Your caracterization of the quote above only makes sense if you are claiming that AI risk can only claim to be "evidence-based" if is is backed by "high-quality studies that withstand scrutiny". In other words, as I said in one of my comments:

It seems like the core of your argument is saying that there is a high burden of proof that hasn't been met.

So, where do we disagreee? As I say immediately after:

I agree that arguments for short timelines haven't met a high burden of proof but I don't believe that there is such a burden.

I believe that we should compare E[benefits(AI)] with E[benefits(GHD)] and any other possible alternative cause areas, with no area having any specific burden of proof. The quality of the evidence plays out in taking those expectations. Different people may disagree on the results based on their interpretations of the evidence. People might weigh different sources of evidence differently. But there is no specific burden to have "high-quality studies that withstand scrutiny", although this obviously weighs in favor of a cause that does have those studies. I don't think having high quality studies amounts to "style points". What I think would amount to "style points" is if someone concluded that E[benefits(AI)] > E[benefits(GHD)] but went with GHD anyway because they think AI is off limits due to the lack of "high-quality studies that withstand scrutiny" (i.e. if there is a burden of proof where "high-quality studies that withstand scrutiny" are required).

TFD

Posts 17

Comments48

An idea

Why I care

Analogies for cause prioritization

In-model vs Out-of-model robustness

Posts
17

Comments
48