<sharing it here, too...>

~ a community of Bayes-enthusiasts fumble statistical inference ~

TL;DR — Industry uses Dirichlet Process and SAS, NOT Bayes. Bayes is persistently *wrong* and lacks a great deal of important information. Supposed ‘rationalists’ cling to Bayes as the Ultimate Truth, without knowing enough Mathematics to know they’re wrong.

Oh, well my Prior was <preferred assumption> but I guess I have to update with that one data-point that wandered into my life.” — multiple ‘Rationalists’ in my year of invading their gatherings

A weird thing is happening in the Bay Area, slowly creeping into the Zeitgeist: a group of non-mathematicians have decided they found the BEST statistical technique ever, and they want to use it to understand the whole world… but their technique is 260 YEARS OLD, and we’ve done a LOT better since then. It’s called Bayes’ Theorem, published in 1763 — literally 260 candles this year.

Let’s get a sense of just how out-dated and bizarre it is, to insist you have the One-True-Method when it’s 260 years old: back in 1763, when Bayes was published, there was another new-fangled invention sweeping Europe — the Dutch Plough. That’s the plough used today by the Amish. Literally, relying on Bayes to draw conclusions is like farming with an Amish plough; it’s hilariously inadequate, and completely dismissed by industry.

That quote at the top is an amalgam of multiple conversations with the Effective Altruists and Astral Codex Ten ‘Rationalists’ (they made that term up to describe themselves); it’s a persistent theme in their conversations. And, it’s not even the *correct* use of Bayes! Let’s see why:

In Bayes’ Theorem, you begin with a Prior. These Rationalists pick the Prior that they *prefer*. Neutral Bayesian Priors, however, are the average of all possible assumptions, NOT you’re preferred place to start. These folks’ first step is a disastrous error. Then, when they say “I guess I should update my Prior…” Wait! Why in the world would you ever feel confidence about a belief, when the ONLY thing you have is a Prior? A Prior is, by definition, the state of “no information” when one should have intellectual humility, not certainty!

Then, they are updating their Bayesian estimate using…. a *few* examples? The Rationalists repeatedly rely upon sparse evidence, while claiming certainty, as if “Statistically Significant Sample Size” just isn’t a thing. Bayes doesn’t *need* statistically significance, apparently! Finally, those examples they use are culled from personal experience. I hope I don’t have to explain to anyone why we need to collect a random sample from representative sub-populations? The supposedly rational Bayes-fans fail on each possible count.

So, if they correct those mistakes, can they then rely on Bayes to find their precious truths? Nope. Bayes is consistently wrong, reliably. That’s why industry doesn’t use it. They’d lose money. Dirichlet lets them make money, because it works better. That’s a stronger proof, empirically, than all the rationalizations of their community’s prominent Bayes-trumpeters: a fiction writer and a psych councilor, both of whom lack relevant experience with statistical analysis software and techniques.

In particular, the blog of that psych councilor, “Astral Codex Ten” has a tag-line: it quotes Bayes’ Theorem, and follows by saying “all else is commentary.” Everyone who reads his blog, and who then DOESN’T check what statistical techniques are used in the real world, stays there as part of the community. They have self-selected for a community of people who call Bayes the be-all-end-all, all of them agreeing they’re right, and they don’t know that they’re horribly wrong… because they don’t check!

Think about this for a moment: if you state Bayes’ Theorem, and then claim “all else is commentary” while recommending readers use Bayes, you are implicitly claiming “NO further improvements in statistical analysis have occurred in the 260 years since Bayes was published; Student-t Distributions, Levi Distributions, they don’t even need to exist!” That’s the core tenet of the Bay Area Rationalists’ luminary, addicted to Bayes.

Wait, so why and how is Dirichlet such an improvement?

Let’s imagine you took a survey in some big city, and found (unsurprisingly) a majority Democrats — it was a 60/40 split, on the nose. That sample’s split is also the “maximum likelihood” for the potential Population. Said another way, “The real-world population which is most likely to give you a 60/40 sample is a 60/40 population.” But, does that make 60/40 your best guess for the real population? No.

Imagine each possible population, one at a time. There’s the 100% Democrat population, first — what is the *likelihood* of such a population producing a 60/40 sample? Zero. What about 99% Democrat? Well, then it’ll depend upon how *many* people you surveyed, but there is just a tiny chance the real population is 99% Democrat! Keep doing that, for every population, all the way to 99% Republican, then 100% Republican. Whew! Now, you have a *likelihood* distribution, the “likelihood of population X generating sample Y.”

When we look at this distribution, for data that falls in two buckets (D/R), then we’ll notice something: the *peak* likelihood is at 60/40, but there’s ALSO a bunch of probability-mass on the 50/50 side of the curve, creating a tilt to the over-all probability. While the ‘mode’ of the likelihood distribution is still the 60/40 estimate, the actual ‘mean’ of that distribution is closer to 50/50, every time! You *should* expect that the true population is closer to an *equal division* among buckets. When you collect more samples, you narrow that distribution of likelihoods, so you see less drift toward 50/50. That’s the reason you want a ‘statistically significant sample size’.

Let’s look at that other aspect Dirichlet possesses, which Bayes wholly lacks: Confidence!

When you look at the likelihood of each population, the chance of it producing your observed sample, you can also ask: “How far AWAY from our best guess would we need to place boundaries, such that we include 95% of the possible populations’ likelihoods within our bounds?” That’s called your Confidence Interval! You may have only learned the trimmed-down simplicities and z-score tables in your Stat 101 class, but there’s a reason for why they can claim confidence: that interval of population-estimates contains 95% of the likelihood-distribution’s probability-mass!

Finally, let’s consider “the cost of being wrong”. Bayes doesn’t balance your prediction according to the cost of being wrong; Dirichlet’s distribution over potential populations can simply be *multiplied* by the cost of each error-distance, and then the mode of that distribution will “minimize the COST of being WRONG.” You can even multiply by costs which are discontinuous or ranges, producing high and low bounds and nuanced thresholds of risk. Definitely better than Bayes.

Now, Dirichlet isn’t even the be-all-end-all… it was published in 1973, 50 years old THIS year! SAS has trade secrets since the 70’s, and invests 2.5x more into R&D than the TECH-industry average! If you want to pass muster for pharmaceuticals in front of the FDA, you send all your data to SAS. It’s required, because they’re soooo damn GOOD! So, unless you work at SAS (which has the highest profits per employee hour of all companies on Earth, and has expanded consistently since 1976… consistently rated one of the best employers on the planet…) then you DON’T know the be-all-end-all statistical technique — and neither do Scott Alexander or Eliezer Yudkowski, as much as they’d like you to believe otherwise. Just for reference, when “you think you’re right BECAUSE you don’t know enough to know you’re wrong,” that’s called the Dunning-Kreuger Effect, dear Rationalists.

Comments5


Sorted by Click to highlight new comments since:

While I expect some EAs and rationalists do actually use Bayes formally in their analysis, a lot of its use is informal, using language associated with Bayes to communicate an approach to updating beliefs.

From this informal perspective, clarity and conciseness matters far more than empirical robustness.

"From this informal perspective, clarity and conciseness matters far more than empirical robustness."

Then you are admitting my critique: "Your community uses excuses, to allow themselves a claim of epistemic superiority, when they are actually using a technique which is inadequate and erroneous." Yup. Thanks for showing me and the public your community's justification for using wrong techniques while claiming you're right. Screenshot done!

Why is it inadequate to use language associated with Bayes in an informal analysis? Are you suggesting that when people communicate about their beliefs in day-to-day conversation, they should only do so after using Dirichlet or another related process? Can you see how that is, in fact, extremely impractical? Can you see how it is rational to take into account the costs and benefits of using a particular technique, and while empirical robustness may sometimes be overwhelmingly important in some contexts, it is not always rational to use a  method in some contexts such as if there are too high costs associated with using it?

Please keep taking screenshots! I'm sure you wouldn't want to mislead your audience by only showing part of the discussion out of context :)

You're welcome to side with convenience; I am not commanding you to perform Dirichlet. Yet! If you take that informality, you give-up accuracy. You become MoreWrong, and should not be believed as readily as you would like.

Oh, you entirely missed my purpose: I was sharing this with your community, as a courtesy. I publish on different newsletters online, and I wrote for that audience ABOUT your community. And, the fact that you're not interested in learning about Dirichlet, when it's industry-standard (demonstrating its superiority empirically, not with anecdotes you find palatable). So, no, I don't plan to present myself in a way you approve of, as a pre-requisite to you noticing that Bayes is out-dated by 260 years of improvements. Dirichlet, logically, would NOT have been published and adopted in 1973 and since, if it were in fact inferior to Bayes.

You evidence the same spurious assumptions and lack of attention to core facts - Dirichlet is an improvement, obviously, by coming along later and being adopted generally. I also addressed the key information which Dirichlet provides, which Bayes' Theorem is incapable of generating: a Likelihood Distribution across possible Populations, and the resultant Confidence Interval, as well as weighting your estimate to Minimize the Cost of being Wrong. Those are all key, valuable information that Bayes' Theorem will not give you on its own. When Scott Alexander claims "Bayes' Theorem; all else is commentary" he leaves-out critical, incomparable improvements in our understanding.

Curated and popular this week
 ·  · 38m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI)[1] by 2028? In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote). This means that, while the compa
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal
 ·  · 1m read
 · 
We’ve written a new report on the threat of AI-enabled coups.  I think this is a very serious risk – comparable in importance to AI takeover but much more neglected.  In fact, AI-enabled coups and AI takeover have pretty similar threat models. To see this, here’s a very basic threat model for AI takeover: 1. Humanity develops superhuman AI 2. Superhuman AI is misaligned and power-seeking 3. Superhuman AI seizes power for itself And now here’s a closely analogous threat model for AI-enabled coups: 1. Humanity develops superhuman AI 2. Superhuman AI is controlled by a small group 3. Superhuman AI seizes power for the small group While the report focuses on the risk that someone seizes power over a country, I think that similar dynamics could allow someone to take over the world. In fact, if someone wanted to take over the world, their best strategy might well be to first stage an AI-enabled coup in the United States (or whichever country leads on superhuman AI), and then go from there to world domination. A single person taking over the world would be really bad. I’ve previously argued that it might even be worse than AI takeover. [1] The concrete threat models for AI-enabled coups that we discuss largely translate like-for-like over to the risk of AI takeover.[2] Similarly, there’s a lot of overlap in the mitigations that help with AI-enabled coups and AI takeover risk — e.g. alignment audits to ensure no human has made AI secretly loyal to them, transparency about AI capabilities, monitoring AI activities for suspicious behaviour, and infosecurity to prevent insiders from tampering with training.  If the world won't slow down AI development based on AI takeover risk (e.g. because there’s isn’t strong evidence for misalignment), then advocating for a slow down based on the risk of AI-enabled coups might be more convincing and achieve many of the same goals.  I really want to encourage readers — especially those at labs or governments — to do something