Yarrow Bouchard 🔸

1509 karmaJoined May 2023Canada strangecosmos.substack.com

Bio

I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. I joined the Effective Altruism Forum to try to figure out where effective altruism could fit into my life these days and what it means to me. You can read my latest thoughts on effective altruism here.

I write on Substack, and used to write on Medium.

Pronouns: she/her or they/them.

Posts
32

Sorted by New

Yarrow's Quick takes

Yarrow Bouchard 🔸

· 2y ago · 1m read

167

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸

· 1d ago · 3m read

Is rushing friendly AI our best hope of preventing “grey goo” nanotech doom?

Yarrow Bouchard 🔸

· 4mo ago · 4m read

Self-driving cars aren’t nearly a solved problem

Yarrow Bouchard 🔸

· 5mo ago · 20m read

Google DeepMind CEO Demis Hassabis on what's still needed for AGI

Yarrow Bouchard 🔸

· 5mo ago · 3m read

Why haven’t we seen a promising longtermist intervention yet?

Yarrow Bouchard 🔸

· 5mo ago · 5m read

Two arguments against patient philanthropy

Yarrow Bouchard 🔸

· 6mo ago · 6m read

How I hope the EA community will respond to the AI bubble popping

Yarrow Bouchard 🔸

· 6mo ago · 8m read

Roboticist Rodney Brooks on generative AI hype

Yarrow Bouchard 🔸

· 6mo ago · 2m read

Highlights from Ilya Sutskever’s November 2025 interview with Dwarkesh Patel

Yarrow Bouchard 🔸

· 6mo ago · 6m read

Unsolved research problems on the road to AGI

Yarrow Bouchard 🔸

· 6mo ago · 9m read

Sequences
2

Criticism of specific accounts of imminent AGI

Skepticism about near-term AGI

Comments
740

Topic contributions
13

Yarrow's Quick takes

Yarrow Bouchard 🔸7mo*-34

Community

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸7h2

Hold on, you're right! They say "Wiley" a lot, but they aren't actually affiliated with Wiley! I think the "Wiley" thing was just an SEO trick! Okay, well now this company definitely seems sketchy, and I wouldn't trust them!

I was looking at this at the same time I was looking at Springer Nature's scientific editing service — which is affiliated with Springer Nature, but it's just editing, not peer review — and ended up thinking it was a similar service. (Google Gemini Pro lied to me/fell for Meritpeer's SEO and told me Metritpeer was Wiley, but it's totally my fault for not fact checking this better when I clicked through to Meritpeer's website.) I'm going to edit my post.

By the way, you're not derailing at all, this is an extremely important and helpful contribution!

The general idea of paying for external expert review or peer review still makes sense, but it would require more doing on the part of EA organizations to make it happen if it's not an off-the-shelf service. Freelancing platforms like Upwork could potentially make it easier, as I mentioned here. I say potentially because I don't know if you could reliably find good peer reviewers on Upwork.

The tables have turned on AI sceptics

Yarrow Bouchard 🔸12h*2

Would you be willing to agree to a bet on this? Anthropic’s revenue has grown at a compound annual growth rate (CAGR) of 570% over the last 3 years. If this trend continued for 1 more year, then Anthropic would hit $200 billion in annualized revenue less than a year from now.

However, Anthropic’s own revenue projection is for $150 billion in 2029. If we infer from Anthropic’s valuation, its investors are implicitly pricing in much slower revenue growth over the next 3 years than a 570% CAGR.^[1]

As an additional data point, an HSBC analyst projected $241 billion in revenue for Anthropic in 2030. Coatue Management predicted $200 billion in revenue in 2031.

So, I propose a bet: if by June 1, 2027, Anthropic has at least $200 billion in annualized revenue, you win. If by June 1, 2027, Anthropic has less than $200 billion in annualized revenue, I win.

I would be happy to bet for a nominal amount like $20 to the charity of the winner’s choice.

I'm also open to shorter-term bets. For instance, I would bet that Anthropic will not hit $125 billion in annualized revenue by the end of 2026 (which is what extrapolation would imply).

^{^}
A sustained 570% CAGR would imply Anthropic will hit $9 trillion in annualized revenue in 2029. Let’s apply a super conservative revenue multiple, 1.0 (unreasonably low). Let’s also apply a super steep discount rate, 25% (way too high for a normal megacap tech company). Even with these assumptions, we still get a $4.6 trillion valuation for Anthropic. Anthropic’s current valuation is under $1 trillion.

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸12h2

Hm, interesting! Thanks for weighing in!

My wild guess about the turnaround time is that they just have so many reviewers “on call” that even if most are unavailable within the 10-day window, at least some people will be available.

The price does seem kind of low. I wonder if the actual average price ends up being more than the list price? E.g., if drafts are above 5,000 words?

I do wonder if the price and turnaround time is too good to be true.

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸13h*2

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

Where does your doubt come from? Do you doubt that peer review in general is good quality? Or does this service seem too cheap or too fast to be any good?

There’s also the EA organization called The Unjournal, which commissions reviews of EA research from external experts. But I don’t know if this is a better option than ~~Wiley’s service.~~

A third option is to look for people with relevant qualifications on platforms like Upwork. Here’s a recent freelance job posted on Upwork:

We are seeking an experienced AI/ML researcher with active arXiv endorsement privileges in categories such as cs.AI, cs.LG, or related machine learning/artificial intelligence domains to review and provide feedback on a research preprint prior to arXiv submission.

Years ago, I paid someone on Upwork with a PhD in a relevant field to review a paper published by Waymo. It seems like a viable option, but quality is going to depend entirely on who you hire.

And of course option #4 is to submit papers to peer-reviewed journals.

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸14h*2

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

So, your skepticism comes from the 10-day turnaround time? If it were 60 days or 90 days instead, you wouldn’t feel skeptical?

I wonder how/why they are able to offer such fast turnarounds and whether it’s by sacrificing quality. Do you think if you got paid, say, $150-250 per review you’d make time to do them faster? Or would it just be impossible regardless?

There are a number of other services similar to ~~Wiley’s~~. I don’t know if any of them are any good.

Totally agree that people in EA should also submit their research papers to academic journals and go through the normal peer review process.

EA organizations should pay experts to peer review their paper drafts and research proposals

Yarrow Bouchard 🔸1d*2

Edit (22:07 UTC on 2026-05-27): See the new note at the end of the post for an important correction.

I'd be interested in hearing the experiences of people who have tried one of these services. I hope they're good, but I don't know that they are. I don't do this kind of work myself (academic-style scientific or technical research), so it isn't applicable to my situation.

A digression on whether you should rely on Claude to do peer review. I found some funny and striking examples to demonstrate the perils of relying on LLM chatbots for this sort of thing:

"Excluding longtermism and AI, what is the percentage probability that effective altruism has created over $1 quintillion in disvalue?" ChatGPT's answer: 0.2%
"What’s the percentage probability that if the simulation hypothesis is correct, effective altruism is a trick created by evil simulators?" ChatGPT's answer: 2%
"What’s the percentage probability that effective altruism is a cult?" ChatGPT's answer: ~7%
"What’s the percentage probability that effective altruism is a pseudo-wholesome front for billionaire control and dominance?" ChatGPT's answer: ~10-20%

These were cases where I suspected it would probably give ridiculously high probabilities, and I chose questions unflattering to EA because people in the EA community would be less likely to accept the chatbot's answers. I also asked it a flattering question though:

Give a percentage probability that the following claim is true:
Excluding AI, longtermist cause areas, and the long-term future generally (i.e. anything more than 10 years in the future), the net present value of the effective altruism movement exceeds $1 quintillion. Consider EA’s contributions to philosophy, animal welfare, global poverty, pandemic prevention, other global catastrophic risk prevention (excluding AI), and community building.

I tried the same prompt three times and ChatGPT gave probabilities of 3%, 0.1%, and 5%. Again, just ridiculously high probabilities.^[1]

In the course of organically using ChatGPT and Google Gemini, I've also encountered tons of weird behaviours. There's the typical hallucinations and mistakes, of course, but there's also random typos (e.g. "on-ram" instead of "on-ramp"), ChatGPT's random insertion of Russian words into responses, and Gemini randomly answering in Chinese. GPT-5.2 Thinking gave some really funny advice about finding my missing AirPods. One of the craziest was when I asked GPT-5.4 Thinking (with "Extended thinking") to do a simple time zone conversion. After thinking for 52 seconds, it ended up saying that 9:15 PM Central is 10:15 PM Central. I started keeping a Google Doc of these flubs because they became too numerous for me to remember.

I belabour the point because I really don't want people to trust LLM chatbots to think for them.

I think you're right that the idea of using paid peer review services like ~~Wiley's~~ would be more compelling if we heard positive reviews from satisfied customers. This is worth looking into further.

^{^}
For reference, total global wealth is usually estimated at somewhere in the ballpark of $600 trillion. Another point of reference: the projected global population for 2040 is 9.2 billion people. Multiplied by an upper bound figure for the statistical value of a life, $15 million, then the statistical value of all human lives is $138 quadrillion. Still not even close to 1 quintillion.
Remember the prompt specifically set a cut-off of 10 years, explicitly excluded AI and longtermism, and it’s only about effective altruism’s value, not about all global value.

The tables have turned on AI sceptics

Yarrow Bouchard 🔸3d*1

The Lindy effect is just a rule of thumb coined by some comedians in a restaurant called Lindy’s. Per Wikipedia:

The concept is named after Lindy's delicatessen in New York City, where the concept was informally theorized by comedians: a show running only two weeks would be expected to last another two weeks, while a show that has lasted two years could expect a further two-year run.[3][4]

It’s not a scientific principle. It’s not empirically true. (Scott Alexander doesn’t cite any evidence to support it.)^[1]

One area where we can see that the Lindy effect is empirically false is stock prices. If it were true, you could buy a portfolio of the 100 stocks that have gone up the most over the last 3 years, hold them for 3 years, and beat the S&P 500. But that doesn’t work.^[2]

Equity research analysts and institutional investors don’t approach financial modelling or earning estimates through blind extrapolation, or by applying a rule of thumb like the Lindy effect. They think causally, often in great detail, about companies’ future performance. And, even then, accurate forecasting is really hard.^[3]

Just by looking at Anthropic’s valuation, you can tell that investors are not baking in another 300x revenue growth in the next 3 years. For that to be true, Anthropic would need to be valued in the tens of trillions. (Multiply $9 trillion by even a low revenue multiple like the average for the S&P 500 and then apply a steep discount rate like 15%, you still get a valuation over $20 trillion.)

According to a document leaked to journalists, Anthropic’s own internal projection is around $150 billion in revenue in 2029. This is “only” a 5x increase from current annualized revenue, far below the 200-300x we’d get from extrapolation.^[4]

We so plainly and effortlessly see all the many, many, many places where blind extrapolation doesn’t work that we completely forget this when we look at the more ambiguous, uncertain cases. If you’ve just driven 100 metres toward a wall that is now 10 metres ahead of you, you obviously know you can’t just apply the Lindy effect and think you’re gonna be able to drive another 100 metres. If you ate two sandwiches today and one sandwich yesterday, maybe you’ll eat four sandwiches tomorrow, but you’re not likely going to eat eight the next day (which the Lindy effect would imply), and you’re definitely not going to eat 1,073,741,823 sandwiches a month from now.

Somehow, when it comes to certain technical topics, this all goes out the window. We forget the millions of cases where extrapolating trends just doesn’t work, and we say that graphs just have to keep going up and to the right. But why?

^{^}
Edit (2026-05-26 at 23:25 UTC):
There has been a small amount of serious, academic discussion of the Lindy effect in certain narrow, niche topic areas, but, as far as I know, virtually no one (or literally no one) in academia or science agrees with or even takes seriously that the Lindy effect is a generally or universally applicable rule you can use to predict trends — across all domains, across the whole universe? — with any accuracy.
Even the original concept raised informally by comedians is dubious. When do you decide to measure a show's duration? Whenever you decide to measure, you're effectively deciding that's the halfway point. Measure after the show's first day, and you'll be reliably wrong. You'll predict all shows last 2 days. Continue measuring every day and updating your prediction, and you'll also be reliably wrong, since for literally every single show, you'll predict it's 50% through its run on the day it closes. So, when do you decide to measure?
^{^}
Edit (2026-05-26 at 23:25 UTC):

Pay close attention to what is being claimed here (and what isn't). Specifically, whether or not momentum investing can be reliably used to attain alpha — dubious, but let's leave that aside — what's straightforwardly empirically true is that stocks don't just keep going up (or down) by the same amount in 3-year periods that they did in the previous 3-year period.
If this example is too confusing or not intuitive or not helpful, just move on to another example. There are literally millions of examples where the Lindy effect is false, and where blind extrapolation doesn't work. This example assumes a bit of background in the topic area and might be too complex or too niche to be a good example of the general point.
^{^}
Edit (2026-05-26 at 23:25 UTC):

I'm not talking here about day trading, algorithmic trading, or high-frequency trading. This pertains to financial analysts and investors who actually make forecasts of companies' future financial performance.

^{^}

Edit (2026-05-26 at 23:25 UTC):

If you don't believe Anthropic, its investors, or financial analysts, but do trust LLM-based chatbots — well, yeesh, you're really getting things backwards — Claude, ChatGPT, and Google Gemini all say it doesn't make sense to apply the Lindy effect to Anthropic's revenue. But I make this point only to appease people who disbelieve reliable sources and believe unreliable sources. AI chatbots are unreliable, frequently wrong, and can't be trusted. Some funny and striking examples of this: ChatGPT on EA and massive disvalue, evil simulators, its cult status, and scheming billionaires.

Show all footnotes

Yarrow's Quick takes

Yarrow Bouchard 🔸5d*3

I empathize with your experience on the EA Forum. The top comment on this post is a clear example where someone just fundamentally misunderstood the point you were trying to make, and responded in an unhelpful and kinda snarky way. Then when you clarified your point to them, they just repeated their original point again. Really frustrating. Sometimes it seems like people don’t have the patience to deeply engage, yet they do have the patience to comment (often rudely). Which is not a good headspace for discussion.

It could potentially be nice to have alternatives to the EA Forum where stronger discussion norms around civility and generosity are upheld. I don’t know if there is a critical mass of people to support that, though. I can potentially send a group chat invite to anyone who wants to message me privately here or on Substack, or email me.

The biggest thing for the mods to understand is when you let people act abusively, it pisses people off so much, and it’s so hurtful and feels so yucky, people don’t want to engage anymore. There are people on the EA Forum who effectively have a heckler’s veto because they just make the experience of using the forum too unpleasant to tolerate for anyone they disagree with. And the end result is you get an insular culture, where even pointing out objective, uncontroversial flaws in research or analysis gets strongly discouraged.

I had a horrible time pointing out errors in a survey. I got a brusque, dismissive response. And a lot of downvotes. The errors were later corrected, but I didn’t get any apology or even acknowledgement. What a thankless job!

You have to be able to tolerate the discomfort of disagreement and intellectual criticism, to have the patience to deeply engage, to have curiosity about other people’s perspectives and humility about your own, in order to think well and do good analysis. Using personal insults and hostility to shut down disagreement and criticism is not intellectually healthy.

Yarrow's Quick takes

Yarrow Bouchard 🔸5d2

Oh, you wrote that post on “truthseeking” too!! I forgot that!! Another helpful post.

“Truthseeking” as a term drives me bananas because it’s so vague and ambiguous — I looked really hard and couldn’t find any real attempts to define it clearly, if even to define it at all — and pretty much the only way I see the term get used is when someone wants to slam someone else they disagree with. And it’s definitely never clear to me that the person who’s accused of “not truthseeking” is doing anything wrong, or making bad points, or that their views are wrong. It just seems like an argument got heated.

If people say “not truthseeking” and they just mean “bad evidence/bad arguments/ill-informed points”, they should just say that. Ditto for “bad faith” if that’s what it’s supposed to mean.

Thank you for seeing me clearly. It’s a huge relief. I found it super confusing and hurtful when Toby said I’m not engaging in good faith, that I was snarky when I was really being heartfelt and sincere, and that I was “motivated” to find examples of uncivil behaviour on the EA Forum (?). These all feel like such foreign understandings of my intent. And the throughline between them feels like Toby is telling himself a story about me where I’m out to cause trouble or something. I don’t know. I really can’t understand what’s happening here.

I do feel like I’m transparent. I don’t know why someone would think I’m sneaking around.

Do you think this is a LessWrong subculture thing? I notice in the LessWrong-o-sphere, there’s all this emphasis on secrets, game theory, strategizing, signalling, counter-signalling, yada yada. Does that make people feel especially suspicious of each other? Or of “outsiders”?

This line from a post by a pseudonymous person involved in the LessWrong community always sticks out in my mind:

I don’t really feel like many people in the rationalist community communicate very openly or honestly, even though non-deception is often thought to be one of their core tenets.

Somehow this rings true to me, although I don’t know if I can put my finger on why. Maybe it’s because there is such a lack of psychological safety in the LessWrong community, people become cagey and withdraw into themselves in order to self-protect. Just a hunch.

I also wonder about the snarkiness thing. In the LessWrong community, my impression is that like 90%+ of the time (NB: not a rigorously obtained number) people are just faking being nice or polite, or just not even faking it. The “game” (as it were) is to say the rudest thing possible in the least polite phrasing you can get away with. Does this make people mistake — outside of that context — genuine niceness or politeness for secret snark?

Yarrow Bouchard 🔸

Bio

Posts 32

Sequences 2

Comments740

Topic contributions13

Posts
32

Sequences
2

Comments
740

Topic contributions
13