This was originally written in December 2021. I think it's good practice to run posts by orgs before making them public, and I did in this case. This has some benefits: orgs aren't surprised, they can prepare a response in advance if they want to, they can point out errors before things become public, etc, and I think it's generally worth doing. In this case Leverage folks pointed out some errors, omissions, and bad phrasing in my post, which I've fixed, and I'm thankful for their help. Pre-publication review does also have downsides, however, and in this case as the email conversations grew to 10k+ words over three weeks I ran out of time and motivation.

A month ago I came across this in my list of blog drafts and decided to publish it as-is with a note at the top explaining the situation. This means that it doesn't cover any more recent Leverage developments, including their Experiences Inquiry Report and On Intention Research paper, both published in April 2022. I shared this post again with Leverage, and while I've made edits in response to their feedback they continue to disagree with my conclusion.

In the original pre-publication discussion with Geoff, one of the topics was whether we could make our disagreement more concrete with a bet. For example, research that launches a new subfield generally gets lots of citations, such as the the Concrete Problems paper (1,644 citations at 6 years), and if Leverage 1.0's research ends up having this kind of foundational impact this could be a clear way to tell. When I gave Leverage a second pre-publication heads up, Geoff and I talked more and we were able to nail down some terms: if a Leverage paper drawing primarily on their pre-2019 research has 100+ citations from people who've never worked for Leverage by 2032-10-01, then I'll donate $100 to a charity of Geoff's choosing; if not then Geoff will do the same for a charity of my choosing. I've listed this on my bets page.

In 2011, several people I knew through Boston-area effective altruism and rationality meetups started an organization called Leverage Research. Their initial goal was to "make the world a much better place, using the most effective means we can", and they worked on a wide range of projects, but they're probably best known for trying to figure out how to make people more productive/capable/successful by better understanding how people think and interact. They initially lived and worked in a series of shared houses, first in Brooklyn and then in Oakland; I visited the latter for an evening in early 2014. The core project ("Leverage 1.0") disintegrated in 2019, with some portions continuing, including Paradigm (training/coaching) and Leverage 2.0 (early stage science). In this post I'm only looking at Leverage 1.0, and specifically at their psychology research program.

In mid-December 2021, Leverage's former head of operations, Cathleen, wrote In Defense of Attempting Hard Things: and my story of the Leverage ecosystem (LW comments), giving a detailed history with extensive thoughts on many aspects of the project. I remember her positively from my short 2014 visit, and I'm really glad she took the time to write this up.

There are many directions from which people could approach Leverage 1.0, but the one that I'm most interested in is lessons for people considering attempting similar things in the future.

My overall read of Cathleen's post is that she (and many other ex-Leverage folks) view the project as one where a group of people took an unorthodox approach to research, making many deep and important discoveries about how people think and relate to each other. I've read the Connection Theory paper and the four research reports Geoff has published (see Appendix 3), however, and I don't see anything in them that backs up these claims about the originality and insight of their psychology research. While there are a range of reasons why people might not write up even novel and valuable results, I think the most likely explanation is that there weren't discoveries on the level they're describing.

Geoff is still gradually writing up Leverage 1.0-era results, so it's possible that something will come out later that really is impressive. While this isn't what I'm expecting, if it happens I'll need to retract most of what follows. [2022-09: this is essentially what Geoff and I bet on above.]

If there weren't any big psychology research breakthroughs, however, why would they think there were? Putting together my reading of Cathleen's post (Appendix 1), Larissa's post (Appendix 2), and a few other sources (Appendix 3), here's what I see as the most likely story: "The core problem was that Leverage 1.0 quickly became much too internally focused. After their Connection Theory research did not receive the kind of positive response they were hoping for they stopped seeing publishing as a way to get good feedback. With an always-on dynamic and minimal distinction between living and working space during their formative years, their internal culture, practice, and body of shared knowledge diverged from mainstream society, academia, and the communities they branched from. They quickly got to where they felt people outside the group didn't have enough background to evaluate what they were doing. Without enough deep external engagement, however, it was too hard for them to tell if their discoveries were actually novel or valuable. They ended up putting large amounts of effort into research that was not just illegible, but not very useful. They gave up their best sources of external calibration so they could move faster, but then, uncalibrated, put lots of effort into things that weren't valuable."

[2022-09: I'm not saying that the people researching psychology at Leverage were poor thinkers. Instead my model is that when people are operating without good feedback loops they very often do work that isn't useful but believe that it is. This is part of why I was pessimistic on circa-2015 AI safety work (and is still a reason I'm skeptical of a lot of AI safety work today) and worried about a dynamic of inattentive funders for meta-EA projects (also still a problem). Similarly, I think the replication crisis was primarily a problem of researchers thinking they had meaningful feedback loops when they didn't.]

In assessing a future research project I wouldn't take "that looks a lot like Leverage" as any sort of strong argument: Leverage 1.0 was a large effort over many years, encompassing many different approaches. Instead I would specifically look at its output and approach to external engagement: if they're not publishing research I would take that as a strong negative signal for the project. Likewise, in participating in a research project I would want to ensure that we were writing publicly and opening our work to engaged and critical feedback.

Appendix 1: some extracts from Cathleen's post that crystallized the above for me:

  • "[T]he pace of discovery and development and changes in the structure and composition of the team was too fast to allow for people to actually keep up unless they were in the thick of it with us"

  • "From the outside (and even sometimes from the inside) this would look like unproductive delusion, but in fact it was intentional and managed theoretical exploration. And it led to an enormous amount of what many in the group came away believing were accurate and groundbreaking theories of how the mind works and how a personality is shaped by life."

  • "For a small group of untrained people to independently derive/discover so much in a handful of years does, I think, indicate something quite unusual about Geoff's ability to design a productive research program."

  • "I think it's worth pausing to appreciate just how bad the conflict leading up to the dissolution, as well as the dissolution itself, was for a number of people who had been relying on the Leverage ecosystem for their life plans: their friends, their personal growth, their livelihood, their social acceptance, their romantic prospects, their reputations, their ability to positively impact the world."

  • The entire "What to do when society is wrong about something?" section.

Appendix 2: the same for Larissa's post:

  • "From the outside, Leverage's research was understandably confusing because they were prioritising moving through a wide range of research areas as efficiently as possible rather than communicating the results to others. This approach was designed to allow them to cover more ground with their research and narrow in quickly on areas that seemed the most promising."

  • "Notably, Leverage's focus was never particularly on sharing research externally. Sometimes this was because it was a quick exploration of a particular avenue or seemed dangerous to share. Often though it was a time trade-off. It takes time to communicate your research well, and this is especially challenging when your research uses unusual methodology or starting assumptions."

  • "[T]here was a trade-off in time spent conducting research versus time spent communicating it. As we didn't invest time early on in communicating about our work effectively, it only became harder over time as we built up our models and ontologies."

  • "One of the additional adverse effects of our poor public communication is that when Leverage staff have interacted with people, they often didn't understand our work and had a lot of questions and concerns about it. While this was understandable, I think it sometimes led staff to feel attacked which I suspect, in some cases, they handled poorly, becoming defensive and perhaps even withdrawing from engaging with people in neighbouring communities. If you don't build up relationships and discuss updates to your thinking inferential distance builds up, and it becomes easy to see some distant, amorphous organisation rather than a collection of people."

Appendix 3: earlier public discussion of Leverage 1.0, which I've also drawn on in trying to understand what happened:

  • January 2012: Geoff, Leverage's primary founder and Executive Director, writes Introducing Leverage Research. The post directs people to the website for more information about their research, which has a link to download Connection Theory: the Current Evidence. The comments have a lot of skeptical discussion of Connection Theory, with back-and-forth from Geoff.

  • September 2012: Peter writes A Critique of Leverage Research's Connection Theory. His conclusion is that the evidence presented is pretty weak and that it's in conflict with a lot of what we do know about psychology. The comments again have good engagement. At some point before Alyssa's 2014 post (below) Leverage removed the Connection Theory paper from their site. [2022-10: Cathleen tells me this was in the 2013 redesign of their site, and linked me to before and after captures.]

  • April 2014: Alyssa writes The Problem With Connection Theory, digging deeper into some of the claims of the Connection Theory paper. She argues that the paper generally oversells its evidence, and highlights that several predictions which one would not normally judge as correct are counted as positive evidence. Jasen, a Leverage employee, responds in the comments to say Alyssa is criticizing an obsolete document.

  • January 2015: Evan comments with his understanding of why Leverage hasn't shared much publicly, including that he thinks "Leverage Research perceives it as difficult to portray their research at any given time in granular detail. That is, Leverage Research is so dynamic an organization at this point that for it to maximally disclose the details of its current research would be an exhaustive and constant effort."

  • [2022-10: Cathleen tells me that in June 2016 she asked the Internet Archive to exclude Leverage so that people would focus on the new content on their website. Because the site isn't included in the Archive, I wasn't able to evaluate the historical content of their website in putting together this post or formulating my hypothesis above. She also linked me to a pair of captures, one from 2013 and another from 2018 showing blog posts published in 2016. I haven't evaluated these captures.]

  • August 2018: Ryan writes Leverage Research: reviewing the basic facts anonymously as "throwaway", and then following up as "anonymoose" (both of which he's since publicly confirmed). His high-level point is that Leverage seemed to have produced very little given the amount of time and money put into the project. Geoff replies that he had been planning to publish some of their results shortly.

  • November 2019: Larissa, Leverage's incoming communications person, posts Updates from Leverage Research: history, mistakes and new focus, expanding on a comment she had made in September. She discusses history, dissolution of the original project, and current plans. I was especially interested in her discussion of the causes and effects of Leverage's approach to external engagement.

  • December 2020 through October 2021: Geoff links a series of four "Leverage 1.0 Research Reports" on his personal site, three on consensus (1, 2, 3) and one on intelligence amplification (4). I haven't seen any discussion of these. I'm very glad he wrote them up and made them public, but I also don't see in them the kind of breakthroughs I would expect from how Cathleen wrote about Leverage 1.0's work.

  • September 2021: Someone anonymous notices that Geoff is fundraising, and posts Common knowledge about Leverage Research 1.0. They argue that Leverage was a harmful "high demand group". Lots of different perspectives in the comments.

  • September 2021: Larissa posts Updates from Leverage Research: History and Recent Progress. In the section on Leverage's Exploratory Psychology Program she discusses their plans to release psychological research tools over the next few months.

  • October 2021: An anonymous former Leverage employee writes about their experience there, and how it "really mismatched the picture of Leverage described by" the 'Common Knowledge' post.

  • October 2021: Zoe, one of their former researchers, posted about her experience there. See the corresponding LessWrong post for discussion. Also see Geoff's response.

  • December 2021: Jonathan, another former researcher, posts Leverage Research: Context, Analysis, and Takeaway. The "Utopic Mania and Closed-offness" section was the most interesting to me, but it is sufficiently metaphorical that I don't really understand it.

  • December 2021: Cathleen, Leverage's former COO, wrote In Defense of Attempting Hard Things: and my story of the Leverage ecosystem. This is the article that prompted my post, and I'm sad about how little coverage and consideration than it received compared to some of the less informative posts above.

Comment via: facebook





More posts like this

Sorted by Click to highlight new comments since:

Partly unrelated: at first, I thought the title meant that we should research deprioritizing external communication. It took me a while to understand it meant that research is/was deprioritizing external communication.

Edit: The post has excellent nuance, and I make no claim to support or defend Leverage specifically (idk them). My comment is intended more generally, and my disagreement concerns two points:

  1. "The core problem was that Leverage 1.0 quickly became much too internally focused."
  2. "If they're not publishing research I would take that as a strong negative signal for the project."

You make several points, but I just want to respond to my impression that you're trying to anchor wayward researchers or research groups to the "main paradigm" to decrease the chance that they'll be wrong. I'm pretty strongly against this.

In a common-payoff game (like EA research), we all share the fruits of major discoveries regardless of who makes the discovery. So we should heavily prioritise sensitivity over specificity. It doesn't matter how many research groups are wildly wrong, as long at least one research group figures out how to build the AI that satisfies our values with friendship and ponies. So when you're trying to rein in researchers instead of letting them go off and explore highly variable crazy stuff, you're putting all your eggs in one basket (the most respectable paradigm). Researchers are already heavily incentivised to research what other people are researching (the better to have a lively conversation!), so we do not need additional incentives against exploration.

The value distribution of research fruits is fat-tailed (citation needed). Strategies that are optimal for sampling normal distributions  are unlikely to be optimal for fat tails. Sampling for outliers means that you should rely more on theoretical arguments, variability, and exploration, because you can't get good data on the outliers--the only data that matters. If you insist on being legible and scientific, so you optimise your strategy based on the empirical data you can collect, you're being fooled into mediocristan again.

Lemme cite a paper in network epistemology so I can fake looking like I know what I'm talking about,

“However, pure populations of mavericks, who try to avoid research approaches that have already been taken, vastly outperform the other strategies. Finally, we show that, in mixed populations, mavericks stimulate followers to greater levels of epistemic production, making polymorphic populations of mavericks and followers ideal in many research domains.”[1]
-- Epistemic landscapes and the division of cognitive labor

That said, I also advocate against explorers being allowed to say

But I'm virtuously doing high-variance exploration, so I don't need to worry about your rigorous schmigorous epistemology!

Explorers need to be way more epistemologically vigilant than staple researchers pursuing the safety of existing paradigms. If you leave your harbour to sail out into the open waters, that's not a good time to forget your sextant, or pretend you'll be a better navigator without studying the maps that do exist.

  1. ^

    FWIW, I think conclusions from network-epistemological computer simulations are extremely weak evidence about what we as an irl research community should do, and I mainly benefit from it because they occasionally reveals patterns that help with analysing real-life phenomena. The field exists at all--despite their obviously irrelevant "experiments"--because it makes theoretical speculation seem more technical, impressive, professional.

It doesn't matter how many research groups are wildly wrong, as long at least one research group figures out how to build the AI that satisfies our values with friendship and ponies.

Sort of? In your hypothetical there are two ways your research project could go once you believe you've succeeded:

  1. You go and implement it, or

  2. You figure out how to communicate your results to the rest of the industry.

If you go with (1) then it's really important that you get things right, and if you've disconnected yourself from external evaluation I think there's a large chance you haven't. I'd much prefer to see (2), except now you do need to communicate your results in detail so the rest of the world can evaluate and so you didn't gain that much by putting off the communication until the end.

I'll also make a stronger claim, which is that communication improves your research and chances of success: figuring out how to communicate things to people who don't have your shared context makes it a lot clearer which things you actually don't understand yet.

trying to rein in researchers instead of letting them go off and explore highly variable crazy stuff

I'm not sure why you think I'm advocating avoiding high-variability lines of research? I'm saying research groups should make public updates on their progress to stay grounded, not that they should only take low-risk bets.

I edited my original comment to point out my specific disagreements. I'm now going to say a selection of plausibly false-but-interesting things, and there's much more nuance here that I won't explicitly cover because that'd take too long. It's definitely going to seem very wrong at first glance without the nuance that communicates the intended domain.

I feel like I'm in a somewhat similar situation to Leverage, only in the sense that I feel like having to frequently publish would hinder my effectiveness. It would make it easier for others to see the value of my work, but in my own estimation that trades off against maximising actual value.

This isn't generally the case for most research, and I might be delusional (ime 10%) to think it's the case for my own, but I should be following the gradient of what I expect will be the most usefwl. It would be selfish of me to do the legible thing motivated just by my wish for people to respect me.

The thing I'm arguing for is not that people like me shouldn't publish at all, it's that we should be very reluctant to punish gambling sailors for a shortage of signals. They'll get our attention once they can demonstrate their product.

The thing about having to frequently communicate your results is that it incentivises you to adopt research strategies that lets you publish frequently. This usually means forward-chaining to incremental progress without much strategic guidance. Plus, if you get into the habit of spending your intrinsic motivation on distilling your progress to the community, now your brain's shifted to searching for ideas that fit into the community, instead of aiming your search to solve the highest-priority confusion points in your own head.

To be an effective explorer, you have to get to the point where you can start to iterate on top of your own ideas. If you timidly "check in" with the community every time you think you have a novel thought, before you let yourself stand on it in order to explore further down the branch, then 1) you're wasting their time, and 2) no one's ever gonna stray far from home.

When you go from—

        A) "huh, I wonder how this thing works, and how it fits into other things I have models of."
        B) "hmm, the community seems to behave as if Y is true, but I have a suspicion that ¬X,
                  so I should research  it and provide them with information they find valuable."

—then a pattern for generating thoughts will mostly be rewarded based on your prediction about whether the community is likely to be persuaded by those thoughts. This makes it hard to have intrinsic motivation to explore anything that doesn't immediately seem relevant to the community.

And while B is still reasonably aligned with producing value as long as the community is roughly as good at evaluating the claims as you are, it breaks down for researchers who are much better than their expected audience at what they specialise in. If the most competent researchers have brains that optimise for communal persuasiveness, they're wasting their potential when they could be searching for ideas that optimise for persuading themselves--a much harder criteria to meet given that they're more competent.

I think it's unhealthy to–within your own brain–constantly try to "advance the communal frontier". Sure, that could ultimately be the goal, but if you're greedily and myopically only able to optimise for specifically that at every step, then that is like a chess player who's compulsively only able to look for checkmate patterns–unable to see forks that merely win material or positional advantage.

How frequently do you have to make your progress legible to measurable or consensus criteria? How lenient is your legibility loop?

I'm not saying it's easy to even start trying to feel intrinsic motivation for building models in your own mind based on your own criteria for success, but being stuck in a short legibility loop certainly doesn't help.

If you've learned to play an instrument, or studied painting under a mentor, you may have heard the advice "you need to learn you trust in your own sense of aesthetics." Think of the kid who, while learning the piano, expectantly looks to their parent after every key they press. They're not learning to listen. Sort of like a GAN with a discriminator trusted so infrequently that it never learns anything. Training to both generate and discriminate within yourself, using your own observations, will be pretty embarrassing at first, but you're running a much shorter feedback loop.

Maybe we're talking about different timescales here? I definitely think researchers need to be able to make progress without checking in with the community at every step, and most people won't do well to try and publish their progress to a broad group, say, weekly. For a typical researcher in an area with poor natural feedback loops I'd guess the right frequency is something like:

  1. Weekly: high-context peers (internal colleagues / advisor / manager)

  2. Quarterly: medium-context peers (distant internal colleagues / close external colleagues)

  3. Yearly: low-context peers and the general world

(I think there are a lot of advantages to writing for these, including being able to go back later, though there are also big advantages to verbal interaction and discussion.)

I think Leverage was primarily short on (3); from the outside I don't know how much of (2) they were doing and I have the impression they were investing heavily in (1).

Roughly agreed. Although I'd want to distinguish between feedback and legibility-requirement loops. One is optimised for making research progress, the other is optimised for being paid and respected.

When you're talking to your weekly colleagues, you have enough shared context and trust that you can ramble about your incomplete intuitions and say "oops, hang on" multiple times in an exposition. And medium-context peers are essential for sanity-checking. This is more about actually usefwl feedback than about paying a tax on speed to keep yourself legible to low-context funders.

Thank you for chatting with me! ^^

(I'm only trying to talk about feedback here as it relates to research progress, not funding etc.)

Ah, but part of my point is that they're inextricably linked--at least for pre-paradigmatic research that requires creativity and don't have cheap empirical-legible measures of progress. Shorter legibility loops puts a heavy tax on the speed of progress, at least for the top of the competence distribution. I can't make very general claims here given how different research fields and groups are, but I don't want us to be blind to important considerations.

There are deeper models behind this claim, but one point is that the "legibility loops" you have to obey to receive funding requires you to compromise between optimisation criteria, and there are steeper invisible costs there than people realise.

Edit: I mostly retract this comment. I skimmed and didn't read the post carefully (something one should never do before leaving a negative comment) and interpreted it as "Leverage wasn't perfect, but it is worth trying to make Leverage 2.0 work or have similar projects with small changes". On rereading, I see that Jeff's emphasis is more on analyzing and quantifying the failure modes than on salvaging the idea. 

That said, I just want to point out that (at least as far as I understand it), there is a significant collection of people within and around EA who think that Leverage is a uniquely awful organization which suffered a multilevel failure extremely reminiscent of your run-of-the mill cult (not just for those who left it, but also for many people who are still in it), which soft-core threatens members to avoid negative publicity, exerts psychological control on members in ways that seem scary and evil. This is context that I think some people reading the sterilized publicity around Leverage will lack.

There are many directions from which people could approach Leverage 1.0, but the one that I'm most interested in is lessons for people considering attempting similar things in the future.

I think there's a really clear lesson here: don't.

I'll elaborate: Leverage was a multilevel failure. A fundamentally dishonest and charismatic leader. A group of people very convinced that their particular chain of flimsy inferences led them to some higher truth that gave them advantages over everyone else. A frenzied sense of secrecy and importance. Ultimately, psychological harm and abuse.

It is very clearly a negative example, and if someone is genuinely trying to gain some positive insight into a project from "things they did right" (or noticeably imitate techniques from that project), that would make me significantly less likely to think of them as being on the right track.

There are examples of  better "secret projects" - the Manhattan project as well as other high-security government organizations, various secret revolutionary groups like the early US revolutionaries, the abolitionist movement and the underground railroad, even various pro-social masonic orders. Having as one's go-to example of something to emulate an organization that significantly crossed the line into cult territory (or at least into Aleister Crowley level grandiosity around a bad actor) would indicate to me a potential enlarged sense of self-importance, an emphasis on deference and exclusivity ("being on our team") instead of competence and accountability, and a lack of emphasis on appropriate levels of humility and self-regulation.

To be clear, I believe in decoupling and don't think it's wrong to learn from bad actors. But with such a deeply rotten track record, and so many decent organizations that are better than it along all parameters, Leverage is perhaps the clearest example of a situation where people should just "say oops" and stop looking for ways to gain any value from it (other than as a cautionary tale) that I have heard of in the EA/LW community.

attempting similar things in the future

I intended this a bit more broadly than you seem to have interpreted it; I'm trying to include exploratory research groups in general.

gain any value from it (other than as a cautionary tale)

That is essentially what this post is: looking in detail at one specific way I think things went wrong, and thinking about how to avoid this in the future.

I expect tradeoffs around how much you should prioritize external communication will continue to be a major issue for research groups!

Fair enough. I admit that I skimmed the post quickly, for which I apologize, and part of this was certainly a knee-jerk reaction to even considering Leverage as a serious intellectual project rather than a total failure as such, which is not entirely fair.  But I think maybe a version of this post I would significantly prefer would first explain your interest in Leverage specifically: that while they are a particularly egregious failure of the closed-research genre, it's interesting to understand exactly how they failed and how the idea of a fast, less-than-fully transparent think tank can be salvaged. It does bother me that you don't try to look for other examples of organizations that do some part of this more effectively, and I have trouble believing that they don't exist. It reads a bit like an analysis of nation-building that focuses specifically on the mistakes and complexities of North Korea without trying to compare it to other less awful entities.

worth trying to make Leverage 2.0 work

Note that Leverage 2.0 is a thing, and seems to be taking a very different approach towards the history of science, with regular public write-ups:

It seems like you're misreading Jeff's post. Perhaps deliberately. I will prefer it if people on this forum do this less.

[This comment is no longer endorsed by its author]Reply

Certainly not deliberately. I'll try to read it more carefully and update my comment 

Thanks. I've retracted my comment since I think it's too harsh. <3

Thanks! But I see your point

That said, I just want to point out that (at least as far as I understand it), there is a significant collection of people within and around EA who think that Leverage is a uniquely awful organization which suffered a multilevel failure extremely reminiscent of your run-of-the mill cult (not just for those who left it, but also for many people who are still in it), which soft-core threatens members to avoid negative publicity, exerts psychological control on members in ways that seem scary and evil. This is context that I think some people reading the sterilized publicity around Leverage will lack.

I can’t comment on whether rumors like this still persist in the EA community, but to the degree that they do, I think there is now a substantial amount of available information that allows for a more nuanced picture of the organization and the people involved.

Two of the best, in my view, are Cathleen’s post and our Inquiry Report. Both posts are quite lengthy, but as you seem passionate about this topic, they may nevertheless be worth reading.

I think it’s fair to say that the majority of people involved in Leverage would strongly disagree with your characterization of the organization. As someone who works at Leverage and was friends with many of the people involved previously, I can say that your characterization strongly mismatches my experience.

Meta note: I think it encourages in-group/out-group experiences on the Forum when known individuals are identified only by their first names and at a minimum would like to see e.g. Geoff, Larissa, and Catherine named in full at least once in this post.

I've intentionally used only first names for everyone in the post, including for individuals who are not well known, to make this post less likely to show up when searching anyone's name.

I just wanted to say thank you for doing this Jeff. I sympathize with Rockwell Schwartz’s general point, but since Cathleen’s post asks that people not use her full name or name her former colleagues I appreciate you taking this seriously.

(For clarity, I don’t mind people using my full name. It’s my forum username and very easily found e.g. on Leverage’s website. But I currently work at Leverage Research and decided to work there knowing full well how some people in EA react when the topic of Leverage comes up. The same is not true of everyone, and I think individuals who have not chosen to be public figures should be allowed to live in peace should they wish to).

That makes sense and I wasn't familiar with Cathleen's request or the general aims of quasi-anonymity here. I think it is useful to specify that you are intentionally not using full names because otherwise the assumption is likely that these are people one should know and contributes to my above concern.

Instead I would specifically look at its output and approach to external engagement: if they're not publishing research I would take that as a strong negative signal for the project. Likewise, in participating in a research project I would want to ensure that we were writing publicly and opening our work to engaged and critical feedback.

I'm curious about why your conclusion is about the importance of public engagement instead of about the importance (and difficulty) of setting up good feedback loops for research.

It seems to me that it is possible to have good feedback loops without good public engagement (e.g., the Manhattan Project) and good public engagement without good feedback loops (e.g., many areas of academic research). But, whereas important research progress seems possible in the former case, it seems all but impossible in the latter case.

I think feedback loops are the important thing, but public engagement is a powerful way to strengthen them which Leverage seemed to have suffered from deprioritizing.

In the example of the Manhattan Project, they were studying and engineering physical things, which makes it a lot harder to be wrong about whether you're making progress. My understanding is also that they brought a shockingly high fraction of the experts in the field into the project, which might mean you could get some of what you'd normally get from public presentation internally?

The degree to which public presentation is likely to strengthen your feedback loops seems to depend quite a lot on the state of the field that you are investigating. In highly functional fields like those found in modern physics, it seems quite likely to be helpful. In less functional fields or those with fewer relevant researchers, this seems less helpful.

To my mind, one strong consideration in favor of publicly presenting your research if you're working in a less functional field is that even if you're right, causing future researchers to build on your work is extremely difficult. Indeed, promising research avenues that are presented publicly die all the time (e.g., muscle reading or phlogiston c.f. Chang in Is Water H2O?). Presenting your research publicly is the best way to engage with other researchers and ensure that, if you do succeed, a research tradition can be built on top of your work.


Larissa from Leverage Research here. I think there might be an interesting discussion to be had about the relationship between feedback loops, external communication (engaging with your main external audiences), and public communication (trying to communicate ideas to the wider public).

For a lot of the history of scientific developments, sharing research, let alone widely distributing it was expensive and rare. Early discoveries in the history of electricity, for example, were nonetheless still made, often by researchers who shared little until they had a complete theory, or a new instrument to display. Often the feedback loops were simply direct engagement with the phenomena itself. Only in more recent history has it become cheap and easy enough to widely share research such that this has become the norm. Similarly, as a couple of people have mentioned in the comments, there are more recent examples of groups that have done great research while having little external engagement: Lockheed Martin and the Manhattan Project being two well-known examples.

This suggests that it is feasible to have feedback loops while doing little external communication of any kind. During Leverage 1.0[1] people relied more on feedback from their own experiences, interactions with teammates’ experiences and views, workshops and coaching.

That said, we do believe (for reasons independent of research feedback loops) that it was a mistake to not do more external communication in the past, which is why this is something Leverage Research has focused on since 2019. More recently, we have also come to think that it is also important to try to communicate to the wider public (in ways that can be broadly understood) as opposed to just your core audience or peer group. One reason for this is that if projects are only communicated about, and criticisms only accepted in, the language of the particular group that developed them, it's easy for blindspots to remain until it is too late. (I recommend Glen Weyl's "Why I'm Not A Technocrat" for a more detailed treatment of this topic.)

For anyone interested in some of our other reflections on public engagement, I recommend reading our 2019-2020 annual report or our Experiences Inquiry Report. The former is Leverage Research's first annual report since the re-organization in 2019, and one topic we discuss is our new focus on external engagement. The latter shares findings from our inquiry last year into the experiences of former collaborators during Leverage 1.0. To see our engagement efforts today, I recommend checking out our website, subscribing to our newsletter, or following us on Twitter or Medium.

For those interested in the exploratory psychology research Jeff mentions, we recommend reading our write-up from earlier this year covering our 2017 - 2019 Intention Research and keeping an eye on our Exploratory Psychology Research Program page. We are currently working on two pieces: one on risks from introspection (we discuss this a bit on Twitter here), and one on Belief Reporting (an introspective tool developed during Leverage 1.0). We're also thinking of sharing a few documents written pre-2019 that relate to introspection techniques. These would perhaps be less accessible for a wider audience unfamiliar with our introspective tools but may nonetheless be of interest to those who want to dive deeper on our introspective research. All of this will be added to our website when completed.

Finally, I just wanted to thank Jeff for engaging with us in a discussion of his post. Although we disagreed on some things and it ended up a lengthy discussion, I do feel like I came to understand a bit more of where the disagreement stemmed from, and the post was improved through the process. This seems valuable, so I would like to see that norm encouraged.


  1. ^

    As context, "Leverage 1.0" is the somewhat clumsy term I introduced as a shorthand for the decentralized research collaboration between a few organizations from 2011 to 2019 that's commonly referred to as "Leverage," so as to distinguish it from Leverage Research the organization since 2019 which looks very different.

Do you have recommendations for the most enlightening concepts, models, or data produced by Leverage that's not been filtered for public sensibilities? I want to efficiently evaluate whether I wish to dive deeper, and if a write-up is wastefwly optimised for professionalism, I take that as a strong signal against the likelihood that there's significant value here.

I'm not interested in whether you produce research that I expect others will expect to be praised for praising me for approving of, I just want to know if it can help me understand stuff.

Curated and popular this week
Relevant opportunities