"Starting a company is like chewing glass. Eventually, you start to like the taste of your own blood."

Building a new organization is extremely hard. It's hard when you've done it before, even several times. It's even harder the first time.

Some new organizations are very similar to existing organizations. The founders of the new org can go look at all the previous closeby examples, learn from them, copy their playbook and avoid their mistakes.  If your org is shaped like a Y-combinator company, you can spend dozens of hours absorbing high-quality, expert-crafted content which has been tested and tweaked and improved over hundreds of companies and more than a decade. You can do a 15 minute interview to go work next to a bunch of the best people who are also building your type of org, and learn by looking over their shoulder and troubleshooting together. You get to talk to a bunch of people who have actually succeeded  building an org-like-yours.  

How likely is org building success, in this premier reference class, rich with prior examples to learn from, with a tried and true playbook, a tight community of founder peers, the advice of many people who have tried to do your kind of thing and won? 

5%. 
https://pitchbook.com/news/articles/y-combinator-accelerator-success-rate-unicorns

An AI safety lab is not the same as a Y-combinator company.  

It is. WAY. FUCKING. HARDER.

Y-combinator crowd has a special category for orgs which are trying build something  that requires > ~any minor research breakthrough: HARD tech. 

Yet the vast majority of these Hard Tech companies are actually building on top of an academic field which basically has the science figured out. Ginkgo Bioworks did not need to figure out the principles of molecular biology, nor the tools and protocols of genetic engineering. They took the a decades old, well-developed paradigm, and worked within it to incrementally build something new. 

How does this look for AI safety?

And how about timing. Y-combinator reference class companies take a long time to build. Growing headcount slowly, running lean: absolutely essential if you are stretching out your last funding round over 7 years to iterate your way from a 24 hour livestream tv show of one guy's life to a game streaming company. 

Remind me again, what are your timelines?

I could keep going on this for a while. People? Fewer. Funding? Monolithic. Advice from the winners? HA.

Apply these updates to our starting reference class success rate of

ONE. IN. TWENTY.

Now count the AI safety labs. 

Multiply by ~3.  

That is the roughly the number of people who are not the subject of this post. 

For all the rest of us, consider several criticisms and suggestions, which were not feasible to run by the subjects of this post before publication
0. Nobody knows what they are fucking doing when founding and running an AI safety lab and everyone who says they do is lying to you.
1. Nobody has ever seen an organization which has succeeded at this goal.
2. Nobody has ever met the founder of such an organization, nor noted down their qualifications.
3. If the quote at the top of this post doesn't evoke a visceral sense memory for you, consider whether you have an accurate mental picture of what it looks like and feels like to be succeeding at this kind of thing from the inside. Make sure you imagine having fully internalized that FAILURE IS YOUR FAULT and no one else's, and are defining success correctly. (I believe it should be "everyone doesn't die" rather than "be highly respected for your organization's contributions" or "avoid horribly embarrassing mistakes".) 
4.  If that last bit feels awful and stress inducing, I expect that is because it is. Even for and especially for the handfulls of people who are not the subjects of this post. So much so that I'm guessing that whatever it is that allows people to say "yes" to that responsibility is the ~only real qualification to adding a one to the number of AI safety labs we counted earlier. 
5. You have permission. You do not need approval. You are allowed to do stupid things, have no relevant experience, be an embarrassing mess, and even ~*~fail to respond criticism~*~ 
6. Some of us know what it looks like to be chewing glass, and we have tasted our own blood. We know the difference between the continuous desperate dumpster fires and the real mistakes. We will be silently cheering you through the former and grieving with you on the latter. Sometimes we will write you a snarky post under a pseudonym when we really should be sleeping. 

522 companies went through Y-combinator over the last year. Imagine that.

Thank you for reading this loveletter to the demeaning occupation of desperately trying. It's addressed to you, if you'd like.

Comments12


Sorted by Click to highlight new comments since:

I am confused about what your claims are, exactly (or what you’re trying to say). 

One interpretation, which makes sense to me, is the following

“Starting an AI safety lab is really hard and we should have a lot of appreciation for people who are doing it. We should also cut them some more slack when they make mistakes because it is really hard and some of the things they are trying to do have never been done before.” (This isn’t a direct quote)

I really like and appreciate this point. Speaking for me personally, I too often fall into the trap of criticising someone for doing something not perfectly and not 1. Appreciating that they have tried at all and that it was potentially really hard, and 2. Criticising all the people who didn’t do anything and chose the safe route. There is a good post about this: Invisible impact loss (and why we can be too error-averse).

In addition, I think it could be a valid point to say that we should be more understanding if e.g. the research agendas of AIS labs are/were off in the past as this is a problem that no one really knows how to solve and that is just very hard. I don’t really feel qualified to comment on that.  

 

Your post could also be claiming something else:

“We should not criticise / should have a very high bar for criticizing AI safety labs and their founders (especially not if you yourself have not started an AIS lab). They are doing something that no one else has done before, and when they make mistakes, that is way understandable because they don’t have anyone to learn from.” (This isn’t a direct quote)

For instance, you seem to claim that the reference class of people who can advise people working on AI safety is some group whose size is the number of AI safety labs multiplied by 3. (This is what I understand your point to be if I look at the passage that starts with “Some new organizations are very similar to existing organizations. The founders of the new org can go look at all the previous closeby examples, learn from them, copy their playbook and avoid their mistakes.” and ends in “That is the roughly the number of people who are not the subject of this post.”)

If this is what you want to say, I think the message is wrong in important ways. In brief: 

  1. I agree that when people work on hard and important things, we should appreciate them, but I disagree that we should avoid criticism of work like this. Criticism is important precisely when the work matters. Criticism is important when the problems are strange and people are probably making mistakes. 
  2. The strong version of “they’re doing something that no one else has done before … they don’t have anyone to learn from” seems to take a very narrow reference class for a broad set of ways to learn from people. You can learn from people who aren’t doing the exact thing that you’re doing.

 

1. A claim like: “We should not criticise / should have a very high bar for criticizing AI safety labs / their founders (especially not if you yourself have not started an AIS lab).”

As stated above, I think it is important to appreciate people for trying at all, and it’s useful to notice that work not getting done is a loss. That being said, criticism is still useful. People are making mistakes that others can notice. Some organizations are less promising than others, and it’s useful to make those distinctions so that we know which to work in or donate to. 

In a healthy EA/LT/AIS community, I want people to criticise other organisations, even if what they are doing is very hard and has never been done before. E.g. you could make the case that what OP, GiveWell, and ACE are doing has never been done before (although it is slightly unclear to me what exactly “doing something that has never been done before” means), and I don’t think anyone would say that those organisations should be beyond criticism. 

This ties nicely into the second point I think is wrong: 

2. A claim like: “they’re doing something that no one else has done before … they don’t have anyone to learn from”

A quote from your post:

The founders of the new org can go look at all the previous closeby examples, learn from them, copy their playbook and avoid their mistakes.  If your org is shaped like a Y-combinator company, you can spend dozens of hours absorbing high-quality, expert-crafted content which has been tested and tweaked and improved over hundreds of companies and more than a decade. You can do a 15 minute interview to go work next to a bunch of the best people who are also building your type of org, and learn by looking over their shoulder and troubleshooting together. You get to talk to a bunch of people who have actually succeeded  building an org-like-yours.  … How does this look for AI safety? … Apply these updates to our starting reference class success rate of ONE. IN. TWENTY. Now count the AI safety labs. Multiply by ~3.  


A point I think you’re making:  

“They are doing something that no one else has done before [build a successful AI safety lab], and therefore, if they make mistakes, that is way understandable because they don’t have anyone to learn from.”

It is true that the closer your organisation is to an already existing org/cluster of orgs, the more you will be able to copy. But just because you’re working on something new that no one has worked on (or your work is different in other important aspects), it doesn’t mean that you cannot learn from other organisations, their successes and failures. For things like having a healthy work culturetalent retention, and good governance structures, there are examples in the world that even AIS labs can learn from. 

I don’t understand the research side of things well enough to comment on whether/how much AIS labs could learn from e.g. academic research or for-profit research labs working on problems different from AIS. 


 

Hey, sorry I'm in a rush and couldn't read your whole comment. I wanted to jump in anyway to clarify that you're totally right to be confused about what my claims are. I wasn't trying to make claims, really, I was channelling an emotion I had late at night into a post that I felt compelled to hit submit on. Hence: "loveletter to the demeaning occupation of desperately trying"

I really value the norms of discourse here, their carefulness, modestness, and earnestness. From the skim of your comment I'm guessing after a closer read I'd think it was a great example of that, which I appreciate.

I don't expect I'll manage to rewrite this post in the way which makes everything I believe clear (and I'm not sure that would be very valuable for others if I did) 

FWIW, I most read the core message of this post as: "you should start an AI safety lab. What are you waiting for? ;)".

The post felt to me like debunking reasons people might feel they aren't qualified to start an AI safety lab.

I don't think this was the primary intention though. I feel like I came away with that impression because of the Twitter contexts in which I saw this post referenced.

Seems like academic research groups would be a better reference class than YC companies for most alignment labs.

If they're trying to build an org that scales a lot, and is funded by selling products, YC companies is a good reference class, but if they're an org of researchers working somewhat independently or collaborating on hard technical problems, funded by grants, that sounds much more similar to an academic research group.

Unsure how to define success for an academic research group, any ideas? They seem to more often be exploratory and less goal-oriented.

As someone who did recently set up an AI safety lab, success rates have certainly been on my mind. It's certainly challenging, but I think the reference class we're in might be better than it seems at first.

I think a big part of what makes succeeding as a for-profit tech start-up challenging is that so many other talented individuals are chasing the same, good ideas. For every Amazon there are 1000s of failed e-commerce start-ups. Clearly, Amazon did something much better than the competition. But what if Amazon didn't exist? What if there was a company that was a little more expensive, and had longer shipping times? I'd wager that company would still be highly successful.

Far fewer people are working on AI safety. That's a bad thing, but it does at least mean that there's more low-hanging fruit to be tapped. I agree with [Adam Binks](https://forum.effectivealtruism.org/posts/PJLx7CwB4mtaDgmFc/critiques-of-non-existent-ai-safety-labs-yours?commentId=eLarcd8no5iKqFaNQ) that academic labs might be a better reference class. But even there, AI safety has had far less attention paid to it than e.g. developing treatments for cancer or unifying quantum mechanics and general relativity. 

So overall it's far from clear to me that it's harder to make progress on AI safety than solve outstanding challenge problems in academia, or in trying to make a $1 bn+ company.

Thanks for writing this. It felt a bit like an AI safety version of Roosevelt's Man in the arena

It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again, because there is no effort without error and shortcoming; but who does actually strive to do the deeds; who knows great enthusiasms, the great devotions; who spends himself in a worthy cause; who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly, so that his place shall never be with those cold and timid souls who neither know victory nor defeat.

I'm honestly not sure whether this is an argument in support of AI labs or against?

it's roughly in support of AI labs, particularly scrappier ones.

~65% of charity entrepreneurship charities are at least moderately successful, with half of those being very successful. They're probably a closer reference class, being donor-funded organisations run by EAs for impact.

One way in which AI safety labs are different than the reference class of Y-combinator startups is in their impact. Conditioned on the median Forum user's assessment of X-risk from AI, the leader of a major AI safety lab probably has more impact that the median U.S. senator, Fortune 500 CEO, or chief executive of smaller regional or even national governments, etc. Those jobs are hard in their own ways, but we expect and even encourage an extremely high amount of criticism. 

I am not suggesting that is the proper reference class for leaders of AI labs that have raised at least $10MM . . . and I don't think it is. But I think the proper scope of criticism is significantly higher than for (e.g.) the median CEO whose company went through Y Combinator.[1] If a startup CEO messes up and their company explodes, the pain is generally going to be concentrated in the company's investors, lenders, and employees . . . a small number of people, each of whom who consented to bearing that risk to a significant extent. If I'm not one of those people, my standing to complain about the startup CEO's mistakes is significantly constrained.

In contrast, if an AI safety lab goes off the rails and becomes net-negative, that affects us all (and futute generations). Even if the lab is merely ineffective, its existence would have drained fairly scarce resources (potential alignment researchers and EA funding) from others in the field. 

I definitively agree that people need to be sensitive to how hard running an AI safety lab is, but also want to affirm that the idea of criticism is legitimate.

 

  1. ^

    To be clear, I don't think Anneal's post suggests that this is the reference class for deciding how much criticism of AI lab leaders is warranted. However, since I didn't see a clear reference class, I thought it was worthwhile to discuss this one.

Fail early, fail often. Many little dooms are good. One big doom is not so good. 

Curated and popular this week
 ·  · 52m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI) by 2028?[1] In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote).[1] This means that, while the co
saulius
 ·  · 22m read
 · 
Summary In this article, I estimate the cost-effectiveness of five Anima International programs in Poland: improving cage-free and broiler welfare, blocking new factory farms, banning fur farming, and encouraging retailers to sell more plant-based protein. I estimate that together, these programs help roughly 136 animals—or 32 years of farmed animal life—per dollar spent. Animal years affected per dollar spent was within an order of magnitude for all five evaluated interventions. I also tried to estimate how much suffering each program alleviates. Using SADs (Suffering-Adjusted Days)—a metric developed by Ambitious Impact (AIM) that accounts for species differences and pain intensity—Anima’s programs appear highly cost-effective, even compared to charities recommended by Animal Charity Evaluators. However, I also ran a small informal survey to understand how people intuitively weigh different categories of pain defined by the Welfare Footprint Institute. The results suggested that SADs may heavily underweight brief but intense suffering. Based on those findings, I created my own metric DCDE (Disabling Chicken Day Equivalent) with different weightings. Under this approach, interventions focused on humane slaughter look more promising, while cage-free campaigns appear less impactful. These results are highly uncertain but show how sensitive conclusions are to how we value different kinds of suffering. My estimates are highly speculative, often relying on subjective judgments from Anima International staff regarding factors such as the likelihood of success for various interventions. This introduces potential bias. Another major source of uncertainty is how long the effects of reforms will last if achieved. To address this, I developed a methodology to estimate impact duration for chicken welfare campaigns. However, I’m essentially guessing when it comes to how long the impact of farm-blocking or fur bans might last—there’s just too much uncertainty. Background In
 ·  · 7m read
 · 
Introduction This payout report covers the Animal Welfare Fund's grantmaking from January 1, 2025 to March 31, 2025 (3 months). It follows the previous October–December 2024 payout report. As mentioned in the 2024 review and the previous payout report, the Animal Welfare Fund  (AWF) made a conscious decision to increase transparency and prioritize more frequent communications about our work. As part of those efforts, we've resumed regular publication of detailed payout reports after previously reducing our public reporting to focus fund manager capacity on grant evaluations. With additional support now in place, we've streamlined our reporting process to provide comprehensive information about our grants and their intended impact. Given that these are recent grants, outcome data will not be included in the initial payout reports. We plan to share these reports quarterly to keep the community informed of our grantmaking activities. Update to private grant reporting While we aim to increase the AWF’s transparency, we also recognize the important benefits that private grants provide: protecting organizations from government harassment, reducing risks of damage to strategic relationships between organizations and industry players, and maintaining security for sensitive work. We don't want strong applicants to be discouraged from applying due to concerns about public reporting and therefore miss out on the impact they could have.  To balance the risks that public reporting has with the benefits of transparency, we are establishing a new approach for private reporting: private grants will be included within payout reports, but we will include them in an anonymized format (e.g. “$350,000 - across three organizations working on fish welfare”, or “$120,000 - welfare improvements in East Asia”), and in some cases, still only list the amount and not the purpose, (e.g. “$50,000 - private grant”). The latter will only be done if we think disclosing details poses a risk of har