606Joined Aug 2021



Note that saying "this isn't my intention" doesn't prevent net negative effects of a theory of change from applying. Otherwise, doing good would be a lot easier. 

I also highly recommend clarifying what exactly you're criticizing, i.e. the philosophy, the movement norms or some institutions that are core to the movement. 

Finally, I usually find the criticism of people a) at the core of the movement and b) highly truth-seeking most relevant to improve the movement so I would expect that if you're trying to improve the movement, you may want to focus on these people. There exists relevant criticisms external to the movement but usually they will lack of context and thus fail to address some key trade-offs that the movement cares about. 

Here's a small list of people I would be excited to hear on EA flaws and their recommandations for change: 

  • Rob Bensinger 
  • Eli Lifland 
  • Ozzie Gooen 
  • Nuno Sempere
  • Oliver Habryka


Thanks for publishing that, I also had a draft lying somewhere on that!

Answer by simeon_cJan 02, 202331

I work every day from about 9:30am to 1am with about 3h off on average and 30 min of walk which helps me brainstorming. Technically this is ~12*7=84h. The main reason is that 1) I want that we don't die and 2) think that there are increasing marginal returns on working hours in a lot of situation, mostly due to the fact that in a lot of domains, winner takes all even if he's only 10% better than others, and because you accumulate in a single person more expertise/knowledge which gives access to more and more rare skills

Among that, I would say that I lose about 15h in unproductive or very little productive work (e.g Twitter or working on stuff which is not on my To Do list). I also spend about 10 to 15h a week in calls.
The rest of it (from 50 to 60h) is productive work.

My productivity (& thus productive work time) has been hugely increasing over the past 3 months (my first three months where I can fully decide the allocation of my time). The total amount of hours I work increased a bit (like +1h/day) in the last 3 months, mostly thanks to optimizations of my sleep & schedule.

Can you share the passive time tracking tools you're using?

"Nobody cared about" LLMs is certainly not true - I'm pretty sure the relevant people watched them closely.

What do you mean by "the relevant people"? I would love that we talk about specifics here and operationalize what we mean. I'm pretty sure E. Macron haven't thought deeply about AGI (i.e has never thought for more than 1h about timelines)  and I'm at 50% that if he had any deep understanding of what changes it will bring, he would already be racing. Likewise for Israel, which is a country which has strong track record of becoming leads in technologies that are crucial for defense. 


That many people aren't concerned about AGI or doubting its feasibility by now only means that THOSE people will not pursue it, and any public discussion will probably not change their minds. 

I think here you wrongly assume that people have even understood what are the implications of AGI and that they can't update at all once the first systems will start being deployed. The situation where what you say could be true is if you think that most of your arguments hold because of ChatGPT. I think it's quite plausible that since ChatGPT and probably even more in 2023 there will be deployments that may make mostly everyone that matter aware of AGI. I don't have a good sense yet of how policymakers have updated yet.


Already there are many alarming posts and articles out there, as well as books like Stuart Russell's "Human Compatible" (which I think is very good and helpful), so keeping the lid on the possibility of AGI and its profound impacts is way too late 

Yeah, I realize thanks to this part that a lot of the debate should happen on specifics rather that at a high-level as we're doing here. Thus, chatting about your book in particular will be helpful for that. 

I'm currently in the process of translating it to English so I can do just that. I'll send you a link as soon as I'm finished. I'll also invite everyone else in the AI safety community (I'm probably going to post an invite on LessWrong).

Great! Thanks for doing that!


while discussing the great potential of AGI for humanity should not.

FYI I don't think that it's true. 



Regarding all our discussion, I realized I didn't mention a fairly important argument: a major failure mode specifically regarding risks is  the following reaction from ~any country: "Omg, China is developing bad AGIs, so let's develop safe AGIs first!".

This can happen in two ways: 

  • Misuse as the mainline scenario that people are envisioning. Basically, if you're mostly concerned about misuse, racing to be the first to have the AGI makes sense. And because misuse is way easier to understand than accidental risk, I expect this to be ~the default. 
  • Overestimating one's competence. Even if you believed in AGI accidental X-risks, you could still race thinking that you're better than the others and that could increase the chances of X-risk. 



Thanks a lot for engaging with my arguments. I still think that you're substantially overconfident about the positive aspects of communicating AGI X-risks to the general public but I appreciate the fact that you took the time to consider and answer to my arguments. 

Hey Misha! Thanks for the comment!

I am quite confused about what probabilities here mean, especially with prescriptive sentences like "Build the AI safety community in China" and "Beware of large-scale coordination efforts."

As I wrote in note 2, I'm here claiming that this claim is more likely to be true under these timelines than the other timelines. But how could I make it clearer without bothering too much? Maybe putting note 2 under the table in italic?

I also disagree with the "vibes" of probability assignment to a bunch of these, and the lack of clarity on what these probabilities entail makes it hard to verbalize these.

I see, I hesitated in the trade-off (1) "put no probabilities" vs (2) "put vague probabilities" because I feel like that the second gives a lot more signal on how confident I am in what I say and allow people to more fruitfully disagree but at the same time it gives a "seriousness" signal which is not good when the predictions are not actual predictions.

Do you think that putting no probabilities would have been better? 


By "I also disagree with the vibes of probability assignment to a bunch of these", do you mean that it seems over/underconfident in a bunch of ways when you try to do a similar exercise? 

Ah ah you probably don't realize it but "you" is actually 4 persons: Amber Dawn for the first draft of the post, me (Simeon) for the ideas, the table and the structure of the post, and me, Nicole Nohemi & Felicity Riddel for the partial rewriting of the draft to make it clearer.

So the credits are highly distributed! And thanks a lot, it's great to hear that! 

I think that our disagreement comes from what we mean by "regulating and directing it." 

My rough model of what usually happens in national governments (and not the EU, which is a lot more independent from its citizen than the typical national government) is that there are two scenarios: 

  1. Scenario 1 in which national governments regulate or do things on something nobody is caring about (in particular, not the media). That gives birth to a lot of degrees of freedom and the possibility of doing fairly ambitious things (cf Secret Congress
  2. Scenario 2 in which national governments regulate things that many people care about and brings attention and then nothing gets done, most measures are fairly weak etc. In this scenario my rough model is that national governments do the smallest thing that satisfy their electorate + key stakeholders. 


I feel like we're extremely likely to be in scenario 2 regarding AI. And thus that no significant measure will be taken, which is why I put the emphasis of "no strong [positive] effect" on AI safety. So basically I feel like the best you can probably do in national policy is something like "avoid that they do bad things" (which is really good if it's a big risk) or "do mildly good things". But to me, it's quite unlikely that we go from a world where we die to a world where we don't die thanks to a theory of change which is focused on national policy. 

The EU AI Act is a bit different in that as I said above, the EU is much less tied to the daily worries of citizen and thus is operating under less constraints. Thus I think that it's indeed plausible that the EU does something ambitious on GPAIS but I think that unfortunately it's unlikely that the US will replicate something locally and that the EU compliance mechanisms are not super likely to cut the worst risks for the UK and US companies.   

Regulating the training of these models is different and harder, but even that seems plausible to me at some point

I think that it's plausible but not likely, and given that it would be the intervention that would cut the most risks, I tend to prefer corporate governance which seems significantly more tractable and neglected to me. 


Out of curiosity, could you refer to a specific event you'd expect to see "if we get closer to substantial leaps in capabilities"? I think that it's a useful exercise to disagree fruitfully on timelines and I'd be happy to bet on some events if we disagree on one.

Thanks for your comment!

A couple of remarks: 

  1. Regulations that cut X-risks are strong regulations: My sense is that regulations that really cut X-risks at least a bit are pretty "strong", i.e. in the reference class of "Constrain labs to airgap and box their SOTA models while they train them" or "Any model which is trained must be trained following these rules/applying these tests".  So what matters in terms of regulation is "will governments take such actions?" and my best guess is no, at least not without the public opinion caring a lot about that. Do you have in mind an example of regulation which is a) useful and b) softer than that? 
  2. Additional state funding would be bad by default: I think that the default of "more funding goes towards AGI" is that it accelerates capabilities more (e.g backing some labs so that they move faster than China, things like that). Which is why I'm not super excited about increasing the amount of funding governments put into AGI. But to the extent that there WILL be some funding, then it's nice to steer it towards safety research. 


And finally, I like the example you gave on cyber. The point I was making was something like "Your theories of change for pre-2030 timelines shouldn't rely too much on national government policy" and my understanding of what you're saying is something like "that may be right, but national governments are still likely to have a lot of (bad by default) influence, so we should care about them". 

I basically had in mind this kind of scenario where states don't do the research themselves but are backing some private labs to accelerate their own capabilities, and it makes me more worried about encouraging states to think about AGI. But I don't put that much weight on these scenarios yet.

How confident are you that governments will get involved in meaningful private-public collaboration around AGI by 2030? A way of operationalizing that could be "A national government spends more than a billion $ in a single year on a collaboration with a lab with the goal to accelerate research on AGI". 

If you believe that it's >50%, that would definitely update me towards "we should still invest a significant share of our resources in national policy, at least in the UK and the US so that they don't do really bad moves". 

Load more