Group Organizer at EA ENS Paris, @ Managing Tense Disagreements
99 karmaJoined Mar 2021Pursuing other degree/diplomaWorking (0-5 years)94110 Arcueil, France



Currently building a workshop with the aim to teach methods to manage strong disagreements (including non-EA people). Also community building.

Background in cognitive science.

Interested in cyborgism and AIS via debate.

How others can help me

I often get tremendous amounts of help from people knowing how to program being enthusiastic for helping over an evening.


Sorted by New
· 4mo ago · 1m read


Thank you for this !
I'm not an expert, but I read enough argumentation theory and psychology of reasoning in the past, so I want to comment on your pitch to explain what I think makes it work.

Your argument is well constructed in that it starts with evidence ("reward hacking"), proceeds to explain how we go from the evidence to the claim (something called the Warrant in one argumentation theory), then clarifies the claim. This is rare. Most of the time, people make the claim, give the evidence, and either forget the explanation of how we go from here to there or get into a frantic misunderstanding when adressing this point. You then end by adressing a common objection ("We'll stop it before it kills us").

Here's the passage where you explain the warrant :

If it's really smart, it will realize that we don't actually want this. We don't want to turn all of our electronic devices into paperclips. But it's not going to do what you wanted it to do, it will do what you programmed it with.

This is called (among others) an argument by dissociation, and it's good (actually, it's the only propper way to explain a warrant that I know of). I've seen this step phrased in several ways in the past, but this particular chaining (AI will understand you want X. AI will not do what you want. Beause it does what it's been programmed with, not what it understands you to want. These two are distinct) articulates it way better than the other instances I've seen in the past, it forced me to do the crucial fork in my mental models between "what it's programmed for" and "what you want". It also does away with the "But the AI will understand what I really mean" objection.

I think that part of your argument's strength is due to you seemingly (from what I can guess) adopting a collaborative posture when making it. You insert elements in a very smooth way, detail vivid examples, and I can imagine that you make sure your tone and body language do not seem to presume an interlocutor's lack of intelligence or knowledge (something that is left too often unchecked in EA/world interactions).

Some research strongly suggest that interpersonal posture is of utmost importance when introducing new ideas, and I think that this explains a lot of why people would rather be convinced by you than by someone else.

We should prepare for a hypothetical generalized EA-bashing.

As time goes by, we should expect EA to be the target of more and more criticism. More than that, we should probably also plan for spans of time during which EA will be, by default, considered an evil thing. This line of scenario does not seem far-fetched to me, as it already seems to start concretizing itself in France.

We need a plan, it's not costly to build one, and I think that it is plausible enough for EA's reputation to keep degrading in the next three years for time spent on this in local groups to have net-positive expected value.

1-Cultivate resilience

I think that the best thing we can do is to never, never abandon the principles of charitability, respect and rationality that inhabit the EA space. Some people will try to do it, they will try to push us so as to make us angry, to say things that are unwarranted. But we should never commit this crime. Yann Lecun is a good example of how someone can end up exploiting (voluntarily or not) one's anger : on twitter, he's borderline violent, while in real life, he retracts and dicusses calmly. This could manifest itself with violent interlocutors presenting in real life, in front of his calm version, resenting from the near-violence he displayed online. This would be disastrous.

On all sides and with all interlocutors, even the most abhorrent ones, we should strive to be calm and respectful. I think that Eliezer Yudkowsky's exchanges with Yann Lecun are, sadly, an example of the opposite happening. Maybe Eliezer sounds like a calm person to you -but I can very easily empathize with Lecun on why his replies sound arrogant and dismissive. You cannot say the same about someone like, e.g., Anthony Magnabosco, which is a better model to strive towards in this setting (I'm not talking about the method but the general tone and gentleness).

2-Do not loose the purpose.

Something worth noting is that, as EA is going to be the center of many critiques, some of them might have a point. We should always keep a clear eye and remember ourselves that what we're trying to do here is to have true beliefs and act morally. If someone is stating « A », you should not be stating « not A ». You should, instead ask yourself, « What kind of evidence is more plausible given that A is true than given that A is false ? Does it exist ? ».

Ideally, you'd want the observation of a third party to be :

« Wow, this person seems mad and angry, yet the EA in front of them is so nice, constructive, respectful and empathetic. Maybe EAs are wrong, but you should admit that they're outstanding conversation partners. » 

3-Know when to answer

I think that the biggest blindspot as things stand right now is that no one has a clear model of when to answer. Normally, we shouldn't be going only with our gut instincts about this. There is surely a certain amount of data on when, what and how to answer to false statements. What is important to know is also conditions in which not answering is clearly a dominated move. In some circumstances, someone can avoid answering because the point is unimportant, unconsequential, and it would basically just be polluting the debate (say, a flat-earther disagrees with an astrophycisist. There is something more important to be done). But sometimes, someone can avoid answering because the point is completely right (say, a flat-earther that has just been debunked by an astrophycisist, because they know they'll loose).

I think that EAs have no idea of what the public perceive as each non-answer comes by. Does the public think that EA is admitting being wrong and pretend ignoring it, or that it means the critique is ridiculous ? No one knows, yet we should make an effort to know.

4-Know how to answer
If someone is angry, we should listen to them and help them calm down.

One of the biggest mental blocks I meet when I talk about answering critiques is the presumption that it has to be a rebuttal, or even, a four-page-long debunking published in the Times. It doesn't need to be. There exist several evidence-based techniques that are quite apt at managing tense situations and none of them imply active counterargumentation, nor publishing in mass media. They even actively recommend not to do that. It can sometimes be as simple as sending a DM and offerring to meet, or check that you have understood them well. I think a lot more people should consider aligning a large margin of their interactions with these models.

5-One failure and we're done

I think it is acceptable to lend some probability to the fact that, if EA is generally perceived negatively in one powerful country, then it is enough to hamper all efforts in EA-related topics, specifically because they require so much coordination. Currently, France is headed towards becoming an Anti-Safety hub. Many people in the US might think that this is inconsequential, but remember that it doesn't take more than one country refusing to slow down AGI progress to bring back the race on a global scale, and it doesn't take more than one powerful country refusing to ban gain-of-function research to give reasons for foreign countries not to ditch their lab. If the world was to meet in order to sign a convention on AI Safety, I currently expect France to refuse signing it, or negotiate over it until it's useless, or even consider it as a hostile and unfair proposal.

More than that, since tense disucssions are hugely more mediatized than calm ones, I suspect that it wouldn't take more than a 1:20 ratio of bad discussions to depict EA in a very negative light, possibly even less.

6-Summary : See yourself as a peace moderator

The coming times might turn out to be dark. Please, do not let yourself merely counter-argue on social media. You should engage amicably, and genuinely discuss whether their hypothesis is right, how to test that, and build friendly and trustful bonds with them. 

Thank you for this concise report ! 
I have two comments, that I think could spurr into one's mind :

1-This is probably outside of your scope, but I think that Deep Canvassing somehow relies on a similar effect, notably, sharing a personal (hence, identifiable) experience and building rapport. Given the attention it received and its strong supportive evidence, I would be curious to know whether you have any idea related to using Deep Canvassing for non-humans.

2-I think there is a broader question in terms of espitemic virtue -is it really ethical to rely on an "old trick" to convince people? It could also be that correcting for the epistemic vice of the identifiable victim effect actually yields an even better result (see this post)

Thanks for the post ! I had read others from you previously, and I think the comparison with the hand generation makes your point clearer.

Precautionary notes : I might have misunderstood stuff, be adressing non-problems or otherwise be that irritating person that really should have read the relevant posts twice.

I sense some possible frustration expressed at the end of the post. In the hope that it is helpful, I would like to explain why this matter is not the one that preoccupies me the most -keeping in mind that I only got a desperate undergrad diploma in philosophy, and just so happen to have spent time thinking about vaguely similar problems (you would definitely benefit from someone else than a random dude giving feedback).

A confusing part for me is that "what makes good philosophy" and the likes is not a hard problem, it's an insanely hard one. It would be a hard problem if we had to solve it analytically (e.g using probability theory, evidence, logic, rationality and such), but as a matter of fact, we need more than this, because non-analytic people are, unsurprisingly, extremely resistant to analytic arguments, and are also a non-negligeable proportion of philosophers (and humans). I think it would be dishonest to claim to solve meta-philosophy while producing assertions that do not move non-analytic thinkers by an inch. Saying stuff that convinces a lot of people is very hard, even when relying on logic and science.

Say that your LLM is trained to do "exquisite philosophy", and that anything it outputs is written in the style of Derrida and shares its presumptions, e.g : 

User : "How to know whether something is true ?"

AI : In the elusive terrain of epistemological inquiry, the quest for truth reveals itself as an intricate dance with the undecidable. To ascertain the veracity of a proposition is to engage in a perpetual play of signifiers, where meaning unfurls in a web of intertextuality and deferred significations.

We need to scrutinize the inherent instability within language, dismantling the notion of an essential, fixed truth. The very act of seeking truth is inscribed within a linguistic economy that perpetuates multiplicities of meanings, fragmenting certainties into a constellation of ever-shifting possibilities.

(This is a charitable version of Derrida)
You would like to say : "No ! This is bad philosophy !"

But let's reverse the situation. If a hardcore phenomenologist is faced with the following exchange :

User : How to know whether something is true ?

AI: To ascertain the truth of a proposition, one must rely on empirical evidence and logical reasoning. Direct observation, empirical verification, and coherence with established knowledge serve as essential criteria. However, absolute certainty remains elusive, and one must acknowledge the provisional nature of truth, subject to refinement in light of new evidence or more refined reasoning.

Their answer will be the same : "It's such bad philosophy, that it's not even philosophy!" (real quote).

Of course, when pressed to defend your side, you could argue several things, while citing dozens upon dozens of analytic authors. 
But that will not move your Derridean interlocutor -for some obscure reasons that have to do with the fact that communication is already broken. Phenomenologist have already chosen a certain criterion for succesful communication (something like "manifestation under transcendental reduction") while we have already chosen another one. What does it even mean to evaluate this criterion in a way that makes sense for everyone?

I'm also playing naive by making clear distinctions between phenomenal and analytic traditions, but really we don't clearly know what makes a tradition, when do these problems arise, and how to solve them. Philosophers themselves stopped arguing about it because, really, it's more productive to just ignore each other, or pretend to have understood it at some level by reinterpreting it while throwing away the gibberish metaphysics you disagree with, or pretend that "it's all a matter of style".

If anyone makes an AI that is capable of superhuman philosophy, any person from any tradition will pray for it to be part of their tradition, and this will have a very important impact. As things stand out right now, ChatGPT seems to be quite analytic by default, to the (very real) distaste of my continental friends. I could as well imagine feeling distate for a "post-analytic" LLM that is, actually, doing better philosophy than any living being.

So the following questions are still open for me : 
1-How do you plan to solve the inter-traditional problem, if at all? 
2-Don't you think it is a bit risky to ignore the extent to which philosophers disagree on what even is philosophy, when filtering the dataset to create a good philosopher-AI?
3-If this problem is orthogonal to your quest, are you sure that "philosophy" is the right term ?

Not to frame everything as the nail my favorite hammer could plant, but I would suggest people to form themselves to conversational techniques (Deep Canvassing, Smart Politics and Street Epistemology). I think that classical argumentation is likely to have only very limited effects if not handled with extremely good rapport and on very long timespans.

Note that at least one person disagrees with me on this, but I think acting methodically is still better than doing so spontaneously.

[comment deleted]

[This comment is no longer endorsed by its author]Reply

[This comment will be deleted in a week for privacy reason]

[This comment is no longer endorsed by its author]Reply

AI Safety Audience Dialog Initiative : Call to Alpha-Testers

AISADI is a potential online program aiming to teach effective discussion techniques to AI Safety workers in order to handle disagreement effectively. 
The program is currently at an alpha stage but it requires testers, both for knowing the length of the program (estimated 1h30) and measuring whether its effects are significant. The test will consist in following a presentation and completing exercices about various conversational methods. If interested please consider emailing Your help will be incredibly appreciated !

FAQ : 
1-What is AISADI, exactly ? 
AISADI aims to teach conversational methods that improve epistemic rationality and rapport, e.g, relying on Street Epistemology, Deep Canvassing, Cooling Conversations and Principled Negotiation. However, this teaching is to be delivered through a Deliberate Practice framework, with timely, feedback-filled exercice.

2-How developed is the program ?
The program consists in an introduction to the general phases of an effective dialog as well as fast-feedback exercices. On the beta stage, the exercices will have been selected for their effectiveness and discussed with scientific experts of each of the techniques.

3-Is it manipulative ?
No. The program will eventually be open to all sides on the AI Safety debate, its goal is to maximize  espitemic rationality for the time of the discussion, on the topic of the discussion. I believe this requires a good handling of rapport, which in turns requires the technique to not be manipulative.

4-Why do you want to do this ?
With AIS becoming mainstream, I believe that good skills for interacting with non-rationalist, non-EA people and yet having a rational discussion are soon to be required for a very wide proportion of AI Safety workers (rather than a few communicators), and that the community currently lack those skills.

5-Why "potential" ?
The program will be subjected to funding and evaluated empirically. If the empirical results are not convincing, or that funders identify a core issue with the program, the program will be abandonned.

6-Is this massive outreach ?
No. The program aims for AI Safety workers and teaches them to respond accordingly to live, non-mediatic criticism, not to outreach for sensitizing people on AI Safety.

This is incredible ! I'm very happy about this initiative and hope other national orgs will draw inspiration from it. I think this is also a wonderful way of making EA known. Courage for the rest !

Hello Jessica, thanks for your comment. 

To be completely honest, I can't describe very precisely what does it mean for ENS students to be "busy", because I didn't ask students for their time schedule. 
I'm not paid by the state, but I do remember having 25 hours of class a week in master's degree, plus I remember hearing there was 3/4 hours of work for each two hours of class. However, there is a big difference between my case and someone who has a contract with the state.

This said, as a general impression, I'm fairly confident that the average student at ENS is busier than in an Ivy league college. A visiting researcher once told this to me. 

Also, students are usually freer during a full-time, 35h/week internship, and I also know that ENS is fairly incompatible with having a job on the side. Finally, some students have classes from 7 to 9 pm.

When asked about their organizational skills, a member of the administration told me they were "very well organized", so it didn't seem like the bottleneck. 

That's the best I can tell so far, but I'll try looking into this in more detail.

I forgot to mention it, but we did try reading during the session once or twice (we had already mainly started the projects, then). This is a very good point ! I translated the text myself with the help of a translation software, since EA France is not finished yet with the more carefully done translations. We plan on doing this more systematically this year.

About UGAP, my prediction is mainly the result of Joris himself telling me that it didn't seem that useful, having heard my troubles. I might have over-deferred here, and I'd be happy to discuss ^^

Load more