I'm a PhD student at the Center for Human-Compatible AI (CHAI) at UC Berkeley. I edit and publish the Alignment Newsletter, a weekly publication with recent content relevant to AI alignment. In the past, I ran the EA UC Berkeley and EA at the University of Washington groups.


Coherence arguments imply a force for goal-directed behavior

I respond here; TL;DR is that I meant a different thing than the thing Katja is responding to.

Layman’s Summary of Resolving Pascallian Decision Problems with Stochastic Dominance

In a working paper, Christian Tarsney comes up with a clever resolution to this conflict

Fwiw, I was expecting that the "resolution" would be an argument for why you shouldn't take the wager.

If you do consider it a resolution: if Alice said she would torture a googol people if you didn't give her $5, would you give her the $5? (And if so, would you keep doing it if she kept upping the price, after you had already paid it?)

Is this a good way to bet on short timelines?

Counterfactuals are hard. I wouldn't be committing to donate it. (Also, if I were going to donate it, but it would have been donated anyway, then $4,000 no longer seems worth it if we ignore the other benefits.)

I expect at least one of us to update at least slightly.

I agree with "at least slightly".

I'd be interested to know why you disagree

Idk, empirically when I discuss things with people whose beliefs are sufficiently different from mine, it doesn't seem like their behavior changes much afterwards, even if they say they updated towards X. Similarly, when people talk to me, I often don't see myself making any particular changes to how I think or behave. There's definitely change over the course of a year, but it feels extremely difficult to ascribe that to particular things, and I think it more often comes from reading things that people wrote, rather than talking to them.

Is this a good way to bet on short timelines?

I'm happy to sell an hour of my time towards something with no impact at $1,000, so that puts an upper bound of $4,000. (Though currently I've overcommitted myself, so for the next month or two it might be  ~2x higher.)

That being said, I do think it's valuable for people working on AI safety to at least understand each other's positions; if you don't think you can do that re: my position, I'd probably be willing to have that conversation without being paid at all (after the next month or two). And I do expect to understand your position better, though I don't expect to update towards it, so that's another benefit.

Is this a good way to bet on short timelines?

I'm pretty sure I have longer timelines than you. On each of the bets:

  1. I would take this, but also I like to think if I did update towards your position I would say that anyway (and I would say that you got it right earlier if you asked me to do so, to the extent that I thought you got it right for the right reasons or something).
  2. I probably wouldn't take this (unless X was quite high), because I don't really expect either of us to update to the other's position.
  3. I wouldn't take this; I am very pessimistic about my ability to do research that I'm not inside-view excited about (like, my 50% confidence interval is that I'd have 10-100x less impact even in the case where someone with the same timelines as me is choosing the project, if they don't agree with me on research priorities). It isn't necessary that someone with shorter timelines than me would choose projects I'm not excited about, but from what I know about what you care about working on, I think it would be the case here. Similarly, I am pessimistic about your ability to do research on broad topics that I choose on my inside view. (This isn't specific to you; it applies to anyone who doesn't share most of my views.)
Avoiding Munich's Mistakes: Advice for CEA and Local Groups

Yeah, I think I agree with everything you're saying. I think we were probably thinking of different aspects of the situation -- I'm imagining the sorts of crusades that were given as examples in the OP (for which a good faith assumption seems straightforwardly wrong, and a bad faith assumption seems straightforwardly correct), whereas you're imagining other situations like a university withdrawing affiliation (where it seems far more murky and hard to label as good or bad faith).

Also, I realize this wasn't clear before, but I emphatically don't think that making threats is necessarily immoral or even bad; it depends on the context (as you've been elucidating).

Avoiding Munich's Mistakes: Advice for CEA and Local Groups

I agree with parts of this and disagree with other parts.

First off:

First, if she is acting in good faith, pre-committing to refuse any compromise for 'do not give in to bullying' reasons means one always ends up at ones respective BATNAs even if there was mutually beneficial compromises to be struck.

Definitely agree that pre-committing seems like a bad idea (as you could probably guess from my previous comment).

Second, wrongly presuming bad faith for Alice seems apt to induce her to make a symmetrical mistake presuming bad faith for you. To Alice, malice explains well why you were unwilling to even contemplate compromise, why you considered yourself obliged out of principle  to persist with actions that harm her interests, and why you call her desire to combat misogyny bullying and blackmail.

I agree with this in the abstract, but for the specifics of this particular case, do you in fact think that online mobs / cancel culture / groups who show up to protest your event without warning should be engaged with on a good faith assumption? I struggle to imagine any of these groups accepting anything other than full concession to their demands, such that you're stuck with the BATNA regardless.

(I definitely agree that if someone emails you saying "I think this speaker is bad and you shouldn't invite him", and after some discussion they say "I'm sorry but I can't agree with you and if you go through with this event I will protest / criticize you / have the university withdraw affiliation", you should not treat this as a bad faith attack. Afaik this was not the case with EA Munich, though I don't know the details.)


Re: the first five paragraphs: I feel like this is disagreeing on how to use the word "bully" or "threat", rather than anything super important. I'll just make one note:

Alice is still not a bully even if her motivating beliefs re. Bob are both completely mistaken and unreasonable. She's also still not a bully even if Alice's implied second-order norms are wrong (e.g. maybe the public square would be better off if people didn't stridently object to hosting speakers based on their supposed views on topics they are not speaking upon, etc.)

I'd agree with this if you could reasonably expect to convince Alice that she's wrong on these counts, such that she then stops doing things like

(e.g.) protest this event, stridently criticise the group in the student paper for hosting him, petition the university to withdraw affiliation

But otherwise, given that she's taking actions that destroy value for Bob without generating value for Alice (except via their impact on Bob's actions), I think it is fine to think of this as a threat. (I am less attached to the bully metaphor -- I meant that as an example of a threat.)

Avoiding Munich's Mistakes: Advice for CEA and Local Groups

Yeah, I'm aware that is the emotional response (I feel it too), and I agree the game theoretic reason for not giving in to threats is important. However, it's certainly not a theorem of game theory that you always do better if you don't give in to threats, and sometimes giving in will be the right decision.

we will find you and we will make sure it was not worth it for you, at the cost of our own resources

This is often not an option. (It seems pretty hard to retaliate against an online mob, though I suppose you could randomly select particular members to retaliate against.)

Another good example is bullying. A child has ~no resources to speak of, and bullies will threaten to hurt them unless they do X. Would you really advise this child not to give in to the bully?

(Assume for the sake of the hypothetical the child has already tried to get adults involved and it has done ~nothing, as I am told is in fact often the case. No, the child can't coordinate with other children to fight the bully, because children are not that good at coordinating.)

Avoiding Munich's Mistakes: Advice for CEA and Local Groups

It seems like you believe that one's decision of whether or not to disinvite a speaker should depend only on one's beliefs about the speaker's character, intellectual merits, etc. and in particular not on how other people would react.

Suppose that you receive a credible threat that if you let already-invited person X speak at your event, then multiple bombs would be set off, killing hundreds of people. Can we agree that in that situation it is correct to cancel the event?

If so, then it seems like at least in extreme cases, you agree that the decision of whether or not to hold an event can depend on how other people react. I don't see why you seem to assume that in the EA Munich case, the consequences are not bad enough that EA Munich's decision is reasonable.

Some plausible (though not probable) consequences of hosting the talk:

  • Protests disrupting the event (this has previously happened to a local EA group)
  • Organizers themselves get cancelled
  • Most members of the club leave due to risk of the above or disagreements with the club's priorities

At least the first two seem quite bad, there's room for debate on the third.

In addition, while I agree that the extremes of cancel culture are in fact very harmful for EA, it's hard to argue that disinviting a speaker is anywhere near the level of any of the examples you give. Notably, they are not calling for a mob to e.g. remove Robin Hanson from his post, they are simply cancelling one particular talk that he was going to give at their venue. This definitely does have a negative impact on norms, but it doesn't seem obvious to me that the impact is very large.

Separately, I think it is also reasonable for a random person to come to believe that Robin Hanson is not arguing in good faith.

(Note: I'm still undecided on whether or not the decision itself was good or not.)

Getting money out of politics and into charity

I'm super excited that you're doing this! It's something I've wanted to exist for a long time, and I considered doing it myself a few years ago. It definitely seems like the legal issues are the biggest hurdle. Perhaps I'm being naively optimistic, but I was at least somewhat hopeful that you could get the political parties to not hate you, by phrasing it as "we're taking away money from the other party".

I'm happy to chat about implementation details, unfortunately I'm pretty busy and can't actually commit enough time to help with, you know, actual implementation. Also unfortunately, it seems I have a similar background to you, and so wouldn't really complement your knowledge very well.

If I were to donate to politics (which I could see happening), I would very likely use this service if it existed.

Load More