One thing that bugged me when I first got involved with EA was the extent to which the community seemed hesitant to spend lots of money on stuff like retreats, student groups, dinners, compensation, etc. despite the cost-benefit analysis seeming to favor doing so pretty strongly. I know that, from my perspective, I felt like this was some evidence that many EAs didn't take their stated ideals as seriously as I had hoped—e.g. that many people might just be trying to act in the way that they think an altruistic person should rather than really carefully thin... (read more)
My anecdotal experience hiring is that I get many more prospective candidates saying something like "if this is so important why isn't your salary way above market rates?" than "if you really care about impact, why are you offering so much money?" (Though both sometimes happen.)
Precisely. Also, the frugality of past EA creates a selection effect, so probably there is a larger fraction of anti-frugal people outside the community (and among people who might be interested) than we would expect from looking inside it.
Great point! I think each spending strategy has its pitfalls related to signalling.
I think this correlates somewhat with people's knowledge/engagement with economics, and political lean. The "frugal altruism" will probably attract more left leaning people, while "spending altruism" probably attracts more right leaning people
I agree that it’s possible to be unthinkingly frugal. It’s also possible to be unthinkingly spendy. Both seem bad, because they are unthinking. A solution would be to encourage EA groups to practice good thinking together, and to showcase careful thinking on these topics.
I like the idea of having early EA intro materials and university groups that teach BOTECs, cost-benefit analysis, and grappling carefully with spending decisions.
This kind of training, however, trades off against time spent learning about eg. AI safety and biosecurity.
Academic projects are definitely the sort of thing we fund all the time. I don't know if the sort of research you're doing is longtermist-related, but if you have an explanation of why you think your research would be valuable from a longtermist perspective, we'd love to hear it.
Since it was brought up to me, I also want to clarify that EA Funds can fund essentially anyone, including:
I'm one of the grant evaluators for the LTFF and I don't think I would have any qualms with funding a project 6-12 months in advance.
To be clear, I agree with a lot of the points that you're making—the point of sketching out that model was just to show the sort of thing I'm doing; I wasn't actually trying to argue for a specific conclusion. The actual correct strategy for figuring out the right policy here, in my opinion, is to carefully weigh all the different considerations like the ones you're mentioning, which—at the risk of crossing object and meta levels—I suspect to be difficult to do in a low-bandwidth online setting like this.
Maybe it'll still be helpful to just give my take us... (read more)
I think you're imagining that I'm doing something much more exotic here than I am. I'm basically just advocating for cooperating on what I see as a prisoner's-dilemma-style game (I'm sure you can also cast it as a stag hunt or make some really complex game-theoretic model to capture all the nuances—I'm not trying to do that there; my point here is just to explain the sort of thing that I'm doing).
Consider:
A and B can each choose:
And they each have utility function... (read more)
(It seems that you're switching the topic from what your policy is exactly, which I'm still unclear on, to the model/motivation underlying your policy, which perhaps makes sense, as if I understood your model/motivation better perhaps I could regenerate the policy myself.)
I think I may just outright disagree with your model here, since it seems that you're not taking into account the significant positive externalities that a public argument can generate for the audience (in the form of more accurate beliefs, about the organizations involved and EA topics i... (read more)
For example would you really not have thought worse of MIRI (Singularity Institute at the time) if it had labeled Holden Karnofsky's public criticism "hostile" and refused to respond to it, citing that its time could be better spent elsewhere?
To be clear, I think that ACE calling the OP “hostile” is a pretty reasonable thing to judge them for. My objection is only to judging them for the part where they don't want to respond any further. So as for the example, I definitely would have thought worse of MIRI if they had labeled Holden's criticisms as “host... (read more)
Still pretty unclear about your policy. Why is ACE calling the OP "hostile" not considered "meta-level" and hence not updateable (according to your policy)? What if the org in question gave a more reasonable explanation of why they're not responding, but doesn't address the object-level criticism? Would you count that in their favor, compared to total silence, or compared to an unreasonable explanation? Are you making any subjective judgments here as to what to update on and what not to, or is there a mechanical policy you can write down (that anyone can f... (read more)
That's a great point; I agree with that.
I disagree, obviously, though I suspect that little will be gained by hashing it out in more here. To be clear, I have certainly thought about this sort of issue in great detail as well.
I would be curious to read more about your approach, perhaps in another venue. Some questions I have:
It clearly is actual, boring, normal, bayesian evidence that they don't have a good response. It's not overwhelming evidence, but someone declining to respond sure is screening off the worlds where they had a great low-inferential distance reply that was cheap to shoot off that addressed all the concerns. Of course I am going to update on that.
I think that you need to be quite careful with this sort of naive-CDT-style reasoning. Pre-commitments/norms against updating on certain types of evidence can be quite valuable—it is just not the case that you sho... (read more)
To be clear, I think it's perfectly reasonable for you to want ACE to respond if you expect that information to be valuable. The question is what you do when they don't respond. The response in that situation that I'm advocating for is something like “they chose not to respond, so I'll stick with my previous best guess” rather than “they chose not to respond, therefore that says bad things about them, so I'll update negatively.” I think that the latter response is not only corrosive in terms of pushing all discussion into the public sphere even when that makes it much worse, but it also hurts people's ability to feel comfortably holding onto non-public information.
“they chose not to respond, therefore that says bad things about them, so I'll update negatively.” I think that the latter response is not only corrosive in terms of pushing all discussion into the public sphere even when that makes it much worse, but it also hurts people's ability to feel comfortably holding onto non-public information.
This feels wrong from two perspectives:
Yeah, I downvoted because it called the communication hostile without any justification for that claim. The comment it is replying to doesn't seem at all hostile to me, and asserting it is, feels like it's violating some pretty important norms about not escalating conflict and engaging with people charitably.
Yeah—I mostly agree with this.
I think it's pretty important for people to make themselves available for communication.
Are you sure that they're not available for communication? I know approximately nothing about ACE, but I'd surprised if they wo... (read more)
Are you sure that they're not available for communication? I know approximately nothing about ACE, but I'd surprised if they wouldn't be willing to talk to you after e.g. sending them an email.
Yeah, I am really not sure. I will consider sending them an email. My guess is they are not interested in talking to me in a way that would later on allow me to write up what they said publicly, which would reduce the value of their response quite drastically to me. If they are happy to chat and allow me to write things up, then I might be able to make the time, but ... (read more)
I also think there's a strong tendency for goalpost-moving with this sort of objection—are you sure that, if they had said more things along those lines, you wouldn't still have objected?
I do think I would have still found it pretty sad for them to not respond, because I do really care about our public discourse and this issue feels important to me, but I do think I would feel substantially less bad about it, and probably would only have mild-downvoted the comment instead of strong-downvoted it.
... (read more)What I have a problem with is the notion that we should
Why was this response downvoted so heavily? (This is not a rhetorical question—I'm genuinely curious what the specific reasons were.)
As Jakub has mentioned above, we have reviewed the points in his comment and fully support Anima International’s wish to share their perspective in this thread. However, Anima’s description of the events above does not align with our understanding of the events that took place, primarily within points 1,5, and 6.
This is relevant, useful information.
... (read more)The most time-consuming part of our commitment to Representation, Equity
I didn't downvote (because as you say it's providing relevant information), but I did have a negative reaction to the comment. I think the generator of that negative reaction is roughly: the vibe of the comment seems more like a political attempt to close down the conversation than an attempt to cooperatively engage. I'm reminded of "missing moods"; it seems like there's a legitimate position of "it would be great to have time to hash this out but unfortunately we find it super time consuming so we're not going to", but it would naturally come with a... (read more)
I downvoted because it called the communication hostile without any justification for that claim. The comment it is replying to doesn't seem at all hostile to me, and asserting it is, feels like it's violating some pretty important norms about not escalating conflict and engaging with people charitably.
I also think I disagree that orgs should never be punished for not wanting to engage in any sort of online discussion. We have shared resources to coordinate, and as a social network without clear boundaries, it is unclear how to make progress on many of the... (read more)
I'd personally love to get more Alignment Forum content cross-posted to the EA Forum. Maybe some sort of automatic link-posting? Though that could pollute the EA Forum with a lot of link posts that probably should be organized separately somehow. I'd certainly be willing to start cross-posting my research to the EA Forum if that would be helpful.
Instinctively, I wish that discussion on these posts could all happen on the Alignment Forum, but since who can join is limited, having discussion here as well could be nice.
I don't know whether every single post should be posted here, but it would be nice to at least have occasional posts summarizing the best recent AF content. This might look like just crossposting every new issue of the Alignment Newsletter, which is something I may start doing soon.
Glad you enjoyed it!
So, I think what you're describing in terms of a model with a pseudo-aligned objective pretending to have the correct objective is a good description of specifically deceptive alignment, though the inner alignment problem is a more general term that encompasses any way in which a model might be running an optimization process for a different objective than the one it was trained on.
In terms of empirical examples, there definitely aren't good empirical examples of deceptive alignment right now for the reason you mentioned, though whether
... (read more)This thread on LessWrong has a bunch of information about precautions that might be worth taking.
Google, by contrast, is notoriously the opposite—for example emphasizing just trying lots of crazy, big, ambitious, expensive bets (e.g. their "10x" philosophy). Also see how Google talked about frugality in 2011.