356 karmaJoined May 2021


Sorted by New


I deeply appreciate the degree to which this comment acknowledges issues and provides alternative organizations that may be better in specific respects. It has given me substantial respect for LTFF.

This feels like a "be the change you want to see in the world" moment. If you want such an event, it seems like you could basically just make a forum post (or quick take) offering 1:1s?

I think that basically all of these are being pursued and many are good ideas. I would be less put off if the post title was 'More people should work on aligning profit incentives with alignment research', but suggesting that no one is doing this seems off base.

This is what I got after a few minutes of Google search (not endorsing any of the links beyond that they are claiming to do the thing described).

AI Auditing:

Model interpretability:

Monitoring and usage:

Future Endowment Fund sounds a lot like an impact certificate:

I agree that 'utilitarianism' often gets elided into meaning a variation of hedonic utilitarianism. I would like to hold philosophical discourse to a higher bar. In particular, once someone mentions hedonic utilitarianism, I'm going to hold them to the standard of separating out hedonic utilitarianism and preference utilitarianism, for example.

I agree hedonic utilitarians exist. I'm just saying the utilitarians I've talked to always add more terms than pleasure and suffering to their utility function. Most are preference utilitarians.

I feel like 'valuism' is redefining utilitarianism, and the contrasts to utilitarianism don't seem very convincing. For instance, you define valuism as noticing what you intrinsically value and trying to take effective action to increase that. This seems identical to a utilitarian whose utility function is composed of what they intrinsically value.

I think you might be defining utilitarianism such that they are only allowed to care about one thing? Which is sort of true, in that utilitarianism generally advocates converting everything into a common scale, but that common scale can measure multiple things. My utility function includes happiness, suffering, beauty, and curiosity as terms. This is totally fine, and a normal part of utilitarian discourse. Most utilitarians I've talked to are total preference utilitarians, I've never met a pure hedonistic utilitarian.

Likewise, I'm allowed to maintain my happiness and mental health as an instrumental goal for maximizing utility. This doesn't mean that utilitarianism is wrong, it just means we can't pretend we can be utility maximizing soul-less robots. I feel like there is a post on folks realizing this at least every few months. Which makes sense! It's an important realization!

Also, utilitarianism also doesn't need objective morality any more than any other moral philosophy, so I didn't understand your objection there.

This comment came across as unnecessarily aggressive to me.

The original post is a newsletter that seems to be trying to paint everyone in their best light. That's a nice thing to do! The epistemic status of the post (hype) also feels pretty clear already.

Thank them for the comment, and then link to this thread?

As someone who went through the CEA application process, I wholeheartedly endorse this. I was also really impressed with CEA's approach the process, and their surprising willingness to give feedback & advice through it.

[It ended up being a mutually bad fit. I've spent my whole career as a C++ backend engineer at a FAANG and I like working in person, and that doesn't align super well with a small remote-first org that has a lot of frontend needs.]

It feels weird to me to hear that something is terrible to think. It might be terrible that we're only alive because everyone doesn't have the option to kill everyone else instantly, but it's also true. Thinking true thoughts isn't terrible.

If everyone has a button that could destroy all life on the planet, I feel like it's unrealistic to expect that button to remain unpressed for more than a few hours. The most misanthropic person on Earth is very, very misanthropic. I'm not confident that many people would press the button, but the whole thing is that it only takes one.

Given that currently people don't have such a button, it seems easier to think how we can prevent that button from existing, rather than how we could make everyone agree not to press the button. The button is a power no one should have.

Load more