498 karmaJoined May 2020


EA and AI safety

Conceptual alignment research at MIRI



Yeah, a more quantitative survey sounds like a useful thing to have, although I don't have concrete plans to do this currently.

I'm slight wary of causing 'survey fatigue' by emailing AI safety people constantly with surveys, but this seems like something that wouldn't be too fatiguing 

Not exactly, but it seems useful to know what other people have done if you want to do similar work to them. 

Obviously with all the standard hedges that we don't want everyone doing exactly the same thing and thinking the same way.

That is definitely part of studying math. The thing I was trying to point to is the process of going from an idea or intuition to something that you can write in math. For example, in linear algebra you might have a feeling about some property of a matrix but then you actually have to show it with math. Or more relevantly, in Optimal Policies Tend to Seek Power it seems like the definition of 'power' came from formalizing what properties we would want this thing called 'power' to have. 

But I'm curious to hear your thoughts on this, and if you think there are other useful ways to develop this 'formalization' skill.

I got to the same stage (and also didn't get in) and had the same experience as you. I was definitely a bit sad about not getting in, but I did appreciate the call and feedback

Maybe some construction megaprojects might count, I'm thinking the Notre-Dame Cathedral which took about 100 years to complete. 

This might not really count because the choir was completed after about 20 years. I'm also not sure if it was meant to take so long.

One example would be Benjamin Franklin bequeathing $2,000 to Boston and Philadelphia each, which could only be spent after 200 years

This sounds like an almost exact description of the EA Hotel (CEEALAR), which is mentioned in the post. I think this does a pretty decent job of selecting for 'genuine EA' people

For the MIRI Conversations, some people have said they'll pay at least some money for this  https://twitter.com/lxrjl/status/1463845239664394240

How worried are people actually about suffering in neural networks/artificial minds? 

(My impression is that this is a fun thing to talk about, but won't be that useful for a long time)

Updating Moral Beliefs

Imagine there is a box with a ball inside it, and you believe the ball is red. But you also believe that in the future you will update your belief and think that the ball is blue (the ball is a normal, non-color-changing ball). This seems like a very strange position to be in, and you should just believe that the ball is blue now.

This is an example of how we should deal with beliefs in general; if you think in the future you will update a belief in a specific direction then you should just update now.

I think the same principle applies to moral beliefs. If you think that in the future you'll believe that it's wrong to do something, then you should believe that it's wrong now.

As an example of this, if you think that in the future you'll believe eating meat is wrong, then you sort of already believe eating meat is wrong. I was in exactly this position for a while, thinking in the future I would stop eating meat, while also continuing to eat meat. A similar case to this is deliberately remaining ignorant about something because learning would change your moral beliefs. If you're avoiding learning about factory farming because you think it would cause you to believe eating factory farmed meat is bad, then you already on some level believe that.

Another case of this is in politics when a politician says it's 'not the time' for some political action but in the future it will be. This is 'fine' if it's 'not the time' due to political reasons, such as the electorate not reelecting the politician. But I don't think it's consistent to say an action is currently not moral, but will be moral in the future. Obviously this only works if the action now and in the future are actually equivalent. 

Load more