196 karmaJoined Mar 2022


Sorted by New


Right now it seems like AI safety is more of a scene (centered around the Bay Area) than a research community. If you want to attract great scientists and mathematicians (or even mediocre scientists and mathematicians), something even more basic than coming up with good "nerd-sniping" problems is changing this. There are many people who could do good work in technical AI safety, but would not fit in socially with EAs and rationalists.

I'm sympathetic to arguments that formal prepublication peer review is a waste of time. However, I think formalizing and writing up ideas to academic standards, such that they could be submitted for peer review, is definitely a very good use of time, it's not even that hard, and there should be more of it. This would be one step towards making a more bland, boring, professionalized research community where a wider variety of people might want to be involved.

I think a better analogy than "ICBM engineering" might be "all of aeronautical engineering and also some physicists studying fluid dynamics". If you were an anti-nuclear protester and you went and yelled at an engineer who runs wind tunnel simulations to design cars, they would see this as strange and unfair. This is true even though there might be some dual use where aerodynamics simulations are also important for designing nuclear missiles.

I think this would be counterproductive. Most ML researchers don't think the research they personally work on is dangerous, or could be dangerous, or contributes to a research direction that could be dangerous, and most of them actually are right about this. There is all kinds of stuff people work on that's not on the critical path to dangerous AI.

Some basic knowledge of (relatively) old-school probabilistic graphical models, along with basic understanding of variational inference. Not that graphical models are going to be used directly for any SOTA models any more, but the mathematical formalism is still very useful.

For example, understanding how inference on a graphical model works motivates the control-as-inference perspective on reinforcement learning. This is useful for understanding things like decision transformers, or this post on how to interpret RLHF on language models.

It would also be essential background to understand the causal incentives research agenda.

So the same tools come up in two very different places, which I think makes a case for their usefulness.

This is in some sense math-heavy, and some of the concepts are pretty dense, but without many mathematical prerequisites. You have to understand basic probability (how expected values and log likelihoods work, mental comfort going between  and  notation), basic calculus (like "set the derivative = 0 to maximize"), and be comfortable algebraically manipulating sums and products.

Answer by anonymous6Dec 29, 202230

Each chapter of Russell & Norvig's textbook "Artificial Intelligence: A Modern Approach" ends with historical notes. These are probably sparser than you want, but they are good and cover a very broad array of topics. The 4th edition of the book is decently up to date (for the time being!).

Trying to "do as the virtuous agent would do" (or maybe "do things for the sake of being a good person") seems to be a  really common problem for people.

Ruthless consequentialist reasoning totally short-circuits this, which I think is a large part of its appeal. You can be sitting around in this paralyzed fog, agonizing over whether you're "really" good or merely trying to fake being good for subconscious selfish reasons, feeling guilty for not being eudaimonic enough -- and then somebody comes along and says "stop worrying and get up and buy some bednets", and you're free.

I'm not philosophically sophisticated enough to have views on metaethics, but it does seem sometimes that the main value of ethical theories is therapeutic, so different contradictory ethical theories could be best for different people and at different times of life.

I would be inclined to replace “not thinking carefully” with “not thinking formally”. In real life everything tends to have exceptions and this is most people’s presumption, so they don’t feel a need to reserve language for the truly universal claims which are never meaningful.

Some people have practice in thinking about formal systems, where truly universal statements are meaningful, and where using different language to draw fine distinctions is important (“always” vs “with probability 1” vs “with high probability” vs “likely”).

Trying to push the second group’s norms on the first group might be tough even if perhaps it would be good.

I think when most people say “unequivocally” and “all”, they almost always mean “still maybe some exceptions” and “almost all”. If you don’t need to make mathematical/logical statements, which most people don’t, then reserving these words to act as universal quantifiers is not very useful. I used to be annoyed by this but I’ve learned to accept it.

Here's one set of lecture notes (don't endorse that they're necessarily the best, just first I found quickly) https://lucatrevisan.github.io/40391/lecture12.pdf

Keywords to search for other sources would be "multiplicative weight updates", "follow the leader", "follow the regularized leader".

Note that this is for what's sometimes called the "experts" setting, where you get full feedback on the counterfactual actions you didn't take. But the same approach basically works with some slight modification for the "bandit" setting, where you only get to see the result of what you actually did.


I have the feeling people sometimes just disappear even if we already agreed to have a call or to meet up (but for example did not agree on the time yet).

This is stereotypically seen as something people in California do, and is complained about by East Coasters. People will both agree to get coffee or lunch at some point and then never follow up. Maintaining the ambiguity is considered polite. Overrepresentation of Bay Area residents might be the explanation here.

Load more