**TL;DR:** I'm trying to either come up with a new promising AIS direction or decide (based on my inside view and not based on trust) that I strongly believe in one of the existing proposals. Is there some ML background that I better get? (and if possible: why do you think so?)

# I am not asking how to be employable

I know there are other resources on that, and I'm not currently trying to be employed.

# Examples of seemingly useful things I learned so far (I want more of these)

- GPT's training involves loss only for the next token (relevant for myopia)
- Neural networks have a TON of parameters (relevant for why interpretability is hard)
- GPT has a limited amount of tokens in the prompt, plus other problems that I can imagine
^{[1]}trying to solve (relevant for solutions like "chain of thought") - The vague sense that nobody has a good idea for why things work, and the experience of trying different hyperparameters and learning only in retrospect which of them did well.

(Please correct me if something here was wrong)

# I'm asking because I'm trying to decide what to learn next in ML

if anything.

# My background

- Almost no ML/math background: I spent about 13 days catching up on both. What I did so far is something like "build and train GPT2 (plus some of pyTorch)", and learn about some other architectures.
- I have a lot of non-ML software engineering experience.

# Thanks!

^{^}I don't intend to advance ML capabilities

Reading ML papers seems like a pretty useful skill. They're quite different from blog posts, less philosophical and more focused on proving that their methods actually result in progress. Over time, they'll teach you what kinds of strategies tend to succeed in ML, and which ones are more difficult, helping you build an inside view of different alignment agendas. MLSS has a nice list of papers, and weeks 3 through 5 of AGISF have particularly good ones too.

Why is this post being downvoted?

You can tell me anonymously:

https://docs.google.com/forms/d/e/1FAIpQLSca6NOTbFMU9BBQBYHecUfjPsxhGbzzlFO5BNNR1AIXZjpvcw/viewform

Some basic knowledge of (relatively) old-school probabilistic graphical models, along with basic understanding of variational inference. Not that graphical models are going to be used directly for any SOTA models any more, but the mathematical formalism is still very useful.

For example, understanding how inference on a graphical model works motivates the control-as-inference perspective on reinforcement learning. This is useful for understanding things like decision transformers, or this post on how to interpret RLHF on language models.

It would also be essential background to understand the causal incentives research agenda.

So the same tools come up in two very different places, which I think makes a case for their usefulness.

This is in some sense math-heavy, and some of the concepts are pretty dense, but without many mathematical prerequisites. You have to understand basic probability (how expected values and log likelihoods work, mental comfort going between E and ∫ notation), basic calculus (like "set the derivative = 0 to maximize"), and be comfortable algebraically manipulating sums and products.