TL;DR: I'm trying to either come up with a new promising AIS direction or decide (based on my inside view and not based on trust) that I strongly believe in one of the existing proposals. Is there some ML background that I better get? (and if possible: why do you think so?)
I am not asking how to be employable
I know there are other resources on that, and I'm not currently trying to be employed.
Examples of seemingly useful things I learned so far (I want more of these)
- GPT's training involves loss only for the next token (relevant for myopia)
- Neural networks have a TON of parameters (relevant for why interpretability is hard)
- GPT has a limited amount of tokens in the prompt, plus other problems that I can imagine trying to solve (relevant for solutions like "chain of thought")
- The vague sense that nobody has a good idea for why things work, and the experience of trying different hyperparameters and learning only in retrospect which of them did well.
(Please correct me if something here was wrong)
I'm asking because I'm trying to decide what to learn next in ML
- Almost no ML/math background: I spent about 13 days catching up on both. What I did so far is something like "build and train GPT2 (plus some of pyTorch)", and learn about some other architectures.
- I have a lot of non-ML software engineering experience.
I don't intend to advance ML capabilities