Topic Contributions


SERI ML Alignment Theory Scholars Program 2022

I worry that the current format of this program might filter out promising candidates who are risk averse. Specifically, the fact that candidates are only granted the actual research opportunity "Assuming all goes well" is a lot of risk to take on. For driven undergraduates, the cost of a summer opportunity falling through is costly, and they might not apply just because of this uncertainty. 

Currently your structure is like PhD programs which admit students to a specific lab (who may be dropped from that lab if they're not a good fit, and in that case, will have to scramble to find an alternative placement).

Maybe a better model for this program is PhD programs who admit a strong cohort of students. Instead of one two-week research sprint, maybe you have 2-3 shorter research sprints ("rotations"). From a student perspective this would probably lower the probability of them being dropped (since all of the mentors would have to dislike them for this to happen).  

What you're currently doing seems like a fine option for you with little downside for the students if: 

1) "Assuming all goes well" means >90% of students continue on with research

2) The projects are sufficiently disjoint that it's unlikely a student is going to be a good fit for more than one project (I think this is probably false but you know more than me, and maybe you think it's true) 

3) 2-week research sprints are much more valuable than 1-week research sprints (I am not convinced of this but maybe you are)

If not all of these are the case I argue it might be better to do rotations / find other ways to make this less risky for candidates. 

Other idea to avoid filtering out risk averse candidates: You could promise that if they don't get matched with a mentor, they can at least do <some other project> , for example, they could be paid to distill AI Safety materials

Introducing the ML Safety Scholars Program

Can undergraduates who already know ML skip weeks 1-2? Can undergraduates who already know DL skip weeks 3-5?

Introducing the ML Safety Scholars Program

You may already have this in mind but—if you are re-running this program in summer 2023, I think it would be a good idea to announce this further in advance.

Most problems fall within a 100x tractability range (under certain assumptions)

I was in the process of writing a comment trying to debunk this. My counterexample didn't work so now I'm convinced this is a pretty good post. This is a nice way of thinking about ITN quantitatively. 

The counterexample I was trying to make might still be interesting for some people to read as an illustration of this phenomenon. Here it is:

Scale "all humans" trying to solve "all problems" down to "a single high school student" trying to solve "math problems". Then tractability (measured as % of problem solved / % increase in resources) for this person to solve different math problems is as follows:

  • A very large arithmetic question like "find 123456789123456789^2 by hand" requires ~10 hours to solve
  • A median international math olympiad question probably requires ~100 hours of studying to solve 
  • A median research question requires an undergraduate degree (~2000 hours) and then specialized studying (~1000 hours)  to solve
  • A really tough research question takes a decade of work (~20,000 hours) to solve
  • A way ahead of its time research question (maybe, think developing ML theory results before there were even computers) I could see taking 100,000+ hours of work 

Here tractability varies by 4 orders of magnitude (10-100,000 hours) if you include all kinds of math problems. If you exclude very easy or very hard things (as Thomas was describing) you end up with 2 orders of magnitude (~1000-100,000 hours). 

'Dropping out' isn't a Plan

I think the diagram which differentiates "Stay in school" versus "Drop out" before further splitting actually has some sense. The way I read that split is, it is saying "Stay in school" versus "Do something strange".  

In some cases it might be helpful, in abstract, to figure out the pros and cons of staying in school, before recursing down the "Drop out" path. Otherwise, you could imagine a pro/con list for ORGs 1-3 having a lot of repetition: "Not wasting time taking useless required classes" is a pro for all 3, "Losing out on connections / credential" is a con for all 3, etc. 

What's the best machine learning newsletter? How do you keep up to date?

Yannic Kilcher's youtube channel profiles fairly recent papers / "ML news" events. The videos on papers are 30-60mins, so more in depth than reading an abstract, and less time consuming than reading the paper yourself. The "ML news" videos are less technical but still a good way to keep up to date on what DeepMind, Meta, NVIDIA, etc. are up to. 

Free money from New York gambling websites

You must be located in New York or another eligible state while signing up and making the bets.


Just to confirm -- do these bets require New York residency, or just being physically present in New York? What forms of identification are requested -- does it have to be a New York state ID (e.g. driver's license)? 

Most successful EA elevator pitches?

I often run into the problem of EA coming up in casual conversation and not knowing exactly how to explain what it is, and I know many others run into this problem as well.


Not rigorously tested or peer-reviewed but this is an approach I've found works decently. The audience is a "normal person".

My short casual pitch of EA:

"Effective altruism is about doing research to improve the effectiveness of philanthropy. Researchers can measure the effects of different interventions, like providing books versus providing malaria nets. GiveWell, an effective altruist charity evaluator, has identified a few high-impact interventions: malaria medicine and nets, vitamin A supplements, encouraging childhood vaccinations, and so on."

If I have a couple more sentences to introduce a bit of longtermism:

"There is also a part of effective altruism which is concerned with preventing future catastrophes. Climate change is one well-known example. Another example is global catastrophic biological risks—as we saw with COVID-19, pandemics can cause a lot of harm, so effective altruists see research in biosecurity and pandemic prevention as highly effective. There is also the field of "AI Safety", which is based on the premise that AI systems will become more prevalent in the future, so it is important we thoroughly research their capabilities before deploying them. The unifying theme here is a "longtermist" worldview—the idea that we can do good things now which will have positive effects on the far future."

The ideas that make up this pitch are:

  • Start with broadly accepted premises ("AI systems will become more prevalent in the future") before putting the EA spin on it ("so we need to do AI safety research"). This principle also applies to writing abstracts
  • Sacrifice precision in definitions of concepts for the sake of getting the intuitive idea across. For example, describing longtermism as "doing things which positively affect the future" does not perfectly capture the concept, but it's an easier starting point than "future-people are just as morally relevant as present-people". 

These principles can similarly be applied to simply describe AI safety, animal welfare, etc.

Think about EA alignment like skill mastery, not cult indoctrination

When I say "repeating talking points", I am thinking of: 

  1. Using cached phrases and not explaining where they come from. 
  2. Conversations which go like
    • EA: We need to think about expanding our moral circle, because animals may be morally relevant. 
    • Non-EA: I don't think animals are morally relevant though.
    • EA: OK, but if animals are morally relevant, then quadrillions of lives are at stake.

(2) is kind of a caricature as written, but I have witnessed conversations like these in EA spaces. 

My evidence for this claim comes form my personal experience watching EAs talk to non-EAs, and listen to non-EAs talk about their perception of EA. The total number of data points in this pool is ~20. I would say that I don't have exceptionally many EA contacts, compared to most EAs, but I do particularly make an effort to seek out social spaces where non-EAs are looking to learn about EA. Thinking back on these experiences, and what conversations went well and what ones didn't, is what inspired me to write this short post.

Ultimately my anecdotal data can't make any statistical statements about the EA community at large. The purpose of this post is to more describe two mental models of EA alignment and advocate for the "skill mastery" perspective. 

Load More