Simon Skade

66 karmaJoined


Self-teaching myself about AI safety and thinking about how to save the world.

Like to have a chat? Me too! Please reach out / book a meeting:


I think it is important to keep in mind that we are not very funding constrained. It may be ok to have some false positives, false negatives may often be worse, so I wouldn't be too careful.

I think grantmaking is probably still too reluctant to fund stuff that has an unlikely chance of high impact, especially if they are uncertain because the people aren't EAs.
For example, I told a very exceptional student (who has like 1 in a million problem solving capability) to apply for Atlas fellowship, although I don't know him well, because from my limited knowledge it increases the chance that he will work on alignment from 10% to 20-25%, and the $50k are easily worth it.

Though of course having more false positives causes more people that only pretend to do sth good to apply, which isn't easy to handle for our current limited number of grantmakers. We definitely need to scale up grantmaking ability anyways.

I think that non-EAs should know that they can get funding if they do something good/useful. You shouldn't need to pretend to be an EA to get funding, and defending against people who pretend they do good projects seem easier in many cases, e.g. you can often just start giving a little funding and promise more funding later if they show they progress.

(I also expect that we/AI-risk-reduction gets even much more funding as the problem gets more known/acknowledged. I'd guess >$100B in 2030, so I don't think funding ever becomes a bottleneck, but not totally sure of course.)

Out of the alternative important skills you mentioned, I think many of them are very correlated, and I think the relevant stuff roughly boils down to rationality (and perhaps also ambition).

Being rational itself is also correlated with being an EA and with being intelligent, and overall I think intelligence and rationality (and ambition) are traits that are really strong predictors of impact.

The impact curve is very heavy-tailed, and smarter people can have OOMs more impact than people with 15 IQ points less. So no, I don't think EA is focusing too much on smart people, indeed, it would surprise me if it had reached a level where it wouldn't be good to focus even more on intelligence. (Not that I claim I sufficiently argued for this claim, but I can say that this is true in my world model.)

(Not sure if that has been suggested before, but) you should be able to sort comments by magic (the way posts are sorted on the frontpage) or some other better way to combine top+new properties for comments. Otherwise new contributions that are good are read far too rarely, so only very few people will read and upvote them, while the first comments directly receive many upvotes and so get even more upvotes later. Still, upvotes tell a bit about what comments are good, and not everyone wants to read everything.

I would definitely use it myself, but I would strongly suggest also making it the default way comments are sorted.

(That wouldn't totally remove bad dynamics, but it would be a start.)

Related to the post and very related to this comment is this post:

The post claims that saying things like should, good or bad, (or other words that carry moral judgement) can often lead to bad reasoning because you fail to anticipate the actual consequences. (I recommend reading this post, or at least the last two sections, the sentence here isn't really a good summary.)

Actually, some replacements suggested in this post may not help in some cases:

Someone in the EA community should do a specific thing

More people in the EA community should do a specific thing

EA would be better if it had a certain property

An issue is underemphasized by many people in the community, or by large EA institutions

The problem isn't that those words are always bad, but that you need to say more specifically why they are bad, or you might miss something. Therefore those sentences should be followed with a "because" or "otherwise" or preceded with a reason, like in this very sentence.

Of course, in some truly obvious cases it is ok to just use "good" or "bad" or synonyms without a more explicit reason, but those words should be warning signs, so you can see if you didn't reason well. 

(Have you noticed how often I used good, bad or should in this comment, and what the fundamental reasons were that I didn't bother justifying and just accepted as good or bad?) (Also, replacing "good" with something like "useful" or other synonyms doesn't help, they should still be warning signs.)

AGI will (likely) be quite different from current ML systems.

I'm afraid I disagree with this. For example, if this were true, interpretability from Chris Olah or the Anthropic team would be automatically doomed; Value Learning from CHAI would also be useless, our predictions about forecasting that we use to convince people of the importance of AI Safety equally so.

Wow, the "quite" wasn't meant that strongly, though I agree that I should have expressed myself a bit clearer/differently. And the work of Chris Olah, etc. isn't useless anyway, but yeah AGI won't run on transformers and not a lot of what we found won't be that useful, but we still get experience in how to figure out the principles, and some principles will likely transfer. And AGI forecasting is hard, but certainly not useless/impossible, but you do have high uncertainties.

Breakthroughs only happen when one understands the problem in detail, not when people float around vague ideas.

Breakthroughs happen when one understands the problem deeply. I think agree with the "not when people float around vague ideas" part, though I'm not sure what you mean with that. If you mean "academia of philosophy has a problem", then I agree. If you mean "there is no way Einstein could derive special or general relativity mostly from thought experiments", then I disagree, though you do indeed be skilled to use thought experiments. I don't see any bad kind of "floating around with vague ideas" in the AI safety community, but I'm happy to hear concrete examples from you where you think academia methodology is better!
(And I do btw. think that we need that Einstein-like reasoning, which is hard, but otherwise we basically have no chance of solving the problem in time.)

What academia does is to ask for well defined problems and concrete solutions. And that's what we want if we want to progress.

I still don't see why academia should be better at finding solutions. It can find solutions on easy problems. That's why so many people in academia are goodharting all the time. Finding easy subproblems of which the solutions allow us to solve AI safety is (very likely) much harder than solving those subproblems.

Notice also that Shannon and many other people coming up with breakthroughs did so in academic ways.

Yes, in history there were some Einsteins in academia that could even solve hard problems, but those are very rare, and getting those brilliant not-goodharting people to work on AI safety is uncontroversially good I would say. But there might be better/easier/faster options than building the academic field of AI safety to find those people and make them work on AI safety.

Still, I'm not saying it's a bad idea to promote AI safety in academia. I'm just saying it won't nearly suffice to solve alignment, not by a longshot.

(I think the bottom of your comment isn't as you intended it to be.)

I must say I strongly agree with Steven.

  1. If you are saying academia has a good track record, then I must say (1) wrong for stuff like ML, where in recent years much (arguably most) relevant progress is made outside of academia, and (2) it may have a good track record for the long history of science, and when you say it's good at solving problems, sure I think it might solve alignment in 100 years, but we need it in 10, and academia is slow. (E.g. read Yudkowsky's sequence on science, if you don't think that academia is slow.)
  2. Do you have some reason why you think that a person can make more progress in academia than elsewhere? I agree that academia has people, and it's good to get those people, but academia has badly shaped incentives, like (from my other comment): "Academia doesn't have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don't take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them." So I expect a person can make more progress outside of academia. Much more, in fact.
  3. Some important parts of the AI safety problem seem to me like they don't fit well into academia work. There are of course exceptions, people in academia who can make useful progress here, but they are rare. I am not that confident in this, as my understanding of AI safety isn't that deep, but I'm not just making this up. (EDIT: This mostly overlaps with the first two points I made, that academia is slow and that there are bad incentives, and maybe some other minor considerations about why excellent people (e.g. John Wentworth) may rather choose to not work in academia. What I'm saying is that I think that AI safety is a problem where those obstacles are big obstacles, whereas there might be other fields where those obstacles aren't thaaat bad.)

There exists the EA Forum feature suggestion thread for such things, though an app may be a special case because it is a rather big feature, but I still think it rather fits there.

We won't solve AI safety by just throwing a bunch of (ML) researchers on it.

AGI will (likely) be quite different from current ML systems. Also, work on aligning current ML systems won't be that useful, and generally what we need is not small advancements, but we rather need breakthroughs. (This is a great post for getting started on understanding why this is the case.)

We much rather need a few Paul Christiano level researchers that build a very deep understanding of the alignment problem and then can make huge advances, than we need many still-great-but-not-that-extraordinary researchers.

Academia doesn't have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don't take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them.

I think paradigms are partially useful and we should probably create some for some specific approaches to AI safety, but I think the default paradigms that would develop in academia are probably pretty bad, so that the research isn't that useful.

Promoting AI safety in academia is probably still good, but for actually preventing existential risk, we need some other way of creating incentives to usefully contribute to AI safety. I don't know yet how to best do it, but I think there are better options.

Getting people into AI safety without arguing about x-risk seems nice, but mostly because I think this strategy is useful for convincing people of x-risk later, so they then can work on important stuff.

Another advantage of an app may be that you could download posts, in case you go somewhere where you don't have Internet access, but I think this is rare and not a sufficient reason to create an app either.

Why should there be one? The EAForum website works great on mobile. So my guess is that there is no EA Forum app because it's not needed / wouldn't be that useful, except perhaps for app notifications, but that doesn't seem that important.

Load more