Researcher at MIRI. http://shlegeris.com/


The academic contribution to AI safety seems large

Thanks for writing this post--it was useful to see the argument written out so I could see exactly where I agreed and disagreed. I think lots of people agree with this but I've never seen it written up clearly before.

I think I place substantial weight (30% or something) on you being roughly right about the relative contributions of EA safety and non-EA safety. But I think it's more likely that the penalty on non-EA safety work is larger than you think. 

I think the crux here is that I think AI alignment probably requires really focused attention, and research done by people who are trying to do something else will probably end up not being very helpful for some of the core problems.

It's a little hard to evaluate the counterfactuals here, but I'd much rather have the contributions from EA safety than from non EA safety over the last ten years.

I think that it might be easier to assign a value to the discount factor by assessing the total contributions of EA safety and non-EA safety. I think that EA safety does something like 70% of the value-weighted work, which suggests a much bigger discount factor than 80%.


Assorted minor comments:

But this is only half of the ledger. One of the big advantages of academic work is the much better distribution of senior researchers: EA Safety seems bottlenecked on people able to guide and train juniors

Yes, but those senior researchers won't necessarily have useful things to say about how to do safety research. (In fact, my impression is that most people doing safety research in academia have advisors who don't have very smart thoughts on long term AI alignment.)

None of those parameters is obvious, but I make an attempt in the model (bottom-left corner).

I think the link is to the wrong model?

A cursory check of the model

In this section you count nine safety-relevant things done by academia over two decades, and then note that there were two things from within EA safety last year that seem more important. This doesn't seem to mesh with your claim about their relative productivity.

The academic contribution to AI safety seems large

MIRI is not optimistic about prosaic AGI alignment and doesn't put much time into it.

How strong is the evidence of unaligned AI systems causing harm?
Answer by BuckJul 23, 202011

I don’t think the evidence is very good; I haven’t found it more than slightly convincing. I don’t think that the harms of current systems are a very good line of argument for potential dangers of much more powerful systems.

Intellectual Diversity in AI Safety

I'm curious what your experience was like when you started talking to AI safety people after already coming to come of your own conclusions. Eg I'm curious if you think that you missed major points that the AI safety people had spotted which felt obvious in hindsight, or if you had topics on which you disagreed with the AI safety people and think you turned out right.

Are there lists of causes (that seemed promising but are) known to be ineffective?
Answer by BuckJul 09, 20208

In an old post, Michael Dickens writes:

The closest thing we can make to a hedonium shockwave with current technology is a farm of many small animals that are made as happy as possible. Presumably the animals are cared for by people who know a lot about their psychology and welfare and can make sure they’re happy. One plausible species choice is rats, because rats are small (and therefore easy to take care of and don’t consume a lot of resources), definitively sentient, and we have a reasonable idea of how to make them happy.
Thus creating 1 rat QALY costs $120 per year, which is $240 per human QALY per year.
This is just a rough back-of-the-envelope calculation so it should not be taken literally, but I’m still surprised by how cost-inefficient this looks. I expected rat farms to be highly cost-effective based on the fact that most people don’t care about rats, and generally the less people care about some group, the easier it is to help that group. (It’s easier to help developing-world humans than developed-world humans, and easier still to help factory-farmed animals.) Again, I could be completely wrong about these calculations, but rat farms look less promising than I had expected.

I think this is a good example of something seeming like a plausible idea for making the world better, but which turned out to seem pretty ineffective.

Concern, and hope

What current controversy are you saying might make moderate pro-SJ EAs more wary of SSC?

Concern, and hope

I have two complaints: linking to a post which I think was made in bad faith in an attempt to harm EA, and seeming to endorse it by using it as an example of a perspective that some EAs have.

I think you shouldn't update much on what EAs think based on that post, because I think it was probably written in an attempt to harm EA by starting flamewars.

EDIT: Also, I kind of think of that post as trying to start nasty rumors about someone; I think we should generally avoid signal boosting that type of thing.

KR's Shortform

I'd be interested to see a list of what kinds of systematic mistakes previous attempts at long-term forecasting made.

Also, I think that many longtermists (eg me) think it's much more plausible to successfully influence the long run future now than in the 1920s, because of the hinge of history argument.

Concern, and hope

Many other people who are personally connected to the Chinese Cultural Revolution are the people making the comparisons, though. Eg the EA who I see posting the most about this (who I don't think would want to be named here) is Chinese.

Concern, and hope

I think that both the Cultural Revolution comparisons and the complaints about Cultural Revolution comparisons are way less bad than that post.

Load More