richard_ngo

Former AI safety research engineer, now PhD student in philosophy of ML at Cambridge. I'm originally from New Zealand but have lived in the UK for 6 years, where I did my undergrad and masters degrees (in Computer Science, Philosophy, and Machine Learning). Blog: thinkingcomplete.blogspot.com

Sequences

EA Archives Reading List

Comments

richard_ngo's Shortform

There's an old EA forum post called Effective Altruism is a question (not an ideology) by Helen Toner, which I think has been pretty influential.*

But I was recently thinking about how the post rings false for me personally. I know that many people in EA are strongly motivated by the idea of doing the most good. But I was personally first attracted to an underlying worldview composed of stories about humanity's origins, the rapid progress we've made, the potential for the world to be much better, and the power of individuals to contribute to that; from there, given potentially astronomical stakes, altruism is a natural corollary.

I think that leaders in EA organisations are more likely to belong to the former category, of people inspired by EA as a question. But as I discussed in this post, there can be a tradeoff between interest in EA itself versus interest in the things EA deems important. Personally I prioritise making others care about the worldview more than making them care about the question: caring about the question pushes you to do the right thing in the abstract, but caring about the worldview seems better at pushing you towards its most productive frontiers. This seems analogous to how the best scientists are more obsessed with the thing they're studying than the downstream effects of their research.

Anyway, take all this with a grain of salt; it's not a particularly firm opinion, just one personal perspective. But one longstanding EA I was talking to recently found it surprising, so I thought it'd be worth sharing in case others do too. 


* As one datapoint: since the EA forum has been getting more users over time, a given karma score is more impressive the older a post is. Helen's post is twice as old as any other post with comparable or higher karma, making it a strong outlier.

Why should we *not* put effort into AI safety research?

Drexler's CAIS framework attacks several of the premises underlying standard AI risk arguments (although iirc he also argues that CAIS-specific safety work would be valuable). Since his original report is rather long, here are two summaries.

Should you do a PhD in science?

I suspect 1/3 is a significant overestimate since US universities attract people who did their PhDs all across the world.

Why AI is Harder Than We Think - Melanie Mitchell

I was pleasantly surprised by this paper (given how much dross has been written on this topic). My thoughts on the four fallacies Mitchell identifies:

Fallacy 1: Narrow intelligence is on a continuum with general intelligence

This is hard to evaluate, since Mitchell only discusses it very briefly. I do think that people underestimate the gap between solving tasks with near-infinite data (like Starcraft) vs low-data tasks. But saying that GPT-3 isn't a step towards general intelligence also seems misguided, given the importance of few-shot learning.

Fallacy 2: Easy things are easy and hard things are hard

I agree that Moravec's paradox is important and underrated. But this also cuts the other way: if chess and Go were easy, then we should be open to the possibility that maths and physics are too.

Fallacy 3: The lure of wishful mnemonics

This is true and important. My favourite example is artificial planning. Tree search algorithms are radically different from human planning, which operates over abstractions. Yet this is hard to see because we use the same word for both.

Fallacy 4: Intelligence is all in the brain

This is the one I disagree with most, because "embodied cognition" is a very slippery concept. What does it mean? "The representation of conceptual knowledge is ... multimodal" - okay, but CLIP is multimodal.

"Thoughts are inextricably associated with perception, action, and emotion." Okay, but RL agents have perceptions and actions. And even if the body plays a crucial role in human emotions, it's a big leap to claim that disembodied agents therefore can't develop emotions.

Under this fallacy, Mitchell also discusses AI safety arguments by Bostrom and Russell. I agree that early characterisations of AIs as "purely rational" were misguided. Mitchell argues that AIs will likely also have emotions, cultural biases, a strong sense of selfhood and autonomy, and a commonsense understanding of the world. This seems plausible! But note that none of these directly solves the problem of misaligned goals. Sociopaths have all these traits, but we wouldn't want them to have superhuman intelligence.

This does raise the question: can early arguments for AI risk be reformulated to rely less on this "purely rational" characterisation? I think so - in fact, that's what I tried to do in this report.

Some quick notes on "effective altruism"

Well, my default opinion is that we should keep things as they are;  I don't find the arguments against "effective altruism" particularly persuasive, and name changes at this scale are pretty costly.

Insofar as people want to keep their identities small, there are already a bunch of other terms they can use - like longtermist, or environmentalist, or animal rights advocate. So it seems like the point of having a term like EA on top of that is to identify a community. And saying "I'm part of the effective altruism community" softens the term a bit.

around half of the participants (including key figures in EA) said that they don’t self-identify as "effective altruists"

This seems like the most important point to think about; relatedly, I remember being surprised when I interned at FHI and learned how many people there don't identify as effective altruists. It seems indicative of some problem, which seems worth pursuing directly. As a first step, it'd be good to hear more from people who have reservations about identifying as an effective altruist. I've just made a top-level question about it, plus an anonymous version - if that describes you, I'd be interested to see your responses!

Some quick notes on "effective altruism"

I think the "global priorities" label fails to escape several of the problems that Jonas argued the EA brand has. In particular, it sounds arrogant for someone to say that they're trying to figure out global priorities. If I heard of a global priorities forum or conference, I'd expect it to have pretty strong links with the people actually responsible for implementing global decisions; if it were actually just organised by a bunch of students, then they'd seem pretty self-aggrandizing.

The "priorities" part may also suggest to others that they're not a priority. I expect "the global priorities movement has decided that X is not a priority" seems just as unpleasant to people pursuing X as "the effective altruism movement has decided that X is not effective".

Lastly, "effective altruism" to me suggests both figuring out what to do, and then doing it. Whereas "global priorities" only has connotations of the former.

Proposed Longtermist Flag

What would you think about the same flag with the sun removed?

Might make it look a little unbalanced, but I kinda like that - longtermism is itself unbalanced in its focus on the future.

Some preliminaries and a claim

I didn't phrase this as clearly as I should have, but it seems to me that there are two separate issues here: firstly whether group X's views are correct, and secondly whether group X uses a methodology that is tightly coupled to reality (in the sense of having tight feedback loops, or making clear predictions, or drawing on a lot of empirical evidence).

I interpret your critique of EA roughly as the claim that a lack of a tight methodological coupling to reality leads to a lack of correctness. My critique of the posts you linked is also that they lack tight methodological coupling to reality, in particular because they rely on high-level abstractions. I'm not confident about whether this means that they're actually wrong, but it still seems like a problem.

Some preliminaries and a claim

I claim that the Effective Altruism and Bay Area Rationality communities have collectively decided that they do not need to participate in tight feedback loops with reality in order to have a huge, positive impact.

I am somewhat sympathetic to this complaint. However, I also think that many of the posts you linked are themselves phrased in terms of very high-level abstractions which aren't closely coupled to reality, and in some ways exacerbate the sort of epistemic problems they discuss. So I'd rather like to see a more careful version of these critiques.

Contact with reality

Yes, I think I still have these concerns; if I had extreme cognitive biases all along, then I would want them removed even if it didn't improve my understanding of the world. It feels similar to if you told me that I'd lived my whole life in a (pleasant) dreamlike fog, and I had the opportunity to wake up. Perhaps this is the same instinct that motivates meditation? I'm not sure.

Load More