Effective Altruism Forum
EA Forum

Hide table of contents

Comment Permalink

BuckJan 1 202316

2

0

Title: Paul Christiano on how you might get consequentialist behavior from large language models
Author: Paul Christiano
URL: https://forum.effectivealtruism.org/posts/dgk2eLf8DLxEG6msd/how-would-a-language-model-become-goal-directed?commentId=cbJDeSPtbyy2XNr8E
Why it's good: I think lots of people are very wrong about how LLMs might lead to consequentialist behavior, and Paul's comment here is my favorite attempt at answering this question. I think that this question is extremely important.

What are the most underrated posts & comments of 2022, according to you?

by peterhartree

Jan 1 20231 min read 25

53

Building effective altruismCommunityEffective Altruism Forum

What are the most underrated posts & comments of 2022, according to you?

Share your views in the comments!

To make this clear and easy to follow, please use these guidelines:

Use the template below.
Post as many items as you want.
One item per comment, so that it's easy for people to read and react.
(Optional, encouraged) Highlight at least one of your own contributions.

If you need some inspiration, open your EA Forum Wrapped and scroll to the bottom of your "Strong Upvoted" list.

Template

Title:
Author:
URL:
Why it's good:

If you're sharing an underrated comment, set the title to "[Username] on [topic]".

53

0

0

Reactions

0

0

Mentioned in

87Posts from 2022 you thought were valuable (or underrated)

Comments25

Sorted by

Click to highlight new comments since: Today at 6:02 AM

BuckJan 1 202316

2

0

Title: Paul Christiano on how you might get consequentialist behavior from large language models
Author: Paul Christiano
URL: https://forum.effectivealtruism.org/posts/dgk2eLf8DLxEG6msd/how-would-a-language-model-become-goal-directed?commentId=cbJDeSPtbyy2XNr8E
Why it's good: I think lots of people are very wrong about how LLMs might lead to consequentialist behavior, and Paul's comment here is my favorite attempt at answering this question. I think that this question is extremely important.

LizkaJan 2 20239

4

0

Title: Most problems fall within a 100x tractability range (under certain assumptions)

Author: Thomas Kwa

URL: https://forum.effectivealtruism.org/posts/4rGpNNoHxxNyEHde3/most-problems-fall-within-a-100x-tractability-range-under

Why it's good: I don't have time to write a long comment, but this is one of the posts where my first reaction was, "no way," and then I went and checked the math and was convinced. Since then, I've often thought of it and brought it up — I often hear people say or imply, "sure, the scope of X is huge and it's really neglected, but maybe it's massively more intractable..." without an explanation for why X is so unusual.

Ben_West🔸Jan 4 20234

0

0

Thanks for sharing this; I hadn't read it before and I found it persuasive.

[anonymous]Jan 3 20237

0

0

EA-Aligned Political Activity in a US Congressional Primary: Concerns and Proposed Changes by Carolina_EA (78 karma)
Why it's good: I am so, so appreciative when people share detailed, good-faith takes on parts of EA from perspectives that we rarely get such insight into. (Similar posts with at least the same karma, covering disagreement, agreement, praise, indifference, justification, advice: Podcast: The Left and Effective Altruism with Habiba Islam, A subjective account of what it's like to join an EA-aligned org without previous EA knowledge, Notes From a Pledger, Some unfun lessons I learned as a junior grantmaker, Some observations from an EA-adjacent (?) charitable effort.)

LizkaJan 2 20237

1

0

Title: Epistemic Legibility - "Tl;dr: being easy to argue with is a virtue, separate from being correct."

Author: Elizabeth

URL: https://forum.effectivealtruism.org/posts/oRx3LeqFdxN2JTANJ/epistemic-legibility

Why it's good:

Things that make texts or discussions more "epistemically legible" include: making clear what you actually believe, making clear the evidence you're really basing your beliefs on (don't just search for a random hyperlink for a claim — say why you really believe this, even if it's "gut feeling" or "anecdotal evidence"), making the logical steps of your argument clear (don't just list assorted evidence and a conclusion — explain what leads to what and how), use examples, pick a vocabulary that's appropriate to your audience, and write the argument down.^[1] I also think that summaries or outlines help.

So why is this important?

I think a lot of discussions are confused for a bunch of reasons. One of these is that it's hard to understand exactly what other parties are saying and why, both because communicating clearly is hard and because we have a tendency to want to hedge and protect our views — making it harder for others to see how we might be wrong (e.g. because of impostor syndrome).

For instance, I might want to say, "I think a lot of discussions are confused for a bunch of reasons," and walk away — especially if I find a good hyperlink for "confused" or something. But that would make it very hard to argue with me. I didn't explain what "confused" really means to me, I didn't list specific reasons, I didn't say which discussions, or approximately how many of them. (So what do you argue with? "No, I think very few discussions are 'confused'?") I could, instead, write something more specific, like "I think that too many posts on the Forum (for my taste) lead to discussions that misinterpret the claims of the related post or are arguing about details or logical connections that aren't actually relevant. This happens for a bunch of reasons, some of which I could list, but I'm focusing on a specific thing here that is one of what I'd guess are the top 10 reasons, and here's how that happens..." This is still pretty vague, but I think it's better. You can now say, "Here's my list of 10 reasons that contribute to this phenomenon, and epistemic illegibility doesn't get into the top 10 — which do you think are less important?" And I imagine that this leads to a more productive discussion.

There are downsides and costs to being more specific or epistemically legible like this — Elizabeth's post acknowledges them, and notes that not everything should necessarily be epistemically legible. For instance, the rewritten claim above is messier and longer than the original one. (Although I don't think this always has to be true.) But on the margin, I think I'd prefer more posts that are messier and even longer if they're also more epistemically legible. And I really like the specific suggestions on how to be legible.

Or, as Elizabeth puts it,

If I hear an epistemically legible argument, I have a lot of options. I can point out places I think the author missed data that impacts their conclusion, or made an illogical leap. I can notice when I know of evidence supporting their conclusions that they didn’t mention. I can see implications of their conclusions that they didn’t spell out. I can synthesize with other things I know, that the author didn’t include.
If I hear an illegible argument, I have very few options. Perhaps the best case scenario is that it unlocks something I already knew subconsciously but was unable to articulate, or needed permission to admit. This is a huge service! But if I disagree with the argument, or even just find it suspicious, my options are kind of crap. I write a response of equally low legibility, which is unlikely to improve understanding for anyone. Or I could write up a legible case for why I disagree, but that is much more work than responding to a legible original, and often more work than went into the argument I’m responding to, because it’s not obvious what I’m arguing against. I need to argue against many more things to be considered comprehensive. If you believe Y because of X, I can debate X. If you believe Y because …:shrug:… I have to imagine every possible reason you could do so, counter all of them, and then still leave myself open to something I didn’t think of. Which is exhausting.
I could also ask questions, but the more legible an argument is, the easier it is to know what questions matter and the most productive way to ask them.
I could walk away, and I am in fact much more likely to do that with an illegible argument. But that ends up creating a tax on legibility because it makes one easier to argue with, which is the opposite of what I want.

I also love reasoning transparency, but feel like it gets at the quality in a different way, with a different emphasis. And I've also been using "butterfly idea" a lot.

^{^}
I like this essay a lot, on this point: Putting Ideas into Words. Excerpt is copied from a different place where I shared the essay, so I don't remember how relevant it is here specifically.
Writing about something, even something you know well, usually shows you that you didn't know it as well as you thought. Putting ideas into words is a severe test. The first words you choose are usually wrong; you have to rewrite sentences over and over to get them exactly right. And your ideas won't just be imprecise, but incomplete too. Half the ideas that end up in an essay will be ones you thought of while you were writing it. Indeed, that's why I write them.
Once you publish something, the convention is that whatever you wrote was what you thought before you wrote it. These were your ideas, and now you've expressed them. But you know this isn't true. You know that putting your ideas into words changed them. And not just the ideas you published. Presumably there were others that turned out to be too broken to fix, and those you discarded instead.
[...]
Putting ideas into words doesn't have to mean writing, of course. You can also do it the old way, by talking. But in my experience, writing is the stricter test. You have to commit to a single, optimal sequence of words. Less can go unsaid when you don't have tone of voice to carry meaning. And you can focus in a way that would seem excessive in conversation.
[...]

MattBallJan 5 20236

0

1

Title: Working with the Beef Industry for Chicken Welfare

Author: RobertY

URL: https://forum.effectivealtruism.org/posts/5XKAsEBMuxiycTHL7/working-with-the-beef-industry-for-chicken-welfare

Why it's good: Correct focus on a source of immense, totally unnecessary suffering, with outside-the-box thinking to help mitigate the suffering. Thanks, Robert!

technicalitiesJan 3 20234

1

0

Title: The long reflection as the great stagnation

Author: Larks

URL: https://forum.effectivealtruism.org/posts/o5Q8dXfnHTozW9jkY/the-long-reflection-as-the-great-stagnation

Why it's good: Powerful attack on a cherished institution. I don't necessarily agree on the first order, but on the second order people will act up and ruin the Reflection.

John_MaxwellJan 2 20234

1

0

Title: Prizes for ML Safety Benchmark Ideas

Author: Joshc, Dan H

URL: https://forum.effectivealtruism.org/posts/jo7hmLrhy576zEyiL/prizes-for-ml-safety-benchmark-ideas

Why it's good: Benchmarks have been a big driver of progress in AI. Benchmarks for ML safety could be a great way to drive progress in AI alignment, and get people to switch from capabilities-ish research to safety-ish research. The structure of the prize looks good: They're offering a lot of money, there are still over 6 months until the submission deadline, and all they're asking for is a brief write-up. Thinking up benchmarks also seems like the sort of problem that's a good fit for a prize. My only gripe with the competition is that it doesn't seem widely known, hence posting here.

John_MaxwellJan 2 20234

1

0

I wonder if a good standard rule for prizes is that you want a marketing budget which is at least 10-20% the size of the prize pool, for buying ads on podcasts ML researchers listen to or subreddits they read or whatever. Another idea is to incentivize people to make submissions publicly, so your contest promotes itself.

peterhartreeJan 1 20234

3

0

Title:

Getting on a different train: can Effective Altruism avoid collapsing into absurdity?

Author:

Peter McLaughlin

URL:

https://forum.effectivealtruism.org/posts/8wWYmHsnqPvQEnapu/getting-on-a-different-train-can-effective-altruism-avoid

Why it's good:

McLaughlin highlights a problem for people who want to say that scale matters, and also avoid the train to crazy town.

It's not clear how anyone actually gets off the train to crazy town. Once you allow even a little bit of utilitarianism in, the unpalatable consequences follow immediately. The train might be an express service: once the doors close behind you, you can’t get off until the end of the line.
As Richard Y. Chappell has put it, EAs want ‘utilitarianism minus the controversial bits’. Yet it’s not immediately clear how the models and decision-procedures used by Effective Altruists can consistently avoid any of the problems for utilitarianism: as examples above illustrate, it’s entirely possible that even the simplest utilitarian premises can lead to seriously difficult conclusions.

Tyler Cowen wrote a paper on this problem in 1996, called ’What Do We Learn from the Repugnant Conclusion?’. McLaughlin's post opens with an excellent summary.

The upshot:

For any moral theory with universal domain where utility matters at all, either the marginal value of utility diminishes rapidly (asymptotically) towards zero, or considerations of utility come to swamp all other values.

Uh oh!

peterhartreeJan 1 20232

0

0

My take: perhaps the more principled among us should make room for more messy fudges in our thought. Cluster thinking, bounded commensurability and two-thirds utilitarianism for the win.

[anonymous]Jan 3 20233

0

0

Paper summary: The Epistemic Challenge to Longtermism (Christian Tarsney) by Global Priorities Institute, EJT (37 karma)
Why it's good: An attempt to put numbers on the much-disputed tractability of longtermism (I also really appreciate summaries).

[anonymous]Jan 3 20233

0

1

The Possibility of Microorganism Suffering by Elias Au-Yeung (46 karma)
Why it's good: I think there should be more investigations into potential welfarist priorities that sound super weird. (Similarly, though higher-karma: Do Brains Contain Many Conscious Subsystems? If So, Should We Act Differently?, Reducing nightmares as a cause area.)

technicalitiesJan 3 20232

0

0

Title: Forecasting Newsletter: April 2222

Author: Nuno

URL: https://forum.effectivealtruism.org/posts/xnPhkLrfjSjooxnmM/forecasting-newsletter-april-2222

Why it's good: Incredible density of gags. Some of the in-jokes are so clever that I had to think all day to get them; some are so niche that no one except Nuno and the target could possibly laugh.

[anonymous]Jan 3 20232

0

0

Some Carl Sagan quotations by finm (77 karma)
Why it's good: I really appreciate inspiring posts. (Higher-karma posts on this topic that I really appreciated: Open call for EA stories, 🎨 Altruist Dreams - a collaborative art piece from EAG SF, Good things that happened in EA this year, The Spanish-Speaking Effective Altruism community is awesome.)

[anonymous]Jan 3 20232

0

0

Consequentialism and Cluelessness by Richard Y Chappell (27 karma)
Why it's good: Raises and defends an important point that I think would release a lot of people from cluelessness-induced paralysis if more widely shared, namely that Option A can still have higher expected value than Option B despite us having no clue what many of the consequences will be, because these invisible consequences speak neither for or against either option. (Another important point that I wish was better known is that Person-affecting intuitions can often be money pumped, although this post was essentially a repeat of a post from 2020.)

[anonymous]Jan 3 20232

0

0

Space governance - problem profile by finm, 80000_Hours (65 karma)
Why it's good: A new entry to 80,000 Hours' most pressing world problems that is huge in scale, severely neglected from a longtermist perspective, although admittedly not especially tractable (which is why I'd love to see more posts like Influencing United Nations Space Governance).

[anonymous]Jan 3 20232

0

0

Observations of community building in Asia, a 🧵 by Vaidehi Agarwalla (36 karma)
Why it's good: Concrete ideas for improving diversity in EA. (Similar posts I particularly appreciated with more karma: Top down interventions that could increase participation and impact of Low and Middle Income Countries in EA, EA career guide for people from LMICs, A few more relevant categories to think about diversity in EA.)

[anonymous]Jan 3 20232

0

0

Criticism of the main framework in AI alignment by Michele Campolo (34 karma)
Why it's good: Explores an area of AGI alignment that I think is under-discussed, namely the possibility of using AGI for direct moral progress. (Other under-discussed areas with more karma: AGI and Lock-In, Robert Long on Why You Might Want To Care About Artificial Sentience, ‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting, Steering AI to care for animals, and soon.)

[anonymous]Jan 3 20232

0

0

Some core assumptions of effective altruism, according to me by peterhartree (85 karma)
Why it's good: I think people should focus on EA's core assumptions much more than they do and I think this list is pretty accurate (and more accurate than the list it is responding to).

peterhartreeJan 1 20232

0

0

Title:

Bernard Williams: Ethics and the limits of impartiality

Author:

peterhartree

URL:

https://forum.effectivealtruism.org/posts/G6EWTrArPDf74sr3S/bernard-williams-ethics-and-the-limits-of-impartiality

Why it's good:

Derek Parfit saw Bernard Williams as his most important antagonist. Parfit was obsessed with Williams’ “Internal & External Reasons” paper for several decades.

My post introduces some of Bernard Williams’ views on metaphilosophy, metaethics and reasons.

What are we doing when we do moral philosophy? How should this self-understanding inform our practice of philosophy, and what we might hope to gain from it?

According to Williams:

Moral philosophy is about making sense of the situation in which we find ourselves, and deciding what to do about it.

Williams wants to push back against a “scientistic” trend in moral philosophy, and against philosophers who exhibit “a Platonic contempt for the the human and the contingent in the face of the universal”. Such philosophers believe that:

if there were an absolute conception of the world, a representation of it which was maximally independent of perspective, that would be better than more perspectival or locally conditioned representations of the world.

And, relatedly:

that offering an absolute conception is the real thing, what really matters in the direction of intellectual authority

Williams thinks there’s another way. It may not give us everything we want, but perhaps it’s all we can have.

If the post leaves you wanting more, I got into related themes on Twitter last night, in conversation with The Ghost of Jeremy Bentham after some earlier exegetical mischief. Scroll down and click “Show replies”.

peterhartreeJan 2 20233

0

0

P.S. If you don't like the Bernard Williams stuff, I'd love to hear your quick thoughts on why.

He is a divisive figure, especially in Oxford philosophy circles. But Parfit was correct to take him seriously.

His book "Ethics & The Limits of Philosophy" is often recommended as the place to start.

Michelle_HutchinsonJan 2 202314

1

0

My main hesitation on this would be that I never really figured out how the difference between plausible meta-ethical theories was decision relevant.(I'm not sure if that counts as not liking it though - still interesting!)

peterhartreeJan 1 20232

0

1

Title:

EA’s brain-over-body bias, and the embodied value problem in AI alignment

Author:

Geoffrey Miller

URL:

https://forum.effectivealtruism.org/posts/zNS53uu2tLGEJKnk9/ea-s-brain-over-body-bias-and-the-embodied-value-problem-in

Why it's good:

Embodied cognition is a hot topic in cognitive science. Are AI safety people overlooking this?

From Geoffrey’s introduction:

Evolutionary biology and evolutionary medicine routinely analyze our bodies’ biological goals, fitness interests, and homeostatic mechanisms in terms of how they promote survival and reproduction. However the EA movement includes some ‘brain-over-body biases’ that often make our brains’ values more salient than our bodies’ values. This can lead to some distortions, blind spots, and failure modes in thinking about AI alignment. In this essay I’ll explore how AI alignment might benefit from thinking more explicitly and carefully about how to model our embodied values.

Big, if there’s something to it. But the piece received one three word comment...

IanDavidMossJan 3 2023-2

0

1

I'll bite on the invitation to nominate my own content. This short piece of mine spent little time on the front page and didn't seem to capture much attention, either positive or negative. I'm not sure why, but I'd love for the ideas in it to get a second look, especially by people who know more about the topic than I do.

Title: Leveraging labor shortages as a pathway to career impact? [note: question mark was added today to better reflect the intended vibe of the post]

Author: Ian David Moss

URL: https://forum.effectivealtruism.org/posts/xdMn6FeQGjrXDPnQj/leveraging-labor-shortages-as-a-pathway-to-career-impact

Why it's good: I think it surfaces an important and rarely-discussed point that could have significant implications for norms and practices around EA community-building and career guidance if it were determined to be valid.

More from peterhartree

Curated and popular this week

Relevant opportunities