As a special case of not over-optimizing things, I want to point out that an optimizing approach can have a bunch of subtle negative effects in social contexts. The slogan is something like "optimizers make bad friends" ... people don't like being treated as a means rather than an end, and if they get the vibe from social interactions that you're trying to steer them into something, then they may react badly.

Here I'm referring to things like:

  • "How do I optimize this conversation to get X to say yes?"
  • "How do I persuade Y of this particular fact?"
  • "How can I seem particularly cool/impressive to these people?",

... where it's not clear that the other side are willing collaborators in your social objectives. 

So I don't mean to include optimizing for things like:

  • "How can I give the clearest explanation of this thing, so people aren't confused about what I'm saying?"
  • "How can I help Z to feel comfortable and supported?"

I think that people (especially smart people) are often pretty good at getting a vibe that someone is trying to steer them into something (even from relatively little data). When people do notice, it's often correct for them to treat you as adversarial and penalize you for this. This is for two reasons:

  1. A Bayesian/epistemic reason: people who are optimizing for particular outcomes selectively share information which pushes towards those outcomes. So if you think someone is doing [a lot of] this optimization your best estimate of the true strength of the position they're pushing should be [much] lower than if you think they're optimizing less (given otherwise the same observations).
    • Toy example: if Alice and Bob are students each taking 12 subjects, and you randomly find out Alice's chemistry grade and it's a B+, and you hear Bob bragging that his history grade is an A-, you might guess that Alice is overall a stronger student, since it's decently likely that Bob chose his best grade to brag about.
  2. An incentives reason: we want to shape the social incentive landscape so that people aren't rewarded for trying to manipulate us. If we just do the Bayesian response, it will still often be correct for people to invest some in trying to manipulate (in theory they know more about how much information to reveal to leave you with the best impression after Bayesian updating).
    • I don't think the extra penalty for incentive shaping needs to be too big, but I do think it should be nonzero.
    • Actually their update should be larger to compensate for the times that they didn't notice what was happening; this is essentially the same argument as given in Sections III and IV of the (excellent) Integrity for Consequentialists

Correspondingly, I think that we should be especially cautious of optimizing in social contexts to try to get particular responses out of people whom we hope will be our friends/allies. (This is importantly relevant for community-building work.)

Comments1


Sorted by Click to highlight new comments since:

I think that people (especially smart people) are often pretty good at getting a vibe that someone is trying to steer them into something (even from relatively little data). ...we want to shape the social incentive landscape so that people aren't rewarded for trying to manipulate us.

I studied lobbying in Washington, DC, from US trade diplomats, and we were learning that this 5 billion industry is benefiting decisionmakers by sharing research biased in various ways from which they can make decisions unbiased with respect to their own values.[1] So, 'smartness,' if it is interpreted as direct decisionmaking privilege, can be positively correlated with accepting what could be perceived as manipulation.

Also, people who are 'smart' in their ability to process, connect, or repeat a lot of information to give the 'right' answers[2] but do not critically think about the structures which they thus advance may be relatively 'immune' toward negative perceptions of manipulation due to the norms of these structures. These people can be more comfortable if they perceive 'steering' or manipulation, because they could be against 'submitting' to a relatively unaggressive entity. So, in this case, manipulation[3] can be positively correlated with (community builders') individual consideration in a relationship.

'Specific' objective optimization should be refrained from only among people who are 'smart' in emotional/reasoning and would not[4] engage in a dialogue.[5] These people would perceive manipulation negatively[6] and would not support community builders in developing (yet) better ways of engaging people with various viewpoints on doing good effectively.[7]

Still, many people in EA may not mind some manipulation,[8] because they are intrinsically motivated to do good effectively, and there are little alternatives for such. This is not to say that if it is possible to avoid 'specific' optimization, this should not be done but developing this skill can be deprioritized relative to advancing community building projects that attract intrinsically motivated individuals or make changes[9] where the changemakers perceive some 'unfriendliness.'

I would like to ask you if you think that some EA materials that optimize for agreement with a specific thesis, which community builders would use, should be edited, further explained, or discouraged.[10]

  1. ^

    See Allard, 2008 for further discussion on the informational value of privately funded lobbying. 

  2. ^

    Including factually right answers or those which they assess as best for their further social or professional status progress.

  3. ^

    ideally while its use is acknowledged and possibly the discussant is implicitly included in its critique

  1. ^

    or, the discussion would be set up in a way that prevents dialogue

  2. ^

    of course, regardless of their decisionmaking influence

  3. ^

    also due to their limited ability to contribute

  4. ^

    or anything else relevant to EA or the friendship

  5. ^

    For example, a fellow introductory EA fellowship participant pointed out that the comparison between the effectiveness of the treatment of Kaposi sarcoma and information for high-risk groups to prevent HIV/AIDS makes sense because a skin mark is much less serious that HIV/AIDS but this did not discourage anyone from engagement.

  6. ^

    such as vegan lunches in a canteen because community builders optimize for the canteen managers agreeing that this should be done

  7. ^

    for example, see my recent comment on the use of stylistic devices to attract attention and limit critical thinking

Show all footnotes
Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f