rime

@ only nothingness above

262 karmaJoined Apr 2023

Interests:

Animal welfareWild animal welfareMetascienceResearch methodsAI safetyAI alignmentMechanism designLongtermismExistential riskS-risk

Bio

Flowers are selective about the pollinators they attract. Diurnal flowers must compete with each other for visual attention, so they use colours to crowd out their neighbours. But flowers with nocturnal anthesis are generally white, as they aim only to outshine the night.

Posts
5

Sorted by New

rime's Quick takes

rime

· 3y ago · 1m read

Notes on cultivating good incentives

rime

· 3y ago · 2m read

Briefly how I've updated since ChatGPT

rime

· 3y ago · 3m read

Forum suggestion: profile wikis

rime

· 3y ago · 3m read

What can we do now to prepare for AI sentience, in order to protect them from the global scale of human sadism?

rime

· 3y ago · 3m read

Comments
54

Sustainable fishing policy increases fishing, and demand reductions might, too

rime2y3

Just the arguments in the summary are really solid.^[1] And while I wasn't considering supporting sustainability in fishing anyway, I now believe it's more urgent to culturally/semiotically/associatively separate between animal welfare and some strands of "environmentalism". Thanks!

Alas, I don't predict I will work anywhere where this update becomes pivotal to my actions, but my practically relevant takeaway is: I will reproduce the arguments from this post (and/or link it) in contexts where people are discussing conjunctions/disjunctions between environmental concerns and animal welfare.

Hmm, I notice that (what I perceive as) the core argument generalizes to all efforts to make something terrible more "sustainable". We sometimes want there to be high price of anarchy (long-run) wrt competing agents/companies trying to profit from doing something terrible. If they're competitively "forced" to act myopically and collectively profit less over the long-run, this is good insofar as their profit correlates straightforwardly with disutility for others.^[2]

It doesn't hold in cases where what we care about isn't straightforwardly correlated with their profit, however. E.g. ecosystems/species are disproportionately imperiled by race-to-the-bottom-type incentives, because they have an absorbing state at 0.

(Tagging @niplav, because interesting patterns and related to large-scale suffering.)

^{^}
Also just really interesting argument-structure which I hope I can learn to spot in other contexts.
^{^}
EDIT: Another way of framing this is that it reduces the amount of slack they have to optimize their exploitation with.

RyanCarey's Quick takes

rime3y4

Paying people for what they do works great if most of their potential impact comes from activities you can verify. But if their most effective activities are things they have a hard time explaining to others (yet have intrinsic motivation to do), you could miss out on a lot of impact by requiring them instead to work on what's verifiable.

Perhaps funders should consider granting motivated altruists multi-year basic income. Now they don't have to compromise^[1] between what's explainable/verifiable vs what they think is most effective—they now have independence to purely pursue the latter.

Bonus point: People who are much more competent than you at X^[2] will probably behave in ways you don't recognise as more competent. If you could, they wouldn't be much more competent. Your "deference limit" is the level of competence above which you stop being able to reliable judge the difference between experts.

If good research is heavy-tailed & in a positive selection-regime, then cautiousness actively selects against features with the highest expected value.

^{^}
Consider how the cost of compromising between optimisation criteria interacts with what part of the impact distribution you're aiming for. If you're searching for a project with top p% impact and top p% explainability-to-funders, you can expect only p^2 of projects to fit both criteria—assuming independence.
But I think it's an open question how & when the distributions correlate. One reason to think they could sometimes be anticorrelated is that the projects with the highest explainability-to-funders are also more likely to receive adequate attention from profit-incentives alone.
If you're doing conjunctive search over projects/ideas for ones that score above a threshold for multiple criteria, it matters a lot which criteria you prioritise most of your parallel attention on to identify candidates for further serial examination. Try out various examples here & here.
^{^}
At least for hard-to-measure activities where most of the competence derives from knowing what to do in the first place. I reckon this includes most fields of altruistic work.

rime's Quick takes

rime3y3

Philosophy

Second-best theories & Nash equilibria

A general frame I often find comes in handy while analysing systems is to look for look for equilibria, figure out the key variables sustaining it (e.g., strategic complements, balancing selection, latency or asymmetrical information in commons-tragedies), and well, that's it. Those are the leverage points to the system. If you understand them, you're in a much better position to evaluate whether some suggested changes might work, is guaranteed to fail, or suffers from a lack of imagination.

Suggestions that fail to consider the relevant system variables are often what I call "second-best theories". Though they might be locally correct, they're also blind to the broader implications or underappreciative of the full space of possibilities.

(A) If it is infeasible to remove a particular market distortion, introducing one or more additional market distortions in an interdependent market may partially counteract the first, and lead to a more efficient outcome.

(B) In an economy with some uncorrectable market failure in one sector, actions to correct market failures in another related sector with the intent of increasing economic efficiency may actually decrease overall economic efficiency.

Examples

The allele that causes sickle-cell anaemia is good because it confers resistance against malaria. (A)
- Just cure malaria, and sickle-cell disease ceases to be a problem as well.
Sexual liberalism is bad because people need predictable rules to avoid getting hurt. (B)
- Imo, allow people to figure out how to deal with the complexities of human relationships and you eventually remove the need for excessive rules as well.
We should encourage profit-maximising behaviour because the market efficiently balances prices according to demand. (A/B)
- Everyone being motivated by altruism is better because market prices only correlate with actual human need insofar as wealth is equally distributed. The more inequality there is, the less you can rely on willingness-to-pay to signal urgency of need. Modern capitalism is far from the global-optimal equilibrium in market design.
If I have a limp in one leg, I should start limping with my other leg to balance it out. (A)
- Maybe the immediate effect is that you'll walk more efficiently on the margin, but don't forget to focus on healing whatever's causing you to limp in the first place.
Effective altruists seem to have a bias in favour of pursuing what's intellectually interesting & high status over pursuing the boringly effective. Thus, we should apply an equal and opposite skepticism of high-status stuff and pay more attention to what might be boringly effective. (A)
- Imo, rather than introducing another distortion in your motivational system, just try to figure out why you have that bias in the first place and solve it at its root. Don't do the equivalent of limping on both your legs.
I might edit in more examples later if I can think of them, but I hope the above gets the point across.

Downsides of Small Organizations in EA

rime3y3

I mused about this yesterday and scribbled some thoughts on it on Twitter here.

"When should you pool all resources into your current best bet vs spread them across N ~independent parallel plans each of which will have lower probability of success?"

Investing marginal resources (workers, in this case) into your single most promising approach might have diminishing returns due to A) limited low-hanging fruits for that approach, B) making it harder to coordinate, and C) making it harder to think original thoughts due to rapid internal communication & Zollman effects. But marginal investment may also have increasing returns due to D) various scale-economicsy effects.

There are many more factors here, including stuff you mention. The math below doesn't try to capture any of this, however. It's supposed to work as a conceptual thinking-aid, not something you'd use to calculate anything important with.

A toy-model heuristic is to split into separate approaches iff the extra independent chances counterbalance the reduced probability of success for your top approach.

One observation is that the more dependent/serial steps ( $k)$ your plan has, the more it matters to maximise general efficiencies internally ( $c$ ), since that gets exponentially amplified by $k$ .^[1]

^{^}
You can view this as a special case of Ahmdal's argument. If you want. Because nobody can stop you, and all you need to worry about is whether it works profitably in your own head.

20 concrete projects for reducing existential risk

rime3y3

The problem is that if you select people cautiously, you miss out on hiring people significantly more competent than you. The people who are much higher competence will behave in ways you don't recognise as more competent. If you were able to tell what right things to do are, you would just do those things and be at their level. Innovation on the frontier is anti-inductive.

If good research is heavy-tailed & in a positive selection-regime, then cautiousness actively selects against features with the highest expected value.^[1]

That said, "30k/year" was just an arbitrary example, not something I've calculated or thought deeply about. I think that sum works for a lot of people, but I wouldn't set it as a hard limit.

^{^}
Based on data sampled from looking at stuff. :P Only supposed to demonstrate the conceptual point.
Your "deference limit" is the level of competence above your own at which you stop being able to tell the difference between competences above that point. For games with legible performance metrics like chess, you get a very high deference limit merely by looking at Elo ratings. In altruistic research, however...

Strawmen, steelmen, and mithrilmen: getting the principle of charity right

rime3y2

The problem with strawmanning and steelmanning isn't a matter of degree, and I don't think goldilocks can be found in that dimension at all. If you find yourself asking "how charitable should I be in my interpretation?" I think you've already made a mistake.

Instead, I'd like to propose a fourth category. Let's call it.. uhh.. the "blindman"! ^^

The blindman interpretation is to forget you're talking to a person, stop caring about whether they're correct, and just try your best to extract anything usefwl from what they're saying.^[1] If your inner monologue goes "I agree/disagree with that for reasons XYZ," that mindset is great for debating or if you're trying to teach, but it's a distraction if you're purely aiming to learn. If I say "1+1=3" right now, it has no effect wrt what you learn from the rest of this comment, so do your best to forget I said it.

For example, when I skimmed the post "agentic mess", I learned something I thought was exceptionally important, even though I didn't actually read enough to understand what they believe. It was the framing of the question that got me thinking in ways I hadn't before, so I gave them a strong upvote because that's my policy for posts that cause me to learn something I deem important--however that learning comes about.

Likewise, when I scrolled through a different post, I found a single sentence^[2] that made me realise something I thought was profound. I actually disagree with the main thesis of the post, but my policy is insensitive to such trivial matters, so I gave it a strong upvote. I don't really care what they think or what I agree with, what I care about is learning something.

^{^}
"What they believe is tangential to how the patterns behave in your own models, and all that matters is finding patterns that work."
From a comment on reading to understand vs reading to defer/argue/teach.
^{^}
"The Waluigi Effect: After you train an LLM to satisfy a desirable property , then it's easier to elicit the chatbot into satisfying the exact opposite of property $P$ ."

Are we confident that superintelligent artificial intelligence disempowering humans would be bad?

Answer by rimeJun 10, 20237

Here's my just-so story for how humans evolved impartial altruism by going through several particular steps:

First there was kin selection evolving for particular reasons related to how DNA is passed on. This selects for the precursors to altruism.
With ability to recognise individual characteristics and a long-term memory allowing you to keep track of them, species can evolve stable pairwise reputations.
This allows reciprocity to evolve on top of kin selection, because reputations allow you to keep track of who's likely to reciprocate vs defect.
More advanced communication allows larger groups to rapidly synchronise reputations. Precursors of this include "eavesdropping", "triadic awareness",^[1] all the way up to what we know as "gossip".
This leads to indirect reciprocity. So when you cheat one person, it affects everybody's willingness to trade with you.
There's some kind of inertia to the proxies human brains generalise on. This seems to be a combination of memetic evolution plus specific facts about how brains generalise very fast.
If altruistic reputation is a stable proxy for long enough, the meme stays in social equilibrium even past the point where it benefits individual genetic fitness.
In sum, I think impartial altruism (e.g. EA) is the result of "overgeneralising" the notion of indirect reciprocity, such that you end up wanting to help everybody everywhere.^[2] And I'm skeptical a randomly drawn AI will meet the same requirements for that to happen to them.

^{^}
"White-faced capuchin monkeys show triadic awareness in their choice of allies":
"...contestants preferentially solicited prospective coalition partners that (1) were dominant to their opponents, and (2) had better social relationships (higher ratios of affiliative/cooperative interactions to agonistic interactions) with themselves than with their opponents."
You can get allies by being nice, but not unless you're also dominant.
^{^}
For me, it's not primarily about human values. It's about altruistic values. Whatever anything cares about, I care about that in proportion to how much they care about it.

Distancing EA from rationality is foolish

rime2y28

Um, I did not know about "came in fluffer" until I googled it now, inspired by your post. I'm not English, so I thought "fluffer" meant some type of costume, and that some high-status person showed up somewhere in it. My innocence didn't last long.

I'm not against sexual activities, per se, but do you really want to highlight and reinforce that as a salient example of "Rationality culture"?

Mr Beast is now officially engaged in effective giving!

rime2y3

In addition to the title being somewhat misleading, I think it reinforces a very "either in or out" narrative re effective altruism. Very good object-level news, though!

Edit: The title was updated, and I have no issue. : )

rime

Bio

Posts 5

Comments54

Second-best theories & Nash equilibria

Examples

Posts
5

Comments
54