Bio

There are critical gaps in the accessibility and affordability of mental health services worldwide: In some countries, you have to wait for years for therapy, in others you get at max one session per month covered by the health insurance, in others therapies for particular conditions like cluster B personality disorders are virtually nonexistent.

We want to leverage LLMs to fill these gaps and complement regular therapy. Our product is in development. We've based it on Gemini and want it to interface with widely used messaging apps, so users can interact with it like they would with a friend or coach.

I’ve previously founded or worked for several charities and spent a few years in earning to give for work on invertebrate welfare and s-risks from AI.

You can get up to speed on my thinking at Impartial Priorities.

Sequences
3

Welfare Biology and AI
Impact Markets
Researchers Answering Questions

Comments
605

My current practical ethics

The question often comes up how we should make decisions under epistemic uncertainty and normative diversity of opinion. Since I need to make such decisions every day, I had to develop a personal system, however inchoative, to assist me.

A concrete (or granite) pyramid

My personal system can be thought of like a pyramid.

  1. At the top sits some sort of measurement of success. It's highly abstract and impractical. Let's call it the axiology. This is really a collection of all axiologies I relate to, including the amount of frustrated preferences and suffering across our world history. This also deals with hairy questions such as how to weigh Everett branches morally and infinite ethics.
  2. Below that sits a kind of mission statement. Let's call it the ethical theory. It's just as abstract, but it is opinionated about the direction in which to push our world history. For example, it may desire a reduction in suffering, but for others this floor needn't be consequentialist in flavor.
  3. Both of these abstract floors of the pyramid are held up by a mess of principles and heuristics at the ground floor level to guide the actual implementation.

The ground floor

The ground floor of principles and heuristics is really the most interesting part for anyone who has to act in the world, so I won't further explain the top two floors. 

The principles and heuristics should be expected to be messy. That is, I think, because they are by necessity the result of an intersubjective process of negotiation and moral trade (positive-sum compromise) with all the other agents and their preferences. (This should probably include acausal moral trades like Evidential Cooperation in Large Worlds.)

It should also be expected to be messy because these principles and heuristics have to satisfy all sorts of awkward criteria:

  1. They have to inspire cooperation or at least not generate overwhelming opposition.
  2. They have to be easily communicable so people at least don't misunderstand what you're trying to achieve and call the police on you. Ideally so people will understand your goal well enough that they want to join you.
  3. They have to be rapidly actionable, sometimes for split second decisions.
  4. They have to be viable under imperfect information.
  5. They have to be psychologically sustainable for a lifetime.
  6. They have to avoid violating laws.
  7. And many more.

Three types of freedom

But really that leaves us still a lot of freedom (for better or worse):

  1. There are countless things that we can do that are highly impactful and hardly violate anyone's preferences or expectations.
  2. There are also plenty of things that don't violate any preferences or expectations once we get to explain them.
  3. Finally, there are many opportunities for positive-sum moral trade.

These suggest a particular stance toward other activists:

  1. If someone is trying to achieve the same thing you're trying to achieve, maybe you can collaborate.
  2. If someone is trying to achieve something other than what you're trying to achieve, but you think their goals are valuable, don't stand in their way. In particular, it may sometimes feel like doing nothing (to further or hinder their cause) is a form of “not standing in their way.” But if your peers are actually collaborating with them to some extent, doing nothing (or collaborating less) can cause others to also reduce their collaboration and can prevent key threshold effects from taking hold. So the true neutral position is to try to understand how much you need to collaborate toward the valuable goal so it would not have been achieved sooner without you. This is usually very cheap to do and has a chance to get runaway threshold effects rolling.
  3. If someone is trying to achieve something that you consider neutral, the above may still apply to some extent because perhaps you can still be friends. And for reasons of Evidential Cooperation in Large Worlds. (Maybe you'll find that their (to you) neutral thing is easy to achieve here and that other agents like them will collaborate back elsewhere where your goal is easy to achieve.)
  4. Finally, if someone is trying achieve something that you disapprove of… Well, that's not my metier, temperamentally, but this is where compromise can generate gains from moral trade.

Very few examples

In my experience, principles and heuristics are best identified by chatting with friends and generalizing from their various intuitions.

  1. Charitable donations are total anarchy. Mostly, you can just donate wherever the fluff you want, and (unless you're Open Phil) no one will throw stones through your windows in retaliation. You can just optimize directly for your goals – except, Evidential Cooperation in Large Worlds will still make strong recommendations here, but what they are is still a bit underexplored.
  2. Even if you're not an animal welfare activist yourself, you're still well-advised to cooperate with behavior change to avert animal suffering to the extent expected by your peers. (And certainly to avoiding inventing phony reasons to excuse your violation of these expectations. These might be even more detrimental to moral progress and rationality waterline.)
  3. If you want to spend time with someone but they behave outrageously unempathetically toward you or someone else, you can cut ties with them even though, strictly speaking, this does not imply that no positive-sum trade is possible with them.
  4. Trying to systematically put people in powerful positions can arouse suspicion and actually make it harder to put people in powerful positions. Trying to systematically put people into the sorts of positions they find fulfilling might put as many people in powerful positions and make their lives easier too. (Or training highly conscientious people in how to dare to accept responsibility so it's not just those who don't care who self-select into powerful positions.)
  5. And hundreds more…

Various non-consequentialist ethical theories can come in handy here to generate further useful principles and heuristics. That is probably because they are attempts at generalizing from the intuitions of certain authors, which puts them almost on par (to the extent to which these authors are relateable to you) with generalizations from the intuitions of your friends.

(If you find my writing style hard to read, you can ask Claude to rephrase the message into a style that works for you.)

Ohhh! I love that post! Gotta link it to my excessively guilty friends! <3 

Over the past 12 years, I almost always avoided applying for any jobs in effective altruism – though they did often seem like dream jobs – because:

  1. I was afraid I might not be the best candidate, and if, by chance, I replace the best candidate, my work would not only be a waste but outright harmful. I'm the sort of person who's afraid to drive a car for fear of hurting someone, and the funding allocation can affect the lives of billions or trillions of beings, so any mistakes I could make would vastly outstrip any harm I could do with a car if I tried.
  2. That the best candidate might not make up for that harm in some other job that they do instead because they might be more socially motivated than me and not fall back on earning to give if they don't find a charity job but rather value drift and do some mainstream stuff – in the worst case AI capabilities.
  3. That I can survive many years of earning to give without value-drifting because I have managed to do that in the past (a USP I should capitalize on because the counterfactual of the money that I earn at a random company is very low impact, so I can generate great counterfactual impact rather than a bit of marginal impact that I'd get at a charity).
  4. That applying for a job, being considered the best candidate, and then not living up to the expectations would feel deeply humiliating – I'd feel like fraud, feel guilty for the harm I've caused, feel ashamed of having betrayed all the people at the organization, feel like I can never live it down or risk running into any of them ever again at conferences and such.
  5. That sometimes they end up hiring a world-renowned researcher like Carl Shulman, and then I'd feel ashamed of even having considered applying because just the thought of it already feels hubristic.

The upshot for me was:

  1. To apply to places that have the funding and management capacity to hire everyone above some bar.
  2. To time the boom and bust cycles in EA and go into grantmaking at the start of the boom cycles and ETG during the bust cycles.
  3. To trade off funding vs. management capacity, and try to contribute to grantmaking in a way that doesn't come at a cost in management capacity, e.g., not as employee, at times when management capacity is more limiting than funding. Or to create management capacity.
  4. To wait for someone to reach out to me unprompted to apply for a role, because then they've already decided that I might be a good fit and thus taken some of the terrible responsibility off my shoulders.
  5. A friend of mine does a lot of drunk driving and usually goes far above the speed limit. I sometimes meet with friends for optional pastime activities. The risk that I catch a cold and my performance is degraded after such a hangout is vastly more severe than the risk of my friend's drunk speeding because of all the hundreds of thousands of lives it might affect. Conversely, my friend hasn't killed a single person yet. So I'm in no position to judge her.

Meanwhile rejections were not a problem for me, so it's not really “rejection sensitivity.” I talked to my friends about how I'm expected to react to them, and their advice was helpful. If I had dared to apply for particularly responsible roles, a rejection would've been a relief. After all, rejection is a return to safety. It's just mixed with the shame over whatever mistakes I must've made in front of the interviewers. I considered not going to any conferences anymore where I might run into them, but my friends told me that's unnecessary. And it's true because when I decided not to hire people at my companies, I didn't want them to avoid me afterwards even if they've made mistakes in the interviews.

But more recently I've updated in the following ways:

  1. I heard from some hiring managers I'm friends with that even at their EA charities some applicants lie about their qualifications. It's like these people want to cause mayhem and destruction on a global scale by replacing better candidates. Naturally, there friends didn't hire the liars, but who knows how many slipped through and didn't get caught. I imagine these people are rare, so the probability that I'll replace someone better than me is higher but the severity of getting replaced by a liar is worse. So there's a tradeoff I didn't consider.
  2. It seems erroneous to think that someone can likely be a better candidate than me and yet value drift more easily than me. It's not impossible, but it's a strange convergence of mildly contradictory traits that I should thus have discounted. Besides, small-scale 1:1 compassion has a strong pull for me, so I run a risk of value-drifting away from high scalability impact to something like therapy.
  3. If I implicitly compare myself to Carl Shulman's polished outputs from the past decades, I set an unrealistically high bar and will necessarily feel lacking. It stands to reason that the actual best candidate will also make mistakes and that even Carl Shulman has made mistakes. Plus, there are not enough Carl Shulmans for every position at every organization.
  4. I've practiced imagining that somehow I've ended up committing horrendous war crimes on a global scale and how I would process the guilt and try to make it up to all the families I've destroyed. It helps me make my fears concrete like that and think through, step by step, how I would manage such a situation responsibly.
  5. If you're smart enough, self-deceptions won't be obviously false, sometimes not false at all, but they'll be selective and suspiciously purposeful.
  6. I have some information that HR doesn't have but HR also has some information that I don't have. Ideally, we collaborate to make the optimal decision.

Finally, for anyone struggling with similar difficulties in the face of overwhelming responsibility, here's a small example of someone processing his responsibility for a terrible accident.

Exactly. I care that the ideas in the post are good. Who does the actual “typing” is irrelevant for me.

And ghostwriting is nothing new. It's somewhat common practice for busy top researchers to verbally discuss topics with more junior researchers who still have a good grasp on what they're talking about and then for the junior researchers to write there verbal discussions down in the form of an easily digestable article. 

Perhaps there are reasons why ghostwriting should be disclosed or not, but I don't see why ghostwriting by an AI is a special case that deserves more attention than human ghostwriting.

Right now, I’m leaving my job to go start a new charity, working on a very narrow intervention.

 

What is it? I'm curious!

Oof! That's a big variation! Thanks for flagging that!

The purpose of the whole first article is to lay out all these perspectives and make mine transparent. Originally I wanted to just write the sequence, but Claude suggested to add a whole 'nother post in front to introduce people to the ethical foundations.

We've had the discussion of ethical systems on this article. One of my core values is cooperativeness, and while I intuitively started out as classic utilitarian, I found it at odds with many common and widely held value systems and not in ways where I would've been ready to bite the bullet. E.g., it maps to Quiverfull, which is a rather extreme outlier. I also don't bite the bullet on the Repugnant Conclusion. 

But most people don't even think in terms of unbounded maximization or minimization, and certainly not in terms of classic utilitarianism. It's such a niche view out there. Even if I talk about implications that I actually like, like tiling the universe with blazingly fast, happy computations, people look at me weird. If I want to engage in moral trades with them, I have to adjust to a wholly different viewpoint. So it was probably around 2013–14 or so that I felt compelled to abandon classic utilitarianism as extreme, niche, and untenable.

I never settled on anything new after that but rather tried to figure out what the smallest common denominator is of the largest possible subset of all ethical preferences that I could think of.

The smallest common demoninator that is widely shared is that suffering is bad. Lots of people care about lots of other stuff on top, and sometimes it's not the most important thing for them, but most people seem to be in agreement that suffering is bad. That's also why I got so interested in sadism, an important exception from the rule.

So I think by focusing just on “suffering is bad, let's reduce it” (and the particular framing around preferences that I find more intuitive), I can make my articles relevant to the widest audience possible. The Quiverfulls will have quibbles with it, the sadists will hate it, but maybe some 90+% of the world will benefit from them.

You adapt them to the other species. We can do the same with much better information with our friends: One friend enjoys tomato sauce; another friend doesn't enjoy it. You want to get pizza for all three of you, and it's supposed to be a surprise, so instead of being like, “Pizza has tomato sauce on it, and my friend doesn't like it, so I can't make any inference about whether they want pizza and should [get pizza for them anyway, get pizza only for myself and the other friend],” you can make adjustments like, “Pizza has tomato sauce on it, and my friend doesn't like it, so let's ask whether they can make pizza without tomato sauce.”

E.g., I give this example of eusocial vs. solitary insects. So when I wonder whether it's stressful for an insect to be caught in my flat and not finding the way out (despite some nutritious stuff I have lying around), my guess is that it's more likely stressful for the eusocial one who wants to get back to their hive than for the solitary one. When I had a fly over for a week or so in 2021, I gave her a name, enjoyed her company, and didn't worry much about her feeling trapped (I did leave a window open when I was awake), but when I had a bee over a few days ago, I was more concerned and helped the bee find their way out.

Likewise, humans can have some 0–3 children, cows some 5–10, and cats some 60–100 over their lifetimes. They all parent and alloparent. Turtles more like 1–2k, and no (allo-) parenting. So (just based on this data) I imagine that the loss of a child is almost as bad for a cow as it is for a human, that cats grieve somewhat less, and turtles not at all.

Or when it comes to pain, I mention in my last article that it probably doesn't make sense to have very strong pain signals, when an animal cannot react to them. So for nematodes it's not very useful to experience pain strongly; for a fly it may be very useful.

In your thought experiment there is a disconnect there that is crucial for my work: Antifrustrationism, as a form of preference utilitarianism, is all about the preferences of the individuals. If someone can fulfill their own preferences, there's nothing for me to do. If someone can tell me how I can help them fulfill their preferences, I'll happily do things for them that strike me as odd, like unusual kinks, because even though I don't share them, I trust their ability to know and communicate their preferences.

I only run into problems in cases where there is a power differential but the beings cannot communicate their preferences, e.g., because they are in the future, far away in the universe, or it's hard/impossible to build the sorts of experiments where one could elicit revealed preferences. Those are the cases where it gets uncomfortable because I need to make guesses and inferences. 

It's like with a caring mother of an infant who can't talk yet: The infant can't help themselves (yet), so she has to do it, but to do that well, she has to infer the preferences rather than asking what they are.

With the aliens, the situation is reversed. They are powerful but really dumb and could just ask how they can really help us, and NPP is not an obvious lever for a K-strategist species. But we have this problem with aging. Maybe they'll stop by Earth 100k years ago, largely fail to communicate with us, but study our biology and notice that this aging thing really sucks, especially in the last couple of decades of an animal's life. So just leave a time capsule with information on how to reverse aging that's designed in such a way that they hope we can decypher it once we have the necessary foundational technologies.

A confusing thing about my experience is that the truth of the no-self state is intellectually ineluctable to me and yet my perception is still filtered by selfhood. Sometimes I get a burst of spite-fueled energy that almost collapses into fatalism, and I remind myself that I chose this, that I can just go back if I don't like it, but that I want to keep reading this chapter of my life, as it were, because it's exciting! But even when I introspect, there's a perspective there – someone has called it the watcher?

Load more