6000 karmaJoined Dec 2017


Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.blogspot.com


Replacing Fear
EA Archives Reading List


I don't follow. I get that acting on low-probability scenarios can let you get in on neglected opportunities, but you don't want to actually get the probabilities wrong, right?

I reject the idea that all-things-considered probabilities are "right" and inside-view probabilities are "wrong", because you should very rarely be using all-things-considered probabilities when making decisions, for reasons of simple arithmetic (as per my example). Tell me what you want to use the probability for and I'll tell you what type of probability you should be using.

You might say: look, even if you never actually use all-things-considered probabilities in the real world, at least in theory they're still normatively ideal. But I reject that too—see the Anthropic Decision Theory paper for why.

If it informs you that EA beliefs on some question have been unusual from the get-go, it makes sense to update the other way, toward the distribution of beliefs among people not involved in the EA community.

I'm a bit confused by this. Suppose that EA has a good track record on an issue where its beliefs have been unusual from the get-go. For example, I think that by temperament EAs tend to be more open to sci-fi possibilities than others, even before having thought much about them; and that over the last decade or so we've increasingly seen sci-fi possibilities arising. Then I should update towards deferring to EAs because it seems like we're in the sort of world where sci-fi possibilities happen, and it seems like others are (irrationally) dismissing these possibilities.

On a separate note: I currently don't think that epistemic deference as a concept makes sense, because defying a consensus has two effects that are often roughly the same size: it means you're more likely to be wrong, and it means you're creating more value if right.* But if so, then using deferential credences to choose actions will systematically lead you astray, because you'll neglect the correlation between likelihood of success and value of success.

Toy example: your inside view says your novel plan has 90% chance of working, and if it does it'll earn $1000; and experts think it has 10% chance of working, and if it does it'll earn $100. Suppose you place as much weight on your own worldview as experts'. Incorrect calculation: your all-things-considered credence in your plan working is 50%, your all-things-considered estimate of the value of success is $550, your all-things-considered expected value of the plan is $275. Better calculation: your worldview says that the expected value of your plan is $900, the experts think the expected value is $10, average these to get expected value of $455—much more valuable than in the incorrect calculation!

Note that in the latter calculation we never actually calculated any "all-things-considered credences". For this reason I now only express such credences with a disclaimer like "but this shouldn't be taken as action-guiding".


* A third effect which might be bigger than either of them: it motivates you to go out and try stuff, which will give you valuable skills and make you more correct in the future.

Thanks! I'll update it to include the link.

Only when people face starvation, illness, disasters, or warfare can they learn who they can really trust.

Isn't this approximately equivalent to the claim that trust becomes much more risky/costly under conditions of scarcity?

only under conditions of local abundance do we see a lot of top-down hierarchical coercion

Yeah, this is an interesting point. I think my story here is that we need to talk about abundance at different levels. E.g. at the highest level (will my country/civilization survive?) you should often be in scarcity mindset, because losing one war is disastrous. Whereas at lower levels (e.g. will my city survive?) you can have more safety: your city is protected by your country (and your family is protected by your city, and you're protected by your family, and so on).

And so even when we face serious threats, we need to apply coercion only at the appropriate levels. AI is a danger on a civilizational level; but the best way to deal with danger on a civilizational level is via cultivating abundance at the level of your own community, since that's the only way it'll be able to make a difference at that higher level.

FYI I prefer "AI governance" over "AI strategy" because I think the latter pushes people towards trying to just sit down and think through arbitrarily abstract questions, which is very hard (especially for junior people). Better to zoom in more, as I discuss in this post.

I can notice that Open Philanthropy's funding comes from one person

One person may well have multiple different parts, or subscribe to multiple different worldviews!

asking oneself how much one values outcomes in different cause areas relative to each other, and then pursuing a measure of aggregate value with more or less vigor

I think your alternative implicitly assumes that, as a single person, you can just "decide" how much you value different outcomes. Whereas in fact I think of worldview diversification as actually a pretty good approximation of the process I'd go through internally if I were asked this question.

I agree that this, and your other comment below, both describe unappealing features of the current setup. I'm just pointing out that in fact there are unappealing outcomes all over the place, and that just because the equilibrium we've landed on has some unappealing properties doesn't mean that it's the wrong equilibrium. Specifically, the more you move towards pure maximization, the more you run into these problems; and as Holden points out, I don't think you can get out of them just by saying "let's maximize correctly".

(You might say: why not a middle ground between "fixed buckets" and "pure utility maximization"? But note that having a few buckets chosen based on standard cause prioritization reasoning is already a middle ground between pure utility maximization and the mainstream approach to charity, which does way less cause prioritization.)

One reason to see "dangling" relative values as principled: utility functions are equivalent (i.e. produce the same preferences over actions) up to a positive affine transformation. Hence why we often use voting systems to make decisions in cases where people's preferences clash, rather than trying to extract a metric of utility which can be compared across people.

The Pareto improvements aren't about worldview diversification, though. You can see this because you have exactly the same problem under a single worldview, if you keep the amount of funding constant per year. You can solve this by letting each worldview donate to, or steal from, its own budget in other years.

I do think trade between worldviews is good in addition to that, to avoid the costs of lending/borrowing; the issue is that you need to be very careful when you're relying on the worldviews themselves to tell you how much weight to put on them. So for example, if every year the AI risk worldview gets more and more alarmed, then it might "borrow" more and more money from the factory farming worldview, with the promise to pay back whenever it starts getting less alarmed. But the whole point of doing the bucketing in the first place is so that the factory farming worldview can protect itself from the possibility of the AI risk worldview being totally wrong/unhinged, and so you can't assume that the AI risk worldview is just as likely to update down as to update upwards.

You could avoid this by having the AI risk worldview make concrete predictions about what would make it more or less alarmed. But now this is far from a "simple" scheme, it requires doing a bunch of novel intellectual work to pin down the relevant predictions.

Re the U-turn on criminal justice: no matter what scheme you use, you should expect it to change whenever you have novel thoughts about the topic. Worldview diversification isn't a substitute for thinking!

I don't think this is actually a problem, for roughly the reasons described here. I.e. worldview diversification can be seen as a way of aggregating the preferences of multiple agents—but this shouldn't necessarily look like maximizing any given utility function.

I also talk about this here.

Load more