All of Chris Smith's Comments + Replies

I largely agree with what you said in this comment, though I'd say the line between data collection and data processing is often blurred in real-world scenarios. 

I think we are talking past each other (not in a bad faith way though!), so I want to stop myself from digging us deeper into an unproductive rabbit hole.

Just saw this comment, I'm also super late to the party responding to you!

It actually seems to me it might have been worth emphasising more, as I think a casual reader could think this post was a critique of formal/explicit/quantitative models in particular.

Totally agree! Honestly, I had several goals with this post, and I almost complete failed on two of them:

  • Arguing why utilitarianism can't be the foundation of ethics.
  • Without talking much about AI, explaining why I don't think people in the EA community are being reasonable when they suggest there's a de
... (read more)

Just found this post, coming in to comment a year late--Thanks Michael for the thoughtful post and Ozzie for the thoughtful comments! 

I'm not saying that these are easy to solve, but rather, there is a mathematical strategy to generally fix them in ways that would make sense intuitively. There's no better approach than to try to approximate the mathematical approach, or go with an approach that in-expectation does a decent job at approximating the mathematical approach.

I might agree with you about what's (in some sense) mathematically possible (in pri... (read more)

2
MichaelA
3y
Hmm, I feel like you may be framing things quite differently to how I would, or something. My initial reaction to your comment is something like: It seems usefully to conceptually separate data collection from data processing, where by the latter I mean using that data to arrive at probability estimates and decisions.  I think Bayesianism (in the sense of using Bayes' theorem and a Bayesian interpretation of probability) and "math and technical patches" might tend to be part of the data processing, not the data collection. (Though they could also guide what data to look for. And this is just a rough conceptual divide.)  When Ozzie wrote about going with "an approach that in-expectation does a decent job at approximating the mathematical approach", he was specifically referring to dealing with the optimizer's curse. I'd consider this part of data processing.  Meanwhile, my intuitions (i.e., gut reactions) and what experts say are data. Attending to them is data collection, and then we have to decide how to integrate that with other things to arrive at probability estimates and decisions. I don't think we should see ourselves as deciding between either Bayesianism and "math and technical patches" or paying attention to my intuitions and domain experts. You can feed all sorts of evidence into Bayes theorem. I doubt any EA would argue we should form conclusions from "Bayesianism and math alone", without using any data from the world (including even their intuitive sense of what numbers to plug in, or whether people they share their findings with seem skeptical). I'm not even sure what that'd look like. And I think my intuitions or what domain experts says can very easily be made sense of as valid data within a Bayesian framework. Generally, my intuitions and experts are more likely to indicate X is true in worlds where X is true than where it's not. This effect is stronger when the conditions for intuitive expertise are met, when experts' incentives seem to be wel

(I used to work for GiveWell)

Hey Ben,

I'm sympathetic to a lot of the points you make in this post, but I think your conclusions are far more negative than is reasonable.

Here's the stuff I largely agree with you on:

-The opportunities to save lives w/ global health interventions probably aren't nearly as easy as Singer's thought experiment suggests

-Entities other than GiveWell use GiveWell's estimates without the appropriate level of nuance and detail about where the estimates come from and how uncertain they are

-There's not an... (read more)

That's interesting—and something I may not have considered enough. I think there's a real possibility that there could be excessive quantification in some areas of the EA but not enough of it in other areas.

For what it's worth, I may have made this post too broad. I wanted to point out a handful of issues that I felt all kind of fell under the umbrella of "having excessive faith in systematic or mathematical thinking styles." Maybe I should have written several posts on specific topics that get at areas of disagreement a bit more concretely. I might get around to those posts at some point in the future.

FWIW, as someone who was and is broadly sympathetic to the aims of the OP, my general impression agrees with "excessive quantification in some areas of the EA but not enough of it in other areas."

(I think the full picture has more nuance than I can easily convey, e.g. rather than 'more vs. less quantification' it often seems more important to me how quantitative estimates are being used - what role they play in the overall decision-making or discussion process.)

Again, none of this is to say that Bayesianism is fundamentally broken or that high-level Bayesian-ish things like "I have a very skeptical prior so I should not take this estimate of impact at face value" are crazy.

As a real world example:

Venture capitalists frequently fund things that they're extremely uncertain about. It's my impression that Bayesian calculations rarely play into these situations. Instead, smart VCs think hard and critically and come to conclusions based on processes that they probably don't full understand themselves.

It could be that VCs have just failed to realize the amazingness of Bayesianism. However, given that they're smart & there's a ton of money on the table, I think the much more plausible explanation is that hardcore Bayesianism wouldn't lead to better results than whatever it is that successful VCs actually do.

4
kbog
5y
I interned for a VC, albeit a small and unknown one. Sure, they don't do Bayesian calculations, if you want to be really precise. But they make extensive use of quantitative estimates all the same. If anything, they are cruder than what EAs do. As far as I know, they don't bother correcting for the optimizer's curse! I never heard it mentioned. VCs don't primarily rely on the quantitative models, but other areas of finance do. If what they do is OK, then what EAs do is better. This is consistent with what finance professionals told me about the financial modeling that I did. Plus, this is not about the optimizer's curse. Imagine that you told those VCs that they were no longer choosing which startups are best, instead they now have to select which ones are better-than-average and which ones are worse-than-average. The optimizer's curse will no longer interfere. Yet they're not going to start relying more on explicit Bayesian calculations. They're going to use the same way of thinking as always. And explicit Bayesian calculation is rarely used by anyone anywhere. Humans encounter many problems which are not about optimizing, and they still don't use explicit Bayesian calculation. So clearly the optimizer's curse is not the issue. Instead, it's a matter of which kinds of cognition and calculation people are more or less comfortable with.
5
Chris Smith
5y
Again, none of this is to say that Bayesianism is fundamentally broken or that high-level Bayesian-ish things like "I have a very skeptical prior so I should not take this estimate of impact at face value" are crazy.

It's always worth entertaining multiple models if you can do that at no cost. However, doing that often comes at some cost (money, time, etc). In situations with lots of uncertainty (where the optimizer's curse is liable to cause significant problems), it's worth paying much higher costs to entertain multiple models (or do other things I suggested) than it is in cases where the optimizer's curse is unlikely to cause serious problems.

3
kbog
5y
I don't agree. Why is the uncertainty that comes from model uncertainty - as opposed to any other kind of uncertainty - uniquely important for the optimizer's curse? The optimizer's curse does not discriminate between estimates that are too high for modeling reasons, versus estimates that are too high for any other reason. The mere fact that there's more uncertainty is not relevant, because we are talking about how much time we should spend worrying about one kind of uncertainty versus another. "Do more to reduce uncertainty" is just a platitude, we always want to reduce uncertainty.

Hey Kyle, I'd stopped responding since I felt like we were well beyond the point where we were likely to convince one another or say things that those reading the comments would find insightful.

I understand why you think "good prior" needs to be defined better.

As I try to communicate (but may not quite say explicitly) in my post, I think that in situations where uncertainty is poorly understood, it's hard to come up with priors that are good enough that choosing actions based explicit Bayesian calculations will lead to better outcomes than choosing actions based on a combination of careful skepticism, information gathering, hunches, and critical thinking.

2
kbog
5y
Explicit Bayesian calculation is a way of choosing actions based on a combination of careful skepticism, information gathering, hunches, and critical thinking. (With math too.) I'm guessing you mean we should use intuition for the final selection, instead of quantitative estimates. OK, but I don't see how the original post is supposed to back it up; I don't see what the optimizer's curse has to do with it.
9
Chris Smith
5y
As a real world example: Venture capitalists frequently fund things that they're extremely uncertain about. It's my impression that Bayesian calculations rarely play into these situations. Instead, smart VCs think hard and critically and come to conclusions based on processes that they probably don't full understand themselves. It could be that VCs have just failed to realize the amazingness of Bayesianism. However, given that they're smart & there's a ton of money on the table, I think the much more plausible explanation is that hardcore Bayesianism wouldn't lead to better results than whatever it is that successful VCs actually do.

I'd also be excited to see more people in the EA movement doing the sort of work that I think would put society in a good position for handling future problems when they arrive. E.g., I think a lot of people who associate with EA might be awfully good and pushing for progress in metascience/open science or promoting a free & open internet.

1
Liam_Donovan
5y
A recent example of this happening might be EA LTF Fund grants to various organizations trying to improve societal epistemic rationality (e.g. by supporting prediction markets)

Thanks for raising this.

To be clear, I'm still a huge fan of GiveWell. GiveWell only shows up in so many examples in my post because I'm so familiar with the organization.

I mostly agree with the points Holden makes in his cluster thinking post (and his other related posts). Despite that, I still have serious reservations about some of the decision-making strategies used both at GW and in the EA community at large. It could be that Holden and I mostly agree, but other people take different positions. It could be that Holden and I agree about a lo... (read more)

AGB
5y20
0
0

Fair enough. I remain in almost-total agreement, so I guess I'll just have to try and keep an eye out for what you describe. But based on what I've seen within EA, which is evidently very different to what you've seen, I'm more worried about little-to-zero quantification than excessive quantification.

5
Chris Smith
5y
I'd also be excited to see more people in the EA movement doing the sort of work that I think would put society in a good position for handling future problems when they arrive. E.g., I think a lot of people who associate with EA might be awfully good and pushing for progress in metascience/open science or promoting a free & open internet.

Just to be clear, much of the deworming work supported by people in the EA community happens in areas where worm infections are more intense or are caused by worm species other than Trichuris & Ascaris. However, I believe a non-trivial amount of deworming done by charities supported by the EA community occurs in areas w/ primarily light infections from those worms.

Sure. To be clear, I think most of what I'm concerned about applies to prioritization decisions made in highly-uncertain scenarios. So far, I think the EA community has had very few opportunities to look back and conclusively assess whether highly-uncertain things it prioritized turned out to be worthwhile. (Ben makes a similar point at https://www.lesswrong.com/posts/Kb9HeG2jHy2GehHDY/effective-altruism-is-self-recommending.)

That said, there are cases where I believe mistakes are being made. For example, I think mass deworming in areas where almost ... (read more)

8
Chris Smith
5y
Just to be clear, much of the deworming work supported by people in the EA community happens in areas where worm infections are more intense or are caused by worm species other than Trichuris & Ascaris. However, I believe a non-trivial amount of deworming done by charities supported by the EA community occurs in areas w/ primarily light infections from those worms.

I think it's super exciting—a really useful application of probability!

I don't know as much as I'd like to about Tetlock's work. My understanding is that the work has focused mostly on geopolitical events where forecasters have been awfully successful. Geopolitical events are a kind of thing I think people are in an OK position for predicting—i.e. we've seen a lot of geopolitical events in the past that are similar to the events we expect to see in the future. We have decent theories that can explain why certain events came to pas... (read more)

2
MichaelA
4y
I think I agree with everything you've said there, except that I'd prefer to stay away from the term "Knightian", as it seems to be so often taken to refer to an absolute, binary distinction. It seems you wouldn't endorse that binary distinction yourself, given that you say "Knightian-ish", and that in your post you write: But I think, whatever one's own intentions, the term "Knightian" sneaks in a lot of baggage and connotations. And on top of that, the term is interpreted in so many different ways by different people. For example, I happened to have recently seen events very similar to those you contrasted against cases of Knightian-ish uncertainty used as examples to explain the concept of Knightian uncertainty (in this paper): So I see the term "Knightian" as introducing more confusion than it's worth, and I'd prefer to only use it if I also give caveats to that effect, or to highlight the confusions it causes. Typically, I'd prefer to rely instead on terms like more or less resilient, precise, or (your term) hazy probabilities/credences. (I collected various terms that can be used for this sort of idea here.) [I know this comment is very late to the party, but I'm working on some posts about the idea of a risk-uncertainty distinction, and was re-reading your post to help inform that.]

I'm struggling to understand how your proposed new group avoids the optimizer's curse, and I'm worried we're already talking past each other. To be clear, I'm don't believe there's something wrong with Bayesian methods in the abstract. Those methods are correct in a technical sense. They clearly work in situations where everything that matters can be completely quantified.

The position I'm taking is that the scope of real-world problems that those methods are useful for is limited because our ability to precisely qua... (read more)

3
kbog
5y
Because I'm not optimizing! Of course it is still the case that the highest-scoring estimates will probably be overestimates in my new group. The difference is, I don't care about getting the right scores on the highest-scoring estimates. Now I care about getting the best scores on all my estimates. Or to phrase it another way, suppose that the intervention will be randomly selected rather than picked from the top. Well yes, but I think the methods work better than anything else for all these scenarios.

Thanks Max! That paper looks interesting—I'll have to give it a closer read at some point.

I agree with you that how the reliability of assessments varies between options is crucial.

Can you expand on how you would directly estimate the reliability of charity evaluations? I feel like there are a lot of realistic situations where this would be extremely difficult to do well.

4
kbog
5y
I mean do the adjustment for the optimizer's curse. Or whatever else is in that paper. I think talk of doing things "well" or "reliably" should be tabooed from this discussion, because no one has any coherent idea of what the threshold for 'well enough' or 'reliable enough' means or is in this context. "Better" or "more reliable" makes sense.

Thanks for the detailed comment!

I expect we’ll remain in disagreement, but I’ll clarify where I stand on a couple of points you raised:

“Optimizer's curse only matters when comparing better-understood projects to worse-understood projects, but you are talking about "prioritizing among funding opportunities that involve substantial, poorly understood uncertainty."

Certainly, the optimizer’s curse may be a big deal when well-understood projects are compared with poorly-understood projects. However, I don’t think it’s the case that all projects ... (read more)

5
kbog
5y
"Footing" here is about the robustness of our credences, so I'm not sure that we can really be ignorant of them. Yes different projects in a poorly understood domain will have different levels of poorly understood uncertainty, but it's not clear that this is more important than the different levels of uncertainty in better-understood domains (e.g. comparisons across Givewell charities). What do you mean by reliable? Yes, but it's very hard to attack any particular prior as well. Yes I know but again it's the ordering that matters. And we can correct for optimizer's curse, and we don't know if these corrections will overcorrect or undercorrect. "The problem" should be precisely defined. Identifying the correct intervention is hard because the optimizer's curse complicates comparisons between better- and worse-substantiated projects? Yes we acknowledge that. And you are not just saying that there's a problem, you are saying that there is a problem with a particular methodology, Bayesian probability. That is very unclear. This is just a generic bucket of "stuff that makes estimates more accurate, sometimes" without any more connection to the optimizer's curse than to any other facets of uncertainty. Let's imagine I make a new group whose job is to randomly select projects and then estimate each project's expected utility as accurately and precisely as possible. In this case the optimizer's curse will not apply to me. But I'll still want to evaluate things with multiple models, learn more and use proxies such as social capacity. What is some advice that my group should not follow, that Givewell or Open Philanthropy should follow? Aside from the existing advice for how to make adjustments for the Optimizer's Curse. If you want, you can define some set of future updates (e.g. researching something for 1 week) and specify a probability distribution for your belief state after that process. I don't think that level of explicit detail is typically necessary though. Y

It's definitely an interesting phenomenon & worth thinking about seriously.

Any procedures for optimizing for expected impact could go wrong if the value of long-term alliances and relationships isn't accounted for.

Thanks Milan—I probably should have been a bit more detailed in my summary.

Here are the main issues I see:

-The optimizer's curse is an underappreciated threat to those who prioritize among causes and programs that involve substantial, poorly understood uncertainty.

-I think EAs are unusually prone to wrong-way reductions: a fallacy where people try to solve messy, hard problems with tidy, formulaic approaches that actually create more issues than they resolve.

--I argue that trying to turn all uncertainty into something like numeric probability estimat... (read more)