What I'm saying is that if you believe that x-risk is 0.1%, then you think we're at least one in a million.
I think you're saying "if you believe that x-risk this century is 0.1%, then survival probability this century is 99.9%, and for total survival probability over the next trillion years to be 0.01%, there can be at most 9200 centuries with risk that high over the next trillion years (.999^9200=0.0001), which means we're in (most generously) a one-in-one-million century, as a trillion years is 10 billion centuries, which divided by ten thousand is a million." That seem right?
Then, if the expected cost-effectiveness of the best opportunities varies substantially over time, there will be just one point in time at which your philanthropy will have the most impact, and you should try to max out your philanthropy at that time period, donating all your philanthropy at that time if you can.
Tho I note that the only way one would ever take such opportunities, if offered, is by developing a view of what sorts of opportunities are good that is sufficiently motivating to actually take action at least once every few decades.
For example, when the most attractive opportunity so far appears in year 19 of investing and assessing opportunities, will our patient philanthropist direct all their money towards it, and then start saving again? Will they reason that they don't have sufficient evidence to overcome their prior that year 19 is not more attractive than the years to come? Will they say "well, I'm following the Secretary Problem solution, and 19 is less than 70/e, so I'm still in info-gathering mode"?
They won't, of course, know which path had higher value in their particular world until they die, but it seems to me like most of the information content of a strategy that waits to pull the trigger is in when it decides to pull the trigger, and this feels like the least explicit part of your argument.
Compare to investing, where some people are fans of timing the market, and some people are fans of dollar-cost-averaging. If you think the attractiveness of giving opportunities is going to be unpredictably volatile, then doing direct work or philanthropy ever year is the optimal approach. If instead you think the attractiveness of giving opportunities is predictably volatile, or predictably stable, then doing patient philanthropy makes more sense.
What seems odd to me is simultaneously holding the outside view sense that we have insufficient evidence to think that we're correctly assessing a promising opportunity now, and having the sense that we should expect that we will correctly assess the promising opportunities in the future when they do happen.
Now that the world has experienced COVID-19, everyone understands that pandemics could be bad
I found it somewhat surprising how quickly the pandemic was polarized politically; I am curious whether you expect this group to be partisan, and whether that would be a positive or negative factor.
[A related historical question: what were the political party memberships of members of environmental groups in the US across time? I would vaguely suspect that it started off more even than it is today.]
I felt confused about why I was presented with a fully general argument for something I thought I indicated I already considered.
In my original comment, I was trying to resolve the puzzle of why something would have to appear edgy instead of just having fewer filters, by pointing out the ways in which having unshared filters would lead to the appearance of edginess. [On reflection, I should've been clearer about the 'unshared' aspect of it.]
you didn't want to voice unambiguous support for the view that the comment wordings were in fact not easy to improve on given the choice of topic.
I'm afraid this sentence has too many negations for me to clearly point one way or the other, but let me try to restate it and say why I made a comment:
The mechanistic approach to avoiding offense is to keep track of the ways things you say could be interpreted negatively, and search for ways to get your point across while not allowing for any of the negative interpretations. This is a tax on saying anything, and it especially taxes statements on touchy subjects, and the tax on saying things backpropagates into a tax on thinking them.
When we consider people who fail at the task of avoiding giving offense, it seems like there are three categories to consider:
1. The Blunt, who are ignoring the question of how the comment will land, and are just trying to state their point clearly (according to them).
2. The Blithe, who would put effort into rewording their point if they knew how to avoid giving offense, but whose models of the audience are inadequate to the task.
3. The Edgy, who are optimizing for being 'on the line' or in the 'plausible deniability' region, where they can both offend some targets and have some defenders who view their statements as unobjectionable.
While I'm comfortable predicting those categories will exist, confidently asserting that someone falls into any particular category is hard, because it involves some amount of mind-reading (and I think the typical mind fallacy makes it easy to think people are being Edgy, because you assume they see your filters when deciding what to say). That said, my guess is that Hanson is Blunt instead of Edgy or Blithe.
Comparing trolley accidents to rape is pretty ridiculous for a few reasons:
I think you're missing my point; I'm not describing the scale, but the type. For example, suppose we were discussing racial prejudice, and I made an analogy to prejudice against the left-handed; it would be highly innumerate of me to claim that prejudice against the left-handed is as damaging as racial prejudice, but it might be accurate of me to say both are examples of prejudice against inborn characteristics, are perceived as unfair by the victims, and so on.
And so if you're not trying to compare expected trauma, and just come up with rules of politeness that guard against any expected trauma above a threshold, setting the threshold low enough that both "prejudice against left-handers" and "prejudice against other races" are out doesn't imply that the damage done by both are similar.
That said, I don't think I agree with the points on your list, because I used the reference class of "vehicular violence or accidents," which is very broad. I agree there's an important disanalogy in that 'forced choices' like in the trolley problem are highly atypical for vehicular accidents, most of which are caused by negligence of one sort or another, and that trolleys themselves are very rare compared to cars, trucks, and trains, and so I don't actually expect most sufferers of MVA PTSD to be triggered or offended by the trolley problem. But if they were, it seems relevant that (in the US) motor vehicle accidents are more common than rape, and lead to more cases of PTSD than rape (at least, according to 2004 research; I couldn't quickly find anything more recent).
I also think that utilitarian thought experiments in general radiate the "can't be trusted to abide by norms" property; in the 'fat man' or 'organ donor' variants of the trolley problem, for example, the naive utilitarian answer is to murder, which is also a real risk that could make the conversation include an implicit threat.
I'm a bit puzzled why it has to be edgy on top of just talking with fewer filters.
Presumably every filter is associated with an edge, right? Like, the 'trolley problem' is a classic of philosophy, and yet it is potentially traumatic for the victims of vehicular violence or accidents. If that's a group you don't want to upset or offend, you install a filter to catch yourself before you do, and when seeing other people say things you would've filtered out, you perceive them as 'edgy'. "Don't they know they shouldn't say that? Are they deliberately saying that because it's edgy?"
[A more real example is that a friend once collected a list of classic examples and thought experiments, and edited all of the food-based ones to be vegan, instead of the original food item. Presumably the people who originally generated those thought experiments didn't perceive them as being 'edgy' or 'over the line' in some way.]
but also some element of deliberate provocation.
I read a lot of old books; for example, it's interesting to contrast the 1934 and 1981 editions of How to Win Friends and Influence People. Deciding to write one of the 'old-version' sentences in 2020 would probably be seen as a deliberate provocation, and yet it seems hugely inconsistent to see Dale Carnegie as out to deliberately provoke people.
Now, I'm not saying Hanson isn't deliberately edgy; he very well might be. But there are a lot of ways in which you might offend someone, and it takes a lot of computation to proactively notice and prevent all of them, and it's very easy to think your filters are "common knowledge" or "obvious" when in fact they aren't. As a matter of bounded computation, thoughts spent on filters are thoughts not spent on other things, and so there is a real tradeoff here, where the fewer filters are required the more thoughts can be spent on other things, but this is coming through a literal increase in carelessness.
Benjamin Franklin, in his will, left £1,000 pounds each to the cities of Boston and Philadelphia, with the proviso that the money should be invested for 100 years, with 25 percent of the principal to be invested for a further 100 years.
Also of note is that he gave conditions on the investments; the money was to be lent to married men under 25 who had finished an apprenticeship, with two people willing to co-sign the loan for them. So in that regard it was something like a modern microlending program, instead of just trying to maximize returns for benefits in the future.
Presumably there are two categories of heuristics, here: ones which relate to actual difficulties in discerning the ground truth, and ones which are irrelevant or stem from a misunderstanding. I think it seems bad that this list implicitly casts the heuristics as being in the latter category, and rather than linking to why each is irrelevant or a misunderstanding it does something closer to mocking the concern.
For example, I would decompose the "It's not empirically testable" heuristic into two different components. The first is something like "it's way easier to do good work when you have tight feedback loops, and a project that relates to a single shot opportunity without a clear theory simply cannot have tight feedback loops." This was the primary reason I stayed away from AGI safety for years, and still seems to me like a major challenge to research work here. [I was eventually convinced that it was worth putting up with this challenge, however.]
The second is something like "only trust claims that have been empirically verified", which runs into serious problems with situations where the claims are about the future, or running the test is ruinously expensive. A claim that 'putting lamb's blood on your door tonight will cause your child to be spared' is one that you have to act on (or not) before you get to observe whether or not it will be effective, and so whether or not this heuristic helps depends on whether or not it's possible to have any edge ahead of time on figuring out which such claims are accurate.
I certainly don't think agents "should" try to achieve outcomes that are impossible from the problem specification itself.
I think you need to make a clearer distinction here between "outcomes that don't exist in the universe's dynamics" (like taking both boxes and receiving $1,001,000) and "outcomes that can't exist in my branch" (like there not being a bomb in the unlucky case). Because if you're operating just in the branch you find yourself in, many outcomes whose probability an FDT agent is trying to affect are impossible from the problem specification (once you include observations).
And, to be clear, I do think agents "should" try to achieve outcomes that are impossible from the problem specification including observations, if certain criteria are met, in a way that basically lines up with FDT, just like agents "should" try to achieve outcomes that are already known to have happened from the problem specification including observations.
As an example, If you're in Parfit's Hitchhiker, you should pay once you reach town, even though reaching town has probability 1 in cases where you're deciding whether or not to pay, and the reason for this is because it was necessary for reaching town to have had probability 1.