Thanks! I'm now unsure what I think.
if you can select from the intersection, you get options that are pretty good along both axes, pretty much by definition.
Isn't this an argument for always going for the best of both worlds, and never using a barbell strategy?
a concrete use case might be more illuminating.
This isn't super concrete (and I'm not if the specific examples are accurate), but for illustrative purposes, what if:
I think a lot of people's intuition would be that the compromise option is the best one to aim for. Should thinking about fat tails make us prefer one or other of the extremes instead?
This is cool, thanks!
One scenario I am thinking about is how to prioritise biorisk interventions, if you care about both x-risk and non-x-risk impacts. I'm going to run through some thinking, and ask if you think it makes sense:
What do you think? I'm not sure if that reasoning follows/if I've applied the lessons from your post in a sensible way.
From Specification gaming examples in AI:
Thanks, this is really interesting.
One follow-up question: who are safety managers? How are they trained, what's their seniority in the org structure, and what sorts of resources do they have access to?
In the bio case it seems that in at least some jurisdictions and especially historically, the people put in charge of this stuff were relatively low-level administrators, and not really empowered to enforce difficult decisions or make big calls. From your post it sounds like safety managers in engineering have a pretty different role.
Thanks for the kind words!
Can you say more about how either of your two worries work for industrial chemical engineering?
Also curious if you know anything about the legislative basis for such regulation in the US. My impression from the bio standards in the US is that it's pretty hard to get laws passed, so if there are laws for chemical engineering it would be interesting to understand why those were plausible whereas bio ones weren't.
Good question.
There's a little bit on how to think about the XPT results in relation to other forecasts here (not much). Extrapolating from there to Samotsvety in particular:
I also haven't looked in detail at the respective resolution criteria, but at first glance the forecasts also seem relatively hard to compare directly. (I agree with you though that the discrepancy is large enough that it suggests a large disagreement were the two groups to forecast the same question - just expect that it will be hard to work out how large.)
Thanks, I think these points are good.
Do you have any examples in mind of domains where we might expect this? I've heard people say things like 'some maths problems require serial thinking time', but I still feel pretty vague about this and don't have much intuition about how strongly to expect it to bite.