Yes, I suppose I am trying to divide tasks/projects up into two buckets based on whether they require high context and value-alignment and strategic thinking and EA-ness. And I think my claim was/is that UI design is comparatively easy to outsource to someone without much of the relevant context and values. And therefore the comparative advantage of the higher-context people is to do things that are harder to outsource to lower-context people. But I know ~nothing about UI design, maybe being higher context is actually super useful.
Nice post! I agree moral errors aren't only a worry for moral realists. But they do seem especially concerning for realists, as the moral truth may be very hard to discover, even for superintelligences. For antirealists, the first 100 years of a long reflection may get you most of the way to where your views will converge towards after a billion years of reflecting on your values. But the first 100 years of a long reflection are less guaranteed to get you close to the realist moral truth. So a 100-years-reflection is e.g. 90% likely to avoid massive moral errors for antirealists, but maybe only 40% likely to do so for realists.
--
Often when there are long lists like this I find it useful for my conceptual understanding to try to create some scructure to fit each item into, here is my attempt.
A moral error is making a moral decision that is quite suboptimal. This can happen if:
--
Some minor points:
Great set of posts (including the 'how far' and 'how sudden' related ones). I only skimmed the parts I had read drafts of, but still have a few comments, mostly minor:
1. Accelerating progress framing
We define “accelerating AI progress” as “each increment of capability advancement (e.g. GPT-3 → GPT-4) happens more quickly than the last”.
I am a bit skeptical of this definition, both because it is underspecified, and I'm not sure it is pointing at the most important thing.
2. Max rate of change
Theoretical limits for the speed of progress are 100X as fast as recent progress.
It would be good to flag in the main text that the justification for this is in Appendix 2 (initially I thought it was a bare asertion). Also, it is interesting that in @kokotajlod's scenario the 'wildly superintelligent' AI maxes out at 1 million-fold AI R&D speedup; I commented to them on a draft that this seemed implausibly high to me. I have no particular take on whether 100x is too low or too high as the theoretical max, but it would be interesting to work out why there is this Forethought vs AI Futures difference.
3. Error in GPT OOMS calculations
4. Physical limits
Regarding the effective physical limits of each feedback loop, perhaps it is worth noting that your estimates are very well grounded and high-confidence for the chip production feedback loop as we know more or less exactly the energy output of the sun. But the other two are super speculative. Which is fine, they are just quite different types of estimates, so we should remember to rely far less on them.
5. End of the transition period
(Finally, Fn2 is missing a link.)
Thanks for this, I hadn't thought much about the topic and agree it seems more neglected than it should be. But I am probably overall less bullish than you (as operationalised by e.g. how many people in the existential risk field should be making this a significant focus: I am perhaps closer to 5% than your 30% at present).
I liked your flowchart on 'Inputs in the AI application pipeline,' so using that framing:
In terms of which applications to focus on, my guess is epistemic tools and coordination-enabling tools will mostly be built by default (though of course as you note additional effort can still speed them up some). E.g. politicians and business leaders and academics would all presumably love to have better predictions for which policies will be popular, what facts are true, which papers will replicate etc. And negotiation tools might be quite valuable for e.g. negotiating corporate mergers and deals.
So my take is that probably a majority of the game here is in 'automated AI safety/governance/strategy' because there will be less corporate incentive here, and it is also our comparative advantage to work on.
Overall, I agree differential AI tool development could be very important, but think the focus is mainly on providing high-quality training data and RLHF for automated AI safety research, which is somewhat narrower than what you describe.
I'm not sure how much we actually disagree though, would be interested in your thoughts!
Throughout, I use 'us' to refer broadly to EA/longtermist/existential security type folks.
So if we take as given that I am at 53% and Alice is at 45% that gives me some reason to do longtermist outreach, and gives Alice some reason to try to stop me, perhaps by making moral trades with me that get more of what we both value. In this case, cluelessness doesn't bite as Alice and I are still taking action towards our longtermist ends.
However, I think what you are claiming, or at least the version of your position that makes most sense to me, is that both Alice and I would be making a failure of reasoning if we assign these specific credence, and that we should both be 'suspending judgement'. And if I grant that, then yes it seems cluelessness bites as neither Alice or I know at all what to do now.
So it seems to come down to whether we should be precise Bayesians.
Re judgment calls, yes I think that makes sense, though I'm not sure it is such a useful category. I would think there is just some spectrum of arguments/pieces of evidence from 'very well empirically grounded and justified' through 'we have some moderate reason to think so' to 'we have roughly no idea' and I think towards the far right of this spectrum is what we are labeling judgement calls. But surely there isn't a clear cut-off point.
I think there are a lot of thorny definitional issues here that make this set of issues not boil down that nicely to a 1D spectrum. But overall extinction prevention will likely have a far broader coalition supporting it, while making the future large and amazing is far less popular since most people aren't very ambitious with respect to spreading flourishing through the universe, but I tentatively am.
I also think average utilitarianism doesn't seem very plausible. I was just using it as an example of a non-linear theory (though as Will notes if any individual is linear in resources so is the world as a whole, just with a smaller derivative).
Interesting, is this the sort of thing you have in mind? It at least seems similar to me, and I remember thinking that post got at something important.
A bull case for convergence:
But I agree we shouldn't bank on convergence!
Yeah I think I agree with all this; I suppose since 'we' have the AI policy/strategy training data anyway that seems relatively low effort and high value to do, but yes if we could somehow get access to the private notes of a bunch of international negotiators that also seems very valuable! Perhaps actually asking top forecasters to record their working and meetings to use as training data later would be valuable, and I assume many people already do this by default (tagging @NunoSempere). Although of course having better forecasting AIs seems more dual-use than some of the other AI tools.