Thanks for the thoughtful responses!
a guy who attempted a home-built nuclear reactor
Ha!
a licensing system around AGIs,
Well, I have in mind something more like banning the pursuit of a certain class of research goals.
the fact that evidence is fairly easy to come by and that intent with those is fairly easy to prove.
Hm. This provokes a further question:
Are there successful regulations that can apply to activity that is both purely mental (I mean, including speech, but not including anything more kinetic), and also is not an intention to commit a ...
I agree that "defer less" is good advice for EAs, but that's because EAs are especially deferent, and also especially care about getting things right and might actually do something sane about it. I think part of doing something sane about it is to have a detailed model of deference.
Epistemic deference is just obviously parasitic, a sort of dreadful tragedy of the commons of the mind.
I don't think this is right. One has to defer quite a lot; what we do is, appropriately, mostly deferring, in one way or another. The world is so complicated, there's so much information to process, and our problems are high-context (that is, require compressing and abstracting from a lot of information). Also coordination is important.
I think a blunt-force "just defer less" is therefore not viable. Instead, having a more detailed understanding of w...
The Logical Induction paper involved multiple iterations of writing and reviewing, during which we refined the notation, terminology, proof techniques, theorem statements, etc. We also had a number of others comment on various drafts, pointing out wrong or unclear parts.
One horse-sized duck AI. For one thing, the duck is the ultimate (route) optimization process: you can ride it on land, sea, or air. For another, capabilities scale very nonlinearly in size; the neigh of even 1000 duck-sized horse AIs does not compare to the quack of a single horse-sized duck AI. Most importantly, if you can safely do something with 100 opposite-sized AIs, you can safely do the same thing with one opposite-sized AI.
In all seriousness though, we don't generally think in terms of "proving the friendliness" of an AI system. When do...
In my current view, MIRI’s main contributions are (1) producing research on highly-capable aligned AI that won’t be produced by default by academia or industry; (2) helping steer academia and industry towards working on aligned AI; and (3) producing strategic knowledge of how to reduce existential risk from highly-capable AI. I think (1) and (3) are MIRI’s current strong suits. This is not easy to verify without technical background and domain knowledge, but at least for my own thinking I’m impressed enough with these points to find MIRI very worthwhile to...
Scott Garrabrant’s logical induction framework feels to me like a large step forward. It provides a model of “good reasoning” about logical facts using bounded computational resources, and that model is already producing preliminary insights into decision theory. In particular, we can now write down models of agents that use logical inductors to model the world---and in some cases these agents learn to have sane beliefs about their own actions, other agents’ actions, and how those actions affect the world. This, despite the usual obstacles to self-modeling...
Ah, yeah, that sounds close to what I'm imagining, thank you.