Working on compute governance.
Happy to chat about anything, just reach out.
A few quick ideas:
1. On the methods side, I find the potential use of LLMs/AI as research participants in psychology studies interesting (not necessarily related to safety). This may sound ridiculous at first but I think the studies are really interesting.
From my post on studying AI-nuclear integration with methods from psychology:
[Using] LLMs as participants in a survey experiment, something that is seeing growing interest in the social sciences (see Manning, Zhu, & Horton, 2024; Argyle et al., 2023; Dillion et al., 2023; Grossmann et al., 2023).
2. You may be interested or get good ideas from the Large Language Model Psychology research agenda (safety-focused). I haven't gone into it so this is not an endorsement.
3. Then you have comparative analyses of human and LLM behavior. E.g. the Human vs. Machine paper (Lamparth, 2024) compares humans and LLMs' decision-making in a wargame. I do something similar with a nuclear decision-making simulation, but it's not in paper/preprint form yet.
Thanks Clare! Your comment was super informative and thorough.
One thing that I would lightly dispute is that 360 feedback is easily gameable. I (anecdotally) feel like people with malevolent traits (“psychopaths” here) often have trouble remaining “undiscovered” and so have to constantly move or change social circles.
Of course, almost by definition I wouldn’t know any psychopaths that are still undiscovered. But 360 feedback could help discover the “discoverable” subgroup, since the test is not easily gameable by them.
Any thoughts?
Glad it was useful!
I don't feel qualified to give advice on teaching a language to small kids, although I do have a few thoughts. Please take them with a grain of salt, as I've never done this.
I'm assuming you mean your kids, not kids in a classroom? If this is the case:
That's all I could think of. That said, I think a quick Google/YouTube search might uncover much more valuable guidance on this!
Thank you for researching this; this is incredibly valuable.
I noticed that the OUS-Impartial Beneficence subscale correlates well with expansive altruism and effectiveness focus. Maybe I skipped over it, but did you include in your results whether this OUS subscale had higher predictive power than your two new factors?
Thank you for writing this. This is a really useful insight that I’ll be thinking more about as I engage more with IIDM — I have definitely focused disproportionately more on adding good processes than eliminating bad ones. This could be because I’m not very familiar in general with common processes within institutions, as my studies have really only focused on individual decision-making/rationality so far.
Below are a few quick thoughts on that.
Following your Putin-EU example, I wonder how much of Russia’s nimbleness is enabled by one man having so much decision-making power, which might both enable quick decision-making as well as democratic backsliding.
Although you could argue that quicker experimentation might pay off in the long run, I would worry that modern states having too few checks and balances might increase the risk of solo actors making catastrophically bad decisions. At the same time, I worry about vast bureaucracies failing to make important decisions, and that being equally catastrophic.
I agree, as you say, that the need for “caution and consensus vs. experimentation and accountability” depends on the institution and the decision to be made. I’m also not aware of attempts to describe when exactly you would want more of the former vs. the latter.
If you (or others) have good resources on eliminating bad processes/bureaucracy, I’d love to see them.
Mostly the meat-eater problem, also cost-effectiveness analyses. Also higher neglectedness on priors.