Thanks for this!
My thinking has moved in this direction as well somewhat since writing this. I'm working on a post which tells a story more or less following what you lay out above - in doc form here: https://docs.google.com/document/d/1msp5JXVHP9rge9C30TL87sau63c7rXqeKMI5OAkzpIA/edit#
I agree this danger level for capabilities could be an interesting addition to the model.
I do feel like the model remains useful in my thinking, so I might try a re-write + some extensions at some point (but probably not very soon)
Hey Mathilde! Thanks for your thoughtful comment. Curious to hear the mechanism behind eating too many healthy things leading to your issues.
Also, interesting about the supplements, hadn't heard that before. I am a bit ignorant on these things but try to offset that by buying the more expensive versions of supplements when trying them for the first time.
Heartening to hear that you figured it out after a few years!
In response to an earlier version of this question (since taken down) weeatquince responded with the following helpful comment:
Regulatory type interventions (pre-deployment):
Defence in depth type interventions (post-deployment):
I see a lot of talk with digital people about making copies but wouldn't a dominant strategy (presuming more compute = more intelligence/ability to multitask) be to just add compute to any given actor? In general, why copy people when you can just make one actor, who you know to be relatively aligned, much more powerful? Seems likely, though not totally clear, that having one mind with 1000 compute units would be strictly better for seeking power than 100 minds with 10 compute units each.
For example, companies might compete with one another to have the smartest/most able CEO by giving them more compute. The marginal benefit of more intelligence might be really high such that Tim Cook being 1% more intelligent than Mark Zuckerberg could mean Apple becomes dominant. This would trigger an intense race for compute. The same should go for governments. At some point we should have a multipolar superintelligence scenario but with human minds.
That seems true for many cases (including some I described) but you could also have a contingent of forward-looking digital people who are optimizing hard for future bliss (a potentially much more appealing prospect than expansion or procreation). Seems unclear that they would necessarily be interested in this being widespread.
Could also be that digital people find that more compute = more bliss without any bounds. Then there is plenty of interest in the rat race with the end goal of monopolizing compute. I guess this could matter more if there were just one or a few relatively powerful digital people. Then you could have similar problems as you would with AGI alignment. E.g. infrastructure profusion in order to better reward hack. (very low confidence in these arguments)
One thing that seems interesting to consider for digital people is the possibility of reward hacking. While humans certainly have quite a complex reward function, once we have full understanding of the human mind (having very good understanding could be a prerequisite to digital people anyway) then we should be able to figure out how to game it.
A key idea here is that humans have built-in limiters to their pleasure. I.e. if we eat good food that feeling of pleasure must subside quickly or else we'll just sit around satisfied until we die of hunger. Digital people need not have limits to pleasure. They would have none of the obstacles that we have to experiencing constant pleasure (e.g. we only have so much serotonin, money, and stamina so we can't just be on heroin all the time). Drugs and video games are our rudimentary and imperfect attempts to reward hack ourselves. Clearly we already have this desire. By becoming digital we could actually do it all the time and there would be no downsides.
This would bring up some interesting dynamics. Would the first ems have the problem of quickly becoming useless to humans as they turn to wireheading instead of interacting with humans? Would pleasure-seekers just be kind of socially evolved away from and some reality fundamentalists would get to drive the future while many ems sit in bliss? Would reward-hacked digital humans care about one another? Would they want to expand? If digital people optimize for personal 'good feelings' probably that won't need to coincide with interacting with the real world except so as to maintain the compute substrate, right?
I did my masters' thesis evaluating Kremer's paper from the 90's which makes the case for the more people->more growth->more people feedback loop. It essentially supports Ben's post from awhile ago (https://forum.effectivealtruism.org/posts/CWFn9qAKsRibpCGq8/does-economic-history-point-toward-a-singularity) [fyi I did work with ben on this project] in arguing that, with radiocarbon data (which I hold is much better than the guesstimate data Kremer uses), the more people->more growth relationship doesn't seem to hold. In terms of population it seems growth was much less steady than previously assumed. There are basically a few jumps, lot's of stagnation (e.g. China's population seems to have stagnated for thousands of years after the Neolithic revolution), and no clear overall pattern in long-term growth until the past few hundred years.
There are tons of caveats to my results listed in the thesis and I haven't read your paper so I'm not sure how much it even matters but I hope this contributes something! I'll add one more caveat: The paper is not super well-done (hence my previous hesitancy to post). I was sick for much of my thesis-writing period and also working part-time so much of it was rushed through toward the end. If it seems useful I can dredge up my notes on what I think might be wrong with it and send you the data (I actually have decently clean replication files in R). If I remember correctly the main results all hold it's mostly just minor things which need fixing. I've been meaning to clean it up and post it properly but I'm not sure whether that's ever going to happen, hence my posting it now.
With all that in mind, here's the thesis! https://docs.google.com/document/d/1pVzrTikeoRRO3WvU5x01nOEyf_USPUg-FcrqTGwUVR8/edit#
Feel free to reach out if you'd like to have a chat about this!
Cool thanks for the feedback everyone! I haven't done much thinking about root cause vs symptoms but I agree that especially with mental health it does seem right that 'root cause' isn't really a useful term given the complexity.
I changed up that last recommendation a bunch to get rid of symptom/root cause dichotomy:
"[revised] Try a bunch of other things. There are a lot of medications and pills you can take which have relatively low downsides and which can potentially be game-changers. This includes things like antidepressants, various supplements, nootropics, or other medication. Again, it's probably worth thinking of these as abnormally good lottery tickets. Expect most to fail but eventually something might really work. [see comments section for more on how to think about treating symptoms vs root causes]"
Oh thanks! I'll update that
Oh yeah, I think you're right on that! I shouldn't have been so down on symptom-reducing treatment. It does seem clearly better to fix root causes but given they can be so hard to fix it can often be the case that the best solution is to treat symptoms (and in some cases, like mental health, that can help improve root cause as well). I'll change that language so it's more positive on those