Founder of CEEALAR (née the EA Hotel; ceealar.org)
Looking for more projects like these
CEEALAR (formerly the EA Hotel) is looking for funding to cover operations from Jan 2021 onward.
Sorry if this isn’t as polished as I’d hoped. Still a lot to read and think about, but posting as I won’t have time now to elaborate further before the weekend. Thanks for doing the AMA!
It seems like a crux that you have identified is how “sudden emergence” happens. How would a recursive self-improvement feedback loop start? Increasing optimisation capacity is a convergent instrumental goal. But how exactly is that goal reached? To give the most pertinent example - what would the nuts and bolts of it be for it happening in an ML system? It’s possible to imagine a sufficiently large pile of linear algebra enabling recursive chain reactions of both improvement in algorithmic efficiency, and size (e.g. capturing all global compute -> nanotech -> converting Earth to Computronium). Even more so since GPT-3. But what would the trigger be for setting it off?
Does the above summary of my take of this chime with yours? Do you (or anyone else reading) know of any attempts at articulating such a “nuts-and-bolts” explanation of “sudden emergence” of AGI in an ML system?
Or maybe there would be no trigger? Maybe a great many arbitrary goals would lead to sufficiently large ML systems brute-force stumbling upon recursive self-improvement as an instrumental goal (or mesa-optimisation)?
Responding to some quotes from the 80,000 Hours podcast:
“It’s not really that’s surprising, I don’t have this wild destructive preference about how they’re arranged. Let’s say the atoms in this room. The general principle here is that if you want to try and predict what some future technology will look like, maybe there is some predictive power you get from thinking about X percent of the ways of doing this involve property P. But it’s important to think about where there’s a process by which this technology or artifact will emerge. Is that the sort of process that will be differentially attracted to things which are let’s say benign? If so, then maybe that outweighs the fact that most possible designs are not benign.”
What mechanism makes AI be attracted to benign things? Surely only through human direction? But to my mind the whole Bostrom/Yudkowsky argument is that it FOOMs out of control of humans (and e.g. converts everything into Computronium as a convergent instrumental goal.)
“There’s some intuition of just the gap between something that’s going around and let’s say murdering people and using their atoms for engineering projects and something that’s doing whatever it is you want it to be doing seems relatively large.”
This reads like a bit of a strawman. My intuition for the problem of instrumental convergence is that in many take-off scenarios the AI will perform (a lot) more compute, and the way it will do this is by converting all available matter to Computronium (with human-existential collateral damage). From what I’ve read, you don’t directly touch on such scenarios. Would be interested to hear your thoughts on them.
“my impression is that you typically won’t get behaviours which are radically different or that seem like the system’s going for something completely different.”
Whilst you might not typically get radically different behaviours, in the cases where ML systems do fail, they tend to fail catastrophically (in ways that a human never would)! This also fits in with the notion of hidden proxy goals from “mesa optimisers” being a major concern (as well as accurate and sufficient specification of human goals).
Have you had any responses from Bostrom or Yudkowsky to your critiques?
I'm thinking that for me it would be something like 1/100 of a year! Maybe 1/10 tops. And for those such as the OP who think that "there's just no one inside to suffer" - would you risk making such a swap (with a high multiple) if it was somehow magically offered to you?
Pretty grim thought experiment - but I wonder: what amount of living as a chicken, or pig, on a factory farm would people trade for a year of extra healthy (human) life?
Assume that you would have the consciousness of the chicken or pig during the experience (memories of your previous life would be limited to the extent to what a chicken or pig could comprehend), and that you would have some kind of memory of the experience after (although these would be zero if chickens and pigs aren't sentient). Also assume that you wouldn't lose any time in your real life (say it was run as a very fast simulation, but you subjectively still experienced the time you specified).
Edit: there's another thought experiment along the same lines in MichaelStJules' comment here.
Maybe also that the talk of preventing a depression is an information hazard when we are at the stage of the pandemic where all-out lockdown is the biggest priority for most of the richest countries. In a few weeks when the epidemics in the US and Western Europe are under control, and lockdown can be eased with massive testing, tracing and isolating of cases, then it would make more sense to freely talk about boosting the economy again (in the mean time, we should be calling for governments to take up the slack with stimulus packages. Which they seem to be doing already).
I don't know that this is still or ever really was part of the mission of the EA Hotel (now CEEALAR), but one of the things I really appreciated about it from my fortnight stay there was that it provided a space for EA-aligned folks to work on things without the pressure to produce legible results. This to me seems extremely valuable because I believe many types of impact are quantized such that no impact is legible until a lot of things fall into place and you get a "windfall" of impact all at once
Yes, this was a significant consideration in my founding of the project. We also acknowledge it where we have collated outputs. And whilst we have had a good amount of support (see histogram here), I feel that many potential supporters have been holding back, waiting for the windfall (we have struggled with a short runway over the last year).
Makes sense from the point of view of killing germs, and temperatures being tolerable for us also being tolerable for germs. My intuition is that it's easier to get dirt (which contains germs) off hands with warmer water (similar to how it's easier to wash dishes with warmer water).
Alcohol-based hand sanitiser is also good. Often better than hand washing in practice as very few people actually wait for the water to get warm, or spend 20 seconds lathering the soap.
Props for putting in the work to keep this organization alive and well. It's a wonderful asset to the EA community. :)
I agree that the extra E is a bit jarring at first (someone else has pointed this out too). I worry that without it it's too similar to CEA though; and the "Enabling" also seems useful in helping to describe what we do.