Justin Olive

97 karmaJoined


Why I disagree that this video insightful/entertaining: The YouTuber quite clearly has very little knowledge of the subject they are discussing - it's actually quite reasonable for the Zoom CEO to simply say that fixing hallucinations will "occur down the stack", given that they are not the ones developing AI models, and would instead be building the infrastructure and environments that the AI systems operate within.

From what I watched of the video, she also completely misses the real reason that the CEOs claims are ridiculous; if you have an AI system with a level of capability that allows it to replicate a person's actions in the workplace, then why would we go to the extra effort of having Zoom calls between these AI clones?

I.e. It would be much more efficient to build information systems that align with the strengths & comparative advantages of the AI systems  - presumably this would not involve having "realistic clones of real human workers" talking to each other, but rather a network of AI systems that communicate using protocols and data formats that are designed to be as robust and efficient as possible.

FWIW if I were the CEO of Zoom, I'd be pushing hard on the "Human-in-the-loop" idea. E.g. building in features that allow you send out AI agents to fetch information and complete tasks in real time as you're having meetings with your colleagues. That would actually be a useful product that helps keep Zoom interesting and relevant.

With regards to AI progress stalling, I think it depends on what you mean by "stalling", but I think this is basically impossible if you mean "literally will not meaningfully improve in a way that is economically useful"

When I first learned how modern AI systems worked, I was astonished at how absurdly simple and inefficient they are. In the last ~2 years there has been a move towards things like MoE architectures & RNN hybrids, but this is really only scratching the surface of what is possible with more complex architectures. We should expect a steady stream of algorithmic improvements that will push down inference costs and make more real-world applications viable. There's also Moore's Law, but everyone already talks about that quite a lot.

Also, if you buy the idea that "AI systems will learn tasks that they're explicitly trained for", then incremental progress is almost guaranteed. I think it's hilarious that everyone in industry and Government is very excited about general-purpose AI and its capacity for automation, but there is basically no large-scale effort to create high-quality training data to expedite this process.

The fact that pre-training + chatbot RLHF is adequate to build a system with any economic value is dumb luck. I would predict that if we actually dedicated a not-insignificant chunk of society's efforts towards training DL systems to perform important tasks, we would make quite a lot of progress very quickly. Perhaps a central actor like the CCP will do this at some stage, but until then we should expect incremental progress as small-scale efforts gradually build up datasets and training environments.

An obvious answer that I rarely see is to seek roles in public service (civil service).

Basically anyone who is wasting away in private industry can 2x their positive impact by going into public service and fighting for efficiency and effectiveness. Although most government roles don't relate to a specific "EA cause area", the budgets that Governments deal with are mind boggling; as a graduate working in public service, I developed the investment framework for an education infrastructure program with $200 million in recurring funding. That is more money than any grantmaker will touch in the course of their career. $10 million in Government is a rounding error.

EA is hard, and so most want to take the easy path by applying for cool/fun jobs that have direct impact and then calling it a day. But the reality is that there are already many public institutions doing work that benefits the broader good, and that those institutions are filled with bureaucrats who couldn't give a stuff about impact. You can make the world a better place purely through the fact that you give a shit.

However, public service is a grind; it requires fortitude, pragmatism and social skills, so it isn't for everyone (and certainly not for me).

Thanks for extracting that quote about PFAS, this is really the main point for me.

In the contamination remediation industry (which I have some familiarity with via my partner), PFAS seems to be considered to be the boogey-man of contaminants (for enviro and health reasons).

I can imagine an alternative headline that highlights how AMF et al. have been handing out bednets containing PFAS. Doesn't seem like it would go down well either.

Perhaps we just need to accept that this is an R&D problem that needs to be solved ASAP, and respond accordingly.

Excellent post, well done putting this together.

In particular, scenario #1 (military / authoritarian accident) has been my main concern in AI x-risk or s-risk. When I first encountered the field of AI safety, I was quite confused by how much attention was focused on big tech relative to global actors with unambiguous intentions to create disturbing types of AI systems that could easily evolve into something catastrophically bad.

I found scenario 2 quite interesting also, although I find the particular sequence of events less plausible in a world where many organisations have access to powerful AI. I don't think this reduces the risks involved in giving advanced AI the ability to develop and execute business strategies, I just think it's much harder to predict how things might go wrong given the complexity involved in that type of world.

I'll also just add that I found scenario #3 to very interesting, and although I personally consider it to be somewhat far fetched, I commend your creativity (and courage). I'd be very surprised if an EA could pull that sort of thing off, but perhaps I'm missing information about how embedded EA culture is within Silicon Valley.

Overall I also like the analysis of why x-risks from "paperclipper" type systems are probably unlikely. My personal take on this is that LLM's might be particularly useful in this regard. I think the idea of creating AGI from RL is what underpinned a lot of the pre-ChatGPT x-risk discussions. Now the conversation has somewhat shifted toward a lack of interpretability in LLM's and why this is a bad thing, but I still believe a shift toward LLM-based AGI might be a good thing.

My rationale is that LLM's seem to inherently understands human values in a way that I think it would be quite difficult match with a pure RL agent. This understanding of human values is obviously imperfect, and will require further improvements, but at least it provides an obvious way to avoid avoiding a paperclipper scenarios.

For example, you can literally tell GPT to act ethically and positive, and you can be fairly certain that it will do things that pretty much always align with that request. Of course, if you try to make it do bad stuff, it will, but that certainly doesn't seem to be the default.

This seems to be in contrast to the more Yudkowskian approach, which assumes that advanced AI will be catastrophically misaligned by default. LLM's seem to provide a way to avoid that; extrapolating forward from OpenAI's efforts so far, my impression is that if we asked Auto-ChatGPT-9 to not kill everyone, it would actually do a pretty good job. To be fair, I'm not sure you could say the same thing about a future version of AlphaGo that has been trained to manage a large corporation, which was until recently, what many imagined advanced AI would look like.

I hope you found some of that brain-dump insightful. Would be keen to hear what your thoughts are on some of those points.

Thanks very much for the recommendation, I'll do that now

Hi Geoffrey

Thanks for the kind words.

I did have a bit of a think about what the implications are for finding feasible AI governance solutions, and here's my personal take:

If it is true that 'inhibitive' governance measures (perhaps like those that are in effect at Google) cause ML engineers to move to more dangerous research zones, I believe it might be prudent to explore models of AI governance that 'accelerate' progress towards alignment, rather than slow down the progression towards misalignment.

My general argument would be as follows:

If we assume that it will be unfeasible to buy-out or convince most of the ML engineers on the planet to intrinsically value alignment, then it means that global actors with poor intentions (e.g. imperialist autocracies) will benefit from a system where well-intentioned actors have created a comparatively frustrating & unproductive environment for ML engineers. I.e. not only will they have a more efficient R&D pipeline due to lower restrictions, they may also have better capacity to hire & retain talent over the long-term.

One possible implication from this assertion is that the best course of action is to initiate an AI-alignment Manhattan project that focuses on working towards a state of 'stabilisation' in the geopolitical/technology realm. The intention of this is to change the structure of the AI ecosystem so that it favours 'aligned' AI by promoting progress in that area, rather than accidentally proliferating 'misaligned' AI by stifling progress in 'pro-alignment' zones.

I find this conclusion fairly disturbing and I hope there's some research out there that can disprove it.

Reductionist utilitarian models are like play-dough. They're fun and easy to work with, but useless for doing anything complicated and/or useful.

Perhaps in 100-200 years our understanding of neurobiology or psychometrics will be good enough for utilitarian modelling to become relevant to real life, but until then I don't see any point getting on the train.

The fact that intelligent, well-meaning individuals are wasting their time thinking about the St Petersburg paradox is ironically un-utilitarian; that time could be used to accomplish tasks which actually generate wellbeing.