This is a submission to the Future Fund's AI Worldview Prize. It was submitted through our submission form, and was not posted on the EA Forum, LessWrong, or the AI Alignment Forum. We are posting copies/linkposts of such submissions on the EA Forum.
Author: Sharan Babu
This article is a submission to the FTX Future Fund AI worldview prize.
Content structure: short-term predictions followed by further-in-the-timeline predictions.
In the next 5 years, there will be a huge surge in vertical AI models aimed at a few tasks. It is much easier to vet quality data and periodically refresh model knowledge for a particular domain than the entire Internet. **
- Deep learning models with more context space/memory.
- Theorists will run software **. Today, we speak about the large divide between academia and practical/production skills, but this will soon be gone. Take the case of ‘Citizen data scientists’ — people who are not from the domain of statistics or analytics but train models for data analysis. With advances in natural language interfaces, people strong in their domain/subject matter will draw more output from computer systems.
- Bias in models will soon be gone. We will soon have techniques to categorically visualize and control/prune the weights/concepts learned.
- Scoped AGI systems — for example, a program that can be taught new tasks based on explicit instructions and put to work for a use case like desktop automation (clicking and typing). **
- Replacement of prompt engineering with ‘task definition’.
- Language models that are not susceptible to prompt injection.
- Applications that leverage a Large Language Model (LLM) as a queryable snapshot of the Internet and a basic layer of intelligence. **
- Emergence of security companies for large models.** Ex: Charting directed graphs that display the probability of action space of a large model for a given input and how this probabilistic space changes as the model sees new data or its learned weights are changed.
- A basic intelligence layer could go rogue only if it is trained on poor contrived data, is not supervised by humans in any manner after training and given the ability to perform a complimentary set of tasks. Example: Imagine an AI stock trader that is meant to take inputs from a user like — “buy only dips/make bold buys” and take actions accordingly but is also taught how to control the OS (make file changes) and use the web. Consecutive bad decisions have the chance of the model taking the following actions: restrict computer access to itself by issuing OS commands -> Change initial prompt purely to maximize profit -> continue to trade. This is now problematic; the user has to cut out the power source/ block link between trader and bank account/wallet account or equivalent means.
- Remember how recently, some models in huggingface had malicious code embedded in weights and when the model was run, it would cause some pre-determined code to run as well… On a similar note, this is why there will be a rise in security companies that are able to simulate the model’s action choices and determine if a model is safe to use or not (Model Alignment).
- Predictions for ‘AGI will be developed by January 1, 2043’
AGI definition: A computer system able to perform any task that a human can.
- We are 20 years away from 2043, which is a considerable time for development in computer science.
- Advances in quantum computing and photonic deep learning can make computation exponentially faster.
- Learning algorithms better than current ones like Gradient Descent. Shouldn’t a learning algorithm be agnostic to class imbalance? Such fundamental problems will be solved by new learning algorithms.
- Disentanglement of network (differentiated learning of concepts) and latent walks will increase and improve the state of AI by leaps and bounds.
- Deep learning networks learned convolutional kernels. Similarly, it could learn activation functions dynamically too. This could enable partial activation of neurons and hence, relative compression of number of artificial neurons required.
With the above lines of thought in mind, I would suggest a subjective probability of 80% for AGI being developed by January 1, 2043.
2. Predictions for “P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI
- Platforms where information is shared, will become more responsible. Multi-factor authentication would come up in multiple instances while using the application. For example, a social media app might ask the user to upload a live selfie each time a post is to be created. Rules and frameworks like this might decrease misinformation if done and followed diligently.
- Why is it the notion that consensus among robots would be so high? Imagine 1 AGI with access to an extensive network of compute and another one with access to movable robot figures, and the former commanded the latter to do something. This is a case of comparing apples to oranges and hence, not necessary that multiple AGI agents will comply with each other.
- Similar to how FDA approves medicines, central entities like App store will evolve and use new standards.**
- Once AI reaches the position of an absolute perfect assistant. Why would humans (or at least large groups of humans) still work on it?
- If an AGI is willing to accept its initial knowledge set, then it would likely be willing to accept new ones as well. This means non-AGI intellectuals could potentially fool them. Because the search space for validation of new data-point might be too high?
- Unique protocols in the future: If a large number of people accept that a server has to be shut down, then it will be. If such protocols and legislation come in time, the risk would be minimized to a large extent.
Taking all these points into consideration puts my subjective probability for P(misalignment x-risk|AGI) at 0.1–1%
** — Companies that enable this would be great investments for Future Fund.