Money can't continue scaling like this.
Or can it? https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0
This seems to underrate the arguments for Malthusian competition in the long run.
If we develop the technical capability to align AI systems with any conceivable goal, we'll start by aligning them with our own preferences. Some people are saints, and they'll make omnibenevolent AIs. Other people might have more sinister plans for their AIs. The world will remain full of human values, with all the good and bad that entails.
But current human values are do not maximize our reproductive fitness. Maybe one human will start a cult devoted to sending s...
You may have seen this already, but Tony Barrett is hiring an AI Standards Development Researcher. https://existence.org/jobs/AI-standards-dev
I agree they definitely should’ve included unfiltered LLMs, but it’s not clear that this significantly altered the results. From the paper:
“In response to initial observations of red cells’ difficulties in obtaining useful assistance from LLMs, a study excursion was undertaken. This involved integrating a black cell—comprising individuals proficient in jailbreaking techniques—into the red- teaming exercise. Interestingly, this group achieved the highest OPLAN score of all 15 cells. However, it is important to note that the black cell started and concluded ...
Hey, I've found this list really helpful, and the course that comes with it is great too. I'd suggest watching the course lecture video for a particular topic, then reading a few of the papers. Adversarial robustness and Trojans are the ones I found most interesting. https://course.mlsafety.org/readings/
What is Holden Karnofsky working on these days? He was writing publicly on AI for many months in a way that seemed to suggest he might start a new evals organization or a public advocacy campaign. He took a leave of absence to explore these kinds of projects, then returned as OpenPhil's Director of AI Strategy. What are his current priorities? How closely does he work with the teams that are hiring?
We appreciate the feedback!
China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.
I fully agree that this was an ambiguous use of “China.” We should have been more specific about which actors are taking which actions. I’ve updated the text to the following:
...NVIDIA designed a new chip with performance just beneath the thresholds set by the export controls in order to legally sell the chip in China. Other chips have been s
When people distinguish between alignment and capabilities, I think they’re often interested in the question of what research is good vs. bad for humanity. Alignment vs. capabilities seems insufficient to answer that more important question. Here’s my attempt at a better distinction:
There are many different risks from AI. Research can reduce some risks while exacerbating others. "Safety" and "capabilities" are therefore incorrectly reductive. Research should be assessed by its distinct impacts on many different risks and benefits. If a research direc...
Not as much as we'll know when his book comes out next month! For now, his cofounder Reid Hoffman has said some reasonable things about legal liability and rogue AI agents, though he's not expressing concern about x-risks:
...We shouldn’t necessarily allow autonomous bots functioning because that would be something that currently has uncertain safety factors. I’m not going to the existential risk thing, just cyber hacking and other kinds of things. Yes, it’s totally technically doable, but we should venture into that space with some care.
For example, sel
Here’s a fault tree analysis: https://arxiv.org/abs/2306.06924
Review of risk assessment techniques that could be used: https://arxiv.org/abs/2307.08823
Applying ideas from systems safety to AI: https://arxiv.org/abs/2206.05862
Applying ideas from systems safety to AI (part 2): https://arxiv.org/abs/2302.02972
Applying AI to ideas from systems safety (lol): https://arxiv.org/abs/2304.01246
Hey, great opportunity! It looks like a lot of these opportunities are in-person. Do you know if there are any substantial number of remote opportunities?
I’d be curious about what happens after 10. How long so biological humans survive? How long can they said to be “in control” of AI systems such that some group of humans could change the direction of civilization if they wanted to? How likely is deliberate misuse of AI to cause an existential catastrophe, relative to slowly losing control of society? What are the positive visions of the future, and which are the most negative?
Yep, I think those kinds of interventions make a lot of sense. The natural selection paper discusses several of those kinds of interventions in sections 4.2 and 4.3. The Turing Trap also makes an interesting observation about US tax law: automating a worker with AI would typically reduce a company's tax burden. Bill DeBlasio, Mark Cuban, and Bill Gates have all spoken in favor of a robot tax to fix that imbalance.
That’s a good point! Joe Carlsmith makes a similar step by step argument, but includes a specific step about whether the existence of rogue AI would lead to catastrophic harm. Would have been nice to include in Bengio’s.
Carlsmith: https://arxiv.org/abs/2206.13353
Ah okay, if it doesn't delay your graduation then I'd probably lean more towards CS. Self study can be great, but I've found classes really valuable too in getting more rigorous. Of course there's a million factors I'm not aware of -- best of luck in whichever you choose!
Hey, tough choice! Personally I’d lean towards PPE. Primarily that’s driven by the high opportunity cost of another year in school. Which major you choose seems less important than finding something you love and doing good work in it a year sooner.
Two other factors: First, you can learn AI outside of the classroom fairly well, especially since you can already program. I’m an economics major who’s taken a few classes in CS and done a lot of self-study, and that’s been enough to work on some AI research projects. Second, policy is plausibly more important fo...
Very interesting article. Some forecasts of AI timelines (like BioAnchors) are premised on compute efficiency continuing to progress as it has for the last several decades. Perhaps these arguments are less forceful against 5-10 year timelines to AGI, but they're still worth exploring.
I'm skeptical of some of the headwinds you've identified. Let me go through my understanding of the various drivers of performance, and I'd be curious to hear how you think of each of these.
Parallelization has driven much of the recent progress in effective compute...
I built a preliminary model here: https://colab.research.google.com/drive/108YuOmrf18nQTOQksV30vch6HNPivvX3?authuser=2
It’s definitely too simple to treat as strong evidence, but it shows some interesting dynamics. For example, levels of alignment rise at first, then rapidly falling when AI deception skills exceed human oversight capacity. I sent it to Tyler and he agreed — cool, but not actual evidence.
If anyone wants to work on improving this, feel free to reach out!
Some argue that the computational demands of deep learning coupled with the end of Moore's Law will limit AI progress. The most convincing counterargument in my opinion is that algorithms could become much more efficient in using compute. Historically, every 9 months algorithmic improvements have halved the amount of compute necessary to achieve a given level of performance in image classification. AI is currently being used to improve the rate of AI progress (including to improve hardware), meaning full automation could further speed up AI progress. ...
Yep, this is a totally reasonable question. People have worked on it before: https://www.brookings.edu/research/aligned-with-whom-direct-and-social-goals-for-ai-systems/
Many people concerned with existential threats from AI believe that hardest technical challenge is aligning an AI to do any specific thing at all. They argue that we will have little control over the goals and behavior of superhuman systems, and that solving the problem of aligning AI with any one human will eliminate much of the existential risk associated with AI. See here and here for ex...
I completely agree with you and think that’s what will happen. Eliezer might disagree but many others would agree with you.
I think your perception is spot on. The labs that are advancing towards AGI the fastest also profess to care about safety and do research on safety. Within the alignment field, many people believe that many other people's research agendas are useless. There are varying levels of consensus about different questions -- many people are opposed to racing towards AGI, and research directions like interpretability and eliciting latent knowledge are rarely criticized -- but in many cases, making progress on AI safety requires having inside view opinions about what's important and useful.
Hi Arturo. Thank you for the thoughtful and detailed assessment of the AI risk literature. Here are a few other sources you might be interested in reading:
Yep that's a fair argument, and I don't have a knockdown case that predicting human generated data will result in great abilities.
One bit of evidence is that people used to be really pessimistic that scaling up imitation would do anything interesting, this paper was a popular knockdown arguing language models could never understand the physical world, but most of the substantive predictions of that line of thinking have been wrong and those people have largely retreated to semantics debates about the meaning of "understanding". Scaling has gone furth...
Wish I knew! Corporations and countries are shaped by the same survival of the fittest dynamic, and they’ve turned out less than perfect but mostly fine. AI could be far more intelligent though, and it seems unlikely that our current oversight mechanisms would naturally handle that case. Technical alignment research seems like the better path.
Google’s challenge is that language models will eat up the profit margins of search. They currently make a couple of pennies per search, and that’s what it would cost to integrate ChatGPT into search.
Microsoft seems happy to use Bing as a loss leader to break Google’s monopoly on search. Over time, the cost of running language models will fall dramatically, making the business model viable again.
Google isn’t far behind the cutting edge of language models — their PaLM is 3x bigger than GPT-3 and beats it in many academic benchmarks. But they don’t want to p...
I think it’s because predicting exactly what someone will say is more difficult than just sounding something like them. Eliezer Yudkowsky wrote about it here: https://www.lesswrong.com/posts/nH4c3Q9t9F3nJ7y8W/gpts-are-predictors-not-imitators
Tamera Lanham is excited about this and is doing research on it: https://www.lesswrong.com/posts/FRRb6Gqem8k69ocbi/externalized-reasoning-oversight-a-research-direction-for
I’d say alignment research is not going very well! There have been successes in areas that help products get to market (e.g. RLHF) and on problems of academic interest that leave key problems unsolved (e.g. adversarial robustness), but there are several “core problems” that have not seen much progress over the years.
Good overview of this topic: https://www.forourposterity.com/nobodys-on-the-ball-on-agi-alignment/
I don’t think the orthogonality thesis would have predicted GPT models, which become intelligent by mimicking human language, and learn about human values as a byproduct. The orthogonality thesis says that, in principle, any level of intelligence can be combined with any goal, but in practice the most intelligent systems we have are trained by mimicking human concepts.
On the other hand, after you train a language model, you can ask it or fine-tune it to pursue any goal you like. It will use human concepts that it learned from pretraining on natural language, but you can give it a new goal.
That argument sounds right to me. A recent paper made a similar case: https://arxiv.org/abs/2303.16200
Those all seem like important risks to me, but I’d estimate the highest x-risk from agentic systems that learn to seek power or wirehead, especially after a transition to very rapid economic or scientific progress. If AI progresses slowly or is only a tool used by human operators, x-risk seems much lower to me.
Good recent post on various failure modes: https://www.lesswrong.com/posts/mSF4KTxAGRG3EHmhb/ai-x-risk-approximately-ordered-by-embarrassment
This post by Katja Grace makes a lot of interesting arguments against AI x-risk: https://www.lesswrong.com/posts/LDRQ5Zfqwi8GjzPYG/counterarguments-to-the-basic-ai-x-risk-case
Forecasting new technologies is always difficult, so personally I have a lot of uncertainty about the future of AI, but I think this post is a good overview of some considerations: https://www.cold-takes.com/where-ai-forecasting-stands-today/
Global poverty needs plenty of tech talent, GiveDirectly and IDInsight might be good organizations to check out.
Hi! If you're still interested, I think one of the most important questions in AI is how quickly it will transform the world. Effects on GDP are one way of measuring that impact, and growth theory is full of tools for forecasting growth. If this is interesting to you, here's one path you could work on:
Agreed, this is ridiculous. You should take down the contest.
Your chances of a successful attack are very low. It takes years for information to be scraped from the internet, trained into a model, and deployed to production. GPT-4 has a knowledge cutoff of September 2021. If future models have the same delay, you won't see results for a year and a half.
The more likely outcome is press coverage about how AI safety folks are willing to hold society hostage in order to enforce their point of view. See this takedown piece on Eliezer Yudkowsky, and ...
Yep that's a good point. Here's one source on it, funding amounts definitely increased throughout the 2010s. An alternative explanation could be that valuations have increased more than funding amounts. There's some data to support this, but you'd need a more careful comparison of startups within the same reference class to be sure.
Startups would be another good reference class. VCs are incentivized to scale as fast as possible so they can cash out and reinvest their money, but they rarely give a new organization as much money as Redwood received.
Startups usually receive a seed round of ~$2M cash to cover the first year or two of business, followed by ~$10M for Series A to cover another year or two. Even Stripe, a VC wunderkind that’s raised billions privately while scaling to thousands of employees around the world, began with $2M for their first year, $38M for the next three years ...
Thanks, this is helpful. One thing to flag is that I wouldn't find the 2012-2014 numbers very convincing; my impression is that VC funding increased a lot until 2022, and 2021 was a year where capital was particularly cheap, for reasons that in hindsight were not entirely dissimilar to why longtermist EA was (relatively) well-funded in the last two years.
Thanks, fixed!