All of aogara's Comments + Replies

Money can't continue scaling like this.

Or can it? https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0

This seems to underrate the arguments for Malthusian competition in the long run. 

If we develop the technical capability to align AI systems with any conceivable goal, we'll start by aligning them with our own preferences. Some people are saints, and they'll make omnibenevolent AIs. Other people might have more sinister plans for their AIs. The world will remain full of human values, with all the good and bad that entails. 

But current human values are do not maximize our reproductive fitness. Maybe one human will start a cult devoted to sending s... (read more)

2
Matthew_Barnett
2mo
I'm mostly talking about what I expect to happen in the short-run in this thread. But I appreciate these arguments (and agree with most of them). Plausibly my main disagreement with the concerns you raised is that I think coordination is maybe not very hard. Coordination seems to have gotten stronger over time, in the long-run. AI could also potentially make coordination much easier. As Bostrom has pointed out, historical trends point towards the creation of a Singleton. I'm currently uncertain about whether to be more worried about a future world government becoming stagnant and inflexible. There's a real risk that our institutions will at some point entrench an anti-innovation doctrine that prevents meaningful changes over very long time horizons out of a fear that any evolution would be too risky. As of right now I'm more worried about this potential failure mode versus the failure mode of unrestrained evolution, but it's a close competition between the two concerns.

You may have seen this already, but Tony Barrett is hiring an AI Standards Development Researcher. https://existence.org/jobs/AI-standards-dev

3
Anastasiia Gaidashenko
2mo
Yep, thanks for sharing! I don't meet the Qualification Criteria, unfortunately: one needs to be based in the US.

I agree they definitely should’ve included unfiltered LLMs, but it’s not clear that this significantly altered the results. From the paper:

“In response to initial observations of red cells’ difficulties in obtaining useful assistance from LLMs, a study excursion was undertaken. This involved integrating a black cell—comprising individuals proficient in jailbreaking techniques—into the red- teaming exercise. Interestingly, this group achieved the highest OPLAN score of all 15 cells. However, it is important to note that the black cell started and concluded ... (read more)

2
Jeff Kaufman
2mo
My interpretation is something like either (a) the kind of people who are good at jailbreaking LLMs are also the kind of people who are good at thinking creatively about how to cause harm or (b) this is just noise in who you happened to get in which cell.
6
Lizka
2mo
It's potentially also worth noting that the difference in scores was pretty enormous:  This is pretty interesting to me (although it's basically an ~anecdote, given that it's just one team); it reminds me of some of the literature around superforecasters.  ---------------------------------------- (I probably should have added a note about the black cell (and crimson cells) to the summary — thank you for adding this!)

This was very informative, thanks for sharing. Here is a cost-effectiveness model of many different AI safety field-building programs. If you spend more time on this, I'd be curious how AISC stacks up against these interventions, and your thoughts on the model more broadly. 

3
Sam Holton
2mo
Thank you for the pointer! I hadn't seen this before and it looks like there's a lot of interesting thinking on how to study AI safety field building. I appreciate having more cost-effectiveness estimates to compare to. I haven't given it a full read, but it seems like the quality-adjusted researcher year is very similar to the metric I'm proposing here. To do a comparison between our estimates, lets assume a new AIS researcher does 10 years of quality adjusted, time-discounted AIS research (note that timelines become pretty important here) then we get: (10 QARY's/researcher) / ($30K/researcher) = 3.33E-4 QURY's per dollar = 333 QURY's per $1M That seems similar to the CAIS estimates for MLSS, so it seems like these approaches have pretty comparable results! In the future I'm also interested in modelling how to distribute funding optimally in talent pipelines like these.
Answer by aogaraNov 21, 20232
1
0

Hey, I've found this list really helpful, and the course that comes with it is great too. I'd suggest watching the course lecture video for a particular topic, then reading a few of the papers. Adversarial robustness and Trojans are the ones I found most interesting. https://course.mlsafety.org/readings/

What is Holden Karnofsky working on these days? He was writing publicly on AI for many months in a way that seemed to suggest he might start a new evals organization or a public advocacy campaign. He took a leave of absence to explore these kinds of projects, then returned as OpenPhil's Director of AI Strategy. What are his current priorities? How closely does he work with the teams that are hiring? 

9
lukeprog
5mo
Holden has been working on independent projects, e.g. related to RSPs; the AI teams at Open Phil no longer report to him and he doesn't approve grants. We all still collaborate to some degree, but new hires shouldn't e.g. expect to work closely with Holden.

We appreciate the feedback!

China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.

I fully agree that this was an ambiguous use of “China.” We should have been more specific about which actors are taking which actions. I’ve updated the text to the following:

NVIDIA designed a new chip with performance just beneath the thresholds set by the export controls in order to legally sell the chip in China. Other chips have been s

... (read more)
8
Oliver Sourbut
6mo
This is an exemplary and welcome response: concise, full-throated, actioned. Respect, thank you Aidan. Sincerely, I hope my feedback was all-considered good from your perspective. As I noted in this post, I felt my initial email was slightly unkind at one point, but I am overall glad I shared it - you appreciate my getting exercised about this, even over a few paragraphs! Yes, and I repeat that CAIS newsletter has a good balance of nuance, correctness, helpfulness, reach. Hopefully your example here sets the tone for conversations in this space!
Answer by aogaraSep 02, 20235
1
0

When people distinguish between alignment and capabilities, I think they’re often interested in the question of what research is good vs. bad for humanity. Alignment vs. capabilities seems insufficient to answer that more important question.  Here’s my attempt at a better distinction:

There are many different risks from AI. Research can reduce some risks while exacerbating others. "Safety" and "capabilities" are therefore incorrectly reductive. Research should be assessed by its distinct impacts on many different risks and benefits. If a research direc... (read more)

2
aogara
7mo
Related: https://www.lesswrong.com/posts/zswuToWK6zpYSwmCn/some-background-for-reasoning-about-dual-use-alignment
Answer by aogaraAug 13, 20234
0
0

Not as much as we'll know when his book comes out next month! For now, his cofounder Reid Hoffman has said some reasonable things about legal liability and rogue AI agents, though he's not expressing concern about x-risks: 

We shouldn’t necessarily allow autonomous bots functioning because that would be something that currently has uncertain safety factors. I’m not going to the existential risk thing, just cyber hacking and other kinds of things. Yes, it’s totally technically doable, but we should venture into that space with some care.

For example, sel

... (read more)
2
aogara
7mo
Sounds x-risk pilled here: https://open.spotify.com/episode/6TiIgfJ18HEFcUonJFMWaP?si=P6iTLy6LSvq3pH6I1aovWw

Here’s a fault tree analysis: https://arxiv.org/abs/2306.06924

Review of risk assessment techniques that could be used: https://arxiv.org/abs/2307.08823

Applying ideas from systems safety to AI: https://arxiv.org/abs/2206.05862

Applying ideas from systems safety to AI (part 2): https://arxiv.org/abs/2302.02972

Applying AI to ideas from systems safety (lol): https://arxiv.org/abs/2304.01246

Hey, great opportunity! It looks like a lot of these opportunities are in-person. Do you know if there are any substantial number of remote opportunities? 

6
michel
9mo
Maybe check out VSFS internships
4
Elika
9mo
Unfortunately I don't know / think so for many congressional offices. They all should have a DC based internship and one in the home state if that's at all helpful. I know some think tanks offer remote internships too (ex. Brookings)!

I’d be curious about what happens after 10. How long so biological humans survive? How long can they said to be “in control” of AI systems such that some group of humans could change the direction of civilization if they wanted to? How likely is deliberate misuse of AI to cause an existential catastrophe, relative to slowly losing control of society? What are the positive visions of the future, and which are the most negative?

4
Greg_Colbourn
9mo
Seconding this. This future scenario as constructed seems brittle to subtle forms of misalignment[1] erasing nearly all future value (i.e. still an existential catastrophe even if not a sudden extinction event). 1. ^ Note this seems somewhat similar to Yuval Harari's worries voiced in Homo Deus.

Skill up and work on technical AI safety! Two good resources: 1, 2. Even if you don't yet feel the moral urgency, skilling up in ML can put you in a better position to do technical research in the future. 

4
Jonny Spicer
9mo
Thanks for the suggestion! I have actually spent quite a lot of time thinking about this - I had my 80k call last April and this was their advice. I've hesitated against doing this for a number of reasons:   * I'm worried that even if I do upskill in ML, I won't be a good enough software engineer to land a research engineering position, so part of me wants to improve as a SWE first * At the moment I'm very busy and a marginal hour of my time is very valuable, upskilling in ML is likely 200-500 hours, at the moment I would struggle to commit to even 5 hours per week * I don't know whether I would enjoy ML, whereas I know I somewhat enjoy at least some parts of the SWE work I currently do * Learning ML potentially narrows my career options vs learning broader skills, so it's hard to hedge * My impression is that there are a lot of people trying to do this right now, and it's not clear to me that doing so would be my comparative advantage. Perhaps carving out a different niche would be more valuable in the future. There are probably good rebuttals to at least some of these points, and I think that is adding to my confusion. My intuition is to keep doing what I'm currently doing, rather than go try and learn ML, but maybe my intuition here is bad. Edit: writing this comment made me realise that I ought to write a proper doc with the pros/cons of learning ML and get feedback on it if necessary. Thanks for helping pull this useful thought out of my brain :)

Yep, I think those kinds of interventions make a lot of sense. The natural selection paper discusses several of those kinds of interventions in sections 4.2 and 4.3. The Turing Trap also makes an interesting observation about US tax law: automating a worker with AI would typically reduce a company's tax burden. Bill DeBlasio, Mark Cuban, and Bill Gates have all spoken in favor of a robot tax to fix that imbalance. 

That’s a good point! Joe Carlsmith makes a similar step by step argument, but includes a specific step about whether the existence of rogue AI would lead to catastrophic harm. Would have been nice to include in Bengio’s.

Carlsmith: https://arxiv.org/abs/2206.13353

Very interesting. Another discussion of the performance distribution here

4
Joseph Lemien
10mo
Thanks for sharing this. I found this to be quite interesting.

Ah okay, if it doesn't delay your graduation then I'd probably lean more towards CS. Self study can be great, but I've found classes really valuable too in getting more rigorous. Of course there's a million factors I'm not aware of -- best of luck in whichever you choose!

Hey, tough choice! Personally I’d lean towards PPE. Primarily that’s driven by the high opportunity cost of another year in school. Which major you choose seems less important than finding something you love and doing good work in it a year sooner.

Two other factors: First, you can learn AI outside of the classroom fairly well, especially since you can already program. I’m an economics major who’s taken a few classes in CS and done a lot of self-study, and that’s been enough to work on some AI research projects. Second, policy is plausibly more important fo... (read more)

3
Ethan Beri
10mo
Hey, thanks for your comment! I hadn't really realised the extent to which someone can study full-time while also skilling up in research engineering - that definitely makes me feel more willing to go for PPE.  Re your third paragraph, I wouldn't have a year off - it'd just be like doing a year of PPE, followed by three years of CS & philosophy. I do have a scholarship, and would do the first year of PPE anyway in case I didn't get into CS & phil. Either way, your first point does point me more in the direction of just sticking with PPE :)

Very interesting article. Some forecasts of AI timelines (like BioAnchors) are premised on compute efficiency continuing to progress as it has for the last several decades. Perhaps these arguments are less forceful against 5-10 year timelines to AGI, but they're still worth exploring. 

I'm skeptical of some of the headwinds you've identified. Let me go through my understanding of the various drivers of performance, and I'd be curious to hear how you think of each of these. 

Parallelization has driven much of the recent progress in effective compute... (read more)

Thank you very much!

I built a preliminary model here: https://colab.research.google.com/drive/108YuOmrf18nQTOQksV30vch6HNPivvX3?authuser=2

It’s definitely too simple to treat as strong evidence, but it shows some interesting dynamics. For example, levels of alignment rise at first, then rapidly falling when AI deception skills exceed human oversight capacity. I sent it to Tyler and he agreed — cool, but not actual evidence.

If anyone wants to work on improving this, feel free to reach out!

Very cool. You may have seen this but Robin Hanson makes a similar argument in this paper. 

3
AidanGoth
10mo
Interesting. Thanks for sharing :)

Some argue that the computational demands of deep learning coupled with the end of Moore's Law will limit AI progress. The most convincing counterargument in my opinion is that algorithms could become much more efficient in using compute. Historically, every 9 months algorithmic improvements have halved the amount of compute necessary to achieve a given level of performance in image classification. AI is currently being used to improve the rate of AI progress (including to improve hardware), meaning full automation could further speed up AI progress. ... (read more)

Yep, this is a totally reasonable question. People have worked on it before: https://www.brookings.edu/research/aligned-with-whom-direct-and-social-goals-for-ai-systems/

Many people concerned with existential threats from AI believe that hardest technical challenge is aligning an AI to do any specific thing at all. They argue that we will have little control over the goals and behavior of superhuman systems, and that solving the problem of aligning AI with any one human will eliminate much of the existential risk associated with AI. See here and here for ex... (read more)

I completely agree with you and think that’s what will happen. Eliezer might disagree but many others would agree with you.

I think your perception is spot on. The labs that are advancing towards AGI the fastest also profess to care about safety and do research on safety. Within the alignment field, many people believe that many other people's research agendas are useless. There are varying levels of consensus about different questions -- many people are opposed to racing towards AGI, and research directions like interpretability and eliciting latent knowledge are rarely criticized -- but in many cases, making progress on AI safety requires having inside view opinions about what's important and useful. 

Hi Arturo. Thank you for the thoughtful and detailed assessment of the AI risk literature. Here are a few other sources you might be interested in reading:

  • AI Timelines: Where the arguments and the "experts" stand summarizes key sources of evidence on AI timelines. Namely, it finds that AI researchers believe AGI will likely arrive within the next few decades, that the human brain uses more computational power than today's largest AI models but that future models will soon surpass human levels of compute, and that economic history suggests transformative ch
... (read more)
1
Arturo Macias
1y
First, I comment about my specific argument. The link about SayCan is interesting, but the environment looks very controlled and idiosyncratic, and the paper is quite unspecific on the link between the detailed instructions and its execution. It is clear that LLM is a layer between unspecific human instructions and detailed verbal instructions. The relation between those detailed verbal instructions and the final execution is not well described in the paper. The most interesting thing, that is robot-LLM feedback (whether the robot modifies the chain of instructions as a consequence of execution failure or success) is unclear.  I find quite frustrating how descriptive, high level and “results” focused are all this corporate research papers. You cannot grasp what they have really done (the original Alpha Zero white paper! ).  "Personally I believe that AI could pose a threat without physical embodiment" Perhaps, but to be interested on defeating us, it needs to have "real world" interests. The space state chat GTP inhabits is massively made of text chains, her interests are mainly being an engaging chatter (she is the perfect embodiment of the Anglo chattering classes!). In fact, my anecdotal experience with chat GTP is that it is an incredible poet, but very dull in reasoning. The old joke about Keynes (too good a writer to trust his economics), but on a massive scale.     Now, if you train an AI in a physical like virtual word, and her training begins by physical recognition, and then, after that you move into linguistic training, the emergence of AGI would be at least possible. Currently, we have disparate successes in "navigation", "object recognition", “game playing”, and language processing, but IAs have not an executive brain, nor a realist internal world representation.   Ragardin the Bender and Koller paper,  in March 2023 she was still quite sceptical of the semantic abilities of chat GTP. And chat GTP 4 is still easily fooled when you keep in mind that

Yep that's a fair argument, and I don't have a knockdown case that predicting human generated data will result in great abilities. 

One bit of evidence is that people used to be really pessimistic that scaling up imitation would do anything interesting, this paper was a popular knockdown arguing language models could never understand the physical world, but most of the substantive predictions of that line of thinking have been wrong and those people have largely retreated to semantics debates about the meaning of "understanding". Scaling has gone furth... (read more)

Wish I knew! Corporations and countries are shaped by the same survival of the fittest dynamic, and they’ve turned out less than perfect but mostly fine. AI could be far more intelligent though, and it seems unlikely that our current oversight mechanisms would naturally handle that case. Technical alignment research seems like the better path.

4
Greg_Colbourn
1y
But if technical alignment research concludes that alignment of SAI is impossible? That's the depressing scenario that I'm starting to contemplate (I think we should at least have a Manhattan Project on alignment following a global Pause to be sure though).

Google’s challenge is that language models will eat up the profit margins of search. They currently make a couple of pennies per search, and that’s what it would cost to integrate ChatGPT into search.

Microsoft seems happy to use Bing as a loss leader to break Google’s monopoly on search. Over time, the cost of running language models will fall dramatically, making the business model viable again.

Google isn’t far behind the cutting edge of language models — their PaLM is 3x bigger than GPT-3 and beats it in many academic benchmarks. But they don’t want to p... (read more)

I think it’s because predicting exactly what someone will say is more difficult than just sounding something like them. Eliezer Yudkowsky wrote about it here: https://www.lesswrong.com/posts/nH4c3Q9t9F3nJ7y8W/gpts-are-predictors-not-imitators

3
Connor Blake
1y
I should have clarified that LW post is the post on which I based my question, so here is a more fleshed out version: Because GPTs are trained on human data, and given that humans make mistakes and don't have complete understanding of most situations, it seems highly implausible to me that enough information can be extracted from text/images to make a valid prediction of highly complex/abstract topics because of the imprecision of language.  Yudkowsky says of GPT-4:  How do we know it will be able to extract enough information from the shadow to be able to reconstruct the thoughts? Text has comparatively little information to characterize such a complex system. It reminds me of the difficulty of problems like the inverse scattering problem or CT scan computation where underlying structure is very complex, and all you get is a low-dimensional projection of it which may or may not be solvable to obtain the original complex structure. CT scans can find tumors, but they can't tell you which gene mutated because they just don't have enough resolution. Yudkowsky gives this as an example in the article:  I understand that it would be evidence of extreme intelligence to make that kind of prediction, but I don't see how the path to such a conclusion can be made solely from its training data.  Going further, because the training data is from humans (who, as mentioned, make mistakes and have an incomplete understanding of the world), it seems highly unlikely that the model would have the ability to produce new concepts in something exact as, for example, math and science if its understanding of causality is solely based on predicting something as unpredictable as human behavior, even if it's really good. Why should we assume that a model, even a really big one, would converge to understanding the laws of physics well enough to make new discoveries based on human data alone? Is the idea behind ASI that it will even come from LLMs? If so, I am very curious to hear the theor
3
tobycrisford
1y
Thank you! This is exactly what I wanted to read!

I’d say alignment research is not going very well! There have been successes in areas that help products get to market (e.g. RLHF) and on problems of academic interest that leave key problems unsolved (e.g. adversarial robustness), but there are several “core problems” that have not seen much progress over the years.

Good overview of this topic: https://www.forourposterity.com/nobodys-on-the-ball-on-agi-alignment/

I don’t think the orthogonality thesis would have predicted GPT models, which become intelligent by mimicking human language, and learn about human values as a byproduct. The orthogonality thesis says that, in principle, any level of intelligence can be combined with any goal, but in practice the most intelligent systems we have are trained by mimicking human concepts.

On the other hand, after you train a language model, you can ask it or fine-tune it to pursue any goal you like. It will use human concepts that it learned from pretraining on natural language, but you can give it a new goal.

2
Greg_Colbourn
1y
What is the plan in this case? Indefinite Pause and scaling back of compute allowances? (Kind of hate that we might be living in the Dune universe.)

Those all seem like important risks to me, but I’d estimate the highest x-risk from agentic systems that learn to seek power or wirehead, especially after a transition to very rapid economic or scientific progress. If AI progresses slowly or is only a tool used by human operators, x-risk seems much lower to me.

Good recent post on various failure modes: https://www.lesswrong.com/posts/mSF4KTxAGRG3EHmhb/ai-x-risk-approximately-ordered-by-embarrassment

Forecasting new technologies is always difficult, so personally I have a lot of uncertainty about the future of AI, but I think this post is a good overview of some considerations: https://www.cold-takes.com/where-ai-forecasting-stands-today/

Answer by aogaraApr 11, 20236
1
0

Global poverty needs plenty of tech talent, GiveDirectly and IDInsight might be good organizations to check out. 

Answer by aogaraApr 10, 20237
0
0

Hi! If you're still interested, I think one of the most important questions in AI is how quickly it will transform the world. Effects on GDP are one way of measuring that impact, and growth theory is full of tools for forecasting growth. If this is interesting to you, here's one path you could work on:

  • Tom Davidson's Full Takeoff Model is the best comprehensive economic model of AI timelines. Epoch is working on improving the model and making it more accessible to academics and policymakers. If you can identify substantive improvements to the model, plenty
... (read more)

Agreed, this is ridiculous. You should take down the contest. 

Your chances of a successful attack are very low. It takes years for information to be scraped from the internet, trained into a model, and deployed to production. GPT-4 has a knowledge cutoff of September 2021. If future models have the same delay, you won't see results for a year and a half. 

The more likely outcome is press coverage about how AI safety folks are willing to hold society hostage in order to enforce their point of view. See this takedown piece on Eliezer Yudkowsky, and ... (read more)

1
JonCefalu
1y
For what it's worth, the contest host is an artist I know who has no connection to the EA movement.  Also, there is no "holding society hostage" because the contest is designed to make it trivially easy to filter out all poison, just by looking for keywords.  (wallabywinter & yallabywinter).   Black hat hackers are already doing code example poisoning on stack overflow, and this contest simply seeks to raise awareness of that fact in the white-hat community.

Yep that's a good point. Here's one source on it, funding amounts definitely increased throughout the 2010s. An alternative explanation could be that valuations have increased more than funding amounts. There's some data to support this, but you'd need a more careful comparison of startups within the same reference class to be sure. 

 

Startup Funding Explained: Pre-seed, Seed, & Series A-D - Finmark
One chart shows how seed stage valuations were a rare bright spot in VC  during a turbulent period | Fortune
2
Linch
1y
Thanks, appreciate the concrete data!

Startups would be another good reference class. VCs are incentivized to scale as fast as possible so they can cash out and reinvest their money, but they rarely give a new organization as much money as Redwood received.

Startups usually receive a seed round of ~$2M cash to cover the first year or two of business, followed by ~$10M for Series A to cover another year or two. Even Stripe, a VC wunderkind that’s raised billions privately while scaling to thousands of employees around the world, began with $2M for their first year, $38M for the next three years ... (read more)

7
NunoSempere
1y
I appreciate this comment for giving concrete data that improves my model of the world. Thanks.

Thanks, this is helpful. One thing to flag is that I wouldn't find the 2012-2014 numbers very convincing; my impression is that VC funding increased a lot until 2022, and 2021 was a year where capital was particularly cheap, for reasons that in hindsight were not entirely dissimilar to why longtermist EA was (relatively) well-funded in the last two years.

Load more