Effective Altruism Forum
EA Forum

All of Bogdan Ionut Cirstea's Comments + Replies

There have been numerous scandals within the EA community about how working for top AGI labs might be harmful. So, when are we going to have this conversation: contributing in any way to the current US admin getting (especially exclusive) access to AGI might be (very) harmful?

[cross-posted from X and LessWrong]

Open Philanthropy: Our Progress in 2023 and Plans for 2024

Bogdan Ionut Cirstea2y3

From the linked post:

As a result of our internal process, we decided to keep that new higher bar, while also aiming to roughly double our GCR spending over the next few years — if we can find sufficiently cost-effective opportunities.

At first glance, this seems potentially 'wildly conservative' to me, if I think of what this implies for the AI risk mitigation portion of the funding and how this intersects with (shortening) timelines [estimates].

My impression from looking briefly at recent grants is that probably <= 150M$ was spent by O... (read more)

lukeprog

Re: why our current rate of spending on AI safety is "low." At least for now, the main reason is lack of staff capacity! We're putting a ton of effort into hiring (see here) but are still not finding as many qualified candidates for our AI roles as we'd like. If you want our AI safety spending to grow faster, please encourage people to apply!

Could someone help me understand why it's so difficult to solve the alignment problem?

Bogdan Ionut Cirstea3y1

I'll also note that the role of the Constitution in Constitutional AI (https://www.anthropic.com/index/claudes-constitution) seems quite related to your 3rd paragraph.

Could someone help me understand why it's so difficult to solve the alignment problem?

Answer by Bogdan Ionut CirsteaJul 23, 20233

I think you're on to something and some related thoughts are a significant part of my research agenda. Here are some references you might find useful (heavily biased towards my own thinking on the subject), numbered by paragraph in your post:

There's a lot of cumulated evidence of significant overlap between LM and human linguistic representations, scaling laws of this phenomenon seem favorable and LM embeddings have also been used as a model of shared linguistic space for transmitting thoughts during communication. I interpret this as suggesting outer alig

... (read more)

Bogdan Ionut Cirstea

I'll also note that the role of the Constitution in Constitutional AI (https://www.anthropic.com/index/claudes-constitution) seems quite related to your 3rd paragraph.

What new psychology research could best promote AI safety & alignment research?

Answer by Bogdan Ionut CirsteaJul 14, 20234

There seems to be a nascent field in academia of using psychology tools/methods to understand LLMs, e.g. https://www.pnas.org/doi/10.1073/pnas.2218523120; it might be interesting to think about the intersection of this with alignment e.g. what experiments to perform, etc.

Maybe more on the neuroscience side, I'd be very excited to see (more) people think about how to build a neuroconnectionist research programme for alignment (I've also briefly mentioned this in the linkpost).

Timothy Chan

Another relevant article on "machine psychology" https://arxiv.org/abs/2303.13988 (interestingly, it's by a co-author of Peter Singer's first AI paper)

We're no longer "pausing most new longtermist funding commitments"

Bogdan Ionut Cirstea3y5

Maybe, though e.g. combined with

it would still result in a high likelihood of very short timelines to superintelligence (there can be inconsistencies between Metaculus forecasts, e.g. with

as others have pointed out before). I'm not claiming we should only rely on these Metaculus forecasts or that we should only plan for [very] short timelines, but I'm getting the impression the community as a whole and OpenPhil in particular haven't really updated their spending plans with respect to these considerations (or at... (read more)

We're no longer "pausing most new longtermist funding commitments"

Bogdan Ionut Cirstea3y38

Can you comment a bit more on how the specific number of years (20 and 50) were chosen? Aren't those intervals [very] conservative, especially given that AGI/TAI timeline estimates have shortened for many? E.g., if one took seriously the predictions from

wouldn't it be reasonable to also have scenarios under which you might want to spend at least the AI risk portfolio in something like 5-10 years instead? Maybe this is covered somewhat by 'Of course, we can adjust our spending rate over time', but I'd still be curious to hear more ... (read more)

Holden Karnofsky

Aiming to spend down in less than 20 years would not obviously be justified even if one’s median for transformative AI timelines were well under 20 years. This is because we may want extra capital in a “crunch time” where we’re close enough to transformative AI for the strategic picture to have become a lot clearer, and because even a 10-25% chance of longer timelines would provide some justification for not spending down on short time frames. This move could be justified if the existing giving opportunities were strong enough even with a lower bar. That may end up being the case in the future. But we don’t feel it’s the case today, having eyeballed the stack rank.

Imma🔸

Elsewhere, Holden makes this remark about the optimal timing of donations: And in footnote 13: I'm taking the quote out of context a little bit here. I don't know if Holden's guess that giving opportunities will increase is one of OpenPhil's reasons to spend at a low rate. There might be other reasons. Also, Holden is talking about individual donations here, not necessarily about OpenPhil spending. I'm adding it here because it might help answer the question "Why is the spending rate so low relative to AI timelines?" even though it's only tangentially relevant.

MichaelDickens3y13

That question's definition of AGI is probably too weak—it will probably resolve true a good deal before we have a dangerously powerful AI.

Robi Rahman🔸3y19

Can the people who agreement-downvoted this explain yourselves? Bogdan has a good point: if we really believe in short timelines to transformative AI we should either be spending our entire AI-philanthropy capital endowment now, or possibly investing it in something that will be useful after TAI exists. What does not make sense is trying to set up a slow funding stream for 50 years of AI alignment research if we'll have AGI in 20 years.

(Edit: the comment above had very negative net agreement when I wrote this.)

EA & LW Forums Weekly Summary (19 - 25 Sep 22')

Bogdan Ionut Cirstea3y3

Thanks, this series of summaries is great! Minor correction: DeepMind released Sparrow (not OpenAI).

Zoe Williams

Thanks! Fixed

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y1

'One metaphor for my headspace is that it feels as though the world is a set of people on a plane blasting down the runway:

And every time I read commentary on what's going on in the world, people are discussing how to arrange your seatbelt as comfortably as possible given that wearing one is part of life, or saying how the best moments in life are sitting with your family and watching the white lines whooshing by, or arguing about whose fault it is that there's a background roar making it hard to hear each other.

I don't know where we're actually heading, o... (read more)

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y1

'If you know the aliens are landing in thirty years, it’s still a big deal now.' (Stuart Russell)

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y1

'Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. Superintelligence is a challenge for which we are not ready now and will not be ready for a long time. We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound. For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room,... (read more)

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y1

'Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make provided that the machine is docile enough to tell us how to keep it under control.' (I. J. Good)

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y2

'The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.' (Eliezer Yudkowsky)

[$20K In Prizes] AI Safety Arguments Competition

Bogdan Ionut Cirstea4y0

'You can't fetch the coffee if you're dead' (Stuart Russell)

Career Advice: Philosophy + Programming -> AI Safety

Answer by Bogdan Ionut CirsteaMar 19, 20227

Consider applying for https://www.eacambridge.org/agi-safety-fundamentals

tcelferact

Thanks, I'm now on their mailing list!

Nines of safety: Terence Tao’s proposed unit of measurement of risk

Bogdan Ionut Cirstea4y4

If I remember correctly (from 'The Precipice') 'Unaligned AI ~1 in 50 1.7' should actually be 'Unaligned AI ~1 in 10 1'.

2[anonymous]4y

Thanks for pointing this out! Should be fixed now

Why AI alignment could be hard with modern deep learning

Bogdan Ionut Cirstea4y12

From https://www.cold-takes.com/supplement-to-why-ai-alignment-could-be-hard/ : 'A model about as powerful as a human brain seems like it would be ~100-10,000 times larger than the largest neural networks trained today, and I think could be trained using an amount of data and computation that -- while probably prohibitive as of August 2021 -- would come within reach after 15-30 years of hardware and algorithmic improvements.' Is it safe to assume that this is an updated, shorter timeline compared to https://www.alignmentforum.org/posts/KrJfoZzpSDpnrv... (read more)

Ajeya

Not intended to be expressing a significantly shorter timeline; 15-30 years was supposed to be a range of "plausible/significant probability" which the previous model also said (probability on 15 years was >10% and probability on 30 years was 50%). Sorry that wasn't clear! (JTBC I think you could train a brain-sized model sooner than my median estimate for TAI, because you could train it on shorter horizon tasks.)

What kind of event, targeted to undergraduate CS majors, would be most effective at getting people to work on AI safety?

Answer by Bogdan Ionut CirsteaSep 19, 20216

Encouraging them to apply to the next round of the AGI Safety Fundamentals program https://www.eacambridge.org/agi-safety-fundamentals might be another idea. The curriculum there can also provide inspiration for reading group materials.

Forecasting Newsletter: August 2021

Bogdan Ionut Cirstea4y1

'CSET-Foretell forecasts were quoted by Quanta Magazine (a) on on whether VC funding for tech startups will dry up' - the linked article seems to come from Quartz, not Quanta Magazine

NunoSempere

Thanks. fixed.

Forecasting transformative AI: the "biological anchors" method in a nutshell

Bogdan Ionut Cirstea4y20

I was very surprised by the paragraph: 'However, I also have an intuitive preference (which is related to the "burden of proof" analyses given previously) to err on the conservative side when making estimates like this. Overall, my best guesses about transformative AI timelines are similar to those of Bio Anchors.' especially in context and especially because of the use of the term 'conservative'. I would have thought that the conservative assumption to make would be shorter timelines (since less time to prepare). If I remember correctly, Toby Ord discusse... (read more)

Holden Karnofsky

There are contexts in which I'd want to use the terms as you do, but I think it is often reasonable to associate "conservatism" with being more hesitant to depart from conventional wisdom, the status quo, etc. In general, I have always been sympathetic to the idea that the burden of proof/argumentation is on those who are trying to raise the priority of some particular issue or problem. I think there are good reasons to think this works better (and is more realistic and conducive to clear communication) than putting the burden of proof on people to ignore some novel issue / continue what they were doing.

What EA projects could grow to become megaprojects, eventually spending $100m per year?

Answer by Bogdan Ionut CirsteaAug 14, 20214

I think aligning narrow superhuman models could be one very valuable megaproject and this seems scalable to >= $100 million, especially if also training large models (not just fine-tuning them for safety). Training their own large models for alignment research seems to be what Anthropic plans to do. This is also touched upon in Chris Olah's recent 80k interview.