All of Akash's Comments + Replies

Akash
4mo36
11
8
3
2

Congrats to Zach! I feel like this is mostly supposed to be a "quick update/celebratory post", but I feel like there's a missing mood that I want to convey in this comment. Note that my thoughts mostly come from an AI Safety perspective, so these thoughts may be less relevant for folks who focus on other cause areas.

My impression is that EA is currently facing an unprecedented about of PR backlash, as well as some solid internal criticisms among core EAs who are now distancing from EA. I suspect this will likely continue into 2024. Some examples:

  • EA has acq
... (read more)

I strongly agree that being associated with EA in AI policy is increasingly difficult (as many articles and individuals' posts on social media can attest), in particular in Europe, DC, and the Bay Area. 

I appreciate Akash's comment, and at the same time, I understand the object of this post is not to ask for people's opinions about what the priorities of CEA would be, so I won't go too much into detail. I want to highlight that I'm really excited for Zach Robinson to lead CEA!

With my current knowledge of the situation in three different jurisdictions,... (read more)

5
Chris Leong
4mo
These are all good points, but I suspect it could be a mistake for EA to focus too much on PR. Very important to listen carefully to people's concerns, but I also think we need the confidence to forge our own path.

Do you know anything about the strategic vision that Zach has for CEA? Or is this just meant to be a positive endorsement of Zach's character/judgment? 

(Both are useful; just want to make sure that the distinction between them is clear). 

I appreciate the comment, though I think there's a lack of specificity that makes it hard to figure out where we agree/disagree (or more generally what you believe).

If you want to engage further, here are some things I'd be excited to hear from you:

  • What are a few specific comms/advocacy opportunities you're excited about//have funded?
  • What are a few specific comms/advocacy opportunities you view as net negative//have actively decided not to fund?
  • What are a few examples of hypothetical comms/advocacy opportunities you've been excited about?
  • What do you think
... (read more)
Answer by AkashDec 16, 202325
17
1

I expect that your search for a "unified resource" will be unsatisfying. I think people disagree enough on their threat models/expectations that there is no real "EA perspective".

Some things you could consider doing:

  • Having a dialogue with 1-2 key people you disagree with
  • Pick one perspective (e.g., Paul's worldview, Eliezer's worldview) and write about areas you disagree with it.
  • Write up a "Matthew's worldview" doc that focuses more on explaining what you expect to happen and isn't necessarily meant as a "counterargument" piece. 

Among the questions you... (read more)

Tom Barnes
4mo122
11
2
18
3

I agree there's no single unified resource. Having said that, I found Richard Ngo's "five alignment clusters" pretty helpful for bucketing different groups & arguments together. Reposting below:

  1. MIRI cluster. Think that P(doom) is very high, based on intuitions about instrumental convergence, deceptive alignment, etc. Does work that's very different from mainstream ML. Central members: Eliezer Yudkowsky, Nate Soares.
  2. Structural risk cluster. Think that doom is more likely than not, but not for the same reasons as the MIRI cluster. Instead, this cluster f
... (read more)
9
Ryan Greenblatt
4mo
I agree that there is no real "EA perspective", but it seems like there could be a unified doc that a large cluster of people end up roughly endorsing. E.g., I think that if Joe Carlsmith wrote another version of "Is Power-Seeking AI an Existential Risk?" in the next several years, then it's plausible that a relevant cluster of people would end up thinking this basically lays out the key arguments and makes the right arguments. (I'm unsure what I currently think about the old version of the doc, but I'm guessing I'll think it misses some key arguments that now seem more obvious.)

Thanks for this overview, Trevor. I expect it'll be helpful– I also agree with your recommendations for people to consider working at standard-setting organizations and other relevant EU offices.

One perspective that I see missing from this post is what I'll call the advocacy/comms/politics perspective. Some examples of this with the EU AI Act:

  • Foundation models were going to be included in the EU AI Act, until France and Germany (with lobbying pressure from Mistral and Aleph Alpha) changed their position.
  • This initiated a political/comms battle between those
... (read more)
2
tlevin
4mo
(Cross-posting from LW) Thanks for these thoughts! I agree that advocacy and communications is an important part of the story here, and I'm glad for you to have added some detail on that with your comment. I’m also sympathetic to the claim that serious thought about “ambitious comms/advocacy” is especially neglected within the community, though I think it’s far from clear that the effort that went into the policy research that identified these solutions or work on the ground in Brussels should have been shifted at the margin to the kinds of public communications you mention. I also think Open Phil’s strategy is pretty bullish on supporting comms and advocacy work, but it has taken us a while to acquire the staff capacity to gain context on those opportunities and begin funding them, and perhaps there are specific opportunities that you're more excited about than we are.  For what it’s worth, I didn’t seek significant outside input while writing this post and think that's fine (given the alternative of writing it quickly, posting it here, disclaiming my non-expertise, and getting additional perspectives and context from commenters like yourself). However, I have spoken with about a dozen people working on AI policy in Europe over the last couple months (including one of the people whose public comms efforts are linked in your comment) and would love to chat with more people with experience doing policy/politics/comms work in the EU. We could definitely use more help thinking about this stuff, and I encourage readers who are interested in contributing to OP’s thinking on advocacy and comms to do any of the following: * Write up these critiques (we do read the forums!);  * Join our team (our latest hiring round specifically mentioned US policy advocacy as a specialization we'd be excited about, but people with advocacy/politics/comms backgrounds more generally could also be very useful, and while the round is now closed, we may still review general applications)

I'm excited to see the EAIF share more about their reasoning and priorities. Thank you for doing this!

I'm going to give a few quick takes– happy to chat further about any of these. TLDR: I recommend (1) getting rid of the "principles-first" phrase & (2) issuing more calls for proposals focused on the specific projects you want to see (regardless of whether or not they fit neatly into an umbrella term like "principles-first")

  • After skimming the post for 5 minutes, I couldn't find a clear/succinct definition of what "principles-first" actually mean
... (read more)
6
Linch
4mo
Just noting that this is EAIF, not LTFF.

Personally, I still think there is a lot of uncertainty around how governments will act. There are at least some promising signs (e.g., UK AI Safety Summit) that governments could intervene to end or substantially limit the race toward AGI. Relatedly, I think there's a lot to be done in terms of communicating AI risks to the public & policymakers, drafting concrete policy proposals, and forming coalitions to get meaningful regulation through. 

Some folks also have hope that internal governance (lab governance) could still be useful. I am not as opt... (read more)

5
Isaac Dunn
5mo
I think that trying to get safe concrete demonstrations of risk by doing research seems well worth pursuing (I don't think you were saying it's not).

I would be interested in seeing your takes about why building runway might be more cost-effective than donating.

Separately, if you decide not to go with 10% because you want to think about what is actually best for you, I suggest you give yourself a deadline. Like, suppose you currently think that donating 10% would be better than status quo. I suggest doing something like “if I have not figured out a better solution by Jan 1 2024, I will just do the community-endorsed default of 10%.”

I think this protects against some sort of indefinite procrastination. (Obviously less relevant if you never indefinitely procrastinate on things like this, but my sense is that most people do at least sometimes).

6
calebp
5mo
(to be clear, I do donate I just haven't signed the pledge, and I'm confused about how much I am already donating)  I think the main things are: * whilst I think donating now > donating in the future + interest, the cost of waiting to donate is fairly low (if you're not worried about value drift) * I can think of many situations in the past where an extra $10k would have been extremely useful to me to move to more impactful work. * I don't think that it always makes sense for funders to give this kind of money to people in my position. * I now have friends who could probably do this for me, but it has some social cost. * I think it's important for me to be able to walk away from my job without worrying about personal finances.  * My job has a certain kind of responsibility that sometimes makes me feel uneasy, and being able to walk away without having another reason not to seems important. * I think I've seen several EAs make poor decisions from a place of poor personal finance and unusual financial security strategies. I think the epistemic effects of worrying about money are pretty real for me. Also: * If I were trying to have the most impact with my money via donations, I think I would donate to various small things that I sometimes see that funders aren't well positioned to fund. This would probably mean saving my money and not giving right now. * (I think that this kind of strategy is especially good for me as I have a good sense of what funders can and can't fund - I think people tend to overestimate the set of things funders can't fund) * I don't see why the GWWC 10% number should generalise well to my situation. I don't think it's a bad number. I don't weigh the community prior very strongly relative to my inside view here.

I think it’s good for proponents of RSPs to be open about the sorts of topics I’ve written about above, so they don’t get confused with e.g. proposing RSPs as a superior alternative to regulation. This post attempts to do that on my part. And to be explicit: I think regulation will be necessary to contain AI risks (RSPs alone are not enough), and should almost certainly end up stricter than what companies impose on themselves.

Strong agree. I wish ARC and Anthropic had been more clear about this, and I would be less critical of their RSP posts if they were ... (read more)

9
evhub
6mo
Cross-posted from LessWrong. It's hard to take anything else you're saying seriously when you say things like this; it seems clear that you just haven't read Anthropic's RSP. I think that the current conditions and resulting safeguards are insufficient to prevent AI existential risk, but to say that it doesn't make them clear is just patently false. The conditions under which Anthropic commits to pausing in the RSP are very clear. In big bold font on the second page it says: And then it lays out a serious of safety procedures that Anthropic commits to meeting for ASL-3 models or else pausing, with some of the most serious commitments here being: And a clear evaluation-based definition of ASL-3: This is the basic substance of the RSP; I don't understand how you could have possibly read it and missed this. I don't want to be mean, but I am really disappointed in these sort of exceedingly lazy takes.

Excited to see this team expand! A few [optional] questions:

  1. What do you think were some of your best and worst grants in the last 6 months?
  2. What are your views on the value of "prosaic alignment" relative to "non-prosaic alignment?" To what extent do you think the most valuable technical research will look fairly similar to "standard ML research", "pure theory research", or other kinds of research?
  3. What kinds of technical research proposals do you think are most difficult to evaluate, and why?
  4. What are your favorite examples of technical alignment research fr
... (read more)

When should someone who cares a lot about GCRs decide not to work at OP?

I agree that there are several advantages of working at Open Phil, but I also think there are some good answers to "why wouldn't someone want to work at OP?"

Culture, worldview, and relationship with labs

Many people have an (IMO fairly accurate) impression that OpenPhil is conservative, biased toward inaction, generally prefers maintaining the status quo, and is generally in favor of maintaining positive relationships with labs.

As I've gotten more involved in AI policy, I've updated mor... (read more)

7
tlevin
6mo
(I began working for OP on the AI governance team in June. I'm commenting in a personal capacity based on my own observations; other team members may disagree with me.) FWIW I really don’t think OP is in the business of preserving the status quo.  People who work on AI at OP have a range of opinions on just about every issue, but I don't think any of us feel good about the status quo! People (including non-grantees) often ask us for our thoughts about a proposed action, and we’ll share if we think some action might be counterproductive, but many things we’d consider “productive” look very different from “preserving the status quo.” For example, I would consider the CAIS statement to be pretty disruptive to the status quo and productive, and people at Open Phil were excited about it and spent a bunch of time finding additional people to sign it before it was published. I agree that OP has an easier time recruiting than many other orgs, though perhaps a harder time than frontier labs. But at risk of self-flattery, I think the people we've hired would generally be hard to replace — these roles require a fairly rare combination of traits. People who have them can be huge value-adds relative to the counterfactual! I basically disagree with this. There are areas where senior staff have strong takes, but they'll definitely engage with the views of junior staff, and they sometimes change their minds. Also, the AI world is changing fast, and as a result our strategy has been changing fast, and there are areas full of new terrain where a new hire could really shape our strategy. (This is one way in which grantmaker capacity is a serious bottleneck.)
6
SiebeRozendal
6mo
Wow lots of disagreement here - I'm curious what the disagreement is about, if anyone wants to explain?

Adding this comment over from the LessWrong version. Note Evan and others have responded to it here.

Thanks for writing this, Evan! I think it's the clearest writeup of RSPs & their theory of change so far. However, I remain pretty disappointed in the RSP approach and the comms/advocacy around it.

I plan to write up more opinions about RSPs, but one I'll express for now is that I'm pretty worried that the RSP dialogue is suffering from motte-and-bailey dynamics. One of my core fears is that policymakers will walk away with a misleadingly positive impress... (read more)

Thanks! A few quick responses/questions:

I think presumably the pause would just be for that company's scaling—presumably other organizations that were still in compliance would still be fine.

I think this makes sense for certain types of dangerous capabilities (e.g., a company develops a system that has strong cyberoffensive capabilities. That company has to stop but other companies can keep going).

But what about dangerous capabilities that have more to do with AI takeover (e.g., a company develops a system that shows signs of autonomous replication, manipu... (read more)

7
evhub
6mo
I wrote up a bunch of my thoughts on this in more detail here.
2
evhub
6mo
What should happen there is that the leading lab is forced to stop and try to demonstrate that e.g. they understand their model sufficiently such that they can keep scaling. Then: * If they can't do that, then the other labs catch up and they're all blocked on the same spot, which if you've put your capabilities bars at the right spots, shouldn't be dangerous. * If they can do that, then they get to keep going, ahead of other labs, until they hit another blocker and need to demonstrate safety/understanding/alignment to an even greater degree.

@evhub can you say more about what you envision a governmentally-enforced RSP world would look like? Is it similar to licensing? What happens when a dangerous capability eval goes off— does the government have the ability to implement a national pause?

Aside: IMO it's pretty clear that the voluntary-commitment RSP regime is insufficient, since some companies simply won't develop RSPs, and even if lots of folks adopted RSPs, the competitive pressures in favor of racing seem like they'd make it hard for anyone to pause for >a few months. I was surprised/di... (read more)

5
evhub
6mo
I think presumably the pause would just be for that company's scaling—presumably other organizations that were still in compliance would still be fine. That's definitely my position, yeah—and I think it's also ARC's and Anthropic's position. I think the key thing with the current advocacy around companies doing this is that one of the best ways to get a governmentally-enforced RSP regime is for companies to first voluntarily commit to the sort of RSPs that you want the government to later enforce.

@tlevin I would be interested in you writing up this post, though I'd be even more interested in hearing your thoughts on the regulatory proposal Thomas is proposing.

Note that both of your points seem to be arguing against a pause, whereas my impression is that Thomas's post focuses more on implementing a national regulatory body.

(I read Thomas's post as basically saying like "eh, I know there's an AI pause debate going on, but actually this pause stuff is not as important as getting good policies. Specifically, we should have a federal agency that does li... (read more)

One thing I appreciate about both of these tests is that they seem to (at least partially) tap into something like "can you think for yourself & reason about problems in a critical way?" I think this is one of the most important skills to train, particularly in policy, where it's very easy to get carried away with narratives that seem popular or trendy or high-status.

I think the current zeitgeist has gotten a lot of folks interested in AI policy. My sense is that there's a lot of potential for good here, but there are also some pretty easy ways for thi... (read more)

I agree with Zach here (and I’m also a fan of Holly). I think it’s great to spotlight people whose applications you’re excited about, and even reasonable for the tone to be mostly positive. But I think it’s fair for people to scrutinize the exact claims you make and the evidence supporting those claims, especially if the target audience consists of potential donors.

My impression is that the crux is less about “should Holly be funded” and more about “were the claims presented precise” and more broadly some feeling of “how careful should future posts be when advertising possible candidates.”

6
Austin
9mo
Yeah idk, this just seems like a really weird nitpick, given that you both like Holly's work...? I'm presenting a subjective claim to begin with: "Holly's track record is stellar", as based on my evaluation of what's written in the application plus external context. If you think this shouldn't be funded, I'd really appreciate the reasoning; but I otherwise don't see anything I would change about my summary.
7
pseudonym
9mo
Just another +1 to Zach and Akash. A related point, speaking for myself: The likelihood of me funding these projects based on descriptions if these are mainly to hype is lower, because I may not have the time and energy to evaluate how much they are hyped, and I don't know Manifund's track record well enough to defer. On the other hand, if I have reason to believe these are well-calibrated statements then I'm more likely to be happy to defer in future. Don't feel like you should change your approach based on one individual's preferences, but just thought this might be a useful data point.

Congratulations on launching!

On the governance side, one question I'd be excited to see Apollo (and ARC evals & any other similar groups) think/write about is: what happens after a dangerous capability eval goes off? 

Of course, the actual answer will be shaped by the particular climate/culture/zeitgeist/policy window/lab factors that are impossible to fully predict in advance.

But my impression is that this question is relatively neglected, and I wouldn't be surprised if sharp newcomers were able to meaningfully improve the community's thinking on this. 

Excited to see this! I'd be most excited about case studies of standards in fields where people didn't already have clear ideas about how to verify safety.

In some areas, it's pretty clear what you're supposed to do to verify safety. Everyone (more-or-less) agrees on what counts as safe.

One of the biggest challenges with AI safety standards will be the fact that no one really knows how to verify that a (sufficiently-powerful) system is safe. And a lot of experts disagree on the type of evidence that would be sufficient.

Are there examples of standards in oth... (read more)

1
Ariel G.
11mo
Yes, medical robotics is one I was involved in. Though there, the answer is often just wait for the first product to hit the market (there is nothing quite there yet, doing full autonomous surgery), and then copy their approach. As is, the medical standards don't cover much ML, and so the companies have to come up with the reasoning themselves for convincing the FDA in the audit. Which in practice means many companies just don't risk it, and do something robotic, but surgeon controled, or use classical algorithms instead of deep learning.
6
Koen Holtman
11mo
While overcoming expert disagreement is a challenge, it is not one that is as big as you think. TL;DR: Deciding not to agree is always an option. To expand on this: the fallback option in a safety standards creation process, for standards that aim to define a certain level of safe-enough, is as follows. If the experts involved cannot agree on any evidence based method for verifying that a system X is safe enough according to the level of safety required by the standard, then the standard being created will simply, and usually implicitly, declare that there is no route by which system X can comply with the safety standard. If you are required by law, say by EU law, to comply with the safety standard before shipping a system into the EU market, then your only legal option will be to never ship that system X into the EU market. For AI systems you interact with over the Internet, this 'never ship' translates to 'never allow it to interact over the Internet with EU residents'. I am currently in the JTC21 committee which is running the above standards creation process to write the AI safety standards in support of the EU AI Act, the Act that will regulate certain parts of the AI industry, in case they want to ship legally into the EU market. ((Legal detail: if you cannot comply with the standards, the Act will give you several other options that may still allow you to ship legally, but I won't get into explaining all those here. These other options will not give you a loophole to evade all expert scrutiny.)) Back to the mechanics of a standards committee: if a certain AI technology, when applied in a system X, is well know to make that system radioactively unpredictable, it will not usually take long for the technical experts in a standards committee to come to an agreement that there is no way that they can define any method in the standard for verifying that X will be safe according to the standard. The radioactively unsafe cases are the easiest cases to handle. Th
2
Ben Stewart
11mo
Maybe there's something in early cybersecurity? I.e. we're not really sure precisely how people could be harmed through these systems (like the nascent internet), but there's plenty of potential in the future?

Glad to see this write-up & excited for more posts.

I think these are three areas that MATS has handled well. I'd be especially excited to hear more about areas where MATS thinks it's struggling, MATS is uncertain, or where MATS feels like it has a lot of room to grow. Potential candidates include:

  • How is MATS going about talent selection and advertising for the next cohort, especially given the recent wave of interest in AI/AI safety?
  • How does MATS intend to foster (or recruit) the kinds of qualities that strong researchers often possess?
  • How does MATS de
... (read more)
3
Ryan Kidd
1y
* We broadened our advertising approach for the Summer 2023 Cohort, including a Twitter post and a shout-out on Rob Miles' YouTube and TikTok channels. We expected some lowering of average applicant quality as a result but have yet to see a massive influx of applicants from these sources. We additionally focused more on targeted advertising to AI safety student groups, given their recent growth. We will publish updated applicant statistics after our applications close. * In addition to applicant selection and curriculum elements, our Scholar Support staff, introduced in the Winter 2022-23 Cohort, supplement the mentorship experience by providing 1-1 research strategy and unblocking support for scholars. This program feature aims to: * Supplement and augment mentorship with 1-1 debugging, planning, and unblocking; * Allow air-gapping of evaluation and support, improving scholar outcomes by resolving issues they would not take to their mentor; * Solve scholars’ problems, giving more time for research. * Defining "good alignment research" is very complicated and merits a post of its own (or two, if you also include the theories of change that MATS endorses). We are currently developing scholar research ability through curriculum elements focused on breadth, depth, and epistemology (the "T-model of research"): * Breadth-first search (literature reviews, building a "toolbox" of knowledge, noticing gaps); * Depth-first search (forming testable hypotheses, project-specific skills, executing research, recursing appropriately, using checkpoints); * Epistemology (identifying threat models, backchaining to local search, applying builder/breaker methodology, babble and prune, "infinite-compute/time" style problem decompositions, etc.). * Our Alumni Spotlight includes an incomplete list of projects we highlight. Many more past scholar projects seem promising to us but have yet to meet our criteria for inclusion here. Watch this space. * Since Summer 2022,

Clarification: I think we're bottlenecked by both, and I'd love to see the proposals become more concrete. 

Nonetheless, I think proposals like "Get a federal agency to regulate frontier AI labs like the FDA/FAA" or even "push for an international treaty that regulates AI in a way that the IAEA regulates atomic energy" are "concrete enough" to start building political will behind them. Other (more specific) examples include export controls, compute monitoring, licensing for frontier AI models, and some others on Luke's list

I don't think any of t... (read more)

I don't actually think the implementation of governance ideas is mainly bottlenecked by public support; I think it's bottlenecked by good concrete proposals. And to the extent that it is bottlenecked by public support, that will change by default as more powerful AI systems are released.

I appreciate Richard stating this explicitly. I think this is (and has been) a pretty big crux in the AI governance space right now.

Some folks (like Richard) believe that we're mainly bottlenecked by good concrete proposals. Other folks believe that we have concrete proposa... (read more)

I can see a worldview in which prioritizing raising awareness is more valuable, but I don't see the case for believing "that we have concrete proposals". Or at least, I haven't seen any; could you link them, or explain what you mean by a concrete proposal?

My guess is that you're underestimating how concrete a proposal needs to be before you can actually muster political will behind it. For example, you don't just need "let's force labs to pass evals", you actually need to have solid descriptions of the evals you want them to pass.

I also think that recent e... (read more)

Lots of awesome stuff requires AGI or superintelligence. People think LLMs (or stuff LLMs invent) will lead to AGI or superintelligence.

So wouldn’t slowing down LLM progress slow down the awesome stuff?

8
Zach Stein-Perlman
1y
Yeah, that awesome stuff. My impression is that most people who buy "LLMs --> superintelligence" favor caution despite caution slowing awesome stuff. But this thread seems unproductive.

I think more powerful (aligned) LLMs would lead to more awesome stuff, so caution on LLMs does delay other awesome stuff.

I agree with the point that "there's value that can be gained from figuring out how to apply systems at current capabilities levels" (AI summer harvest), but I wouldn't go as far as "you can almost have the best of both worlds." It seems more like "we can probably do a lot of good with existing AI, so even though there are costs of caution, those costs are worth paying, and at least we can make some progress applying AI to pressing world problems while we figure out alignment/governance." (My version isn't catchy though, oops).

6
Zach Stein-Perlman
1y
Sure. Often when people talk about awesome stuff they're not referring to LLMs. In this case, there's no need to slow down the awesome stuff they're talking about.

I appreciate that this post acknowledges that there are costs to caution. I think it could've gone a bit further in emphasizing how these costs, while large in an absolute sense, are small relative to the risks.

The formal way to do this would be a cost-benefit analysis on longtermist grounds (perhaps with various discount rates for future lives). But I think there's also a way to do this in less formal/wonky language, without requiring any longtermist assumptions.

If you have a technology where half of experts believe there's a ~10% chance of extinction, th... (read more)

Imagine: would you board an airplane if 50% of airplane engineers who built it said there was a 10% chance that everybody on board dies?

In the context of the OP, the thought experiment would need to be extended.

"Would you risk a 10% chance of a deadly crash to go to [random country]" -> ~100% of people reply no.

"Would you risk a 10% of a deadly crash to go to a Utopia without material scarcity, conflict, disease?" -> One would expect a much more mixed response.

The main ethical problem is that in the scenario of global AI progress, everyone is forced to board the plane, irrespective of their preferences.

Answer by AkashApr 27, 20236
3
1

The impression I get is that lots of people are like “yeah, I’d like to see more work on this & this could be very important” but there aren’t that many people who want to work on this & have ideas.

Is there evidence that funding isn’t available for this work? My loose impression is that mainstream funders would be interested in this. I suppose it’s an area where it’s especially hard to evaluate the promisingness of a proposal, though.

Reasons people might not be interested in doing this work: — Tractability — Poor feedback loops — Not many others in... (read more)

On the letter itself, though, I have a bunch of uncertainties around whether a six month pause right now would actually help

I share many of your concerns, though I think on balance I feel more enthusiastic about the six-month pause. (Note that I'm thinking about a six-month pause on frontier AI development that is enforced across the board, at least in the US, and I'm more confused about a six-month pause that a few specific labs opt-in to). 

I wonder if this relates more to an epistemic difference (e.g., the actual credence we put on the six-month pau... (read more)

I think it's great that you're releasing some posts that criticize/red-team some major AIS orgs. It's sad (though understandable) that you felt like you had to do this anonymously. 

I'm going to comment a bit on the Work Culture Issues section. I've spoken to some people who work at Redwood, have worked at Redwood, or considered working at Redwood.

I think my main comment is something like you've done a good job pointing at some problems, but I think it's pretty hard to figure out what should be done about these problems. To be clear, I think the post m... (read more)

Hi Akash,

Thank you for sharing your thougths & those concrete action items - I agree it would be nice to have a set of recommendations in an ideal world.

This post took at least 50 hours (collectively) to write, and was delayed in publishing by a few days due to busy schedules. I think if we had more time, I would have shared the final version with a small set of non-redwood beta reviewers for comments which would have caught things like this (and e.g. Nunos' comment).

We plan to do this for future posts (if you're reading this and would like to give com... (read more)

Akash
1y30
13
1

I think I agree with a lot of the specific points raised here, but I notice a feeling of wariness/unease around the overall message. I had a similar reaction to Haydn's recent "If your model is going to sell, it has to be safe" piece. Let me try to unpack this:

On one hand, I do think safety is important for the commercial interests of labs. And broadly being better able to understand/control systems seems good from a commercial standpoint.

My biggest reservations can be boiled down into two points: 

  1. I don't think that commercial incentives will be enoug
... (read more)
2
deep
1y
My personal response would be as follows:  1. As Leopold presents it, the key pressure here that keeps labs in check is societal constraints on deployment, not perceived ability to make money. The hope is that society's response has the following properties: 1. thoughtful, prominent experts are attuned to these risks and demand rigorous responses 2. policymakers are attuned to (thoughtful) expert opinion 3. policy levers exist that provide policymakers with oversight / leverage over labs 2. If labs are sufficiently thoughtful, they'll notice that deploying models is in fact bad for them! Can't make profit if you're dead. *taps forehead knowingly* 1. but in practice I agree that lots of people are motivated by the tastiness of progress, pro-progress vibes, etc., and will not notice the skulls. Counterpoints to 1: Good regulation of deployment is hard (though not impossible in my view).  * reasonable policy responses are difficult to steer towards * attempts at raising awareness of AI risk could lead to policymakers getting too excited about the promise of AI while ignoring the risks * experts will differ; policymakers might not listen to the right experts Good regulation of development is much harder, and will eventually be necessary. This is the really tricky one IMO. I think it requires pretty far-reaching regulations that would be difficult to get passed today, and would probably misfire a lot. But doesn't seem impossible, and I know people are working on laying groundwork for this in various ways (e.g. pushing for labs to incorporate evals in their development process).
Answer by AkashMar 25, 20232
0
0

I don’t know the answer, but I suggest asking this on LessWrong as well.

1
jacquesthibs
1y
Yeah, that’s where I asked first. No responses in the thread, but someone DMd me to suggest looking at universities: https://www.lesswrong.com/posts/kH2uJeQZMnEBKG9GZ/can-independent-researchers-get-a-sponsored-visa-for-the-us.

Some thoughts I had while reading that I expect you'd agree with:

  1. There is probably a lot of overlap in the kinds of interventions that (some) AI safety folks would be on board with and the kinds of interventions that (some) AI ethics folks would be on board with. For example, it seems like (many people in) both groups have concerns about the rate of AI progress and would endorse regulations/policies that promote safe/responsible AI development.
  2. Given recent developments in AI, and apparent interest in regulation that promotes safety, it seems like now might
... (read more)
1
tamgent
1y
Interesting that you don't think the post acknowledged your second collection of points. I thought it mostly did.  1. The post did say it was not suggesting to shut down existing initiatives. So where people disagree on (for example) which evals to do, they can just do the ones they think are important and then both kinds get done. I think the post was identifying a third set of things we can do together, and this was not specific evals, but more about big narrative alliance when influencing large/important audiences. The post also suggested some other areas of collaboration, on policy and regulation, and some of these may relate to evals so there could be room for collaboration there, but I'd guess that more demand, funding, infrastructure for evals helps both kinds of evals. 2. Again I think the post addresses this issue: it talks about how there is this specific set of things the two groups can work on together that is both in their interest to do. It doesn't mean that all people from each group will only work on this new third thing (coalition building), but if a substantial number do, it'll help. I don't think the OP was suggesting a full merger of the groups. They acknowledge the 'personal and ethical problems with one another; [and say] that needn’t translate to political issues'. The call is specifically for political coalition building. 3. Again I don't think the OP is calling for a merger of the groups. They are calling for collaborating on something. 4. OK the post didn't do this that much, but I don't think every post needs to and I personally really liked that this one made its point so clearly. I would read a post which responds to this with some counterarguments with interest so maybe that implies I think it'd benefit from one too, but I wouldn't want a rule/social expectation that every post lists counterarguments as that can raise the barrier to entry for posting and people are free to comment in disagreements and write counter posts.

'It's harder to maintain good epistemics and strong reasoning + reasoning transparency in large coalitions of groups who have different worldviews/goals. ("We shouldn't say X because our allies in AI ethics will think it's weird.") I don't think "X is bad for epistemics" means "we definitely shouldn't consider X", but I think it's a pretty high cost that often goes underappreciated/underacknowledged'

This is probably a real epistemic cost in my view, but it takes more than identifying a cost to establish that forming a coalition with people with different g... (read more)

3
GideonF
1y
Just quickly on that last point: I recognise there is a lot of uncertainty (hence the disclaimer at the beginning). I didn't go through the possible counterarguments because the piece was already so long! Thanks for your comment though, and I will get to the rest of it later!

Exciting! Have you considered linkposting this to LessWrong? (Some technical AIS folks and AI governance folks check LW more than EAF)

1
OscarD
1y
Thanks! I see it is posted on LW, but as a separate post: https://www.lesswrong.com/posts/uhFo7XeaRHL7b9ioh/announcing-the-era-cambridge-summer-research-fellowship (I assume it doesn't much matter that it isn't link-posted, however if you think it is valuable to fix I can ask Nandini to)

Thanks! I found this context useful. 

Thanks for this context. Is it reasonable to infer that you think that OpenAI would've got a roughly-equally-desirable investment if OP had not invested? (Such that the OP investment had basically no effect on acceleration?)

Yes that’s my position. My hope is we actually slowed acceleration by participating but I’m quite skeptical of the view that we added to it.

Personally, I see this as a misunderstanding, i.e. that OP helped OpenAI to come into existence and it might not have happened otherwise.

I think some people have this misunderstanding, and I think it's useful to address it.

With that in mind, much of the time, I don't think people who are saying "do those benefits outweigh the potential harms" are assuming that the counterfactual was "no OpenAI." I think they're assuming the counterfactual is something like "OpenAI has less money, or has to take somewhat less favorable deals with investors, or has to do som... (read more)

Unless it's a hostile situation (as might happen with public cos/activist investors), I don't think it's actually that costly. At seed stage, it's just kind of normal to give board seats to major “investors”, and you want to have a good relationship with both your major investors and your board.

The attitude Sam had at the time was less "please make this grant so that we don't have to take a bad deal somewhere else, and we're willing to 'sell' you a board seat to close the deal" and more "hey would you like to join in on this? we'd love to have you. no worries if not."

Ah, thanks for the summary of the claim. Apologies for misunderstanding this in my initial skim. 

I agree with Rameon's comment.

I appreciate the red-team of a commonly-discussed AI governance proposal. I'm confused about the section on race dynamics:

However, if it is true that the primary effect of the Windfall Clause is to concentrate money/power in the hands of AGI startup CEOs, this argument will likely not work. Rather, from the point of view of the CEOs, the spoils available to winning are now even larger than before, because you will get a massive slush fund for political pet projects, and the losers will not even get the consolation prize of their 401(k) doing so well.

How ar... (read more)

2
Larks
1y
I think Ben West correctly captured my primary argument, so I'll just address a minor subpoint. Firstly, I'm not sure where you're getting '0.0000000000000000001%'. MSFT has a total sharecount of around 7.5bn, so if you own a single share ( $248.59 at time of writing) that's over 100,000,000,000x more than you suggest. Even if you own a single share of an S&P500 ETF, and hence only a fractional share of MSFT, you're still off by many orders of magnitude. Secondly, it's unclear which of two points in the post you're referring to, partly because instead of actually quoting a section of the post you instead made up a fake quote. One argument, about egalitarianism, mentions GOOGL and MSFT: This argument is a relative argument. With over half of all Americans having some exposure to the stock market, even though the distribution is far from equal, it is far less concentrated than a world where the CEO directly controls the windfall. I don't think this is a super important argument though, as I mention, because I am a lot more concerned about AI race dynamics than egalitarianism.  The second argument, about CEO incentives and race dynamics, mentions 401(k)s: I expect AGI firm CEOs, which are the people whose incentives we primarily care about, to be reasonably wealthy, either from previous entrepreneurship or just compensation from the job, so they would likely have significant wealth invested in the stock market. The "Rich and Powerful People who own way more Microsoft and Google than everyone else [and] are the ones who will disproportionately benefit" include the competitor CEOs (and VCs, management teams, etc.) whose incentives determine how bad AI races will be.  This argument is not about egalitarianism or fairness or inclusion; it is simply about race dynamics. For this purpose, inequality is actually advantageous, as increases the rewards for the most influential actors for not racing. But I don't actually regard either of these points as the key part of

I think the claim is:

  1. No windfall clause: the CEO gets 10% of profits to spend on whatever they want (because they own 10% of the company)
  2. Windfall clause: the CEO gets 5% of profits to spend on whatever they want (because they own 10% of the company and half of the profits are distributed to shareholders) + 50% of profits to spend on charity (because 50% of profits go to the windfall, which they choose how to distribute)
  3. And maybe CEOs prefer 5% personal + 50% charity to 10% personal
  4. So the windfall clause makes racing dynamics worse

Can you say more about which longtermist efforts you're referring to?

I think a case can be made, but I don't think it's an easy (or clear) case.

My current impression is that Yudkowsky & Bostrom's writings about AGI inspired the creation of OpenAI/DeepMind. And I believe FTX invested a lot in Anthropic and OP invested a little bit (in relative terms) into OpenAI. Since then, there have been capabilities advances and safety advances made by EAs, and I don't think it's particularly clear which outweighs.

It seems unclear to me what the sign of these effect... (read more)

Is most of the AI capabilities work here causally downstream of Superintelligence, even if Superintelligence may have been (heavily ?) influenced by Yudkowsky? Both Musk and Altman recommended Superintelligence, altough Altman has also directly said Yudkowsky has accelerated timelines the most:

https://twitter.com/elonmusk/status/495759307346952192?lang=en

https://blog.samaltman.com/machine-intelligence-part-1

https://twitter.com/sama/status/1621621724507938816

If things stayed in the LW/Rat/EA community, that might have been best. If Yudkowsky hadn't written ... (read more)

Answer by AkashFeb 11, 202329
9
0

A lot of longtermist effort is going into AI safety at the moment. I think it's hard to make the case that something in AI safety has legibly or concretely reduced AI risk, since (a) the field is still considered quite pre-paradigmatic, (b) the risk comes from systems that are more powerful than the ones we currently have, and (c) even in less speculative fields, research often takes several years before it is shown to legibly help anyone.

But with those caveats in mind, I think:

  1. The community has made some progress in understanding possible risks and threat
... (read more)
3
Ben Snodin
1y
Thanks for these! I think my general feeling on these is that it's hard for me to tell if they actually reduced existential risk. Maybe this is just because I don't understand the mechanisms for a global catastrophe from AI well enough. (e.g. because of this, linking to Neel's longlist of theories for impact was helpful, so thank you for that!) E.g. my impression is that some people with relevant knowledge seem to think that technical safety work currently can't achieve very much.  (Hopefully this response isn't too annoying -- I could put in the work to understand the mechanisms for a global catastrophe from AI better, and maybe I will get round to this someday)

I think it's also easy to make a case that longtermist efforts have increased the x-risk of artificial intelligence, with the money and talent that grew some of the biggest hype machines in AI (Deepmind, OpenAI) coming from longtermist places.  

It's possible that EA has shaved  a couple  counterfactual years off of time to catastrophic AGI, compared to a world where the community wasn't working on it.

+1 on questioning/interrogating opinions, even opinions of people who are "influential leaders."

I claim people who are trying to use their careers in a valuable way should evaluate organizations/opportunities for themselves

My hope is that readers don't come away with "here is the set of opinions I am supposed to believe" but rather "ah here is a set of opinions that help me understand how some EAs are thinking about the world." Thank you for making this distinction explicit.

Disagree that these are mostly characterizing the Berkeley community (#1 and #2 see... (read more)

I appreciated the part where you asked people to evaluate organizations by themselves. But it was in the context of "there are organizations that aren't very good, but people don't want to say they are failing," which to me implies that a good way to do this is to get people "in the know" to tell you if they are the failing ones or not. It implies there is some sort of secret consensus on what is failing and what isn't, and if not for the fact that people are afraid to voice their views you could clearly know which were "failing." This could be partially t... (read more)

I'd be excited about posts that argued "I think EAs are overestimating AI x-risk, and here are some aspects of EA culture/decision-making that might be contributing to this."

I'm less excited about posts that say "X thing going on EA is bad", where X is a specific decision that EAs made [based on their estimate of AI x-risk]. (Unless the post is explicitly about AI x-risk estimates).

Related: Is that your true rejection?

Thanks for writing this, Eli. I haven't read WWOTF and was hoping someone would produce an analysis like this (especially comparing The Precipice to WWOTF).

I've seen a lot of people posting enthusiastically about WWOTF (often before reading it) and some of the press that it has been getting (e.g., cover of TIME). I've felt conflicted about this.

On one hand, it's great that EA ideas have the opportunity to reach more people.

On the other hand, I had a feeling (mostly based on quotes from newspaper articles summarizing the book) that WWOTF doesn't feature AI ... (read more)

Not yet, and I'm to blame. I've been focusing on a different project recently, which has demanded my full attention. 

Will plan to announce the winners (and make winning entries public, unless authors indicated otherwise) at some point this month.

+1. The heuristic doesn’t always work.

(Though for an intro talk I would probably just modify the heuristic to “is the the kind of intro talk that would’ve actually excited a younger version of me.”)

Thanks for writing this, Emma! Upvoted :)

Here's one heuristic I heard at a retreat several months ago: "If you're ever running an event that you are not excited to be part of, something has gone wrong."

Obviously, it's just a heuristic, but I actually found it to be a pretty useful one. I think a lot of organizers spend time hosting events that feel more like "teaching" rather than "learning together or working on interesting unsolved problems together." 

And my impression is that the groups that have fostered more of a "let's learn together and do thin... (read more)

If you're ever running an event that you are not excited to be part of, something has gone wrong

This seems way too strong to me. Eg, reasonable and effective intro talks feel like they wouldn't be much fun for me to do, yet seem likely high value

9
emma-w1
2y
Thanks, Akash! I haven't thought a lot about how this might apply to larger-scale events like retreats, but this makes a lot of sense to me. Somewhat unrelatedly, I think it'd be nice to have a catalog / google doc of workshops that are all about skill-building (or "learning together or working on interesting unsolved problems together." as you put it).  I felt like the Bright Futures Retreat had a lot of good examples of this. University group organizers then have a good reference of what types of events are useful, and they can either 1) use the catalog as inspiration for spin-off workshops 2) host workshops from the catalog as one-off events during term, or 3) compile their favorites for a retreat. This would likely save them a lot of time and diminish the amount of ops-related work they do.  

Thank you for writing this, Ben. I think the examples are a helpful and I plan to read more about several of them. 

With that in mind, I'm confused about how to interpret your post and how much to update on Eliezer. Specifically, I find it pretty hard to assess how much I should update (if at all) given the "cherry-picking" methodology:

Here, I’ve collected a number of examples of Yudkowsky making (in my view) dramatic and overconfident predictions concerning risks from technology.

Note that this isn’t an attempt to provide a balanced overview of Yudkows

... (read more)

I think the effect should depend on your existing view. If you've always engaged directly with Yudkowsky's arguments and chose the ones convinced you, there's nothing to learn. If you thought he was a unique genius and always assumed you weren't convinced of things because he understood things you didn't know about, and believed him anyway, maybe it's time to dial it back. If you'd always assumed he's wrong about literally everything, it should be telling for you that OP had to go 15 years back for good examples.

Writing this comment actually helped me understand how to respond to the OP myself.

Hey, Jay! Judging is underway, and I'm planning to announce the winners within the next month. Thanks for your patience, and sorry for missing your message.

I was keen to check out the winning entries to this contest, but I'm wondering if I missed the announcement, and I can't seem to find it anywhere. Have the entries been made public somewhere?

Load more