New & upvoted

Customize feedCustomize feed

Posts tagged community

Quick takes

One problem in AI/policy/ethics is that it seems like we sometimes have inconsistent preferences. For example, I could want to have a painting in my office, and simultaneously not want to have a painting in my office. This is a problem because classical (preference) logic can’t really deal with contradictions, so it relies on the assumption that they don't exist. Since it seems like we sometimes experience them, I created a non-classical preference logic that can deal with them. Because effective altruism is all about logically satisfying preferences I considered putting it on this forum, but decided against it since it's a bit too abstract. For those of you that are interested in logic/philosophy, you can read it here.
Sam Bankman-Fried's trial is scheduled to start October 3, 2023, and Michael Lewis’s book about FTX comes out the same day. My hope and expectation is that neither will be focused on EA,[1] but several people have recently asked me about if they should prepare anything, so I wanted to quickly record my thoughts. The Forum feels like it’s in a better place to me than when FTX declared bankruptcy: the moderation team at the time was Lizka, Lorenzo, and myself, but it is now six people, and they’ve put in a number of processes to make it easier to deal with a sudden growth in the number of heated discussions. We have also made a number of design changes, notably to the community section.  CEA has also improved our communications and legal processes so we can be more responsive to news, if we need to (though some of the constraints mentioned here are still applicable). Nonetheless, I think there’s a decent chance that viewing the Forum, Twitter, or news media could become stressful for some people, and you may want to preemptively create a plan for engaging with that in a healthy way.  1. ^ This market is thinly traded but is currently predicting that Lewis’s book will not explicitly assert that Sam misused customer funds because of “ends justify the means” reasoning
Vasili Arkhipov is discussed less on the EA Forum than Petrov is (see also this thread of less-discussed people). I thought I'd post a quick take describing that incident. Arkhipov & the submarine B-59’s missile On October 27, 1962 (during the Cuban Missile Crisis), the Russian diesel-powered submarine B-59 started experiencing[1] nearby depth charges from US forces above them; the submarine had been detected and US ships seemed to be attacking. The submarine’s air conditioning was broken,[2] CO2 levels were rising, and B-59 was out of contact with Moscow. Two of the senior officers on the submarine, thinking that a global war had started, wanted to launch their “secret weapon,” a 10-kiloton nuclear torpedo. The captain, Valentin Savistky, apparently exclaimed: “We’re gonna blast them now! We will die, but we will sink them all — we will not become the shame of the fleet.”  The ship was authorized to launch the torpedo without confirmation from Moscow, but all three senior officers on the ship had to agree.[3] Chief of staff of the flotilla Vasili Arkhipov refused. He convinced Captain Savitsky that the depth charges were signals for the Soviet submarine to surface (which they were) — if the US ships really wanted to destroy the B-59, they would have done it by now. (Part of the problem seemed to be that the Soviet officers were used to different signals than the ones the Americans were using.) Arkhipov calmed the captain down[4] and got him to surface the submarine to get orders from the Kremlin, which ended up eventually defusing the situation.  (Here's a Vox article on the incident.) The B-59 submarine. 1. ^ Vadim Orlov described the impact of the depth charges as being inside an oil drum getting struck with a sledgehammer. 2. ^ Temperatures were apparently above 45ºC (113ºF). 3. ^ The B-59 was apparently the only submarine in the flotilla that required three officers’ approval in order to fire the “special weapon” — the others only required the captain and the political offer to approve the launch.  4. ^ From skimming some articles and first-hand accounts, it seems unclear if the captain just had an outburst and then accurately wanted to follow protocol (and use the missile), or if he was truly reacting irrationally/emotionally because of the incredibly stressful environment. Accounts conflict a bit, and my sense is that orders around using the missile were unclear and overly permissive (or even encouraging towards using the missile).
Chevron deference is a legal doctrine that limits the ability of courts to overrule federal agencies. It's increasingly being challenged, and may be narrowed or even overturned this year. This would greatly limit the ability of, for example, a new regulatory agency on AI Governance to function effectively. More: * This argues it would lead to regulatory chaos, and not simply deregulation: * This describes the Koch network influence on Clarence Thomas. The Kochs are behind the upcoming challenge to Chevron:
A brief thought on 'operations' and how it is used in EA (a topic I find myself occasionally returning to). It struck me that operations work and non-operations work (within the context of EA) maps very well onto the concept of staff and line functions. Line function are those that directly advances an organization's core work, while staff functions are those that do not. Staff functions have advisory and support functions; they help the line functions. Staff functions are generally things like accounting, finance, public relations/communication, legal, and HR. Line functions are generally things like sales, marketing, production, and distribution. The details will vary depending on the nature of the organization, but I find this to be a somewhat useful framework for bridging concepts between EA and the broader world. It also helps illustrate how little information is conveyed if I tell someone I work in operations. Imagine 'translating' that into non-EA verbiage as I work in a staff function. Unless the person I am talking to already has a very good understanding of how my organization works, they won't know what I actually do.

Recent discussion

One problem in AI/policy/ethics is that it seems like we sometimes have inconsistent preferences. For example, I could want to have a painting in my office, and simultaneously not want to have a painting in my office. This is a problem because classical (preference) logic can’t really deal with contradictions, so it relies on the assumption that they don't exist.

Since it seems like we sometimes experience them, I created a non-classical preference logic that can deal with them. Because effective altruism is all about logically satisfying preferences I cons... (read more)


The EA Mexico Residency Fellowship marked a significant milestone in bringing together individuals committed to effective altruism (EA) worldwide, focusing on Spanish speakers and individuals from underrepresented backgrounds.  This post serves as an overview of the program's outcomes and areas for improvement. By sharing our experiences, we aim to provide valuable insights for future organizers of similar initiatives.

Quick Facts

  • 102 participants from 25 countries, comprising 11 fellows and 91 visitors. 36% of the fellows identified themselves as female or non-binary. 32% of the fellows were from LATAM.
  • Generated 714 new connections at an average cost of 371 USD per connection.
  • Weekly cost for hosting a participant was approximately 602 USD
  • 42.5% of participants rated the fellowship 10/10.
  • 30% of respondents valued the fellowship at over $10,000.
  • 40% of participants would consider relocating to Mexico City for an established

Confidence level: I’m a computational physicist working on nanoscale simulations, so I have some understanding of most of the things discussed here, but I am not specifically an expert on the topics covered, so I can’t promise perfect accuracy.

I want to give a huge thanks to Professor Phillip Moriarty of the university of Nottingham for answering my questions about the experimental side of mechanosynthesis research.


A lot of people are highly concerned that a malevolent AI or insane human will, in the near future, set out to destroy humanity. If such an entity wanted to be absolutely sure they would succeed, what method would they use? Nuclear war? Pandemics?

According to some in the x-risk community, the answer is this: The AI will invent molecular nanotechnology, and then kill...

Hey, thanks for engaging. I saved the AGI theorizing for last because it's the most inherently speculative: I am highly uncertain about it, and everyone else should be too.  I would dispute that "a million superintelligence exist and cooperate with each other to invent MNT" is a likely scenario, but even given that, my guess would still be no.  The usual disclaimer that the following is all my personal guesses as a non-experimentalist and non-future knower: If we restrict to diamondoid, my credence would be very low, somewhere in the 0 to 10% range. The  "diamondoid massively parallel builds diamondoid and everything else" process is intensely challenging: we only need one step to be unworkable for the whole thing to be kaput, and we've already identified some potential problems (tips sticking together, hydrogen hitting, etc). With all materials available, my credence is very likely (above 95%) that something self-replicating that is more impressive than bacteria and viruses is possible, but I have no idea how impressive the limits of possibility are.  I'd agree that this is almost certain conditional on 1. To be clear, all forms of bonds are "exploiting quantum physics", in that they are low-energy configurations of electrons interacting with each other according to quantum rules. The answer to the sticky fingers problem, if there is one, will almost certainly involve the bonds we already know about, such as using weaker Van-der-Waals forces to stick and unstick atoms, as I think is done in biology? As for the limiting factor: In the case of the million years of superintelligences, it would probably be a long search over a gargantuan set of materials, and a gargantuan set of possible designs and approaches, to identify ones that are theoretically promising, tests them with computational simulations to whittle them down, and then experimentally create each material and each approach and test them all in turn. The galaxy cluster would be able to optimize each st

With all materials available, my credence is very likely (above 95%) that something self-replicating that is more impressive than bacteria and viruses is possible, but I have no idea how impressive the limits of possibility are.

Much of the (purported) advantage of diamondoid mechanisms is that they're (meant to be) stiff enough to operate deterministically with atomic precision. Without that, you're likely to end up much closer to biological systems—transport is more diffusive, the success of any step is probabilistic, and you need a whole ecosystem of mec... (read more)

Really well written post, thanks. I know how hard it is to explain this kind of difficult stuff, and I now understand a LOT more about nano-stuff in general than I did 20 minutes ago. And I know this isn't a very EA comment, but I cant believe that one of the main guys in this story is  Professor Moriarty ...
Sign up for the Forum's email digest
You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.

Having a high-level overview of the AI safety ecosystem seems like a good thing, so I’ve created an Anki deck to help people familiarise themselves with the 167 key organisations, projects, and programs currently active in the field.

Why Anki?

Anki is a flashcard app that uses spaced repetition to help you efficiently remember information over the long term. It’s useful for learning and memorising all kinds of things – it was the the main tool I used to learn German within a year – and during the time that I’ve been testing out this deck I feel like it’s already improved my grasp of the AI safety landscape.

What you’ll learn

The deck is based on data from the AI Existential Safety Map run by AED – if you’re not...

Thank you for making this deck! I'm an avid Anki user and will start on this today. I'm curious about how useful decks like these would be for people. I'm going through AGISF right now and making a bunch of cards for studying and the thought of publishing them for others came up, but wasn't sure if it was worth the effort (to polish them for public use). @Bryce Robertson, any thoughts on this? Were you approached to do this or did you come up with your own reasons as to why you started this project? If this is something that would be valuable for other resources I'd be quite excited to work on this.

The idea came about because I was looking for ways I could use Anki beyond language learning and figured this could be useful, then decided that if it seems useful for me then presumably for others too.

When I told a few people I was working on this, I generally didn’t get particularly excited feedback. It seemed like this may at least to some degree be because people are sceptical as to the quality of shared decks, which is partly why I put a lot of time into making this one as well-designed as possible.

That’s also the reason I would personally be keen to ... (read more)


  •  Edit: The original deadline was October 6th, 2023, but we reviewed applications on a rolling basis, and –as of September 25– we have now received and reviewed enough applications, so we've closed the role. We are sorry about this, and we hope to see your application in a future hiring round.
  • Epoch is hiring a part-time Associate Data Analyst to annotate papers informing our research and stakeholders’ policy decisions. 
  • This is a fully remote independent contractor role. Contracts will be set up as a 1-month, renewable, independent contractor agreements.
  • Compensation ranges from $20-$25 USD per hour, estimated at 5-25 hours of work per week.
  • This is an excellent opportunity for those interested in doing impactful work while gaining exposure to cutting-edge machine learning advancements.
  • Comment below or email if you have

We just finished hiring a data analyst for October. It's possible that we'll hire another candidate in the future, but the position is not currently taking applications.

At EAGs I often find myself having roughly the same 30 minute conversation with university students who are interested in policy careers and want to test their fit.

This post will go over two cheap tests, each possible to do over a weekend, that you can do to test your fit for policy work.

I am by no means the best person to be giving this advice but I received feedback that my advice was helpful, and I'm not going to let go of an opportunity to act old and wise. A lot of it is based off what worked for me, when I wanted to break into the field a few years ago. Get other perspectives too! Contradictory input in the comments from people with more seniority is...

This is a linkpost for

Lightcone Infrastructure (the organization that grew from and houses the LessWrong team) has just finished renovating a 7-building physical campus that we hope to use to make the future of humanity go better than it would otherwise.

We're hereby announcing that it is generally available for bookings. We offer preferential pricing for projects we think are good for the world, but to cover operating costs, we're renting out space to a wide variety of people/projects.

How do I express interest?

 What kinds of things can I run here?

  • Team Retreats (15-80 people)
    • We offer cozy nooks with firepits, discussion rooms with endless whiteboards, plus lodging. The space is very modular, and we can
This looks amazing!! Honest question though (sorry): Do people think there's a moral difference between Lightcone's campus ("house party venue...crossed with Disneyland") and EVF's abbey ("castle")? There are obviously multiple reasons why the EVF purchase looks worse at first glance: * Large English venues generally look more extravagant * EVF is - for better or worse - closely associated with ideas like frugality, charity, sacrifice, etc., while Lightcone explicitly owns their engagement in "the capitalist enterprise" for the greater good * Lightcone spoke about the campus prominently even before it was open for bookings, with their preferred framing and explanation, before critics had the opportunity to do it for them * EVF was already in some people's bad books Many EAs also made the meta objection that bad optics should have been reason enough not to do it. But this was partly self-fulfilling the more loudly and hyperbolically the objection was voiced, and hopefully we can all agree that the extent to which optics should guide our decisions is at least a hotly contested topic in EA (rather than a scandalous signal of incompetence when an org leans one way rather than the other). So I'm curious if community opinion is generally that the EVF purchase was "bad" and the Lightcone purchase was "good"? If so, and if the explanation is that third wave EA is too large for majority opinion to be that much more nuanced than the public's, I want to double-check that orgs are very aware of this now so that they can factor it in to future decisions. And if people think that the EVF purchase was bad for another reason, I think that would also be helpful to know.[1] 1. ^ Sorry if I've missed something obvious and important here! This is all just one person's impression of how two similar initiatives played out and it's very plausible to me that I've missed a key part of the picture.

So I'm curious if community opinion is generally that the EVF purchase was "bad" and the Lightcone purchase was "good"?

I didn't get the sense that there's a community-consensus about the castle as a convenient event venue that you can resell at some point later to make up parts of the costs. Some people were very vocal in their outrage, but many thought it might be totally fine or is at least defensible even if it was a mistake.

It could be that "third wave EA" will contain a norm of accepting that people have different takes on things like that. For exampl... (read more)

Felix Wolf
2740 Telegraph Ave, Berkeley (94705), California. I also was not able to find it directly on the website.

TL;DR: I argue for two main theses:

  1. [Moderate-high confidence] It would be better to aim for a conditional pause, where a pause is triggered based on evaluations of model ability, rather than an unconditional pause (e.g. a blanket ban on systems more powerful than GPT-4).
  2. [Moderate confidence] It would be bad to create significant public pressure for a pause through advocacy, because this would cause relevant actors (particularly AGI labs) to spend their effort on looking good to the public, rather than doing what is actually good.

Since mine is one of the last posts of the AI Pause Debate Week, I've also added a section at the end with quick responses to the previous posts.

Which goals are good?

That is, ignoring tractability and just assuming that we succeed at the...

A conditional pause fails to prevent x-risk if either:

  • The AI successfully exfiltrates itself (which is what's needed to defeat rollback mechanisms) during training or evaluation, but before deployment. 
  • The AI successfully sandbags the evaluations. (Note that existing conditional pause proposals depend on capability evaluations, not alignment evaluations.)

Another way evals could fail is if they work locally but it's still too late in the relevant sense because even with the pause mechanism kicking in (e.g., "from now on, any training runs that use 0.2x... (read more)

Hm, or that we get lucky in terms of the public's response being a good one given the circumstances, even if I don't expect the discourse to be nuanced. It seems like a reasonable stance to think that a crude reaction of "let's stop this research before it's too late" is appropriate as a first step, and that it's okay to worry about other things later on. The negatives you point out are certainly significant, so if we could get a conditional pause setup through other channels, that seems clearly better! But my sense is that it's unlikely we'd succeed at getting ambitious measures in place without some amount of public pressure. (For what it's worth, I think the public pressure is already mounting, so I'm not necessarily saying we have to ramp up the advocacy side a lot – I'm definitely against forming PETA-style anti-AI movements.) It also matters how much weight you give to person-affecting views (I've argued here for why I think they're not unreasonable). If we can delay AI takeoff for five years, that's worth a lot from the perspective of currently-existing people! (It's probably also weakly positive or at least neutral from a suffering-focused longtermist perspective because everything seems uncertain from that perspective and a first-pass effect is delaying things from getting bigger; though I guess you could argue that particular s-risks are lower if more alignment-research-type reflection goes into AI development.) Of course, buying a delay that somewhat (but not tremendously) worsens your chances later on is a huge cost to upside-focused longtermism. But if we assume that we're already empirically pessimistic on that view to begin with, then it's an open question how a moral parliament between worldviews would bargain things out. Certainly the upside-focused longtermist faction should get important concessions like "try to ensure that actually-good alignment research doesn't fall under the type of AI research that will be prohibited." My all-things-conside
Sounds like we roughly agree on actions, even if not beliefs (I'm less sold on fast / discontinuous takeoff than you are). As a minor note, to keep incentives good, you could pay evaluators / auditors based on how much performance they are able to elicit. You could even require that models be evaluated by at least three auditors, and split up payment between them based on their relative performances. In general it feels like there a huge space of possibilities that has barely been explored.

A theory of change is a set of hypotheses about explicitly articulates the cause-and-effect steps for how a project—suchproject or organization can turn inputs into a desired impact on the world (i.e. it’s their theory of how they’ll make a change). They generally include the following sections:

  • Inputs / activities: What the project or organization does to create change (e.g. “distribute bednets”)
  • Outputs: The tangible effects generated by the inputs (e.g. “beneficiaries have access to malaria nets”)
  • Intermediate outcomes: The outputs’ effects, including benefits for the beneficiary, (e.g. “malaria nets are used” and "reduced incidence of malaria")
  • Impact: What we’re ultimately solving, and why the intermediate outcomes matter (e.g. “lives saved”)

Best practices when crafting a theory of change (i.e. for creators):

  • Invest sufficiently in understanding the problem context (i.e. understanding the needs and incentives of the beneficiaries and other stakeholders, as well as barriers to change and the economic & political context)
  • Map the causal pathway backwards from impact to activities
  • Question every causal step (is it clear why A should cause B? how might it fail?)

Hallmarks of an intervention, an organization,  or a movement—will accomplish its goals.excellent theory of change (i.e. for reviewers):

  • A focused suite of activities
  • The evidence and assumptions behind each step are explicitly named
  • The relative confidence of each step is clear
  • It is often created by starting withclear who the goals, and then identifying the penultimateactor is in each step that leads

Common mistakes to the goals. The step that leads to the penultimate step is then identified, and so on, until the first steps are known. This method is called backward induction, backwards mapping or backchaining.

Further reading

Siegmann, Charlotte (2022) Collection of resources aboutavoid in theories of change are:

  • Not making fundamental impact the goal (e.g., Effective Altruismstopping at ‘increased immunizations’ instead of ‘improved health’)
  • Being insufficiently detailed: (a) making large leaps between each step, (b) combining multiple major outcomes into one step (e.g. ‘government introduces and enforces regulation’).
  • Setting and forgetting (instead of regularly iterating on it)
  • Not building your theory of change into a measurement plan

From: Nailing the basics – Theories of change — EA Forum, May 15. (