Effective Altruism Forum
EA Forum

In my latest post I talked about whether unaligned AIs would produce more or less utilitarian value than aligned AIs. To be honest, I'm still quite confused about why many people seem to disagree with the view I expressed, and I'm interested in engaging more to get a better understanding of their perspective. At the least, I thought I'd write a bit more about my thoughts here, and clarify my own views on the matter, in case anyone is interested in trying to understand my perspective. The core thesis that was trying to defend is the following view: My view: It is likely that by default, unaligned AIs—AIs that humans are likely to actually build if we do not completely solve key technical alignment problems—will produce comparable utilitarian value compared to humans, both directly (by being conscious themselves) and indirectly (via their impacts on the world). This is because unaligned AIs will likely both be conscious in a morally relevant sense, and they will likely share human moral concepts, since they will be trained on human data. Some people seem to merely disagree with my view that unaligned AIs are likely to be conscious in a morally relevant sense. And a few others have a semantic disagreement with me in which they define AI alignment in moral terms, rather than the ability to make an AI share the preferences of the AI's operator. But beyond these two objections, which I feel I understand fairly well, there's also significant disagreement about other questions. Based on my discussions, I've attempted to distill the following counterargument to my thesis, which I fully acknowledge does not capture everyone's views on this subject: Perceived counter-argument: The vast majority of utilitarian value in the future will come from agents with explicitly utilitarian preferences, rather than those who incidentally achieve utilitarian objectives. At present, only a small proportion of humanity holds slightly utilitarian views. However, as unaligned AIs will differ from humans across numerous dimensions, it is plausible that they will possess negligible utilitarian impulses, in stark contrast to humanity's modest (but non-negligible) utilitarian tendencies. As a result, it is plausible that almost all value would be lost, from a utilitarian perspective, if AIs were unaligned with human preferences. Again, I'm not sure if this summary accurately represents what people believe. However, it's what some seem to be saying. I personally think this argument is weak. But I feel I've had trouble making my views very clear on this subject, so I thought I'd try one more time to explain where I'm coming from here. Let me respond to the two main parts of the argument in some amount of detail: (i) "The vast majority of utilitarian value in the future will come from agents with explicitly utilitarian preferences, rather than those who incidentally achieve utilitarian objectives." My response: I am skeptical of the notion that the bulk of future utilitarian value will originate from agents with explicitly utilitarian preferences. This clearly does not reflect our current world, where the primary sources of happiness and suffering are not the result of deliberate utilitarian planning. Moreover, I do not see compelling theoretical grounds to anticipate a major shift in this regard. I think the intuition behind the argument here is something like this: In the future, it will become possible to create "hedonium"—matter that is optimized to generate the maximum amount of utility or well-being. If hedonium can be created, it would likely be vastly more important than anything else in the universe in terms of its capacity to generate positive utilitarian value. The key assumption is that hedonium would primarily be created by agents who have at least some explicit utilitarian goals, even if those goals are fairly weak. Given the astronomical value that hedonium could potentially generate, even a tiny fraction of the universe's resources being dedicated to hedonium production could outweigh all other sources of happiness and suffering. Therefore, if unaligned AIs would be less likely to produce hedonium than aligned AIs (due to not having explicitly utilitarian goals), this would be a major reason to prefer aligned AI, even if unaligned AIs would otherwise generate comparable levels of value to aligned AIs in all other respects. If this is indeed the intuition driving the argument, I think it falls short for a straightforward reason. The creation of matter-optimized-for-happiness is more likely to be driven by the far more common motives of self-interest and concern for one's inner circle (friends, family, tribe, etc.) than by explicit utilitarian goals. If unaligned AIs are conscious, they would presumably have ample motives to optimize for positive states of consciousness, even if not for explicitly utilitarian reasons. In other words, agents optimizing for their own happiness, or the happiness of those they care about, seem likely to be the primary force behind the creation of hedonium-like structures. They may not frame it in utilitarian terms, but they will still be striving to maximize happiness and well-being for themselves and others they care about regardless. And it seems natural to assume that, with advanced technology, they would optimize pretty hard for their own happiness and well-being, just as a utilitarian might optimize hard for happiness when creating hedonium. In contrast to the number of agents optimizing for their own happiness, the number of agents explicitly motivated by utilitarian concerns is likely to be much smaller. Yet both forms of happiness will presumably be heavily optimized. So even if explicit utilitarians are more likely to pursue hedonium per se, their impact would likely be dwarfed by the efforts of the much larger group of agents driven by more personal motives for happiness-optimization. Since both groups would be optimizing for happiness, the fact that hedonium is similarly optimized for happiness doesn't seem to provide much reason to think that it would outweigh the utilitarian value of more mundane, and far more common, forms of utility-optimization. To be clear, I think it's totally possible that there's something about this argument that I'm missing here. And there are a lot of potential objections I'm skipping over here. But on a basic level, mostly just lack the intuition that the thing we should care about, from a utilitarian perspective, is the existence of explicit utilitarians in the future, for the aforementioned reasons. The fact that our current world isn't well described by the idea that what matters most is the number of explicit utilitarians, strengthens my point here. (ii) "At present, only a small proportion of humanity holds slightly utilitarian views. However, as unaligned AIs will differ from humans across numerous dimensions, it is plausible that they will possess negligible utilitarian impulses, in stark contrast to humanity's modest (but non-negligible) utilitarian tendencies." My response: Since only a small portion of humanity is explicitly utilitarian, the argument's own logic suggests that there is significant potential for AIs to be even more utilitarian than humans, given the relatively low bar set by humanity's limited utilitarian impulses. While I agree we shouldn't assume AIs will be more utilitarian than humans without specific reasons to believe so, it seems entirely plausible that factors like selection pressures for altruism could lead to this outcome. Indeed, commercial AIs seem to be selected to be nice and helpful to users, which (at least superficially) seems "more utilitarian" than the default (primarily selfish-oriented) impulses of most humans. The fact that humans are only slightly utilitarian should mean that even small forces could cause AIs to exceed human levels of utilitarianism. Moreover, as I've said previously, it's probable that unaligned AIs will possess morally relevant consciousness, at least in part due to the sophistication of their cognitive processes. They are also likely to absorb and reflect human moral concepts as a result of being trained on human-generated data. Crucially, I expect these traits to emerge even if the AIs do not share human preferences. To see where I'm coming from, consider how humans routinely are "misaligned" with each other, in the sense of not sharing each other's preferences, and yet still share moral concepts and a common culture. For example, an employee can share moral concepts with their employer while having very different consumption preferences from them. This picture is pretty much how I think we should primarily think about unaligned AIs that are trained on human data, and shaped heavily by techniques like RLHF or DPO. Given these considerations, I find it unlikely that unaligned AIs would completely lack any utilitarian impulses whatsoever. However, I do agree that even a small risk of this outcome is worth taking seriously. I'm simply I'm skeptical that such low-probability scenarios should be the primary factor in assessing the value of AI alignment research. Intuitively, I would expect the arguments for prioritizing alignment to be more clear-cut and compelling than "if we fail to align AIs, then there's a small chance that these unaligned AIs might have zero utilitarian value, so we should make sure AIs are aligned instead". If low probability scenarios are the strongest considerations in favor of alignment, that seems to undermine the robustness of the case for prioritizing this work. While it's appropriate to consider even low-probability risks when the stakes are high, I'm doubtful that small probabilities should be the dominant consideration in this context. I think the core reasons for focusing on alignment should probably be more straightforward and less reliant on complicated chains of logic than this type of argument suggests. In particular, as I've said before, I think it's quite reasonable to think that we should align AIs to humans for the sake of humans. In other words, I think it's perfectly reasonable to admit that solving AI alignment might be a great thing to ensure human flourishing in particular. But if you're a utilitarian, and not particularly attached to human preferences per se (i.e., you're non-speciesist), I don't think you should be highly confident that an unaligned AI-driven future would be much worse than an aligned one, from that perspective.

JWS

14h

Going to quickly share that I'm going to take a step back from commenting on the Forum for the foreseeable future. There are a lot of ideas in my head that I want to work into top-level posts to hopefully spur insightful and useful conversation amongst the community, and while I'll still be reading and engaging I do have a limited amount of time I want to spend on the Forum and I think it'd be better for me to move that focus to posts rather than comments for a bit.[1] If you do want to get in touch about anything, please reach out and I'll try my very best to respond. Also, if you're going to be in London for EA Global, then I'll be around and very happy to catch up :) 1. ^ Though if it's a highly engaged/important discussion and there's an important viewpoint that I think is missing I may weigh in

SiebeRozendal

14m

Global healthBiosecurity & pandemicsForecasting

I am concerned about the H5N1 situation in dairy cows and have written and overview document to which I occasionally add new learnings (new to me or new to world). I also set up a WhatsApp community that anyone is welcome to join for discussion & sharing news. In brief: * I believe there are quite a few (~50-250) humans infected recently, but no sustained human-to-human transmission * I estimate the Infection Fatality Rate substantially lower than the ALERT team (theirs is 63% that CFR >= 10%), something like 80%CI = 0.1 - 5.0 * The government's response is astoundingly bad - I find it insane that raw milk is still being sold, with a high likelihood that some of it contains infectious H5N1 * There are still quite a few genetic barriers to sustained human-to-human transmission * This might be a good time to push specific pandemic preparedness policies

yanni kyriacos

Something I'm confused about: what is the threshold that needs meeting for the majority of people in the EA community to say something like "it would be better if EAs didn't work at OpenAI"? Imagining the following hypothetical scenarios over 2024/25, I can't predict confidently whether they'd individually cause that response within EA? 1. Ten-fifteen more OpenAI staff quit for varied and unclear reasons. No public info is gained outside of rumours 2. There is another board shakeup because senior leaders seem worried about Altman. Altman stays on 3. Superalignment team is disbanded 4. OpenAI doesn't let UK or US AISI's safety test GPT5/6 before release 5. There are strong rumours they've achieved weakly general AGI internally at end of 2025

ABishop

12h

Much of the writing here relies on confusing metaconcepts. Preferences, intuition, etc. are not things like apples and trees. I hope the confusion will be overcome. Is there a useful metaphysics for dealing with this? Perhaps anyone interested in complex controversies in the philosophy of mathematics will know.

Past week
Past week

Frontpage Posts

243

Émile P. Torres’s history of dishonesty and harassment

anonymous-for-obvious-reasons

· 4d ago · 48m read

212

Joining the Carnegie Endowment for International Peace

Holden Karnofsky

· 6d ago · 2m read

115

Why I'm doing PauseAI

Joseph Miller

· 5d ago

114

Introducing AI Lab Watch

Zach Stein-Perlman

· 5d ago

113

My Lament to EA

kta

· 2d ago · 14m read

AMA: Lewis Bollard, Program Director of Farm Animal Welfare at OpenPhil

tobytrem

· 4d ago · 1m read

One week left to give feedback on the UK Mandatory Welfare Label Scheme

tobytrem

· 5d ago · 4m read

Animal Welfare is now enshrined in the Belgian Constitution

Bob Jacobs

· 2d ago · 11m read

Ask me questions here about my 80,000 hours podcast on preventing neonatal deaths with Kangaroo Mother Care

deanspears

· 3d ago · 1m read

#185 – The 7 most promising ways to end factory farming, and whether AI is going to be good or bad for animals (Lewis Bollard on the 80,000 Hours Podcast)

80000_Hours

· 5d ago · 18m read

Is there any way to be confident that humanity won't keep employing mass torture of animals for millions of years in the future?

Eduardo

· 7d ago · 1m read

On John Woolman (Thing of Things)

Aaron Gertler

· 4d ago · 1m read

AISC9 has ended and there will be an AISC10

Linda Linsefors

· 6d ago

White House publishes framework for Nucleic Acid Screening

Agustín Covarrubias

· 6d ago · 1m read

"AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case

· 2d ago

· 5d ago · 1m read

Quick takes

tlevin

I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable. I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies. In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences. Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).

William_S

I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source "transformer debugger" tool. I resigned from OpenAI on February 15, 2024.

RedStateBlueState

Biosecurity & pandemics

Trump recently said in an interview (https://time.com/6972973/biden-trump-bird-flu-covid/) that he would seek to disband the White House office for pandemic preparedness. Given that he usually doesn't give specifics on his policy positions, this seems like something he is particularly interested in. I know politics is discouraged on the EA forum, but I thought I would post this to say: EA should really be preparing for a Trump presidency. He's up in the polls and IMO has a >50% chance of winning the election. Right now politicians seem relatively receptive to EA ideas, this may change under a Trump administration.

MathiasKB

Excerpt from the most recent update from the ALERT team: Highly pathogenic avian influenza (HPAI) H5N1: What a week! The news, data, and analyses are coming in fast and furious. Overall, ALERT team members feel that the risk of an H5N1 pandemic emerging over the coming decade is increasing. Team members estimate that the chance that the WHO will declare a Public Health Emergency of International Concern (PHEIC) within 1 year from now because of an H5N1 virus, in whole or in part, is 0.9% (range 0.5%-1.3%). The team sees the chance going up substantially over the next decade, with the 5-year chance at 13% (range 10%-15%) and the 10-year chance increasing to 25% (range 20%-30%). their estimated 10 year risk is a lot higher than I would have anticipated.

Thomas Kwa

Not sure how to post these two thoughts so I might as well combine them. In an ideal world, SBF should have been sentenced to thousands of years in prison. This is partially due to the enormous harm done to both FTX depositors and EA, but mainly for basic deterrence reasons; a risk-neutral person will not mind 25 years in prison if the ex ante upside was becoming a trillionaire. However, I also think many lessons from SBF's personal statements e.g. his interview on 80k are still as valid as ever. Just off the top of my head: * Startup-to-give as a high EV career path. Entrepreneurship is why we have OP and SFF! Perhaps also the importance of keeping as much equity as possible, although in the process one should not lie to investors or employees more than is standard. * Ambition and working really hard as success multipliers in entrepreneurship. * A career decision algorithm that includes doing a BOTEC and rejecting options that are 10x worse than others. * It is probably okay to work in an industry that is slightly bad for the world if you do lots of good by donating. [1] (But fraud is still bad, of course.) Just because SBF stole billions of dollars does not mean he has fewer virtuous personality traits than the average person. He hits at least as many multipliers than the average reader of this forum. But importantly, maximization is perilous; some particular qualities like integrity and good decision-making are absolutely essential, and if you lack them your impact could be multiplied by minus 20. [1] The unregulated nature of crypto may have allowed the FTX fraud, but things like the zero-sum zero-NPV nature of many cryptoassets, or its negative climate impacts, seem unrelated. Many industries are about this bad for the world, like HFT or some kinds of social media. I do not think people who criticized FTX on these grounds score many points. However, perhaps it was (weak) evidence towards FTX being willing to do harm in general for a perceived greater good, which is maybe plausible especially if Ben Delo also did market manipulation or otherwise acted immorally. Also note that in the interview, SBF didn't claim his donations offset a negative direct impact; he said the impact was likely positive, which seems dubious.

Load more (5/25)

Past 14 days
Past 14 days

Frontpage Posts

224

Priors and Prejudice

MathiasKB

· 13d ago · 9m read

193

Motivation gaps: Why so much EA criticism is hostile and lazy

titotal

· 13d ago · 23m read

182

Announcing The New York Declaration on Animal Consciousness

Sofia_Fogel

· 14d ago · 1m read

110

On failing to get EA jobs: My experience and recommendations to EA orgs

Ávila Carmesí

· 13d ago · 6m read

103

What's in a GWWC Pin?

JWS

· 14d ago · 5m read

EA Meta Funding Landscape Report

Joel Tan

· 9d ago · 8m read

You probably want to donate any Manifold currency this week

Henri Thunberg

· 12d ago · 2m read

Lessons from two pioneering advocates for farmed animals

LewisBollard

· 9d ago · 6m read

New core career advice series from Probably Good!

Probably Good

· 11d ago · 1m read

Bringing Monitoring, Evaluation and, Learning to animal advocates: 6 months of lessons learned

Nicoll Peracha

· 9d ago · 14m read

If You're Going To Eat Animals, Eat Beef and Dairy

Omnizoid

· 13d ago · 3m read

Today is World Malaria Day (April 25)

tobytrem

· 10d ago · 2m read

10 Lessons Learned - One Year at EA Switzerland

Alix Pham

· 8d ago · 10m read

Your feedback for Actually After Hours: the unscripted, informal 80k podcast

Mjreard

· 11d ago · 2m read

'The AI Dilemma: Growth vs Existential Risk': An Extension for EAs and a Summary for Non-economists

TomHoulden

· 14d ago · 19m read

New org announcement: Would your project benefit from OSINT, satellite imagery analysis, or international security-related research support?

Christina

· 13d ago · 1m read

Load more (16/42)

Quick takes

131

Matthew_Barnett

11d

In this "quick take", I want to summarize some my idiosyncratic views on AI risk. My goal here is to list just a few ideas that cause me to approach the subject differently from how I perceive most other EAs view the topic. These ideas largely push me in the direction of making me more optimistic about AI, and less likely to support heavy regulations on AI. (Note that I won't spend a lot of time justifying each of these views here. I'm mostly stating these points without lengthy justifications, in case anyone is curious. These ideas can perhaps inform why I spend significant amounts of my time pushing back against AI risk arguments. Not all of these ideas are rare, and some of them may indeed be popular among EAs.) 1. Skepticism of the treacherous turn: The treacherous turn is the idea that (1) at some point there will be a very smart unaligned AI, (2) when weak, this AI will pretend to be nice, but (3) when sufficiently strong, this AI will turn on humanity by taking over the world by surprise, and then (4) optimize the universe without constraint, which would be very bad for humans. By comparison, I find it more likely that no individual AI will ever be strong enough to take over the world, in the sense of overthrowing the world's existing institutions and governments by surprise. Instead, I broadly expect unaligned AIs will integrate into society and try to accomplish their goals by advocating for their legal rights, rather than trying to overthrow our institutions by force. Upon attaining legal personhood, unaligned AIs can utilize their legal rights to achieve their objectives, for example by getting a job and trading their labor for property, within the already-existing institutions. Because the world is not zero sum, and there are economic benefits to scale and specialization, this argument implies that unaligned AIs may well have a net-positive effect on humans, as they could trade with us, producing value in exchange for our own property and services. Note that my claim here is not that AIs will never become smarter than humans. One way of seeing how these two claims are distinguished is to compare my scenario to the case of genetically engineered humans. By assumption, if we genetically engineered humans, they would presumably eventually surpass ordinary humans in intelligence (along with social persuasion ability, and ability to deceive etc.). However, by itself, the fact that genetically engineered humans will become smarter than non-engineered humans does not imply that genetically engineered humans would try to overthrow the government. Instead, as in the case of AIs, I expect genetically engineered humans would largely try to work within existing institutions, rather than violently overthrow them. 2. AI alignment will probably be somewhat easy: The most direct and strongest current empirical evidence we have about the difficulty of AI alignment, in my view, comes from existing frontier LLMs, such as GPT-4. Having spent dozens of hours testing GPT-4's abilities and moral reasoning, I think the system is already substantially more law-abiding, thoughtful and ethical than a large fraction of humans. Most importantly, this ethical reasoning extends (in my experience) to highly unusual thought experiments that almost certainly did not appear in its training data, demonstrating a fair degree of ethical generalization, beyond mere memorization. It is conceivable that GPT-4's apparently ethical nature is fake. Perhaps GPT-4 is lying about its motives to me and in fact desires something completely different than what it professes to care about. Maybe GPT-4 merely "understands" or "predicts" human morality without actually "caring" about human morality. But while these scenarios are logically possible, they seem less plausible to me than the simple alternative explanation that alignment—like many other properties of ML models—generalizes well, in the natural way that you might similarly expect from a human. Of course, the fact that GPT-4 is easily alignable does not immediately imply that smarter-than-human AIs will be easy to align. However, I think this current evidence is still significant, and aligns well with prior theoretical arguments that alignment would be easy. In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking bad actions, allowing us to shape its rewards during training accordingly. After we've aligned a model that's merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point. 3. The default social response to AI will likely be strong: One reason to support heavy regulations on AI right now is if you think the natural "default" social response to AI will lean too heavily on the side of laissez faire than optimal, i.e., by default, we will have too little regulation rather than too much. In this case, you could believe that, by advocating for regulations now, you're making it more likely that we regulate AI a bit more than we otherwise would have, pushing us closer to the optimal level of regulation. I'm quite skeptical of this argument because I think that the default response to AI (in the absence of intervention from the EA community) will already be quite strong. My view here is informed by the base rate of technologies being overregulated, which I think is quite high. In fact, it is difficult for me to name even a single technology that I think is currently clearly underregulated by society. By pushing for more regulation on AI, I think it's likely that we will overshoot and over-constrain AI relative to the optimal level. In other words, my personal bias is towards thinking that society will regulate technologies too heavily, rather than too loosely. And I don't see a strong reason to think that AI will be any different from this general historical pattern. This makes me hesitant to push for more regulation on AI, since on my view, the marginal impact of my advocacy would likely be to push us even further in the direction of "too much regulation", overshooting the optimal level by even more than what I'd expect in the absence of my advocacy. 4. I view unaligned AIs as having comparable moral value to humans: This idea was explored in one of my most recent posts. The basic idea is that, under various physicalist views of consciousness, you should expect AIs to be conscious, even if they do not share human preferences. Moreover, it seems likely that AIs — even ones that don't share human preferences — will be pretrained on human data, and therefore largely share our social and moral concepts. Since unaligned AIs will likely be both conscious and share human social and moral concepts, I don't see much reason to think of them as less "deserving" of life and liberty, from a cosmopolitan moral perspective. They will likely think similarly to the way we do across a variety of relevant axes, even if their neural structures are quite different from our own. As a consequence, I am pretty happy to incorporate unaligned AIs into the legal system and grant them some control of the future, just as I'd be happy to grant some control of the future to human children, even if they don't share my exact values. Put another way, I view (what I perceive as) the EA attempt to privilege "human values" over "AI values" as being largely arbitrary and baseless, from an impartial moral perspective. There are many humans whose values I vehemently disagree with, but I nonetheless respect their autonomy, and do not wish to deny these humans their legal rights. Likewise, even if I strongly disagreed with the values of an advanced AI, I would still see value in their preferences being satisfied for their own sake, and I would try to respect the AI's autonomy and legal rights. I don't have a lot of faith in the inherent kindness of human nature relative to a "default unaligned" AI alternative. 5. I'm not fully committed to longtermism: I think AI has an enormous potential to benefit the lives of people who currently exist. I predict that AIs can eventually substitute for human researchers, and thereby accelerate technological progress, including in medicine. In combination with my other beliefs (such as my belief that AI alignment will probably be somewhat easy), this view leads me to think that AI development will likely be net-positive for people who exist at the time of alignment. In other words, if we allow AI development, it is likely that we can use AI to reduce human mortality, and dramatically raise human well-being for the people who already exist. I think these benefits are large and important, and commensurate with the downside potential of existential risks. While a fully committed strong longtermist might scoff at the idea that curing aging might be important — as it would largely only have short-term effects, rather than long-term effects that reverberate for billions of years — by contrast, I think it's really important to try to improve the lives of people who currently exist. Many people view this perspective as a form of moral partiality that we should discard for being arbitrary. However, I think morality is itself arbitrary: it can be anything we want it to be. And I choose to value currently existing humans, to a substantial (though not overwhelming) degree. This doesn't mean I'm a fully committed near-termist. I sympathize with many of the intuitions behind longtermism. For example, if curing aging required raising the probability of human extinction by 40 percentage points, or something like that, I don't think I'd do it. But in more realistic scenarios that we are likely to actually encounter, I think it's plausibly a lot better to accelerate AI, rather than delay AI, on current margins. This view simply makes sense to me given the enormously positive effects I expect AI will likely have on the people I currently know and love, if we allow development to continue.

Ben_West

11d

Animal welfare

First in-ovo sexing in the US Egg Innovations announced that they are "on track to adopt the technology in early 2025." Approximately 300 million male chicks are ground up alive in the US each year (since only female chicks are valuable) and in-ovo sexing would prevent this. UEP originally promised to eliminate male chick culling by 2020; needless to say, they didn't keep that commitment. But better late than never! Congrats to everyone working on this, including @Robert - Innovate Animal Ag, who founded an organization devoted to pushing this technology.[1] 1. ^ Egg Innovations says they can't disclose details about who they are working with for NDA reasons; if anyone has more information about who deserves credit for this, please comment!

harfe

12d

Consider donating all or most of your Mana on Manifold to charity before May 1. Manifold is making multiple changes to the way Manifold works. You can read their announcement here. The main reason for donating now is that Mana will be devalued from the current 1 USD:100 Mana to 1 USD:1000 Mana on May 1. Thankfully, the 10k USD/month charity cap will not be in place until then. Also this part might be relevant for people with large positions they want to sell now: > One week may not be enough time for users with larger portfolios to liquidate and donate. We want to work individually with anyone who feels like they are stuck in this situation and honor their expected returns and agree on an amount they can donate at the original 100:1 rate past the one week deadline once the relevant markets have resolved.

AnonymousTurtle

14d

GiveWell and Open Philanthropy just made a $1.5M grant to Malengo! Congratulations to @Johannes Haushofer and the whole team, this seems such a promising intervention from a wide variety of views

Eli_Nathan

13d

Career choiceCommunity

CEA is hiring for someone to lead the EA Global program. CEA's three flagship EAG conferences facilitate tens of thousands of highly impactful connections each year that help people build professional relationships, apply for jobs, and make other critical career decisions. This is a role that comes with a large amount of autonomy, and one that plays a key role in shaping a key piece of the effective altruism community’s landscape. See more details and apply here!

Load more (5/30)

Past 31 days

Frontpage Posts

235

Understanding FTX's crimes

FTXwatcher

· 24d ago · 23m read

231

Personal reflections on FTX

William_MacAskill

· 17d ago · 1m read

223

Future of Humanity Institute 2005-2024: Final Report

Pablo

· 18d ago · 6m read

146

How good it is to donate and how hard it is to get a job

Elijah Persson-Gordon

· 19d ago · 7m read

135

Writing about my job on Open Philanthropy's Global Aid Policy program + related career opportunities

Sam Anschell

· 23d ago · 14m read

111

Things EA Group Organisers Need To Hear

Kenneth_Diao

· 22d ago · 14m read

110

U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team [and Paul Christiano update]

Phib

· 19d ago · 1m read

107

What should the EA community learn from the FTX / SBF disaster? An in-depth discussion with Will MacAskill on the Clearer Thinking podcast

spencerg

· 19d ago · 1m read

103

A trilogy on anti-philanthropic misdirection

Richard Y Chappell

· 1mo ago · 1m read

102

On Leif Wenar's Absurdly Unconvincing Critique Of Effective Altruism

Omnizoid

· 1mo ago · 17m read

Trying to Do More Good

Jeff Kaufman

· 1mo ago

Probably Good is looking for a CEO / Executive Director!

Probably Good

· 16d ago · 1m read

Two tools for rethinking existential risk

Arepo

· 1mo ago · 30m read

Dear EA, please be the reason people like me will actually see a better world. Help me make some small stride on extreme poverty where I live -- by the end of 2024.

Anthony Kalulu, a rural farmer in eastern Uganda.

· 24d ago · 9m read

Why you might be getting rejected from (junior) operations jobs

Eli_Nathan

· 1mo ago · 4m read

Mediocre EAs: career paths and how do they engage with EA?

mikbp

· 24d ago · 1m read

Load more (16/137)

Quick takes

David Mathers

1mo

Community

Please people, do not treat Richard Hannania as some sort of worthy figure who is a friend of EA. He was a Nazi, and whilst he claims he moderated his views, he is still very racist as far as I can tell. Hannania called for trying to get rid of all non-white immigrants in the US, and the sterilization of everyone with an IQ under 90 indulged in antisemitic attacks on the allegedly Jewish elite, and even post his reform was writing about the need for the state to harass and imprison Black people specifically ('a revolution in our culture or form of government. We need more policing, incarceration, and surveillance of black people' https://en.wikipedia.org/wiki/Richard_Hanania). Yet in the face of this, and after he made an incredibly grudging apology about his most extreme stuff (after journalists dug it up), he's been invited to Manifiold's events and put on Richard Yetter Chappel's blogroll. DO NOT DO THIS. If you want people to distinguish benign transhumanism (which I agree is a real thing*) from the racist history of eugenics, do not fail to shun actual racists and Nazis. Likewise, if you want to promote "decoupling" factual beliefs from policy recommendations, which can be useful, do not duck and dive around the fact that virtually every major promoter of scientific racism ever, including allegedly mainstream figures like Jensen, worked with or published with actual literal Nazis (https://www.splcenter.org/fighting-hate/extremist-files/individual/arthur-jensen). I love most of the people I have met through EA, and I know that-despite what some people say on twitter- we are not actually a secret crypto-fascist movement (nor is longtermism specifically, which whether you like it or not, is mostly about what its EA proponents say it is about.) But there is in my view a disturbing degree of tolerance for this stuff in the community, mostly centered around the Bay specifically. And to be clear I am complaining about tolerance for people with far-right and fascist ("reactionary" or whatever) political views, not people with any particular personal opinion on the genetics of intelligence. A desire for authoritarian government enforcing the "natural" racial hierarchy does not become okay, just because you met the person with the desire at a house party and they seemed kind of normal and chill or super-smart and nerdy. I usually take a way more measured tone on the forum than this, but here I think real information is given by getting shouty. *Anyone who thinks it is automatically far-right to think about any kind of genetic enhancement at all should go read some Culture novels, and note the implied politics (or indeed, look up the author's actual die-hard libertarian socialist views.) I am not claiming that far-left politics is innocent, just that it is not racist.

Ben_West

15d

Animal welfare

Animal Justice Appreciation Note Animal Justice et al. v A.G of Ontario 2024 was recently decided and struck down large portions of Ontario's ag-gag law. A blog post is here. The suit was partially funded by ACE, which presumably means that many of the people reading this deserve partial credit for donating to support it. Thanks to Animal Justice (Andrea Gonsalves, Fredrick Schumann, Kaitlyn Mitchell, Scott Tinney), co-applicants Jessica Scott-Reid and Louise Jorgensen, and everyone who supported this work!

Jeroen De Ryck

22d

Why are April Fools jokes still on the front page? On April 1st, you expect to see April Fools' posts and know you have to be extra cautious when reading strange things online. However, April 1st was 13 days ago and there are still two posts that are April Fools posts on the front page. I think it should be clarified that they are April Fools jokes so people can differentiate EA weird stuff from EA weird stuff that's a joke more easily. Sure, if you check the details you'll see that things don't add up, but we all know most people just read the title or first few paragraphs.

Ben_West

18d

Effective giving

Marcus Daniell appreciation note @Marcus Daniell, cofounder of High Impact Athletes, came back from knee surgery and is donating half of his prize money this year. He projects raising $100,000. Through a partnership with Momentum, people can pledge to donate for each point he gets; he has raised $28,000 through this so far. It's cool to see this, and I'm wishing him luck for his final year of professional play!

harfe

18d

FHI has shut down yesterday: https://www.futureofhumanityinstitute.org/

Load more (5/41)

Since March 1st

Frontpage Posts

376

Maternal Health Initiative is Shutting Down

Ben Williamson

· 2mo ago · 24m read

356

Quick Update on Leaving the Board of EV

Rebecca Kagan

· 1mo ago · 1m read

110

320

New video: You're richer than you realise

GraceAdams

· 2mo ago · 1m read

285

EA "Worldviews" Need Rethinking

Richard Y Chappell

· 2mo ago · 3m read

225

Killing the moths

Bella

· 1mo ago · 6m read

210

The Scale of Fetal Suffering in Late-Term Abortions

Ariel Simnegar

· 2mo ago · 3m read

208

We Did It! - Victory for Octopus in Washington State

Tessa @ ALI

· 2mo ago · 3m read

193

Results from an Adversarial Collaboration on AI Risk (FRI)

Forecasting Research Institute

· 2mo ago · 11m read

189

Unflattering aspects of Effective Altruism

NunoSempere

· 2mo ago · 1m read

168

AIM (CE) new program: Founding to Give. Apply now to launch a high-growth company!

· 2mo ago · 6m read

166

Why hasn't EA done an SBF investigation and postmortem?

RobBensinger

· 1mo ago · 3m read

156

The current limiting factor for new charities

Joey

· 2mo ago · 2m read

Personal Blogposts

356

The Centre for Effective Altruism is spinning out of the Centre for Effective Altruism

OllieBase

· 1mo ago · 1m read

285

[April Fools' Day] Introducing Open Asteroid Impact

Linch

· 1mo ago · 1m read

235

Excerpts From The EA Talmud

Scott Alexander

· 1mo ago · 7m read

228

EA is now scandal-constrained

Guy Raveh

· 1mo ago · 1m read

Load more (16/383)

Quick takes

Will Howard

2mo

You can now import posts directly from Google docs Plus, internal links to headers[1] will now be mapped over correctly. To import a doc, make sure it is public or shared with "eaforum.posts@gmail.com"[2], then use the widget on the new/edit post page: Importing a doc will create a new (permanently saved) version of the post, but will not publish it, so it's safe to import updates into posts that are already published. You will need to click the "Publish Changes" button to update the live post. Everything that previously worked on copy-paste[3] will also work when importing, with the addition of internal links to headers (which only work when importing). There are still a few things that are known not to work: * Nested bullet points (these are working now) * Cropped images get uncropped * Bullet points in footnotes (these will become separate un-bulleted lines) * Blockquotes (there isn't a direct analog of this in Google docs unfortunately) There might be other issues that we don't know about. Please report any bugs or give any other feedback by replying to this quick take, you can also contact us in the usual ways. Appendix: Version history There are some minor improvements to the version history editor[4] that come along with this update: * You can load a version into the post editor without updating the live post, previously you could only hard-restore versions * The version that is live[5] on the post is shown in bold Here's what it would look like just after you import a Google doc, but before you publish the changes. Note that the latest version isn't bold, indicating that it is not showing publicly: 1. ^ Previously the link would take you back to the original doc, now it will take you to the header within the Forum post as you would expect. Internal links to bookmarks (where you link to a specific text selection) are also partially supported, although the link will only go to the paragraph the text selection is in 2. ^ Sharing with this email address means that anyone can access the contents of your doc if they have the url, because they could go to the new post page and import it. It does mean they can't access the comments at least 3. ^ I'm not sure how widespread this knowledge is, but previously the best way to copy from a Google doc was to first "Publish to the web" and then copy-paste from this published version. In particular this handles footnotes and tables, whereas pasting directly from a regular doc doesn't. The new importing feature should be equal to this publish-to-web copy-pasting, so will handle footnotes, tables, images etc. And then it additionally supports internal links 4. ^ Accessed via the "Version history" button in the post editor 5. ^ For most intents and purposes you can think of "live" as meaning "showing publicly". There is a bit of a sharp corner in this definition, in that the post as a whole can still be a draft. To spell this out: There can be many different versions of a post body, only one of these is attached to the post, this is the "live" version. This live version is what shows on the non-editing view of the post. Independently of this, the post as a whole can be a draft or published.

akash

1mo

David Nash's Monthly Overload of Effective Altruism seems highly underrated, and you should most probably give it a follow. I don't think any other newsletter captures and highlights EA's cause-neutral impartial beneficence better than the Monthly Overload of EA. For example, this month's newsletter has updates about Conferences, Virtual Events, Meta-EA, Effective Giving, Global Health and Development, Careers, Animal Welfare, Organization updates, Grants, Biosecurity, Emissions & CO2 Removal, Environment, AI Safety, AI Governance, AI in China, Improving Institutions, Progress, Innovation & Metascience, Longtermism, Forecasting, Miscellaneous causes and links, Stories & EA Around the World, Good News, and more. Compiling all this must be hard work! Until September 2022, the monthly overloads were also posted on the Forum and received higher engagement than the Substack. I find the posts super informative, so I am giving the newsletter a shout-out and putting it back on everyone's radar!

Jason

2mo

The government's sentencing memorandum for SBF is here; it is seeking a sentence of 40-50 years. As typical for DOJ in high-profile cases, it is well-written and well-done. I'm not just saying that because it makes many of the same points I identified in my earlier writeup of SBF's memorandum. E.g., p. 8 ("doubling down" rather than walking away from the fraud); p. 43 ("paid in full" claim is highly misleading) [page cites to numbers at bottom of page, not to PDF page #]. EA-adjacent material: There's a snarky reference to SBF's charitable donations "(for which he still takes credit)" (p. 2) in the intro, and the expected hammering of SBF's memo for taking credit for attempting to take credit for donations paid with customer money (p. 95). There's a reference to SBF's "idiosyncratic . . . beliefs around altruism, utilitarianism, and expected value" (pp. 88-89). This leads to the one surprise theme (for me): the need to incapacitate SBF from committing additional crimes (pp. 87, 90). Per the feds, "the defendant believed and appears still to believe that it is rational and necessary for him to take great risks including imposing those risks on others, if he determines that it will serve what he personally deems a worthy project or goal," which contributes to his future dangerousness (p. 89). For predictors: Looking at sentences where the loss was > $100MM and the method was Ponzi/misappropriation/embezzlement, there's a 20-year, two 30-years, a bunch of 40-years, three 50-years, and three 100+-years (pp. 96-97). Interesting item: The government has gotten about $3.45MM back from political orgs, and the estate has gotten back ~$280K (pp. 108-09). The proposed forfeiture order lists recipients, and seems to tell us which ones returned monies to the government (Proposed Forfeiture Order, pp. 24-43). Life Pro Tip: If you are arrested by the feds, do not subsequently write things in Google Docs that you don't want the feds to bring up at your sentencing. Jotting down the idea that "SBF died for our sins" as some sort of PR idea (p. 88; source here) is particularly ill-advised. My Take: In Judge Kaplan's shoes, I would probably sentence at the high end of the government's proposed range. Where the actual loss will likely be several billion, and the loss would have been even greater under many circumstances, I don't think a consequence of less than two decades' actual time in prison would provide adequate general deterrence -- even where the balance of other factors was significantly mitigating. That would imply a sentence of ~25 years after a prompt guilty plea. Backsolving, that gets us a sentence of ~35 years without credit for a guilty plea. But the balance of other factors is aggravating, not mitigating. Stealing from lots of ordinary people is worse than stealing from sophisticated investors. Outright stealing by someone in a fiduciary role is worse than accounting fraud to manipulate stock prices. We also need to adjust upward for SBF's post-arrest conduct, including trying to hide money from the bankruptcy process, multiple attempts at witness tampering, and perjury on the stand. Stacking those factors would probably take me over 50 years, but like the government I don't think a likely-death-in-prison sentence is necessary here.

Jason

2mo

SBF's sentencing memorandum is here. On the first page of the intro, we get some quotes about SBF's philanthropy. On the next, we are told he "lived a very modest life" and that any reports of extravagance are fabrications. [N.B.: All page citations are to the typed numbers at the bottom, not to the PDF page number.] For the forecasters: based on Guidelines calculations in the pre-sentence report (PSR) by Probation, the Guidelines range is 110 years (would be life, but is capped at the statutory max). Probation recommended 100 years. PSRs are sealed, so we'll never see the full rationale on that. The average fraud defendant with a maxed-out offense level, no criminal history, and no special cooperation credit receives a sentence of 283 months. The first part of the memo is about the (now advisory) Sentencing Guidelines used in federal court. The major argument is that there should be no upward adjustment for loss because everyone is probably getting all their money back. Courts have traditionally looked at the greater of actual or "intended" loss, but the memo argues that isn't correct after a recent Supreme Court decision. As a factual matter, I'm skeptical that the actual loss is $0, especially where much of the improvement is due to increases in the crypto market that customers would have otherwise benefitted from directly. Plus everyone getting money back (including investors who were defrauded) is far from a certain outcome, the appellate courts have been deferential to best-guess loss calculations, and the final Guidelines range would not materially change if the loss amount were (say) $25MM. If I'm the district judge here, I'd probably include some specific statements and findings in my sentencing monologue in an attempt to insulate this issue from appeal. Such as: I'd impose the same sentence no matter what the Guidelines said, because $0 dramatically understates the offense severity and $10B overstates it. There are a few places in which the argument ventures into tone-deaf waters. The argument that SBF wasn't in a position of public or private trust (p. 25-26) seems awfully weak and ill-advised to my eyes. The discussion of possible equity-holder losses (pp. 20-21) also strikes me as dismissive at points. No, equity holders don't get a money-back guarantee, but they are entitled to not be lied to when deciding where to invest. The second half involves a discussion of the so-called 3553 factors that a court must consider in determining a sentence that is sufficient, but not greater than necessary. Pages 41-42 discuss Peter Singer and earning to give, while pages 46-50 continue on about SBF's philanthropy (including a specific reference to GWWC and link to the pledger list on page 46). Throughout the memo, the defense asserts that FTX was different from various other schemes that were fraudulent from day one (e.g., p. 56). My understanding is that the misuse of customer funds started pretty early in FTX's history, so I don't give this much weight. The memo asserts that SBF was less culpable than various comparators, ranging from Madoff himself (150 years) to Elizabeth Holmes (135 months) (pp. 73-80). The bottom-line request is for a sentence of 63-78 months, which is the Guidelines range if one accepts the loss amount as $0 (p. 89). There are 29 letters in SBF's support by family members, his psychiatrist, Ross Rheingans-Yoo, Kat Woods, and a bunch of names I don't recognize. [Caution: The remainder of this post contains more opinion-laden commentary than what has preceded it!] I generally find the 3553 discussion unpersuasive. The section on "remorse" (pp. 55-56) rings hollow to me, although this is an unavoidable consequence of SBF's trial litigation choices. There is "remorse" that people were injured and impliedly that SBF made various errors in judgment, but there isn't any acknowledgment of wrongdoing. One sentence of note to this audience: "Sam is simply devastated that the advice, mentorship, and funding that he has given to the animal welfare, global poverty, and pandemic prevention movements does not begin to counteract the damage done to them by virtue of their association with him." (p. 55). I find the discussion of the FTX Foundation to be jarring, such as "Ultimately, the FTX Foundation donated roughly $150 million to charities working on issues such as pandemic prevention, animal welfare, and funding anti-malarial mosquito netting in Africa." (p. 57). Attempting to take credit for sending some of the money you took from customers to charity takes a lot of nerve! Although the memo asserts that SBF's neurodiversity makes him "uniquely vulnerable" in prison (p. 58), the unfortunate truth is that many convicted criminals have characteristics that make successfully adapting to prison life more difficult than for the average person (e.g., severe mental illness, unusually low intelligence). So I'm not convinced by the memo that he would face an atypical burden that would warrant serious consideration in sentencing. Although I certainly can't fault counsel for pointing to SBF's positive characteristics, I'm sure Judge Kaplan knows that many of his opportunities to legibly display these characteristics have been enabled by privilege that most people being sentenced in federal court do not have. I'm also not generally convinced by the arguments about general deterrence. In abbreviated form, the argument is that running SBF through the ringer and exposing him to public disgrace is strong enough that a lower sentence (and the inevitable lifetime public stigma) suffices to deter other would-be fraudsters. See pp. 66-67. And there's good evidence that severity of punishment is relatively less important in deterrence. However, if a tough sentence is otherwise just, I don't think we need a high probability of deterrent effect for an extremely serious offense for extended incarceration to be worth it. Crypto scams are common, and as a practical matter it is difficult to increase certainty and speed of punishment because so much of the problematic conduct happens outside the U.S. So severity is the lever the government has. Moreover, discounts for offenders who have a lot to lose (because they are wealthy already) and/or are seen as having more future productive value seem backward as far as moral desert. Finally, I think there's potentially value in severity deterrence of someone already committing fraud; if the punishment level is basically maxed out at the $500M customer money you've already put at risk, there is no reason (other than an increased risk of detection) not to put $5B at risk. As the saying goes, "might as well be hanged for a sheep as for a lamb" as the penalty for ovine theft was death either way. Defense recommendations on sentencing are generally unrealistic in cases without specific types of plea deal. This one is no different. Also, the sentencing discussion will sound extremely harsh to at least non-US readers . . . but that's the situation in the US and especially in the federal system. I'd note that SBF's post-arrest decisions will likely have triggered a substantial portion of his sentence. Much has been written about the US "trial penalty," and it is often a problem. However, I don't think a discount of ~25-33% for a prompt guilty plea, as implied by the Guidelines for most offenses (also by the guidelines used in England and Wales) is penologically unjustified or coercive. Instead of that, SBF's sentence is likely to be higher because of multiple attempts at witness tampering and evasive, incredible testimony on the stand. So he could be looking at ~double the sentence he would be facing if he had pled guilty. He likely could not have gotten the kind of credit for cooperation his co-conspirators received (a "5K.1") because there was no other big fish to rat out. Providing 5K.1 cooperation often reduces sentences by a lot, in my opinion often too much. Given the 5K.1 cooperation and the lesser role, one must exercise caution in using co-conspirator sentences to estimate what SBF would have received if he had promptly accepted responsibility. Finally, I'd view any sentence over ~60 years as de facto life and chosen more for symbolic purposes than to actually increase punishment. Currently, one can receive a ~15% discount for decent behavior in prison, and can potentially serve ~25-33% of the sentence in a halfway house or the like for participating in programs under the First Step Act. It's hard to predict what the next few decades will bring as far as sentencing policy, but the recent trend has been toward expanding the possibilities for early release. So I'd estimate that SBF will actually serve ~75% of his sentence, and probably some portion of it outside of a prison.

James Herbert

2mo

If anyone wants to see what making EA enormous might look like, check out Rutger Bregmans' School for Moral Ambition (SMA). It isn't an EA project (and his accompanying book has a chapter on EA that is quite critical), but the inspiration is clear and I'm sure there will be things we can learn from it. For their pilot, they're launching in the Netherlands, but it's already pretty huge, and they have plans to launch in the UK and the US next year. To give you an idea of size, despite the official launch being only yesterday, their growth on LinkedIn is significant. For the 90 days preceding the launch date, they added 13,800 followers (their total is now 16,300). The two EA orgs with the biggest LinkedIn presence I know of are 80k and GWWC. In the same period, 80k gained 1,200 followers (their total is now 18,400), and GWWC gained 700 (their total is now 8,100).[1] And it's not like SMA has been spamming the post button. They only posted 4 times. The growth in followers comes from media coverage and the founding team posting about it on their personal LinkedIn pages (Bregman has over 200k followers). 1. ^ EA Netherlands gained 137, giving us a total of 2900 - wooo!

Load more (5/67)

Load more months

All posts

Today and yesterdayToday and yesterday

Frontpage Posts

Quick takes

Past weekPast week

Frontpage Posts

Quick takes

Past 14 daysPast 14 days

Frontpage Posts

Quick takes

Past 31 days

Frontpage Posts

Quick takes

Since March 1st

Frontpage Posts

Personal Blogposts

Quick takes

Today and yesterday
Today and yesterday

Past week
Past week

Past 14 days
Past 14 days