All posts

New & upvoted

Today and yesterday
Today and yesterday

AI safety 5
Building effective altruism 3
Economics 3
AI governance 2
Opportunities to take action 2
Application announcements 2

Frontpage Posts

Quick takes

We should expect that the incentives and culture for AI-focused companies to make them uniquely terrible for producing safe AGI.    From a “safety from catastrophic risk” perspective, I suspect an “AI-focused company” (e.g. Anthropic, OpenAI, Mistral) is abstractly pretty close to the worst possible organizational structure for getting us towards AGI. I have two distinct but related reasons: 1. Incentives 2. Culture From an incentives perspective, consider realistic alternative organizational structures to “AI-focused company” that nonetheless has enough firepower to host successful multibillion-dollar scientific/engineering projects: 1. As part of an intergovernmental effort (e.g. CERN’s Large Hadron Collider, the ISS) 2. As part of a governmental effort of a single country (e.g. Apollo Program, Manhattan Project, China’s Tiangong) 3. As part of a larger company (e.g. Google DeepMind, Meta AI) In each of those cases, I claim that there are stronger (though still not ideal) organizational incentives to slow down, pause/stop, or roll back deployment if there is sufficient evidence or reason to believe that further development can result in major catastrophe. In contrast, an AI-focused company has every incentive to go ahead on AI when the case for pausing is uncertain, and minimal incentive to stop or even take things slowly.  From a culture perspective, I claim that without knowing any details of the specific companies, you should expect AI-focused companies to be more likely than plausible contenders to have the following cultural elements: 1. Ideological AGI Vision AI-focused companies may have a large contingent of “true believers” who are ideologically motivated to make AGI at all costs and 2. No Pre-existing Safety Culture AI-focused companies may have minimal or no strong “safety” culture where people deeply understand, have experience in, and are motivated by a desire to avoid catastrophic outcomes.  The first one should be self-explanatory. The second one is a bit more complicated, but basically I think it’s hard to have a safety-focused culture just by “wanting it” hard enough in the abstract, or by talking a big game. Instead, institutions (relatively) have more of a safe & robust culture if they have previously suffered the (large) costs of not focusing enough on safety. For example, engineers who aren’t software engineers understand fairly deep down that their mistakes can kill people, and that their predecessors’ fuck-up have indeed killed people (think bridges collapsing, airplanes falling, medicines not working, etc). Software engineers rarely have such experience. Similarly, governmental institutions have institutional memories with the problems of major historical fuckups, in a way that new startups very much don’t.
I expect (~ 75%) that the decision to "funnel" EAs into jobs at AI labs will become a contentious community issue in the next year. I think that over time more people will think it is a bad idea. This may have PR and funding consequences too.
The following is a collection of long quotes from Ozy Brennan's post On John Woolman (which I stumbled upon via Aaron Gertler) that spoke to me. Woolman was clearly what David Chapman would call mission-oriented with respect to meaning of and purpose in life; Chapman argues instead for what he calls "enjoyable usefulness", which is I think healthier in ~every way ... it just doesn't resonate. All bolded text is my own emphasis, not Ozy's. ---------------------------------------- > As a child, Woolman experienced a moment of moral awakening: ... [anecdote] > > This anecdote epitomizes the two driving forces of John Woolman’s personality: deep compassion and the refusal to ever cut himself a moment of slack. You might say “it was just a bird”; you might say “come on, Woolman, what were you? Ten?” Woolman never thought like that. It was wrong to kill; he had killed; that was all there was to say about it. > > When Woolman was a teenager, the general feeling among Quakers was that they were soft, self-indulgent, not like the strong and courageous Quakers of previous generations, unlikely to run off to Massachusetts to preach the Word if the Puritans decided once again to torture Quakers for their beliefs, etc. Woolman interpreted this literally. He spent his teenage years being like “I am depraved, I am evil, I have not once provoked anyone into whipping me to death, I don’t even want to be whipped to death.” > > As a teenager, Woolman fell in with a bad crowd and committed some sins. What kind of sins? I don’t know. Sins. He's not telling us: > > > “I hastened toward destruction,” he writes. “While I meditate on the gulf toward which I travelled … I weep; mine eye runneth down with water.” > > In actuality, Woolman’s corrupting friends were all... Quakers who happened to be somewhat less strict than he was. We have his friends' diaries and none of them remarked on any particular sins committed in this period. Biographers have speculated that Woolman was part of a book group and perhaps the great sin he was reproaching himself for was reading nonreligious books. He may also have been reproaching himself for swimming, skating, riding in sleighs, or drinking tea. > > Woolman is so batshit about his teenage wrongdoing that many readers have speculated about the existence of different, non-Quaker friends who were doing all the sins. However, we have no historical evidence of him having other friends, and we have a fuckton of historical evidence of Woolman being extremely hard on himself about minor failings (or “failings”). > > Most people who are Like That as teenagers grow out of it. Woolman didn’t. He once said something dumb in Weekly Meeting1 and then spent three weeks in a severe depression about it. He never listened to nonreligious music, read fiction or newspapers, or went to plays. He once stormed down to a tavern to tell the tavern owner that celebrating Christmas was sinful. ---------------------------------------- > ... if Woolman were just an 18th century neurotic, no one would remember him. We care about him because of his attitude about slavery.  > > When Woolman was 21, his employer asked him to write a bill of sale for an enslaved woman. Woolman knew it was wrong. But his employer told him to and he was scared of being fired. Both Woolman’s employer and the purchaser were Quakers themselves, so surely if they were okay with it it was okay. Woolman told both his master and the purchaser that he thought that Christians shouldn't own enslaved people, but he wrote the bill. > > After he wrote the bill of sale Woolman lost his inner peace and never really recovered it. He spent the rest of his life struggling with guilt and self-hatred. He saw himself as selfish and morally deficient. ... > > Woolman worked enough to support himself, but the primary project of his life was ending slavery. He wrote pamphlet after pamphlet making the case that slavery was morally wrong and unbiblical. He traveled across America making speeches to Quaker Meetings urging them to oppose slavery. He talked individually with slaveowners, both Quaker and not, which many people criticized him for; it was “singular”, and singular was not okay. ...  > > It is difficult to overstate how much John Woolman hated doing anti-slavery activism. For the last decade of his life, in which he did most of his anti-slavery activities, he was clearly severely depressed. ... Partially, he hated the process of traveling: the harshness of life on the road; being away from his family; the risk of bringing home smallpox, which terrified him.  > > But mostly it was the task being asked of Woolman that filled him with grief. Woolman was naturally "gentle, self-deprecating, and humble in his address", but he felt called to harshly condemn slaveowning Quakers. All he wanted was to be able to have friendly conversations with people who were nice to him. But instead, he felt, God had called him to be an Old Testament prophet, thundering about God’s judgment and the need for repentance. ...  > > Woolman craved approval from other Quakers. But even Quakers personally opposed to slavery often thought that Woolman was making too big a deal about it. There were other important issues. Woolman should chill. His singleminded focus on ending slavery was singular, and being singular was prideful. Isn’t the real sin how different Woolman’s abolitionism made him from everyone else? > > Sometimes he persuaded individual people to free their slaves, but successes were few and far between. Mostly, he gave speeches and wrote pamphlets as eloquently as he could, and then his audience went “huh, food for thought” and went home and beat the people they’d enslaved. Nothing he did had any discernible effect. > > ... Woolman spent much of his time feeling like a failure. If he were better, if he followed God’s will more closely, if he were kinder and more persuasive and more self-sacrificing, then maybe someone would have lived free who now would die a slave, because Woolman wasn’t good enough. The modern version of this is probably what Thomas Kwa wrote about here: > I think that many people new to EA have heard that multipliers like these exist, but don't really internalize that all of these multipliers stack multiplicatively. ... If she misses one of these multipliers, say the last one, ... Ana is losing out on 90% of her potential impact, consigning literally millions of chickens to an existence worse than death. To get more than 50% of her maximum possible impact, Ana must hit every single multiplier. This is one way that reality is unforgiving.  ---------------------------------------- > From one perspective, Woolman was too hard on himself about his relatively tangential connection to slavery. From another perspective, he is one of a tiny number of people in the eighteenth century who has a remotely reasonable response to causing a person to be in bondage when they could have been free. Everyone else flinched away from the scale of the suffering they caused; Woolman looked at it straight. Everyone else thought of slaves as property; Woolman alone understood they were people. > > Some people’s high moral standards might result in unproductive self-flagellation and the refusal to take actions because they might do something wrong. But Woolman derived strength and determination from his high moral standards. When he failed, he regretted his actions and did his best to change them. At night he might beg God to fucking call someone else, but the next morning he picked up his walking stick and kept going. > > And the thing he was doing mattered. Quaker abolitionism wasn’t inevitable; it was the result of hard work by specific people, of whom Woolman was one of the most prominent. If Woolman were less hard on himself, many hundreds if not thousands of free people would instead have been owned things that could beaten or raped or murdered with as little consequence as I experience from breaking a laptop. ---------------------------------------- An aside (doubling as warning) on mission orientation, quoting Tanner Greer's Questing for Transcendence: > ... out of the lands I’ve lived and roles I’ve have donned, none blaze in my memory like the two years I spent as a missionary for the Church of Jesus Christ. It is a shame that few who review my resume ask about that time; more interesting experiences were packed into those few mission years than in the rest of the lot combined. ... I doubt I shall ever experience anything like it again. I cannot value its worth. I learned more of humanity’s crooked timbers in the two years I lived as missionary than in all the years before and all the years since. > > Attempting to communicate what missionary life is like to those who have not experienced it themselves is difficult. ... Yet there is one segment of society that seems to get it. In the years since my service, I have been surprised to find that the one group of people who consistently understands my experience are soldiers. In many ways a Mormon missionary is asked to live something like a soldier... [they] spend years doing a job which is not so much a job as it is an all-encompassing way of life.  > > The last point is the one most salient to this essay. It is part of the reason both many ex-missionaries (known as “RMs” or “Return Missionaries” in Mormon lingo) and many veterans have such trouble adapting to life when they return to their homes. ... Many RMs report a sense of loss and aimlessness upon returning to “the real world.” They suddenly find themselves in a society that is disgustingly self-centered, a world where there is nothing to sacrifice or plan for except one’s own advancement. For the past two years there was a purpose behind everything they did, a purpose whose scope far transcended their individual concerns. They had given everything—“heart, might, mind and strength“—to this work, and now they are expected to go back to racking up rewards points on their credit card? How could they? > > The soldier understands this question. He understands how strange and wonderful life can be when every decision is imbued with terrible meaning. Things which have no particular valence in the civilian sphere are a matter of life or death for the soldier. Mundane aspects of mundane jobs (say, those of the former vehicle mechanic) take on special meaning. A direct line can be drawn between everything he does—laying out a sandbag, turning off a light, operating a radio—and the ability of his team to accomplish their mission. Choice of food, training, and exercise before combat can make the difference between the life and death of a soldier’s comrades in combat. For good or for ill, it is through small decisions like these that great things come to pass. > > In this sense the life of the soldier is not really his own. His decisions ripple. His mistakes multiply. The mission demands strict attention to things that are of no consequence in normal life. So much depends on him, yet so little is for him. > > This sounds like a burden. In some ways it is. But in other ways it is a gift. Now, and for as long as he is part of the force, even his smallest actions have a significance he could never otherwise hope for. He does not live a normal life. He lives with power and purpose—that rare power and purpose given only to those whose lives are not their own. > > ... It is an exhilarating way to live. > > This sort of life is not restricted to soldiers and missionaries. Terrorists obviously experience a similar sort of commitment. So do dissidents, revolutionaries, reformers, abolitionists, and so forth. What matters here is conviction and cause. If the cause is great enough, and the need for service so pressing, then many of the other things—obedience, discipline, exhaustion, consecration, hierarchy, and separation from ordinary life—soon follow. It is no accident that great transformations in history are sprung from groups of people living in just this way. Humanity is both at its most heroic and its most horrifying when questing for transcendence.
Help clear something up for me: I am extremely confused (theoretically) how we can simultaneously have: 1. An Artificial Superintelligence 2. It be controlled by humans (therefore creating misuse of concentration of power issues) My intuition is that once it reaches a particular level of power it will be uncontrollable. Unless people are saying that we can have models 100x more powerful than GPT4 without it having any agency??

Past week
Past week

Announcements and updates 6
AI safety 5
Building effective altruism 5
Cause prioritization 4
Animal welfare 4
Organization updates 4

Frontpage Posts

· · 11m read

Quick takes

Congratulations to the EA Project For Awesome 2024 team, who managed to raise over $100k for AMF, GiveDirectly and ProVeg International by submitting promotional/informational videos to the project. There's been an effort to raise money for effective charities via Project For Awesome since 2017, and it seems like a really productive effort every time. Thanks to all involved! 
FAQ: “Ways the world is getting better” banner The banner will only be visible on desktop. If you can't see it, try expanding your window. It'll be up for a week.  How do I use the banner? 1. Click on an empty space to add an emoji,  2. Choose your emoji,  3. Write a one-sentence description of the good news you want to share,  4. Link an article or forum post that gives more information.  If you’d like to delete your entry, click the cross that appears when you hover over it. It will be deleted for everyone. What kind of stuff should I write? Anything that qualifies as good news relevant to the world's most important problems.  For example, Ben West’s recent quick takes (1, 2, 3). Avoid posting partisan political news, but the passage of relevant bills and policies is on topic.  Will my entry be anonymous? All submissions are displayed without your Forum name, so they are ~anonymous to users, however, usual moderation norms still apply (additionally, we may remove duplicates or borderline trollish submissions. This is an experiment, so we reserve the right to moderate heavily if necessary). Ask any other questions you have in the comments below. Feel free to dm me with feedback or comments.  
This could be a long slog but I think it could be valuable to identify the top ~100 OS libraries and identify their level of resourcing to avoid future attacks like the XZ attack. In general, I think work on hardening systems is an underrated aspect of defending against future highly capable autonomous AI agents.
Common prevalence estimates are often wrong. Example: snakebites and my experience reading Long Covid literature. Both institutions like the WHO and academic literature appear to be incentivized to exaggerate. I think the Global Burden of Disease might be a more reliable source, but have not looked into it. I advise everyone using prevalence estimates to treat them with some skepticism and look up the source.
With another EAG nearby, I thought now would be a good time to push out this draft-y note. I'm sure I'm missing a mountain of nuance, but I stand by the main messages:   "Keep Talking" I think there are two things EAs could be doing more of, on the margin. They are cheap, easy, and have the potential to unlock value in unsuspecting ways. Talk to more people I say this 15 times a week. It's the most no-brainer thing I can think of, with a ridiculously low barrier to entry; it's usually net-positive for one while often only drawing on unproductive hours of the other. Almost nobody would be where they were without the conversations they had. Some anecdotes: - A conversation led both parties discovering a good mentor-mentee fit, leading to one dropping out of a PhD, being mentored on a project, and becoming an alignment researcher. - A first conversation led to more conversations which led to more conversations, one of which illuminated a new route to impact which this person was a tremendously good fit for. They're now working as a congressional staffer. - A chat with a former employee gave an applicant insight about a company they were interviewing with and helped them land the job (many, many such cases). - A group that is running a valuable fellowship programme germinated from a conversation between three folks who previously were unacquainted (the founders) (again, many such cases).   Make more introductions to others (or at least suggest who they should reach out to) By hoarding our social capital we might leave ungodly amounts of value on the table. Develop your instincts and learn to trust them! Put people you speak with in touch with other people who they should speak with -- especially if they're earlier in their discovery of using evidence and reason to do more good in the world. (By all means, be protective of those whose time is 2 OOMs more precious; but within +/- 1, let's get more people connected: exchanging ideas, improving our thinking, illuminating truth, building trust.  At EAG, at the very least, point people to others they should be talking to. The effort in doing so is so, so low, and the benefits could be massive.

Past 14 days
Past 14 days

AI safety 6
Animal welfare 5
Community 4
Building effective altruism 3
Farmed animal welfare 3
Research 3

Frontpage Posts

· · 15m read

Quick takes

I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source "transformer debugger" tool. I resigned from OpenAI on February 15, 2024.
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable. I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies. In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences. Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
Trump recently said in an interview ( that he would seek to disband the White House office for pandemic preparedness. Given that he usually doesn't give specifics on his policy positions, this seems like something he is particularly interested in. I know politics is discouraged on the EA forum, but I thought I would post this to say: EA should really be preparing for a Trump presidency. He's up in the polls and IMO has a >50% chance of winning the election. Right now politicians seem relatively receptive to EA ideas, this may change under a Trump administration.
In my latest post I talked about whether unaligned AIs would produce more or less utilitarian value than aligned AIs. To be honest, I'm still quite confused about why many people seem to disagree with the view I expressed, and I'm interested in engaging more to get a better understanding of their perspective. At the least, I thought I'd write a bit more about my thoughts here, and clarify my own views on the matter, in case anyone is interested in trying to understand my perspective. The core thesis that was trying to defend is the following view: My view: It is likely that by default, unaligned AIs—AIs that humans are likely to actually build if we do not completely solve key technical alignment problems—will produce comparable utilitarian value compared to humans, both directly (by being conscious themselves) and indirectly (via their impacts on the world). This is because unaligned AIs will likely both be conscious in a morally relevant sense, and they will likely share human moral concepts, since they will be trained on human data. Some people seem to merely disagree with my view that unaligned AIs are likely to be conscious in a morally relevant sense. And a few others have a semantic disagreement with me in which they define AI alignment in moral terms, rather than the ability to make an AI share the preferences of the AI's operator.  But beyond these two objections, which I feel I understand fairly well, there's also significant disagreement about other questions. Based on my discussions, I've attempted to distill the following counterargument to my thesis, which I fully acknowledge does not capture everyone's views on this subject: Perceived counter-argument: The vast majority of utilitarian value in the future will come from agents with explicitly utilitarian preferences, rather than those who incidentally achieve utilitarian objectives. At present, only a small proportion of humanity holds partly utilitarian views. However, as unaligned AIs will differ from humans across numerous dimensions, it is plausible that they will possess negligible utilitarian impulses, in stark contrast to humanity's modest (but non-negligible) utilitarian tendencies. As a result, it is plausible that almost all value would be lost, from a utilitarian perspective, if AIs were unaligned with human preferences. Again, I'm not sure if this summary accurately represents what people believe. However, it's what some seem to be saying. I personally think this argument is weak. But I feel I've had trouble making my views very clear on this subject, so I thought I'd try one more time to explain where I'm coming from here. Let me respond to the two main parts of the argument in some amount of detail: (i) "The vast majority of utilitarian value in the future will come from agents with explicitly utilitarian preferences, rather than those who incidentally achieve utilitarian objectives." My response: I am skeptical of the notion that the bulk of future utilitarian value will originate from agents with explicitly utilitarian preferences. This clearly does not reflect our current world, where the primary sources of happiness and suffering are not the result of deliberate utilitarian planning. Moreover, I do not see compelling theoretical grounds to anticipate a major shift in this regard. I think the intuition behind the argument here is something like this: In the future, it will become possible to create "hedonium"—matter that is optimized to generate the maximum amount of utility or well-being. If hedonium can be created, it would likely be vastly more important than anything else in the universe in terms of its capacity to generate positive utilitarian value. The key assumption is that hedonium would primarily be created by agents who have at least some explicit utilitarian goals, even if those goals are fairly weak. Given the astronomical value that hedonium could potentially generate, even a tiny fraction of the universe's resources being dedicated to hedonium production could outweigh all other sources of happiness and suffering. Therefore, if unaligned AIs would be less likely to produce hedonium than aligned AIs (due to not having explicitly utilitarian goals), this would be a major reason to prefer aligned AI, even if unaligned AIs would otherwise generate comparable levels of value to aligned AIs in all other respects. If this is indeed the intuition driving the argument, I think it falls short for a straightforward reason. The creation of matter-optimized-for-happiness is more likely to be driven by the far more common motives of self-interest and concern for one's inner circle (friends, family, tribe, etc.) than by explicit utilitarian goals. If unaligned AIs are conscious, they would presumably have ample motives to optimize for positive states of consciousness, even if not for explicitly utilitarian reasons. In other words, agents optimizing for their own happiness, or the happiness of those they care about, seem likely to be the primary force behind the creation of hedonium-like structures. They may not frame it in utilitarian terms, but they will still be striving to maximize happiness and well-being for themselves and others they care about regardless. And it seems natural to assume that, with advanced technology, they would optimize pretty hard for their own happiness and well-being, just as a utilitarian might optimize hard for happiness when creating hedonium. In contrast to the number of agents optimizing for their own happiness, the number of agents explicitly motivated by utilitarian concerns is likely to be much smaller. Yet both forms of happiness will presumably be heavily optimized. So even if explicit utilitarians are more likely to pursue hedonium per se, their impact would likely be dwarfed by the efforts of the much larger group of agents driven by more personal motives for happiness-optimization. Since both groups would be optimizing for happiness, the fact that hedonium is similarly optimized for happiness doesn't seem to provide much reason to think that it would outweigh the utilitarian value of more mundane, and far more common, forms of utility-optimization. To be clear, I think it's totally possible that there's something about this argument that I'm missing here. And there are a lot of potential objections I'm skipping over here. But on a basic level, I mostly just lack the intuition that the thing we should care about, from a utilitarian perspective, is the existence of explicit utilitarians in the future, for the aforementioned reasons. The fact that our current world isn't well described by the idea that what matters most is the number of explicit utilitarians, strengthens my point here. (ii) "At present, only a small proportion of humanity holds partly utilitarian views. However, as unaligned AIs will differ from humans across numerous dimensions, it is plausible that they will possess negligible utilitarian impulses, in stark contrast to humanity's modest (but non-negligible) utilitarian tendencies." My response: Since only a small portion of humanity is explicitly utilitarian, the argument's own logic suggests that there is significant potential for AIs to be even more utilitarian than humans, given the relatively low bar set by humanity's limited utilitarian impulses. While I agree we shouldn't assume AIs will be more utilitarian than humans without specific reasons to believe so, it seems entirely plausible that factors like selection pressures for altruism could lead to this outcome. Indeed, commercial AIs seem to be selected to be nice and helpful to users, which (at least superficially) seems "more utilitarian" than the default (primarily selfish-oriented) impulses of most humans. The fact that humans are only slightly utilitarian should mean that even small forces could cause AIs to exceed human levels of utilitarianism. Moreover, as I've said previously, it's probable that unaligned AIs will possess morally relevant consciousness, at least in part due to the sophistication of their cognitive processes. They are also likely to absorb and reflect human moral concepts as a result of being trained on human-generated data. Crucially, I expect these traits to emerge even if the AIs do not share human preferences.  To see where I'm coming from, consider how humans routinely are "misaligned" with each other, in the sense of not sharing each other's preferences, and yet still share moral concepts and a common culture. For example, an employee can share moral concepts with their employer while having very different consumption preferences from them. This picture is pretty much how I think we should primarily think about unaligned AIs that are trained on human data, and shaped heavily by techniques like RLHF or DPO. Given these considerations, I find it unlikely that unaligned AIs would completely lack any utilitarian impulses whatsoever. However, I do agree that even a small risk of this outcome is worth taking seriously. I'm simply skeptical that such low-probability scenarios should be the primary factor in assessing the value of AI alignment research. Intuitively, I would expect the arguments for prioritizing alignment to be more clear-cut and compelling than "if we fail to align AIs, then there's a small chance that these unaligned AIs might have zero utilitarian value, so we should make sure AIs are aligned instead". If low probability scenarios are the strongest considerations in favor of alignment, that seems to undermine the robustness of the case for prioritizing this work. While it's appropriate to consider even low-probability risks when the stakes are high, I'm doubtful that small probabilities should be the dominant consideration in this context. I think the core reasons for focusing on alignment should probably be more straightforward and less reliant on complicated chains of logic than this type of argument suggests. In particular, as I've said before, I think it's quite reasonable to think that we should align AIs to humans for the sake of humans. In other words, I think it's perfectly reasonable to admit that solving AI alignment might be a great thing to ensure human flourishing in particular. But if you're a utilitarian, and not particularly attached to human preferences per se (i.e., you're non-speciesist), I don't think you should be highly confident that an unaligned AI-driven future would be much worse than an aligned one, from that perspective.
Excerpt from the most recent update from the ALERT team:   Highly pathogenic avian influenza (HPAI) H5N1: What a week! The news, data, and analyses are coming in fast and furious. Overall, ALERT team members feel that the risk of an H5N1 pandemic emerging over the coming decade is increasing. Team members estimate that the chance that the WHO will declare a Public Health Emergency of International Concern (PHEIC) within 1 year from now because of an H5N1 virus, in whole or in part, is 0.9% (range 0.5%-1.3%). The team sees the chance going up substantially over the next decade, with the 5-year chance at 13% (range 10%-15%) and the 10-year chance increasing to 25% (range 20%-30%).   their estimated 10 year risk is a lot higher than I would have anticipated.

Past 31 days

Community 7
Building effective altruism 6
Career choice 4
Opinion 3
Announcements and updates 3
Postmortems & retrospectives 3

Frontpage Posts

· · 9m read
· · 5m read

Quick takes

In this "quick take", I want to summarize some my idiosyncratic views on AI risk.  My goal here is to list just a few ideas that cause me to approach the subject differently from how I perceive most other EAs view the topic. These ideas largely push me in the direction of making me more optimistic about AI, and less likely to support heavy regulations on AI. (Note that I won't spend a lot of time justifying each of these views here. I'm mostly stating these points without lengthy justifications, in case anyone is curious. These ideas can perhaps inform why I spend significant amounts of my time pushing back against AI risk arguments. Not all of these ideas are rare, and some of them may indeed be popular among EAs.) 1. Skepticism of the treacherous turn: The treacherous turn is the idea that (1) at some point there will be a very smart unaligned AI, (2) when weak, this AI will pretend to be nice, but (3) when sufficiently strong, this AI will turn on humanity by taking over the world by surprise, and then (4) optimize the universe without constraint, which would be very bad for humans. By comparison, I find it more likely that no individual AI will ever be strong enough to take over the world, in the sense of overthrowing the world's existing institutions and governments by surprise. Instead, I broadly expect unaligned AIs will integrate into society and try to accomplish their goals by advocating for their legal rights, rather than trying to overthrow our institutions by force. Upon attaining legal personhood, unaligned AIs can utilize their legal rights to achieve their objectives, for example by getting a job and trading their labor for property, within the already-existing institutions. Because the world is not zero sum, and there are economic benefits to scale and specialization, this argument implies that unaligned AIs may well have a net-positive effect on humans, as they could trade with us, producing value in exchange for our own property and services. Note that my claim here is not that AIs will never become smarter than humans. One way of seeing how these two claims are distinguished is to compare my scenario to the case of genetically engineered humans. By assumption, if we genetically engineered humans, they would presumably eventually surpass ordinary humans in intelligence (along with social persuasion ability, and ability to deceive etc.). However, by itself, the fact that genetically engineered humans will become smarter than non-engineered humans does not imply that genetically engineered humans would try to overthrow the government. Instead, as in the case of AIs, I expect genetically engineered humans would largely try to work within existing institutions, rather than violently overthrow them. 2. AI alignment will probably be somewhat easy: The most direct and strongest current empirical evidence we have about the difficulty of AI alignment, in my view, comes from existing frontier LLMs, such as GPT-4. Having spent dozens of hours testing GPT-4's abilities and moral reasoning, I think the system is already substantially more law-abiding, thoughtful and ethical than a large fraction of humans. Most importantly, this ethical reasoning extends (in my experience) to highly unusual thought experiments that almost certainly did not appear in its training data, demonstrating a fair degree of ethical generalization, beyond mere memorization. It is conceivable that GPT-4's apparently ethical nature is fake. Perhaps GPT-4 is lying about its motives to me and in fact desires something completely different than what it professes to care about. Maybe GPT-4 merely "understands" or "predicts" human morality without actually "caring" about human morality. But while these scenarios are logically possible, they seem less plausible to me than the simple alternative explanation that alignment—like many other properties of ML models—generalizes well, in the natural way that you might similarly expect from a human. Of course, the fact that GPT-4 is easily alignable does not immediately imply that smarter-than-human AIs will be easy to align. However, I think this current evidence is still significant, and aligns well with prior theoretical arguments that alignment would be easy. In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking bad actions, allowing us to shape its rewards during training accordingly. After we've aligned a model that's merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point. 3. The default social response to AI will likely be strong: One reason to support heavy regulations on AI right now is if you think the natural "default" social response to AI will lean too heavily on the side of laissez faire than optimal, i.e., by default, we will have too little regulation rather than too much. In this case, you could believe that, by advocating for regulations now, you're making it more likely that we regulate AI a bit more than we otherwise would have, pushing us closer to the optimal level of regulation. I'm quite skeptical of this argument because I think that the default response to AI (in the absence of intervention from the EA community) will already be quite strong. My view here is informed by the base rate of technologies being overregulated, which I think is quite high. In fact, it is difficult for me to name even a single technology that I think is currently clearly underregulated by society. By pushing for more regulation on AI, I think it's likely that we will overshoot and over-constrain AI relative to the optimal level. In other words, my personal bias is towards thinking that society will regulate technologies too heavily, rather than too loosely. And I don't see a strong reason to think that AI will be any different from this general historical pattern. This makes me hesitant to push for more regulation on AI, since on my view, the marginal impact of my advocacy would likely be to push us even further in the direction of "too much regulation", overshooting the optimal level by even more than what I'd expect in the absence of my advocacy. 4. I view unaligned AIs as having comparable moral value to humans: This idea was explored in one of my most recent posts. The basic idea is that, under various physicalist views of consciousness, you should expect AIs to be conscious, even if they do not share human preferences. Moreover, it seems likely that AIs — even ones that don't share human preferences — will be pretrained on human data, and therefore largely share our social and moral concepts. Since unaligned AIs will likely be both conscious and share human social and moral concepts, I don't see much reason to think of them as less "deserving" of life and liberty, from a cosmopolitan moral perspective. They will likely think similarly to the way we do across a variety of relevant axes, even if their neural structures are quite different from our own. As a consequence, I am pretty happy to incorporate unaligned AIs into the legal system and grant them some control of the future, just as I'd be happy to grant some control of the future to human children, even if they don't share my exact values. Put another way, I view (what I perceive as) the EA attempt to privilege "human values" over "AI values" as being largely arbitrary and baseless, from an impartial moral perspective. There are many humans whose values I vehemently disagree with, but I nonetheless respect their autonomy, and do not wish to deny these humans their legal rights. Likewise, even if I strongly disagreed with the values of an advanced AI, I would still see value in their preferences being satisfied for their own sake, and I would try to respect the AI's autonomy and legal rights. I don't have a lot of faith in the inherent kindness of human nature relative to a "default unaligned" AI alternative. 5. I'm not fully committed to longtermism: I think AI has an enormous potential to benefit the lives of people who currently exist. I predict that AIs can eventually substitute for human researchers, and thereby accelerate technological progress, including in medicine. In combination with my other beliefs (such as my belief that AI alignment will probably be somewhat easy), this view leads me to think that AI development will likely be net-positive for people who exist at the time of alignment. In other words, if we allow AI development, it is likely that we can use AI to reduce human mortality, and dramatically raise human well-being for the people who already exist. I think these benefits are large and important, and commensurate with the downside potential of existential risks. While a fully committed strong longtermist might scoff at the idea that curing aging might be important — as it would largely only have short-term effects, rather than long-term effects that reverberate for billions of years — by contrast, I think it's really important to try to improve the lives of people who currently exist. Many people view this perspective as a form of moral partiality that we should discard for being arbitrary. However, I think morality is itself arbitrary: it can be anything we want it to be. And I choose to value currently existing humans, to a substantial (though not overwhelming) degree. This doesn't mean I'm a fully committed near-termist. I sympathize with many of the intuitions behind longtermism. For example, if curing aging required raising the probability of human extinction by 40 percentage points, or something like that, I don't think I'd do it. But in more realistic scenarios that we are likely to actually encounter, I think it's plausibly a lot better to accelerate AI, rather than delay AI, on current margins. This view simply makes sense to me given the enormously positive effects I expect AI will likely have on the people I currently know and love, if we allow development to continue.
Animal Justice Appreciation Note Animal Justice et al. v A.G of Ontario 2024 was recently decided and struck down large portions of Ontario's ag-gag law. A blog post is here. The suit was partially funded by ACE, which presumably means that many of the people reading this deserve partial credit for donating to support it. Thanks to Animal Justice (Andrea Gonsalves, Fredrick Schumann, Kaitlyn Mitchell, Scott Tinney), co-applicants Jessica Scott-Reid and Louise Jorgensen, and everyone who supported this work!
Why are April Fools jokes still on the front page? On April 1st, you expect to see April Fools' posts and know you have to be extra cautious when reading strange things online. However, April 1st was 13 days ago and there are still two posts that are April Fools posts on the front page. I think it should be clarified that they are April Fools jokes so people can differentiate EA weird stuff from EA weird stuff that's a joke more easily. Sure, if you check the details you'll see that things don't add up, but we all know most people just read the title or first few paragraphs.
Marcus Daniell appreciation note @Marcus Daniell, cofounder of High Impact Athletes, came back from knee surgery and is donating half of his prize money this year. He projects raising $100,000. Through a partnership with Momentum, people can pledge to donate for each point he gets; he has raised $28,000 through this so far. It's cool to see this, and I'm wishing him luck for his final year of professional play!
First in-ovo sexing in the US Egg Innovations announced that they are "on track to adopt the technology in early 2025." Approximately 300 million male chicks are ground up alive in the US each year (since only female chicks are valuable) and in-ovo sexing would prevent this.  UEP originally promised to eliminate male chick culling by 2020; needless to say, they didn't keep that commitment. But better late than never!  Congrats to everyone working on this, including @Robert - Innovate Animal Ag, who founded an organization devoted to pushing this technology.[1] 1. ^ Egg Innovations says they can't disclose details about who they are working with for NDA reasons; if anyone has more information about who deserves credit for this, please comment!

Since March 1st

Community 6
Building effective altruism 5
April Fools' Day 4
FTX collapse 3
Animal welfare 3
Announcements and updates 2

Frontpage Posts

Quick takes

Please people, do not treat Richard Hannania as some sort of worthy figure who is a friend of EA. He was a Nazi, and whilst he claims he moderated his views, he is still very racist as far as I can tell. Hannania called for trying to get rid of all non-white immigrants in the US, and the sterilization of everyone with an IQ under 90 indulged in antisemitic attacks on the allegedly Jewish elite, and even post his reform was writing about the need for the state to harass and imprison Black people specifically ('a revolution in our culture or form of government. We need more policing, incarceration, and surveillance of black people'  Yet in the face of this, and after he made an incredibly grudging apology about his most extreme stuff (after journalists dug it up), he's been invited to Manifiold's events and put on Richard Yetter Chappel's blogroll.  DO NOT DO THIS. If you want people to distinguish benign transhumanism (which I agree is a real thing*) from the racist history of eugenics, do not fail to shun actual racists and Nazis. Likewise, if you want to promote "decoupling" factual beliefs from policy recommendations, which can be useful, do not duck and dive around the fact that virtually every major promoter of scientific racism ever, including allegedly mainstream figures like Jensen, worked with or published with actual literal Nazis (  I love most of the people I have met through EA, and I know that-despite what some people say on twitter- we are not actually a secret crypto-fascist movement (nor is longtermism specifically, which whether you like it or not, is mostly about what its EA proponents say it is about.) But there is in my view a disturbing degree of tolerance for this stuff in the community, mostly centered around the Bay specifically. And to be clear I am complaining about tolerance for people with far-right and fascist ("reactionary" or whatever) political views, not people with any particular personal opinion on the genetics of intelligence. A desire for authoritarian government enforcing the "natural" racial hierarchy does not become okay, just because you met the person with the desire at a house party and they seemed kind of normal and chill or super-smart and nerdy.  I usually take a way more measured tone on the forum than this, but here I think real information is given by getting shouty.  *Anyone who thinks it is automatically far-right to think about any kind of genetic enhancement at all should go read some Culture novels, and note the implied politics (or indeed, look up the author's actual die-hard libertarian socialist views.) I am not claiming that far-left politics is innocent, just that it is not racist. 
You can now import posts directly from Google docs Plus, internal links to headers[1] will now be mapped over correctly. To import a doc, make sure it is public or shared with ""[2], then use the widget on the new/edit post page: Importing a doc will create a new (permanently saved) version of the post, but will not publish it, so it's safe to import updates into posts that are already published. You will need to click the "Publish Changes" button to update the live post. Everything that previously worked on copy-paste[3] will also work when importing, with the addition of internal links to headers (which only work when importing). There are still a few things that are known not to work: * Nested bullet points (these are working now) * Cropped images get uncropped * Bullet points in footnotes (these will become separate un-bulleted lines) * Blockquotes (there isn't a direct analog of this in Google docs unfortunately) There might be other issues that we don't know about. Please report any bugs or give any other feedback by replying to this quick take, you can also contact us in the usual ways. Appendix: Version history There are some minor improvements to the version history editor[4] that come along with this update: * You can load a version into the post editor without updating the live post, previously you could only hard-restore versions * The version that is live[5] on the post is shown in bold Here's what it would look like just after you import a Google doc, but before you publish the changes. Note that the latest version isn't bold, indicating that it is not showing publicly: 1. ^ Previously the link would take you back to the original doc, now it will take you to the header within the Forum post as you would expect. Internal links to bookmarks (where you link to a specific text selection) are also partially supported, although the link will only go to the paragraph the text selection is in 2. ^ Sharing with this email address means that anyone can access the contents of your doc if they have the url, because they could go to the new post page and import it. It does mean they can't access the comments at least 3. ^ I'm not sure how widespread this knowledge is, but previously the best way to copy from a Google doc was to first "Publish to the web" and then copy-paste from this published version. In particular this handles footnotes and tables, whereas pasting directly from a regular doc doesn't. The new importing feature should be equal to this publish-to-web copy-pasting, so will handle footnotes, tables, images etc. And then it additionally supports internal links 4. ^ Accessed via the "Version history" button in the post editor 5. ^ For most intents and purposes you can think of "live" as meaning "showing publicly". There is a bit of a sharp corner in this definition, in that the post as a whole can still be a draft. To spell this out: There can be many different versions of a post body, only one of these is attached to the post, this is the "live" version. This live version is what shows on the non-editing view of the post. Independently of this, the post as a whole can be a draft or published.
David Nash's Monthly Overload of Effective Altruism seems highly underrated, and you should most probably give it a follow. I don't think any other newsletter captures and highlights EA's cause-neutral impartial beneficence better than the Monthly Overload of EA. For example, this month's newsletter has updates about Conferences, Virtual Events, Meta-EA, Effective Giving, Global Health and Development, Careers, Animal Welfare, Organization updates, Grants, Biosecurity, Emissions & CO2 Removal, Environment, AI Safety, AI Governance, AI in China, Improving Institutions, Progress, Innovation & Metascience, Longtermism, Forecasting, Miscellaneous causes and links, Stories & EA Around the World, Good News, and more. Compiling all this must be hard work! Until September 2022, the monthly overloads were also posted on the Forum and received higher engagement than the Substack. I find the posts super informative, so I am giving the newsletter a shout-out and putting it back on everyone's radar!
The government's sentencing memorandum for SBF is here; it is seeking a sentence of 40-50 years. As typical for DOJ in high-profile cases, it is well-written and well-done. I'm not just saying that because it makes many of the same points I identified in my earlier writeup of SBF's memorandum. E.g., p. 8 ("doubling down" rather than walking away from the fraud); p. 43 ("paid in full" claim is highly misleading) [page cites to numbers at bottom of page, not to PDF page #]. EA-adjacent material: There's a snarky reference to SBF's charitable donations "(for which he still takes credit)" (p. 2) in the intro, and the expected hammering of SBF's memo for taking credit for attempting to take credit for donations paid with customer money (p. 95). There's a reference to SBF's "idiosyncratic . . . beliefs around altruism, utilitarianism, and expected value" (pp. 88-89). This leads to the one surprise theme (for me): the need to incapacitate SBF from committing additional crimes (pp. 87, 90). Per the feds, "the defendant believed and appears still to believe that it is rational and necessary for him to take great risks including imposing those risks on others, if he determines that it will serve what he personally deems a worthy project or goal," which contributes to his future dangerousness (p. 89). For predictors: Looking at sentences where the loss was > $100MM and the method was Ponzi/misappropriation/embezzlement, there's a 20-year, two 30-years, a bunch of 40-years, three 50-years, and three 100+-years (pp. 96-97). Interesting item: The government has gotten about $3.45MM back from political orgs, and the estate has gotten back ~$280K (pp. 108-09). The proposed forfeiture order lists recipients, and seems to tell us which ones returned monies to the government (Proposed Forfeiture Order, pp. 24-43). Life Pro Tip: If you are arrested by the feds, do not subsequently write things in Google Docs that you don't want the feds to bring up at your sentencing. Jotting down the idea that "SBF died for our sins" as some sort of PR idea (p. 88; source here) is particularly ill-advised.  My Take: In Judge Kaplan's shoes, I would probably sentence at the high end of the government's proposed range. Where the actual loss will likely be several billion, and the loss would have been even greater under many circumstances, I don't think a consequence of less than two decades' actual time in prison would provide adequate general deterrence -- even where the balance of other factors was significantly mitigating. That would imply a sentence of ~25 years after a prompt guilty plea. Backsolving, that gets us a sentence of ~35 years without credit for a guilty plea. But the balance of other factors is aggravating, not mitigating. Stealing from lots of ordinary people is worse than stealing from sophisticated investors. Outright stealing by someone in a fiduciary role is worse than accounting fraud to manipulate stock prices. We also need to adjust upward for SBF's post-arrest conduct, including trying to hide money from the bankruptcy process, multiple attempts at witness tampering, and perjury on the stand. Stacking those factors would probably take me over 50 years, but like the government I don't think a likely-death-in-prison sentence is necessary here.
SBF's sentencing memorandum is here. On the first page of the intro, we get some quotes about SBF's philanthropy. On the next, we are told he "lived a very modest life" and that any reports of extravagance are fabrications. [N.B.: All page citations are to the typed numbers at the bottom, not to the PDF page number.]  For the forecasters: based on Guidelines calculations in the pre-sentence report (PSR) by Probation, the Guidelines range is 110 years (would be life, but is capped at the statutory max). Probation recommended 100 years. PSRs are sealed, so we'll never see the full rationale on that. The average fraud defendant with a maxed-out offense level, no criminal history, and no special cooperation credit receives a sentence of 283 months. The first part of the memo is about the (now advisory) Sentencing Guidelines used in federal court. The major argument is that there should be no upward adjustment for loss because everyone is probably getting all their money back. Courts have traditionally looked at the greater of actual or "intended" loss, but the memo argues that isn't correct after a recent Supreme Court decision.  As a factual matter, I'm skeptical that the actual loss is $0, especially where much of the improvement is due to increases in the crypto market that customers would have otherwise benefitted from directly. Plus everyone getting money back (including investors who were defrauded) is far from a certain outcome, the appellate courts have been deferential to best-guess loss calculations, and the final Guidelines range would not materially change if the loss amount were (say) $25MM. If I'm the district judge here, I'd probably include some specific statements and findings in my sentencing monologue in an attempt to insulate this issue from appeal. Such as: I'd impose the same sentence no matter what the Guidelines said, because $0 dramatically understates the offense severity and $10B overstates it. There are a few places in which the argument ventures into tone-deaf waters. The argument that SBF wasn't in a position of public or private trust (p. 25-26) seems awfully weak and ill-advised to my eyes. The discussion of possible equity-holder losses (pp. 20-21) also strikes me as dismissive at points. No, equity holders don't get a money-back guarantee, but they are entitled to not be lied to when deciding where to invest. The second half involves a discussion of the so-called 3553 factors that a court must consider in determining a sentence that is sufficient, but not greater than necessary. Pages 41-42 discuss Peter Singer and earning to give, while pages 46-50 continue on about SBF's philanthropy (including a specific reference to GWWC and link to the pledger list on page 46).  Throughout the memo, the defense asserts that FTX was different from various other schemes that were fraudulent from day one (e.g., p. 56). My understanding is that the misuse of customer funds started pretty early in FTX's history, so I don't give this much weight. The memo asserts that SBF was less culpable than various comparators, ranging from Madoff himself (150 years) to Elizabeth Holmes (135 months) (pp. 73-80). The bottom-line request is for a sentence of 63-78 months, which is the Guidelines range if one accepts the loss amount as $0 (p. 89). There are 29 letters in SBF's support by family members, his psychiatrist, Ross Rheingans-Yoo, Kat Woods, and a bunch of names I don't recognize. [Caution: The remainder of this post contains more opinion-laden commentary than what has preceded it!] I generally find the 3553 discussion unpersuasive. The section on "remorse" (pp. 55-56) rings hollow to me, although this is an unavoidable consequence of SBF's trial litigation choices. There is "remorse" that people were injured and impliedly that SBF made various errors in judgment, but there isn't any acknowledgment of wrongdoing. One sentence of note to this audience: "Sam is simply devastated that the advice, mentorship, and funding that he has given to the animal welfare, global poverty, and pandemic prevention movements does not begin to counteract the damage done to them by virtue of their association with him." (p. 55).  I find the discussion of the FTX Foundation to be jarring, such as "Ultimately, the FTX Foundation donated roughly $150 million to charities working on issues such as pandemic prevention, animal welfare, and funding anti-malarial mosquito netting in Africa." (p. 57). Attempting to take credit for sending some of the money you took from customers to charity takes a lot of nerve!  Although the memo asserts that SBF's neurodiversity makes him "uniquely vulnerable" in prison (p. 58), the unfortunate truth is that many convicted criminals have characteristics that make successfully adapting to prison life more difficult than for the average person (e.g., severe mental illness, unusually low intelligence). So I'm not convinced by the memo that he would face an atypical burden that would warrant serious consideration in sentencing.  Although I certainly can't fault counsel for pointing to SBF's positive characteristics, I'm sure Judge Kaplan knows that many of his opportunities to legibly display these characteristics have been enabled by privilege that most people being sentenced in federal court do not have. I'm also not generally convinced by the arguments about general deterrence. In abbreviated form, the argument is that running SBF through the ringer and exposing him to public disgrace is strong enough that a lower sentence (and the inevitable lifetime public stigma) suffices to deter other would-be fraudsters. See pp. 66-67. And there's good evidence that severity of punishment is relatively less important in deterrence.  However, if a tough sentence is otherwise just, I don't think we need a high probability of deterrent effect for an extremely serious offense for extended incarceration to be worth it. Crypto scams are common, and as a practical matter it is difficult to increase certainty and speed of punishment because so much of the problematic conduct happens outside the U.S. So severity is the lever the government has. Moreover, discounts for offenders who have a lot to lose (because they are wealthy already) and/or are seen as having more future productive value seem backward as far as moral desert.  Finally, I think there's potentially value in severity deterrence of someone already committing fraud; if the punishment level is basically maxed out at the $500M customer money you've already put at risk, there is no reason (other than an increased risk of detection) not to put $5B at risk. As the saying goes, "might as well be hanged for a sheep as for a lamb" as the penalty for ovine theft was death either way. Defense recommendations on sentencing are generally unrealistic in cases without specific types of plea deal. This one is no different. Also, the sentencing discussion will sound extremely harsh to at least non-US readers . . . but that's the situation in the US and especially in the federal system. I'd note that SBF's post-arrest decisions will likely have triggered a substantial portion of his sentence. Much has been written about the US "trial penalty," and it is often a problem. However, I don't think a discount of ~25-33% for a prompt guilty plea, as implied by the Guidelines for most offenses (also by the guidelines used in England and Wales) is penologically unjustified or coercive. Instead of that, SBF's sentence is likely to be higher because of multiple attempts at witness tampering and evasive, incredible testimony on the stand. So he could be looking at ~double the sentence he would be facing if he had pled guilty.  He likely could not have gotten the kind of credit for cooperation his co-conspirators received (a "5K.1") because there was no other big fish to rat out. Providing 5K.1 cooperation often reduces sentences by a lot, in my opinion often too much. Given the 5K.1 cooperation and the lesser role, one must exercise caution in using co-conspirator sentences to estimate what SBF would have received if he had promptly accepted responsibility. Finally, I'd view any sentence over ~60 years as de facto life and chosen more for symbolic purposes than to actually increase punishment. Currently, one can receive a ~15% discount for decent behavior in prison, and can potentially serve ~25-33% of the sentence in a halfway house or the like for participating in programs under the First Step Act. It's hard to predict what the next few decades will bring as far as sentencing policy, but the recent trend has been toward expanding the possibilities for early release. So I'd estimate that SBF will actually serve ~75% of his sentence, and probably some portion of it outside of a prison.

Load more months