In 2018, the Center for Humane Technology's "Time Well Spent" campaign probably contributed to Apple's Screen Time and Google's Digital Wellbeing features. These features seem deliberately hampered because of how easy it is to disable the reminder for you to get off your phone. I wonder if this problem is hard to make real traction on because tech companies, especially ones that make revenue from advertisements, are actively motivated against reducing screen time.
Great piece overall! I'm hoping AI risk assessment and management processes can be improved.
Anthropic found that Claude 3 didn't trigger AI Safety Level 3 for CBRN, but gave it a 30% chance of doing so in three months
30% chance of crossing the Yellow Line threshold (which requires building harder evals), not ASL-3 threshold
The plant-based foods industry should make low-phytoestrogen soy products.
Soy is an excellent plant-based protein. It's also a source of the phytoestrogen isoflavone, which men online are concerned has feminizing properties (cf. soy boy). I think the effect of isoflavones is low for moderate consumption (e.g., one 3.5 oz block of tofu per day), but could be significant if the average American were to replace the majority of their meat consumption with soy-based products.
Fortunately, isoflavones in soy don't have to be an issue. Low-isoflavone products are ...
Harris was the one personally behind the voluntary AI safety commitments of July 2023. Here's a press release from the White House:
...The Vice President’s trip to the United Kingdom builds on her long record of leadership to confront the challenges and seize the opportunities of advanced technology. In May, she convened the CEOs of companies at the forefront of AI innovation, resulting in voluntary commitments from 15 leading AI companies to help move toward safe, secure, and transparent development of AI technology. In July, the Vice President convened
I'm surprised the video doesn't mention cooperative AI and avoiding conflict among transformative AI systems, as this is (apparently) a priority of the Center on Long-Term Risk, one of the main s-risk organizations. See Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda for more details.
I wouldn't consider factory farming to be an instance of astronomical suffering, as bad as the practice is, since I don't think the suffering from one century of factory farming exceeds hundreds of millions of years of wild animal suffering. However, perhaps it could be an s-risk if factory farming somehow continues for a billion years. For reference, here is definition of s-risk from a talk by CLR in 2017:
“S-risk – One where an adverse outcome would bring about severe suffering on a cosmic scale, vastly exceeding all suffering that has existed on Earth so far.”
Thanks for your comment, Jackson! I've removed my post since it seems that it was too confusing. One message that I meant to convey is that the imaginary nuclear company essentially does not have any safety commitments currently in effect ("we aren't sure yet how to operate our plant safely") and is willing to accept any number of deaths less than <10,000 people, despite adopting this "responsible nuclear policy."
I think another promising intervention would be to persuade God to be a conditional annihilationist or support universal reconciliation with Christ. Abraham successfully negotiated conditions with God regarding the destruction of Sodom and Gomorrah with just a few sentences. Imagine what we could do with rigorous and prayerful BOTEC analyses! Even if there is a small chance of this succeeding, the impact could be incredible in expectation.
Great post! I've written a paper along similar lines for the SERI Conference in April 2023 here, titled "AI Alignment Is Not Enough to Make the Future Go Well." Here is the abstract:
...AI alignment is commonly explained as aligning advanced AI systems with human values. Especially when combined with the idea that AI systems aim to optimize their world based on their goals, this has led to the belief that solving the problem of AI alignment will pave the way for an excellent future. However, this common definition of AI alignment is somewhat idealistic and mis
It would be bad to create significant public pressure for a pause through advocacy, because this would cause relevant actors (particularly AGI labs) to spend their effort on looking good to the public, rather than doing what is actually good.
I think I can reasonably model the safety teams at AGI labs as genuinely trying to do good. But I don't know that the AGI labs as organizations are best modeled as trying to do good, rather than optimizing for objectives like outperforming competitors, attracting investment, and advancing exciting capabilities – subjec...
I don't know that the AGI labs as organizations are best modeled as trying to do good, rather than optimizing for objectives like outperforming competitors, attracting investment, and advancing exciting capabilities – subject to some safety-related concerns from leadership.
I will go further -- it's definitely the latter one for at least Google DeepMind and OpenAI; Anthropic is arguable. I still think that's a much better situation than having public pressure when the ask is very nuanced (as it would be for alignment research).
For example, I'm currently gla...
I think GiveWell shouldn’t be modeled as wanting to recommend organizations that save as many current lives as possible. I think a more accurate way to model them is “GiveWell recommends organizations that are [within the Overton Window]/[have very sound data to back impact estimates] that save as many current lives as possible.”
This is correct if you look at GiveWell's criteria for evaluating donation opportunities. GiveWell’s highly publicized claim “We search for the charities that save or improve lives the most per dollar” is somewhat misleading g...
What do you think are the main reasons behind wanting to deploy your own model instead of training an API? Some reasons I can think of:
For anyone interested, the Center for AI Safety is offering up to $500,000 in prizes for benchmark ideas: SafeBench (mlsafety.org)
Where do you draw the line between AI startups that do vs don't contribute excessively to capabilities externalities and existential risk? I think you're right that your particular startup wouldn't have a significant effect of accelerating timelines. But if we're thinking AI startups in general, this could be another OpenAI or Adept, which probably have more of an effect on timelines.
I could imagine that even if one's startup doesn't working on scaling and making models generally smarter, a relatively small amount of applications work to make them more use...
I'm curious whether the reason why EA may be perceived as a cult while, e.g., environmentalist and social justice activism are not, is primarily that the concerns of EA are much less mainstream.
I appreciate the suggestions on how to make EA less cultish, and I think they are valuable to implement, but I don't think they would have a significant effect on public perception of whether EA is a cult.
I agree, that seems concerning. Ultimately, since the AI developers are designing the AIs, I would guess that they would try to align the AI to be helpful to the users/consumers or to the concerns of the company/government, if they succeed at aligning the AI at all. As for your suggestions "Alignment with whoever bought the AI? Whoever users it most often? Whoever might be most positively or negatively affected by its behavior? Whoever the AI's company's legal team says would impose the highest litigation risk?" – these all seem plausible to me.
On the sepa...
But I sometimes have a fear in the back of my mind that some of the attendees who are intrigued by these ideas are later going to look up effective altruism, get the impression that the movement’s focus is just about existential risks these days, and feel duped. Since EA pitches don’t usually start with longtermist ideas, it can feel like a bait and switch.
To avoid the feeling of a bait and switch, I think one solution is to introduce existential risk in the initial pitch. For example, when introducing my student group Effective Altruism at Georgia T...
I think AI alignment isn't really about designing AI to maximize for the preference satisfaction of a certain set of humans. I think an aligned AI would look more like an AI which:
Thanks for writing this! There's been a lot of interest of EA community building, but I think the one of the most valuable parts of EA community building is basically just recruiting – e.g., notifying interested people about relevant opportunities and inspiring people to apply for impactful opportunities. A lot of potential talent isn't looped in with a local EA group or the EA community at all, however, so I think more professional recruiting could help a lot with solving organizational bottlenecks.
Technical note: I think we need to be careful to note the difference in meaning between extinction and existential catastrophe. When Joseph Carlsmith talks about existential catastrophe, he doesn't necessarily mean all humans dying; in this report, he's mainly concerned about the disempowerment of humanity. Following Toby Ord in The Precipice, Carlsmith defines an existential catastrophe as "an event that drastically reduces the value of the trajectories along which human civilization could realistically develop". It's not straightforward to translat...
Here's my proposal for a contest description. Contest problems #1 and 2 are inspired by Richard Ngo's Alignment research exercises.
...AI alignment is the problem of ensuring that advanced AI systems take actions which are aligned with human values. As AI systems become more capable and approach or exceed human-level intelligence, it becomes harder to ensure that they remain within human control instead of posing unacceptable risks.
One solution to AI alignment proposed by Stuart Russell, a leading AI researcher, is the assistance game, also called a cooperativ
Some quick thoughts:
As a countervailing perspective, Dan Hendrycks thinks that it would be valuable to have automated moral philosophy research assistance to "help us reduce risks of value lock-in by improving our moral precedents earlier rather than later" (though I don't know if he would endorse this project). Likewise, some AI alignment researchers think it would be valuable to have automated assistance with AI alignment research. If EAs could write a nice EA Forum post just by giving GPT-EA-Forum a nice prompt and revising the resulting post, that could help EAs save time...
Distributed computing seems to be a skill in high demand among AI safety organizations. Does anyone have recommendations for resources to learn about it? Would it look like using the PyTorch Distributed package or something like a microservices architecture?
I feel somewhat concerned that after reading your repeated writing saying "use your AGI to (metaphorically) burn all GPUs", someone might actually do so, but of course their AGI isn't actually aligned or powerful enough to do so without causing catastrophic collateral damage. At least the suggestion encourages AI race dynamics – because if you don't make AGI first, someone else will try to burn all your GPUs! – and makes the AI safety community seem thoroughly supervillain-y.
Points 5 and 6 suggest that soon after someone develops AGI for the first time, th...
Quoting Scott Alexander here:
...I agree it's not necessarily a good idea to go around founding the Let's Commit A Pivotal Act AI Company.
But I think there's room for subtlety somewhere like "Conditional on you being in a situation where you could take a pivotal act, which is a small and unusual fraction of world-branches, maybe you should take a pivotal act."
That is, if you are in a position where you have the option to build an AI capable of destroying all competing AI projects, the moment you notice this you should update heavily in favor of short timelines
Thanks for writing this! I've seen Hilary Greaves' video on longtermism and cluelessness in a couple university group versions of the Intro EA Program (as part of the week on critiques and debates), so it's probably been influencing some people's views. I think this post is a valuable demonstration that we don't need to be completely clueless about the long-term impact of presentist interventions.
For future submissions to the Red Teaming Contest, I'd like to see posts that are much more rigorously argued than this. I'm not concerned about whether the arguments are especially novel.
My understanding of the key claim of the post is, EA should consider reallocating some more resources from longtermist to neartermist causes. This seems plausible – perhaps some types of marginal longtermist donations are predictably ineffective, or it's bad if community members feel that longtermism unfairly has easier access to funding – but I didn't find the four reaso...
Thanks for the reply . Let me just address the things I think are worth responding to.
For future submissions to the Red Teaming Contest, I'd like to see posts that are much more rigorously argued than this. I'm not concerned about whether the arguments are especially novel.
Ouch. My humble suggestion: maybe be more friendly to outsiders, especially ones supportive and warm, when your movement has a reputation for being robotic/insular? Or just say "I don't want anyone who is not part of the movement to comment." Because that is the very obvious ...
Some quick thoughts:
I see two new relevant roles on the 80,000 Hours job board right now:
Here's an excerpt from Anthropic's job posting. It's looking for basic familiarity with deep learning and mechanistic interpretability, but mostly nontechnical skills.
...In this role you would:
- Partner closely with the interpre
You might want to share this project idea in the Effective Environmentalism Slack, if you haven't already done so.
Is the application form "EAGxBerkeley, India & Future Forum Organizing Team Expression of Interest" supposed to have questions asking about whether you're interested in organizing the Future Forum? I don't see any; I only see questions about EAGxBerkeley and EAGxIndia.
From my experience with running EA at Georgia Tech, I think the main factors are:
I think I was primarily concerned that negative information about the campaign could get picked up by the media. Thinking it over now though, that motivation doesn't make sense for not posting about highly visible negative news coverage (which the media would have already been aware of) or not posting concerns on a less publicly visible EA platform, such as Slack. Other factors for why I didn't write up my concerns about Carrick's chances of being elected might have been that:
Before the election was decided, I agreed with the overall point that donating, phone banking, or door-knocking for the campaign seemed quite valuable. At the same time, I want to mention a couple critiques I have (copied from my comment on "Some potential lessons from Carrick’s Congressional bid")
Overall, I agree with Habryka's comment that "negative evidence on the campaign would be 'systematically filtered out'". Although I maxed out donations to the primary campaign and phone banked a bit for the campaign, I had a number of concerns about the campaign that I never saw mentioned in EA spaces. However, I didn't want to raise these concerns for fear that this would negatively affect Carrick's chances of winning the election.
Now that Carrick's campaign is over, I feel more free to write my concerns. These included:
Another introductory post about why one may want to care about insect welfare: Does Insect Suffering Bug You? - Faunalytics (Jesse Gildesgame, 2016).
...Recently, activists have started campaigning against silk because they believe the production process is cruel to silkworms. Many people respond to these campaigns with skepticism: who cares about silkworms? It’s easy to feel for the chinchillas, foxes, and other furry mammals used in fur clothing. But insects like silkworms are a harder sell. It seems crazy to grant moral consideration to a bug.
Nonetheless, t
The Qualia Research Institute might be funding-constrained but it's questionable whether it's doing good work; for example, see this comment here about its Symmetry Theory of Valence.
See Ask MIT Climate: Why do some people call climate change an “existential threat”?
... (read more)