Standard AI are optimizers: they ‘look’ through possible actions they could take, and pick the one that maximises what they care about. This can be dangerous— an AI which maximises in this way needs to care about exactly the same things that humans care about, which is really hard[1]. If you tell a human to calculate as many digits of pi as possible within a year, they’ll do ‘reasonable’ things towards that goal. An optimizing AI might work out that it could calculate many more digits in a year by taking over another supercomputer— as this is the most effective action, it seems very attractive to the AI.

Quantilizers are a different approach. Instead of maximizing, they randomly choose from a few of the most effective possible actions:

They work like this:

  1. Start with a goal, and a set of possible actions
  2. Predict how useful each action will be for achieving the goal
  3. Rank the actions from the most to the least useful
  4. Pick randomly from the highest fraction only (i.e, the top 10%)

This avoids cases where the AI chooses extreme actions to maximize the goal. The AI chooses somewhat helpful actions instead.

It does leave one question — how do we make a list of possible actions in the first place? One suggestion is to ask a lot of humans to solve the task and train an AI to generate possible things it thinks humans would do. This list can then be used as an input to our quantilizer.

This does make them less effective, of course— firstly by picking less effective actions overall, and secondly by picking actions it thinks humans would take. But this might be worth the reduced risks— indeed, based on your risk tolerance, you can change the % of top actions the quantilizer will consider to make it more effective and riskier or vice versa.

So quantilizers trade some capability in exchange for greater safety, and avoid unintended consequences. They pick from lots of mild actions and very few extreme actions, so the chance of them doing something extreme or unexpected is miniscule.

Quantilizers are a proposed safer approach to AI goals. By randomly choosing from a selection of the top options, they avoid extreme behaviors that could cause harm. More research is needed, but quantilizers show promise as a model for the creation of AI systems that are beneficial but limited in scope. They provide an alternative to goal maximization, which can be dangerous, though they’re just theoretical right now.

  1. ^

    Humans care about an awful lot of different things, even just one human!

8

0
0

Reactions

0
0
No comments on this post yet.
Be the first to respond.
Curated and popular this week
 ·  · 5m read
 · 
When we built a calculator to help meat-eaters offset the animal welfare impact of their diet through donations (like carbon offsets), we didn't expect it to become one of our most effective tools for engaging new donors. In this post we explain how it works, why it seems particularly promising for increasing support for farmed animal charities, and what you can do to support this work if you think it’s worthwhile. In the comments I’ll also share our answers to some frequently asked questions and concerns some people have when thinking about the idea of an ‘animal welfare offset’. Background FarmKind is a donation platform whose mission is to support the animal movement by raising funds from the general public for some of the most effective charities working to fix factory farming. When we built our platform, we directionally estimated how much a donation to each of our recommended charities helps animals, to show users.  This also made it possible for us to calculate how much someone would need to donate to do as much good for farmed animals as their diet harms them – like carbon offsetting, but for animal welfare. So we built it. What we didn’t expect was how much something we built as a side project would capture peoples’ imaginations!  What it is and what it isn’t What it is:  * An engaging tool for bringing to life the idea that there are still ways to help farmed animals even if you’re unable/unwilling to go vegetarian/vegan. * A way to help people get a rough sense of how much they might want to give to do an amount of good that’s commensurate with the harm to farmed animals caused by their diet What it isn’t:  * A perfectly accurate crystal ball to determine how much a given individual would need to donate to exactly offset their diet. See the caveats here to understand why you shouldn’t take this (or any other charity impact estimate) literally. All models are wrong but some are useful. * A flashy piece of software (yet!). It was built as
 ·  · 2m read
 · 
Project for Awesome (P4A) is a charity video contest running from February 11th to February 19th, 2025. The public can vote on videos supporting various charities, and the ones with the most votes receive donations. Thanks to the support of the EA community, three EA charities received $37,000 each last year. Please help generate additional donations for EA charities again this year with just a few clicks! Voting is open until Wednesday, February 19th at 11:59 AM EST. You can find more information about P4A in this EA Forum post. On the P4A website, there are numerous videos showcasing different charities, including several EA charities. Feel free to watch the videos and cast your votes. Here’s how it works: „Anyone can go to the homepage of projectforawesome.com to see all videos. You can sort by charity category, pick from a dropdown of organization names, or search for a specific video. After you click on a video, look for a big red “VOTE” button either next to or below the video. You’ll have to check an “I’m not a robot” box, too.“ This year, there’s a new rule: „Our voting rule for Project for Awesome 2025 is one vote per charitable organization per device.“ So, you can vote for all the charities you want. List of videos about EA charities If you can’t find videos of EA-aligned charities directly, here’s a list: * Access to Medicines Initiative (Vote here) * ACTRA (Vote here) * Against Malaria Foundation (Vote here) * Animal Advocacy Africa (Vote here) * Animal Advocacy Careers (Vote here or here) * Animal Charity Evaluators (Vote here or here) * Animal Equality (Vote here) * Aquatic Life Institute (Vote here or here) * Center for the Governance of AI (Vote here) * Faunalytics (Vote here or here) * GiveDirectly (Vote here) * Giving What We Can (Vote here or here) * Good Food Institute (Vote here or here or here) * International Campaign to Aboli
 ·  · 12m read
 · 
TL;DR HealthLearn provides accredited, engaging, mobile-optimized online courses for health workers in Nigeria and Uganda. We focus on lifesaving clinical skills that are simple to implement. Our recent evaluation of the HealthLearn Newborn Care Foundations course showed significant improvements in birth attendants’ clinical practices and key birth outcomes. Early initiation of breastfeeding, strongly linked to reduced newborn mortality, improved significantly in the evaluation. After applying large (>10X) discounts, we estimate the course is ~24 times more cost-effective than GiveWell’s cash transfer benchmark. We are uncertain about the precise magnitude of impact, but a sensitivity analysis suggests that the program is cost-effective under a wide range of plausible scenarios. Our already-low unit costs should decline as we scale up. This is likely to increase or at least maintain the program’s cost-effectiveness, even if the impact per trainee is lower than our current point estimate. We also earn revenue by hosting courses for another NGO, which covers a portion of our core team costs and increases cost-effectiveness per philanthropic dollar spent. We have identified key uncertainties in evidence strength, sustainability of clinical practice change, and intervention reach. We plan to improve our monitoring and evaluation to assess these uncertainties and develop more precise estimates of impact per trainee. We will continue our work to improve and scale up the Newborn Care Foundations course, while also developing new courses addressing other gaps in clinical practices where impactful interventions are needed. Background HealthLearn is an AIM-incubated nonprofit that develops and provides engaging, accredited, case-based, mobile-optimized online courses for health workers (HWs) in Nigeria and Uganda. This includes one HealthLearn course (Newborn Care Foundations) and two courses (focused on epidemic preparedness and hypertension diagnosis and management) f