Hide table of contents

This essay for the Open Philanthropy AI Worldviews Contest is targeted at Question 1.

AIs have recently accrued some impressive successes, raising both hopes and concerns that artificial general intelligence (AGI) may be just decades away. I will argue that the probability of an AGI, as described in the contest prompt, arising by 2043 is extremely low. 

 

 Specification gaming and school

 

In Richard Feynman’s memoir Surely You’re Joking, Mr. Feynman! the physicist recalls his 1951 experiences visiting a Brazilian university and advising Brazilian professors and government officials on science education. He explains how the university he visited managed to “teach students physics” without giving them any real understanding. The students learned to associate physics words with definitions and memorized equations. They could accurately state physics laws on exams, and they could solve the types of problems they’d been taught to solve. They looked like they were learning.

 

But, as Feynman found, the students couldn’t stretch their knowledge to unfamiliar problem types, and they couldn’t solve even the most basic real-world physics problems by applying the equations they knew to physical objects. Of course, these students couldn’t have even begun to push research into new areas - the job of a working physicist. The students trained in the Brazilian system of the 1950s were one step away from learning the scientific subject matter and two steps away from learning to actually do science.1

 

Today’s chatbots have been compared to high school students in their writing abilities, and this seems very impressive for a machine. But many high school students are plagiarists; to some extent, so are many adult writers. Instead of copying whole sentences, advanced plagiarists cut and paste fragments of sentences from multiple sources. Better yet, they rearrange and choose near-synonyms for as many words as possible to reduce the chance that the teacher will discover the sources (i.e. paraphrase plagiarism). Or, they develop an essay that closely mimics the novel ideas and arguments in a published work, but doesn’t contain any of the same words (i.e. conceptual plagiarism). 

 

Instead of learning to actually formulate an argument based on their own ideas, high school plagiarists engage in specification gaming. They learn to game teachers’ and plagiarism checkers’ abilities to recognize when they are not developing their own work. They learn to ape the skills the teachers want them to learn by honing advanced ways of disguising the derivativeness of their writing. 

 

The problem with this isn’t just that the students borrow others’ work without attribution. It’s that the students don’t learn to think for themselves. These students are “learning” something, but what they’re learning is a low-level skill that’s only useful within the “game” of school. And the game, to some extent, defeats the purpose of education. 

 

Earlier this year, writer Alex Kantrowitz discovered that a partially plagiarized version of his article had been published without crediting him, and that the “author” was an AI alias.2 Several other companies and individuals have been caught using AIs to produce articles or stories that consist of lightly-altered sentences drawn from uncredited source materials. Researchers have found that generative AIs engage in both paraphrase and conceptual plagiarism as well as reproducing memorized text (verbatim plagiarism).3 Similar issues exist with AIs that generate art or music. In 2022, a research team prompted an art AI to output very close copies of dozens of images (some photos, some artwork) from its training data.4 The company behind the same art AI is the subject of several copyright-infringement lawsuits.

 

Leaving aside the intellectual property issues and the effects on student learning, these cases tell us something about generative AIs’ actual capabilities and likely trajectory. Free near-copies of artists’ work will be as valuable as forgeries (like a “newly discovered” fake Picasso) have always been. Creating such forgeries is a skill, yes, but developing that skill is not an indication that one is on the way to becoming a Picasso-level artist. 

 

When AI development focuses on convincing humans that a machine is intelligent, the result may say more about human psychology (i.e. how to effectively deceive humans) than it does about the subject matter the AI works with. This deception would most likely be unintentional; in other words, the AI presents humans with outputs that make it easy to fool ourselves. The harder the problem an AI is tasked with, the easier, by comparison, this kind of deception might be. In a subject-matter realm in which no AI is close to developing capabilities, all AIs that appear to succeed will do so by deception. 

 

Eyespots on the rear end of Eupemphix nattereri frog. Photo by Felipe Gomes

 

Confusing Definitions

 

The meanings of “data,” intelligence,” and “general intelligence” are not pinned down, so these words are liable to be abused for profit.

 

Could the biggest AI-related risks come from a mismatch between different meanings of these words, rather than from AIs’ growing capabilities?

 

“Data”

The inputs that power generative AIs are commonly called “training data,” but they really don’t fit the usual definition of “data”. Most people don’t think of the Mona Lisa or the full text of a novel as a data point, but the “training data” that generative AIs use consists of peoples’ ideas, innovations, creative works, and professional output. The rampant cases of AI-facilitated plagiarism and forgery imply that these AIs are using their “training data” more as a source of raw material than as a source of training. 

 

“Intelligence” and “general intelligence”

Moving on to other types of AIs, it's easy to see game-playing AIs’ successes at games like Go as evidence that they are becoming “intelligent.”

 

Game-playing machines’ successes are impressive, but they don’t necessarily indicate skills that transfer to the real-world environments that our human intelligence handles. We can precisely tell an algorithm what “winning” a game means and let it play millions of games in a uniform environment. The games in question are played entirely inside a computer (or entered into a computer) and don’t involve changing environments, uncertainty around labels and borders, or the need to track down and collect information in the real world. Carefully-designed AIs can produce novel solutions for manmade puzzles and games even if they’re highly complex, but this says nothing about their performance in an environment of real-world decision-making or investigation.

 

In real-world environments, unlike in a game, we rarely know what “winning” is or what “the correct answer” is. If we think we know, we may be wrong in cases different from the ones we’ve considered — even if we don’t know there’s a difference.

 

Case study 1

 

In 2018, the AI system AlphaFold beat out 97 competitors in accurately predicting the structures of recently solved proteins, thus winning the biannual CASP competition. AlphaFold was trained on the Protein Data Bank (PDB), a large database of proteins solved mostly through crystallography. This AI system has gained recognition as an important advance in structural biology.

 

The first issue is that PDB is a biased sample of proteins- those whose production, purification and crystallization were tractable, and those that come from species that humans are most interested in and/or have an easy time studying in the lab. AlphaFold can mostly interpolate within the range of its training data. We don’t really know how much we can extrapolate known protein structures to, say, the proteins of extremophiles, or proteins with low homology to any protein in PDB, or proteins that haven’t been solved because they tend to kill the bacterial cells in which we try to grow them. 

 

Another issue with using PDB is that not all proteins adopt a single, static structure in nature. The up-and-coming fields of intrinsically disordered proteins (IDPs), intrinsically disordered regions (IDRs), and bi-stable “fold-switching” proteins are revealing that many proteins are fully or partly disordered or can change between different structures. Current estimates are that over 50% of eukaryotic proteins contain at least one long disordered segment, and these features appear to be key to many proteins’ functions.

 

Scientists who work with IDPs/IDRs and fold-switching proteins are offering cautions that AlphaFold (and its successor AlphaFold2) may predict spurious structures for these proteins.5 Scientists working on proteins that switch between two or more structures report that AlphaFold will typically find just one of the structures, or an intermediate structure, but the prediction will be given with high confidence. Scientists are using AlphaFold to (carefully) study IDPs and IDRs, and AlphaFold seems to do well at marking these regions as “low-confidence” predictions. However, the authors of a 2021 paper caution that those inexperienced with IDPs/IDRs who use AlphaFold may be “tempted to draw” incorrect inferences that have “zero physical meaning” and that the AlphaFold process can introduce artifacts that are especially severe for IDRs.6 Because IDPs/IDRs are still a lesser-known field, some scientists today may be looking at AlphaFold’s static outputs with less than the appropriate skepticism.

 

This is not to diminish the accomplishments of those who’ve worked on AlphaFold; it really is an important advance that puts multiple recent discoveries in protein biology and molecular evolution to work. We can use AlphaFold and similar AIs well by understanding where they perform better and worse, where nobody knows the “ground truth,” where we may be wrong about it, or where we may be extrapolating too broadly.

 

If humans had turned the task of investigating protein structures over to an AI (or to scientists who rely heavily on an AI) before humans had discovered IDPs and bi-stable proteins, we may never have known about these important groups. It took human intelligence to make the discoveries leading to the recognition of IDPs/IDRs: the hands-on work of trying and failing to crystallize some groups of proteins, and the innovative thinking needed to notice where the dominant paradigm was breaking down and recognize its importance. 

 

Case study 2

 

AI programs for medical image analysis have been under development for years. A few of these programs are currently used in the clinic, though reception has been mixed. To the extent that we can define what constitutes a “normal” or “abnormal” imaging result, and to the extent that representative and accurately-labeled imaging datasets are available, it should be possible to eventually develop useful AIs that replicate physicians’ judgments regarding images.

 

But image evaluation is only one step in a larger diagnostic process, the same step filled by laboratory blood tests and the like. Because doctors who understand a lab test’s limitations can make better use of it, clinical leaders like Catherine Lucey, MD of UC San Francisco are advocating for doctors to place less emphasis on the tests alone by placing test and imaging results within a Bayesian framework (such as by using Fagan nomograms, also called “Bayesian ladders”), with the goal of improving medical decision-making.

 

From this perspective, if AIs eventually become capable of replacing doctors in medical image analysis, allowing the AIs to take over diagnosis would be a step in the wrong direction. Instead, the advent of radiology AIs would be analogous to the mid-1800s replacement of doctors skilled at tasting urine by laboratory tests for glucose in diagnosing diabetes. The laboratory glucose tests eventually proved more accurate than the human tongue, and most human doctors probably didn’t mind giving up that role. 

 

But any lab test has its limitations. Doctors need to consider the prior probability that a patient has a particular disease based on an exam and interview, the possibilities for other diseases that may explain the patient’s symptoms, available tests’ specificity and sensitivity, the consequences of a false positive or false negative, the uncertainty around the definition of “normal,” differences among populations, and cases where the test will fail in order to decide whether to apply the test in a particular patient’s case. The same considerations, plus more, apply to AI image analysis.

 

A different kind of “learning”

The “learning” that some AIs go through may be better compared to the learning that the human immune system goes through, when it “learns” to recognize self vs. non-self and to attack pathogens, rather than to the learning that a human does. The kind of “intelligence” an immune system has could even be called “general,” since it eventually becomes successful at many different tasks, from destroying cancer cells to neutralizing different pathogens to causing allergic reactions against pollen to rejecting transplanted organs. 

 

Likewise, the kind of “learning” AIs do can make them very good at some tasks or even a range of tasks. However, calling that “intelligence” or “general intelligence” is a metaphor that is more likely to be misleading than helpful. It doesn’t make much more sense than calling the immune system a “general intelligence” -- except as a marketing term and to attract funders.

 

AI trainers, handlers and data labelers are the cognitive engine behind many AIs

 

Next, I’d like to look at another way observers might overestimate current AIs’ capabilities (both quantitatively and qualitiatively) and thus overestimate their prospects for developing certain capabilities in the future.

 

Quite a few AI companies have reportedly used a combination of AIs and human handlers to deliver what customers needed, even while customers and members of the public thought an AI was fully responsible.7,8,9Even AIs that don’t need support after they’re deployed often require a massive number of worker-hours to get up and running, including dataset development, labeling and sorting data, training, and giving feedback. In the case of generative AI, these inputs are in addition to the work of artists, writers, and musicians whose contributions are scraped as “data.” In some cases, these human contributions look less like training and more like doing the work for the AI (albeit ahead of time). 

 

To understand whether human support is like “training wheels” or will be needed permanently, we need a clear-eyed view of what is actually happening. Today, some companies pass off much of the cognitive work in setting up and running their AIs to contractors or people working through task services like Mechanical Turk; other companies have used prison labor or outsourced to low-income countries. Tech researcher Saiph Savage has documented how AI company staff often underestimate the time required to complete “microwork” tasks passed to contractors. The money offered for microwork tasks is often so poorly aligned to the time needed that these AI workers’ average earnings are around $2 an hour (even for workers in the US).10 The lack of feedback makes it hard for AI companies, and even harder for outside observers, to gauge how many hours of human cognitive work are actually going in. 

 

When deployed in the workplace, some current AIs may surreptitiously borrow time and cognitive resources from their human “coworkers” too. One of the criticisms of the (now defunct) IBM Watson Health AI was that doctors and nurses had to spend large amounts of time inputting their medical knowledge and patient data into the AI tools, with little usable payoff.11 Clinicians have similar complaints about other AIs that are widely deployed in healthcare today. 

 

I’ve had similar enough experiences with expensive but poorly functional laboratory robots and corporate software to understand clinicians’ frustrations with these tools. When a technology appears to replace human workers, sometimes it actually pushes the work onto other humans. Managers can easily miss or underestimate the contributions by humans, making it appear that the technology (whether AI or otherwise) is more successful and “smarter” that it actually is. Humans who wind up as AI handlers may experience their jobs as less meaningful and more frustrating. For humans involved (by choice or not) in any stage of AI development or deployment, having a machine take credit for one’s work will come with an emotional impact and possibly a financial penalty.

 

Current estimates of the limitations on making AIs “smarter” may be biased by underestimates of how much today’s AIs rely on humans to do their cognitive work, either in advance or in real time. To know whether an AI is actually accomplishing useful work, we need to compare its time savings to the hours put in by all humans who worked on the project and all humans who help the AI function day-to-day, including contractors, people at outsourcing sites, people at client organizations, and end users, and we need to compare its contributions to the opportunity costs from those people not spending their time on something else. In some cases, the human input for today’s AIs is vastly greater than any output we can expect. 

 

Some AIs are status quo engines that will slow down human advancement if deployed inappropriately 

 

Modern society has a great predilection to base new programs and solutions on existing situations, even if those situations are pathological. 

 

Some examples:

  1. Breastfeeding has health benefits, but breastfed babies typically gain weight more slowly than formula-fed babies. The CDC’s percentile charts that track infant weight were based on data from a study conducted in the 1970s, when formula feeding was very common. Pediatricians used these charts as a standard to gauge the growth of most US babies born from the late 1970s until 2010. A 2006 panel convened by the CDC and other groups found that some doctors were inappropriately advising parents to switch from breastfeeding to formula feeding to help their babies “catch up” with the growth charts. In 2010, these concerns led the CDC to recommend that pediatricians switch to using newer WHO growth charts based on breastfed babies.12
  2. In 2009, the National Oceanic and Atmospheric Administration (NOAA) rolled out new “catch-share” regulations that were supposed to prevent overfishing in the New Bedford, MA fishery. The total fish harvest would be capped, and each fishing business would be allotted a piece of the pie based on its catch of each vulnerable species in past years. The result was that companies that had already been doing the most damage to vulnerable species had the highest allocations, and those companies were able to further increase their shares by driving more responsible fishermen out of business.13
  3. In hospitals across the US, a healthcare algorithm decided which patients’ health conditions were severe enough that they should be offered extra care. Problems surfaced in 2019. Researchers revealed that the decision-making algorithm was much less likely to offer extra care to black patients compared to equally sick white patients. The algorithm used the money spent on each patient in past years to predict which patients were “sicker.” This biased decision-making against patients who had been undertreated due to poor healthcare access in the past. Because the algorithm allocated services away from patients for whom money and access had been limiting factors in their health, and toward patients for whom they had not been, it probably ensured that money was spent in the least efficient way possible

 

AIs didn’t originate this, but because many AIs rely on past data to make predictions, basing the future on the past or present is something they’re naturally good at. Similar examples of “AI bias” influenced by current and historical realities or by human behavior are popping up throughout healthcare and in other areas of our society. Many people are working on reducing bias in AI technology and in algorithms more broadly, but the problem might be inherent to any system that relies on making associations in existing data rather than going out and investigating the causes of problems.15,16

 

If AIs can’t be made to think outside the data, more advanced AIs may just be better status quo engines, skilled at propagating harmful situations forward in time. 

 

In the corporate world, AI could easily feed right into many business leaders’ inclination to throw new, heavily advertised technology at a problem instead of understanding it. Unless AIs can investigate a problem hands-on, they can only use the existing data that humans chose to collect and feed them.

 

And alignment between the AI itself and the goals of its developers is not enough. As with traditional product design, the way developers understand real-world environments and the needs of users is often far from reality. The Lean Manufacturing and the broader Lean Thinking movements have facilitated real growth in value precisely by teaching people to open up communication, investigate the real roots of problems hands-on, and seek alignment of goals and actions not only within a company but with customers and vendors. To improve our thinking and action, we should be learning to break out of our status-quo-is-correct thinking patterns instead of reinforcing them and making them harder to detect. 

 

To work with some AIs, humans will have to dumb themselves down

 

In high school English classes today, students and schools are rewarded for producing essays that please automated essay-scoring systems. These tests tend to give low scores to the work of certain famous authors, but high scores to students who’ve learned to write shallow essays that use lots of big words in long sentences, even if the essays contain little actual reasoning or insight.17 In other words, when algorithms are not capable of scoring our brightest kids’ tests, teachers are tasked with making kids dumber. Automated scoring systems for school subjects besides English are being developed.

 

AIs may be able to displace workers who were trained in “rote” educational programs, but the reason is that standardized testing and rote learning aren’t the best ways to help people develop real creativity, advanced thinking skills, or hands-on problem investigation. An AI recently passed a standardized medical licensing exam, but a written test doesn’t capture all aspects of knowledge, and the difference between a new med-school graduate and an experienced doctor is precisely the real-world learning that an algorithm inside a machine can’t do. The more similar our educational systems are to the 1950s Brazilian physics program Feynman visited, the easier our work will be for an AI to replace. But the issue here isn’t really the AI. 

 

Generative AIs, too, trade in a secondary market of ideas, so their work will be more and more self-referential unless humans keep infusing real innovation. But this could become more difficult. If generative AI intellectual property law is worked out in favor of the AI companies, writers who share very innovative ideas, artists or musicians with a very distinctive style, or photographers who capture unique objects or scenes may be most prone to being directly infringed. The most innovative creators will be susceptible to having their work devalued by the greatest degree and could be driven out of the marketplace. It will become difficult to originate and share anything that contains true originality, unless it can be securely kept away from the AIs.18

 

In scientific research, over-production of low-quality journal articles and bad meta-analyses are already problems, and people in the field of metascience are working on solving them. Attempts to use AIs to carry out investigations and publish papers will only make the scientific “replication crisis” and similar issues worse. Even in most areas where we think we know what “correct” is, we should be questioning our definitions of “correct,” not fossilizing them.

 

Like automatic telephone switchboards and calculating machines before them, AIs are much faster and better at specific tasks than humans are. The problem is that AIs are getting better at miming human-like behaviors at a time when their actual capabilities cover only a tiny fraction of the full range of what humans can do. If societal forces exacerbated by AI deployment lead us to devalue the work that is outside AIs’ range, we will constrain our thinking and development.

 

As with any type of tool, we can make better and safer use of AIs if we’re clear-eyed about their limitations and keep humans in the driver’s seat. The graduates of the Brazilian physics program Feynman encountered could be dangerous, if someone hired a bunch of them for high positions in a nuclear project on the strength of their apparent skill in physics. But the danger would be very different from that of a brilliant and malicious physicist wresting control of the same project.

 

 

In summary:

  • Some of the seemingly most impressive AIs are cognitive freeloaders, using the cognition of human trainers, users, and creators behind the scenes. 
  • As other expensive and unneeded technologies have, some AIs could feed into broken processes and contribute to the further degradation of worker experience in the workplace. Other AIs can make useful contributions, but if deployed in the wrong ways, they could hamper human educational, intellectual and creative endeavors. 
  • Like other algorithms that rely on historical or current data, AIs can give human users the impression that the status quo is “correct.” This can exacerbate the human and institutional tendency to propagate the status quo, even if it is pathological.
  • Talking about machine learning algorithms in terms of human “general intelligence” is an unhelpful, bad metaphor. It’s a bad metaphor because it implies that the danger in AIs is that they will get too smart. The real danger is that we (or some of us) overestimate their capabilities and trust AIs in places they’re not suited for.

 

 

 

References and Notes

 

  1. “Surely You’re Joking, Mr. Feynman!”: Adventures of a Curious Character by Richard P. Feynman, 1985.
  2. https://www.bigtechnology.com/p/a-writer-used-ai-to-plagiarize-me
  3. https://pike.psu.edu/publications/www23.pdf
  4. https://arxiv.org/pdf/2212.03860.pdf
  5. https://www.nlm.nih.gov/research/researchstaff/labs/porter/pdfs/chakravarty_AF2.pdf
  6. https://www.sciencedirect.com/science/article/pii/S0022283621004411
  7. https://www.wired.com/story/not-always-ai-that-sifts-through-sensitive-info-crowdsourced-labor/  
  8. https://www.bloomberg.com/news/articles/2016-04-18/the-humans-hiding-behind-the-chatbots?leadSource=uverify%20wall  
  9. https://www.vice.com/en/article/xweqbq/microsoft-contractors-listen-to-skype-calls
  10. https://www.technologyreview.com/2020/12/11/1014081/ai-machine-learning-crowd-gig-worker-problem-amazon-mechanical-turk/
  11. https://www.statnews.com/2017/09/05/watson-ibm-cancer/
  12. https://www.cdc.gov/mmwr/preview/mmwrhtml/rr5909a1.htm
  13. https://hakaimagazine.com/features/last-trial-codfather/ This allocation system probably increased overfishing. It turns out that the companies most willing to overfish vulnerable populations for short-term gain were also willing to deceive regulators. The companies owned by fraudster Carlos Rafael (aka The Codfather) circumvented the new regulations by mislabeling fish and misreporting how much they were catching. Carlos Rafael had nearly taken over the New Bedford fishery by the time he was caught.
  14. https://www.science.org/doi/10.1126/science.aax2342
  15. http://ziadobermeyer.com/wp-content/uploads/2019/09/measurement_aer.pdf
  16. http://ziadobermeyer.com/wp-content/uploads/2021/08/Predicting-A-While-Hoping-for-B.pdf
  17. https://www.salon.com/2013/09/30/computer_grading_will_destroy_our_schools/ Referring to high school students gaming essay-grading algorithms, the author writes: “One obvious problem is that if you know what the machine is measuring, it is easy to trick it. You can feed in an “essay” that it is actually a bag of words (or very nearly so), and if those words are SAT-vocab-builders arranged in long sentences with punctuation marks at the end, the computer will give you a good grade. The standard automated-essay-scoring-industry response to this criticism is that anyone smart enough to figure out how to trick the algorithm probably deserves a good grade anyway.” Similar developments in several areas could lead to circular “humans-gaming-machines-gaming humans” situations that would suck up time and energy.
  18. There is some hope that this will change. The FTC has forced several companies to undergo “algorithmic disgorgement”: deletion of improperly-collected photos, videos, or personal information and destruction of the algorithms developed using those data. (https://jolt.richmond.edu/files/2023/03/Goland-Final.pdfThus far, however, these enforcement actions have focused on privacy violations, not on unauthorized use of music, art, or other professional work. Perhaps greater use of algorithmic disgorgement and requirements that companies obtain “opt-in” permission before using personal data or copyrighted works in AI training could be a way to avert some of the harms discussed here. 
Comments


No comments on this post yet.
Be the first to respond.
Curated and popular this week
LintzA
 ·  · 15m read
 · 
Cross-posted to Lesswrong Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achi
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d
Rory Fenton
 ·  · 6m read
 · 
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program. I don’t know how many Crossfit-interested low-income teens there are in my small town, but I’ll guess there are perhaps 2 of them who would benefit from the scholarship. After all, CrossFit is pretty niche, and the town is small. Helping youngsters get swole in the Pacific Northwest is not exactly as cost-effective as preventing malaria in Malawi. But I notice I feel drawn to supporting the scholarship anyway. Every time it pops in my head I think, “My money could fully solve this problem”. The camp only costs a few hundred dollars per kid and if there are just 2 kids who need support, I could give $500 and there would no longer be teenagers in my town who want to go to a CrossFit summer camp but can’t. Thanks to me, the hero, this problem would be entirely solved. 100%. That is not how most nonprofit work feels to me. You are only ever making small dents in important problems I want to work on big problems. Global poverty. Malaria. Everyone not suddenly dying. But if I’m honest, what I really want is to solve those problems. Me, personally, solve them. This is a continued source of frustration and sadness because I absolutely cannot solve those problems. Consider what else my $500 CrossFit scholarship might do: * I want to save lives, and USAID suddenly stops giving $7 billion a year to PEPFAR. So I give $500 to the Rapid Response Fund. My donation solves 0.000001% of the problem and I feel like I have failed. * I want to solve climate change, and getting to net zero will require stopping or removing emissions of 1,500 billion tons of carbon dioxide. I give $500 to a policy nonprofit that reduces emissions, in expectation, by 50 tons. My donation solves 0.000000003% of the problem and I feel like I have f