Jackson Wagner

Scriptwriter for RationalAnimations @ https://youtube.com/@RationalAnimations
2958 karmaJoined Apr 2021Working (6-15 years)Fort Collins, CO, USA


Engineer working on next-gen satellite navigation at Xona Space Systems. I write about effective-altruist and longtermist topics at nukazaria.substack.com, or you can read about videogames like Braid and The Witness at jacksonw.xyz


To answer with a sequence of increasingly "systemic" ideas (naturally the following will be tinged by by own political beliefs about what's tractable or desirable):

There are lots of object-level lobbying groups that have strong EA endorsement. This includes organizations advocating for better pandemic preparedness (Guarding Against Pandemics), better climate policy (like CATF and others recommended by Giving Green), or beneficial policies in third-world countries like salt iodization or lead paint elimination.

Some EAs are also sympathetic to the "progress studies" movement and to the modern neoliberal movement connected to the Progressive Policy Institute and the Niskasen Center (which are both tax-deductible nonprofit think-tanks). This often includes enthusiasm for denser ("yimby") housing construction, reforming how science funding and academia work in order to speed up scientific progress (such as advocated by New Science), increasing high-skill immigration, and having good monetary policy. All of those cause areas appear on Open Philanthropy's list of "U.S. Policy Focus Areas".

Naturally, there are many ways to advocate for the above causes -- some are more object-level (like fighting to get an individual city to improve its zoning policy), while others are more systemic (like exploring the feasibility of "Georgism", a totally different way of valuing and taxing land which might do a lot to promote efficient land use and encourage fairer, faster economic development).

One big point of hesitancy is that, while some EAs have a general affinity for these cause areas, in many areas I've never heard any particular standout charities being recommended as super-effective in the EA sense... for example, some EAs might feel that we should do monetary policy via "nominal GDP targeting" rather than inflation-rate targeting, but I've never heard anyone recommend that I donate to some specific NGDP-targeting advocacy organization.

I wish there were more places like Center for Election Science, living purely on the meta level and trying to experiment with different ways of organizing people and designing democratic institutions to produce better outcomes. Personally, I'm excited about Charter Cities Institute and the potential for new cities to experiment with new policies and institutions, ideally putting competitive pressure on existing countries to better serve their citizens. As far as I know, there aren't any big organizations devoted to advocating for adopting prediction markets in more places, or adopting quadratic public goods funding, but I think those are some of the most promising areas for really big systemic change.

The Christians in this story who lived relatively normal lives ended up looking wiser than the ones who went all-in on the imminent-return-of-Christ idea. But of course, if christianity had been true and Christ had in fact returned, maybe the crazy-seeming, all-in Christians would have had huge amounts of impact.

Here is my attempt at thinking up other historical examples of transformative change that went the other way:

  • Muhammad's early followers must have been a bit uncertain whether this guy was really the Final Prophet. Do you quit your day job in Mecca so that you can flee to Medina with a bunch of your fellow cultists? In this case, it probably would've been a good idea: seven years later you'd be helping lead an army of 100,000 holy warriors to capture the city of Mecca. And over the next thirty years, you'll help convert/conquer all the civilizations of the middle east and North Africa.

  • Less dramatic versions of the above story could probably be told about joining many fast-growing charismatic social movements (like joining a political movement or revolution). Or, more relevantly to AI, about joining a fast-growing bay-area startup whose technology might change the world (like early Microsoft, Google, Facebook, etc).

  • You're a physics professor in 1940s America. One day, a team of G-men knock on your door and ask you to join a top-secret project to design an impossible superweapon capable of ending the Nazi regime and stopping the war. Do you quit your day job and move to New Mexico?...

  • You're a "cypherpunk" hanging out on online forums in the mid-2000s. Despite the demoralizing collapse of the dot-com boom and the failure of many of the most promising projects, some of your forum buddies are still excited about the possibilities of creating an "anonymous, distributed electronic cash system", such as the proposal called B-money. Do you quit your day job to work on weird libertarian math problems?...

People who bet everything on transformative change will always look silly in retrospect if the change never comes. But the thing about transformative change is that it does sometimes occur.

(Also, fortunately our world today is quite wealthy -- AI safety researchers are pretty smart folks and will probably be able to earn a living for themselves to pay for retirement, even if all their predictions come up empty.)

Kind of a repetitive stream-of-consciousness response, but I found this both interesting as a philosophical idea and also annoying/cynical/bad-faith:

This is interesting but also, IMO, kind of a strawman -- what's being attacked is some very specific form of utilitarianism, wheras I think many/most "longtermists" are just interested in making sure that we get some kind of happy long-term future for humanity and are fuzzy about the details.  Torres says that "Longtermists would surely argue...", but I would like to see some real longtermists quoted as arguing this!!

Personally, I think that taking total-hedonic-utilitarianism 100% seriously is pretty dumb (if you keep doubling the number of happy people, eventually you get to such high numbers that it seems the moral value has to stop 2x-ing because you've got quadrillions of people living basically identical lives), but I still consider myself a longtermist, because I think society is underrating how bad it would be for a nuclear war or similar catastrophe to wreck civilization.

Personally I would also put some (although not overwhelming) weight on the continuity in World B on account of how it gives life more meaning (or at least it would mean that citizens of World B would be more similar to myself -- like me, they too would plan for the future and think of themselves as being part of a civilization that extends through time, rather than World A which seems like it might develop a weird "nothing matters" culture that I'd find alienating).  I think a lot of EAs would agree that something feels off about World A, although the extra 10 billion people is definitely a plus, and that overall it seems like an unsolved philosophical mystery whether it matters if your civilization is stretched out in time or not, or whether there is even an objective "right answer" to that question vs being a matter of purely personal taste.  At the end of the day I'm very uncertain as to whether world A vs B is better; population ethics is just a really tricky subject to think about.

So this is a good thought experiment!  But it seems pretty cynical to introduce this philosophical thought experiment and then:
1. say that your political opponents would obviously/unanimously endorse World A, when actually I think if you polled EAs you might get a pretty even split or they might favor World B.
2. say that this proves they "don't actually care about the future at all", despite the myriad real-world examples of EAs who are working hard to try and reduce long-term risks from climate change, pandemics, nuclear war, rouge AI, etc.

There is also maybe a bit of a sleight-of-hand in the fact that the total population in both scenarios is only 10-20 billion, which is much smaller than the total population of the best futures we could hope for.  This makes the 20-billion-people-all-at-once World A scenario feel like an imminent end of the world (nuclear war in 2100, perhaps?), which makes it feel very bad.  

But the only-1-million-people-alive-at-a-time scenario is also bad; Torres just doesn't dwell on it!  Maybe I should write an op-ed saying that Torres would "surely argue" in favor of stretching out modern human civilization, so that instead of all 8 billion of us hanging out together, 99.99% of us are in cryosleep at any given time and only a million humans are alive at any given time.  I could write about how this proves that Torres "doesn't really care about community or cultural diversity at all", since such a small population would surely create much more of a monoculture than the present-day earth.  Think about all the human connections and experiences (for example, the existence of communities built around very niche/rare hobbies, or the ability to go to a show and appreciate the talents of an artist/performer/athlete who's "one-in-a-million", or the experience of being in a big bustling city like New York, population 10m) that would be permanently ruled out in Torres's scenario!  (How would we decide who's awake for each generation?  Would we go by ethnicity -- first all the Chinese people, then all the Italians, then all the Norwegians, and so forth?  Surely it would make more sense to make each generation be a representative slice -- but then you'd destroy very small ethnic groups, like for instance the extremely unique and low-population Hazda, by forcing only one member of the Hazda to be woken up every 8 generations!  Would Torres "surely argue" in favor of this atrocity?!)  But such attacks would be silly; it seems silly to attack people based on treating their opinions on really abstract unsolved philosophical questions, as if they were concrete political agendas?  At the end of the day we're all on the same side just trying to make sure that we don't have a nuclear war; the abstract philosophy isn't driving many actionable political disagreements.

I think the downvotes are coming from the fact that Émile P. Torres has been making similar-ish critiques on the concept of longtermism for a while now.  (Plus, in some cases, closer to bad-faith attacks against the EA movement, like I think at one point saying that various EA leaders were trying to promote white supremacism or something?)  Thus, people might feel both that this kind of critique is "old news" since it's been made before, and they furthermore might feel opposed to highlighting more op-eds by Torres.

Some previous Torres content which garnered more of the critical engagement that you are seeking:
- "The Case Against Longtermism", from three years ago was one of the earlier op-eds and sparked a pretty lively discussion; this is just one of several detailed Forum posts responding to the essay.
- Some more Forum discussion of "The Dangerous Ideas of 'Longtermism' and 'Existential Risk'", two years ago.

- By contrast, the more-recent "Understanding 'longtermism': Why this suddenly influential philosophy is so toxic", from only one year ago, didn't get many responses because I think people got the impression that Torres is still making similar points and, for their part, not giving much critical engagement to various rebuttals / defenses from the people that Torres is criticizing.

Various "auto-GPT" schemes seem like a good demonstration of power-seeking behavior (and perhaps very limited forms of self-preservation or self-improvement), insofar as auto-GPT setups will often invent basic schemes like "I should try to find a way to earn some money in order to accomplish my goal of X", or "I should start a twitter account to gain some followers", or other similarly "agenty" actions/plans.

This might be a bit of a stretch, but to the extent that LLMs exhibit "sycophancy" (ie, telling people what they want to hear in response to stuff like political questions), this seems like it might be partially fueled by the LLM "specification gaming" the RLHF process?  Since I'd expect that an LLM might get higher "helpful/honest/harmless" scores by trying to guess which answers the grader most wants to hear, instead of trying to give its truly most "honest" answer?  (But I don't have a super-strong understanding of this stuff, and it's possible that other effects are fueling the sycophancy, such as if most of the sycophancy comes from the base model rather than emerging after RLHF.)  But specification gaming seems like such a common phenomenon that there must be better examples out there.

I'm definitely not deeply familiar with any kind of "official EA thinking" on this topic (ie, I don't know any EAs that specialize in nuclear security research / grantmaking / etc).  But here are some things I just thought up, which might possibly be involved:

  • Neglectedness in the classic sense.  Although not as crowded as climate change, there are other large organizations / institutions that address nuclear risk and have been working in this space since the early Cold War.  (Here I am thinking not just about charitable foundations, but also DC think-tanks, university departments, and even the basic structure of the US military-industrial complex which naturally involves a lot of people trying to figure out what to do about nuclear weapons and war.)
  • Nuclear war might be slightly lower-ranked on the importance scale of a very committed and philosophically serious longtermist, since it seems harder for a nuclear war to literally kill everyone (wouldn't New Zealand still make it?  etc), than sufficiently super-intelligent AI or a sufficiently terrifying engineered bioweapon.  So this places nuclear war risk somewhere on a spectrum between being a direct existential threat, vs being more of an "existential risk factor" (like climate change).  Personally, I find it hard to bite that longtermist bullet all the way, emotionally. (ie, "The difference between killing 99% of people and 100% of people, is actually a bazillion times worse than the difference between killing 99% versus 0%".)  So I feel like nuclear war pretty much maxes out my personal, emotional "importance scale".  But other people might be be better than me at shutting up and multiplying!  (And/or have higher odds than me that civilization would eventually be able to fully recover after a nuclear war.)
  • Tractability, in the sense that a lot of nuclear policy is decided by the US military-industrial complex (and people like the US president), in a way that seems pretty hard for the existing EA movement to influence?  And then it gets even worse, because of course the OTHER half of the equation is being decided by the military-industrial  complexes of Russia, China, India, etc -- this seems even harder to influence!  By contrast, AI safety is hugely influenceable by virtue of the fact that the top AI labs are right in the bay area and their researchers literally go to some of the same social events as bay-area EAs.  Biosecurity seems like a middle-ground case, where on the downside there isn't the crazy social overlap, but on the plus side it's a partly academic field which is amenable to influence via charities, academic papers, hosting conferences, advocating for regulation, trying to spread good ideas via podcasts and blog posts, etc...
  • Tractability, in a different sense, namely that it's pretty unclear exactly HOW to reduce the risk of a nuclear war, which interventions are helpful vs harmful, etc.  For instance, lots of anti-nuclear activists advocate for reducing nuclear stockpiles (which certainly seems like would help reduce the severity of a worst-case nuclear war), but my impression is that many experts (both within EA and within more traditional bastions of nuclear security research) are very uncertain about the impact of unilaterally reducing our nuclear stockpiles -- for example, maybe it would actually increase the damage caused by a nuclear war if we got rid of our land-based "nuclear sponge" ICBMs?  Besides severity, what impact might reduced stockpiles have on the likelihood of nuclear war, if any?  My impression is that these kinds of tricky questions are even more common in nuclear security than they are in the already troublesome fields of AI safety and biosecurity.

If I had to take a wild guess, I would say that my first Tractability point (as in, "I don't know anybody who works at STRATCOM or the People's Liberation Army Rocket Force") is probably the biggest roadblock in an immediate sense.  But maybe EA would have put more effort into building more influence here if we had prioritized nuclear risk more from the start -- and perhaps that lack of historical emphasis is due to some mix of the other problems I mentioned?

reposting a reply by Omnizoid from Lesswrong:

"Philosophy is pretty much the only subject that I'm very informed about.  So as a consequence, I can confidently say Eliezer is eggregiously wrong about most of the controversial views I can fact check him on.  That's . . . worrying."

And my reply to that:

Some other potentially controversial views that a philosopher might be able to fact-check Eliezer on, based on skimming through an index of the sequences:

  • Assorted confident statements about the obvious supremacy of Bayesian probability theory and how Frequentists are obviously wrong/crazy/confused/etc.  (IMO he's right about this stuff.  But idk if this counts as controversial enough within academia?)
  • Probably a lot of assorted philosophy-of-science stuff about the nature of evidence, the idea that high-caliber rationality ought to operate "faster than science", etc.  (IMO he's right about the big picture here, although this topic covers a lot of ground so if you looked closely you could probably find some quibbles.)
  • The claim / implication that talk of "emergence" or the study of "complexity science" is basically bunk.  (Not sure but seems like he's probably right?  Good chance the ultimate resolution would probably be "emergence/complexity is a much less helpful concept than its fans think, but more helpful than zero".)
  • A lot of assorted references to cognitive and evolutionary psychology, including probably a number of studies that haven't replicated -- I think Eliezer has expressed regret at some of this and said he would write the sequences differently today.  But there are probably a bunch of somewhat-controversial psychology factoids that Eliezer would still confidently stand by.  (IMO you could probably nail him on some stuff here.)
  • Maybe some assorted claims about the nature of evolution?  What it's optimizing for, what it produces ("adaptation-executors, not fitness-maximizers"), where the logic can & can't be extended (can corporations be said to evolve?  EY says no), whether group selection happens in real life (EY says basically never).  Not sure if any of these claims are controversial though.
  • Lots of confident claims about the idea of "intelligence" -- that it is a coherent concept, an important trait, etc.  (Vs some philosophers who might say there's no one thing that can be called intelligence, or that the word intelligence has no meaning, or generally make the kinds of arguments parodied in "On the Impossibility of Supersized Machines".  Surely there are still plenty of these philosophers going around today, even though I think they're very wrong?)
  • Some pretty pure philosophy about the nature of words/concepts, and "the relationship between cognition and concept formation".  I feel like philosophers have a lot of hot takes about linguistics, and the way we structure concepts inside our minds, and so forth?  (IMO you could at least definitely find some quibbles, even if the big picture looks right.)
  • Eliezer confidently dismissing what he calls a key tenet of "postmodernism" in several places -- the idea that different "truths" can be true for different cultures.  (IMO he's right to dismiss this.)
  • Some pretty confident (all things considered!) claims about moral anti-realism and the proper ethical attitude to take towards life?  (I found his writing helpful and interesting but idk if it's the last word, personally I feel very uncertain about this stuff.)
  • Eliezer's confident rejection of religion at many points.  (Is it too obvious, in academic circles, that all major religions are false?  Or is this still controversial enough, with however many billions of self-identified believers worldwide, that you can get credit for calling it?)
  • It also feels like some of the more abstract AI alignment stuff (about the fundamental nature of "agents", what it means to have a "goal" or "values", etc) might be amenable to philosophical critique.

Maybe you toss out half of those because they aren't seriously disputed by any legit academics.  But, I am pretty sure that at least postmodern philosophers, "complexity scientists", people with bad takes on philosophy-of-science / philosophy-of-probability, and people who make "On the Impossibility of Supersized Machines"-style arguments about intelligence, are really out there!  They at least consider themselves to be legit, even if you and I are skeptical!  So I think EY would come across with a pretty good track record of correct philosophy at the end of the day, if you truly took the entire reference class of "controversial philosophical claims" and somehow graded how correct EY was (in practice, since we haven't yet solved philosophy -- how close he is to your own views?), and compared this to how correct the average philosopher is.

I suggest maybe re-titling this post to:
"I strongly disagree with Eliezer Yudkowsky about the philosophy of consciousness and decision theory, and so do lots of other academic philosophers"

or maybe:
"Eliezer Yudkowsky is Frequently, Confidently, Egregiously Wrong, About Metaphysics"

or consider:
"Eliezer's ideas about Zombies, Decision Theory, and Animal Consciousness, seem crazy"

Otherwise it seems pretty misleading / clickbaity (and indeed overconfident) to extrapolate from these beliefs, to other notable beliefs of Eliezer's -- such as cryonics, quantum mechanics, macroeconomics, various political issues, various beliefs about AI of course, etc.  Personally, I clicked on this post really expecting to see a bunch of stuff like "in March 2022 Eliezer confidently claimed that the government of Russia would collapse within 90 days, and it did not", or "Eliezer said for years that X approach to AI couldn't possibly scale, but then it did".

Personally, I feel that beliefs within this narrow slice of philosophy topics are unlikely to correlate to being "egregiously wrong" in other fields.  (Philosophy is famously hard!!  So even though I agree with you that his stance on animal consciousness seems pretty crazy, I don't really hold this kind of philosophical disagreement against people when they make predictions about, eg, current events.)

I agree, and I think your point applies equally well to the original Eliezer Zombie discussion, as to this very post.  In both cases, trying to extrapolate from "I totally disagree with this person on [some metaphysical philosophical questions]" to "these people are idiots who are wrong all the time, even on more practical questions", seems pretty tenuous.

But all three parts of this "takedown" are about questions of philosophy / metaphysics?  How do you suggest that I "follow the actual evidence" and avoid "first principles reasoning" when we are trying to learn about the nature of consciousness or the optimal way to make decisions??

Load more