All posts

New & upvoted

Week of Monday, 27 June 2022
Week of Mon, 27 Jun 2022

Building effective altruism 24
Community 18
AI safety 18
Existential risk 11
Cause prioritization 9
AI alignment 9
More

Frontpage Posts

140
Jacy
· · 38m read
127
103
· · 6m read
75
· · 1m read
30
· · 1m read
25
Lizka
· · 2m read
24
· · 1m read
6
· · 2m read
-1
· · 1m read

Personal Blogposts

Quick takes

Comments on Jacy Reese Anthis' Some Early History of EA (archived version). Summary: The piece could give the reader the impression that Jacy, Felicifia and THINK played a comparably important role to the Oxford community, Will, and Toby, which is not the case. I'll follow the chronological structure of Jacy's post, focusing first on 2008-2012, then 2012-2021. Finally, I'll discuss "founders" of EA, and sum up. 2008-2012 Jacy says that EA started as the confluence of four proto-communities: 1) SingInst/rationality, 2) Givewell/OpenPhil, 3) Felicifia, and 4) GWWC/80k (or the broader Oxford community). He also gives honorable mentions to randomistas and other Peter Singer fans. Great - so far I agree. What is important to note, however, is the contributions that these various groups made. For the first decade of EA, most key community institutions of EA came from (4) - the Oxford community, including GWWC, 80k, and CEA, and secondly from (2), although Givewell seems to me to have been more of a grantmaking entity than a community hub. Although the rationality community provided many key ideas and introduced many key individuals to EA, the institutions that it ran, such as CFAR, were mostly oriented toward its own "rationality" community.  Finally, Felicifia is discussed at greatest length in the piece, and Jacy clearly has a special affinity to it, based on his history there, as do I. He goes as far as to describe the 2008-12 period as a history of "Felicifia and other proto-EA communities". Although I would love to take credit for the development of EA in this period, I consider Felicifia to have had the third- or fourth-largest role in "founding EA" of groups on this list. I understand its role as roughly analogous to the one currently played (in 2022) by the EA Forum, as compared to those of CEA and OpenPhil: it provides a loose social scaffolding that extends to parts of the world that lack any other EA organisation. It therefore provides some interesting ideas and leads to the discovery of some interesting people, but it is not where most of the work gets done.  Jacy largely discusses the Felicifia Forum as a key component, rather than the Felicifia group-blog. However, once again, this is not quite what I would focus on. I agree that the Forum contributed a useful social-networking function to EA. However, I suspect we will find that more of the important ideas originated on Seth Baum's Felicifia group-blog and more of the big contributors started there. Overall, I think the emphasis on the blog should be at least as great as that of the forum. 2012 onwards Jacy describes how he co-founded THINK in 2012 as the first student network explicitly focused on this emergent community. What he neglects to discuss at this time is that the GWWC and 80k Hours student networks already existed, focusing on effective giving and impactful careers. He also mentions that a forum post dated to 2014 discussed the naming of CEA but fails to note that the events described in the post occurred in 2011, culminating in the name "effective altruism" being selected for that community in December 2011. So steps had already been taken toward having an "EA" moniker and an EA organisation before THINK began. Co-founders of EA To wrap things up, let's get to the question of how this history connects to the "co-founding" of EA. > Some people including me have described themselves as “co-founders” of EA. I hesitate to use this term for anyone because this has been a diverse, diffuse convergence of many communities. However, I think insofar as anyone does speak of founders or founding members, it should be acknowledged that dozens of people worked full-time on EA community-building and research since before 2012, and very few ideas in EA have been the responsibility of one or even a small number of thinkers. We should be consistent in the recognition of these contributions. There may have been more, but only three people come to mind, who have described themselves as co-founders of EA: Will, Toby, and Jacy. For Will and Toby, this makes absolute sense: they were the main ringleaders of the main group (the Oxford community) that started EA, and they founded the main institutions there. The basis for considering Jacy among the founders, however, is that he was around in the early days (as were a couple of hundred others), and that he started one of the three main student groups - the latest, and least-important among them. In my view, it's not a reasonable claim to have made. Having said that, I agree that it is good to emphasise that as the "founders" of EA, Will and Toby only did a minority - perhaps 20% - of the actual work involved in founding it. Moreover, I think there is a related, interesting question: if Will and Toby had not founded EA, would it have happened otherwise? The groundswell of interest that Jacy describes suggests to me an affirmative answer: a large group of people were already becoming increasingly interested in areas relating to applied utilitarianism, and increasingly connected with one another, via GiveWell, academic utilitarian research, Felicifia, utilitarian Facebook groups, and other mechanisms. I lean toward thinking that something like an EA movement would have happened one way or another, although it's characteristics might have been different.
12
Jacy
2y
0
Brief Thoughts on the Prioritization of Quality Risks This is a brief shortform post to accompany "The Future Might Not Be So Great." These are just some scattered thoughts on the prioritization of quality risks not quite relevant enough to go in the post itself. Thanks to those who gave feedback on the draft of that post, particularly on this section. > People ask me to predict the future, when all I want to do is prevent it. Better yet, build it. Predicting the future is much too easy, anyway. You look at the people around you, the street you stand on, the visible air you breathe, and predict more of the same. To hell with more. I want better. ⸻ Ray Bradbury (1979) I present a more detailed argument for the prioritization of quality risks (particularly moral circle expansion) over extinction risk reduction (particularly through certain sorts of AI research) in Anthis (2018), but here I will briefly note some thoughts on importance, tractability, and neglectedness. Two related EA Forum posts are “Cause Prioritization for Downside-Focused Value Systems” (Gloor 2018) and “Reducing Long-Term Risks from Malevolent Actors” (Althaus and Baumann 2020). Additionally, at this early stage of the longtermist movement, the top priorities for population and quality risk may largely intersect. Both issues suggest foundational research of topics such as the nature of AI control and likely trajectories of the long-term future, community-building of thoughtful do-gooders, and field-building of institutional infrastructure to use for steering the long-term future. Importance One important application of the EV of human expansion is to the “importance” of population and quality risks. Importance can be operationalized as the good done if the entire cause succeeded in solving its corresponding problem, such as the good done by eliminating or substantially reducing extinction risk, which is effectively zero if the EV of human expansion is zero and effectively negative if the EV of human expansion is negative. The importance of quality risk reduction is clearer, in the sense that the difference in quality between possible futures is clearer than the difference in extinction and non-extinction, and larger, in the sense that while population risk entails only the range of zero-to-positive difference between human extinction and non-extinction (or population risk between zero population and some positive number of individuals) across quality risk entails the difference between the best quality humans could engender and the worst, across all possible population sizes. This is arguably a weakness of the framework because we could categorize the quality risk cause area as smaller in importance (say, an increase of 1 trillion utils, i.e., units of goodness), and it would tend to become more tractable as we narrow the category. Tractability The tractability difference between population and quality risk seems the least clear of the three criteria. My general approach is thinking through the most likely “theories of change” or paths to impact and assessing them step-by-step. For example, one commonly discussed extinction risk reduction path to impact is “agent foundations,” building mathematical frameworks and formally proving claims about the behavior of intelligent agents, which would then allow us to build advanced AI systems more likely to do what we tell them to do, and then using these frameworks to build AGI or persuading the builders of AGI to use them. Quality-risk-focused AI safety strategies may be more focused on the outer alignment problem, ensuring that an AI’s objective is aligned with the right values, rather than just the inner alignment problem, ensuring that all actions of the AI are aligned with the objective.[1] Also, we can influence quality by steering the “direction” or “speed” of the long-term future, approaches with potentially very different impact, hinging on factors such as the distribution of likely futures across value and likelihood (e.g., Anthis 2018c; Anthis and Paez 2021). One argument that I often hear on the tractability of trajectory changes is that changes need to “stick” or “persist” over long periods. It is true that there needs to be a persistent change in the expected value (i.e., the random variable or time series regime of value in the future), but I frequently hear the claim that there needs to be a persistent change in the realization of that value. For example, if we successfully broker a peace deal between great powers, neither the peace deal itself nor any other particular change in the world has to persist in order for this to have high long-term impact. The series of values itself can have arbitrarily large variance, such as it being very likely that the peace deal is broken within a decade. For a sort of change to be intractable, it needs to not just lack persistence, but to rubber band (i.e., create opposite-sign effects) back to its counterfactual. For example, if brokering a peace deal causes an equal and opposite reaction of anti-peace efforts, then that trajectory change is intractable. Moreover, we should not only consider rubber banding but dominoing (i.e., create same-sign effects), perhaps because of how this peace deal inspires other great powers to follow suit even if this particular deal is broken. There is much of this potential energy in the world waiting to be unlocked by thoughtful actors. The tractability of trajectory change has been the subject of research at Sentience Institute, including our historical case studies and “Harris’ (2019)” How Tractable Is Changing the Course of History?” Neglectedness The neglectedness difference between population and quality risk seems the most clear. There are far more EAs and longtermists working explicitly on population risks than on quality risks (i.e., risks to the moral value of individuals in the long-term future). Two nuances for this claim are first that it may not be true for other relevant comparisons: For example, many people in the world are trying to change social institutions, such as different sides of the political spectrum trying to pull public opinion towards their end of the spectrum. This group seems much larger than people focused explicitly on extinction risks, and there are many other relevant reference classes. Second, it is not entirely clear whether extinction risk reduction and quality risk reduction face higher or lower returns to being less neglected (i.e., more crowded). It may be that so few people are focused on quality risks that marginal returns are actually lower than they would be if there were more people working on them (i.e., increasing returns). ---------------------------------------- 1. In my opinion, there are many different values involved in developing and deploying an AI system, so the distinction between inner and outer alignment is rarely precise in practice. Much of identifying and aligning with “good” or “correct” values can be described as outer alignment. In general, I think of AI value alignment as a long series of mechanisms from the causal factors that create human values (which themselves can be thought of as objective functions) to a tangled web of objectives in each human brain (e.g., values, desires, preferences) to a tangled web of social objectives aggregated across humans (e.g., voting, debates, parliaments, marketplaces) to a tangled web of objectives communicated from humans to machines (e.g., material values in game-playing AI, training data, training labels, architectures) to a tangled web of emergent objectives in the machines (e.g., parametric architectures in the neural net, (smoothed) sets of possible actions in domain, (smoothed) sets of possible actions out of domain) and finally to the machine actions (i.e., what it actually does in the world). We can reasonably refer to the alignment of any of these objects with any of the other objects in this long, tangled continuum of values. Two examples of outer alignment work that I have in mind here are Askell et al. (2021) “A General Language Assistant as a Laboratory for Alignment” and Hobbhan et al. (2022) “Reflection Mechanisms as an Alignment Target: A Survey.” ↩︎
Looking for help: what's the opposite of counterfactual reasoning -- in other words: when EAs encourage counterfactual reasoning, what do they discourage? I ask because I'm writing about good epistemic practices and mindsets. I am trying to structure my writing as a list of opposites (scout mindset vs soldier mindset, numerical vs verbal reasoning, etc).  Would it be correct to say that in the case of counterfactual reasoning there is no real opposite? Rather, the appropriate contrast is: "counterfactual reasoning done well vs. counterfactual reasoning done badly"?
2
Gavin
2y
0
"Effective Accelerationism" (Kent Brockman: I for one welcome our Vile Offspring.)
2
sawyer
2y
0
Today is Asteroid Day. From the website: > Asteroid Day as observed annually on 30 June is the United Nations sanctioned day of public awareness of the risks of asteroid impacts.  Our mission is to educate the public about the risks and opportunities of asteroids year-round by hosting events, providing educational resources and regular communications to our global audience on multiple digital platforms. I didn't know about this until today. Seems like a potential opportunity for more general communication on global catastrophic risks.