Nah, I'm pretty sure the difference there is "Steve thinks that Jacob is way overestimating the difficulty of humans building AGI-capable learning algorithms by writing source code", rather than "Steve thinks that Jacob is way underestimating the difficulty of computationally recapitulating the process of human brain evolution".
For example, for the situation that you're talking about (I called it "Case 2" in my post") I wrote "It seems highly implausible that the programmers would just sit around for months and years and decades on end, waiting patiently for the outer algorithm to edit the inner algorithm, one excruciatingly-slow step at a time. I think the programmers would inspect the results of each episode, generate hypotheses for how to improve the algorithm, run small tests, etc." If the programmers did just sit around for years not looking at the intermediate training results, yes I expect the project would still succeed sooner or later. I just very strongly expect that they wouldn't sit around doing nothing.
Love this idea! For all the writers here, I'd like to notify you about the EA Creatives and Communicators slack. You can use it to connect with other writers and maybe give feedback and bounce ideas off each other!
(Please let me know or downvote if this was inapproriate of me to comment, as it could be considered advertising.)
Once again, I think I agree, although I think there are some rationality/decision-making projects that are popular but not very targeted or value-oriented. Does that seem reasonable?
It does, and I admittedly wrote that part of the comment before fully understanding your argument about classifying the development of general-use decision-making tools as being value-neutral. I agree that there has been a nontrivial focus on developing the science of forecasting and other approaches to probability management within EA circles, for example, and that those would qualify as value-neutral using your definition, so my earlier statement that value-neutral is "not really a thing" in EA was unfair.
If I were to draw this out, I would add power/scope of institutions as a third axis or dimension (although I would worry about presenting a false picture of orthogonality between power and decision quality). The impact of an institution would then be related to the relevant volume of a rectangular prism, not the relevant area of a rectangle.
Yeah, I also thought of suggesting this, but think it's problematic as well. As you say, power/scope is correlated with decision quality, although more on a long-term time horizon than in the short term and more for some kinds of organizations (corporations, media, certain kinds of nonprofits) than others (foundations, local/regional governments). I think it would be more parsimonious to just replace decision quality with institutional capabilities on the graphs and to frame DQ in the text as a mechanism for increasing the latter, IMHO. (Edited to add: another complication is that the line between institutional capabilities that come from DQ and capabilities that come from value shift is often blurry. For example, a nonprofit could decide to change its mission in such a way that the scope of its impact potential becomes much larger, e.g., by shifting to a wider geographic focus. This would represent a value improvement by EA standards, but it also means that it might open itself up to greater possibilities for scale from being able to access new funders, etc.)
Would you mind if I added an excerpt from this or a summary to the post?
No problem, go ahead!
Thanks for sharing this concern, which is very reasonable.
One of the things that motivated us to run this contest was the desire to have a lot of interesting new content on the Forum — we want people checking in regularly to see new stories, commenting on submissions, and digging through the archives even after the contest is over.
If we used a standard "read submissions in private, publish the best" model, we'd be missing out on that, even if we still achieved our other goal of "find a few really top-notch things to share".
But I do acknowledge that this presents authors with a conundrum if they want to publish stories elsewhere. Would the following arrangement be fine?
This seems like it opens up the chance to submit the story elsewhere again (since no one will be able to read it on the Forum anymore). And if it doesn't end up getting published elsewhere, you can just go back to your draft post and hit "publish".
Would this work, or do you think something ever having been published, even if it disappeared again, would make it impossible to submit to some/many places?
Especially for referrals, since there may be very many.
I didn't get the intuition behind the initial formulation:
What exactly is that supposed to represent? And what was the basis for assigning numbers to the contingency matrix in the two example cases you've considered?
Good question. I'd say "one poem per post", unless the poems are quite short or linked in some way that makes it seem more natural to put multiple in one post. But it's up to you; hard to go too far wrong.
Your new setup seems less likely to have morally relevant valence. Essentially the more the setup factors out valence-relevant computation (e.g. by separating out a module, or by accessing an oracle as in your example) the less likely it is for valenced processing to happen within the agent.
Just to be explicit here, I'm assuming estimates of goal achievement are valence-relevant. How generally this is true is not clear to me.
Thanks for the link. I’ll have to do a thorough read through your post in the future. From scanning it, I do disagree with much of it, many of those points of disagreement were laid out by previous commenters. One point I didn’t see brought up: IIRC the biological anchors paper suggests we will have enough compute to do evolution-type optimization before the end of the century. So even if we grant your claim that learning to learn is much harder to directly optimize for, I think it’s still a feasible path to AGI. Or perhaps you think evolution like optimization takes more compute than the biological anchors paper claims?
Interesting. Trying to include an entire story in a comment, rather than giving it its own post, seems pretty unwieldy to me as a reading experience. But we'll keep an eye on how many submissions come in, and take action if they really seem to be overwhelming the front page.
So personally, I will prefer for entries to be replies to a top-level post, and maybe for winners to be reposted to top-level posts.
But I will hide it for myself for now.
Good question! I think it's definitely high impact and will give a couple anecdotes to illustrate it below:
When I was looking back on the last ten years and what were the highest impact mind changes I had in that period of time, around 30% of them came from conversations at EAGs, directly leading to me founding two of the charities I've started.
This isn't counting other benefits, including meeting my best friend of the last 6 years (who introduced me to my romantic partner), and hiring some of my best hires.
A second anecdote, is I remember this question coming up years back and about how it seemed self-serving. If it was truly high impact, then you'd fund the tickets for other people to go. Somebody found this persuasive and funded other people's tickets. Including one person who ended up being one of our best hires and who since went on to start his own charity that is quite high impact.
Of course, hard to tell the counterfactuals, but I think for a lot of these there's a very high chance they wouldn't have happened otherwise.
Thank you for this response! I think I largely agree with you, and plan to add some (marked) edits as a result. More specifically,
On the 80K problem profile:
"I don't think the value-neutral version of IIDM is really much of a thing in the EA community"
Side note, on "a core tenet of democracy is the idea that one citizen's values and policy preferences shouldn't count more than another's"
"It looks like you're essentially using decision quality as a proxy for institutional power, and then concluding that intentions x capability = outcomes."
About "the distinction between stated values and de facto values for institutions"
"The professional world is incredibly siloed, and it's not hard at all for me to imagine that ostensibly publicly available resources and tools that anyone could use would, in practice, be distributed through networks that ensure disproportionate adoption by well-intentioned individuals and groups. I believe that something like this is happening with Metaculus, for example."
On your note about"generic-strategy": Apologies for that, and thank you for pointing it out! I'll make some edits.
Note: I now realize that I have basically inverted normal comment-response formatting in this response, but I'm too tired to fix it right now. I hope that's alright!
Once again, thank you for this really detailed comment and all the feedback-- I really appreciate it!
You're right, I stand corrected.
Wow! It's really great to see such an in-depth response to the definitional and foundational work that's been taking place around IIDM over the past year, plus I love your hand-drawn illustrations! As the author or co-author of several of the pieces you cited, I thought I'd share a few thoughts and reactions to different issues you brought up. First, on the distinctions and delineations between the value-neutral and value-oriented paradigms (I like those labels, by the way):
I appreciated your thought-provoking exploration of the two indirect pathways to impact you proposed. Regarding the second pathway (selecting which institutions will survive and flourish), I would propose that an additional complicating factor is that non-value-aligned institutions may be less constrained by ethical considerations in their option set, which could give them an advantage over value-aligned institution from the standpoint of maximizing power and influence.
I did have a few critiques about the section on directly improving the outcomes of institutions' decisions:
While overall I tend to agree with you that a value-oriented approach is better, I don't think you give a fair shake to the argument that "value-aligned institutions will disproportionately benefit from the development of broad decision-making tools." It's important to remember that improving institutional decision-making in the social sector and especially from an EA perspective is a very recent concept. The professional world is incredibly siloed, and it's not hard at all for me to imagine that ostensibly publicly available resources and tools that anyone could use would, in practice, be distributed through networks that ensure disproportionate adoption by well-intentioned individuals and groups. I believe that something like this is happening with Metaculus, for example.
One final technical note: you used "generic-strategy" in a different way that we did in the "Which Institutions?" post—our definition imagines a specific organization that is targeted through a non-specific strategy, whereas yours imagines a specific strategy not targeted to any specific organization. I agree that the latter deserves its own label, but suggest a different one than "generic-strategy" to avoid confusion with the previous post.
I've focused mostly on criticisms here for the sake of efficiency, but I really was very impressed with this article and hope to see more writing from you in the future, on this topic and others!
I'm actually pretty happy for this warning to spread; it's not a big problem now(?), but will be if growth continues. Vigilance is the way to make the critique untrue.
OTOH you don't necessarily want to foreground it as the first theme of EA, or even the main thing to worry about.
...it seems like your argument is saying "(A) and (B) are both really hard to estimate, and they're both really low likelihood—but neither is negligible. Thus, we can't really know whether our interventions are helping. (With the implicit conclusion being: thus, we should be more skeptical about attempts to improve the long-term future)"
Thanks, that is fairly accurate summary of one of the crucial points I am making except I would also add that the difficulty of estimation increases with time. And this is a major concern here because the case of longtermism rests precisely on there being greater and greater number of humans (and other sentient independent agents) as the horizon of time expands.
Sometimes we can't know the probability distribution of (A) vs. (B), but sometimes we can do better-than-nothing estimates, and for some things (e.g., some aspects of X-risk reduction) it seems reasonable to try.
Fully agree that we should try but the case of longtermism remains rather weak until we have some estimates and bounds that can be reasonably justified.
Great points again!
I have only cursorily examined the links you've shared (bookmarked them for later) but I hope the central thrust of what I am saying does not depend too strongly on being closely familiar with the contents of those.
A few clarifications are in order. I am really not sure about AGI timelines and that's why I am reluctant to attach any probability to it. For instance, the only reason I believe that there is less than 50% chance that we will have AGI in the next 50 years is because we have not seen it yet and IMO it seems rather unlikely to me that the current directions will lead us there. But that is a very weak justification. What I do know is that there has to be some radical qualitative change for artificial agents to go from excelling in narrow tasks to developing general intelligence.
That said, it may seem like nit-picking but I do want to draw the distinction between "not significant progress" and "no progress at all" towards AGI. Not only am I stating the former, I have no doubt that we have made incredible progress with algorithms in general. I am less convinced about how much those algorithms help us get closer towards an AGI. (In hindsight, it may turn out that our current deep learning approaches such as GANs contain path-breaking proto-AGI ideas /principles, but I am unable to see it that way).
If we consider a scale of 0-100 where 100 represents AGI attainment and 0 is some starting point in the 1950s, I have no clear idea whether the progress we've made thus far is close to 5 or 0.5 or even 0.05. I have no strong arguments to justify one or the other because I am way too uncertain about how far the final stage is.
There can also be no question with respect to the other categories of progress that you have highlighted such as compute power and infrastructure and large datasets -indeed I see these as central to the remarkable performance we have come to witness with deep learning models.
The perspective I have is that while acknowledging plenty of progress in understanding several processes in the brain such as signal propagation, mapping of specific sensory stimuli to neuronal activity, theories of how brain wiring at birth may have encoded several learning algorithms, they constitute piece-meal knowledge and they still seem quite a few strides removed the bigger question - how do we attain high level cognition, develop abstract thinking, be able to reason and solve complex mathematical problems ?
Sorry if I'm misunderstanding.
"isn't there an infinite degree of freedom associated with a continuous function?"
I'm a bit confused by this; are you saying that the only possible AGI algorithm is "the exact algorithm that the human brain runs"? The brain is wired up by a finite number of genes, right?
I agree that we don't necessarily have to reproduce the exact wiring or the functional relation in order to create a general intelligence (which is why I mentioned the equivalence classes).
Finite number of genes implies finite steps/information/computation (and that is not disputable of course) but the number of potential wiring options in the brain and functional forms between input and output is exponentially large. (It is in principle, infinite, if we want to reproduce the exact function, but we both agree that that may not be necessary). Pure exploratory search may not be feasible and one may make the case that with appropriate priors and assuming some modular structure of the brain, the search space will reduce considerably, but still how much of a quantitative grip do we have on this? And how much rests on speculation?
This was my initial reaction, that suspiciousness of existing forecasts can justify very wide error bars but not certainty in >50 year timelines. But then I realized I didn't understand what probability OP gave to <50 years timelines, which is why I asked a clarifying question first.
All writing is covered by copyright!
Not writing published 70 years after the author's death, if I understand correctly.
(Which is not a hypothetical example if people are planning to excerpt Kipling).
I actually have another question. I submitted a Kipling poem as a recommendation for the contest. (It's my first post, so it's currently awaiting moderation.) If I find more EA-themed Kipling poems (which, given the poet, would not be surprising), should I add them to the first post, submit them in batches or make individual posts for each of them?
Also, completely separate question: should I try to err on the side of submitting or not submitting a marginal work, written either by me or by someone else? How do you want to weight the tradeoff between 'deluged by irrelevant nonsense' and 'people whose work you might be interested in don't submit?'
So, I have multiple comments.
First, as an EA person, I want to thank you, because I think this is a great idea, and I very much approve. I think the amazing power of fiction to change people's minds has been an occasional but important force throughout history; the claim that Uncle Tom's Cabin got the abolitionist movement to the mainstream seems historically plausible enough to agree that this is, indeed, a useful thing for the EA movement to do, and one I highly approve of.
But the second is as a SF&F author who tries to get his stories published.
As far as I can tell, the reason most SF&F fiction contests don't use the 'post on forum, award money to best' model is because anything once published becomes a 'reprint', and reprints can't be sold elsewhere except at for a few places and usually at a 7x markdown, and anything that can show up in Google searches is published. If you send it to the editor's work E-mail address, or post it on a password-protected forum only for authors, it's 'not published'. But here, it's 'published'.
So submitting stories to this content means that I can't sell it anywhere else if it doesn't win the prize, while submitting stories elsewhere I can try, try again.
On the one hand it is probably worth me doing this because I believe in effective altruism and the cost to me is pretty negligible, given that individual short stories don't actually sell for all that much. On the other hand, I think the public posting might mean that the profit motive is pointing in the opposite way from you want.
AlphaGo has a human-created optimizer, namely MCTS. Normally people don't use the term "mesa-optimizer" for human-created optimizers.
Then maybe you'll say "OK there's a human-created search-based consequentialist planner, but the inner loop of that planner is a trained ResNet, and how do you know that there isn't also a search-based consequentialist planner inside each single run through the ResNet?"
Admittedly, I can't prove that there isn't. I suspect that there isn't, because there seems to be no incentive for that (there's already a search-based consequentialist planner!), and also because I don't think ResNets are up to such a complicated task.
Hi Mats! That sounds splendid!
Meanwhile I’ve set up my wiki, started thinking about the structure of the template I’d like to use for the project pages, and have started reading up on your Google Docs. It’s impressive how thoroughly you’ve already evaluated your project concept!
My “idea foundry” project itself will have its own page in its wiki with more information on my future plans. That’ll make it easier to reflect on whether the whole thing is sustainable. I haven’t thought about it sufficiently myself. I’ll only publish individual pages once I have proofread them for possible info hazards and have gotten feedback from some trusted friends too.
… as well as a list of past/failed projects or lessons learned from projects
Yeah, and there are also a lot of ostensibly brilliant project ideas in various lists that I think are subtly deleterious. No one has attempted to realize them yet (at least the ones I vaguely recall and to my knowledge) but a project database with just a bit more detailed thinking may help to keep it that way. (Or else may inspire someone to come up with a way to realize the project in a way that avoids the subtly deleterious bits.)
… as long as you're OK with your wiki being separate from the project database
Totally. It feels like so far I’ve been wholly unconvinced by some 95+% of project ideas I’ve read about, so those should not end up on your platform. It would just be valuable – or essential – to be able to promote the top of the shortlist to potential founders.
My only slight hesitation for your approach is the effort involved in development and upkeep, we would rather offer a lower-value solution (just a list of ideas) that we can guarantee can be maintained than a higher-value solution (detailed wiki with required fields for each project idea) that has a large chance of being abandoned after a while.
I’m worried about that too. I’d be willing to risk it, pending further thinking. An alleviating factor is that the detailed reviews would be reserved for a small shortlist of projects. Most of them would just get a quick stub summary and the reason why I didn’t prioritize them.
I’ve read that you’re perfectly open to (for-profit) social enterprises and of course early-stage project in need of cofounders. But I see the term “volunteer” a lot in the materials. It has these particular associations with low commitment, low responsibility, no salary, nonprofits, etc. Is it the best synonym for the job? None of the alternatives I can think of is quite broad enough either – cofounder, collaborator, partner, talent, … – but I imagine that such word choices can influence what the platform will end up being used for. A platform for “cofounder matching” may end up being used for more high-value work than one for “volunteer matching,” maybe some sort of “Task Y” notwithstanding. But I’ve also heard that someone had the impression that cofounder matching is not a current bottleneck, which I found surprising.
I’ll get in touch through one of the channels you recommended.
While I like the story, I wouldn't recommend it for the contest, for spoilery reasons. Putting them into ROT13:
Vafgrnq bs bccbfvat Fhcrezna qverpgyl, Yrk Yhgube vasvygengrf gur znffvir RN nccnenghf ohvyg nebhaq hfvat Fhcrezna rssrpgviryl naq trgf n wbo nf Fhcrezna'f cflpuvngevfg, gurerol (nf orpbzrf pyrne va gur raq) nyybjvat uvz gb pbageby uvf jbefg rarzl gb qb uvf jvyy - n pevfvf gung jbhyqa'g unir rkvfgrq vs Fhcrezna unq fhpprffshyyl xrcg n frperg vqragvgl, be fgnlrq orybj gur enqne.
Vafbsne nf gur fgbel unf n zbeny, vg vf gung crbcyr jub qrfver cbjre jvyy frrx gb gnxr pbageby bire nal naq nyy fbheprf bs vg, ertneqyrff bs jung gur cbjre vf ynoryrq nf orvat sbe. Guvf vf n avpr yrffba, ohg V fhfcrpg gung vg vf tbvat gb or ernq nf 'RN [va cnegvphyne] vf ihyarenoyr gb uvwnpxvat ol znyribyrag sbeprf orpnhfr vg vf pbafrdhragvnyvfg, RN naq pbafrdhragvnyvfz ner gurersber obgu onq' juvpu vf abg ernyyl n zrffntr jr jnag gb fcernq.
Just want to be clear, the main post isn't about analyzing eigenmodes with EEG data. It's very funny that when I am intellectually honest enough to say I don't know about one specific EEG analysis that doesn't exist and is not referenced in the main text, people conclude that I don't have expertise to comment on fMRI data analysis or the nature of neural representations.
Meanwhile QRI does not have expertise to comment on many of the things they discuss, but they are super confident about everything and in the original posts especially did not clearly indicate what is speculation versus what is supported by research.
I continue to be unconvinced with the arguments laid out, but I do think both the tone of the conversation and Mike Johnson's answers improved after he was criticized. (Correlation? Causation?)
GPT-3 is of that form, but AlphaGo/MuZero isn't (I would argue).
I don't see why. The NNs in AlphaGo and MuZero were trained using some SGD variant (right?), and SGD variants can theoretically yield mesa-optimizers.
I see what you mean, and again I have some sympathy for the argument that it's very difficult to be confident about a given probability distribution in terms of both positive and negative consequences. However, to summarize my concerns here, I still think that even if there is a large amount of uncertainty, there is typically still reason to think that some things will have a positive expected value: preventing a given event (e.g., a global nuclear war) might have a ~0.001% of making existence worse in the long-term (possibility A), but it seems fair to estimate that preventing the same event also has a ~0.1% chance of producing an equal amount of long-term net benefit (B). Both estimates can be highly uncertain, but there doesn't seem to be a good reason to expect that (A) is more likely than (B).
My concern thus far has been that it seems like your argument is saying "(A) and (B) are both really hard to estimate, and they're both really low likelihood—but neither is negligible. Thus, we can't really know whether our interventions are helping. (With the implicit conclusion being: thus, we should be more skeptical about attempts to improve the long-term future)" (If that isn't your argument, feel free to clarify!). In contrast, my point is "Sometimes we can't know the probability distribution of (A) vs. (B), but sometimes we can do better-than-nothing estimates, and for some things (e.g., some aspects of X-risk reduction) it seems reasonable to try."
“Life can only be understood backwards; but it must be lived forwards.” ― Søren Kierkegaard
Source: https://frasimondo.com/frasi-bellissime/
I wrote my response (as a clueless non-ivory-tower non-academic) to Crary's second incarnation here: "Crary avoids explaining her arguments against Effective Altruism"... let me know if you want the "next 45%" sequel.
Thanks for this post! I love a good research agenda. Some other relevant bits of work:
You're welcome! It's certainly one of his less prominent stories.
I lead some of DeepMind's technical AGI safety work, and wanted to add two supporting notes:
Thank you for sharing. I agree with adding a somewhat commercial dimension to research (possibly not all research). It can inspire a better balanced incentive structure, accelerate the process and possibly attract private funding (without corroding one’s research integrity, process and outcome). I have only regained interest in STEM (enthusiast) this year and seem to come across recurring issues with the process and dearth of funding. Ones that I feel pertinent difficulty in funding research (in general and in such a capital abundant period) outside the generally expected area of a field, some corrosive politics and the desire to succeed in each research like it is your last (but for the wrong reasons).
I think we can and should do better. I am working on something.
But more immediately:
I didn't quite follow. What's the reasoning for claiming this?
From the definition of the four variables, the following equivalence can be deduced:
Thanks for posting this! I had read some of their other stuff, but hadn't come across this story
I commented on a draft of this post. I haven't re-read it in full, so I don't know to what degree my comments were incorporated. Based on a quick glance it seems they weren't, so I thought I'd copy the main comments I left on that draft. My main point is that I think inserting regional groups into the funding landscape would likely worsen rather than improve the funding situation. I still think regional groups seem promising for other reasons.
Some of my comments (copy-paste, quickly written):
[Regarding applying for funding:] At a high level, my guess would be that this solution would increase overhead and friction in distributing money, rather than reducing it. I think setting up lots of regional grantmakers is a lot of work
That said, I think regional groups can be very useful and valuable for other reasons. Just don't really think they should do grantmaking.
I'm worried about different regional groups applying inconsistent quality service, and/or inconsistent criteria in distributing money
I think we should think of ways to address the psychological issue of people being afraid, rather than building a lot of structure around this
I think [the EAIF would] have a pretty easy time setting up more scalable systems [once there is a much larger number of groups]
E.g. we could set up more standardized, faster processes for grant applications that fit certain categories that can be quickly reviewed by less senior people. The bottleneck for setting up such a system is having a sufficient number of applications for it to be worth doing
You also need to build the infrastructure for making the payments themselves efficiently, doing the financial accounting, running an entity, tax reporting, etc. – (…)
I think people routinely underestimate the time cost of running a legal entity with a lot of activity. I wish people generally try really hard to eliminate any unnecessary operational busywork. Instead, we should focus relentlessly on the EA content and promising people, and use very pragmatic fast solutions for handling admin things
They can be blissful or terrifying depending on where in the brain they occur. I thought is was pretty well understood that locality is what determines the experience, not harmonics of the seizure. Even if harmonics have something to do with it, I wouldn't say that experiences during seizures are evidence in favor of STV.
Sweet! I've messaged him.
This is a very interesting paper and while it covers a lot of ground that I have described in the introduction, the actual cubic growth model used has a number of limitations, perhaps the most significant of which is the assumption that it considers the causal effect of an intervention to diminish over time and converge towards some inevitable state: more precisely it assumes as , where is some desirable future state and A and B are some distinct interventions at present.
Please correct me if I am wrong about this.
However, the introduction considers not just interventions fading out in terms of their ability to influence future events but often the sheer unpredictability of them. In fact, much like I did, the idea from chaos theory is cited:
.... we know
on theoretical grounds that complex systems can be extremely sensitive to initial
conditions, such that very small changes produce very large differences in later con-
ditions (Lorenz, 1963; Schuster and Just, 2006). If human societies exhibit this sort
of “chaotic” behavior with respect to features that determine the long-term effects
of our actions (to put it very roughly), then attempts to predictably influence the
far future may be insuperably stymied by our inability to measure the present state
of the world with arbitrary precision.
But the model does not consider any of these cases.
In any case, by the author's own analysis ( which is based on a large number of assumptions), there are several scenarios where the outcome is not favorable to the longtermist.
Again, interesting work, but this modeling framework is not very persuasive to begin with (regardless of which way the final results point to).
I'm not very familiar with the Center for Data Innovation, thank you for pointing this out!
I included their response as its author is familiar with EA and well reasoned. I also felt it would be healthy to include a perspective and set of concerns vastly different from my own, as the post is already biased by my choice of focus.
That being said I haven't gotten the best impression by some of the Center for Data Innovation's research. As far as I can tell their widely cited analysis which projects the act to cost €31 billion has flaw in its methodology which results in the estimate turning out much higher. In their defense, their cost-analysis is also conservative in other ways, leading to a lower number than what might be reasonable.
They model the situation, run the calculation and end up with 10^-12 and then say the probability is 10^-12.
Consider that if you're aggregating expert predictions, you might be generating probabilities too soon. Instead you could for instance interview the subject-matter experts, make the transcript available to expert forecasters, and then aggregate the probabilities of the latter. This might produce more accurate probabilities.
I find most justifications and arguments made in favor of a timeline of less than 50 years to be rather unconvincing.
If we don't have convincing evidence in favor of a timeline <50 years, and we also don't have convincing evidence in favor of a timeline ≥50 years, then we just have to say that this is a question on which we don't have convincing evidence of anything in particular. But we still have to take whatever evidence we have and make the best decisions we can. ¯\_(ツ)_/¯
(You don't say this explicitly but your wording kinda implies that ≥50 years is the default, and we need convincing evidence to change our mind away from that default. If so, I would ask why we should take ≥50 years to be the default. Or sorry if I'm putting words in your mouth.)
I am simply not able to understand why we are significantly closer to AGI today than we were in 1950s
Lots of ingredients go into AGI, including (1) algorithms, (2) lots of inexpensive chips that can do lots of calculations per second, (3) technology for fast communication between these chips, (4) infrastructure for managing large jobs on compute clusters, (5) frameworks and expertise in parallelizing algorithms, (6) general willingness to spend millions of dollars and roll custom ASICs to run a learning algorithm, (7) coding and debugging tools and optimizing compilers, etc. Even if you believe that you've made no progress whatsoever on algorithms since the 1950s, we've made massive progress in the other categories. I think that alone puts us "significantly closer to AGI today than we were in the 1950s": once we get the algorithms, at least everything else will be ready to go, and that wasn't true in the 1950s, right?
But I would also strongly disagree with the idea that we've made no progress whatsoever on algorithms since the 1950s. Even if you think that GPT-3 and AlphaGo have absolutely nothing whatsoever to do with AGI algorithms (which strikes me as an implausibly strong statement, although I would endorse much weaker versions of that statement), that's far from the only strand of research in AI, let alone neuroscience. For example, there's a (IMO plausible) argument that PGMs and causal diagrams will be more important to AGI than deep neural networks are. But that would still imply that we've learned AGI-relevant things about algorithms since the 1950s. Or as another example, there's a (IMO misleading) argument that the brain is horrifically complicated and we still have centuries of work ahead of us in understanding how it works. But even people who strongly endorse that claim wouldn't also say that we've made "no progress whatsoever" in understanding brain algorithms since the 1950s.
Sorry if I'm misunderstanding.
isn't there an infinite degree of freedom associated with a continuous function?
I'm a bit confused by this; are you saying that the only possible AGI algorithm is "the exact algorithm that the human brain runs"? The brain is wired up by a finite number of genes, right?
Note: This message came out of a conversation with u/AppliedDivinityStudies and therefore contains a mix of opinions from the two of us, even though I use "I" throughout. All mistakes can be attributed to me (An1lam) though.
Really appreciate you all running this program and writing this up! That said, I disagree with a number of the conclusions in the write-up and worry that if neither I nor anyone else speak up with our criticisms, people will get the (in my opinion) wrong idea about bottlenecks to more longtermist entrepreneurship.
At a high level, many of my criticisms stem from my sense that the program didn't lean in to the "entrepreneurship" component that hard and as a result ended up looking a lot like typical EA activities (nothing wrong with typical EA activities).
First, I strongly disagree with the implicit conclusion that fostering a LE requires lots of existing LE entrepreneurs, specifically:
Hundreds of people expressed interest in doing LE, but a very small number of these (1-3 dozen) had backgrounds in both longtermism and entrepreneurship. There were few people that we thought could pull off very ambitious projects.
And also:
Talent pool is larger than expected, but less senior.
If there existed a large pool of LE entrepreneurs with the right skills, there'd be a less pressing need for this sort of program. I get that you're wary of analogies to tech startups due to downside risk but to the degree one wants to foster an ecosystem, taking a risk on at least some more junior people seems pretty necessary. Even within the EA ecosystem, my sense is that people who founded successful orgs often hadn't done this before. E.g., as far as I know Nick Bostrom hadn't founded an FHI 0.0 before founding the current instantiation of FHI. Same for GiveWell, CEA, etc. Given that, the notion that doing LE entrepreneurship requires "backgrounds in both longtermism and entrepeneurship" seems like too restrictive a filter.
Second, without examples it's a little hard to discuss, but I feel like the concern about downside risk is real but overblown. It's definitely an important difference between LE entrepeneurship and traditional startups to be mindful of but I question whether it's being used to justify an extreme form of the precautionary principle that says funders shouldn't fund ideas with downside risks instead of the, more reasonable IMO, principle of funding +EV things or trying to ensure the portfolio of projects has +EV.
Third, I think some of the assumptions about what types of activities should take precedence for LE entrepreneurship deserve re-examining. As I alluded to above, it seems like the activities you say matter most for LE entrepeneurship, "research, strategic thinking, forecasting long-run consequences, and introspection [rather] than finding product-market fit" are suspiciously similar to "typical EA activities". From my perspective, it could instead be interesting to try and take some of the startup gospel around iteration, getting things out into the wild sooner rather than later, etc. seriously and adapt them to LE entrepreneurship rather than starting from the (appearance of the) assumption that you have very little to learn from that world. This isn't fully charitable, but I have the sense that EA has a lot of people who gravitate towards talking/strategizing/coordinating and getting other people to do things but sometimes shy away from "actually doing things" themselves. I view an LE entrepreneurship incubator as an opportunity to reward or push more people towards the "actually doing things" part. Part of this may also be that I'm a bit confused about where the boundary between normal and LE entrepreneurship lies. In my mind, SpaceX, fusion startups, psychedelics research would all qualify as examples of LE entrepreneurship with limited downside risk or at least not existential downside risks. Would you agree that these qualify as good examples?
Fourth, you mention advisors but only say a few by name. I'm 1) curious whether any of these advisors were experienced entrepreneurs and 2) interested in whether you considered getting advisors only adjacent to EA but very experienced entrepreneurs. As an example, at least one founder of Wave is an EA-aligned successful entrepreneurs who I can only imagine has wisdom to impart about entrepreneurship. I don't live in the Bay Area but I have the sense that there are quite a few other EA-adjacent founders there who might also be interested in advising a program like this.
Fifth, this is more low-level but I still don't really understand the skepticism of a YC-like incubator for LE entrepreneurship. It seems like your arguments boil down to 1) the current pool is small and 2) the requirements are different. But on 1, when YC started, the pool of entrepreneurs was smaller too! Such a program can help to increase the size of that pool. On 2, I agree that a literal copy of YC would have the issues you describe but I'd imagine a YC-like program blending the two community's thinking styles in a way that gets most of the benefits of each while avoiding the downsides. As an aside, we are also very supportive of longtermists doing YC but for slightly different reasons. This may also be related to the confusion about what qualifies as LEE.
Summarizing, my goal in writing this comment is not to just criticize the program. Instead, I worry that by highlighting the need for experience and the overwhelming risk of harm, the write-up as-is might discourage would-be LE entrepreneurs from trying something . I hope that my comment can help provide a counterweight to that.
Thank you for writing this summary!
I wanted to share this new website about the AI Act we have set up together with colleagues at the Future of Life Institute: https://artificialintelligenceact.eu/. You can find the main text, annexes, some analyses of the proposal, and the latest developments on the site. Feel free to get in touch if you'd like to discuss the proposal or have suggestions for the website. We'd like it to be a good resource for the general public but also for people interested in the regulation more closely.
Sounds like you should cross-post it, then!
I'd recommend an excerpt + link to the full post, or sharing full text if you get Dylan's participation (I imagine he'd be happy to have his work entered in the contest for free).
Yes, thanks — we need to clarify this.
The best move here is likely an excerpt + backlink, as we've done for e.g. Vox articles, and as news organizations and blogs do all the time for things they quote. I'll clarify this in the contest rules later today. (Ideally, I'd still love to have full-text versions of things on the Forum, but I'll specify that people should ask authors for permission before going that far.)
This is great!
I'm a first-year doctor in Australia. I'm giving half my income to charity this year: (https://henryach.com/workathon). Over here I'm on track to make about 95,000-100,000 AUD (~56,000-62,000 EUR) before tax for a year of ~50 hour weeks. Not quite as much as your job in Switzerland but then this is a first year internship wage and residency wages are slightly higher.
Have you considered locum work? I don't know if there's much of this in Switzerland but it's big in Australia as so much of the country is rural and it's hard to entice doctors out there other than by paying them huge amounts for short stints. Hourly rates as a locum are usually double or more the usual rate. I work with a lot of UK doctors currently doing this.
How about Dylan Balfour's 'Pascal's Mugging Strikes Again'? It's great.
I think you're right - I can't find anything under "fair use" that involves pasting someone else's story onto the Forum without their permission, even if you link back to it.
I don't understand how "the exception is writing covered by copyright". All writing is covered by copyright!
The method of submitting someone else's work seems problematic if I understand it right - it sounds like a breach of copyright.
Thank you! I learned too from the examples.
One question:
In particular, that the best approach for practical rationality involves calculating things out according to each of the probabilities and then aggregating from there (or something like that), rather than aggregating first.
I am confused about this part. I think I said exactly the opposite? You need to aggregate first, then calculate whatever you are interested in. Otherwise you lose information (because eg taking the expected value of the individual predictions loses information that was contained in the individual predictions, about for example the standard deviation of the distribution, which depending on the aggregation method might affect the combined expected value).
What am I not seeing?
A tag is probably enough, but you could also maybe ask people to put some copy about the contest at the top of each submission?
Why hide stuff from newbies? They are here to see the forum, and this is a cool EA thing happening on the forum.
Excellent overview, and I completely agree that the AI Act is an important policy for AI governance.
One quibble: as far as I know, the Center for Data Innovation is just a lobbying group for Big Tech - I was a little surprised to see it listed in "public responses from various EA and EA Adjacent organisations".
I agree with a lot of this. In particular, that the best approach for practical rationality involves calculating things out according to each of the probabilities and then aggregating from there (or something like that), rather than aggregating first. That was part of what I was trying to show with the institution example. And it was part of what I was getting at by suggesting that the problem is ill-posed — there are a number of different assumptions we are all making about what these probabilities are going to be used for and whether we can assume the experts are themselves careful reasoners etc. and this discussion has found various places where the best form of aggregation depends crucially on these kinds of matters. I've certainly learned quite a bit from the discussion.
I think if you wanted to take things further, then teasing out how different combinations of assumptions lead to different aggregation methods would be a good next step.
I see what you mean, though you will find that scientific experts often end up endorsing probabilities like these. They model the situation, run the calculation and end up with 10^-12 and then say the probability is 10^-12. You are right that if you knew the experts were Bayesian and calibrated and aware of all the ways the model or calculation could be flawed, and had a good dose of humility, then you could read more into such small claimed probabilities — i.e. that they must have a mass of evidence they have not yet shared. But we are very rarely in a situation like that. Averaging a selection of Metaculus forecasters may be close, but is quite a special case when you think more broadly about the question of how to aggregate expert predictions.
This talk and paper discusses what I think are some of your concerns about growing uncertainty over longer and longer horizons.
Hey, can you manage the project on github and, like, make issues and break up the stuff you have planned into chunks? That way, people can help out with stuff if they have time. Or maybe you can look for someone else who is interested in working on this?
Several good points made by Linch, Aryeh and steve2512.
As for making my skepticism more precise in terms of probability, it's less about me having a clear sense of timeline predictions that are radically different from those who believe that AGI will explode upon us in the next few decades, and more about the fact that I find most justifications and arguments made in favor of a timeline of less than 50 years to be rather unconvincing.
For instance, having studied and used state-of-the-art deep learning models, I am simply not able to understand why we are significantly closer to AGI today than we were in 1950s. General intelligence requires something qualitatively different from GPT-3 or Alpha Go, and I have seen literally zero evidence that any AI systems comprehend things even remotely close how humans operate.
Note that the last point is not a requirement (namely that AI should understand objects, events and relations like humans do) as such for AGI but it does make me skeptical of people who cite these examples as evidence of progress we've made towards such a general intelligence.
I have looked at Holden's post and there are several things that are not clear to me. Here is one: there appears to be a lot of focus on the number of computations, especially in comparison to the human brain, and while I have little doubt that artificial systems would surpass those limitations (if it has already not done so), the real question is decoding the nature of wiring and the functional form of the relation between the inputs and outputs. Perhaps there is something I am not getting here but (at least in principle) isn't there an infinite degree of freedom associated with a continuous function? Even if one argued that we can define equivalence class of similar functions (made rigorous), does that still not leave us with an extremely large number of possibilities?
You're completely correct about a couple of things, and not only am I not disputing them, they are crucial to my argument: first, that I am only focusing on only one side of the distribution, and the second, that the scenarios I am referring to (with WW2 counterfactual or nuclear war) are improbable.
Indeed, as I have said, even if the probability of the future scenarios I am positing is of the order of 0.00001 (which makes it improbable), that can hardly be the grounds to dismiss the argument in this context simply because longtermism appeals precisely to the immense consequences of events whose absolute probability is very low.
At the risk of quoting out of context:
If we increase the odds of survival at one of the filters by one in a million, we can multiply one of the inputs for C by 1.000001.
So our new value of C is 0.01 x 0.01 x 1.000001 = 0.0001000001
New expected time remaining for civilization = M x C = 10,000,010,000
In much the same way, it's absolutely correct that I am referring to one side of the distribution ; however it is not because the other-side does not exist or is not relevant bur rather because I want to highlight the magnitude of uncertainty and how that expands with time.
It follows also that I am in no way disputing (and my argument is somewhat orthogonal to) the different counterfactuals for WW2 you've outlined.
This seems correct and a valid point to keep in mind - but it cuts both ways. It makes sense to reduce your credence when you recognize that expert judgment here is less informed than you originally thought. But by the same token, you should probably reduce your credence in your own forecasts being correct, at least to the extent that they involve inside view arguments like, "deep learning will not scale up all the way because it's missing xyz." The correct response in this case will depend on how much your views depend on inside view arguments about deep learning, of course. But I suspect that at least for a lot of people the correct response is to become more agnostic about any timeline forecast, their own included, rather than to think that since the experts aren't so reliable here, therefore I should just trust my own judgement.
most contemporary progress on AI happens by running base-optimizers which could support mesa-optimization
GPT-3 is of that form, but AlphaGo/MuZero isn't (I would argue).
I'm not sure how to settle whether your statement about "most contemporary progress" is right or wrong. I guess we could count how many papers use model-free RL vs model-based RL, or something? Well anyway, given that I haven't done anything like that, I wouldn't feel comfortable making any confident statement here. Of course you may know more than me! :-)
If we forget about "contemporary progress" and focus on "path to AGI", I have a post arguing against what (I think) you're implying at Against evolution as an analogy for how humans will create AGI, for what it's worth.
Ideally we'd want a method for identifying valence which is more mechanistic that mine. In the sense that it lets you identify valence in a system just by looking inside the system without looking at how it was made.
Yeah I dunno, I have some general thoughts about what valence looks like in the vertebrate brain (e.g. this is related, and this) but I'm still fuzzy in places and am not ready to offer any nice buttoned-up theory. "Valence in arbitrary algorithms" is obviously even harder by far. :-)
Hmm I sort of agree with this. I think when I run back-of-the-envelope calculations on the value of information that you can gain from "gold standard" studies or models on questions that are of potential interest in developed-world contexts (eg high-powered studies on zinc on common cold symptom, modeling how better ventilation can stop airborne disease spread at airports, some stuff on social platforms/infrastructures for testing vaccines, maybe some stuff on chronic fatigue), it naively seems like high-quality but simple research (but not implementation) for developed world health research (including but not limited to the traditional purview of public health) is plausibly competitive with Givewell-style global health charities even after accounting for the 100x-1000x multiplier.
I think the real reason people don't do this more is because we're limited more here on human capital than on $s. In particular, people with a) deep health backgrounds and b) strong EA alignment have pretty strong counterfactuals in working or attempting to work on either existential biorisk reduction or public health research for developing world diseases, both of which are probably more impactful (for different reasons).
Awesome work! I remember when Ivan mentioned your project to me. Really cool to see it come to fruition. I like the idea of a central data repository and would benefit from it. I think that having an accompanying visualisation like this could add value to the annual EA survey data.
I also think that creating data visualisations could also help to increase the dissemination and impact of EA research. I'd like to see more work there too.
That's a solid idea. We could also set a default filter alongside the default Personal Blog filter, so that newcomers don't see the fiction unless they choose to see it (though they'll still be able to see it at the tag page if someone links them to it). I'll talk to the tech team and see if that's reasonable.
In the couple of past cases where people have shared fiction here, it's been on the frontpage and people haven't generally seemed to mind.
Presumably we are expecting a much higher volume than in the past. It might be a bit strange for newcomers to the movement, expecting to find a forum for serious idea discussion, instead find themselves on a strange version of AO3.
edit: perhaps entrants should have [Creative Writing Entry] as the start of their title, so it is easy to distinguish on the frontpage?
In the couple of past cases where people have shared fiction here, it's been on the frontpage and people haven't generally seemed to mind. It's also quite easy to filter out all the submissions if you want — just do this:

I have generally been quite skeptical about the view that we are on the cusp of a revolution that will lead us to artificial general intelligence in the next 50 years so.
Can you clarify what you mean by this? Does "quite skeptical" mean
I think there is <20% probability that we'll have AGI in <50 years
or
I think there is <1% probability that we'll have AGI in <50 years
or
I think there is <0.01% probability that we'll have AGI in <50 years
or something else?
Language is quite imprecise, numbers can't resolve uncertainty in the underlying phenomenon, but they help a lot in clarifying and making the strength of your uncertainty more precise.
I feel like the main reasons you shouldn't trust forecasts from subject matter experts are something like:
So like you and steve2152 I'm at least somewhat skeptical of putting too much faith in expert forecasts.
However, in contrast I feel like a lack of theoretical understanding of current ML can't be that strong evidence against trusting experts here, for the very simple reason that conservation of expected evidence means this implies that we ought to trust forecasts from experts with a theoretical understanding of their models more. And this seems wrong because (among others) it would've been wrong 50 years ago to trust experts on GOFAI for their AI timelines!
First of all, I'm really excited for this contest! Using fiction to communicate EA messages has always seemed a priori plausible to me (along the lines of eg 4.2 here), and I'm excited to see various possible different takes here!
Certainly the success of introductions like HPMOR lends additional nontrivial evidence to this theory, so I'm excited to see more experiments like this one and others.
Secondly, really cool that CEA is taking the initiative to encourage these things.
How do I submit content?
All stories must be published on the EA Forum and tagged with Creative Writing Contest.
We want lots of people to read and discuss your submissions — we think the Forum will be a really fun place if good stories start showing up. However, we won’t use upvotes or comments as part of our process for choosing a winner.
If you’re wary of sharing your work in public, remember that winning the contest guarantees your work being shared in public (with many, many people). That said, you are welcome to use a pseudonym if you’d prefer!
I think I personally will have a preference for fiction to not show up as top-level posts on the Forum, unless they've been previously vetted as unusually good or they're unusually culturally significant. But obviously a) different people have different tastes, and b) this is your forum!
Hi Denis, thank you for your message and your offer to contribute, it is welcome. Since we are just starting out we still haven't built all the capabilities we have envisioned. For example, and as mentioned above, we were planning a list of tractable problems and project ideas to guide potential project leaders, as well as a list of past/failed projects or lessons learned from projects to ensure the community as a whole is not just spinning its wheels (e.g. this metaproject has had similar iterations in the past..). But your idea for a wiki that not only provides problem areas and project ideas but also provides thought-through analyses, roadmaps, required skills lists, available resources and community input is a huge improvement over our current plan. So I don't think the issue of not having project leaders identified upfront would be a critical problem, as long as you're OK with your wiki being separate from the project database. Ideally entrepreneurial EAs will find your project write-ups on Impact CoLabs and then create a project from it (or people that are screened from the platform due to low-impact ideas can be directed to those pre-vetted ideas).
We definitely want the ultimate version of Impact CoLabs to be the central node for project creation, and we want the resources we provide to reflect that. The goal is to be a more high-volume/low-touch, top-of-the-funnel solution than the incubators/accelerators like Charity Entrepreneurship or other upcoming startup factories. But even if we are not going to shepherd projects personally and diligently, it doesn't mean we can't try to provide as detailed and well-researched guidance as possible.
My only slight hesitation for your approach is the effort involved in development and upkeep, we would rather offer a lower-value solution (just a list of ideas) that we can guarantee can be maintained than a higher-value solution (detailed wiki with required fields for each project idea) that has a large chance of being abandoned after a while. So it all depends on volunteer interest in contributing and/or how we set it up. Would love to chat about this more. If you want to take this offline we would recommend filling out our new team member form so we can get you more background info on the project, or alternatively you can just email info@impactcolabs.com.
Have you read https://www.cold-takes.com/where-ai-forecasting-stands-today/ ?
I do agree that there are many good reasons to think that AI practitioners are not AI forecasting experts, such as the fact that they're, um, obviously not—they generally have no training in it and have spent almost no time on it, and indeed they give very different answers to seemingly-equivalent timelines questions phrased differently. This is a reason to discount the timelines that come from AI practitioner surveys, in favor of whatever other forecasting methods / heuristics you can come up with. It's not per se a reason to think "definitely no AGI in the next 50 years".
Well, maybe I should just ask: What probability would you assign to the statement "50 years from today, we will have AGI"? A couple examples:
In that example, Alice has ~5 min of time to give feedback to Bob; in Toby's case the senior researchers are (in aggregate) spending at least multiple hours providing feedback (where "Bob spent 15 min talking to Alice and seeing what she got excited about" counts as 15 min of feedback from Alice). That's the major difference.
I guess one way you could interpret Toby's advice is to simply get a project idea from a senior person, and then go work on it yourself without feedback from that senior person -- I would disagree with that particular advice. I think it's important to have iterative / continual feedback from senior people.
This sounds like really valuable project!
I’ve been thinking about helping to set up some sort of EA incubator ecosystem. My contribution could be to collect, organize, prioritize, and roadmap all the project ideas that are floating around. I’d apply some sort of process along the lines of that of Charity Entrepreneurship but with a much more longtermist focus. I’ve been envisioning this in the form of a wiki with a lot of stub articles for project ideas that didn’t pass the shallow review phase and a few comprehensive articles that compile (1) detailed thinking on robustness, importance, tractability, etc.; (2) notes from interviews with domain experts; (3) a roadmap for how the project might be realized; (4) descriptions of the sorts of skills and resources it will require; (4) talent, funding, and other buy-in that is maybe already interested; (5) a comment section for discussions. (Jan’s process could be part of this too.) Since this would take the format of a wiki, I could easily add other editors to contribute to it too. I wouldn’t make it fully publicly editable though. Ideally, there’d also be a forum post for each top project that is automatically updated when the wiki changes and whose comments are displayed on the wiki page too.
My main worry is that the final product will just collect dust until it is hopelessly outdated.
So I’ve been wondering whether there are maybe synergies here, e.g., along the lines where I do the above, and your platform can in the end reduce the risk that nothing ever comes of the top project ideas?
I’ve only spot-checked a few of your current projects, but it seems to me that they typically have project owners whereas my projects would typically start out with no one doing them and at max. vague buy-in of the sort “People X and Y are tentatively interested in funding such a project, and person Z has considered starting this project but is now working on something else because they couldn’t find a cofounder.” Do you think that would be a critical problem?
Nice, thanks for that info! I'll check out that post soon, and might reach out to you with questions at some point.
Informative and well reasoned talk. Thank you. One thrust of this talk is encouraging fellow humans to do research into considering expanding moral patienthood to invertebrate animals. Any suggestions and resources for going outside the presumed human baseline of moral patienthood within the animalia kingdom, or even life at all, similar to panpsychists?
For decades I've struggled with the seeming intractability, given current technology limitations, of minimizing harm to life in general given the need for energy to just survive, and I've long presumed that using criteria that focus on the animalia kingdom (let alone wild animals) is reflective of a deeper speciecism we would do well to eliminate, and quite likely a "crucial consideration." So, eager to learn more or receive feedback.
This seems to be an issue of only considering one side of the possibility distribution. I think it’s very arguable that a post-nuclear-holocaust society is just as if not more likely to be more racist/sexist, more violent or suspicious of others, more cruel to animals (if only because our progress in e.g., lab-grown meat will be undone), etc. in the long term. This is especially the case if history just keeps going through cycles of civilizational collapse and rebuilding—in which case we might have to suffer for hundreds of thousands of years (and subject animals to that many more years of cruelty) until we finally develop a civilization that is capable of maximizing human/sentient flourishing (assuming we don’t go extinct!)
You cite the example of post-WW2 peace, but I don’t think it’s that simple:
there were many wars afterwards (e.g., the Korean War, Vietnam), they just weren’t all as global in scale. Thus, WW2 may have been more of a peak outlier at a unique moment in history.
It’s entirely possible WW2 could have led to another, even worse war—we just got lucky. (consider how people thought WW1 would be the war to end all wars because of its brutality, only for WW2 to follow a few decades later)
Inventions such as nuclear weapons, the strengthening of the international system in terms of trade and diplomacy, the disenchantment with fascism/totalitarianism (with the exception of communism), and a variety of other factors seemed to have helped to prevent a WW3; the brutality of WW2 was not the only factor.
Ultimately, I still consider that the argument that seemingly horrible things like nuclear holocausts (or The Holocaust) or world wars are more likely to produce good outcomes in the long term just generally seems improbable. (I just wish someone who is more familiar with longtermism would contribute)
For more on "Example intervention: Funding EAs to work at think tanks", see here. That post and those notes are specific to the US system; I'm not sure it would work (or at least work the same way) in other systems. Think tanks are also much bigger parts of the policy research ecosystem in the US than in other countries. I'm a big fan of this model, but I'm not sure anyone has checked whether it could work outside of the US context.
A couple of other caveats:
Think tanks tend to have more flexibility than academia in what they write about, as their reports don’t have to pass peer-review, fit into established journals, etc.
I don't think this is true. Think tank researchers indeed face fewer journal/peer review constraints, but they have some additional ones, especially perceptions of policy relevance. There are academic journals/conferences for most topics, but you're going to have a hard time finding a think tank interested in speculative longtermist research. My guess is a large majority (probably >75%) of EA researchers (even those who would self-identify as being interested in "policy") would have a rather hard time with think tank constraints.
apparently some (or many?) think tanks are able and willing to essentially just accept funding for a specific person to work on a specific topic (with the funder deciding on the person and the topic).
From a think tank perspective, there is a big difference between flexible individual-level funding and individual-level funding to work on a specific topic from a specific perspective. Most think tanks are very sensitive about the optics of being "bought" by outside interests. They're fine with outside funding and eager for free labor, but I think many (especially reputable/high-quality) think tanks would not want to accept someone who comes in saying "I come to you from X funder and they want me to write Y and Z." The easiest way to get around this issue is joining a think tank that has overlapping interests (e.g. if you want to work on nuclear nonproliferation, you can join the Nuclear Threat Initiative or the Arms Control Association teams already working on that issue).
Thanks, David. In light of this comment, I now lean towards renaming the entry resilient food. Michael, what do you think?
Draft and re-draft (and re-draft). The writing should go through many iterations. You make drafts, you share them with a few people, you do something else for a week. Maybe nobody has read the draft, but you come back and you’ve rejuvenated your wonderful capacity to look at the work and know why it’s terrible.
Kind of related to this: giving a presentation about the ideas in your article is something that you can use as a form of a draft. If you can't get anyone to listen to a presentation, or don't want to give one quite yet, you can pick some people whose opinion you value and just make a presentation where you imagine that they're in the audience.
I find that if I'm thinking of how to present the ideas in a paper to an in-person audience, it makes me think about questions like "what would be a concrete example of this idea that I could start the presentation with, that would grab the audience's attention right away". And then if I come up with a good way of presenting the ideas in my article, I can rewrite the article to use that same presentation.
(Unfortunately myself I have mostly taken this advice in its reverse form. I've first written a paper and then given a presentation of it afterwards, at which point I've realized that this is actually what I should have said in the paper itself.)
I'm confused about your FAQ's advice here. Some quotes from the longer example:
Let’s say that Alice is an expert in AI alignment, and Bob wants to get into the field, and trusts Alice’s judgment. Bob asks Alice what she thinks is most valuable to work on, and she replies, “probably robustness of neural networks”. [...] I think Bob should instead spend some time thinking about how a solution to robustness would mean that AI risk has been meaningfully reduced. [...] It’s possible that after all this reflection, Bob concludes that impact regularization is more valuable than robustness. [...] It’s probably not the case that progress in robustness is 50x more valuable than progress in impact regularization, and so Bob should go with [impact regularization].
In the example, Bob "wants to get into the field", so this seems like an example of how junior people shouldn't defer to experts when picking research projects.
(Specualative differences: Maybe you think there's a huge difference between Alice giving a recommendation about an area vs a specific research project? Or maybe you think that working on impact regularization is the best Bob can do if he can't find a senior researcher to supervise him, but if Alice could supervise his work on robustness he should go with robustness? If so, maybe it's worth clarifying that in the FAQ.)
Edit: TBC, I interpret Toby Shevlane as saying ~you should probably work on whatever senior people find interesting; while Jan Kulveit says that "some young researchers actually have great ideas, should work on them, and avoid generally updating on research taste of most of the 'senior researchers'". The quoted FAQ example is consistent with going against Jan's strong claim, but I'm not sure it's consistent with agreeing with Toby's initial advice, and I interpret you as agreeing with that advice when writing e.g. "Defer to experts for ~3 years, then trust your intuitions".
That's beautiful! Thanks for creating the website and for this interesting writeup :)
Thanks for the response. I believe I understand your objection but it would be helpful to distinguish the following two propositions:
a. A catastrophic risk in the next few years is likely to be horrible for humanity over the next 500 years.
b. A catastrophic risk in the next few years is likely to to leave humanity (and other sentient agents) worse off in the next 5,000,000 years, all things considered.
I have no disagreement at all with the first but am deeply skeptical of the second. And that's where the divergence comes from.
The example of a post-nuclear generation being animal-right sensitive is just one possibility that I advanced; one may consider other areas such as universal disarmament, open borders, end of racism/sexism. If the probability of a more tolerant humanity emerging from the ashes of a nuclear winter is even 0.00001, then from the perspective of someone looking back 100,000 years from now, it is not very obvious that the catastrophic risk was bad, all things considered.
For example, whatever the horrors of WWII may have been, the sustenance of relative peace and prosperity of Europe since 1945 owes a significant deal to the war. In addition, the widespread acknowledgement of norms and conventions around torture and human rights is partly a consequence of the brutality of the war. That of course if far from enough to conclude that the war was a net positive. However 5000 years into the future, are you sure that in the majority of scenarios, in retrospect, WW2 would still be a net negative event?
In any case, I have added this as well in the post:
If a longtermist were to state that the expected number of lives saved in say T (say 100,000 ) years is N (say 1,000,000) and that the probability of saving at least M (say 10,000) lives is 25% and that the probability of causing more deaths (or harm engendered) is less than 1%, all things considered (i.e., counter-factuals with opportunity cost), then I’ll put all this aside and join the club!
Thanks for linking to that article, which I hadn't seen. I updated the 'certificates of impact' entry with a brief summary of the proposal.
(As an aside, I read your FAQ and enjoyed it, so thanks for the share!)
I agree on the challenges of deploying results. I think the primary value in public health research is empowering individuals to make good decisions for themselves. For example, sites like WedMD and Healthline add a lot of value for individuals trying to improve their families' health. I don't think the answer is already out there on obesity and many other chronic diseases. If it is, I would appreciate someone directing me to it. :)
(Note: I'm not well-steeped in the longtermism literature, so don't look to me as some philosophical ambassador; I'm only commenting since I hadn't seen any other answers yet.)
I get lost with your argument when you say "the standard deviation as a measure of uncertainty [...] could be so large that the coefficient of variation is very small" What is the significance/meaning of that?
I read your Medium post and I think I otherwise understand the general argument (and even share similar concerns at times). However, my response to the argument you lay out there would mainly be as follows: yes, it technically is possible that a civilizational nuclear reset could lead to good outcomes in the long term, but it's also highly improbable. In the end, we have to weigh what is more plausible, and while there will be a lot of uncertainty, it isn't fair to characterize every situation as purely/symmetrically uncertain, and one of the major goals of longtermism is to seek out these cases where it seems that an intervention is more likely to help than to hurt in the long term.
One of the major examples I've heard longtermists talk about is reducing x-risk. You seem to take issue with this point, but I think the reasoning here is tenuous at best. More specifically, consider the example you give regarding "what if nuclear reset leads to a society that is so "enlightened... that they no longer farm animals for food." Does it seem more plausible that nuclear reset will lead to an enlightened society or a worse society (and enormous suffering in the process)? As part of this, consider all the progress our current society has made in the past ~60 years regarding things like lab-grown meat and veganism—and how much progress in these and other fields would be lost in such a scenario. In this case, it seems far more plausible that preventing a nuclear holocaust will be better for the long term future.
What about this to reduce the pbly often overwhelming stigma attached to showcasing one's own donations?!
Re comment 1: Yes, sorry this was just meant to point at a potential parallel not to work out the parallel in detail. I think it'd be valuable to work out the potential parallel between the DM agent's predicate predictor module (Fig12/pg14) with my factored-noxiousness-object-detector idea. I just took a brief look at the paper to refresh my memory, but if I'm understanding this correctly, it seems to me that this module predicts which parts of the state prevent goal realization.
I guess what I don't understand is how the "predicate predictor" thing can make it so that the setup is less likely to yield models that support morally relevant valence (if you indeed think that). Suppose the environment is modified such that the observation that the agent gets in each time step includes the value of every predicate in the reward specification. That would make the "predicate predictor" useless (I think; just from a quick look at the paper). Would that new setup be more likely than the original to yield models that have morally relevant valence?
Hmm okay! Thanks so much for this. So I suppose the main uncertainties for me are
Really appreciate you helping clarify this for me!
Right! Thanks, I've fixed it!
Thanks for your answer! I agree it's strange that these kinds of formalities are still so much of a thing among otherwise egalitarian people.
Thanks! That's super helpful.
Great post! 🙂
One question, I was not able to understand the total donations calculation here:
"Difference: EUR 865.25/month 12x865.25 = EUR 10,383 13th salary = EUR 4,834.25 Total difference: EUR 15,217.25 Total donations: EUR 43’592.09"
Can you help me understand how do you calculate total donations here? Is it an annual number?
This is super helpful. Thank you!
I think it'd be good for someone to read/skim the relevant 80k article and write some entry text based on that.
I think it'd also be good to list/discuss EA, EA-adjacent, or especially-EA-relevant think tanks, such as Rethink Priorities, CSET, and NTI.
I essentially always just use first name, including CEOs or professors. I actually find it quite strange how insistent some otherwise extremely egalitarian people are on the use of professional titles as a mark of social status.
For actual nobility I guess I might use titles.
Hi Peter,
Thanks for creating these entries. My sense is that Scheffler doesn't satisfy the criteria for inclusion. Thoughts?
This may be a good opportunity to mention that although I spent quite a bit of time thinking about these criteria, I'm still rather uncertain and am open to adopting a more inclusivist approach to entries for individual people. If you have any views on what the criteria should be, feel free to share them here.
I think the analogy to humans suggests otherwise. Suppose a human feels pain in their hand due to touching something hot. We can regard all the relevant mechanisms in their body outside the brain—those that cause the brain to receive the relevant signal—as mechanisms that have been "factored out from the brain". And yet those mechanisms are involved in morally relevant pain. In contrast, suppose a human touches a radioactive material until they realize it's dangerous. Here there are no relevant mechanisms that have been "factored out from the brain" (the brain needs to use ~general reasoning); and there is no morally relevant pain in this scenario.
Though if "factoring out stuff" means that smaller/less-capable neural networks are used, then maybe it can reduce morally relevant valence risks.