I just heard about this via a John Green video, and immediately came here to check whether it'd been discussed. Glad to see that it's been posted -- thanks for doing that! (Strong-upvoted, because this is the kind of thing I like to see on the EA forum.)
I don't have the know-how to evaluate the 100x claim, but it's huge if true -- hopefully if it pops up on the forum like this now and then, especially as more evidence comes in from the organization's work, we'll eventually get the right people looking to evaluate this as an opportunity.
I think this is a good point; you may also be interested in Michelle's post about beneficiary groups, my comment about beneficiary subgroups, and Michelle's follow-up about finding more effective causes.
Thanks Tobias.
In a hard / unexpected takeoff scenario, it's more plausible that we need to get everything more or less exactly right to ensure alignment, and that we have only one shot at it. This might favor HRAD because a less principled approach makes it comparatively unlikely that we get all the fundamentals right when we build the first advanced AI system.
FWIW, I'm not ready to cede the "more principled" ground to HRAD at this stage; to me, it seems like the distinction is more about which aspects of an AI system's behavior we're specif...
Thanks for these thoughts. (Your second link is broken, FYI.)
On empirical feedback: my current suspicion is that there are some problems where empirical feedback is pretty hard to get, but I actually think we could get more empirical feedback on how well HRAD can be used to diagnose and solve problems in AI systems. For example, it seems like many AI systems implicitly do some amount of logical-uncertainty-type reasoning (e.g. AlphaGo, which is really all about logical uncertainty over the result of expensive game-tree computations) -- maybe HRAD could be ...
My guess is that the capability is extremely likely, and the main difficulties are motivation and reliability of learning (since in other learning tasks we might be satisfied with lower reliability that gets better over time, but in learning human preferences unreliable learning could result in a lot more harm).
I am very bullish on the Far Future EA Fund, and donate there myself. There's one other possible nonprofit that I'll publicize in the future if it gets to the stage where it can use donations (I don't want to hype this up as an uber-solution, just a nonprofit that I think could be promising).
I unfortunately don't spend a lot of time thinking about individual donation opportunities, and the things I think are most promising often get partly funded through Open Phil (e.g. CHAI and FHI), but I think diversifying the funding source for orgs like CHAI and FHI is valuable, so I'd consider them as well.
I think there's something to this -- thanks.
To add onto Jacob and Paul's comments, I think that while HRAD is more mature in the sense that more work has gone into solving HRAD problems and critiquing possible solutions, the gap seems much smaller to me when it comes to the justification for thinking HRAD is promising vs justification for Paul's approach being promising. In fact, I think the arguments for Paul's work being promising are more solid than those for HRAD, despite it only being Paul making those arguments -- I've had a much harder time understanding anything more nuanced than the basic case for HRAD I gave above, and a much easier time understanding why Paul thinks his approach is promising.
...My perspective on this is a combination of “basic theory is often necessary for knowing what the right formal tools to apply to a problem are, and for evaluating whether you're making progress toward a solution” and “the applicability of Bayes, Pearl, etc. to AI suggests that AI is the kind of problem that admits of basic theory.” An example of how this relates to HRAD is that I think that Bayesian justifications are useful in ML, and that a good formal model of rationality in the face of logical uncertainty is likely to be useful in analogous ways. When
Thanks Nate!
The end goal is to prevent global catastrophes, but if a safety-conscious AGI team asked how we’d expect their project to fail, the two likeliest scenarios we’d point to are "your team runs into a capabilities roadblock and can't achieve AGI" or "your team runs into an alignment roadblock and can easily tell that the system is currently misaligned, but can’t figure out how to achieve alignment in any reasonable amount of time."
This is particularly helpful to know.
...We worry about "unknown unknowns", but I’d pro
I'm going to try to answer these questions, but there's some danger that I could be taken as speaking for MIRI or Paul or something, which is not the case :) With that caveat:
I'm glad Rob sketched out his reasoning on why (1) and (2) don't play a role in MIRI's thinking. That fits with my understanding of their views.
...(1) You might think that "learning to reason from humans" doesn't accomplish (1) because a) logic and mathematics seem to be the only methods we have for stating things with extremely high certainty, and b) you probably can't rule
Thanks for linking to that conversation -- I hadn't read all of the comments on that post, and I'm glad I got linked back to it.
Thanks!
Conditional on MIRI's view that a hard or unexpected takeoff is likely, HRAD is more promising (though it's still unclear).
Do you mean more promising than other technical safety research (e.g. concrete problems, Paul's directions, MIRI's non-HRAD research)? If so, I'd be interested in hearing why you think hard / unexpected takeoff differentially favors HRAD.
Thanks Tara! I'd like to do more writing of this kind, and I'm thinking about how to prioritize it. It's useful to hear that you'd be excited about those topics in particular.
Welcome! :)
I think your argument totally makes sense, and you're obviously free to use your best judgement to figure out how to do as much good as possible. However, a couple of other considerations seem important, especially for things like what a "true effective altruist" would do.
1) One factor of your impact is your ability to stick with your giving; this could give you a reason to adopt something less scary and demanding. By analogy, it might seem best for fitness to commit to intense workouts 5 days a week, strict diet changes, and no alcoho...
Re: donation: I'd personally feel best about donating to the Long-Term Future EA Fund (not yet ready, I think?) or the EA Giving Group, both managed by Nick Beckstead.
Thanks for recommending a concrete change in behavior here!
I also appreciate the discussion of your emotional engagement / other EAs' possible emotional engagement with cause prioritization -- my EA emotional life is complicated, I'm guessing others have a different set of feelings and struggles, and this kind of post seems like a good direction for understanding and supporting one another.
ETA: personally, it feels correct when the opportunity arises to emotionally remind myself of the gravity of the ER-triage-like decisions that humans have to make when a...
I agree that if engagement with the critique doesn't follow those words, they're not helpful :) Editing my post to clarify that.
The pledge is really important to me as a part of my EA life and (I think) as a part of our community infrastructure, and I find your critiques worrying. I'm not sure what to do, but I appreciate you taking the critic's risk to help the community. Thank you!
This is a great point -- thanks, Jacob!
I think I tend to expect more from people when they are critical -- i.e. I'm fine with a compliment/agreement that someone spent 2 minutes on, but expect critics to "do their homework", and if a complimenter and a critic were equally underinformed/unthoughtful, I'd judge the critic more harshly. This seems bad!
One response is "poorly thought-through criticism can spread through networks; even if it's responded to in one place, people cache and repeat it other places where it's not responded to, and that...
Thanks!
I think parts of academia do this well (although other parts do it poorly, and I think it's been getting worse over time). In particular, if you present ideas at a seminar, essentially arbitrarily harsh criticism is fair game. Of course, this is different from the public internet, but it's still a group of people, many of whom do not know each other personally, where pretty strong criticism is the norm.
One guess is that ritualization in academia helps with this -- if you say something in a talk or paper, you ritually invite criticism, whereas I'...
Prediction-making in my Open Phil work does feel like progress to me, because I find making predictions and writing them down difficult and scary, indicating that I wasn't doing that mental work as seriously before :) I'm quite excited to see what comes of it.
I have very mixed feelings about Sarah's post; the title seems inaccurate to me, and I'm not sure about how the quotes were interpreted, but it's raised some interesting and useful-seeming discussion. Two brief points:
I'm really glad you posted this! I've found it helpful food for thought, and I think it's a great conversation for the community to be having.
For many Americans, income taxes might go down; probably worth thinking about what to do with that "extra" money.
Thanks for mentioning this -- I totally see what you're pointing at here, and I think you make valid points re: there always being more excuses later.
I just meant to emphasize that "giving now feels good" wasn't something I was prepared to justify in terms of its actual impact on the world; if I found out that this good feeling was justified in terms of impact, that'd be great, but if it turned out that I could give up that good feeling in order to have a better impact, I'd try my best to do so.
Thanks Milan!
I haven't thought a lot about that, and might be making the wrong call. Off the top of my head:
I guess I could put my 10% toward debt reduction instead -- if you or anyone else has ...
I was glad to see this article -- I think it's a very interesting issue, and generally want to encourage people to bring up this kind of thing so that we can continue to look for more effective causes and beneficiary groups. Nice work!
I didn't find the presentation unpleasant, personally, but I have a high tolerance for being opinionated, and it's been helpful to see others' reactions in the comments.
Since the groups above seem to exhaust the space of beneficiaries (if what we care about is well-being), we can’t expect to get more effectiveness improvements in this way. In future, such improvements will have to come from finding new interventions, or intervention types.
Though I think the conclusion may well be correct, this argument doesn't seem valid to me. Thinking about it more produced some ideas I found interesting.
Imagine that we instead had only one group of beneficiaries: all conscious beings. We could run the same argument -- this group exh...
Some of the most significant insights of effective altruism in terms of finding more effective ways to help others have come from highlighting different beneficiary groups.
This makes me want to split off "people in extreme poverty" into a distinct group of beneficiaries -- I suspect that for many the "aha!" moment in their EA journey was realizing that these people exist and can be helped. Also, it seems to me that the interventions available for helping people in extreme poverty are quite different from interventions that help riche...
Has anyone here seen any good analyses of helping Syrian refugees as a cause area, or the most effective ways to do it? I've seen some commentary on opening borders and some general tips on disaster relief from GiveWell, but not much beyond that. Thanks!
Thanks! :) After our conversation Owen jumped right into the write-up, and I pitched in with the javascript -- it was fun to just charge ahead and execute a small idea like this.
It's true that this calculator doesn't take field-steering or paradigm-defining effects of early research into account, nor problems of inherent seriality vs parallelizable work. These might be interesting to incorporate into a future model, at some risk of over-complicating what will always be a pretty rough estimate.
I am not Nate, but my view (and my interpretation of some median FHI view) is that we should keep options open about those strategies and as-yet unknown other strategies instead of fixating on one at the moment. There's a lot of uncertainty, and all of the strategies look really hard to achieve. In short, no strongly favored strategy.
FWIW, I also think that most current work in this area, including MIRI's, promotes the first three of those goals pretty well.
Thanks! This reply makes sense to me, and the refutation of the marginal-contribution strategy is interesting. I can see why you've chosen to group tightly complementary contributions.
Thanks for posting these updates, I'm quite excited about the project!
Have you considered incentive problems stemming from the fact that you require fractions of impact to be allocated among participants so that they add up to 1? My understanding is that this way of allocating credit doesn't produce the desired results in cases where the project wouldn't have happened without all participants (see e.g. 5 mistakes of moral reasoning).
If you've already answered this, I'd appreciate a link -- I know you've thought about this quite a bit.
I've often found the EAs around me to be
(i) very supportive of taking on things that are ex ante good ideas, but carry significant risk of failing altogether, and
(ii) good at praising these decisions after they have turned out to fail.
It doesn't totally remove the sting to have those around you say "Great job taking that risk, it was the right decision and the EV was good!" and really mean it, but I do find that it helps, and it's a habit I'm trying to build to praise these kinds of things after the fact as much as I praise big successes.
Of cours...
I was going to "heart" this, but that seemed ambiguous. So I'm just commenting to say, I hear you.