You can send me a message anonymously here:


I replied on LW:

Thanks for the response and for the concern. To be clear, the purpose of this post was to explore how much a typical, small AI project would affect AI timelines and AI risk in expectation. It was not intended as a response to the ML engineer, and as such I did not send it or any of its contents to him, nor comment on the quoted thread. I understand how inappropriate it would be to reply to the engineer's polite acknowledgment of the concerns with my long analysis of how many additional people will die in expectation due to the project accelerating AI timelines.

I also refrained from linking to the quoted thread specifically because again this post is not a contribution to that discussion. The thread merely inspired me to take a quantitative look at what the expected impacts of a typical ML project actually are. I included the details of the project for context in case others wanted to take them into account when forecasting the impact.

I also included Jim and Raymond's comments because this post takes their claims as givens. While I understand the ML engineer may have been skeptical of their claims, and so elaborating on why the project is expected to accelerate AI timelines (and therefore increase AI risk) would be necessary to persuade them that their project is bad for the world, again that aim is outside of the scope of this post.

I've edited the heading after "The trigger for this post" from "My response" to "My thoughts on whether small ML projects significantly affect AI timelines" to make clear that the contents are not intended as a response to the ML engineer, but rather are just my thoughts about the claim made by the ML engineer. I assume that heading is what led you to interpret this post as a response to the ML engineer, but if there's anything else that led you to interpret it that way, I'd appreciate you letting me know so I can improve it for others who might read it. Thanks again for reading and offering your thoughts.

I only play-tested it once (in-person with three people with one laptop plus one phone editing the spreadsheet) and the most annoying aspect of my implementation of it was having to record one's forecasts in a spreadsheet from a phone. If everyone had a laptop or their own device it'd be easier. But I made the spreadsheet to handle games (or teams?) of up to 8 people, so I think it could work well for that.

I don't operate with this mindset frequently, but thinking back to some of the highest impact things I've done I'm realizing now that I did those things because I had this attitude. So I'm inclined to think it's good advice.

I love Wits & Wagers! You might be interested in Wits & Calibration, a variant I made during the pandemic in which players forecast the probability that each numeric range is 'correct' (closest to the true answer without being greater than it) rather than bet on the range that is most probable (as in the Party Edition) or highest EV given payout-ratios (regular Wits & Wagers). The spreadsheet I made auto-calculates all scores, so players need only enter their forecasts and check a box next to the correct answer.

I created the variant because I think it makes the game higher skill. For example, rather than just bet on a range that you know is the most likely to be correct, you can be rewarded for knowing whether it's 60% likely or 80% likely to be correct, unlike in classic Wits & Wagers where everyone would bet on the range simply by knowing it's >50% likely to be correct and get an equal reward.

I second this.

FWIW I read from the beginning through What actually is "value-alignment"? then decided it wasn't worth reading further and just skimmed a few more points and the conclusion section. I then read some comments.

IMO the parts of the post I did read weren't worth reading for me, and I doubt they're worth reading for most other Forum users as well. (I strong-downvoted the post to reflect this, though I'm late to the party, so my vote probably won't have the same effect on readership as it would have if I had voted on it 13 days ago).

Hi Devon, FWIW I agree with John Halstead and Michael PJ re John's point 1.

If you're open to considering this question further, you may be interested in knowing my reasoning (note that I arrived at this opinion independently of John and Michael), which I share below.

Last November I commented on Tyler Cowen's post to explain why I disagreed with his point:

I don't find Tyler's point very persuasive: Despite the fact that the common sense interpretation of the phrase "existential risk" makes it applicable to the sudden downfall of FTX, in actuality I think forecasting existential risks (e.g. the probability of AI takeover this century) is a very different kind of forecasting question than forecasting whether FTX would suddenly collapse, so performance at one doesn't necessarily tell us much about performance on the other.

Additionally, and more importantly, the failure to anticipate the collapse of FTX seems to not so much be an example of making a bad forecast, but an example of failure to even consider the hypothesis. If an EA researcher had made it their job to try to forecast the probability that FTX collapses and assigned a very low probability to it after much effort, that probably would have been a bad forecast. But that's not what happened; in reality EAs just failed to even consider that forecasting question. EAs *have* very seriously considered forecasting questions on x-risk though.

So the better critique of EAs in the spirit of Tyler's would not be to criticize EA's existential risk forecasts, but rather to suggest that there may be an existential risk that destroys humanity's potential that isn't even on our radar (similar to how the sudden end of FTX wasn't on our radar). Others have certainly talked about this possibility before though, so that wouldn't be a new critique. E.g. Toby Ord in The Precipice put "Unforeseen anthropogenic risks" in the next century at ~1 in 30. (Source: Does Tyler think ~1 in 30 this century is too low? Or that people haven't spent enough effort thinking about these unknown existential risks?

You made a further point, Devon, that I want to respond to as well:

There is a certain hubris in claiming you are going to "build a flourishing future" and "support ambitious projects to improve humanity's long-term prospects" (as the FFF did on its website) only to not exist 6 months later and for reasons of fraud to boot.

I agree with you here. However, I think the hubris was SBF's hubris, not EAs' or longtermists-in-general's hubris.

I'd even go further to say that it wasn't the Future Fund team's hubris.

As John commented below, "EAs did a bad job on the governance and management of risks involved in working with SBF and FTX, which is very obvious and everyone already agrees."

But that's a critique of the Future Fund's (and others') ability to think of all the right top priorities for their small team in their first 6 months (or however long it was), not a sign that the Future Fund had hubris.

Note, however, that I don't even consider the Future Fund team's failure to think of this to be a very big critique of them. Why? Because anyone (in the EA community or otherwise) could have entered in The Future Fund's Project Ideas Competition and suggested the project of investigating the integrity of SBF and his businesses, and the risk that they may suddenly collapse, to ensure the stability of the funding source for the benefit of future Future Fund projects, and to protect EA's and longtermists' reputation from risks arising from associating with SBF should SBF become involved in a scandal. (Even Tyler Cowen could have done so and won some easy money.) But no one did (as far as I'm aware). So given that, I conclude that it was a hard risk to spot so early on, and consequently I don't fault the Future Fund team all that much for failing to spot this in their first 6 months.

There is a lesson to be learned from peoples' failure to spot the risk, but that lesson is not that longtermists lack the ability to forecast existential risks well, or even that they lack the ability to build a flourishing future.

Great list! 5 of the 6 "Other Items" are YouTube videos.

Forewarning: I have not read your post (yet).

I argue that moral offsetting is not inherently immoral

(I'm probably just responding to a literal interpretation of what you wrote rather than the intended meaning, but just in case and to provide clarity:) I'm not aware of anyone who argues that offsetting itself is immoral (though EAs have pointed out Ethical offsetting is antithetical to EA).

Rather, the claim that I've seen some people make is that (some subset of) the  actions that would normally be impermissible (like buying factory farmed animal products or hiring an assassin) can be permissible if the person doing the action engages in the right kind of offsetting behavior, such as donating money to prevent factory farmed animal suffering or preventing an assassination.

I bring up the assassination example because we'd pretty much all agree that that hiring an assassin is impermissible regardless of what offsetting behavior one does to try to right this wrong. For people who agree that hiring an assassin is wrong regardless of any offsetting behavior, but think there are some other kinds of generally impermissible actions (e.g. buying animal products) that become permissible when one engages in a certain offsetting behavior, I'd be interested in hearing what you think the difference is that makes it apply to the one behavior but not to the hiring of the assassin. (If this is what the OP blog post does, let me know and I'll give it a read.)

I'm also curious if there are less controversial examples than buying animal products where most people agree that offsetting behavior is sufficient to make a generally impermissible action permissible.

Load More