Topic Contributions


Milan Griffes on EA blindspots

Smart things are not dangerous because they have access to human-built legacy nukes.  Smart things are dangerous because they are smarter than you. 

I expect that the most efficient way to kill everyone is via the biotech->nanotech->tiny diamondoid bacteria hopping the jetstream and replicating using CHON and sunlight->everybody falling over dead 3 days after it gets smart.  I don't expect it would use nukes if they were there.

Smart AIs are not dangerous because somebody built guns for them, smart AIs are not dangerous because cars are connected to the Internet, smart AIs are not dangerous because they can steal existing legacy weapons infrastructure, smart AIs are dangerous because they are smarter than you and can think of better stuff to do.

FTX EA Fellowships

If I ended up spending some time in the Bahamas this year, do you have a guess as to when would be the optimal time for that?

List of EA funding opportunities

Can you put something on here to the effect of: "Eliezer Yudkowsky continues to claim that anybody who comes to him with a really good AGI alignment idea can and will be funded."

Towards a Weaker Longtermism

It strikes me as a fine internal bargain for some nonhuman but human-adjacent species; I would not expect the internal parts of a human to able to abide well by that bargain.

Towards a Weaker Longtermism

There’s nothing convoluted about it! We just observe that historical experience shows that the supposed benefits never actually appear, leaving just the atrocity! That’s it! That’s the actual reason you know the real result would be net bad and therefore you need to find a reason to argue against it! If historically it worked great and exactly as promised every time, you would have different heuristics about it now!

Towards a Weaker Longtermism

The final conclusion here strikes me as just the sort of conclusion that you might arrive at as your real bottom line, if in fact you had an arrived at an inner equilibrium between some inner parts of you that enjoy doing something other than longtermism, and your longtermist parts.  This inner equilibrium, in my opinion, is fine; and in fact, it is so fine that we ought not to need to search desperately for a utilitarian defense of it.  It is wildly unlikely that our utilitarian parts ought to arrive at the conclusion that the present weighs about 50% as much as our long-term future, or 25% or 75%; it is, on the other hand, entirely reasonable that the balance of what our inner parts vote on will end up that way.  I am broadly fine with people devoting 50%, 25% or 75% of themselves to longtermism, in that case, as opposed to tearing themselves apart with guilt and ending up doing nothing much, which seems to be the main alternative.  But you're just not going to end up with a utilitarian defense of that bottom line; if the future can matter at all, to the parts of us that care abstractly and according to numbers, it's going to end up mattering much more than the present; equivalently, any rationalization like exponential discounting that can imply averting this, is going to imply that it is better to eat an ice cream today and destroy a galaxy of happy sapient beings in ten million years.  This is crazy, and I think it makes a lot more sense to just admit that part of you cares about galaxies and part of you cares about ice cream and say that neither of these parts are going to be suppressed and beaten down inside you.

I think this is what actually yields an appeal of "regular longtermism", and since that's what actually produces the bottom line, I think that what produces this bottom line should just be directly called the justification for it - there's no point in reaching for a different argument for justification than for conclusion-production.

Towards a Weaker Longtermism

The reason we have a deontological taboo against “let’s commit atrocities for a brighter tomorrow” is not that people have repeatedly done this, it worked exactly like they said it would, and millions of people received better lives in exchange for thousands of people dying unpleasant deaths exactly as promised.

The reason we have this deontological taboo is that the atrocities almost never work to produce the promised benefits. Period. That’s it. That’s why we normatively should have a taboo like that.

(And as always in a case like that, we have historical exceptions that people don’t like to talk about because they worked, eg, Knut Haukelid, or the American Revolution. And these examples are distinguished among other factors by a found mood (the opposite of a missing mood) which doesn’t happily jump on the controversial wagon for controversy points, nor gain power and benefit from the atrocity; but quietly and regretfully kills the innocent night watchman who helped you, to prevent the much much larger issue of Nazis getting nuclear weapons.)

This logic applies without any obvious changes to “let’s commit atrocities in pursuit of a brighter tomorrow a million years away” just like it applies to “let’s commit atrocities in pursuit of a brighter tomorrow in 2 years”. Literally any nice thing somebody says you could get would “justify atrocities”, in exactly the same way, if you forgot this rule. If you admit the existence of thousands of American schoolchildren getting suboptimally nutritious lunches, it could, oh no, justify abducting and torturing businessmen into using their ATM cards so you could get more money for the schoolchildren. Obviously then those children must not exist, or maybe they don’t have qualia so their suffering won’t be important, because if they existed and mattered that could justify atrocities, couldn’t it?

There is nothing special about longtermism compared to any other big desideratum in this regard. It is 100% unjustified special attention because people don’t like the desideratum itself. The same way that people ask “How can we spend money on AI safety when children are starving now?” but their mind doesn’t make the same leap about “How can we spend money on fighting global warming when children are starving now?” or say “Hey maybe we should critique total spending on lipstick advertising before we critique spending on rockets.”

As always, transhumanism done correctly is just humanism.

Taboo "Outside View"

I worriedly predict that anyone who followed your advice here would just switch to describing whatever they're doing as "reference class forecasting" since this captures the key dynamic that makes describing what they're doing as "outside viewing" appealing: namely, they get to pick a choice of "reference class" whose samples yield the answer they want, claim that their point is in the reference class, and then claiming that what they're doing is what superforecasters do and what Philip Tetlock told them to do and super epistemically virtuous and anyone who argues with them gets all the burden of proof and is probably a bad person but we get to virtuously listen to them and then reject them for having used the "inside view".

My own take:  Rule One of invoking "the outside view" or "reference class forecasting" is that if a point is more dissimilar to examples in your choice of "reference class" than the examples in the "reference class" are dissimilar to each other, what you're doing is "analogy", not "outside viewing".

All those experimental results on people doing well by using the outside view are results on people drawing a new sample from the same bag as previous samples.  Not "arguably the same bag" or "well it's the same bag if you look at this way", really actually the same bag: how late you'll be getting Christmas presents this year, based on how late you were in previous years.  Superforecasters doing well by extrapolating are extrapolating a time-series over 20 years, which was a straight line over those 20 years, to another 5 years out along the same line with the same error bars, and then using that as the baseline for further adjustments with due epistemic humility about how sometimes straight lines just get interrupted some year.  Not by them picking a class of 5 "relevant" historical events that all had the same outcome, and arguing that some 6th historical event goes in the same class and will have that same outcome.

Two Strange Things About AI Safety Policy

The idea of running an event in particular seems misguided. Conventions come after conversations. Real progress toward understanding, or conveying understanding, does not happen through speakers going On Stage at big events. If speakers On Stage ever say anything sensible, it's because an edifice of knowledge was built in the background out of people having real, engaged, and constructive arguments with each other, in private where constructive conversations can actually happen, and the speaker On Stage is quoting from that edifice.

(This is also true of journal publications about anything strategic-ish - most journal publications about AI alignment come from the void and are shouting into the void, neither aware of past work nor feeling obliged to engage with any criticism. Lesser (or greater) versions of this phenomenon occur in many fields; part of where the great replication crisis comes from is that people can go on citing refuted studies and nothing embarrassing happens to them, because god forbid there be a real comments section or an email reply that goes out to the whole mailing list.)

If there's something to be gained from having national-security higher-ups understanding the AGI alignment strategic landscape, or from having alignment people understand the national security landscape, then put Nate Soares in a room with somebody in national security who has a computer science background, and let them have a real conversation. Until that real progress has already been made in in-person conversations happening in the background where people are actually trying to say sensible things and justify their reasoning to one another, having a Big Event with people On Stage is just a giant opportunity for a bunch of people new to the problem to spout out whatever errors they thought up in the first five seconds of thinking, neither aware of past work nor expecting to engage with detailed criticism, words coming from the void and falling into the void. This seems net counterproductive.

The history of the term 'effective altruism'

There's only so many things you can call it, and accidental namespace collisions / phrase reinventions aren't surprising. I was surprised when I looked back myself and noticed the phrase was there, so it would be more surprising if Toby Ord remembered than if he didn't. I'm proud to have used the term "effective altruist" once in 2007, but to say that this means I coined the term, especially when it was re-output by the more careful process described above, might be giving me too much credit - but it's still nice to have this not-quite-coincidental mention be remembered, so thank you for that!

Load More