I really appreciate the donation to GovAI!
According to staff I've talked to, MIRI is not heavily funding constrained, but that they believe they could use more money. I suspect GovAI is in a similar place but I have not inquired.
For reference, for anyone thinking of donating to GovAI: I would currently describe us as “funding constrained” — I do current expect financial constraints to prevent us from making program improvements/expansions and hires we’d like to make over the next couple years. (We actually haven’t yet locked down enough funding to main...
Thanks for the thoughtful comment!
So it's not enough to be "no less democratic than other charity orgs". I believe we should strive to be much more democratic than that average - which seems to me like a minority view here.
I do think that this position - "EA foundations aren't unusually undemocratic, but they should still be a lot more democratic than they are" - is totally worthy of discussion. I think you're also right to note that other people in the community tend to be skeptical of this position; I'm actually skeptical of it, myself, but I would b...
Thanks!
To be clear, though, I also don't think people should feel like they need to write out comments explaining their strong downvotes. I think the time cost is too high for it to be a default expectation, particularly since it can lead to getting involved in a fraught back-and-forth and take additional time and energy that way. I don't use strong downvotes all that often, but, when I do use them, it's rare that I'll also write up an explanatory comment.
(Insofar as I disagree with forum voting norms, my main disagreement is that I'd like to see people ha...
I do think it's reasonable to feel frustrated by your experience commenting on this post. I think you should have been engaged more respectfully, with more of an assumption of good faith, and that a number of your comments shouldn't have been so heavily downvoted. I do also agree with some of the concerns you've raised in your comments and think it was useful for you to raise them.[1]
At the same time, I do think this comment isn't conducive to good conversation, and the content mostly strikes me as off-base.
The EA community doesn't have its roots in man
I generally think it'd be good to have a higher evidential bar for making these kinds of accusations on the forum. Partly, I think the downside of making an off-base socket-puppeting accusation (unfair reputation damage, distraction from object-level discussion, additional feeling of adversarialism) just tends to be larger than the upside of making a correct one.
Fwiw, in this case, I do trust that A.C. Skraeling isn't Zoe. One point on this: Since she has a track record of being willing to go on record with comparatively blunter criticisms, using her own name, I think it would be a confusing choice to create a new pseudonym to post that initial comment.
I think to some degree this level of accusations is problematic and to some degree derails an important conversation. Given the role a report like this may play in EA in the future, ad hominem and false attacks on critiques seem somewhat problematic
I strongly agree - if someone has a question or concern about someone else's identity, I think they should either handle it privately or speak to the Forum team about their concerns.
I really appreciate the time people have taken to engage with this post (and actually hope the attention cost hasn’t been too significant). I decided to write some post-discussion reflections on what I think this post got right and wrong.
The reflections became unreasonably long - and almost certainly should be edited down - but I’m posting them here in a hopefully skim-friendly format. They cover what I see as some mistakes with the post, first, and then cover some views I stand by.
Things I would do differently in a second version of the post:
1. I would ei...
Thanks for writing this update. I think my number one takeaway here is something like: when writing a piece with the aim of changing community dynamics, it's important to be very clear about motivations and context. E.g. I think a version of the piece which said "I think people are overreacting to Death with Dignity, here are my specific models of where Yudkowsky tends to be overconfident, here are the reasons why I think people aren't taking those into account as much as they should" would have been much more useful and much less controversial than the current piece, which (as I interpret it) essentially pushes a general "take Yudkowsky less seriously" meme (and is thereby intrinsically political/statusy).
I appreciate this update!
Then the post gives some evidence that, at each stage of his career, Yudkowsky has made a dramatic, seemingly overconfident prediction about technological timelines and risks - and at least hasn’t obviously internalised lessons from these apparent mistakes.
I am confused about you bringing in the claim of "at each stage of his career", given that the only two examples you cited that seemed to provide much evidence here were from the same (and very early) stage of his career. Of course, you might have other points of evidence t...
I noted some places I agree with your comment here, Ben. (Along with my overall take on the OP.)
Some additional thoughts:
Notably, since that post didn’t really have substantial arguments in it (although the later one did), I think the fact it had an impact is seemingly a testament to the power of deference
The “death with dignity” post came in the wake of Eliezer writing hundreds of thousands of words about why he thinks alignment is hard in the Late 2021 MIRI Conversations (in addition to the many specific views and arguments about alignment difficulty he’...
I'm a bit confused about a specific small part:
tendency toward expressing dramatic views
I imagine that for many people, including me (including you?), once we work on [what we believe to be] preventing the world from ending, we would only move to another job if it was also preventing the world from ending, probably in an even more important way.
In other words, I think "working at a 2nd x-risk job and believing it is very important" is mainly predicted by "working at a 1st x-risk job and believing it is very important", much more than by personality t...
I really appreciated this update. Mostly it checks out to me, but I wanted to push back on this:
...Here’s a dumb thought experiment: Suppose that Yudkowsky wrote all of the same things, but never published them. But suppose, also, that a freak magnetic storm ended up implanting all of the same ideas in his would-be-readers’ brains. Would this absence of a casual effect count against deferring to Yudkowsky? I don’t think so. The only thing that ultimately matters, I think, is his track record of beliefs - and the evidence we currently have about how accurate o
A general reflection: I wonder if one at least minor contributing factor to disagreement, around whether this post is worthwhile, is different understandings about who the relevant audience is.
I mostly have in mind people who have read and engaged a little bit with AI risk debates, but not yet in a very deep way, and would overall be disinclined to form strong independent views on the basis of (e.g.) simply reading Yudkowsky's and Christiano's most recent posts. I think the info I've included in this post could be pretty relevant to these people, since in ...
I think that insofar as people are deferring on matters of AGI risk etc., Yudkowsky is in the top 10 people in the world to defer to based on his track record, and arguably top 1. Nobody who has been talking about these topics for 20+ years has a similarly good track record. If you restrict attention to the last 10 years, then Bostrom does and Carl Shulman and maybe some other people too (Gwern?), and if you restrict attention to the last 5 years then arguably about a dozen people have a somewhat better track record than him.
(To my knowledge. I think...
...The part of this post which seems most wild to me is the leap from "mixed track record" to
In particular, I think, they shouldn’t defer to him more than they would defer to anyone else who seems smart and has spent a reasonable amount of time thinking about AI risk.
For any reasonable interpretation of this sentence, it's transparently false. Yudkowsky has proven to be one of the best few thinkers in the world on a very difficult topic. Insofar as there are others who you couldn't write a similar "mixed track record" post about, it's almost entirely bec
I phrased my reply strongly (e.g. telling people to read the other post instead of this one) because deference epistemology is intrinsically closely linked to status interactions, and you need to be pretty careful in order to make this kind of post not end up being, in effect, a one-dimensional "downweight this person". I don't think this post was anywhere near careful enough to avoid that effect. That seems particularly bad because I think most EAs should significantly upweight Yudkowsky's views if they're doing any kind of reasonable, careful deference, ...
If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.
I suppose one of my main questions is whether he has visibly learned from the mistakes, in this case.
For example, I wasn't able to find a post or comment to the effect of "When I was younger, I spent of years of my life motivated by the belief that near-term extinction from nanotech was looming. I turned out to be wrong. Here's what I learned from that experience and how I've applied it to my forecasts of near-t...
Eliezer writes a bit about his early AI timeline and nanotechnology opinions here, though it sure is a somewhat obscure reference that takes a bunch of context to parse:
...Luke Muehlhauser reading a previous draft of this (only sounding much more serious than this, because Luke Muehlhauser): You know, there was this certain teenaged futurist who made some of his own predictions about AI timelines -
Eliezer: I'd really rather not argue from that as a case in point. I dislike people who screw up something themselves, and then argue like
While he's not single-handedly responsible, he lead the movement to take AI risk seriously at a time when approximately no one was talking about it, which has now attracted the interests of top academics. This isn't a complete track record, but it's still a very important data-point.
I definitely do agree with that!
It's possible I should have emphasized the significance of it more in the post, rather than moving on after just a quick mention at the top.
If it's of interest: I say a little more about how I think about this, in response to Gwern's comment ...
What?
I interpreted Gwern as mostly highlighting that people have updated toward's Yudkowsky's views - and using this as evidence in favor of the view we should defer a decent amount to Yudkowsky. I think that was a reasonable move.
There is also a causal question here ('Has Yudkowsky on-net increased levels of concern about AI risk relative to where they would otherwise be?'), but I didn't take the causal question to be central to the point Gwern was making. Although now I'm less sure.
I don't personally have strong views on the causal question - I haven't thought through the counterfactual.
On 1 (the nanotech case):
I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old.
I think your comment might give the misimpression that I don't discuss this fact in the post or explain why I include the case. What I write is:
...I should, once again, emphasize that Yudkowsky was around twenty when he did the final updates on this essay. In that sense, it might be unfair to bring this very old example up.
Nonetheless, I do think this case can be treated as informative, since: the belief was so analogous to his cu
One quick response, since it was easy (might respond more later):
Overall, then, I do think it's fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn't incidental or a secondary consideration.
I do think takeoff speeds between 1 week and 10 years are a core premise of the classic arguments. I do think the situation looks very different if we spend 5+ years in the human domain, but I don't think there are many who believe that that is going to happen.
I don't think the distinction between 1 week and 1 year is that ...
No, it's just as I said, and your Karnofsky retrospective strongly supports what I said.
I also agree that Karnfosky's retrospective supports Gwern's analysis, rather than doing the opposite.
(I just disagree about how strongly it counts in favor of deference to Yudkowsky. For example, I don't think this case implies we should currently defer more to Yudkwosky's risk estimates than we do to Karnofsky's.)
Thanks for the comment! A lot of this is useful.
calling LOGI and related articles 'wrong' because that's not how DL looks right now is itself wrong. Yudkowsky has never said that DL or evolutionary approaches couldn't work, or that all future AI work would look like the Bayesian program and logical approach he favored;
I mainly have the impression that LOGI and related articles were probably "wrong" because, so far as I've seen, nothing significant has been built on top of them in the intervening decade-and-half (even though LOGI's successor was seeming...
I do not want an epistemic culture that finds it acceptable to challenge an individuals overall credibility in lieu of directly engaging with their arguments.
I think I roughly agree with you on this point, although I would guess I have at least a somewhat weaker version of your view. If discourse about people's track records or reliability starts taking up (e.g.) more than a fifth of the space that object-level argument does, within the most engaged core of people, then I do think that will tend to suggest an unhealthy or at least not-very-intellectuall...
I prefer to just analyse and refute his concrete arguments on the object level.
I agree that work analyzing specific arguments is, overall, more useful than work analyzing individual people's track records. Personally, partly for that reason, I've actually done a decent amount of public argument analysis (e.g. here, here, and most recently here) but never written a post like this before.
Still, I think, people do in practice tend to engage in epistemic deference. (I think that even people who don't consciously practice epistemic deference tend to be influ...
(I hadn't seen this reply when I made my other reply).
What do you think of legitimising behaviour that calls out the credibility of other community members in the future?
I am worried about displacing the concrete object level arguments as the sole domain of engagement. A culture in which arguments cannot be allowed to stand by themselves. In which people have to be concerned about prior credibility, track record and legitimacy when formulating their arguments...
It feels like a worse epistemic culture.
However, if there's no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I'm not sure your story explains why we end up fixating on the uncertain interventions (AIS research).
The story does require there to be only a very limited number of arms that we initially think have a non-negligible chance of paying. If there are unlimited arms, then one of them should be both paying and easil...
A follow-on:
The above post focused on the idea that certain traits -- reflectiveness and self-skepticism -- are more valuable in the context of non-profits (especially ones long-term missions) than they are in the context of startups.
I also think that certain traits -- drivenness, risk-tolerance, and eccentricity-- are less valuable in the context of non-profits than they are in the context of startups.
Hiring advice from the startup world often suggests that you should be looking for extraordinarily driven, risk-tolerant people with highly idiosyncratic pe...
The bandit problem is definitely related, although I'm not sure it's the best way to formulate the situation here. The main issue is that the bandit formulation, here, treats learning about the magnitude of a risk and working to address the risk as the same action - when, in practice, they often come apart.
Here's a toy model/analogy that feels a bit more like it fits the case, in my mind.
Let's say there are two types of slot machines: one that has a 0% chance of paying and one that has a 100% chance of paying. Your prior gives you a 90% credence that each ...
Couldn't the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate?
Definitely!
(I say above that the dynamic applies to "most software," but should have said something broader to make it clear that it also applies to any company whose product - basically - is information that it's close to costless to reproduce/generate. The book Information Rules is really good on this.)
Sometimes the above conditions hold well enough for people to ...
It’ll be interesting to see how well companies will be able to monetise large, multi-purpose language and image-generation models.
Companies and investors are spending increasingly huge amounts of money on ML research talent and compute, typically with the hope that investments in this area lead to extremely profitable products. But - even if the resulting products are very useful and transformative - it still seems like it's still a bit of an open question how profitable they’ll be.
Some analysis:[1]
1.
Although huge state-of-the-art models are increasingly c...
This is a helpful comment - I'll see if I can reframe some points to make them clearer.
Human psychology is flawed in such a way that we consistently estimate the probability of existential risk from each cause to be ~10% by default.
I'm actually not assuming human psychology is flawed. The post is meant to be talking about how a rational person (or, at least, a boundedly rational person) should update their views.
On the probabilities: I suppose I'm implicitly evoking both a subjective notion of probability ("What's a reasonable credence to assign to X h...
A point about hiring and grantmaking, that may already be conventional wisdom:
If you're hiring for highly autonomous roles at a non-profit, or looking for non-profit founders to fund, then advice derived from the startup world is often going to overweight the importance of entrepreneurialism relative to self-skepticism and reflectiveness.[1]
Non-profits, particularly non-profits with longtermist missions, are typically trying to maximize something that is way more illegible than time-discounted future profits. To give a specific example: I think it's way ha...
I think most people would probably regard the objection as a nitpick (e.g. "OK, maybe the Indifference Principle isn't actually sufficient to support a tight formal argument, and you need to add in some other assumption, but the informal version if the argument is just pretty clearly right"), feel the objection has been successfully answered (e.g. find the response in the Simulation Argument FAQ more compelling than I do), or just haven't completely noticed the potential issue.
I think it's still totally reasonable for the paper to have passed peer review. ...
To be clear, I'm not saying the conclusion is wrong - just that the explicit assumptions the paper makes (mainly the Indifference Principle) aren't sufficient to imply its conclusion.
The version that you've just presented isn't identical to the one in Bostrom's paper -- it's (at least implicitly) making use of assumptions beyond the Indifference Principle. And I think it's surprisingly non-trivial to work out exactly how to formalize the needed assumptions, and make the argument totally tight, although I'd still guess that this is ultimately possible.[1]
I'm trying to understand the simulation argument. I think Bostrom uses the Indifference Principle (IP) in a weird way. If we become a posthuman civilization that runs many many simulations of our ancestors (meaning us), then how does the IP apply? It only applies when one has no other information to go on. But in this case, we do have some extra information -- crucial information! I.e., we know that we are not in any of the simulations that we have produced. Therefore, we do not have any statistical reason to believe that we are simulated.
I agree that t...
...The actual worry with inner misalignment style concerns is that the selection you do during training does not fully constrain the goals of the AI system you get out; if there are multiple goals consistent with the selection you applied during training there's no particular reason to expect any particular one of them. Importantly, when you are using natural selection or gradient descent, the constraints are not "you must optimize X goal", the constraints are "in Y situations you must behave in Z ways", which doesn't constrain how you behave in totally diff
(Disclaimer: The argument I make in this short-form feels I little sophistic to me. I’m not sure I endorse it.)
Discussions of AI risk, particular risks from “inner misalignment,” sometimes heavily emphasize the following observation:
...Humans don’t just care about their genes: Genes determine, to a large extent, how people behave. Some genes are preserved from generation-to-generation and some are pushed out of the gene-pool. Genes that cause certain human behaviours (e.g. not setting yourself on fire) are more likely to be preserved. But people don’t care
The actual worry with inner misalignment style concerns is that the selection you do during training does not fully constrain the goals of the AI system you get out; if there are multiple goals consistent with the selection you applied during training there's no particular reason to expect any particular one of them. Importantly, when you are using natural selection or gradient descent, the constraints are not "you must optimize X goal", the constraints are "in Y situations you must behave in Z ways", which doesn't constrain how you behave in totally diffe...
The existential risk community’s relative level of concern about different existential risks is correlated with how hard-to-analyze these risks are. For example, here is The Precipice’s ranking of the top five most concerning existential risks:
This isn’t surprising.
For a number of risks, when you first hear about them, it’s reasonable to have the reaction “Oh, hm, maybe that could be a ...
Let’s call the hypothesis that the base rate of major wars hasn’t changed the constant risk hypothesis. The best presentation of this view is in Only the Dead, a book by an IR professor with the glorious name of Bear Braumoeller. He argues that there is no clear trend in the average incidence of several measures of conflict—including uses of force, militarized disputes, all interstate wars, and wars between “politically-relevant dyads”—between 1800 and today.
A quick note on Braumoeller's analysis:
He's relying on the Correlates of War (COW) dataset, whic...
...I'm not familiar with Zoe's work, and would love to hear from anyone who has worked with them in the past. After seeing the red flags mentioned above, and being stuck with only Zoe's word for their claims, anything from a named community member along the lines of "this person has done good research/has been intellectually honest" would be a big update for me…. [The post] strikes me as being motivated not by a desire to increase community understanding of an important issue, but rather to generate sympathy for the authors and support for their positi
FWIW, I haven't had this impression.
Single data point: In the most recent survey on community opinion on AI risk, I was in at least the 75th percentile for pessimism (for roughly the same reasons Lukas suggests below). But I'm also seemingly unusually optimistic about alignment risk.
I haven't found that this is a really unusual combo: I think I know at least a few other people who are unusually pessimistic about 'AI going well,' but also at least moderately optimistic about alignment.
(Caveat that my apparently higher level of pessimism could also be explai...
Thanks for the clarification! I still feel a bit fuzzy on this line of thought, but hopefully understand a bit better now.
At least on my read, the post seems to discuss a couple different forms of wildness: let’s call them “temporal wildness” (we currently live at an unusually notable time) and “structural wildness” (the world is intuitively wild; the human trajectory is intuitively wild).[1]
I think I still don’t see the relevance of “structural wildness,” for evaluating fishiness arguments. As a silly example: Quantum mechanics is pretty intuitively wild,...
Ben, that sounds right to me. I also agree with what Paul said. And my intent was to talk about what you call temporal wildness, not what you call structural wildness.
I agree with both you and Arden that there is a certain sense in which the "conservative" view seems significantly less "wild" than my view, and that a reasonable person could find the "conservative" view significantly more attractive for this reason. But I still want to highlight that it's an extremely "wild" view in the scheme of things, and I think we shouldn't impose an inordinate burden of proof on updating from that view to mine.
To say a bit more here, on the epistemic relevance of wildness:
I take it that one of the main purposes of this post is to push back against “fishiness arguments,” like the argument that Will makes in “Are We Living at the Hinge of History?”
The basic idea, of course, is that it’s a priori very unlikely that any given person would find themselves living at the hinge of history (and correctly recognise this). Due to the fallibility of human reasoning and due to various possible sources of bias, however, it’s not as unlikely that a given person would mistakenl...
We were previously comparing two hypotheses:
Now we're comparing three:
"Wild time" is almost as unlikely as HoH. Holden is trying to suggest it's comparably intuitively wild, and it has pretty similar anthropic / "base rate" force.
So if your arguments look solid, "All futures are wild" makes hypothesis 2 look kind of lame/improbable---it has to posit a flaw in an argument, and also that you are living at a wildly improb...
Some possible futures do feel relatively more "wild” to me, too, even if all of them are wild to a significant degree. If we suppose that wildness is actually pretty epistemically relevant (I’m not sure it is), then it could still matter a lot if some future is 10x wilder than another.
For example, take a prediction like this:
...Humanity will build self-replicating robots and shoot them out into space at close to the speed of light; as they expand outward, they will construct giant spherical structures around all of the galaxy’s stars to extract tremendous v
To say a bit more here, on the epistemic relevance of wildness:
I take it that one of the main purposes of this post is to push back against “fishiness arguments,” like the argument that Will makes in “Are We Living at the Hinge of History?”
The basic idea, of course, is that it’s a priori very unlikely that any given person would find themselves living at the hinge of history (and correctly recognise this). Due to the fallibility of human reasoning and due to various possible sources of bias, however, it’s not as unlikely that a given person would mistakenl...
I suspect you are more broadly underestimating the extent to which people used "insect-level intelligence" as a generic stand-in for "pretty dumb," though I haven't looked at the discussion in Mind Children and Moravec may be making a stronger claim.
I think that's good push-back and a fair suggestion: I'm not sure how seriously the statement in Nick's paper was meant to be taken. I hadn't considered that it might be almost entirely a quip. (I may ask him about this.)
Moravec's discussion in Mind Children is similarly brief: He presents a graph of the co...
I do think my main impression of insect <-> simulated robot parity comes from very fuzzy evaluations of insect motor control vs simulated robot motor control (rather than from any careful analysis, of which I'm a bit more skeptical though I do think it's a relevant indicator that we are at least trying to actually figure out the answer here in a way that wasn't true historically). And I do have only a passing knowledge of insect behavior, from watching youtube videos and reading some book chapters about insect learning. So I don't think it's unfair to put it in the same reference class as Rodney Brooks' evaluations to the extent that his was intended as a serious evaluation.
As a last thought here (no need to respond), I thought it might useful to give one example of a concrete case where: (a) Tetlock’s work seems relevant, and I find the terms “inside view” and “outside view” natural to use, even though the case is relatively different from the ones Tetlock has studied; and (b) I think many people in the community have tended to underweight an “outside view.”
A few years ago, I pretty frequently encountered the claim that recently developed AI systems exhibited roughly “insect-level intelligence.” This claim was typically used...
The Nick Bostrom quote (from here) is:
In retrospect we know that the AI project couldn't possibly have succeeded at that stage. The hardware was simply not powerful enough. It seems that at least about 100 Tops is required for human-like performance, and possibly as much as 10^17 ops is needed. The computers in the seventies had a computing power comparable to that of insects. They also achieved approximately insect-level intelligence.
I would have guessed this is just a funny quip, in the sense that (i) it sure sounds like it's just a throw-away quip, no e...
Thank you (and sorry for my delayed response)!
I shudder at the prospect of having a discussion about "Outside view vs inside view: which is better? Which is overrated and which is underrated?" (and I've worried that this thread may be tending in that direction) but I would really look forward to having a discussion about "let's look at Daniel's list of techniques and talk about which ones are overrated and underrated and in what circumstances each is appropriate."
I also shudder a bit at that prospect.
I am sometimes happy making pretty broad and sloppy ...
As a last thought here (no need to respond), I thought it might useful to give one example of a concrete case where: (a) Tetlock’s work seems relevant, and I find the terms “inside view” and “outside view” natural to use, even though the case is relatively different from the ones Tetlock has studied; and (b) I think many people in the community have tended to underweight an “outside view.”
A few years ago, I pretty frequently encountered the claim that recently developed AI systems exhibited roughly “insect-level intelligence.” This claim was typically used...
I'm not sure if you think this is an interesting point to notice that's useful for building a world-model, and/or a reason to be skeptical of technical alignment work. I'd agree with the former but disagree with the latter.
Mostly the former!
I think the point may have implications for how much we should prioritize alignment research, relative to other kinds of work, but this depends on what the previous version of someone's world model was.
For example, if someone has assumed that solving the 'alignment problem' is close to sufficient to ensure that human...
It’s definitely entirely plausible that I’ve misunderstood your views.
My interpretation of the post was something like this:
...There is a bag of things that people in the EA community tend to describe as “outside views.” Many of the things in this bag are over-rated or mis-used by members of the EA community, leading to bad beliefs.
One reason for this over-use or mis-use is that the the term “outside view” has developed an extremely positive connotation within the community. People are applauded for saying that they’re relying on “outside views” — “outside
On the contrary; tabooing the term is more helpful, I think. I've tried to explain why in the post. I'm not against the things "outside view" has come to mean; I'm just against them being conflated with / associated with each other, which is what the term does. If my point was simply that the first Big List was overrated and the second Big List was underrated, I would have written a very different post!
My initial comment was focused on your point about conflation, because I think this point bears on the linguistic question more strongly than the other p...
I agree that people sometimes put too much weight on particular outside views -- or do a poor job of integrating outside views with more inside-view-style reasoning. For example, in the quote/paraphrase you present at the top of your post, something has clearly gone wrong.[1]
But I think the best intervention, in this case, is probably just to push the ideas "outside views are often given too much weight" or "heavily reliance on outside views shouldn't be seen as praiseworthy" or "the correct way to integrate outside views with more inside-view reasoning is...
When people use “outside view” or “inside view” without clarifying which of the things on the above lists they mean, I am left ignorant of what exactly they are doing and how well-justified it is. People say “On the outside view, X seems unlikely to me.” I then ask them what they mean, and sometimes it turns out they are using some reference class, complete with a dataset. (Example: Tom Davidson’s four reference classes for TAI). Other times it turns out they are just using the anti-weirdness heuristic. Good thing I asked for elaboration!
FWIW, as a...
Fortunately, if I remember correctly, something like the distinction between the true criterion of rightness and the best practical decision procedure actually is a major theme in the Kagan book. (Although I think the distinction probably often is underemphasized.)
It is therefore kind of misleading to think of consequentialism vs. deontology vs. virtue ethics as alternative theories, which however is the way normative ethics is typically presented in the analytic tradition.
I agree there is something to this concern. But I still wouldn't go so far as to...
A slightly boring answer: I think most people should at least partly read something that overviews common theories and frameworks in normative ethics (and the arguments for and against them) and something that overviews core concepts and principles in economics (e.g. the idea of expected utility, the idea of an externality, supply/demand, the basics of economic growth, the basics of public choice).
In my view, normative ethics and economics together make up a really large portion of the intellectual foundation that EA is built on.
One good book that overview...
I remember that reading up on normative ethics was one of the first things I focused on after I had encountered EA. I'm sure it was useful in many ways. For some reason, however, I feel surprisingly lukewarm about recommending that people read about normative ethics.
It could be because my view these days is roughly: "Once you realize that consequentialism is great as a 'criterion of rightness' but doesn't work as 'decision procedure' for boundedly rational agents, a lot of the themes from deontology, virtue ethics, moral particularism, and moral plur...
That's a good example.
I do agree that quasi-random variation in culture can be really important. And I agree that this variation is sometimes pretty sticky (e.g. Europe being predominantly Christian and the Middle East being predominantly Muslim for more than a thousand years). I wouldn't say that this kind of variation is a "rounding error."
Over sufficiently long timespans, though, I think that technological/economic change has been more significant.
As an attempt to operationalize this claim: The average human society in 1000AD was obviously very differen...
FWIW, I wouldn't say I agree with the main thesis of that post.
...However, while I expect machines that outcompete humans for jobs, I don’t see how that greatly increases the problem of value drift. Human cultural plasticity already ensures that humans are capable of expressing a very wide range of values. I see no obviously limits there. Genetic engineering will allow more changes to humans. Ems inherit human plasticity, and may add even more via direct brain modifications.
In principle, non-em-based artificial intelligence is capable of expressing the enti
Do you have the intuition that absent further technological development, human values would drift arbitrarily far?
Certainly not arbitrarily far. I also think that technological development (esp. the emergence of agriculture and modern industry) has played a much larger role in changing the world over time than random value drift has.
[E]ven non-extinction AI is enabling a new set of possibilities that modern-day humans would endorse much less than the decisions of future humans otherwise.
I definitely think that's true. But I also think that was true ...
To help with the talent pipeline, GovAI currently runs twice-a-year three-month fellowships. We've also started offering one-year Research Scholar positions. We're also now experimenting with a new policy program. Supporting the AI governance talent pipeline is one of our key priorities as an organization.
That being said, we're very very far from filling the community's needs in this regard. We're currently getting far more strong applications than we have open slots. (I believe our acceptance rate for the Summer Fellowship is something like 5% and will pr... (read more)