All of bgarfinkel's Comments + Replies

To help with the talent pipeline, GovAI currently runs twice-a-year three-month fellowships. We've also started offering one-year Research Scholar positions. We're also now experimenting with a new policy program. Supporting the AI governance talent pipeline is one of our key priorities as an organization.

That being said, we're very very far from filling the community's needs in this regard. We're currently getting far more strong applications than we have open slots. (I believe our acceptance rate for the Summer Fellowship is something like 5% and will pr... (read more)

1
MattThinks
8mo
I was very glad to see the research scholar pathway open up, it seems exactly right for someone like me (advanced early career, is that a stable segment?). I’m also glad to hear of the interest too, although it’s too bad that the acceptance rate is lower than ideal. Then again, to many folks coming from academic grant funding ecosystems, 5% is fairly typical, for major funding in my fields at least.
2
Harrison Durland
11mo
I would also strongly recommend having a version of the fellowship that aligns with US university schedules, unlike the current Summer fellowship!

I really appreciate the donation to GovAI!

According to staff I've talked to, MIRI is not heavily funding constrained, but that they believe they could use more money. I suspect GovAI is in a similar place but I have not inquired.

For reference, for anyone thinking of donating to GovAI: I would currently describe us as “funding constrained” — I do current expect financial constraints to prevent us from making program improvements/expansions and hires we’d like to make over the next couple years. (We actually haven’t yet locked down enough funding to main... (read more)

Thanks for the thoughtful comment!

So it's not enough to be "no less democratic than other charity orgs". I believe we should strive to be much more democratic than that average - which seems to me like a minority view here.

I do think that this position - "EA foundations aren't unusually undemocratic, but they should still be a lot more democratic than they are" - is totally worthy of discussion. I think you're also right to note that other people in the community tend to be skeptical of this position; I'm actually skeptical of it, myself, but I would b... (read more)

3
Guy Raveh
2y
I don't disagree, but I think the discussion is not as simple. When it comes to "legitimate" EA money, I think it would be much better to have some mechanism that includes as many of the potential beneficiaries as possible, rather than one national government. I just view tax money as "not legitimate EA money" (Edit: and I see people who do want to avoid taxes, as wanting to subvert the democratic system they're in in favor of their own decisionmaking).
2
Guy Raveh
2y
I live in Israel. A short Google search didn't turn up much in terms of English language information about this, other than this government document outlining the relevant laws and rules. The relevant part of it is the chapter about the institutions of an Amuta(=Israeli non-profit), starting page 9. In practice, since members have to be admitted by already existing bodies of the non-profit, the general assembly can be just the executive board and the auditor(s), and thus be meaningless. I'm sure this happens often (maybe most of the time). In particular, EA Israel (the org) has very few members. But I've been a member of a non-profit with a much larger (~100 people) general assembly in the past. You can draw some parallels between the general assembly and a board of directors (Edit: trustees? I don't know what the right word is). On the other hand, you can also draw parallels between the executive board and a board of directors - since in many (most?) cases, including EA Israel, the actual day-to-day management of the non-profit is done by a paid CEO and other employees. So the executive board makes strategy decisions and oversees the activity, and doesn't implement it itself. Meaning it's kind of a board of directors, which still answers to a possibly much larger general assembly.

Thanks!

To be clear, though, I also don't think people should feel like they need to write out comments explaining their strong downvotes. I think the time cost is too high for it to be a default expectation, particularly since it can lead to getting involved in a fraught back-and-forth and take additional time and energy that way. I don't use strong downvotes all that often, but, when I do use them, it's rare that I'll also write up an explanatory comment.

(Insofar as I disagree with forum voting norms, my main disagreement is that I'd like to see people ha... (read more)

1
Phil Tanny
2y
  Ok, no problem, thanks for sharing that.    For me, without explanations  the entire voting system up and down generates entirely worthless information.  With explanations then there is an opportunity to evaluate the quality of the votes.    To be fair, I've been using forums regularly since they first appeared on the net, and this is probably the most intelligent forum I've ever discovered, which I am indeed quite grateful for.   Perhaps the reason I've complained about the voting system is that, in my mind, it contaminates what is otherwise a pretty close to perfect site.  The contrast between near perfection, and high school level popularity contest gimmickry offends my delicate aesthetic sensibility.   :-)

I do think it's reasonable to feel frustrated by your experience commenting on this post. I think you should have been engaged more respectfully, with more of an assumption of good faith, and that a number of your comments shouldn't have been so heavily downvoted. I do also agree with some of the concerns you've raised in your comments and think it was useful for you to raise them.[1]

At the same time, I do think this comment isn't conducive to good conversation, and the content mostly strikes me as off-base.

  • The EA community doesn't have its roots in man

... (read more)
5[anonymous]2y
This is pretty far afield from what the post is about, but to me the most natural reason why someone might say EA rejects democracy are neither of the two interpretations you mentioned, but rather that EAs are technocrats suspicious of democracy, to quote Rob Reich:
3
Guy Raveh
2y
1. Every time the issue of taxes comes up, it's a very popular opinion that people should avoid as much taxes as possible to redirect the money to what they personally deem effective. This is usually accompanied by insinuations that democratically elected governments are useless or harmful. 2. While it is true that aid and charity in general tend to be far from democratic, it is also widely accepted that they often cause harm or just fail to have an effect - indeed, this is the basis for our very movement. There are also many known cases where bad effects were the result of lack of participation by the recipients of aid. So it's not enough to be "no less democratic than other charity orgs". I believe we should strive to be much more democratic than that average - which seems to me like a minority view here. 3. I'm assuming you're right about the amount of democracy in other non-profits, but the situation in my country is actually different. All non-profits have members who can call an assembly and have final say on any decision or policy of the non-profit.
-2
Phil Tanny
2y
Thank you for providing an excellent example of how one should down vote, if that is what you're doing.   Not meaning to put words in  your mouth, just applauding a reasoned challenge.

I generally think it'd be good to have a higher evidential bar for making these kinds of accusations on the forum. Partly, I think the downside of making an off-base socket-puppeting accusation (unfair reputation damage, distraction from object-level discussion, additional feeling of adversarialism) just tends to be larger than the upside of making a correct one.

Fwiw, in this case, I do trust that A.C. Skraeling isn't Zoe. One point on this: Since she has a track record of being willing to go on record with comparatively blunter criticisms, using her own name, I think it would be a confusing choice to create a new pseudonym to post that initial comment.

[anonymous]2y44
0
0

I think this is fair. I shouldn't have done it and am sorry for doing so

I think to some degree this level of accusations is problematic and to some degree derails an important conversation. Given the role a report like this may play in EA in the future, ad hominem and false attacks on critiques seem somewhat problematic

I strongly agree - if someone has a question or concern about someone else's identity, I think they should either handle it privately or speak to the Forum team about their concerns.

I really appreciate the time people have taken to engage with this post (and actually hope the attention cost hasn’t been too significant). I decided to write some post-discussion reflections on what I think this post got right and wrong.

The reflections became unreasonably long - and almost certainly should be edited down - but I’m posting them here in a hopefully skim-friendly format. They cover what I see as some mistakes with the post, first, and then cover some views I stand by.

Things I would do differently in a second version of the post:

1. I would ei... (read more)

1[anonymous]2y
Edit: I think this came off more negatively than I intended it to, particularly about Yudkowsky's understanding of physics. The main point I was trying to make is that Yudkowsky was overconfident, not that his underlying position was wrong. See the replies for more clarification.  I think there's another relevant (and negative) data point when discussing Yudkowsky's track record: his argument and belief that the Many-Worlds Interpretation of quantum mechanics is the only viable interpretation of quantum mechanics, and anyone who doesn't agree is essentially a moron. Here's one 2008 link from the Sequences where he expresses this position[1]; there are probably many other places where he's said similar things. (To be clear, I don’t know if he still holds this belief, and if he doesn’t anymore, when and why he updated away from it.)  Many Worlds is definitely a viable and even leading interpretation, and may well be correct. But Yudkowsky's confidence in Many Worlds, as well as his conviction that people who disagree with him are making elementary mistakes, is more than a little disproportionate, and may come partly from a lack of knowledge and expertise.  The above is a paraphrase of Scott Aaronson, a credible authority on quantum mechanics who is sympathetic to both Yudkowsky and Many Worlds (bold added):  While this isn't directly related to AI risk, I think it's relevant to Yudkowsky's track record as a public intellectual.  1. ^ He expresses this in the last six paragraphs of the post. I'm excerpting some of it (bold added, italics were present in the original):   

Thanks for writing this update. I think my number one takeaway here is something like: when writing a piece with the aim of changing community dynamics, it's important to be very clear about motivations and context. E.g. I think a version of the piece which said "I think people are overreacting to Death with Dignity, here are my specific models of where Yudkowsky tends to be overconfident, here are the reasons why I think people aren't taking those into account as much as they should" would have been much more useful and much less controversial than the current piece, which (as I interpret it) essentially pushes a general "take Yudkowsky less seriously" meme (and is thereby intrinsically political/statusy).

I appreciate this update! 

Then the post gives some evidence that, at each stage of his career, Yudkowsky has made a dramatic, seemingly overconfident prediction about technological timelines and risks - and at least hasn’t obviously internalised lessons from these apparent mistakes.

I am confused about you bringing in the claim of "at each stage of his career", given that the only two examples you cited that seemed to provide much evidence here were from the same (and very early) stage of his career. Of course, you might have other points of evidence t... (read more)

I noted some places I agree with your comment here, Ben. (Along with my overall take on the OP.)

Some additional thoughts:

Notably, since that post didn’t really have substantial arguments in it (although the later one did), I think the fact it had an impact is seemingly a testament to the power of deference

The “death with dignity” post came in the wake of Eliezer writing hundreds of thousands of words about why he thinks alignment is hard in the Late 2021 MIRI Conversations (in addition to the many specific views and arguments about alignment difficulty he’... (read more)

I'm a bit confused about a specific small part:

tendency toward expressing dramatic views

I imagine that for many people, including me (including you?), once we work on [what we believe to be] preventing the world from ending, we would only move to another job if it was also preventing the world from ending, probably in an even more important way.

 

In other words, I think "working at a 2nd x-risk job and believing it is very important" is mainly predicted by "working at a 1st x-risk job and believing it is very important", much more than by personality t... (read more)

5
David Mathers
2y
'Here’s one data point I can offer from my own life: Through a mixture of college classes and other reading, I’m pretty confident I had already encountered the heuristics and biases literature, Bayes’ theorem, Bayesian epistemology, the ethos of working to overcome bias, arguments for the many worlds interpretation, the expected utility framework, population ethics, and a number of other ‘rationalist-associated’ ideas before I engaged with the effective altruism or rationalist communities.' I think some of this is just a result of being a community founded partly by analytic philosophers. (though as a philosopher I would say that!).  I think it's normal to encounter some of these ideas in undergrad philosophy programs. At my undergrad back in 2005-09 there was a whole upper-level undergraduate course in decision theory. I don't think that's true everywhere all the time, but I'd be surprised if it was wildly unusual. I can't remember if we covered population ethics in any class, but I do remember discovering Parfit on the Repugnant Conclusion in 2nd-year of undergrad because one of my ethics lecturers said Reasons and Persons was a super-important book. In terms of the Oxford phil scene where the term "effective altruism" was born, the main titled professorship in ethics at that time was held by John Broome, a utilitarianism-sympathetic former economist, who had written famous stuff on expected utility theory. I can't remember if he was the PhD supervisor of anyone important to the founding of EA, but I'd be astounded if some of the phil. people involved in that had not been reading his stuff and talking to him about it.  Most of the phil. physics people at Oxford were gung-ho for many worlds, it's not a fringe view in philosophy of physics as far as I know. (Though I think Oxford was kind of a centre for it and there was more dissent elsewhere.)  As far as I can tell, Bayesian epistemology in at least some senses of that term is a fairly well-known approach in phi
9[anonymous]2y
For what it's worth, I found this post and the ensuing comments very illuminating. As a person relatively new to both EA and the arguments about AI risk, I was a little bit confused as to why there was not much push back on the very high confidence beliefs about AI doom within the next 10 years. My assumption had been that there was a lot of deference to EY because of reverence and fealty stemming from his role in getting the AI alignment field started not to mention the other ways he has shaped people's thinking. I also assumed that his track record on predictions was just ambiguous enough for people not to question his accuracy. Given that I don't give much credence to the idea that prophets/oracles exist, I thought it unlikely that the high confidence on his predictions were warranted on the count that there doesn't seem to be much evidence supporting the accuracy of long range forecasts. I did not think that there were such glaring mispredictions made by EY in the past so thank you for highlighting them.
8
Verden
2y
I feel like people are missing one fairly important consideration when discussing how much to defer to Yudkowsky, etc. Namely, I've heard multiple times that Nate Soares, the executive director of MIRI, has models of AI risk that are very similar to Yudkowsky's, and their p(doom) are also roughly the same. My limited impression is that Soares is no less smart or otherwise capable than Yudkowsky. So, when having this kind of discussion, focusing on Yudkowsky's track record or whatever, I think it's good to remember that there's another very smart person, who entered AI safety much later than Yudkowsky, and who holds very similar inside views on AI risk.

I really appreciated this update. Mostly it checks out to me, but I wanted to push back on this:

Here’s a dumb thought experiment: Suppose that Yudkowsky wrote all of the same things, but never published them. But suppose, also, that a freak magnetic storm ended up implanting all of the same ideas in his would-be-readers’ brains. Would this absence of a casual effect count against deferring to Yudkowsky? I don’t think so. The only thing that ultimately matters, I think, is his track record of beliefs - and the evidence we currently have about how accurate o

... (read more)

A general reflection: I wonder if one at least minor contributing factor to disagreement, around whether this post is worthwhile, is different understandings about who the relevant audience is.

I mostly have in mind people who have read and engaged a little bit with AI risk debates, but not yet in a very deep way, and would overall be disinclined to form strong independent views on the basis of (e.g.) simply reading Yudkowsky's and Christiano's most recent posts. I think the info I've included in this post could be pretty relevant to these people, since in ... (read more)

I think that insofar as people are deferring on matters of AGI risk etc., Yudkowsky is in the top 10 people in the world to defer to based on his track record, and arguably top 1. Nobody who has been talking about these topics for 20+ years has a similarly good track record. If you restrict attention to the last 10 years, then Bostrom does and Carl Shulman and maybe some other people too (Gwern?), and if you restrict attention to the last 5 years then arguably about a dozen people have a somewhat better track record than him. 

(To my knowledge. I think... (read more)

The part of this post which seems most wild to me is the leap from "mixed track record" to

In particular, I think, they shouldn’t defer to him more than they would defer to anyone else who seems smart and has spent a reasonable amount of time thinking about AI risk.

For any reasonable interpretation of this sentence, it's transparently false. Yudkowsky has proven to be one of the best few thinkers in the world on a very difficult topic. Insofar as there are others who you couldn't write a similar "mixed track record" post about, it's almost entirely bec

... (read more)

I phrased my reply strongly (e.g. telling people to read the other post instead of this one) because deference epistemology is intrinsically closely linked to status interactions, and you need to be pretty careful in order to make this kind of post not end up being, in effect, a one-dimensional "downweight this person". I don't think this post was anywhere near careful enough to avoid that effect. That seems particularly bad because I think most EAs should significantly upweight Yudkowsky's views if they're doing any kind of reasonable, careful deference, ... (read more)

If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.

I suppose one of my main questions is whether he has visibly learned from the mistakes, in this case.

For example, I wasn't able to find a post or comment to the effect of "When I was younger, I spent of years of my life motivated by the belief that near-term extinction from nanotech was looming. I turned out to be wrong. Here's what I learned from that experience and how I've applied it to my forecasts of near-t... (read more)

Eliezer writes a bit about his early AI timeline and nanotechnology opinions here, though it sure is a somewhat obscure reference that takes a bunch of context to parse:  

Luke Muehlhauser reading a previous draft of this (only sounding much more serious than this, because Luke Muehlhauser):  You know, there was this certain teenaged futurist who made some of his own predictions about AI timelines -

Eliezer:  I'd really rather not argue from that as a case in point.  I dislike people who screw up something themselves, and then argue like

... (read more)

While he's not single-handedly responsible, he lead the movement to take AI risk seriously at a time when approximately no one was talking about it, which has now attracted the interests of top academics. This isn't a complete track record, but it's still a very important data-point.

I definitely do agree with that!

It's possible I should have emphasized the significance of it more in the post, rather than moving on after just a quick mention at the top.

If it's of interest: I say a little more about how I think about this, in response to Gwern's comment ... (read more)

What?

I interpreted Gwern as mostly highlighting that people have updated toward's Yudkowsky's views - and using this as evidence in favor of the view we should defer a decent amount to Yudkowsky. I think that was a reasonable move.

There is also a causal question here ('Has Yudkowsky on-net increased levels of concern about AI risk relative to where they would otherwise be?'), but I didn't take the causal question to be central to the point Gwern was making. Although now I'm less sure.

I don't personally have strong views on the causal question - I haven't thought through the counterfactual.

On 1 (the nanotech case):

I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old.

I think your comment might give the misimpression that I don't discuss this fact in the post or explain why I include the case. What I write is:

I should, once again, emphasize that Yudkowsky was around twenty when he did the final updates on this essay. In that sense, it might be unfair to bring this very old example up.

Nonetheless, I do think this case can be treated as informative, since: the belief was so analogous to his cu

... (read more)
1
TAG
8mo
"Orthogonality thesis: Intelligence can be directed toward any compact goal…. Instrumental convergence: An AI doesn’t need to specifically hate you to hurt you; a paperclip maximizer doesn’t hate you but you’re made out of atoms that it can use to make paperclips, so leaving you alive represents an opportunity cost and a number of foregone paperclips…. Rapid capability gain and large capability differences: Under scenarios seeming more plausible than not, there’s the possibility of AIs gaining in capability very rapidly, achieving large absolute differences of capability, or some mixture of the two…. 1-3 in combination imply that Unfriendly AI is a critical problem-to-be-solved, because AGI is not automatically nice, by default does things we regard as harmful, and will have avenues leading up to great intelligence and power.”"   1-3 in combination don't imply anything with high probability.

One quick response, since it was easy (might respond more later): 

Overall, then, I do think it's fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn't incidental or a secondary consideration.

I do think takeoff speeds between 1 week and 10 years are a core premise of the classic arguments. I do think the situation looks very different if we spend 5+ years in the human domain, but I don't think there are many who believe that that is going to happen. 

I don't think the distinction between 1 week and 1 year is that ... (read more)

No, it's just as I said, and your Karnofsky retrospective strongly supports what I said.

I also agree that Karnfosky's retrospective supports Gwern's analysis, rather than doing the opposite.

(I just disagree about how strongly it counts in favor of deference to Yudkowsky. For example, I don't think this case implies we should currently defer more to Yudkwosky's risk estimates than we do to Karnofsky's.)

-1
Charles He
2y
Ugh. Y'all just made me get into "EA rhetoric" mode: What?  No. Not only is this not true but this is indulging in a trivial rhetorical maneuver.   My comment said that the counterfactual would be better without the involvement of the person mentioned in the OP. I used the retrospective as evidence.  The retrospective includes at least two points for why the author changed their mind: 1. The book Superintelligence, which they explicitly said was the biggest event 2. The author moved to SF and learned about DL, and was informed by speaking to non-rationalist AI researchers, and then decided that LessWrong and MIRI were right. In response to this,  Gwern states the point #2, and asserts that this is causal evidence in favor of the person mentioned in the OP being useful.  Why? How?   Notice that #2 above doesn't at all rule out that the founders or culture was repellent. In fact it seems like a lavish, and unlikely level amount of involvement.

Thanks for the comment! A lot of this is useful.

calling LOGI and related articles 'wrong' because that's not how DL looks right now is itself wrong. Yudkowsky has never said that DL or evolutionary approaches couldn't work, or that all future AI work would look like the Bayesian program and logical approach he favored;

I mainly have the impression that LOGI and related articles were probably "wrong" because, so far as I've seen, nothing significant has been built on top of them in the intervening decade-and-half (even though LOGI's successor was seeming... (read more)

I do not want an epistemic culture that finds it acceptable to challenge an individuals overall credibility in lieu of directly engaging with their arguments.

I think I roughly agree with you on this point, although I would guess I have at least a somewhat weaker version of your view. If discourse about people's track records or reliability starts taking up (e.g.) more than a fifth of the space that object-level argument does, within the most engaged core of people, then I do think that will tend to suggest an unhealthy or at least not-very-intellectuall... (read more)

I prefer to just analyse and refute his concrete arguments on the object level.

I agree that work analyzing specific arguments is, overall, more useful than work analyzing individual people's track records. Personally, partly for that reason, I've actually done a decent amount of public argument analysis (e.g. here, here, and most recently here) but never written a post like this before.

Still, I think, people do in practice tend to engage in epistemic deference. (I think that even people who don't consciously practice epistemic deference tend to be influ... (read more)

(I hadn't seen this reply when I made my other reply).

What do you think of legitimising behaviour that calls out the credibility of other community members in the future?

I am worried about displacing the concrete object level arguments as the sole domain of engagement. A culture in which arguments cannot be allowed to stand by themselves. In which people have to be concerned about prior credibility, track record and legitimacy when formulating their arguments...

It feels like a worse epistemic culture.

However, if there's no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I'm not sure your story explains why we end up fixating on the uncertain interventions (AIS research).

The story does require there to be only a very limited number of arms that we initially think have a non-negligible chance of paying. If there are unlimited arms, then one of them should be both paying and easil... (read more)

A follow-on:

The above post focused on the idea that certain traits -- reflectiveness and self-skepticism -- are more valuable in the context of non-profits (especially ones long-term missions) than they are in the context of startups.

I also think that certain traits -- drivenness, risk-tolerance, and eccentricity-- are less valuable in the context of non-profits than they are in the context of startups.

Hiring advice from the startup world often suggests that you should be looking for extraordinarily driven, risk-tolerant people with highly idiosyncratic pe... (read more)

The bandit problem is definitely related, although I'm not sure it's the best way to formulate the situation here. The main issue is that the bandit formulation, here, treats learning about the magnitude of a risk and working to address the risk as the same action - when, in practice, they often come apart.

Here's a toy model/analogy that feels a bit more like it fits the case, in my mind.

Let's say there are two types of slot machines: one that has a 0% chance of paying and one that has a 100% chance of paying. Your prior gives you a 90% credence that each ... (read more)

2
RyanCarey
2y
Interesting, that makes perfect sense. However, if there's no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I'm not sure your story explains why we end up fixating on the uncertain interventions (AIS research).  Another way to explain why the uncertain risks look big would be that we are unable to stop society pulling the AI progress lever until we have proven it to be dangerous. Definitely risky activities just get stopped! Maybe that's implicitly how your model gets the desired result.
2
RyanCarey
2y
Interesting, that makes perfect sense. However, if there's no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and forget about the unknowable one. So I'm not sure your story explains why we end up fixating on the uncertain interventions (AIS research). It seems you need an additional element where society is unable to stop itself pulling the AI progress lever...

Couldn't the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate?

Definitely!

(I say above that the dynamic applies to "most software," but should have said something broader to make it clear that it also applies to any company whose product - basically - is information that it's close to costless to reproduce/generate. The book Information Rules is really good on this.)

Sometimes the above conditions hold well enough for people to ... (read more)

4
RyanCarey
2y
1. Ah, you do say that. Serves me right for skimming! 2. To start, you could have a company for each domain area that an AI needs to be fine-tuned, marketed, and adapted to meet any regulatory requirements. Writing advertising copy, editing, insurance evaluations, etc. 3. As for the foundation models themselves, I think training models is too expensive to go back to academia as you suggest. And I think that there are some barriers to getting priced down. Firstly, when you say you need "patents or very-hard-to-learn-or-rediscover trade secrets ", does the cost of training the model not count? It is a huge barrier. There are also difficulties in acquiring AI talent. And future patents seem likely. We're already seeing a huge shift with AI researchers leaving big tech for startups, to try to capture more of the value of their work, and this shift could go a lot further.

It’ll be interesting to see how well companies will be able to monetise large, multi-purpose language and image-generation models.

Companies and investors are spending increasingly huge amounts of money on ML research talent and compute, typically with the hope that investments in this area lead to extremely profitable products. But - even if the resulting products are very useful and transformative - it still seems like it's still a bit of an open question how profitable they’ll be.

Some analysis:[1]

1.

Although huge state-of-the-art models are increasingly c... (read more)

6
Tamay
2y
This is insightful.  Some quick responses: * My guess would be that the ability to commercialize these models would strongly hinge on the ability for firms to wrap these up with complementary products, that would contribute to an ecosystem with network effects, dependencies, evangelism, etc. * I wouldn't draw too strong conclusions from the fact that the few early attempts to commercialize models like these, notably by OpenAI, haven't succeeded in creating the preconditions for generating a permenant stream of profits. I'd guess that their business models look less-than-promising on this dimension because (and this is just my impression) they've been trying to find product-market-fit, and have gone lightly on exploiting particular fits they found by building platforms to service these * Instead, better examples of what commercialization looks like are GPT-3-powered companies, like copysmith, which seem a lot more like traditional software businesses with the usual tactics for locking users in, and creating network effects and single-homing behaviour * I expect that companies will have ways to create switching costs for these models that traditional software product don't have. I'm particularly interested in fine-tuning as a way to lock-in users by enabling models to strongly adapt to context about the users' workloads. More intense versions of this might also exist, such as learning directly from individual customer's feedback through something like RL. Note that this is actually quite similar to how non-software services create loyalty I agree that it seems hard to commercialize these models out-of-the-box with something like paid API access, but I expect, given the points above, to be superseded by better strategies. 
4
RyanCarey
2y
Couldn't the exact same arguments be made to argue that there would not be successful internet companies, because the fundamental tech is hard to patent, and any website is easy to duplicate? But this just means that instead of monetising the bottom layer of tech (TCP/IP, or whatever), they make their billions from layering needed stuff on top - search, social network, logistics.
2
Charles He
2y
This was excellent!

This is a helpful comment - I'll see if I can reframe some points to make them clearer.

Human psychology is flawed in such a way that we consistently estimate the probability of existential risk from each cause to be ~10% by default.

I'm actually not assuming human psychology is flawed. The post is meant to be talking about how a rational person (or, at least, a boundedly rational person) should update their views.

On the probabilities: I suppose I'm implicitly evoking both a subjective notion of probability ("What's a reasonable credence to assign to X h... (read more)

Good points - those all seem right to me!

A point about hiring and grantmaking, that may already be conventional wisdom:

If you're hiring for highly autonomous roles at a non-profit, or looking for non-profit founders to fund, then advice derived from the startup world is often going to overweight the importance of entrepreneurialism relative to self-skepticism and reflectiveness.[1]

Non-profits, particularly non-profits with longtermist missions, are typically trying to maximize something that is way more illegible than time-discounted future profits. To give a specific example: I think it's way ha... (read more)

3
bgarfinkel
2y
A follow-on: The above post focused on the idea that certain traits -- reflectiveness and self-skepticism -- are more valuable in the context of non-profits (especially ones long-term missions) than they are in the context of startups. I also think that certain traits -- drivenness, risk-tolerance, and eccentricity-- are less valuable in the context of non-profits than they are in the context of startups. Hiring advice from the startup world often suggests that you should be looking for extraordinarily driven, risk-tolerant people with highly idiosyncratic perspectives on the world.[1] And, in the context of for-profit startups, it makes sense that these traits would be crucial. A startup's success will often depend on its ability to outcompete large, entrenched firms in some industry (e.g. taxi companies, hotels, tech giants). To do that, an extremely high level of drivenness may be necessary to compensate for lower resource levels, lower levels of expertise, and weaker connections to gatekeepers. Or you may need to be willing to take certain risks (e.g. regulatory/PR/enemy-making risks) that would slow down existing companies in pursuing certain opportunities. Or you may need to simply see an opportunity that virtually no one else would (despite huge incentives to see it), because you have an idiosyncratic way of seeing the world. Having all three of these traits (extreme drivenness, risk tolerance, idiosyncrasy) may be necessary for you to have any plausible chance of success. I think that all of these traits are still valuable in the non-profit world, but I also think they're comparatively less valuable (especially if you're lucky enough to have secure funding). There's simply less direct competition in the non-profit world. Large, entrenched non-profits also have much weaker incentives to find and exploit impact opportunities. Furthermore, the non-profit world isn't even that big to begin with. So there's no reason to assume all the low-hanging fruit have
5
ac
2y
I think that’s mostly right, with a couple of caveats: * You only mentioned non-profits, but I think most of this applies to other longtermists organizations with pretty illegible missions. Maybe Anthropic is an example. * Some organizations with longtermists missions should not aim to maximise something particularly illegible. In these cases, entrepreneurialism will often be very important, including in highly autonomous roles. For example, some biosecurity organization could be trying to design and produce, at very large scales, “Super PPE”, such as masks, engineered with extreme events in mind. * Like SpaceX, which initially aimed to significantly reduce the cost, and improve the supply, of routine space flight, the Super PPE project would need to improve upon existing PPE designed for use in extreme events, which is “ bulky, highly restrictive, and insufficiently abundant”.  (Alvea might be another example, but I don’t know enough about them). * This suggests a division of labour where project missions are defined by individuals outside the organization, as with Super PPE, before being executed by others, who are high on entrepreneurialism. Note that, in hiring for leadership roles in the organization, this will mean placing more weight on entrepreneurialism than on self-skepticism and reflectiveness.  While Musk did a poor job defining SpaceX's mission, he did an excellent job executing it. This seems true. It also suggests that if you can be extremely high on both traits, you’ll bring significant counterfactual value.

I think most people would probably regard the objection as a nitpick (e.g. "OK, maybe the Indifference Principle isn't actually sufficient to support a tight formal argument, and you need to add in some other assumption, but the informal version if the argument is just pretty clearly right"), feel the objection has been successfully answered (e.g. find the response in the Simulation Argument FAQ more compelling than I do), or just haven't completely noticed the potential issue.

I think it's still totally reasonable for the paper to have passed peer review. ... (read more)

To be clear, I'm not saying the conclusion is wrong - just that the explicit assumptions the paper makes (mainly the Indifference Principle) aren't sufficient to imply its conclusion.

The version that you've just presented isn't identical to the one in Bostrom's paper -- it's (at least implicitly) making use of assumptions beyond the Indifference Principle. And I think it's surprisingly non-trivial to work out exactly how to formalize the needed assumptions, and make the argument totally tight, although I'd still guess that this is ultimately possible.[1]


... (read more)
1
Leo
2y
My version tried to be an intuitive simplification of the core of Bostrom's paper. I actually don't identify these assumptions you mention. If you are right, I may have presupposed them while reading the paper, or my memory may be betraying me for the sake of making sense of it. Anyway, I really appreciate you took the time to comment.

I'm trying to understand the simulation argument. I think Bostrom uses the Indifference Principle (IP) in a weird way. If we become a posthuman civilization that runs many many simulations of our ancestors (meaning us), then how does the IP apply? It only applies when one has no other information to go on. But in this case, we do have some extra information -- crucial information! I.e., we know that we are not in any of the simulations that we have produced. Therefore, we do not have any statistical reason to believe that we are simulated.

I agree that t... (read more)

-15
Alex Williams
2y
-11
Alex Williams
2y
1
Leo
2y
I would like to understand how that is a valid objection, because I honestly don't see it. To simplify a bit, if you think that 1 ('humanity won't reach a posthuman stage') and 2 ('posthuman civilizations are extremely unlikely to run vast numbers of simulations') are false, it follows that humanity will probably both reach a posthuman stage and run a vast number of simulations. Now if you really think this will probably happen, I can see no reason to deny that it has already happened in the past. Why postulate that we will be the first simulators? There's no empirical evidence to support it, given that we are talking about extremely detailed, realistic simulations, and as it was already agreed that simulations are so many, it seems very, very unlikely that we are located at the first level. In other words, if one believes that intelligent life is part of a process which normally culminates with a massive ancestor-simulation program, the fact that there is intelligent life is not enough to find out in what part of the process it is located.

The actual worry with inner misalignment style concerns is that the selection you do during training does not fully constrain the goals of the AI system you get out; if there are multiple goals consistent with the selection you applied during training there's no particular reason to expect any particular one of them. Importantly, when you are using natural selection or gradient descent, the constraints are not "you must optimize X goal", the constraints are "in Y situations you must behave in Z ways", which doesn't constrain how you behave in totally diff

... (read more)
5
Rohin Shah
2y
I mostly go ¯\_(ツ)_/¯ , it doesn't feel like it's much evidence of anything, after you've updated off the abstract argument. The actual situation we face will be so different (primarily, we're actually trying to deal with the alignment problem, unlike evolution). I do agree that in saying " ¯\_(ツ)_/¯  " I am disagreeing with a bunch of claims that say "evolution example implies misalignment is probable". I am unclear to what extent people actually believe such a claim vs. use it as a communication strategy. (The author of the linked post states some uncertainty but presumably does believe something similar to that; I disagree with them if so.) I like the general idea but the way I'd do it is by doing some black-box investigation of current language models and asking these questions there; I expect we understand the "ancestral environment" of a language model way, way better than we understand the ancestral environment for humans, making it a lot easier to draw conclusions; you could also finetune the language models in order to simulate an "ancestral environment" of your choice and see what happens then. I agree with the murder example being a tiny bit reassuring for training non-murderous AIs; medium-reassuring is probably too much, unless we're expecting our AI systems to be put into the same sorts of situations / ancestral environments as humans were in. (Note that to be the "same sort of situation" it also needs to have the same sort of inputs as humans, e.g. vision + sound + some sort of controllable physical body seems important.)

(Disclaimer: The argument I make in this short-form feels I little sophistic to me. I’m not sure I endorse it.)

Discussions of AI risk, particular risks from “inner misalignment,” sometimes heavily emphasize the following observation:

Humans don’t just care about their genes: Genes determine, to a large extent, how people behave. Some genes are preserved from generation-to-generation and some are pushed out of the gene-pool. Genes that cause certain human behaviours (e.g. not setting yourself on fire) are more likely to be preserved. But people don’t care

... (read more)

The actual worry with inner misalignment style concerns is that the selection you do during training does not fully constrain the goals of the AI system you get out; if there are multiple goals consistent with the selection you applied during training there's no particular reason to expect any particular one of them. Importantly, when you are using natural selection or gradient descent, the constraints are not "you must optimize X goal", the constraints are "in Y situations you must behave in Z ways", which doesn't constrain how you behave in totally diffe... (read more)

The existential risk community’s relative level of concern about different existential risks is correlated with how hard-to-analyze these risks are. For example, here is The Precipice’s ranking of the top five most concerning existential risks:

  1. Unaligned artificial intelligence[1]
  2. Unforeseen anthropogenic risks (tied)
  3. Engineered pandemics (tied)
  4. Other anthropogenic risks
  5. Nuclear war (tied)
  6. Climate change (tied)

This isn’t surprising.

For a number of risks, when you first hear about them, it’s reasonable to have the reaction “Oh, hm, maybe that could be a ... (read more)

3
Zach Stein-Perlman
2y
Related: (source)

Let’s call the hypothesis that the base rate of major wars hasn’t changed the constant risk hypothesis. The best presentation of this view is in Only the Dead, a book by an IR professor with the glorious name of Bear Braumoeller. He argues that there is no clear trend in the average incidence of several measures of conflict—including uses of force, militarized disputes, all interstate wars, and wars between “politically-relevant dyads”—between 1800 and today.

A quick note on Braumoeller's analysis:

He's relying on the Correlates of War (COW) dataset, whic... (read more)

I'm not familiar with Zoe's work, and would love to hear from anyone who has worked with them in the past. After seeing the red flags mentioned above,  and being stuck with only Zoe's word for their claims, anything from a named community member along the lines of "this person has done good research/has been intellectually honest" would be a big update for me…. [The post] strikes me as being motivated not by a desire to increase community understanding of an important issue, but rather to generate sympathy for the authors and support for their positi

... (read more)
9
RAB
2y
Thanks Ben! That's very helpful info. I'll edit the initial comment to reflect my lowered credence in exaggeration or malfeasance.

FWIW, I haven't had this impression.

Single data point: In the most recent survey on community opinion on AI risk, I was in at least the 75th percentile for pessimism (for roughly the same reasons Lukas suggests below). But I'm also seemingly unusually optimistic about alignment risk.

I haven't found that this is a really unusual combo: I think I know at least a few other people who are unusually pessimistic about 'AI going well,' but also at least moderately optimistic about alignment.

(Caveat that my apparently higher level of pessimism could also be explai... (read more)

Thanks for the clarification! I still feel a bit fuzzy on this line of thought, but hopefully understand a bit better now.

At least on my read, the post seems to discuss a couple different forms of wildness: let’s call them “temporal wildness” (we currently live at an unusually notable time) and “structural wildness” (the world is intuitively wild; the human trajectory is intuitively wild).[1]

I think I still don’t see the relevance of “structural wildness,” for evaluating fishiness arguments. As a silly example: Quantum mechanics is pretty intuitively wild,... (read more)

Ben, that sounds right to me. I also agree with what Paul said. And my intent was to talk about what you call temporal wildness, not what you call structural wildness.

I agree with both you and Arden that there is a certain sense in which the "conservative" view seems significantly less "wild" than my view, and that a reasonable person could find the "conservative" view significantly more attractive for this reason. But I still want to highlight that it's an extremely "wild" view in the scheme of things, and I think we shouldn't impose an inordinate burden of proof on updating from that view to mine.

To say a bit more here, on the epistemic relevance of wildness:

I take it that one of the main purposes of this post is to push back against “fishiness arguments,” like the argument that Will makes in “Are We Living at the Hinge of History?

The basic idea, of course, is that it’s a priori very unlikely that any given person would find themselves living at the hinge of history (and correctly recognise this). Due to the fallibility of human reasoning and due to various possible sources of bias, however, it’s not as unlikely that a given person would mistakenl... (read more)

We were previously comparing two hypotheses:

  1. HoH-argument is mistaken
  2. Living at HoH

Now we're comparing three:

  1. "Wild times"-argument is mistaken
  2. Living at a wild time, but HoH-argument is mistaken
  3. Living at HoH

"Wild time" is almost as unlikely as HoH. Holden is trying to suggest it's comparably intuitively wild, and it has pretty similar anthropic / "base rate" force.

So if your arguments look solid,  "All futures are wild" makes hypothesis 2 look kind of lame/improbable---it has to posit a flaw in an argument, and also that you are living at a wildly improb... (read more)

7
Ofer
3y
I think the more decision relevant probabilities involve "Someone believes they should act as if they live at the HoH" rather than "Someone believes they live at the HoH". Our actions may be much less important if 'this is all a dream/simulation' (for example). We should make our decisions in the way we wish everyone-similar-to-us-across-the-multiverse make their decisions. As an analogy, suppose Alice finds herself getting elected as the president of the US. Let's imagine there are 10100 citizens in the US. So Alice reasons that it's way more likely that she is delusional than she actually being the president of the US. Should she act as if she is the president of the US anyway, or rather spend her time trying to regain her grip on reality? The 10100 citizens want everyone in her situation to choose the former. It is critical to have a functioning president. And it does not matter if there are many delusional citizens who act as if they are the president. Their "mistake" does not matter. What matters is how the real president acts.

Some possible futures do feel relatively more "wild” to me, too, even if all of them are wild to a significant degree. If we suppose that wildness is actually pretty epistemically relevant (I’m not sure it is), then it could still matter a lot if some future is 10x wilder than another.

For example, take a prediction like this:

Humanity will build self-replicating robots and shoot them out into space at close to the speed of light; as they expand outward, they will construct giant spherical structures around all of the galaxy’s stars to extract tremendous v

... (read more)

To say a bit more here, on the epistemic relevance of wildness:

I take it that one of the main purposes of this post is to push back against “fishiness arguments,” like the argument that Will makes in “Are We Living at the Hinge of History?

The basic idea, of course, is that it’s a priori very unlikely that any given person would find themselves living at the hinge of history (and correctly recognise this). Due to the fallibility of human reasoning and due to various possible sources of bias, however, it’s not as unlikely that a given person would mistakenl... (read more)

I suspect you are more broadly underestimating the extent to which people used "insect-level intelligence" as a generic stand-in for "pretty dumb," though I haven't looked at the discussion in Mind Children and Moravec may be making a stronger claim.

I think that's good push-back and a fair suggestion: I'm not sure how seriously the statement in Nick's paper was meant to be taken. I hadn't considered that it might be almost entirely a quip. (I may ask him about this.)

Moravec's discussion in Mind Children is similarly brief: He presents a graph of the co... (read more)

I do think my main impression of insect <-> simulated robot parity comes from very fuzzy evaluations of insect motor control vs simulated robot motor control (rather than from any careful analysis, of which I'm a bit more skeptical though I do think it's a relevant indicator that we are at least trying to actually figure out the answer here in a way that wasn't true historically). And I do have only a passing knowledge of insect behavior, from watching youtube videos and reading some book chapters about insect learning. So I don't think it's unfair to put it in the same reference class as Rodney Brooks' evaluations to the extent that his was intended as a serious evaluation.

As a last thought here (no need to respond), I thought it might useful to give one example of a concrete case where: (a) Tetlock’s work seems relevant, and I find the terms “inside view” and “outside view” natural to use, even though the case is relatively different from the ones Tetlock has studied; and (b) I think many people in the community have tended to underweight an “outside view.”

A few years ago, I pretty frequently encountered the claim that recently developed AI systems exhibited roughly “insect-level intelligence.” This claim was typically used... (read more)

The Nick Bostrom quote (from here) is:

In retrospect we know that the AI project couldn't possibly have succeeded at that stage. The hardware was simply not powerful enough. It seems that at least about 100 Tops is required for human-like performance, and possibly as much as 10^17 ops is needed. The computers in the seventies had a computing power comparable to that of insects. They also achieved approximately insect-level intelligence.

I would have guessed this is just a funny quip, in the sense that (i) it sure sounds like it's just a throw-away quip, no e... (read more)

Thank you (and sorry for my delayed response)!

I shudder at the prospect of having a discussion about "Outside view vs inside view: which is better? Which is overrated and which is underrated?" (and I've worried that this thread may be tending in that direction) but I would really look forward to having a discussion about "let's look at Daniel's list of techniques and talk about which ones are overrated and underrated and in what circumstances each is appropriate."

I also shudder a bit at that prospect.

I am sometimes happy making pretty broad and sloppy ... (read more)

2
kokotajlod
3y
I guess we can just agree to disagree on that for now. The example statement you gave would feel fine to me if it used the original meaning of "outside view" but not the new meaning, and since many people don't know (or sometimes forget) the original meaning...   100% agreement here, including on the bolded bit. Also agree here, but again I don't really care which one is overall more problematic because I think we have more precise concepts we can use and it's more helpful to use them instead of these big bags.  I think I agree with all this as well, noting that this causal/deductive reasoning definition of inside view isn't necessarily what other people mean by inside view, and also isn't necessarily what Tetlock meant. I encourage you to use the term "causal/deductive reasoning" instead of "inside view," as you did here, it was helpful (e.g. if you had instead used "inside view" I would not have agreed with the claim about baseline bias)

As a last thought here (no need to respond), I thought it might useful to give one example of a concrete case where: (a) Tetlock’s work seems relevant, and I find the terms “inside view” and “outside view” natural to use, even though the case is relatively different from the ones Tetlock has studied; and (b) I think many people in the community have tended to underweight an “outside view.”

A few years ago, I pretty frequently encountered the claim that recently developed AI systems exhibited roughly “insect-level intelligence.” This claim was typically used... (read more)

I'm not sure if you think this is an interesting point to notice that's useful for building a world-model, and/or a reason to be skeptical of technical alignment work. I'd agree with the former but disagree with the latter.

Mostly the former!

I think the point may have implications for how much we should prioritize alignment research, relative to other kinds of work, but this depends on what the previous version of someone's world model was.

For example, if someone has assumed that solving the 'alignment problem' is close to sufficient to ensure that human... (read more)

It’s definitely entirely plausible that I’ve misunderstood your views.

My interpretation of the post was something like this:

There is a bag of things that people in the EA community tend to describe as “outside views.” Many of the things in this bag are over-rated or mis-used by members of the EA community, leading to bad beliefs.

One reason for this over-use or mis-use is that the the term “outside view” has developed an extremely positive connotation within the community. People are applauded for saying that they’re relying on “outside views” — “outside

... (read more)
8
kokotajlod
3y
Wow, that's an impressive amount of charitable reading + attempting-to-ITT you did just there, my hat goes off to you sir! I think that summary of my view is roughly correct. I think it over-emphasizes the applause light aspect compared to other things I was complaining about; in particular, there was my second point in the "this expansion of meaning is bad" section, about how people seem to think that it is important to have an outside view and an inside view (but only an inside view if you feel like you are an expert) which is, IMO, totally not the lesson one should draw from Tetlock's studies etc., especially not with the modern, expanded definition of these terms. I also think that while I am mostly complaining about what's happened to "outside view," I also think similar things apply to "inside view" and thus I recommend tabooing it also.  In general, the taboo solution feels right to me; when I imagine re-doing various conversations I've had, except without that phrase, and people instead using more specific terms, I feel like things would just be better. I shudder at the prospect of having a discussion about "Outside view vs inside view: which is better? Which is overrated and which is underrated?" (and I've worried that this thread may be tending in that direction) but I would really look forward to having a discussion about "let's look at Daniel's list of techniques and talk about which ones are overrated and underrated and in what circumstances each is appropriate." Now I'll try to say what I think your position is: How does that sound?

On the contrary; tabooing the term is more helpful, I think. I've tried to explain why in the post. I'm not against the things "outside view" has come to mean; I'm just against them being conflated with / associated with each other, which is what the term does. If my point was simply that the first Big List was overrated and the second Big List was underrated, I would have written a very different post!

My initial comment was focused on your point about conflation, because I think this point bears on the linguistic question more strongly than the other p... (read more)

2
kokotajlod
3y
I said in the post, I'm a fan of reference classes. I feel like you think I'm not? I am! I'm also a fan of analogies. And I love trend extrapolation. I admit I'm not a fan of the anti-weirdness heuristic, but even it has its uses. In general most of what you are saying in this thread is stuff I agree with, which makes me wonder if we are talking past each other. (Example 1: Your second small comment about reference class tennis. Example 2: Your first small comment, if we interpret instances of "outside view" as meaning "reference classes" in the strict sense, though not if we use the broader definition you favor. Example 3: your points a, b, c, and e. (point d, again, depends on what you mean by 'outside view,' and also what counts as often.) My problem is with the term "Outside view." (And "inside view" too!) I don't think you've done much to argue in favor of it in this thread. You have said that in your experience it doesn't seem harmful; fair enough, point taken. In mine it does. You've also given two rough definitions of the term, which seem quite different to me, and also quite fuzzy. (e.g. if by "reference class forecasting" you mean the stuff Tetlock's studies are about, then it really shouldn't include the anti-weirdness heuristic, but it seems like you are saying it does?) I found myself repeatedly thinking "but what does he mean by outside view? I agree or don't agree depending on what he means..." even though you had defined it earlier. You've said that you think the practices you call "outside view" are underrated and deserve positive reinforcement; I totally agree that some of them are, but I maintain that some of them are overrated, and would like to discuss each of them on a case by case basis instead of lumping them all together under one name. Of course you are free to use whatever terms you like, but I intend to continue to ask people to be more precise when I hear "outside view" or "inside view." :)  

I agree that people sometimes put too much weight on particular outside views -- or do a poor job of integrating outside views with more inside-view-style reasoning. For example, in the quote/paraphrase you present at the top of your post, something has clearly gone wrong.[1]

But I think the best intervention, in this case, is probably just to push the ideas "outside views are often given too much weight" or "heavily reliance on outside views shouldn't be seen as praiseworthy" or "the correct way to integrate outside views with more inside-view reasoning is... (read more)

9
kokotajlod
3y
  On the contrary; tabooing the term is more helpful, I think. I've tried to explain why in the post. I'm not against the things "outside view" has come to mean; I'm just against them being conflated with / associated with each other, which is what the term does. If my point was simply that the first Big List was overrated and the second Big List was underrated, I would have written a very different post! By what definition of "outside view?" There is some evidence that in some circumstances people don't take reference class forecasting seriously enough; that's what the original term "outside view" meant. What evidence is there that the things on the Big List O' Things People Describe as Outside View are systematically underrated by the average intellectual?

When people use “outside view” or “inside view” without clarifying which of the things on the above lists they mean, I am left ignorant of what exactly they are doing and how well-justified it is. People say “On the outside view, X seems unlikely to me.” I then ask them what they mean, and sometimes it turns out they are using some reference class, complete with a dataset. (Example: Tom Davidson’s four reference classes for TAI). Other times it turns out they are just using the anti-weirdness heuristic. Good thing I asked for elaboration!

FWIW, as a... (read more)

8
kokotajlod
3y
Thanks for this thoughtful pushback. I agree that YMMV; I'm reporting how these terms seem to be used in my experience but my experience is limited. I think opacity is only part of the problem; illicitly justifying sloppy reasoning is most of it. (My second and third points in "this expansion of meaning is bad" section.) There is an aura of goodness surrounding the words "outside view" because of the various studies showing how it is superior to the inside view in various circumstances, and because of e.g. Tetlock's advice to start with the outside view and then adjust. (And a related idea that we should only use inside view stuff if we are experts... For more on the problems I'm complaining about, see the meme, or Eliezer's comment.) This is all well and good if we use those words to describe what was actually talked about by the studies, by Tetlock, etc. but if instead we have the much broader meaning of the term, we are motte-and-bailey-ing ourselves.  

Fortunately, if I remember correctly, something like the distinction between the true criterion of rightness and the best practical decision procedure actually is a major theme in the Kagan book. (Although I think the distinction probably often is underemphasized.)

It is therefore kind of misleading to think of consequentialism vs. deontology vs. virtue ethics as alternative theories, which however is the way normative ethics is typically presented in the analytic tradition.

I agree there is something to this concern. But I still wouldn't go so far as to... (read more)

6
Max_Daniel
3y
Yeah, I think these are good points. I also suspect that many deontologists and virtue ethicists would be extremely annoyed at my claim that they aren't alternative theories to consequentialism.  (Though I also suspect that many are somewhat annoyed at the typical way the distinctions between these types of theories are described by philosophers in a broadly consequentialist tradition. My limited experience debating with committed Kantians suggests that disagreements seem much more fundamental than "I think the right action is the one with the best consequences, and you think there are additional determinants of rightness beyond axiology", or anything like that.)

A slightly boring answer: I think most people should at least partly read something that overviews common theories and frameworks in normative ethics (and the arguments for and against them) and something that overviews core concepts and principles in economics (e.g. the idea of expected utility, the idea of an externality, supply/demand, the basics of economic growth, the basics of public choice).

In my view, normative ethics and economics together make up a really large portion of the intellectual foundation that EA is built on.

One good book that overview... (read more)

I remember that reading up on normative ethics was one of the first things I focused on after I had encountered EA. I'm sure it was useful in many ways. For some reason, however, I feel surprisingly lukewarm about recommending that people read about normative ethics. 

It could be because my view these days is roughly: "Once you realize that consequentialism is great as a 'criterion of rightness' but doesn't work as 'decision procedure' for boundedly rational agents, a lot of the themes from deontology, virtue ethics, moral particularism, and moral plur... (read more)

That's a good example.

I do agree that quasi-random variation in culture can be really important. And I agree that this variation is sometimes pretty sticky (e.g. Europe being predominantly Christian and the Middle East being predominantly Muslim for more than a thousand years). I wouldn't say that this kind of variation is a "rounding error."

Over sufficiently long timespans, though, I think that technological/economic change has been more significant.

As an attempt to operationalize this claim: The average human society in 1000AD was obviously very differen... (read more)

FWIW, I wouldn't say I agree with the main thesis of that post.

However, while I expect machines that outcompete humans for jobs, I don’t see how that greatly increases the problem of value drift. Human cultural plasticity already ensures that humans are capable of expressing a very wide range of values. I see no obviously limits there. Genetic engineering will allow more changes to humans. Ems inherit human plasticity, and may add even more via direct brain modifications.

In principle, non-em-based artificial intelligence is capable of expressing the enti

... (read more)
4
abergal
3y
Really appreciate the clarifications! I think I was interpreting "humanity loses control of the future" in a weirdly temporally narrow sense that makes it all about outcomes, i.e. where "humanity" refers to present-day humans, rather than humans at any given time period.  I totally agree that future humans may have less freedom to choose the outcome in a way that's not a consequence of alignment issues. I also agree value drift hasn't historically driven long-run social change, though I kind of do think it will going forward, as humanity has more power to shape its environment at will.

Do you have the intuition that absent further technological development, human values would drift arbitrarily far?

Certainly not arbitrarily far. I also think that technological development (esp. the emergence of agriculture and modern industry) has played a much larger role in changing the world over time than random value drift has.

[E]ven non-extinction AI is enabling a new set of possibilities that modern-day humans would endorse much less than the decisions of future humans otherwise.

I definitely think that's true. But I also think that was true ... (read more)

Load more