RobBensinger

Wiki Contributions

Comments

Digital person

I passed the above comment on to the LW team and added:

maybe relevant to LW norms on what we want wiki pages to look like. the main thing that feels weird to me about LW wiki pages is how they sort of encourage you to remove yourself from the picture -- I'd rather if there were an easy/normal way to say something like 'The present author ([insert name here]) believes that...'

Digital person

I also want to make a general complaint about how bad normal impersonal citation style is for clarity / epistemics / understanding.

If I cite a source in normal conversation, it's usually very clear why I'm citing it and what role the citation plays in my argument. In contrast, sticking a '(Smith 2007)' parenthetical at the end of a paragraph often leaves it unclear what role the citation is playing. E.g. (picking an example at random, not trying to find an especially bad one or anything like that):

The inside view and the outside view are two alternative approaches to forecasting. Whereas the inside view attempts to make predictions based on an understanding of the details of a problem, the outside view—also called reference class forecasting—instead looks at similar past situations and predicts based on those outcomes. For example, in trying to predict the time it will take a team to design an academic curriculum, a forecaster can either look at the characteristics of the curriculum to be designed and of the curriculum designers (inside view) or consider the time it has taken past teams to design similar curricula (outside view) (Kahneman & Lovallo 1993).

What should the reader infer about K&L 1993 here?

  • Is K&L 1993 a good introduction to inside vs. outside view? To reference class forecasting? Or is it a good introduction to neither -- are the authors instead citing it even though it's a bad introduction? Say, because of its historical importance (did K&L 1993 coin one of the terms in question, or initiate study of the phenomenon?), or because it provides unusually strong evidence for... something? (For what, exactly?)
  • Is K&L 1993 specifically about the academic curriculum design case (hence its being cited at the end of the paragraph, rather than in an earlier sentence)?
  • Was K&L 1993 chosen because it's a relatively important, useful, or well-known source? Or is it just the source the author(s) happened to be familiar with, or the first source they found on a quick Google Scholar check?

If I say 'K&L argue X' or 'K&L showed X', or better yet 'The most cited academic article on X is K&L' or 'K&L is an example of a study showing X (the first result this page's authors found on Google)', then the relationship of the source of the source to the claims made is much clearer.

The general point I want to gesture at is the one made in Rationality and the English Language:

[...] “Writers are told to avoid usage of the passive voice.” A rationalist whose background comes exclusively from science may fail to see the flaw in the previous sentence; but anyone who’s done a little writing should see it right away. I wrote the sentence in the passive voice, without telling you who tells authors to avoid passive voice. Passive voice removes the actor, leaving only the acted-upon. “Unreliable elements were subjected to an alternative justice process”—subjected by whom? What does an “alternative justice process” do? With enough static noun phrases, you can keep anything unpleasant from actually happening.

Journal articles are often written in passive voice. (Pardon me, some scientists write their journal articles in passive voice. It’s not as if the articles are being written by no one, with no one to blame.) It sounds more authoritative to say “The subjects were administered Progenitorivox” than “I gave each college student a bottle of 20 Progenitorivox, and told them to take one every night until they were gone.” If you remove the scientist from the description, that leaves only the all-important data. But in reality the scientist is there, and the subjects are college students, and the Progenitorivox wasn’t “administered” but handed over with instructions. Passive voice obscures reality. [...]

Journal articles try to sound authoritative and objective, at the cost of being informative and useful. I think a more conversational style is just better for thinking, because you can come closer to providing the actual chains of reasoning and evidence that led you to generate this article rather than a different one.

If twisting our wiki into the shape of an academic-style encyclopedia effectively epistemically self-handicaps us, then I think we should at least consider finding a way to rebrand/reorganize the wiki so we don't feel a need to put it in that style, and can instead make the style more closely resemble 'the ideal way for two people to share information' or 'the ideal way for an EA to think', or something like that.

Digital person

I think that quotes should not be used to reduce the risk of mischaracterization; the safeguard against this is provided by the citations.

I think citations are a vastly weaker safeguard against mischaracterization, compared to quoting the source (or at least giving a specific subsection or page number). This is especially true for citing a book, but reading a bunch of 30-page blog posts in order to dig up a source is a lot to ask too!

One of the ways in which the Wiki contributes value is by sparing the reader the need to consult the original source, and instead providing a succinct statement of the claims and arguments made in those sources.

I think there's similar value in finding a specific sentence or paragraph in a much larger work, and quoting it alongside other unusually-important content. If the author already put it about as clearly and concisely as you can, no need to reinvent the wheel.

(Especially since reinventing the wheel actively loses value insofar as each more possibility for error is introduced each time info is summarized. A quote passed on from source A to B to C retains its original content, whereas if wiki A summarizes wiki B's summary of source C, there's liable to be some degradation of information each step along the way.

We can try to minimize that degradation, but including quotations helps make that task easier, while also allowing the reader to verify more things for themselves rather than having to choose between 'take our word for it' vs. 'spend the next 45 minutes trying to figure out why you think the source supports the claim you're making'.)

Digital person

I think "artificial" and "machine" are both sort of ambiguous -- ems are products of artifice/engineering/design in some respects but not in others. I think I've seen some people use "AI" to subsume ems, but I think this is less common, especially in EA-ish circles.

Also, I think the strategic significance of AI systems is wildly different from that of ems, so I think if we had one term "X" referring to AI, another "Y" referring to ems, and another "Z" referring to both, then we'd end up using the words X and Y a lot and the term Z rarely. I also don't know of a good word for X other than "AI", so we might need to invent a new one.

Moreover, defining "digital people" as "people that are digital" seems pretty unhelpful, since "people" is a notoriously contested term and Holden explicitly says that digital people may be very different from present-day people (and one of the meanings of "people" is precisely "moral patient").

Holden can correct me if I'm wrong, but I think his goal in introducing the term "digital people" was to have a colloquial term whose meaning will be obvious to a general, non-philosophical audience. In ordinary English, we use "person" / "people" to refer to humans (and I think it's used especially often to refer to adult humans).

Philosophers tend to want crisp definitions with necessary and sufficient conditions. I see Holden as deliberately avoiding that route and just gesturing at the thing laypeople ordinarily use "people" to mean and saying "You know that thing? Well, imagine that but run on a computer. Now, here are some important things I think happen once you have that..."

Building the "important things" in to the definition (including e.g. moral patienthood) would undermine Holden's argument, because it might now seem (to the smart non-philosopher reader) that he's trying to sneak those properties in via nonstandard definitions of words. As I read it, Holden's posts are trying to avoid putting focus on the word 'digital people', in favor of trying to take two pre-theoretic, totally ordinary empirical thingies ('digital' and 'person') and showing how their combination has crazy real-world consequences.

Digital person
  • WBE is usually defined in contrast to 'AI', rather than as a special case of 'AI'. I suggest "machine intelligence" as an umbrella term.
  • Holden doesn't define digital people as "moral patients"; he defines them as people ('human-ish things'?) that are digital, and he argues that they would (with high probability) have moral standing. (Presumably moral agency too, not just moral patienthood.)
  • I suggest including more quotations from source materials in the wiki, to reduce the risk that we'll mis-characterize claims like this. (Especially when the source material is itself an easy-to-understand introductory resource, rather than something dense and cryptic.)
Outline of Galef's "Scout Mindset"

If someone's specifically looking for a book about EA, I wouldn't give them Scout Mindset and say 'this is a great introduction to EA' -- it's not!  Riffing on your analogy, it's more like a world where:

  • There's a book about statistics (or whatever) that happens to be especially useful as a prereq for social science resources -- e.g., it provides the core tools for evaluating social-science claims, even if it doesn't discuss social science on the object level.
  • Social science departments end up healthier when they filter on the kind of person who's interested in the stats book and reads that book, vs. filtering on a social science book.
  • Compared to the content of the stats book, the basics of social science are sufficiently 'in the water', or sufficiently easy to pick up via conversation and scattered blog posts, that there's less lost from soaking it up informally.
  • It's more important that a critical mass of social scientists have the stats book's concepts as cultural touchstones / common language / shared standards / etc., than that they have that for any given social science book's concepts.
  • People who almost go into social science (but decide to do something else instead) end up doing much more useful work if they read the stats book than if they read a social science book (assuming they only read one). (Note that this might make the stats book better consequentially even if it means that fewer people end up doing social science work -- maximizing EA's growth isn't identical to maximizing EA's impact-mediated-by-people-we-court.)

I could of course just be wrong about this. But that's the shape of my view.

Survey on AI existential risk scenarios

Fascinating results! I really appreciate the level of thought and precision you all put into the survey questions.

Were there any strong correlations between which of the five scenarios respondents considered more likely?

Predict responses to the "existential risk from AI" survey

Survey results for Q2, Q1 (hover for spoilers):

OpenAI: ~21%, ~13%

FHI: ~27%, ~19%

DeepMind: (no respondents declared this affiliation)

CHAI/Berkeley: 39%, 39%

MIRI: 80%, 70%

Open Philanthropy: ~35%, ~16%

"Existential risk from AI" survey results

Some reasons I can imagine for focusing on 90+% loss scenarios:

  • You might just have the empirical view that very few things would cause 'medium-sized' losses of a lot of the future's value. It could then be useful to define 'existential risk' to exclude medium-sized losses, so that when you talk about 'x-risks' people fully appreciate just how bad you think these outcomes would be.
  • 'Existential' suggests a threat to the 'existence' of humanity, i.e., an outcome about as bad as human extinction. (Certainly a lot of EAs -- myself included, when I first joined the community! -- misunderstand x-risk and think it's equivalent to extinction risk.)

After googling a bit, I now think Nick Bostrom's conception of existential risk (at least as of 2012) is similar to Toby's. In https://www.existential-risk.org/concept.html, Nick divides up x-risks into the categories "human extinction, permanent stagnation, flawed realization, and subsequent ruination", and says that in a "flawed realization", "humanity reaches technological maturity" but "the amount of value realized is but a small fraction of what could have been achieved". This only makes sense as a partition of x-risks if all x-risks reduce value to "a small fraction of what could have been achieved" (or reduce the future's value to zero).

I still think that the definition of x-risk I proposed is a bit more useful, and I think it's a more natural interpretation of phrasings like "drastically curtail [Earth-originating intelligent life's] potential" and "reduce its quality of life (compared to what would otherwise have been possible) permanently and drastically". Perhaps I should use a new term, like hyperastronomical catastrophe, when I want to refer to something like 'catastrophes that would reduce the total value of the future by 5% or more'.

"Existential risk from AI" survey results

Oh, your survey also frames the questions very differently, in a way that seems important to me. You give multiple-choice questions like :

Which of these is closest to your estimate of the probability that there will be an existential catastrophe due to AI (at any point in time)?

  • 0.0001%
  • 0.001%
  • 0.01%
  • 0.1%
  • 0.5%
  • 1%
  • 2%
  • 3%
  • 4%
  • 5%
  • 6%
  • 7%
  • 8%
  • 9%
  • 10%
  • 15%
  • 20%
  • 25%
  • 30%
  • 35%
  • 40%
  • 45%
  • 50%
  • 55%
  • 60%
  • 65%
  • 70%
  • 75%
  • 80%
  • 85%
  • 90%
  • 95%
  • 100%

... whereas I just asked for a probability.

Overall, you give fourteen options for probabilities below 10%, and two options above 90%. (One of which is the dreaded-by-rationalists "100%".)

By giving many fine gradations of 'AI x-risk is low probability' without giving as many gradations of 'AI x-risk is high probability', you're communicating that low-probability answers are more normal/natural/expected.

The low probabilities are also listed first, which is a natural choice but could still have a priming effect. (Anchoring to 0.0001% and adjusting from that point, versus anchoring to 95%.) On my screen's resolution, you have to scroll down three pages to even see numbers as high as 65% or 80%. I lean toward thinking 'low probabilities listed first' wasn't a big factor, though.

Load More