matthew.vandermerwe

FHI - RA Nick Bostrom (previously, Toby Ord)

Comments

Some learnings I had from forecasting in 2020

With regards to the AGI timeline, it's important to note that Metaculus' resolution criteria are quite different from a 'standard' interpretation of what would constitute AGI[1], (or human-level AI[2], superintelligence[3], transformative AI, etc.). It's also unclear what proportion of forecasters have read this fine print (interested to hear others' views on this), which further complicates interpretation.

For these purposes we will thus define "an artificial general intelligence" as a single unified software system that can satisfy the following criteria, all easily completable by a typical college-educated human.

  • Able to reliably pass a Turing test of the type that would win the Loebner Silver Prize.
  • Able to score 90% or more on a robust version of the Winograd Schema Challenge, e.g. the "Winogrande" challenge or comparable data set for which human performance is at 90+%
  • Be able to score 75th percentile (as compared to the corresponding year's human students; this was a score of 600 in 2016) on all the full mathematics section of a circa-2015-2020 standard SAT exam, using just images of the exam pages and having less than ten SAT exams as part of the training data. (Training on other corpuses of math problems is fair game as long as they are arguably distinct from SAT exams.)
  • Be able to learn the classic Atari game "Montezuma's revenge" (based on just visual inputs and standard controls) and explore all 24 rooms based on the equivalent of less than 100 hours of real-time play (see closely-related question.)

By "unified" we mean that the system is integrated enough that it can, for example, explain its reasoning on an SAT problem or Winograd schema question, or verbally report its progress and identify objects during videogame play. (This is not really meant to be an additional capability of "introspection" so much as a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems.)


  1. OpenAI Charter ↩︎

  2. expert survey ↩︎

  3. Bostrom ↩︎

Has anyone gone into the 'High-Impact PA' path?

I work at FHI, as RA and project manager for Toby Ord/The Precipice (2018–20), and more recently as RA to Nick Bostrom (2020–). Prior to this, I spent 2 years in finance, where my role was effectively that of an RA (researching cement companies, rather than existential risk). All of the below is in reference to my time working with Toby.

Let me know if a longer post on being an RA would be useful, as this might motivate me to write it.

Impact

I think a lot of the impact can be captured in terms of being a multiplier[1] on their time, as discussed by Caroline and Tanya. This can be sub-divided into two (fuzzy, non-exhaustive) pathways:

  • Decision-making — helping them make better decisions, ranging from small (e.g. should they appear on podcast X) to big (e.g. should they write a book)
  • Execution — helping them better execute their plans

When I joined Toby on The Precipice, a large proportion of his impact was ‘locked in’ insofar he was definitely writing the book. There were some important decisions, but I expect more of my impact was via execution, which influenced (1) the quality of the book itself; (2) the likelihood of it’s being published on schedule; (3) the promotion of the book and its ideas; (4) the proportion of Toby’s productive time it took up, i.e. by freeing up time for him to do non-book things. Over the course of my role, I think I (very roughly) added 5–25% to the book’s impact, and freed up 10–33% of Toby's time.

Career decisions

Before joining Toby, I was planning to join the first cohort of FHI’s Research Scholars Program and pursued my own independent projects for 2 years. At the time, the most compelling reason for choosing the RA role was:

  • Toby’s book will have large impact X, and I can expect to multiply this by ~10%, for impact of ~0.1X
  • If I ‘do my own thing’, it would take me much longer than 2 years to find and execute a project with at least 0.1X impact (relative to the canonical book on existential risk…)

One thing I didn’t foresee is how valuable the role would be for my development as a researcher. While I’ve had less opportunity to choose my own research projects; publish papers; etc., I think this has been substantially outweighed by the learning benefits of working so closely with a top tier researcher on important projects. Overall, I expect that working with Toby ‘sped up’ my development by a few years relative to doing independent research of some sort.

One noteworthy feature of being a ‘high-impact RA/PA/etc’ is that while these jobs are relatively highly regarded in EA circles, they can sound a bit baffling to anyone else. As such, I think I’ve built up some pretty EA-specific career capital.

Some other names

Here's an incomplete list of people who have done (or are doing) this line of work, other than Caroline and myself:

Nick Bostrom — Kyle Scott, Tanya Singh, Andrew Snyder-Beattie

Toby Ord — Andrew Snyder-Beattie, Joe Carlsmith

Will MacAskill – Pablo Stafforini, Laura Pomarius, Luisa Rodriguez, Frankie Andersen-Wood, Aron Vallinder


  1. some RA trivia — Richard Kahn, the economist normally credited with the idea of a (fiscal) multiplier, was a long-time RA to John Maynard Keynes, of whom Keynes’ wrote “He is a marvelous critic and suggester and improver … There never was anyone in the history of the world to whom it was so helpful to submit one’s stuff.” ↩︎

Some thoughts on EA outreach to high schoolers

If there were more orgs doing this, there’d be the risk of abuse working with minors if in-person.

I think this deserves more than a brief mention. One of the two high school programs mentioned (ESPR) failed to safeguard students from someone later credibly accused of serious abuse, as detailed in CFAR's write-up:

Of the interactions CFAR had with Brent, we consider the decision to let him assist at ESPR—a program we helped run for high school students—to have been particularly unwise ... We do not believe any students were harmed. However, Brent did invite a student (a minor) to leave camp early to join him at Burning Man. Beforehand, Brent had persuaded a CFAR staff member to ask the camp director for permission for Brent to invite the student. Multiple other staff members stepped in to prevent this, by which time the student had decided against attending anyway.

This is a terrible track record for this sort of outreach effort. I think it provides a strong reason against pursuing it further without a high degree of assurance that the appropriate lessons have been learned — something which doesn't seem to have been addressed in the post or comments.

Max_Daniel's Shortform

Nice post. I’m reminded of this Bertrand Russell passage:

“all the labours of the ages, all the devotion, all the inspiration, all the noonday brightness of human genius, are destined to extinction in the vast death of the solar system, and that the whole temple of Man's achievement must inevitably be buried beneath the debris of a universe in ruins ... Only within the scaffolding of these truths, only on the firm foundation of unyielding despair, can the soul's habitation henceforth be safely built.” —A Free Man’s Worship, 1903

I take Russell as arguing that the inevitability (as he saw it) of extinction undermines the possibility of enduring achievement, and that we must therefore either ground life’s meaning in something else, or accept nihilism.

At a stretch, maybe you could run your argument together with Russell's — if we ground life’s meaning in achievement, then avoiding nihilism requires that humanity neither go extinct nor achieve total existential security.

The Importance of Unknown Existential Risks

Thanks — I agree with this, and should have made clearer that I didn't see my comment as undermining the thrust of Michael's argument, which I find quite convincing.

The Importance of Unknown Existential Risks

Great post!

But based on Rowe & Beard's survey (as well as Michael Aird's database of existential risk estimates), no other sources appear to have addressed the likelihood of unknown x-risks, which implies that most others do not give unknown risks serious consideration.

I don't think this is true. The Doomsday Argument literature (Carter, Leslie, Gott etc.) mostly considers the probability of extinction independently of any specific risks, so these authors' estimates implicitly involve an assessment of unknown risks. Lots of this writing was before there were well-developed cases for specific risks. Indeed, the Doomsday literature seems to have inspired Leslie, and then Bostrom, to start seriously considering specific risks.

Leslie explicitly considers unknown risks (p.146, End of the World):

Finally, we may well run a severe risk from something-we-know-not-what: something of which we can say only that it would come as a nasty surprise like the Antarctic ozone hole and that, again like the ozone hole, it would be a consequence of technological advances.

As does Bostrom (2002):

We need a catch-all category. It would be foolish to be confident that we have already imagined and anticipated all significant risks. Future technological or scientific developments may very well reveal novel ways of destroying the world.

How Much Does New Research Inform Us About Existential Climate Risk?

Very useful comment — thanks.

Overall, I don't view this as especially good news ...

How do these tail values compare with your previous best guess?

Objections to Value-Alignment between Effective Altruists

[ii] Some queries to MacAskill’s Q&A show reverence here, (“I'm a longtime fan of all of your work, and of you personally. I just got your book and can't wait to read it.”, “You seem to have accomplished quite a lot for a young person (I think I read 28?). Were you always interested in doing the most good? At what age did you fully commit to that idea?”).

I share your concerns about fandom culture / guru worship in EA, and am glad to see it raised as a troubling feature of the community. I don’t think these examples are convincing, though. They strike me as normal, nice things to say in the context of an AMA, and indicative of admiration and warmth, but not reverence.

Should EA Buy Distribution Rights for Foundational Books?

Hayek's Road to Serfdom, and twentieth century neoliberalism more broadly, owes a lot of its success to this sort of promotion. The book was published in 1944 and initially quite successful, but print runs were limited by wartime paper rationing. In 1945, the US magazine Reader's Digest created a 20-page condensed version, and sold 1 million of these very cheaply (5¢ per copy). Anthony Fisher, who founded the IEA, came across Hayek's ideas through this edition.

Source: https://press.uchicago.edu/Misc/Chicago/320553.html

Should EA Buy Distribution Rights for Foundational Books?

Great post — this is something EA should definitely be thinking more about as the canon of EA books grows and matures. Peter Singer has done it already, buying back the rights for TLYCS and distributing a free digital versions for its 10th anniversary.

I wonder whether most of the value of buying back rights could be captured by just buying books for people on request. A streamlined process for doing this could have pretty low overheads — it only takes a couple minutes to send someone a book via Amazon — and seems scalable. This should be easy enough for a donor or EA org to try.

I also imagine that for most publishers, profits are concentrated after release

I looked into this recently, using Goodreads data as a proxy for sales. My takeaway was that sales of these books have been surprisingly linear over time, rather than being concentrated early on: Superintelligence; Doing Good Better; TLYCS

Load More