technicalities

https://www.gleech.org/

Background in philosophy, international development, statistics. Doing a technical AI PhD at Bristol.

Financial conflict of interest: technically the British government through the funding council.

technicalities's Comments

What are the best arguments that AGI is on the horizon?

It can seem strange that people act decisively about speculative things. So the first piece to understand is expected value: if something would be extremely important if it happened, then you can place quite low probability on it and still have warrant to act on it. (This is sometimes accused of being a decision-theory "mugging", but it isn't: we're talking about subjective probabilities in the range of 1% - 10%, not infinitesimals like those involved in Pascal's mugging.)

I think the most-defensible outside-view argument is: it could happen soon; it could be dangerous; aligning it could be very hard; and the product of these probabilities is not low enough to ignore.

1. When you survey general AI experts (not just safety or AGI people), they give a very wide distribution of predicting when we will have human-level AI (HLAI), with a central tendency of "10% chance of human-level AI... in the 2020s or 2030s". (This is weak evidence, since technology forecasting is very hard; these surveys are not random samples; but it seems like some evidence.)


2. We don't know what the risk of HLAI being dangerous is, but we have a couple of analogous precedents:

* the human precedent for world domination through intelligence / combinatorial generalisation / cunning

* the human precedent for 'inner optimisers': evolution was heavily optimising for genetic fitness, but produced a system, us, which optimises for a very different objective ("fun", or "status", or "gratification" or some bundle of nonfitness things).

* goal space is much larger than the human-friendly part of goal space (suggesting that a random objective will not be human-friendly, which combined with assumptions about goal maximisation and instrumental drives implies that most goals could be dangerous) .

* there's a common phenomenon of very stupid ML systems still developing "clever" unintended / hacky / dangerous behaviours


3. We don't know how hard alignment is, so we don't know how long it will take to solve. It may involve certain profound philosophical and mathematical questions, which have been worked on by some of the greatest thinkers for a long time. Here's a nice nontechnical statement of the potential difficulty. Some AI safety researchers are actually quite optimistic about our prospects for solving alignment, even without EA intervention, and work on it to cover things like the "value lock-in" case instead of the x-risk case.

Growth and the case against randomista development

Great work. I'm very interested in this claim

the top ten most prescribed medicines many work on only a third of the patients

In which volume was this claim made?

In praise of unhistoric heroism

Some (likely insufficient) instrumental benefits of feeling bad about yourself:

  • When I play saxophone I often feel frustration at not sounding like Coltrane or Parker; but when I sing I feel joy at just being able to make noise. I'm not sure which mindset has led to better skill growth. : Evaluations can compare up (to a superior reference class) or compare down. I try to do plenty of both. e.g. "Relative to the human average I've done a lot and know a lot." Comparing up is more natural to me, so I have an emotional-support Anki deck of achievements and baselines.
  • Impostor syndrome is always painful and occasionally useful. Most people can't / won't pay attention to what they're bad at, and people with impostor syndrome sometimes do, and so at least have a chance to improve. If I had the chance to completely "cure" mine I might not, instead halving the intensity. (Soares' Replacing Guilt is an example of a productive mindset which dispenses with this emotional cost though, and it might be learnable, I don't know.)
  • It's really important for EAs to be modest, if only to balance out the arrogant-seeming claim in the word "Effective".
  • My adult life was tense and confusing until I blundered into two-level utilitarianism, so endorsing doing most actions intuitively, not scoring my private life. (I was always going to do most things intuitively, because it's impossible not to, but I managed to stop feeling bad about it.) Full explicit optimisation is so expensive and fraught that it only makes sense for large or rare decisions, e.g. career, consumption habits, ideology.
Against value drift

Sure, I agree that most people's actions have a streak of self-interest, and that posterity could serve as this even in cases of sacrificing your life. I took OP to be making a stronger claim, that it is simply wrong to say that "people have altruistic values" as well.

There's just something up with saying that these altruistic actions are caused by selfish/social incentives, where the strongest such incentive is ostracism or the death penalty for doing it.

Against value drift

How does this reduction account for the many historical examples of people who defied local social incentives, with little hope of gain and sometimes even destruction? (Off the top of my head: Ignaz Semmelweis, Irena Sendler, Sophie Scholl.)

We can always invent sufficiently strange posthoc preferences to "explain" any behaviour: but what do you gain in exchange for denying the seemingly simpler hypothesis "they had terminal values independent of their wellbeing"?

(Limiting this to atheists, since religious martyrs are explained well by incentives.)

What book(s) would you want a gifted teenager to come across?

Actually I think Feynman has the same risk. (Consider his motto: "disregard others" ! All very well, if you're him.)

https://stepsandleaps.wordpress.com/2017/10/17/feynmans-breakthrough-disregard-others/

What book(s) would you want a gifted teenager to come across?

I think I would have benefitted from Hanson's 'Elephant in the Brain', since I was intensely frustrated by (what I saw as) pervasive, inexplicable, wilfully bad choices, and this frustration affected my politics and ethics.

But it's high-risk, since it's easy to misread as justifying adolescent superiority (having 'seen through' society).

Call for beta-testers for the EA Pen Pals Project!

I suggest randomising in two blocks: people who strongly prefer video calls vs people who strongly prefer text, with abstainers assigned to either. Should prevent one obvious failure mode, people having an incompatible medium.

Who are the people that most publicly predicted we'd have AGI by now? Have they published any kind of retrospective, and updated their views?

I was sure that Kurzweil would be one, but actually he's still on track. ("Proper Turing test passed by 2029").

I wonder if the dismissive received view on him is because he states specific years (to make himself falsifiable), which people interpret as crankish overconfidence.

Load More