Bio

Participation
4

I am a generalist quantitative researcher. I am open to volunteering and paid work. I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How others can help me

I am open to volunteering and paid work (I usually ask for 20 $/h). I welcome suggestions for posts. You can give me feedback here (anonymously or not).

How I can help others

I can help with career advice, prioritisation, and quantitative analyses.

Comments
2786

Topic contributions
40

Hi. I agree with the points you make in that comment. However, the question from Metaculus I mention in the last section of the post is about superintelligent AI, and the operationalisation of this does require a very high level of intelligence and generality.

"Superintelligent Artificial Intelligence" (SAI) is defined for the purposes of this question as an AI which can perform any task humans can perform in 2021, as well or superior to the best humans in their domain.  The SAI may be able to perform these tasks themselves, or be capable of designing sub-agents with these capabilities (for instance the SAI may design robots capable of beating professional football players which are not successful brain surgeons, and design top brain surgeons which are not football players).  Tasks include (but are not limited to): performing in top ranks among professional e-sports leagues, performing in top ranks among physical sports, preparing and serving food, providing emotional and psychotherapeutic support, discovering scientific insights which could win 2021 Nobel prizes, creating original art and entertainment, and having professional-level software design and AI design capabilities.

Hi Aleph. I believe i) individual welfare per fully-healthy-animal-year being proportional to "individual number of neurons"^"exponent of the individual number of neurons" for some animals is a reason for testing extending i) to all animals. Likewise, I think ii) individual welfare per fully-healthy-animal-year being proportional to "BMR"^"exponent of the BMR" for some organisms (which I see as a possibility for the reasons I provide in the comment) is a reason for testing extending ii) to all organisms.

Agreed, David. The post Where’s my ten minute AGI? by Anson Ho discusses why METR's task time horizon does not translate into as much automation as one may naively expect.

[...] if AIs are actually able to perform most tasks on 1-hour task horizons, why don’t we see more real-world task automation? For example, most emails take less than an hour to write, but crafting emails remains an important part of the lives of billions of people every day.

Some of this could be due to people underusing AI systems,2 but in this post I want to focus on reasons that are more fundamental to the capabilities of AI systems. In particular, I think there are three such reasons that are the most important:

  1. Time-horizon estimates are very domain-specific
  2. Task reliability strongly influences task horizons
  3. Tasks are very bundled together and hard to separate out.

While it’s hard to be quantitative about just how much each of these reasons matter, they’re all strong enough to explain why many tasks with 1-hour or even 10-minute horizons remain unautomated.

If you thought Yudkowsky and Soares used overly confident language and would have taken the "QED" as further evidence of that, but this particular example turns out not to have been written by Yudkowsky and Soares, that's some evidence against your hypothesis.

I agree.

But instead of updating away a little, you seemed to dismiss that evidence and double down.

I updated away a little, but negligibly so.

I think you originally replied to the original comment approvingly or at least non-critically, but then deleted that comment after I replied to it, but I could be misremembering that.

I deleted a comment which said something like the following. "Thanks, Falk. I very much agree". I did not remember "QED" was Robin paraphrasing. However, I think the "QED" is still supposed to represent the level of confidence of the authors (in the book) in their arguments for a high risk of human extinction.

I would've been surprised to see him use "QED" in that way, which is why I reacted to the original comment here with skepticism and checked whether "QED" actually appeared in the book (it didn't).

Interesting. I would not have found the use of "QED" surprising. To me it seems that Yudkowsky is often overly confident.


I remain open to bets against short AI timelines, or what they supposedly imply, up to 10 k$. Do you see any that we could make that is good for both of us considering we could invest our money, and that you could take loans?

I should say though that based on my conversations at the time, it seemed unlikely that alt proteins will make a big difference.

Interesting. I think changes in diet, not global catastrophes, are the driver of reductions in the number of farmed animals. I guess alternative proteins will have a negligible effect over the next 10 years, but that their effect may well be the most important one in 100 years.

I completely agree with what you just stated (although I have not read the post you linked), but I do not understand why it would undermine the broader point I mentioned in my comment.

Hi Saulius.

I tried estimating years of impact using graphs like this:

[...]

The yellow line accounts for the possibility that commitments will stop being relevant due to things like x-risks, global catastrophic risks, societal collapse, cultured meat taking over, animals bred not to suffer, black swans, etc. [...]

[...]

Finally, we estimate the expected proportion of hens used by companies that will be cage-free each year as follows: (blue - red) ✕ yellow. And then we add up the result for all years to calculate years of impact.

I would estimate the number of layer-years improved in expectation in year Y from "expected population of layers in year Y"*("expected population of layers in cages in year Y without the intervention as a fraction of all of them in year Y" - "expected population of layers in cages in year Y with the intervention as a fraction of all of them in year Y") = P(Y)*(f_control(Y) - f_intervention(Y)), which is correct by definition. I would then calculate the total number of layer-years improved adding the effects from the year in which the intervention started on. I believe the annual effects should eventually go to 0, such that there is no need to add the effects of all the years until infinity. It is enough to consider the years accounting for the vast majority of the total number of layer-years improved.

P, f_control, and f_intervention relate to your yellow, red, and blue lines, but their meaning is more intuitive. In addition, the yellow line in your formula should not be strictly seen as a probability for it to work in all cases. A probability describes effects that would make the fraction of hens in cages the same with and without the intervention, which applies to, for example, human extinction. However, there are non-binary gradual effects like the raise of alternative proteins which make the fraction of hens in cages with and without the intervention more similar in expectation, but without all the effect coming from the possibility of the fraction with and without the intervention becoming the same.

I think the post The Selfish Machine by Maarten Boudry is relevant to this discussion.

Consider dogs. Canine evolution under human domestication satisfies Lewontin’s three criteria: variation, heritability, and differential reproduction. But most dogs are bred to be meek and friendly, the very opposite of selfishness. Breeders ruthlessly select against aggression, and any dog attacking a human usually faces severe fitness consequences—it is put down, or at least not allowed to procreate. In the evolution of dogs, humans call the shots, not nature. Some breeds, like pit bulls or Rottweilers, are of course selected for aggression (to other animals, not to its guardian), but that just goes to show that domesticated evolution depends on breeders’ desires.

How can we extend this difference between blind evolution and domestication to the domain of AI? In biology, the defining criterion of domestication is control over reproduction. If humans control an animal’s reproduction, deciding who gets to mate with whom, then it’s domesticated. If animals escape and regain their autonomy, they’re feral. By that criterion, house cats are only partly domesticated, as most moggies roam about unsupervised and choose their own mates, outside of human control. If you apply this framework to AIs, it should be clear that AI systems are still very much in a state of domestication. Selection pressures come from human designers, programmers, consumers, and regulators, not from blind forces. It is true that some AI systems self-improve without direct human supervision, but humans still decide which AIs are developed and released. GPT-4 isn’t autonomously spawning GPT-5 after competing in the wild with different LLMs; humans control its evolution.

By and large, current selective pressures for AI are the opposite of selfishness. We want friendly, cooperative AIs that don’t harm users or produce offensive content. If chatbots engage in dangerous behavior, like encouraging suicide or enticing journalists to leave their spouse, companies will frantically try to update their models and stamp out the unwanted behavior. In fact, some language models have become so safe, avoiding any sensitive topics or giving anodyne answers, that consumers now complain they are boring. And Google became a laughing stock when its image generator proved to be so politically correct as to produce ethnically diverse Vikings or founding fathers.

Thanks for clarifying, Erich. I believe Falk was referring to Yudkowsky and Soares. I have not read their book. I have just listened to podcasts they have done, and skimmed some of their writings. However, I think the broader point stands that they often use language that implies much greater confidence in and robustness of the possibility of human extinction than warranted by their arguments.

Thanks for the great point, Falk. I very much agree.

Load more