M

MichaelDickens

6112 karmaJoined
mdickens.me

Bio

I do independent research on EA topics. I write about whatever seems important, tractable, and interesting (to me).

I have a website: https://mdickens.me/ Much of the content on my website gets cross-posted to the EA Forum, but I also write about some non-EA stuff like [investing](https://mdickens.me/category/finance/) and [fitness](https://mdickens.me/category/fitness/).

My favorite things that I've written: https://mdickens.me/favorite-posts/

I used to work as a software developer at Affirm.

Sequences
1

Quantitative Models for Cause Selection

Comments
859

Much has been written about why we should not concede to such.

I've seen much written that takes it as a premise that you shouldn't concede to a Pascal's mugging, but I've seen very little about why not.

(I can think of arguments for not conceding in the actual Pascal's mugging thought experiment: (1) ignoring threats as a game-theoretic strategy and (2) threats of unlikely outcomes constituting evidence against the outcome. Neither of these apply to caring about soil nematodes.)

Relatedly, I often find there is some concept I want to be able to reference, but it's scattered in pieces across four different articles/books, so I find myself writing an article whose only contribution is to put all those pieces together in one place.

Not OP but I would say that if we end up with an ASI that can misunderstand values in that kind of way, then it will almost certainly wipe out humanity anyway.

That is the same category of mistake as "please maximize the profit of this paperclip factory" getting interpreted as "convert all available matter into paperclip machines".

I don't cite LLMs for objective facts.

In casual situations I think it's basically okay to cite an LLM if you have a good sense of what sorts of facts LLMs are unlikely to hallucinate, namely, well-established facts that are easy to find online (because they appear a lot in the training data). But for those sorts of facts, you can turn on LLM web search and it will find a reliable source for you and then you can cite that source instead.

I think it's okay to cite LLMs for things along the lines of "I asked Claude for a list of fun things to do in Toronto and here's what it came up with".

I like this article. I have reservations about some AI-for-animal-welfare interventions, and this article explains my reservations well. Particularly I am glad that you highlighted the importance of (1) potentially short timelines and (2) that the post-TAI world will probably be very weird.

Yes, it could well be that an LLM isn't conscious on a single pass, but it becomes conscious across multiple passes.

This is analogous to the Chinese room argument, but I don't take the Chinese room argument as a reductio ad absurdum—unless you're a substance dualist or a panpsychist, I think you have to believe that a conscious being is made up of parts that are not themselves conscious.

(And even under panpsychism I think you still have to believe that the composed being is conscious in a way that the individual parts aren't? Not sure.)

I don't find the Turing test evidence as convincing as you present it here.

Fair enough, I did not actually read the paper! I have talked to LLMs about consciousness and to me they seem pretty good at talking about it.

I agree that if each token you read is generated by a single forward pass through a network of fixed weights, then it seems hard to imagine how there could be any 'inner life' behind the words. There is no introspection. But this is not how the new generation of reasoning models work. They create a 'chain of thought' before producing an answer, which looks a lot like introspection if you read it!

The chain of thought is still generated via feed-forward next token prediction, right?

A commenter on my blog suggested that LLMs could still be doing enough internally that they are conscious even while generating only one token at a time, which sounds reasonable to me.

If you haven't already, you should consider posting your project ideas!

  1. You can get feedback on which ideas seem most promising, so maybe you end up getting those ones done
  2. Other people might pick up the ideas, or it could inspire related ideas

That paper is long and kind of confusing but from skimming for relevant passages, here is how I understood its arguments against computational functionalism:

  • Section 3.4: Human brains deviate from Turing machines in that brain states require energy to be maintained and Turing machines are "immortal". [And I guess the implication is that this is evidence for substrate dependence? But I don't see why.]
  • Section 3.5: Brains might violate informational closure, which basically means that the computations a brain performs might depend on the substrate on which they are performed. Which is evidence that AIs wouldn't be conscious. [I found this section confusing but it seems unlikely to me that brains violate informational closure, if I understood it correctly.]
  • Section 3.6: AI can only be conscious if computational functionalism is true. [That sounds false to me. It could be that some other version of functionalism is true, or panpsychism is true, or perhaps identity theory is true but that both brains and transistors can produce consciousness, or perhaps even dualism is true and AIs are endowed with dualistic consciousness somehow.]

I didn't understand these arguments very well but I didn't find them compelling. I think the China brain argument is much stronger although I don't find it persuasive either. If you're talking to a black box that contains either a human or a China-brain, then there is no test you can perform to distinguish the two. If the human can say things to you that convince you it's conscious, then you should also be convinced that the China-brain is conscious.

“What probability do you put on future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species?” – median: 5% – mean: 16.2%

“What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species?” – median: 10% – mean: 19.4%

Am I missing something, or are these answers nonsensical? On my reading, the 2nd outcome is a strict subset of the 1st outcome, so the probability can't be higher. But the median given probability is twice as high.

Load more