cdkg

Again, thanks for this! I think this is an important issue which it might be worth addressing more directly in the paper. Two comments, and I'm interested in what you think about them.

Comment 1: I'm not sure that the analogy to the relationship between compute and accuracy here is apt. When we duplicate an automated AI researcher, we are not trying to improve our accuracy on a single task, we are working on multiple tasks in parallel.

Comment 2: I do think the analogy to cloning is apt. Consider some talented ML researcher at a top lab — call her Ava. We can ask: If we had a duplication machine such that we could make any number of copies of Ava, how would the total quality-weighted research effort contributed by making n copies compare to the total quality-weighted research effort contributed by instead hiring n additional engineers? It is correct that the copies of Ava will not come into the world with new ideas, training, or background. But I imagine this would not be such a huge limitation for them, since engineers can and do retrain and come up with new ideas. On the other hand, hiring n additional engineers is sampling without replacement from a fixed pool of talent, so we should expect that as we hire more engineers, they will become more and more inferior on average to Ava.

So in sum, I suppose I agree with you that setting gamma to 1 is not strictly speaking a conceptual truth, but given the cognitive flexibility we are assuming when we talk about a human-level digital machine learning researcher, I feel confident that gamma is approximately 1.

Rebooting the Singularity

cdkg1y1

Really glad to hear from you, since I greatly appreciated your work on the AI 2027 material!

You're right that there are two errors here that need to be corrected. One is that it should be .82 rather than .082. The other is that I intended to be using the numbers from the "naive estimate" column of Table 8 on page 21 in the Erdil paper, which are calculated using a simple process which (to my mind) is likely to be less subject to errors introduced by model choice, but the 1.58 is the 50% estimate from their more complex model — the naive estimate is 1.66. The Feller diffusion model is relevant to their more complex calculations, about which I am a little suspicious, but not to their naive calculations.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y2

I think this is really the position of the stochastic parrots people, yes.

I don't think it's plausible, but I think it partly explains their relentless opposition to work on AI safety.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y4

Ok, let me spell it out explicitly. In a section called "Large LMs: Hype and analysis," the linked paper says that claims that LLM can "understand," "comprehend," and "know" are "gross overclaims." The paper supports this contention by pointing to evidence that "in fact, far from doing the “reasoning” ostensibly required to complete the tasks, [LLMs] were instead simply more effective at leveraging artifacts in the data than previous approaches."

Here is where the imagination comes in. Imagine that you think that all mental state attributions to artificial systems are confused in exactly this way. Imagine that you think that artificial neural nets can't reason at all. Now imagine that someone tells you that we should all be very concerned that misaligned superintelligent AI systems will destroy us.

Your response to that would be something like: it is deeply confused to think that superintelligent AI systems are something we need to worry about, and the people who are worried about them simply do not understand what is going on under the hood in machine learning models. Worries about existential risk from superintelligent AI stem from the same kind of confusion as attributing understanding to existing systems: the tendency of people who are not technically literate to anthropomorphize the systems they interact with.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y4

Well, you can dismiss them and their argument if you want to — I personally don't find their arguments terribly convincing, and their social media presence is, as you point out, strident.

But one must be aware that to a surprising extent, they control the narrative about AI safety in academia and the mainstream media. So if one cares about making AI safety seem credible, it's worth engaging with them.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y4

Have a little imagination.

Suppose I am very worried that ghosts will steal things out of my closet. It seems like a perfectly object-level argument against my position to provide reasons for thinking that beliefs in paranormal activity are not scientifically respectable. This can be true even if the reasons provided do not mention ghosts.

People like Bender take themselves to be offering reasons for thinking that worries about AGI are not scientifically respectable. This can be true even if the reasons they provide do not mention AGI.

Note that I think Bender's arguments are bad. But I don't see what is so mysterious about them.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y2

The object-level argument, as I understand it, is that worries about human-level AI capabilities of the sort that could pose an existential threat are based on a misunderstanding of what is going on under the hood in neural networks. This is what Bender means when she talks about "AI Hype". See for example her paper with Koller "Climbing towards NLU" for criticisms of attributing some kinds of mental states to neural networks.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y5

I didn't endorse that idea and, as an academic, obviously wouldn't. Also as an academic, I think paying people to explain themselves to you when you haven't first shown that you have read their work by e.g. explaining why you don't find the arguments they have already made in print convincing is not a shining exemplar of intellectually honest exploration.

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

cdkg3y10

Just FYI, many people in the AI ethics community find this kind of thing offensive. They have published their arguments in numerous scholarly venues and also in major newspapers and magazines and on places like Medium and Twitter. This kind of post is interpreted as "I'm too lazy to look at your work to find your arguments but I bet I can make you dance with small sums of money." Bad optics.

Language Agents Reduce the Risk of Existential Catastrophe

cdkg3y1

Hello,

If you're imagining a system which is an LLM trained to exhibit agentic behavior through RLHF and then left to its own devices to operate in the world, you're imagining something quite different from a language agent. Take a look at the architecture in the Park et al. paper, which is available on ArXiv — this is the kind of thing we have in mind when we talk about language agents.

I'm also not quite sure how the point about how doing RLHF on an LLM could make a dangerous system is meant to engage with our arguments. We have identified a particular kind of system architecture and argued that it has improved safety properties. It's not a problem for our argument to show that there are alternative system architectures that lack those safety properties. Perhaps there are ways of setting up a language agent that wouldn't be any safer than using ordinary RL. That's ok, too — our point is that there are ways of setting up a language agent that are safer.

cdkg

Bio

Posts 5

Comments12

Posts
5

Comments
12