Safety Researcher and Scalable Alignment Team lead at DeepMind. AGI will probably be wonderful; let's make that even more probable.


Sorted by New

Wiki Contributions


An update in favor of trying to make tens of billions of dollars

The rest of this comment is interesting, but opening with “Ummm, what?” seems bad, especially since it takes careful reading to know what you are specifically objecting to.

Edit: Thanks for fixing!

Why aren't you freaking out about OpenAI? At what point would you start?

It can’t be up to date, since they recently announced that Helen Toner joined the board, and she’s not listed.

Why aren't you freaking out about OpenAI? At what point would you start?

Unfortunately, a significant part of the situation is that people with internal experience and a negative impression feel both constrained and conflicted (in the conflict of interest sense) for public statements. This applies to me: I left OpenAI in 2019 for DeepMind (thus the conflicted).

Seeking social science students / collaborators interested in AI existential risks

I'm the author of the cited AI safety needs social scientists article (along with Amanda Askell), previously at OpenAI and now at DeepMind.  I currently work with social scientists in several different areas (governance, ethics, psychology, ...), and would be happy to answer questions (though expect delays in replies).

DeepMind is hiring Long-term Strategy & Governance researchers

I lead some of DeepMind's technical AGI safety work, and wanted to add two supporting notes:

  1. I'm super happy we're growing strategy and governance efforts!
  2. We view strategy and governance questions as coupled to technical safety, and are working to build very close links between research in the two areas so that governance mechanisms and alignment mechanisms can be co-designed.  (This also applies to technical safety and the Ethics Team, among other teams.)
It takes 5 layers and 1000 artificial neurons to simulate a single biological neuron [Link]

This paper has at least two significant flaws when used to estimate relative complexity for useful purposes.  In the authors' defense such an estimate wasn't the main motivation of the paper, but the Quanta article is all about estimation and the paper doesn't mention the flaws.

Flaw one: no reversed control
Say we have two parameterized model classes  and , and ask  what ns are necessary for  to approximate  and  to approximate .  It is trivial to construct model classes for which the n is large in both directions, just because  is a much better algorithm to approximate  than  and vice versa.  I'm not sure how much this cuts off the 1000 estimate, but it could easily be 10x.

Brief Twitter thread about this:

Flaw two: no scaling w.r.t. multiple neurons
I don't see any reason to believe the 1000 factor would remain constant as you add more neurons, so that we're approximating many real neurons with many (more) artificial neurons.  In particular, it's easy to construct model classes where the factor decays to 1 as you add more real neurons.  I don't know how strong this effect is, but again there is no discussion or estimation of it in the paper.

Can you control the past?

Ah, I see: you’re going to lean on the difference between “cause” and “control”. So to be clear: I am claiming that, as an empirical matter, we also can’t control the past, or even “control” the past.

To expand, I’m not using physics priors to argue that physics is causal, so we can’t control the past. I’m using physics and history priors to argue that we exist in the non-prediction case relative to the past, so CDT applies.

Can you control the past?

By “physics-based” I’m lumping together physics and history a bit, but it’s hard to disentangle them especially when people start talking about multiverses. I generally mean “the combined information of the laws of physics and our knowledge of the past”. The reason I do want to cite physics too, even for the past case of (1), is that if you somehow disagreed about decision theorists in WW1 I’d go to the next part of the argument, which is that under the technology of WW1 we can’t do the necessary predictive control (they couldn’t build deterministic twins back then).

However, it seems like we’re mostly in agreement, and you could consider editing the post to make that more clear. The opening line of your post is “I think that you can “control” events you have no causal interaction with, including events in the past.” Now the claim is “everyone agrees about the relevant physics — and in particular, that you can’t causally influence the past”. These two sentences seem inconsistent, and especially since your piece is long and quite technical opening with a wrong summary may confuse people.

I realize you can get out of the inconsistency by leaning on the quotes, but it still seems misleading.

Load More