Effective Altruism Forum
EA Forum

All of Geoffrey Irving's Comments + Replies

‘Consequentialism’ is being used to mean several different things

Aha. Well, hopefully we can agree that those philosophers are adding confusion. :)

Leaving Google, Joining the Nucleic Acid Observatory

Those aspects are getting weaker, but the ability for ML to models humans is getting stronger, and there are other “computer acting as salesperson” channels which don’t go through Privacy Sandbox. But probably I’m just misusing the term “ad tech” here, and “convince someone to buy something” tech might be a better term.

‘Consequentialism’ is being used to mean several different things

Geoffrey Irving4y7

Saying that consequentialist theories are “often agent neutral” may only add confusion, as it’s not a part of the definition and indeed “consequentialism can be agent non-neutral” is part of what separates it from utilitarianism.

Theo Hawking

My understanding is that some philosophers do actually think 'consequentialism' should only refer to agent-neutral theories. I agree it's confusing - I couldn't think of a better way to phrase it.

How much funding is needed to eradicate Malaria?

Geoffrey Irving4y9

Is that any particular confidence interval? It seems implausible that it would be so tight.

Michael Huang

None mentioned in the report. It refers to the Methods section of an online appendix but the appendix doesn't appear to be on the website.

Leaving Google, Joining the Nucleic Acid Observatory

Geoffrey Irving4y11

Congratulations on the switch!

I enjoyed your ads blog post, by the way. Might be fun to discuss that sometime, both because (1) I’m funded by ads and (2) I’m curious how the picture will shift as ad tech gets stronger.

tamgent

Could you elaborate on what you mean by as ad tech gets stronger? Is that just because all tech gets stronger with time, or is it in response to the current shifts, like privacy sandbox?

Bad Omens in Current Community Building

Geoffrey Irving4y4

Nice to see you here, Ferenc! We’ve talked before when I was at OpenAI and you Twitter, and always happy to chat if you’re pondering safety things these days.

What are the coolest topics in AI safety, to a hopelessly pure mathematician?

Answer by Geoffrey IrvingMay 09, 20222

In outer alignment one can write down a correspondence between ML training schemes that learn from human feedback and complexity classes related to interactive proof schemes. If we model the human as a (choosable) polynomial time algorithm, then

1. Debate and amplification get to PSPACE, and more generally $n$ -step debate gets to $Σ_{n} P$ .
2. Cross-examination gets to NEXP.
3. If one allows opaque pointers, there are schemes that go further: market making gets to R.

Moreover, we informally have constraints on which schemes are practical based on prope... (read more)

How I failed to form views on AI safety

Geoffrey Irving4y2

That is also very reasonable! I think the important part is to not feel to bad about the possibility of never having a view (there is a vast sea of things I don't have a view on), not least because I think it actually increases the chance of getting to the right view if more effort is spent.

(I would offer to chat directly, as I'm very much part of the subset of safety close to more normal ML, but am sadly over capacity at the moment.)

How I failed to form views on AI safety

Geoffrey Irving4y11

Yep, that’s very fair. What I was trying to say was that if in response to the first suggestion someone said “Why aren’t you deferring to others?” you could use that as a joke backup, but agreed that it reads badly.

DaneelO

Makes a lot of sense :D I just didn't get the joke, which I in hindsight probably should have... :P

How I failed to form views on AI safety

Geoffrey Irving4y13

(I’m happy to die on the hill that that threshold exists, if you want a vicious argument. :))

How I failed to form views on AI safety

Geoffrey Irving4y8

I think the key here is that they’ve already spent quite a lot of time investigating the question. I would have a different reaction without that. And it seems like you agree my proposal is best both for the OP and the world, so perhaps the real sadness is about the empirical difficulty at getting people to consensus?

At a minimum I would claim that there should exist some level of effort past which you should not be sad not arguing, and then the remaining question is where the threshold is.

Geoffrey Irving4y13

(I’m happy to die on the hill that that threshold exists, if you want a vicious argument. :))

How I failed to form views on AI safety

Geoffrey Irving4y9

As somehow who works on AGI safety and cares a lot about it, my main conclusion from reading this is: it would be ideal for you to work on something other than AGI safety! There are plenty of other things to work on that are important, both within and without EA, and a satisfactory resolution to “Is AI risk real?” doesn’t seem essential to usefully pursue other options.

Nor do I think this is a block to comfortable behavior as an EA organizer or role model: it seems fine to say “I’ve thought about X a fair amount but haven’t reached a satisfactory conclusi... (read more)

colin4y10

it would be ideal for you to work on something other than AGI safety!

I disagree. Here is my reasoning:

Many people that have extensive ML knowledge are not working on safety because either they are not convinced of its importance or because they haven't fully wrestled with the issue
In this post, Ada-Maaria articulated the path to her current beliefs and how current AI safety communication has affected her.
She has done a much more rigorous job of evaluating the pervasiveness of these arguments than anyone else I've read
If she continues down this path she cou

... (read more)

Ada-Maaria Hyvärinen4y14

Thanks for giving me permission, I guess can use this if I need ever the opinion of "the EA community" ;)

However, I don't think I'm ready to give up on trying to figure out my stance on AI risk just yet, since I still estimate it is my best shot in forming a more detailed understanding on any x-risk, and understanding x-risks better would be useful for establishing better opinions on other cause priorization issues.

DaneelO

edit: I don't have a sense of humor "a senior AGI safety person has given me permission to not have a view and not feel embarrassed about it." For a lack of a better word, this sound cultish to me, why would one need permission "from someone senior" to think or feel anything? If someone said this to me it would be a red flag about the group/community. I think your first suggestion ("I’ve thought about X a fair amount but haven’t reached a satisfactory conclusion") sounds much more reasonable, if OP feels like that reflects their opinion. But I also think that something like "I don't personally feel convinced by the AGI risk arguments, but many others disagree, I think you should read up on it more and reach your own conclusions", is much more reasonable than your second suggestion. I think we should welcome different opinions, as long as someone agrees with the main EA principles they are an EA, it should not be about agreeing completely with cause A, B and C. Sorry if I am over-interpreting your suggestion as implying much more than you meant, I am just giving my personal reaction. Disclaimer: long time lurker, first time poster.

Linch

On the one hand I agree with this being very likely the most prudent action from OP to take from her perspective, and probably the best action for the world as well. On the other, I think I feel a bit sad to miss some element of...combativeness(?)... in my perhaps overly-nostalgic memories of the earlier EA culture, where people used to be much more aggressive about disagreements with cause and intervention prioritizations. It feels to me that people are less aggressive about disagreeing with established consensus or strong viewpoints that other EAs have, and are somewhat more "live and let live" about both uses of money and human capital. I sort of agree with this being the natural evolution of our movement's emphases (longtermism is harder to crisply argue about than global health, money is more liquid/fungible than human capital). But I think I feel some sadness re: the decrease in general combativeness and willingness to viciously argue about causes. This is related to an earlier post about the EA community becoming a "big tent," which at the time I didn't agree with but now I'm warning up to.

Deliberate Performance in People Management

Geoffrey Irving4y4

This is a great article, and I will make one of those spreadsheets!

Though I can't resist pointing out that, assuming you got 99.74% out of an Elo calculation, I believe the true probability of them beating you is way higher than 99.74%. :)

If Superpositions can Suffer

Geoffrey Irving4y1

Yes, Holevo as you say. By information I mean the standard definitions.

If Superpositions can Suffer

Geoffrey Irving4y1

The issue is not the complexity, but the information content. As mentioned, n qbits can’t store more than n bits of classical information, so the best way to think of them is “n bits of information with some quantum properties”. Therefore, it’s implausible that they correspond to exponential utility.

EricBlair

What do you mean by store information? The state space of a quantum state is(/can be thought of as) a vector of 2^n complex numbers, it's this that prohibits efficient classical simulation. Perhaps you're talking about storing and retrieving information, which does indeed have constraints (e.g. Holevo bound). Constraints that limit quantum computers as a kind of exponentially large memory stick where you store and retrieve information. But algorithms (like Shor's) use this large space then carefully encode their output (using the structure in the problem) in a way that can be transferred off the computer without breaking the Holevo bound. I guess I believe the state space that you can't necessarily access is the important element, not the information being brought in and out of the system.

If Superpositions can Suffer

Geoffrey Irving4y3

This is somewhat covered by existing comments, but to add my wording:

It's highly unlikely that utility is exponential in quantum state, for roughly the same reason that quantum information is not exponential in quantum state. That is, if you have n qbits, you can hold n bits of classical information, not 2^n. You can do more computation with n qbits, but only in special cases.

EricBlair

Thats a good point, why do you think that at least some part of utility generation doesn't allow a more efficient quantum algorithm?

Geoffrey Irving4y23

The rest of this comment is interesting, but opening with “Ummm, what?” seems bad, especially since it takes careful reading to know what you are specifically objecting to.

Edit: Thanks for fixing!

Why aren't you freaking out about OpenAI? At what point would you start?

Geoffrey Irving4y30

Unfortunately we may be unlikely to get a statement from a departed safety researcher beyond mine (https://forum.effectivealtruism.org/posts/fmDFytmxwX9qBgcaX/why-aren-t-you-freaking-out-about-openai-at-what-point-would?commentId=WrWycenCHFgs8cak4), at least currently.

Why aren't you freaking out about OpenAI? At what point would you start?

Geoffrey Irving4y32

It can’t be up to date, since they recently announced that Helen Toner joined the board, and she’s not listed.

Lukas Finnveden4y23

The website now lists Helen Toner, but do not list Holden, so it seems he is no longer on the board.

Why aren't you freaking out about OpenAI? At what point would you start?

Geoffrey Irving4y85

Unfortunately, a significant part of the situation is that people with internal experience and a negative impression feel both constrained and conflicted (in the conflict of interest sense) for public statements. This applies to me: I left OpenAI in 2019 for DeepMind (thus the conflicted).

Why aren't you freaking out about OpenAI? At what point would you start?

Geoffrey Irving4y11

Is Holden still on the board?

Jaime Sevilla4y10

He is listed in the website.

> OpenAI is governed by the board of OpenAI Nonprofit, which consists of OpenAI LP employees Greg Brockman (Chairman & CTO), Ilya Sutskever (Chief Scientist), and Sam Altman (CEO), and non-employees Adam D’Angelo, Holden Karnofsky, Reid Hoffman, Shivon Zilis, Tasha McCauley, and Will Hurd.

It might not be up to date though

Geoffrey Irving4y7

I'm the author of the cited AI safety needs social scientists article (along with Amanda Askell), previously at OpenAI and now at DeepMind. I currently work with social scientists in several different areas (governance, ethics, psychology, ...), and would be happy to answer questions (though expect delays in replies).

Vael Gates

Thanks so much; I'd be excited to talk! Emailed.

DeepMind is hiring Long-term Strategy & Governance researchers

Geoffrey Irving4y17

I lead some of DeepMind's technical AGI safety work, and wanted to add two supporting notes:

I'm super happy we're growing strategy and governance efforts!
We view strategy and governance questions as coupled to technical safety, and are working to build very close links between research in the two areas so that governance mechanisms and alignment mechanisms can be co-designed. (This also applies to technical safety and the Ethics Team, among other teams.)

It takes 5 layers and 1000 artificial neurons to simulate a single biological neuron [Link]

Geoffrey Irving4y47

This paper has at least two significant flaws when used to estimate relative complexity for useful purposes. In the authors' defense such an estimate wasn't the main motivation of the paper, but the Quanta article is all about estimation and the paper doesn't mention the flaws.

Flaw one: no reversed control
Say we have two parameterized model classes $A_{n}$ and $B_{n}$ , and ask what ns are necessary for $A_{n}$ to approximate $B_{1}$ and $B_{n}$ to approximate $A_{1}$ . It is trivial to construct model classes for which ... (read more)

Michael St Jules 🔸

Thanks, these are both excellent points. I did hint to the first one, and I specifically came back to this post to mention the second, but you beat me to it. ;) I've edited my post. EDIT: Also edited again to emphasize the weaknesses.

Can you control the past?

Geoffrey Irving4y3

To expand, I’m not using physics priors to argue that physics is causal, so we can’t control the past. I’m using physics and history priors to argue that we exist in the non-prediction case relative to the past, so CDT applies.

Joe_Carlsmith

Cool, this gives me a clearer picture of where you're coming from. I had meant the central question of the post to be whether it ever makes sense to do the EDT-ish try-to-control-the-past thing, even in pretty unrealistic cases -- partly because I think answering "yes" to this is weird and disorienting in itself, even if it doesn't end up making much of a practical difference day-to-day; and partly because a central objection to EDT is that the past, being already fixed, is never controllable in any practically-relevant sense, even in e.g. Newcomb's cases. It sounds like your main claim is that in our actual everyday circumstances, with respect to things like the WWI case, EDTish and CDT recommendations don't come apart -- a topic I don't spend much time on or have especially strong views about. "you’re going to lean on the difference between 'cause' and 'control'" -- indeed, and I had meant the "no causal interaction with" part of opening sentence to indicate this. It does seem like various readers object to/were confused by the use of the term "control" here, and I think there's room for more emphasis early on as to what specifically I have in mind; but at a high-level, I'm inclined to keep the term "control," rather than trying to rephrase things solely in terms of e.g. correlations, because I think it makes sense to think of yourself as, for practical purposes, "controlling" what your copy writes on his whiteboard, what Omega puts in the boxes, etc; that more broadly, EDT-ish decision-making is in fact weird in the way that trying to control the past is weird, and that this makes it all the more striking and worth highlighting that EDT-ish decision-making seems, sometimes, like the right way to go.

Can you control the past?

Geoffrey Irving4y3

By “physics-based” I’m lumping together physics and history a bit, but it’s hard to disentangle them especially when people start talking about multiverses. I generally mean “the combined information of the laws of physics and our knowledge of the past”. The reason I do want to cite physics too, even for the past case of (1), is that if you somehow disagreed about decision theorists in WW1 I’d go to the next part of the argument, which is that under the technology of WW1 we can’t do the necessary predictive control (they couldn’t build deterministic twin... (read more)

Geoffrey Irving

Ah, I see: you’re going to lean on the difference between “cause” and “control”. So to be clear: I am claiming that, as an empirical matter, we also can’t control the past, or even “control” the past. To expand, I’m not using physics priors to argue that physics is causal, so we can’t control the past. I’m using physics and history priors to argue that we exist in the non-prediction case relative to the past, so CDT applies.

What problems in society are both mathematically challenging, verifiable, and competitively rewarding?

Geoffrey Irving4y15

As a high-level comment, it seems bad to structure the world so that the smartest people compete against each other in zero-sum games. It's definitely the case that zero-sum games are the best way to ensure technical hardness, as the games will by construction be right at the threshold of playability. But if we do this we're throwing most of the value away in comparison to working on positive-sum games.

Is volunteer computing an easily accessible way of effective altruism?

Answer by Geoffrey IrvingAug 28, 20211

Unfortunately, this is unlikely to be an effective use of resources (speaking as someone who worked in high-performance computing for the past 18 years). The resources you can contribute will be dwarfed by the volume and efficiency of cloud services and supercomputers. Even then, due to network constraints the only possible tasks will be embarrassingly parallel computations that do not stress network or memory, and very few scientific computing tasks have this form.

Can you control the past?

Geoffrey Irving4y12

So certainly physics-based priors is a big component, and indeed in some sense is all of it. That is, I think physics-based priors should give you an immediate answer of "you can't influence the past with high probability", and moreover that once you think through the problems in detail the conclusion will be that you could influence the past if physics were different (including boundary conditions, even if laws remain the same), but still that boundary condition priors should still tell us you can't influence the past. I'm happy to elaborate.

F... (read more)

Joe_Carlsmith

Thanks for these comments. Re: “physics-based priors,” I don't think I have a full sense of what you have in mind, but at a high level, I don’t yet see how physics comes into the debate. That is, AFAICT everyone agrees about the relevant physics — and in particular, that you can’t causally influence the past, “change” the past, and so on. The question as I see it (and perhaps I should’ve emphasized this more in the post, and/or put things less provocatively) is more conceptual/normative: whether when making decisions we should think of the past the way CDT does — e.g., as a set of variables whose probabilities our decision-making can’t alter — or in the way that e.g. EDT does — e.g., as a set of variables whose probabilities our decision-making can alter (and thus, a set of variables that EDT-ish decision-making implicitly tries to “control” in a non-causal sense). Non-causal decision theories are weird; but they aren’t actually “I don’t believe in normal physics” weird. They’re more “I believe in managing the news about the already-fixed past” weird. Re: CDT’s domain of applicability, it sounds like your view is something like: “CDT generally works, but it fails in the type of cases that Joe treats as counter-examples to CDT.” I agree with this, and I think most people who reject CDT would agree, too (after all, most decision theories agree on what to do in most everyday cases; the traditional questions have been about what direction to go when their verdicts come apart). I’m inclined to think of this as CDT being wrong, because I’m inclined to think of decision theory as searching for the theory that will get the full range of cases right — but I’m not sure that much hinges on this. That said, I do think that even acknowledging that CDT fails sometimes involves rejecting some principles/arguments one might’ve thought would hold good in general (e.g. “c’mon, man, it’s no use trying to control the past,”the "what would your friend who can see what's in the boxes

Can you control the past?

Geoffrey Irving4y9

I'm not sure anyone else is going to be brave enough to state this directly, so I'll do it:

After reading some of this post (and talking to Paul a bunch and Scott a little), I remain unconfused about whether we can control the past.

Joe_Carlsmith

I have sympathy for responses like "look, it's just so clear that you can't control the past in any practically relevant sense that we should basically just assume the type of arguments in this post are wrong somehow." But I'm curious where you think the arguments actually go wrong, if you have a view about that? For example, do you think defecting in perfect deterministic twin prisoner's dilemmas with identical inputs is the way to go?

Should EA have a career-focused “Do the most good" pledge?

Geoffrey Irving5y2

I think we might just end up in the disaster scenario where you get a bunch of karma. :)

BrianTan

I've added below each comment that they should downvote the comment with negative karma if they upvoted using the poll. I think this is a sign that the EA Forum should have a poll feature.

Should EA have a career-focused “Do the most good" pledge?

Geoffrey Irving5y3

If we want to include a hits-based approach to careers, but also respect people not having EA goals as the exclusive life goal, I'd have a worry that signing this pledge is incompatible with staying in a career that the EA community subsequently decides is ineffective. This could be true even if under the information known at the time of career choice the career looked like terrific expected value.

The actual wording of the pledge seems okay under this metric, as it only promises to "seek out ways to increase the impact of my career", so maybe this is fine as long as the pledge doesn't rise to "switch career" in all cases.

Should EA have a career-focused “Do the most good" pledge?

Geoffrey Irving5y3

Won't this comment get hidden soon?

BrianTan

I assume people can click to unhide it and then downvote it?

High Impact Careers in Formal Verification: Artificial Intelligence

Geoffrey Irving5y11

As someone who's worked both in ML for formal verification with security motivations in mind, and (now) directly on AGI alignment, I think most EA-aligned folk who would be good at formal verification will be close enough to being good at direct AGI alignment that it will be higher impact to work directly on AGI alignment. It's possible this would change in the future if there are a lot more people working on theoretically-motivated prosaic AGI alignment, but I don't think we're there yet.

Concerns with ACE's Recent Behavior

Geoffrey Irving5y33

I think that isn't the right counterfactual since I got into EA circles despite having only minimal (and net negative) impressions of EA-related forums. So your claim is narrowly true, but if instead the counterfactual was if my first exposure to EA was the EA forum, then I think yes the prominence of this kind of post would have made me substantially less likely to engage.

But fundamentally if we're running either of these counterfactuals I think we're already leaving a bunch of value on the table, as expressed by EricHerboso's post about false dilemmas.

Concerns with ACE's Recent Behavior

Geoffrey Irving5y50

I bounce off posts like this. Not sure if you'd consider me net positive or not. :)

[anonymous]5y18

I do too, FWIW. I read this post and its comments because I'm considering donating to/through ACE, and I wanted to understand exactly what ACE did and what the context was. Reading through a sprawling, nearly 15k-word discussion mostly about social justice and discourse norms was not conducive to that goal.

Buck5y14

I am glad to have you around, of course.

My claim is just that I doubt you thought that if the rate of posts like this was 50% lower, you would have been substantially more likely to get involved with EA; I'd be very interested to hear I was wrong about that.

Technology Non-Profits I could volunteer for?

Answer by Geoffrey IrvingOct 21, 20204

Not a non-profit, but since you mention AI and X-risk it's worth mentioning DeepMind, since program managers are core to how research is organized and led here: https://deepmind.com/careers/jobs/2390893.

Quantum computing timelines

Geoffrey Irving5y7

5% probability by 2039 seems way too confident that it will take a long time: is this intended to be a calibrated estimate, or does the number have a different meaning?

Jess_Riedel

Jaime gave a great thorough explanation. My catch-phrase version: This is not a holistic Bayesian prediction. The confidence intervals come from bootstrapping (re-sampling) a fixed dataset, not summing over all possible future trajectories for reality.

Jaime Sevilla

It is not intended to be a calibrated estimated, though we were hoping that it could help others make calibrated estimations. The ways that a calibrated estimate would differ include: 1. The result is a confidence interval, not a credence interval (most places in the paper where it says probability is should say confidence, I apologize for the oversight), so your choice of prior can make a big difference to the associated credence interval. 2. The model is assuming that no discontinuous progress will happen, but we do not know whether this will hold. (Grace, 2020) estimates a yearly rate of discontinous breakthroughs on any given technology of 0.1%, so I'd naively expect a 1-(1-0.1%)^20 = 2% chance that there is such a discontinuous breakthrough for quantum computing in the next 20 years. 3. The model makes optimistic assumptions of progress - namely that a) the rate of exponential progress will hold for both the physical qubit count and the gate error rate, b) there is no correlation between the metrics in a system (which we show it is probably an optimistic assumption, since it is easier to optimize only one of the metrics than both) and c) we ignore the issue of qubit connectivity due to lack of data and modelling difficulty. If I was pressed to put a credence bound on it, I'd assign about 95% chance that EITHER the model is basically correct OR that the timelines are slower than expected (most likely if the exponential trend of progress on gate error rate does not hold in the next 20 years), for an upper bound on the probability that we will have RSA 2048 quantum attacks by 2040 of <5% + 95% 5% ~= 10%. Either case, I think that the model should make us puzzle over the expert timelines, and inquire whether they are taking into account any extra information or being too optimistic. EDIT: I made an artihmetic mistake, now corrected (thanks to Eric Martin for pointing it out)

DavidWeber

It appears to be the extrapolation using exponential growth from current capacity using maximum likelihood to fit the growth rate. Whether you believe the date comes down to how well you think their generalized logical Qubit measures what they're trying to capture. I think it's worth remembering that asking experts for timelines requiring more than 10 years often results in guessing 10 years, so I would tend to favor a data-based extrapolation over that.

Assessing the impact of quantum cryptanalysis

Geoffrey Irving6y4

Yep, that’s the right interpretation.

In terms of hardware, I don’t know how Chrome did it, but at least on fully capable hardware (mobile CPUs and above) you can often bitslice to make almost any circuit efficient if it has to be evaluated in parallel. So my prior is that quite general things don’t need new hardware if one is sufficiently motivated, and would want to see the detailed reasoning before believing you can’t do it with existing machines.

Assessing the impact of quantum cryptanalysis

Geoffrey Irving6y13

This is a great document! I agree with the conclusions, though there are a couple factors not mentioned which seem important:

On the positive side, Google has already deployed post-quantum schemes as a test, and I believe the test was successful (https://security.googleblog.com/2016/07/experimenting-with-post-quantum.html). This was explicitly just a test and not intended as a standardization proposal, but it's good to see that it's practical to layer a post-quantum scheme on top of an existing scheme in a deployed system. I do think if we need... (read more)

Jaime Sevilla

Thank you so much for your kind words and juicy feedback! I did not know about this, and this actually updates me on how much overhead will be needed for post quantum crypto (the NIST expert I interviewed gave me an impression that it was large and essentially would need specialized hardware to meet performance expectations, but this seems to speak to the contrary (?)) To make sure I understand your point, let me try to paraphase. You are pointing out that: 1) past communications that are recorded will be rendered insecure by quantum computing 2) there are some transition costs associated with post quantum crypto - which are related to for example the cost of rebuilding PGP certificate networks. If so, I agree that this is a relevant consideration but does not change the bottom line. Thank you again for reading my paper!

Intellectual Diversity in AI Safety

Geoffrey Irving6y4

In the other direction, I started to think about this stuff in detail at the same time I started working with various other people and definitely learned a ton from them, so there wasn’t a long period where I had developed views but hadn’t spent months talking to Paul.

Intellectual Diversity in AI Safety

Geoffrey Irving6y10

We should also mention Stuart Russell here, since he’s certainly very aware of Bostrom and MIRI but has different detail views and is very grounded in ML.

Intellectual Diversity in AI Safety

Geoffrey Irving6y7

I think mostly I arrived with a different set of tools and intuitions, in particular a better sense for numerical algorithms (Paul has that too, of course) and thus intuition about how things should work with finite errors and how to build toy models that capture the finite error setting.

I do think a lot of the intuitions built by Bostrom and Yudkowsky are easy to fix into a form that works in the finite error model (though not all of it), so I don’t agree with some of the recent negativity about these classical arguments. That is, some fixing is required to make me like those arguments, but it doesn’t feel like the fixing is particularly hard.

Intellectual Diversity in AI Safety

Geoffrey Irving6y7

Well, part of my job is making new people that qualify, so yes to some extent. This is true both in my current role and in past work at OpenAI (e.g., https://distill.pub/2019/safety-needs-social-scientists).

Intellectual Diversity in AI Safety

Geoffrey Irving6y9

I started working on AI safety prior to reading Superintelligence and despite knowing about MIRI et al. since I didn‘t like their approach. So I don’t think I agree with your initial premise that the field is as much a monoculture as you suggest.

My impression is that people like you are pretty rare, but all of this is based off subjective impressions and I could be very wrong. Have you met a lot of other people who came to AI safety from some background other than the Yudkowsky/Superintelligence cluster?

Buck6y10

I'm curious what your experience was like when you started talking to AI safety people after already coming to come of your own conclusions. Eg I'm curious if you think that you missed major points that the AI safety people had spotted which felt obvious in hindsight, or if you had topics on which you disagreed with the AI safety people and think you turned out right.

A list of good heuristics that the case for AI X-risk fails

Geoffrey Irving6y4

Yes, the mocking is what bothers me. In some sense the wording of the list means that people on both sides of the question could come away feeling justified without a desire for further communication: AGI safety folk since the arguments seem quite bad, and AGI safety skeptics since they will agree that some of these heuristics can be steel-manned into a good form.

A list of good heuristics that the case for AI X-risk fails

Geoffrey Irving6y17

As a meta-comment, I think it's quite unhelpful that some of these "good heuristics" are written as intentional strawmen where the author doesn't believe the assumptions hold. E.g., the author doesn't believe that there are no insiders talking about X-risk. If you're going to write a post about good heuristics, maybe try to make the good heuristic arguments actually good? This kind of post mostly just alienates me from wanting to engage in these discussions, which is a problem given that I'm one of the more senior AGI safety researchers.

Aaron Gertler 🔸

I wouldn't have used the word "Good" in the title if the author hadn't, and I share your annoyance with the way the word is being used. As for the heuristics themselves, I think that even the strawmen capture beliefs that many people hold, and I could see it being useful for people to have a "simple" version of a given heuristic to make this kind of list easier to hold in one's mind. However, I also think the author could have achieved simplicity without as much strawmanning.

vaniver6y11

Presumably there are two categories of heuristics, here: ones which relate to actual difficulties in discerning the ground truth, and ones which are irrelevant or stem from a misunderstanding. I think it seems bad that this list implicitly casts the heuristics as being in the latter category, and rather than linking to why each is irrelevant or a misunderstanding it does something closer to mocking the concern.

For example, I would decompose the "It's not empirically testable" heuristic into two different components. The first is something li... (read more)

Will protests lead to thousands of coronavirus deaths?

Geoffrey Irving6y1

“Quite possible” means I am making a qualitative point about game theory but haven’t done the estimates.

Though if one did want to do estimates, that ratio isn’t enough, as spread is superlinear as a function of the size of a group arrested and put in a single room.

Will protests lead to thousands of coronavirus deaths?

Geoffrey Irving6y4

Thanks, that’s all reasonable. Though to clarify, the game theory point isn’t about deterring police but about whether to let potential arrests and coronavirus consequences deter the protests themselves.

Will protests lead to thousands of coronavirus deaths?

Geoffrey Irving6y1

It's worth distinguishing between the protests causing spread and arresting protesters causing spread. It's quite possible more spread will be caused by the latter, and calling this spread "caused by the protests" is game theoretically similar to "Why are you hitting yourself?" My guess is that you're not intending to lump those into the same bucket, but it's worth separating them out explicitly given the title.

Larks

Thanks for your comment! I actually discussed the maths of this a bit in person, but omitted it from the article for simplicities sake, and because I don't think it affects the conclusion much; it is essentially another causal channel by which the protests could increase transmission. I am sceptical that total transmission-once-arrested cases will be anywhere close to transmission-on-streets. For the period they are arrested they'll be in close quarters, so it's definitely true that that is bad, though the total number of people they interact with will presumably go down, which will be a positive. But most importantly I expect only a very small fraction of protesters to be arrested. Indeed at some protests not a single person has been arrested! Furthermore, I expect that anyone reading this article (or anyone being influenced by someone who has read this article) is significantly less likely than average to be arrested, so it is at least less relevant from the point of view of their personal decision making. I sort of see your point about the game theory, but I am sceptical that "the police will have to treat me nicely because otherwise I will get coronavirus" will work in practice. Similarly, I don't recommend trying deterrence with the IRS, or the SEC, or many other US government agencies; they have quite credible pre-commitments to ignoring your threat.

Pablo

What do you mean by 'quite possible'? And what's your estimate of the minimum ratio of arrests to protesters needed for spread due to arrests to exceed spread due to protests?