All of Rohin Shah's Comments + Replies

Oh I see, sorry for misinterpreting you.

So I'm not really seeing anything "bad" here.

I didn't say your proposal was "bad", I said it wasn't "conservative".

My point is just that, if GHD were to reorient around "reliable global capacity growth", it would look very different, to the point where I think your proposal is better described as "stop GHD work, and instead do reliable global capacity growth work", rather than the current framing of "let's reconceptualize the existing bucket of work".

3
Richard Y Chappell
1mo
I was replying to your sentence, "I'd guess most proponents of GHD would find (1) and (2) particularly bad."

I'll suggest a reconceptualization that may seem radical in theory but is conservative in practice.

It doesn't seem conservative in practice? Like Vasco, I'd be surprised if aiming for reliable global capacity growth would look like the current GHD portfolio. For example:

  1. Given an inability to help everyone, you'd want to target interventions based on people's future ability to contribute. (E.g. you should probably stop any interventions that target people in extreme poverty.)
  2. You'd either want to stop focusing on infant mortality, or start interventions to i
... (read more)

I also think it misses the worldview bucket that's the main reason why many people fund global health and (some aspects of) development: intrinsic value attached to saving [human] lives. Potential positive flowthrough effects are a bonus on top of that, in most cases.

From an EA-ish hedonic utilitarianism perspective this dates right back to Singer's essay about saving a drowning child. Taking that thought experiment in a different direction, I don't think many people - EA or otherwise - would conclude that the decision on whether to save the child or not s... (read more)

9
Richard Y Chappell
1mo
I guess I have (i) some different empirical assumptions, and (ii) some different moral assumptions (about what counts as a sufficiently modest revision to still count as "conservative", i.e. within the general spirit of GHD). To specifically address your three examples: 1. I'd guess that variance in cost (to save one life, or whatever) outweighs the variance in predictable ability to contribute. (iirc, Nick Beckstead's dissertation on longtermism made the point that all else equal, it would be better to save a life in a wealthy country for instrumental reasons, but that the cost difference is so great that it's still plausibly much better to focus on developing countries in practice.) Perhaps it would justify more of a shift towards the "D" side of "H&D", insofar as we could identify any good interventions for improving economic development. But the desire for lasting improvements seems commonsensical to many people anyway (compare all the rhetoric around "root causes", "teaching a man to fish", etc.) In general, extreme poverty might seem to have the most low-hanging fruit for improvement (including improvements to capacity-building). But there may be exceptions in cases of extreme societal dysfunction, in which case, again, I think it's pretty commonsensical that we shouldn't invest resources in places where they'd actually do less lasting good. 2. I don't understand at all why this would motivate less focus on infant mortality: fixing that is an extremely cheap way to improve human capacity!  I think I already mentioned in the OP that increasing fertility could also be justified in principle, but I'm not aware of any proven cheap interventions that do this in practice. Adding some child benefit support (or whatever) into the mix doesn't strike me as unduly radical, in any case. 3. Greater support for education seems very commonsensical in principle (including from a broadly "global health & development" perspective), and iirc was an early f

Research Scientist and Research Engineer roles in AI Safety and Alignment at Google DeepMind.

Location: Hybrid (3 days/week in the office) in San Francisco / Mountain View / London.

Application deadline: We don't have a final deadline yet, but will keep the roles open for at least another two weeks (i.e. until March 1, 2024), and likely longer.

For further details, see the roles linked above. You may also find my FAQ useful.

(Fyi, I probably won't engage more here, due to not wanting to spend too much time on this)

Jonas's comment is a high level assessment that is only useful insofar as you trust his judgment.

This is true, but I trust basically any random commenter a non-zero amount (unless their comment itself gives me reasons not to trust them). I agree you can get more trust if you know the person better. But even the amount of trust for "literally a random person I've never heard of" would be enough for the evidence to matter to me.

I'm only saying that I think large update

... (read more)

SBF was an EA leader in good standing for many years and had many highly placed friends. It's pretty notable to me that there weren't many comments like Jonas's for SBF, while there are for Owen.

 

I think these cases are too different for that comparison to hold. 

One big difference is that SBF committed fraud, not sexual harassment. There's a long history of people minimizing sexual harassment, especially when it's as ambiguous. There's also a long history of ignoring fraud when you're benefiting from it, but by the time anyone had a chance to com... (read more)

The evidence Jonas provides is equally consistent with “Owen has a flaw he has healed” and “Owen is a skilled manipulator who charms men, and harasses women”.

Surely there are a lot of other hypotheses as well, and Jonas's evidence is relevant to updating on those?

More broadly, I don't think there's any obvious systemic error going on here. Someone who knows the person reasonably well, giving a model for what the causes of the behavior were, that makes predictions about future instances, clearly seems like evidence one should take into account.

(I do agree t... (read more)

Surely there are a lot of other hypotheses as well, and Jonas's evidence is relevant to updating on those?

 

There are of course infinite hypotheses. But I don't think Jonas's statement adds much to my estimates of how much harm Owen is likely to do in the future, and expect the same should be true for most people reading this.

To be clear I'm not saying I estimate more harm is likely- taking himself off the market seems likely to work, and this has been public enough I expect it to be easy for future victims to complain if something does happen. I'm onl... (read more)

Yeah, I don't think it's accurate to say that I see assistance games as mostly irrelevant to modern deep learning, and I especially don't think that it makes sense to cite my review of Human Compatible to support that claim.

The one quote that Daniel mentions about shifting the entire way we do AI is a paraphrase of something Stuart says, and is responding to the paradigm of writing down fixed, programmatic reward functions. And in fact, we have now changed that dramatically through the use of RLHF, for which a lot of early work was done at CHAI, so I think... (read more)

Fyi, the list you linked doesn't contain most of what I would consider the "small" orgs in AI, e.g. off the top of my head I'd name ARC, Redwood Research, Conjecture, Ought, FAR AI, Aligned AI, Apart, Apollo, Epoch, Center for AI Safety, Bluedot, Ashgro, AI Safety Support and Orthogonal. (Some of these aren't even that small.) Those are the ones I'd be thinking about if I were to talk about merging orgs.

Maybe the non-AI parts of that list are more comprehensive, but my guess is that it's just missing most of the tiny orgs that OP is talking about (e.g. OP'... (read more)

5
Angelina Li
10mo
Yeah, fair! It's frustratingly hard to get comprehensive lists of EA orgs (it's hard to be in the business of gatekeeping what 'EA-affiliated' is). I did a 5 min search for the best publicly available list and then gave up; sometimes I use the list of organizations with representatives at the last EAG for this use case. Maybe within AI specifically, someone could repeat this exercise with something like this list. If someone knows of a better public list of EA orgs, I'd love to know about it :)

:) I'm glad we got to agreement!

(Or at least significantly closer, I'm sure there are still some minor differences.)

On hits-based research: I certainly agree there are other factors to consider in making a funding decision. I'm just saying that you should talk about those directly instead of criticizing the OP for looking at whether their research was good or not.

(In your response to OP you talk about a positive case for the work on simulators, SVD, and sparse coding -- that's the sort of thing that I would want to see, so I'm glad to see that discussion starting.)

On VCs: Your position seems reasonable to me (though so does the OP's position).

On recommendations: Fwiw I ... (read more)

Hmm, yeah. I actually think you changed my mind on the recommendations. My new position is something like:
1. There should not be a higher burden on anti-recommendations than pro-recommendations.
2. Both pro- and anti-recommendations should come with caveats and conditionals whenever they make a difference to the target audience. 
3. I'm now more convinced that the anti-recommendation of OP was appropriate. 
4. I'd probably still phrase it differently than they did but my overall belief went from "this was unjustified" to "they should have used diffe... (read more)

I'm not very compelled by this response.

It seems to me you have two points on the content of this critique. The first point:

I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit.

I'm pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?

Presuma... (read more)

Good comment, consider cross-posting to LW?

9
mariushobbhahn
10mo
1. Meta: maybe my comment on the critique reads stronger than intended (see comment with clarifications) and I do agree with some of the criticisms and some of the statements you made. I'll reflect on where I should have phrased things differently and try to clarify below.  2. Hits-based research: Obviously results are one evaluation criterion for scientific research. However, especially for hits-based research, I think there are other factors that cannot be neglected. To give a concrete example, if I was asked whether I should give a unit under your supervision $10M in grant funding or not, I would obviously look back at your history of results but a lot of my judgment would be based on my belief in your ability to find meaningful research directions in the future. To a large extent, the funding would be a bet on you and the research process you introduce in a team and much less on previous results. Obviously, your prior research output is a result of your previous process but especially in early organizations this can diverge quite a bit. Therefore, I think it is fair to say that both a) the output of Conjecture so far has not been that impressive IMO and b) I think their updates to early results to iterate faster and look for more hits actually is positive evidence about their expected future output.  3. Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that product

Wait, you think the reason we can't do brain improvement is because we can't change the weights of individual neurons?

That seems wrong to me. I think it's because we don't know how the neurons work.

Did you read the link to Cold Takes above? If so, where do you disagree with it?

(I agree that we'd be able to do even better if we knew how the neurons work.)

Similarly I'd be surprised if you thought that beings as intelligent as humans could recursively improve NNs. Cos currently we can't do that, right?

Humans can improve NNs? That's what AI capabilities resear... (read more)

I think it's within the power of beings equally as intelligent as us (similarly as mentioned above I think recursive improvement in humans would accelerate if we had similar abilities).

2
Nathan Young
1y
Wait, you think the reason we can't do brain improvement is because we can't change the weights of individual neurons? That seems wrong to me. I think it's because we don't know how the neurons work. Similarly I'd be surprised if you thought that beings as intelligent as humans could recursively improve NNs. Cos currently we can't do that, right?

I thought yes, but I'm a bit unhappy about that assumption (I forgot it was there). If you go by the intended spirit of the assumption (see the footnote) I'm probably on board, but it seems ripe for misinterpretation ("well if you had just deployed GPT-5 it really could have run an automated company, even though in practice we didn't do that because we were worried about safety and/or legal liability and/or we didn't know how to prompt it etc").

You could look at these older conversations. There's also Where I agree and disagree with Eliezer (see also my comment) though I suspect that won't be what you're looking for.

Mostly though I think you aren't going to get what you're looking for because it's a complicated question that doesn't have a simple answer.

(I think this regardless of whether you frame the question as "do we die?" or "do we live?", if you think the case for doom is straightforward I think you are mistaken. All the doom arguments I know of seem to me like they establish plausibility, ... (read more)

2
Ben_West
1y
Pedantic, but are you using the bio anchors definition? ("software which causes a tenfold acceleration in the rate of growth of the world economy (assuming that it is used everywhere that it would be economically profitable to use it)")
7
Greg_Colbourn
1y
Thanks. Regarding the conversations from 2019, I think we are in a different world now (post GPT-4 + AutoGPT/plugins). [Paul Christiano] "Perhaps there's no problem at all" - saying this really doesn't help! I want to know why might that be the case! "concerted effort by longtermists could reduce it" - seems less likely now given shorter timelines. "finding out that the problem is impossible can help; it makes it more likely that we can all coordinate to not build dangerous AI systems" - this could be a way out, but again, little time. We need a Pause first to have time to firmly establish impossibility. However, "coordinate to not build dangerous AI systems" is not part of p(non-doom|AGI) [I'm interested in why people think there won't be doom, given we get AGI]. So far, Paul's section does basically nothing to update me on p(doom|AGI). [Rohin Shah] "A likely crux is that I think that the ML community will actually solve the problems, as opposed to applying a bandaid fix that doesn't scale." - yes, this is a crux for me. How do the fixes scale, with 0 failure modes in the limit of superintelligence? You mention interpretability as a basis for scalable AI-assisted alignment above this, but progress in interpretability remains far behind the scaling of the models, so doesn't hold much hope imo. "I'm also less worried about race dynamics increasing accident risk"; "the Nash equilibrium is for all agents to be cautious" - I think this has been blown out of the water with the rush to connect GPT-4 to the internet and spread it far and wide as quickly as possible. As I said, we're in a different world now. "If I condition on discontinuous takeoff... I... get a lot more worried about AI risk" - this also seems cruxy (and I guess we've discussed a bit above). What do you think the likelihood is of model trained with 100x more compute (affordable by Microsoft or Google) being able to do AI Research Engineering as well as the median AI Research Engineer? To me it seems pret

First off, let me say that I'm not accusing you specifically of "hype", except inasmuch as I'm saying that for any AI-risk-worrier who has ever argued for shorter timelines (a class which includes me), if you know nothing else about that person, there's a decent chance their claims are partly "hype". Let me also say that I don't believe you are deliberately benefiting yourself at others' expense.

That being said, accusations of "hype" usually mean an expectation that the claims are overstated due to bias. I don't really see why it matters if the bias is sur... (read more)

0
Greg_Colbourn
1y
I guess you're right that "hype" here could also come from being survival motivated. But surely the easier option is to just stop worrying so much? (I mean, it's not like stress doesn't have health effects). Read the best counter-arguments and reduce your p(doom) accordingly. Unfortunately, I haven't seen any convincing counterarguments. I'm with Richard Ngo here when he says: What are the best counter-arguments you are aware of?  I'm always a bit confused by people saying they have a p(doom|TAI) of 1-10%: like what is the mechanistic reason for expecting that the default, or bulk of the probability mass, is not doom? How is the (transformative) AI spontaneously becoming aligned enough to be safe!? It often reads to me as people (who understand the arguments for x-risk) wanting to sound respectable and not alarmist, rather than actually having a good reason to not worry so much. GPT-5 or GPT-6 (1 or 2 further generations of large AI model development). Yes, TAI, or PASTA, or AI that can do everything as good as the best humans (including AI Research Engineering). Would you be willing to put this in numerical form (% chance) as a rough expectation?

I don't yet understand why you believe that hardware scaling would come to grow at much higher rates than it has in the past.

If we assume innovations decline, then it is primarily because future AI and robots will be able to automate far more tasks than current AI and robots (and we will get them quickly, not slowly).

Imagine that currently technology A that automates area X gains capabilities at a rate of 5% per year, which ends up leading to a growth rate of 10% per year.

Imagine technology B that also aims to automate area X gains capabilities at a rate o... (read more)

I don't disagree with any of the above (which is why I emphasized that I don't think the scaling argument is sufficient to justify a growth explosion). I'm confused why you think the rate of growth of robots is at all relevant, when (general-purpose) robotics seem mostly like a research technology right now. It feels kind of like looking at the current rate of growth of fusion plants as a prediction of the rate of growth of fusion plants after the point where fusion is cheaper than other sources of energy.

(If you were talking about the rate of growth of machines in general I'd find that more relevant.)

5
Magnus Vinding
1y
By "I am confused by your argument against scaling", I thought you meant the argument I made here, since that was the main argument I made regarding scaling; the example with robots wasn't really central. I'm also a bit confused, because I read your arguments above as being arguments in favor of explosive economic growth rates from hardware scaling and increasing software efficiency. So I'm not sure whether you believe that the factors mentioned in your comment above are sufficient for causing explosive economic growth. Moreover, I don't yet understand why you believe that hardware scaling would come to grow at much higher rates than it has in the past.
5
Magnus Vinding
1y
To be clear, I don't mean to claim that we should give special importance to current growth rates in robotics in particular. I just picked that as an example. But I do think it's a relevant example, primarily due to the gradual nature of the abilities that robots are surpassing, and the consequent gradual nature of their employment. Unlike fusion, which is singular in its relevant output (energy), robots produce a diversity of things, and robots cover a wide range of growth-relevant skills that are gradually getting surpassed already. It is this gradual nature of their growth-related abilities that makes them relevant, imo — because they are already doing a lot of work and already contributing a fair deal to the growth we're currently seeing. (To clarify, I mostly have in mind industrial robots, such as these, the future equivalents of which I also expect to be important to growth; I'd agree that it wouldn't be so relevant if we were only talking about some prototypes of robots that don't yet contribute meaningfully to the economy.)

I am confused by your argument against scaling.

My understanding of the scale-up argument is:

  1. Currently humans are state-of-the-art at various tasks relevant to growth.
  2. We are bottlenecked on scaling up humans by a variety of things (e.g. it takes ~20 years to train up a new human, you can't invest money into the creation of new humans with the hope of getting a return on it, humans only work ~8 hours a day)
  3. At some point AI / robots will be able to match human performance at these tasks.
  4. AI / robots will not be bottlenecked on those things.

In some sense I agre... (read more)

I agree with premise 3. Where I disagree more comes down to the scope of premise 1.

This relates to the diverse class of contributors and bottlenecks to growth under Model 2. So even though it's true to say that humans are currently "the state-of-the-art at various tasks relevant to growth", it's also true to say that computers and robots are currently "the state-of-the-art at various tasks relevant to growth". Indeed, machines/external tools have been (part of) the state-of-the-art at some tasks for millennia (e.g. in harvesting), and computers and robots ... (read more)

“Hype” typically means Person X is promoting a product, that they benefit from the success of that product, and that they are probably exaggerating the impressiveness of that product in bad faith (or at least, with a self-serving bias). 

All of this seems to apply to AI-risk-worriers?

  • AI-risk-worriers are promoting a narrative that powerful AI will come soon
  • AI-risk-worriers are taken more seriously, have more job opportunities, get more status, get more of their policy proposals, etc, to the extent that this narrative is successful
  • My experience is that
... (read more)
3
Greg_Colbourn
1y
FWIW I am not seeking job opportunities or policy proposals that favour me financially. Rather - policy proposals that keep me, my family, and everyone else alive. My self-interest here is merely in staying alive (and wanting the rest of the planet to stay alive too). I'd rather this wasn't an issue and just enjoy my retirement. I want to spend money on this (pay for people to work on Pause / global AGI moratorium / Shut It Down campaigns). Status is a trickier thing to untangle. I'd be lying, as a human, if I said I didn't care about it. But I'm not exactly getting much here by being an "AI-risk-worrier". And I could probably get more doing something else. No one is likely to thank me if a disaster doesn't happen. Re AI products being less impressive than the impression you get from AI-risk-worriers, what do you make of Connor Leahy's take that LLMs are basically "general cognition engines" and will scale to full AGI in a generation or two (and with the addition of various plugins etc to aid "System 2" type thinking, which are freely being offered by the AutoGPT crowd)?
5
Steven Byrnes
1y
Hmm. Touché. I guess another thing on my mind is the mood of the hype-conveyer. My stereotypical mental image of “hype” involves Person X being positive & excited about the product they’re hyping, whereas the imminent-doom-ers that I’ve talked to seem to have a variety of moods including distraught, pissed, etc. (Maybe some are secretly excited too? I dunno; I’m not very involved in that community.)
2
Sharmake
1y
This. I generally also agree with your 3 observations, and the reason I was focusing on truth seeking is because my epistemic environment tends to reward worrying AI claims more than it probably should due to negativity bias, as well as looking at AI Twitter hype.

Thanks for this, it's helpful. I do agree that declining growth rates is significant evidence for your view.

I disagree with your other arguments:

For one, an AI-driven explosion of this kind would most likely involve a corresponding explosion in hardware (e.g. for reasons gestured at here and here), and there are both theoretical and empirical reasons to doubt that we will see such an explosion.

I don't have a strong take on whether we'll see an explosion in hardware efficiency; it's plausible to me that there won't be much change there (and also plausible t... (read more)

7
Magnus Vinding
1y
Regarding explosive growth in the amount of hardware: I meant to include the scale aspect as well when speaking of a hardware explosion. I tried to outline one of the main reasons I'm skeptical of such an 'explosion via scaling' here. In short, in the absence of massive efficiency gains, it seems even less likely that we will see a scale-up explosion in the future. That's right, but that's consistent with the per capita drop in innovation being a significant part of the reason why growth rates gradually declined since the 1960s. I didn't mean to deny that total population size has played a crucial role, as it obviously has and does. But if innovations per capita continue to decline, then even a significant increase in effective population size in the future may not be enough to cause a growth explosion. For example, if the number of employed robots continues to grow at current rates (roughly 12 percent per year), and if future robots eventually come to be the relevant economic population, then declining rates of innovation/economic productivity per capita would mean that the total economic growth rate still doesn't exceed 12 percent. I realize that you likely expect robot populations to grow much faster in such a future, but I still don't see what would drive such explosive growth in hardware (even if, in fact especially if, it primarily involves scaling-based growth). That makes sense. On the other hand, it's perhaps worth noting that individual human thinking was increasingly extended by computers after ca. 1950, and yet the rate of innovation per capita still declined. So in that sense, the decline in progress could be seen as being somewhat understated by the graphs, in that the rate of innovation per dollar/scientific instrument/computation/etc. has declined considerably more.

I think it does [change the conclusion].

Upon rereading I realize I didn't state this explicitly, but my conclusion was the following:

If an agent has complete preferences, and it does not pursue dominated strategies, then it must be representable as maximizing expected utility.

Transitivity depending on completeness doesn't invalidate that conclusion.

3
EJT
1y
Ah I see! Yep, agree with that.

Okay, it seems like we agree on the object-level facts, and what's left is a disagreement about whether people have been making a major error. I'm less interested in that disagreement so probably won't get into a detailed discussion, but I'll briefly outline my position here.

The error is claiming that 

  • There exist  theorems which state that, unless an agent can be represented as maximizing expected utility, that agent is liable to pursue strategies that are dominated by some other available strategy.

I haven't seen anyone point out that that claim

... (read more)
5
EJT
1y
I think that’s right. Yep, I agree with all of this. Often, but not in this case. If authors understood the above points and meant to refer to the Complete Class Theorem, they need only have said: * If an agent has complete, transitive preferences, and it does not pursue dominated strategies, then it must be representable as maximizing expected utility. (And they probably wouldn’t have mentioned Cox, Savage, etc.) I think it does. If the money-pump for transitivity needs Completeness, and Completeness is doubtful, then the money-pump for transitivity is doubtful too.

Thanks, I understand better what you're trying to argue.

The part I hadn't understood was that, according to your definition, a "coherence theorem" has to (a) only rely on antecedents of the form "no dominated strategies" and (b) conclude that the agent is representable by a utility function. I agree that on this definition there are no coherence theorems. I still think it's not a great pedagogical or rhetorical move, because the definition is pretty weird.

I still disagree with your claim that people haven't made this critique before.

From your discussion:

[T

... (read more)
8
EJT
1y
Yep, I agree with that. Note that your money-pump justifies acyclicity  (The agent does not strictly prefer A to B, B to C, and C to A) rather than the version of transitivity necessary for the VNM and Complete Class theorems (If the agent weakly prefers A to B, and B to C, then the agent weakly prefers A to C). Gustafsson thinks you need Completeness to get a money-pump for this version of transitivity working (see footnote 8 on page 3), and I'm inclined to agree. A dominated strategy would be a strategy which leads you to choose an option that is worse in some respect than another available option and not better than that other available option in any respect. For example, making all the trades and getting A- in the decision-situation below would be a dominated strategy, since you could have made no trades and got A: The error is claiming that  * There exist  theorems which state that, unless an agent can be represented as maximizing expected utility, that agent is liable to pursue strategies that are dominated by some other available strategy. I haven't seen anyone point out that that claim is false. That said, one could reason as follows: 1. Rohin, John, and others have argued that agents with incomplete preferences can act in accordance with policies that make them immune to pursuing dominated strategies. 2. Agents with incomplete preferences cannot be represented as maximizing expected utility. 3. So, if Rohin's, John's, and others' arguments are sound, there cannot exist theorems which state that, unless an agent can be represented as maximizing expected utility, that agent is liable to pursue strategies that are dominated by some other available strategy. Then one would have corrected the error. But since the availability of this kind of reasoning is easily missed, it seems worth correcting the error directly.

Your post argues for a strong conclusion:

To spot the error in these arguments, we only have to look up what cited ‘coherence theorems’ actually say. And yet the error seems to have gone uncorrected for more than a decade.

[...]

There are no coherence theorems. Authors in the AI safety community should stop suggesting that there are.

There are money-pump arguments, but the conclusions of these arguments are not theorems. The arguments depend on substantive and doubtful assumptions.

As I understand it, you propose two main arguments for the conclusion:

  1. There are
... (read more)
EJT
1y15
4
0

Theorems are typically of the form "Suppose X, then Y"; what is X if not an assumption?

X is an antecedent.

Consider an example. Imagine I claim:

  • Suppose James is a bachelor. Then James is unmarried.

In making this claim, I am not assuming that James is a bachelor. My claim is true whether or not James is a bachelor.

I might temporarily assume that James is a bachelor, and then use that assumption to prove that James is unmarried. But when I conclude ‘Suppose James is a bachelor. Then James is unmarried’, I discharge that initial assumption. My conclusion no lo... (read more)

If we bracket the timelines part and just ask about p(doom), I think https://www.lesswrong.com/posts/Ke2ogqSEhL2KCJCNx/security-mindset-lessons-from-20-years-of-software-security and https://intelligence.org/2017/11/25/security-mindset-ordinary-paranoia/ makes it quite easy to reach extremely dire forecasts about AGI. Getting extremely novel software right on the first try is just that hard.

Surely not. Neither of those make any arguments about AI, just about software generally. If you literally think those two are sufficient arguments for concluding "AI ki... (read more)

7
RobBensinger
1y
Yep! To be explicit, I was assuming that general intelligence is very powerful, that you can automate it, and that it isn't (e.g.) friendly by default.

I think this is generally false, but might benefit from some examples or more specifics.

(Referring to the OP, not the comment)

Unless they were edited in after these comments were written

For the record, these were not edited in after seeing the replies. (Possibly I edited them in a few minutes after writing the comment -- I do that pretty frequently -- but if so it was before any of the replies were written, and very likely before any of the repliers had seen my comment.)

At times like these, I don't really know why I bother to engage on the EA Forum, given that people seem to be incapable of engaging with the thing I wrote instead of some totally different thing in their head.

I'll just pop back in here briefly to say that (1) I have learned a lot from your writing over the years, (2) I have to say I still cannot see how I misinterpreted your comment, and (3) I genuinely appreciate your engagement with the post, even if I think your summary misses the contribution in a fundamentally important way (as I tried to elaborate elsewhere in the thread).

The argument here emphatically cannot be merely summarized as "AGI soon [is] a very contrarian position [and market prices are another indication of this]".

Can you describe in concrete detail a possible world in which:

  1. "AGI in 30 years" is a very contrarian position, including amongst hedge fund managers, bankers, billionaires, etc
  2. Market prices indicate that we'll get AGI in 30 years

It seems to me that if you were in such a situation, all of the non-contrarian hedge fund managers, bankers, billionaires would do the opposite of all of the trades that you've ... (read more)

This post's thesis is that markets don't expect AGI in the next 30 years. I'll make a stronger claim: most people don't expect AGI in the next 30 years; it's a contrarian position. Anyone expecting AGI in that time is disagreeing with a very large swath of humanity.

(It's a stronger claim because "most people don't expect AGI" implies "markets don't expect AGI", but the reverse is not true. (Not literally so -- you can construct scenarios like "only investors expect AGI while others don't" where most people don't expect AGI but the market does expect AGI --... (read more)

-28
supesanon
1y

It's a stronger claim because "most people don't expect AGI" implies "markets don't expect AGI"

I'm not sure that's true. Markets often price things that only a minority of people know or care about. See the lithium example in the original post. That was a case where "most people didn't know lithium was used in the H-bomb" didn't imply that "markets didn't know lithium was used in the H-bomb"

My view on this is rather that there seem to be several key technologies and measures of progress that have very limited room for further growth, and the ~zero-to-one growth that occurred along many of these key dimensions seems to have been low-hanging fruit that coincided with the high growth rates that we observed around the mid-1900s. And I think this counts as modest evidence against a future growth explosion.

Hmm, it seems to me like these observations are all predicted by the model I'm advocating, so I don't see why they're evidence against that mode... (read more)

3
Magnus Vinding
1y
I think the empirical data suggests that that effect generally doesn't dominate anymore, and that it hasn't dominated in the economy as a whole for the last ~3 doublings. For example, US Total Factor Productivity growth has been weakly declining for several decades despite superlinear growth in the effective number of researchers. I think the example of 0 AD is disanalogous because there wasn't a zero-to-one growth along similarly significant and fundamental dimensions (e.g. hitting the ultimate limit in the speed of communication) followed by an unprecedented growth decline that further (weakly) supports that we're past the inflection point, i.e. past peak growth rates.

You're trying to argue for "there are no / very few important technologies with massive room for growth" by giving examples of specific things without massive room for growth.

In general arguing for "there is no X that satisfies Y" by giving examples of individual Xs that don't satisfy Y is going to be pretty rough and not very persuasive to me, unless there's some reason that can be abstracted out of the individual examples that is likely to apply to all Xs, which I don't see in this case. I don't care much whether the examples are technologies or measures... (read more)

3
Magnus Vinding
1y
I should clarify that I’m not trying to argue for that claim, which is not a claim that I endorse. My view on this is rather that there seem to be several key technologies and measures of progress that have very limited room for further growth, and the ~zero-to-one growth that occurred along many of these key dimensions seems to have been low-hanging fruit that coincided with the high growth rates that we observed around the mid-1900s. And I think this counts as modest evidence against a future growth explosion. That is, my sense from reading Gordon and others is that the high growth rates of the 20th century were in large part driven by a confluence of innovations across many different technological domains — innovations that people such as Gordon and Cowen roughly describe as low-hanging innovations that are no longer (as readily) accessible. This, combined with the empirical observation that growth rates have been declining since the 1960s, and the observation that innovations per capita have decreased, seems to me convergent (albeit in itself still quite tentative) evidence in favor of the claim that we will not see explosive growth in the future, as this “low-hanging fruit picture” renders a similar — and especially a greater — such confluence of progress less likely to occur again. And I would regard each of those lines of evidence to be significant. (Of course, these lines of evidence are closely related; e.g. the decline in innovations per capita might be seen as a consequence of our having already reaped the most significant innovations in many — though by no means all — key domains.)  I also think it’s important to distinguish 1) how much room for growth that various technologies have, and 2) how likely it is that we will see a growth explosion. My view is that we are (obviously) quite far from the ultimate limits in many domains, but that future growth will most likely be non-explosive, partly because future innovations seem much harder to reap compare

I don't disagree with anything you've written here, but I'm not seeing why the limits they impose are anywhere close to where we are today.

3
David Johnston
1y
We might just be talking past each other - I’m not saying this is a reason to be confident explosive growth won’t happen and I agree it looks like growth could go much faster before hitting any limits like this. I just meant to say “here’s a speculative mechanism that could break some of the explosive growth models”

I think most of the arguments I present in the section on why I consider Model 2 most plausible are about declining growth along various metrics.

Yes, sorry, I shouldn't have said "most".

especially those presented in the section “Many key technologies only have modest room for further growth

Yeah, I mostly don't buy the argument (sorry for not noting that earlier). It's not the case that there are N technologies and progress consists solely of improving those technologies; progress usually happens by developing new technologies. So I don't see the fact that... (read more)

5
Magnus Vinding
1y
Yeah, I agree with that. :) But I think we can still point to some important underlying measures — say, "the speed at which we transmit signals around Earth" or "the efficiency with which we can harvest solar energy" — where there isn't much room for further progress. On the first of those two measures, there basically isn't any room for further progress. On the second, we can at the very most see ~a doubling from where we currently are, whereas we have seen more than a 40x increase since the earliest solar cells in the late 1800s. Those are some instances of progress that cannot be repeated (within those domains), even if we create new technologies within these domains. Of course, there may be untapped domains that could prove similarly significant for growth. But I still think the increasing number of domains in which past growth accomplishments cannot be repeated provides a modest reason to doubt a future growth explosion. As noted, I don't think any of the reasons I listed are strong in themselves, but when combined with the other reasons, including the decline in innovations per capita, recent empirical trends in hardware progress, the relatively imminent limits in the growth of information processing (less than 250 years at current growth rates), and the point about the potential difficulties of explosive growth given limited efficiency gains I made here, I do think a growth explosion begins to look rather unlikely, especially one that implies >1000 percent annual growth (corresponding to an economy that doubles ~every three months or faster).

As I understand it your argument is "Even if AI could lead to explosive growth, we'll choose not to do it because we don't yet know what we want". This seems pretty wild, does literally no one want to make tons of money in this scenario?

2
David Johnston
1y
I don’t think your summary is wrong as such, but it’s not how I think about it. Suppose we’ve got great AI that, in practice, we still use with a wide variety of control inputs (“make better batteries”, “create software that does X”). Then it could be the case - if AI enables explosive growth in other domains - that “production of control inputs” becomes the main production bottleneck. Alternatively, suppose there’s a “make me a lot of money” AI and money making is basically about making stuff that people want to buy. You can sell more stuff that people are already known to want - but that runs into the limit that people only want a finite amount of stuff. You could alternatively sell new stuff that people want but don’t know it yet. This is still limited by the number of people in the world, how often each wants to consider adopting a new technology and what things someone with life history X is actually likely to adopt and how long it takes them to make this decision. These things seem unlikely to scale indefinitely with AI capability. This could be defeated by either money not being about making stuff people want - which seems fairly likely, but in this case I don’t really know what to think - or AI capability leading to (explosive?) human population expansion. In defence of this not being completely wild speculation: advertising already comprises a nontrivial fraction of economic activity and seems to be growing faster than other sectors https://www.statista.com/statistics/272443/growth-of-advertising-spending-worldwide/ (Although only a small fraction of advertising is promoting the adoption of new tech)

What do you think about the perspective of "Model 2, but there's still explosive growth"? In particular, depending on what exactly you mean by cognitive abilities, I think it's pretty plausible to believe (1) cognitive abilities are a rather modest part of ability to achieve goals, (2)
individual human cognitive abilities are a significant bottleneck among many others to economic and technological growth, (3) the most growth-relevant human abilities will be surpassed by machines quite continuously (not  over millennia, but there isn't any one "big disc... (read more)

I wrote earlier that I might write a more elaborate comment, which I'll attempt now. The following are some comments on the pieces that you linked to.

1. The Most Important Century series

I disagree with this series in a number of places. For example, in the post "This Can't Go On", it says the following in the context of an airplane metaphor for our condition:

We're going much faster than normal, and there isn't enough runway to do this much longer ... and we're accelerating.

As argued above, in terms of economic growth rates, we're in fact not accelerating, ... (read more)

6
Magnus Vinding
1y
Thanks for your question :) I might write a more elaborate comment later, but to give a brief reply: It’s true that Model 2 (defined in terms of those three assumptions) does not rule out significantly higher growth rates, but it does, I believe, make explosive growth quite a lot less likely compared to Model 1, since it does not imply that there’s a single bottleneck that will give rise to explosive growth. I think most of the arguments I present in the section on why I consider Model 2 most plausible are about declining growth along various metrics. And many of the declining trends appear to have little to do with the demographic transition, especially those presented in the section “Many key technologies only have modest room for further growth”, as well as the apparent decline in innovations per capita.
1
David Johnston
1y
One objection to the “more AI -> more growth” story is that it’s quite plausible that people still participate in an AI driven economy to the extent that they decide what they want, and this could be a substantial bottleneck to growth rates. Speeds of technological adoption do seem to have increased (https://www.visualcapitalist.com/rising-speed-technological-adoption/), but that doesn’t necessarily mean they can indefinitely keep pace with AI driven innovation.

Fwiw I'd also say that most of "the most engaged EAs" would not feel betrayed or lied to (for the same reasons), though I would be more uncertain about that. Mostly I'm predicting that there's pretty strong selection bias in the people you're thinking of and you'd have to really precisely pin them down (e.g. maybe something like "rationalist-adjacent highly engaged EAs who have spent a long time thinking meta-honesty and glomarization") before it would become true that a majority of them would feel betrayed or lied to.

3
Habryka
1y
That's plausible, though I do think I would take a bet here if we could somehow operationalize it. I do think I have to adjust for a bunch of selection effects in my thinking, and so am not super confident here, but still a bit above 50%. 

Like, let's look ahead a few months. Some lower-level FTX employee is accused of having committed some minor fraud with good ethical justification that actually looks reasonable according to RP leadership, so they make a statement coming out in defense of that person. 

Do you not expect this to create strong feelings of betrayal in previous readers of this post, and a strong feeling of having been lied to?

I broadly agree with your comments otherwise, but in fact in this hypothetical I expect most readers of this post would not feel betrayed or lied to.... (read more)

Yeah, sorry, I think you are right that as phrased this is incorrect. I think my phrasing implies I am talking about the average or median reader, who I don't expect to react in this way.

Across EA, I do expect reactions to be pretty split. I do expect many of the most engaged EAs to have taken statements like this pretty literally and to feel quite betrayed (while I also think that in-general the vast majority of people will have interpreted the statements as being more about mood-affiliation and to have not really been intended to convey information).&nbs... (read more)

But even if we could be confident that entertainment would hypothetically outweigh sex crimes on pure utilitarian grounds, in the real world with real politics and EA critics, I do not think this position would be tenable.

Isn't this basically society's revealed position on, say, cameras? People can and do use cameras for sex crimes (e.g. voyeurism) but we don't regulate cameras in order to reduce sex crimes.

I agree that PR-wise it's not a great look to say that benefits outweigh risks when the risks are sex crimes but that's because PR diverges wildly from... (read more)

1
philljkc
1y
We agree for sure that cost/benefit ought be better articulated when deploying these models (see the What Do We Want section on Cost-Benefit Analysis). The problem here really is the culture of blindly releasing and open-sourcing models like this, using a Go Fast And Break Things mentality, without at least making a case for what the benefits are, what the harms are, and not appealing to any existing standard when making these decisions.  Again, it's possible (but not our position) that the specifics of DALLE-2  don't bother you as much, but certainly the current culture we have around such models and their deployment seems an unambiguously alarming development. The text-to-image models for education + communication here seems like a great idea! Moreover, I think it's definitely consistent with what we've put forth here too, since you could probably fine-tune on graphics contained in papers related to your task at hand. The issue here really is that people are incurring unnecessary amounts of risk by making, say, an automatic Distill-er by using all images on the internet or something like that, when training on a smaller corpora would probably suffice, and vastly reduce the amount of possible risk of a model intended originally for Distill-ing papers. The fundamental position we advance that better protocols are needed before we start mass-deploying these models, and not that NO version of these models / technologies could be beneficial, ever.

A way to separate between goal misgeneralization and capabilities misgeneralization would be exciting if work on goal misgeneralization could improve alignment without capabilities externalities.

However, this distinction might be eroded in the future.

My favorite distinction between alignment vs capabilities, which mostly doesn't work now but should work for more powerful future systems, is to ask "did the model 'know' that the actions it takes are ones that the designers would not want?" If yes, then it's misalignment.

(This is briefly discussed in Section 5.2 of the paper.)

 As far as I can tell, we don't know of any principles that satisfy both (1) they guide our actions in all or at least most situations and (2) when taken seriously, they don't lead to crazy town. So our options seem to be (a) don't use principles to guide actions, (b) go to crazy town, or (c) use principles to guide actions but be willing to abandon them when the implications get too crazy.

I wish this post had less of a focus on utilitarianism and more on whether we should be doing (a), (b) or (c).

(I am probably not going to respond to comments about how specific principles satisfy (1) and (2) unless it actually seems plausible to me that they might satisfy (1) and (2).)

5
MichaelStJules
2y
I think there plausibly are principles that achieve (1) and (2), but they'll give up either transitivity or the independence of irrelevant alternatives, and if used to guide actions locally without anticipating your own future decisions and without the ability to make precommitments, lead to plausibly irrational behaviour (and more than usual than just with known hard problems like Newcomb's and Parfit's hitchhiker). I don't think those count as "crazy towns", but they're things people find undesirable or see as "inconsistent". Also, they might require more arbitrariness than usual, e.g. picking thresholds, or nonlinear monotonically increasing functions. Principles I have in mind (although they need to be extended or combined with others to achieve 1): 1. Partial/limited aggregation, although I don't know if they're very well-developed, especially to handle uncertainty (and some extensions may have horrific crazy town counterexamples, like https://forum.effectivealtruism.org/posts/smgFKszHPLfoBEqmf/partial-aggregation-s-utility-monster). The Repugnant Conclusion and extensions (https://link.springer.com/article/10.1007/s00355-021-01321-2 ) can be avoided this way, and I think limiting and totally forbidding aggregation are basically the only ways to do so, but totally forbidding aggregation probably leads to crazy towns. 2. Difference-making risk aversion, to prevent fanaticism for an otherwise unbounded theory (objections of stochastic dominance and, if universalized, collective defeat here https://globalprioritiesinstitute.org/on-the-desire-to-make-a-difference-hilary-greaves-william-macaskill-andreas-mogensen-and-teruji-thomas-global-priorities-institute-university-of-oxford/ , https://www.youtube.com/watch?v=HT2w5jGCWG4 and https://forum.effectivealtruism.org/posts/QZujaLgPateuiHXDT/concerns-with-difference-making-risk-aversion , and some other objections and responses here https://forum.effectivealtruism.org/posts/sEnkD8sHP6pZztFc2/fanatical-eas-should-su

Thanks for your comment.

It's not clear to me that (a), (b), and (c) are the only options - or rather, there are a bunch of different variants (c'), (c''), (c'''). Sure, you can say "use principles to guide action until they get too crazy', but you can also say 'use a multiplicity of principles to guide action until they conflict', or 'use a single principle to guide action until you run into cases where it is difficult to judge how the principle applies', or so on. There are lots of different rules of thumb to tell you when and how principles run out, none... (read more)

6
Guy Raveh
2y
Why is criterion (1) something we want? Isn't it enough to find principles that would guide us in some or many situations? Or even just in "many of the currently relevant-to-EA situations"?

You can either interpret low karma as a sign that the karma system is broken or that the summaries aren't sufficiently good. In hindsight I think you're right and I lean more towards the former --even though people tell me they like my newsletter, it doesn't actually get that much karma.

I thought you thought that karma was a decent measure since you suggested

Putting the summary up as a Forum post and seeing if it gets a certain number of karma

as a way to evaluate how good a summary is.

2
Nathan Young
2y
Yeah I think I don't think karma is as good as I imply here. I'll change.

Idk, in my particular case I'd say writing summaries was a major reason that I now have prestige / access to resources.

I think it's probably just hard to write good summaries; many of the summaries posted here don't get very much karma.

4
Nathan Young
2y
This is one of the problems I'm criticising, both here and here. I'm confused, your tone seems like you are dismissing the criticism, but you seem to agree on the object level that summaries are undersupplied given their value. Maybe you think that summaries do give rewards and prestige but only in the long term? In which case, isn't that a problem that can be sorted by paying people who write good summaries and ensuring that the karma system appropriately rewards summarising?

I'm surprised that "write summaries" isn't one of the proposed concrete solutions. One person can do a lot.

5
Nathan Young
2y
Added and credit you

Yeah, I don't think it's clearly unreasonable (though it's not my intuition).

I agree that suicide rates are not particularly strong evidence one way or the other.

I broadly agree that "what does a life barely worth living look like" matters a lot, and you could imagine setting it to be high enough that the repugnant conclusion doesn't look repugnant.

That being said, if you set it too high, there are other counterintuitive conclusions. For example, if you set it higher than people alive today (as it sounds like you're doing), then you are saying that people alive today have negative terminal value, and (if we ignore instrumental value) it would be better if they didn't exist.

7
AppliedDivinityStudies
2y
This seems entirely plausible to me. A couple jokes which may help generate an intuition here (1, 2) You could argue that suicide rates would be much higher if this were true, but there are lots of reasons people might not commit suicide despite experiencing net-negative utility over the course of their lives. At the very least, this doesn't feel as obviously objectionable to me as the other proposed solutions to the "mere addition paradox".  

So, did I or didn't I come across as unfriendly/hostile?

You didn't to me, but also (a) I know you in person and (b) I'm generally pretty happy to be in forceful arguments and don't interpret them as unfriendly / hostile, while other people plausibly would (see also combat culture). So really I think I'm the wrong person to ask.

So, given that I wanted to do both 1 and 2, would you think it would have been fine if I had just made them as separate comments, instead of mentioning 1 in passing in the thread on 2? Or do you think I really should have picked one

... (read more)
7
kokotajlod
2y
OK. I'll DM Nuno. Something about your characterization of what happened continues to feel unfair & inaccurate to me, but there's definitely truth in it & I think your advice is good so I will stop arguing & accept the criticism & try to remember it going forward. :)

Did I come across as unfriendly and hostile? I am sorry if so, that was not my intent.

No, that's not what I meant. I'm saying that the conversational moves you're making are not ones that promote collaborative truth-seeking.

Any claim of actual importance usually has a giant tree of arguments that back it up. Any two people are going to disagree on many different nodes within this tree (just because there are so many nodes). In addition, it takes a fair amount of effort just to understand and get to the same page on any one given node.

So, if you want to do ... (read more)

4
kokotajlod
2y
Thanks for this thoughtful explanation & model. (Aside: So, did I or didn't I come across as unfriendly/hostile? I never suggested that you said that, only that maybe it was true. This matters because I genuinely worry that I did & am thinking about being more cautious in the future as a result.) So, given that I wanted to do both 1 and 2, would you think it would have been fine if I had just made them as separate comments, instead of mentioning 1 in passing in the thread on 2? Or do you think I really should have picked one to do and not done both? The thing about changing my mind also resonates--that definitely happened to some extent during this conversation, because (as mentioned above) I didn't realize Nuno was talking about people who put lots of probability mass on the evolution anchor. For those people, a shift up or down by a couple OOMs really matters, and so the BOTEC  I did about how probably the environment can be simulated for less than 10^41 flops needs to be held to a higher standard of scrutiny & could end up being judged insufficient.  

Lots of thoughts on this post:

Value of inside views

Inside Views are Overrated [...]

The obvious reason to form inside views is to form truer beliefs

No? The reason to form inside views is that it enables better research, and I'm surprised this mostly doesn't feature in your post. Quoting past-you:

  • Research quality - Doing good research involves having good intuitions and research taste, sometimes called an inside view, about why the research matters and what’s really going on. This conceptual framework guides the many small decisions and trade-offs you make o
... (read more)
Load more