All of Rohin Shah's Comments + Replies

Peacefulness, nonviolence, and experientialist minimalism

For others who were confused, like I was:

Some people may worry that minimalist axiologies would imply an affirmative answer to the following questions:

  1. Would an empty world (i.e. a world without sentient beings) be axiologically perfect?
  2. For any hypothetical world, would the best outcome always be realized by pressing a button that leads to its instant cessation?

The author agrees that the answers to these questions are "yes". The author's main point is that perhaps you shouldn't be worried about that.

The author agrees that the answers to these questions are "yes".

Not quite. The author assumes a certain class of minimalist axiologies (experientialist ones), according to which the answers to those questions are:

  1. Yes (though a world with untroubled sentient beings would be equally perfect, and there are good reasons to focus more on that ideal of minimalism in practice).
  2. If the hypothetical world contains no disvalue, then pressing the button is not strictly better, but if the hypothetical world does contain disvalue, then it would be better to press a cess
... (read more)
Ben Garfinkel's Shortform

I suppose my point is more narrow, really just questioning whether the observation "humans care about things besides their genes" gives us any additional reason for concern.

I mostly go ¯\_(ツ)_/¯ , it doesn't feel like it's much evidence of anything, after you've updated off the abstract argument. The actual situation we face will be so different (primarily, we're actually trying to deal with the alignment problem, unlike evolution).

I do agree that in saying " ¯\_(ツ)_/¯  " I am disagreeing with a bunch of claims that say "evolution example implies misa... (read more)

Ben Garfinkel's Shortform

The actual worry with inner misalignment style concerns is that the selection you do during training does not fully constrain the goals of the AI system you get out; if there are multiple goals consistent with the selection you applied during training there's no particular reason to expect any particular one of them. Importantly, when you are using natural selection or gradient descent, the constraints are not "you must optimize X goal", the constraints are "in Y situations you must behave in Z ways", which doesn't constrain how you behave in totally diffe... (read more)

5Ben Garfinkel2d
I think that's well-put -- and I generally agree that this suggests genuine reason for concern. I suppose my point is more narrow, really just questioning whether the observation "humans care about things besides their genes" gives us any additional reason for concern. Some presentations seem to suggest it does. For example, this introduction [https://www.lesswrong.com/posts/AHhCrJ2KpTjsCSwbt/inner-alignment-explain-like-i-m-12-edition] to inner alignment concerns (based on the MIRI mesa-optimization paper) says: And I want to say: "On net, if humans did only care about maximizing inclusive genetic fitness, that would probably be a reason to become more concerned (rather than less concerned) that ML systems will generalize in dangerous ways." While the abstract argument makes sense, I think this specific observation isn't evidence of risk. -------------------------------------------------------------------------------- Relatedly, something I'd be interested in reading (if it doesn't already exist?) would be a piece that takes a broader approach to drawing lessons from the evolution of human goals - rather than stopping at the fact that humans care about things besides genetic fitness. My guess is that the case of humans is overall a little reassuring (relative to how we might have expected generalization to work), while still leaving a lot of room for worry. For example, in the case of violence: People who committed totally random acts of violence presumably often failed to pass on their genes (because they were often killed or ostracized in return). However, a large portion of our ancestors did have occasion for violence. On high-end estimates, our average ancestor may have killed about .25 people. This has resulted in most people having a pretty strong disinclination to commit murder; for most people, it's very hard to bring yourself to murder and you'll often be willing to pay a big cost to avoid committing murder. The three main reasons for concern, t
Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

I agree with your points on making sure you're in a race and being careful about secrecy, but I don't understand:

Scientists have a lot of power! Don’t give it up easily

From my perspective it seems like the scientists wielded their power very effectively rather than "giving it up". They just happened to wield the power in service of the wrong goal, due to mistaken beliefs about the state of reality.

Perhaps to frame it differently: what does it look like to not give up your power as a scientist?

6HaydnBelfield4d
Thanks Rohin. Yes I should perhaps have spelled this out more. I was thinking about two things - focussed on those two stages of advocacy and participation. 1. Don't just get swept up in race rhetoric and join the advocacy: "oh there's nothing we can do to prevent this, we may as well just join and be loud advocates so we have some chance to shape it". Well no, whether a sprint occurs is not just in the hands of politicians and the military, but also to a large extent in the hands of scientists. Scientists have proven crucial to advocacy for, and participation in, sprints. Don't give up your power too easily. 2. You don't have to stay if it turns out you're not actually in a race and you don't have any influence on the sprint program. There were several times in 1945 when it seems to me that scientists gave up their power too easily - over when and how the bomb was used, and what information was given to the US public. Its striking that Rotblat was the only one to resign - and he was leant on to keep his real reasons secret. One can also see this later in 1949 and the decision to go for the thermonuclear bomb. Oppenheimer, Conant, Fermi and Bethe all strongly opposed that second 'sprint' ("It is neccessarily an evil thing considerd in any light."). They were overruled, and yet continued to actively participate in the program. The only person to leave the program (Ellsberg thinks, p.291-296) was Ellsberg's own father, a factory designer - who also kept it secret. Exit or the threat of exit can be a powerful way to shape outcomes - I discuss this further in Activism by the AI Community [https://arxiv.org/abs/2001.06528]. Don't give up your power too easily.
Forecasting Newsletter: April 2022

Jan Kirchner (a) writes a popularization of "Infrabayesianism" (a), a theory of how to make pseudo-Bayesian updates in the presence of intelligent adversaries. I appreciated the effort, but I thought this could have been much better. If anyone writes a better introduction I'll give them a forecasting microgrant proportionate to my estimate of its quality.

I still like my summary, if you haven't seen that yet. (Tbc I'm not looking for a microgrant, just informing you of the existence of the summary.)

2NunoSempere13d
Cheers!
Compiling resources comparing AI misuse, misalignment, and incompetence risk and tractability

I think this mostly hasn't been done, but here's one survey that finds large disagreement.

1Peter444419d
Thanks. Yes, that was the survey I mentioned.
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

Hmm well aren't we all individuals making individual choices? So ultimately what is relevant to me is if my actions are fanatical?

We're all particular brain cognitions that only exist for ephemeral moments before our brains change and become a new cognition that is similar but not the same. (See also "What counts as death?".) I coordinate both with the temporally-distant (i.e. future) brain cognitions that we typically call "me in the past/future" and with the spatially-distant brain cognitions that we typically call "other people". The temporally-distant ... (read more)

4Jack Malde20d
Certainly agree there is something weird there! Anyway I don't really think there was too much disagreement between us, but it was an interesting exchange nonetheless!
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

Why consider only a single longtermist career in isolation, but consider multiple donations in aggregate?

Given that you seem to agree voting is fanatical, I'm guessing you want to consider the probability that an individual's actions are impactful, but why should the locus of agency be the individual? Seems pretty arbitrary.

If you agree that voting is fanatical, do you also agree that activism is fanatical? The addition of a single activist is very unlikely to change the end result of the activism.

2Jack Malde21d
A longtermist career spans decades, as would going vegan for life or donating regularly for decades. So it was mostly a temporal thing, trying to somewhat equalise the commitment associated with different altruistic choices. Hmm well aren't we all individuals making individual choices? So ultimately what is relevant to me is if my actions are fanatical? Pretty much yes. To clarify - I have never said I'm against acting fanatically. I think the arguments for acting fanatically, particularly the one in this paper [https://globalprioritiesinstitute.org/nick-beckstead-and-teruji-thomas-a-paradox-for-tiny-probabilities-and-enormous-values/] , are very strong. That said, something like a Pascal's mugging does seem a bit ridiculous to me (but I'm open to the possibility I should hand over the money!).
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

I think the probability that my personal actions avert an existential catastrophe is higher than the probability that my personal vote in the next US presidential election would change its outcome.

I think I'd plausibly say the same thing for my other examples; I'd have to think a bit more about the actual probabilities involved.

2Jack Malde22d
That's fair enough, although when it comes to voting I mainly do it for personal pleasure / so that I don't have to lie to people about having voted! When it comes to something like donating to GiveWell charities on a regular basis / going vegan for life I think one can probably have greater than 50% belief they will genuinely save lives / avert suffering. Any single donation or choice to avoid meat will have far lower probability, but it seems fair to consider doing these things over a longer period of time as that is typically what people do (and what someone who chooses a longtermist career essentially does).
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

By this logic it seems like all sorts of ordinary things are fanatical:

  1. Buying less chicken from the grocery store is fanatical (this only reduces the number of suffering chickens if you buying less chicken was the tipping point that caused the grocery store to order one less shipment of chicken, and that one fewer order was the tipping point that caused the factory farm to reduce the number of chickens it aimed to produce; this seems very low probability)
  2. Donating small amounts to AMF is fanatical (it's very unlikely that your $25 causes AMF to do another d
... (read more)
2Jack Malde22d
Probabilities are on a continuum. It’s subjective at what point fanaticism starts. You can call those examples fanatical if you want to, but the probabilities of success in those examples are probably considerably higher than in the case of averting an existential catastrophe.
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

many of them are willing to give to a long-term future fund over GiveWell charities

It really doesn't seem fanatical to me to try to reduce the chance of everyone dying, when you have a specific mechanism by which everyone might die that doesn't seem all that unlikely! That's the right action according to all sorts of belief systems, not just longtermism! (See also these posts.)

2Jack Malde22d
Hmm I do think it's fairly fanatical. To quote this summary: The probability that any one longtermist's actions will actually prevent a catastrophe is very small. So I do think longtermist EAs are acting fairly fanatically. Another way of thinking about it is that, whilst the probability of x-risk may be fairly high, the x-risk probability decrease any one person can achieve is very small. I raised this point on Neel's post. [https://forum.effectivealtruism.org/posts/rFpfW2ndHSX7ERWLH/simplify-ea-pitches-to-holy-shit-x-risk?commentId=n9hu3rGgGY6rCQttb]
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

I think the EA longtermist movement is currently choosing the actions that most increase probability of infinite utility, by reducing existential risk.

This is not in conflict with my claim (b). My claim (b) is about the motivation or reasoning by which actions are chosen. That's all I rely on for the inferences in claims (c) and (d).

I think we're mostly in agreement here, except that perhaps I'm more confident that most longtermists are not (currently) motivated by "highest probability of infinite utility".

2Jack Malde22d
Yeah that's fair. As I said I'm not entirely sure on the motivation point. I think in practice EAs are quite fanatical, but only to a certain point. So they probably wouldn't give in to a Pascal's mugging but many of them are willing to give to a long-term future fund over GiveWell charities - which is quite a bit fanaticism! So justifying fanaticism still seems useful to me, even if EAs put their fingers in their ears with regards to the most extreme conclusion...
rohinmshah's Shortform

I have data in the sense that when I read news articles and check how correct they are, they are usually not very correct. (You can have more nuance than this, e.g. facts about what mundane stuff happened in the world tend to be correct.)

I don't have data  in the sense that I don't have a convenient list of articles and ways they were wrong such that I could easily persuade someone else of this belief of mine. (Though here's one example of an article that you at least have to read closely if you want to not be misled.)

Also, I could justify ignoring th... (read more)

Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

I'm not sure whether you are disagreeing with me or not. My claims are (a) accepting fanaticism implies choosing actions that most increase probability of infinite utility, (b) we are not currently choosing actions based on how much they increase probability of infinite utility, (c) therefore we do not currently accept fanaticism (though we might in the future), (d) given we don't accept fanaticism we should not use "fanaticism is fine" as an argument to persuade people of longtermism.

Is there a specific claim there you disagree with? Or were you riffing off what I said to make other points?

2Jack Malde22d
Yes I disagree with b) although it's a nuanced disagreement. I think the EA longtermist movement is currently choosing the actions that most increase probability of infinite utility, by reducing existential risk. What I'm less sure of is that achieving infinite utility is the motivation for reducing existential risk. It might just be that achieving "incredibly high utility" is the motivation for reducing existential risk. I'm not too sure on this. My point about the long reflection was that when we reach this period it will be easier to see the fanatics from the non-fanatics.
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)
  1. We figure out how to prevent the heat death of the universe indefinitely. (Technically this doesn't lead to infinite utility, since you could still destroy everything of value in the universe, but by driving the probability of that low enough you can get arbitrarily large amounts of utility, which leads to the same fanatical conclusions.)
  2. We figure out that a particular configuration of matter produces experiences so optimized for pleasure that it is infinite utility (i.e. we'd accept any finite amount of torture to create it even for one second).
  3. We discove
... (read more)
2Jack Malde23d
As you said in your previous comment we essentially are increasing the probability of these things happening by reducing x-risk. I'm not convinced we don't tend to reason fanatically in practice - after all Bostrom's astronomical waste [https://www.nickbostrom.com/astronomical/waste.html]argument motivates reducing x-risk by raising the possibility of achieving incredibly high levels of utility (in a footnote he says he is setting aside the possibility of infinitely many people). So reducing x-risk and trying to achieve existential security seems to me to be consistent with fanatical reasoning. It's interesting to consider what we would do if we actually achieved existential security and entered the long reflection. If we take fanaticism seriously at that point (and I think we will) we may well go for infinite value. It's worth noting though that certain approaches to going for infinite value will probably dominate other approaches by having a higher probability of success. So we'd probably decide on the most promising possibility and run with that. If I had to guess I'd say we'd look into creating infinitely many digital people with extremely high levels of utility.
Paper summary: The case for strong longtermism (Hilary Greaves and William MacAskill)

First, denying fanaticism has implausible consequences (see Beckstead and Thomas 2021, Wilkinson 2022) so perhaps we should be fanatical on balance.

I haven't read the paper, but if we accept fanaticism shouldn't we be chasing the highest probability of infinite utility? That seems pretty inconsistent with how longtermists seem to reason (though it probably still leads to similar actions like reducing x-risk, since we probably have to be around in order to affect the world and increase the probability of infinite utility).

1Michael_Wiebe24d
Can you give some examples of infinite utility?
Comparative advantage does not mean doing the thing you're best at

I agree with your post overall, but:

Comparative advantage is a useful concept, but it doesn’t mean "everyone does the thing which they are best at in the world." If this were true, there would be only one chef in the world (the person who is best at being a chef), only one baker, one software engineer, etc.

This is nonsensical for a different reason. There are billions of humans, but only ~thousands (hundreds?) of jobs at the granularity of "chef", "baker", "software engineer". Not everyone will be the best in the world at one of these jobs.

This would seem ... (read more)

2Linch24d
Yes, bwr makes a similar point here [https://forum.effectivealtruism.org/posts/T2jLCYHZxNq8iFTTg/comparative-advantage-does-not-mean-doing-the-thing-you-re?commentId=NbDJrcAmATuSFEYoy] .
How I failed to form views on AI safety

Huh. I'd put assistance games above all of those things (except inner optimization but that's again downstream of the paradigm difference; inner optimization is much less of a thing when you aren't getting intelligence through a giant search over programs). Probably not worth getting into this disagreement though.

How I failed to form views on AI safety

I don't think that my main disagreement with Stuart is about how we'll reach AGI, because critiques of his approach, like this page, don't actually require any assumption that we're in the ML paradigm.

I agree that single critique doesn't depend on the ML paradigm. If that's your main disagreement then I retract my claim that it's downstream of paradigm disagreements.

What's your probability that if we really tried to get the assistance paradigm to work then we'd ultimately conclude it was basically doomed because of this objection? I'm at like 50%, such tha... (read more)

How I failed to form views on AI safety

The issues seem much deeper than that (mostly of the "grain of truth" sort, and from the fact that in CIRL-like formulations, the actual update-rule for how to update your beliefs about the correct value function is where 99% of the problem lies, and the rest of the decomposition doesn't really seem to me to reduce the problem very much

Sounds right, and compatible with everything I said? (Not totally sure what counts as "reducing the problem", plausibly I'd disagree with you there.)

Like, if you were trying to go to the Moon, and you discovered the rocket e... (read more)

8Habryka1mo
Some abstractions that feel like they do real work on AI Alignment (compared to CIRL stuff): * Inner optimization * Intent alignment vs. impact alignment * Natural abstraction hypothesis * Coherent Extrapolated Volition * Instrumental convergence * Acausal trade None of these are paradigms, but all of them feel like they do substantially reduce the problem, in a way that doesn't feel true for CIRL. It is possible I have a skewed perception of actual CIRL stuff, based on your last paragraph though, so it's plausible we are just talking about different things.
6richard_ngo1mo
I don't think that my main disagreement with Stuart is about how we'll reach AGI, because critiques of his approach, like this page [https://arbital.com/p/updated_deference/], don't actually require any assumption that we're in the ML paradigm. Whether AGI will be built in the ML paradigm or not, I think that CIRL does less than 5%, and probably less than 1%, of the conceptual work of solving alignment; whereas the rocket equation does significantly more than 5% of the conceptual work required to get to the moon. And then in both cases there's lots of engineering work required too. (If AGI will be built in a non-ML paradigm, then getting 5% of the way to solving alignment probably requires actually making claims about whatever the replacement-to-ML paradigm is, which I haven't seen from Stuart.) But Stuart's presentation of his ideas seems wildly inconsistent with both my position and your position above (e.g. in Human Compatible he seems way more confident in his proposal than would be justified by having gotten even 5% of the way to a solution).
How I failed to form views on AI safety

I wish you would summarize this disagreement with Russell as "I think neural networks / ML will lead to AGI whereas Russell expects it will be something else". Everything else seems downstream of that. (If I had similar beliefs about how we'd get to AGI as Russell, and I was forced to choose to work on some existing research agenda, it would be assistance games. Though really I would prefer to see if I could transfer the insights from neural network / ML alignment, which might then give rise to some new agenda.)

This seems particularly important to do when talking to someone who also thinks neural networks/ ML will not lead to AGI.

FWIW, I don't think the problem with assistance games is that it assumes that ML is not going to get to AGI. The issues seem much deeper than that (mostly of the "grain of truth" sort, and from the fact that in CIRL-like formulations, the actual update-rule for how to update your beliefs about the correct value function is where 99% of the problem lies, and the rest of the decomposition doesn't really seem to me to reduce the problem very much, but instead just shunts it into a tiny box that then seems to get ignored, as far as I can tell).

rohinmshah's Shortform

Ah, I interpreted that claim as "it's not a huge priority to prevent bad posts from being upvoted, regardless of how that happens", rather than "it's fine for forum members to upvote posts whose conclusions they agree with even if they see that the reasons are bad".

rohinmshah's Shortform

I'd assume that forum members don't notice that the reasoning is bad.

As evidence in favor of this view, at least sometimes after I post such a comment, the post's karma starts to go down, suggesting that the comment informed voters about bad reasoning that they hadn't previously noticed. (Possibly this happened in most of the examples above, I wasn't carefully tracking this and don't know of any way to check now.)

6Stefan_Schubert1mo
Probably yeah, at least in part. Sometimes they may notice it a bit but put insufficient weight on it relative to the fact that they agree with the conclusion. But some may also miss it altogether. My comment was in response to the claim that "to some extent it's OK for bad posts to get upvoted".
rohinmshah's Shortform

Reasonably often (maybe once or twice a month?) I see fairly highly upvoted posts that I think are basically wrong in something like "how they are reasoning", which I'll call epistemics. In particular, I think these are cases where it is pretty clear that the argument is wrong, and that this determination can be made using only knowledge that the author probably had (so it is more about reasoning correctly given a base of knowledge).

Sometimes I write a comment explaining why. If I reliably did this on all of the posts then you could still rely on karma as ... (read more)

I know this is just a small detail and not what you wrote about, but: much of your comment on the recommender systems post hinged on news articles being uncorrelated with the truth. Do you have data to back that up?

I'm replying here because it's a strong claim that's relevant to many things beyond that specific post.

5Max_Daniel1mo
I'm wondering if it'd be good to have something special happen to posts where a comment has more karma than the OP. Like, decrease the font size of the OP and increase the font size of the comment, or display the comment first, or have a red warning light emoji next to the post's title or ... Or maybe the commenter gets a $1,000 prize whenever that happens. Good versions of "something special" would also incentivize the public service of pointing out significant flaws in posts by making comments that have a shot at exceeding the OP's karma score. Obviously "there exists a comment that has higher karma than the OP" is an imperfect proxy of what we're after here, but anecdotally it seems to me this proxy works surprisingly well (though maybe it would stop due to Goodhart issues if we did any of the above) and it has the upside that it can be evaluated automatically.

Agreed. To some extent it's OK for bad posts to get upvoted. But I think the fact that posting volume is so much higher now means we should be able to trade off some of that volume for greater post quality. This could be by having a review process for posts, or reinstating the minimum upvote requirement before a user is allowed to post. I also think there may be some achievable gains that don't require trading off volume, such as improving the upvote strength algorithm.

How much current animal suffering does longtermism let us ignore?

(Strong) longtermists will always ignore current suffering and focus on the future, provided it is vast in expectation

But at the time of the heat death of the universe, the future is not vast in expectation? Am I missing something basic here?

(I'm ignoring weird stuff which I assume the OP was  ignoring like acausal trade / multiverse cooperation, or infinitesimal probabilities of the universe suddenly turning infinite, or already being infinite such that there's never a true full heat death and there's always some pocket of low entropy somewhere, or b... (read more)

4Jack Malde1mo
No you're not missing anything that I can see. When OP says: I think they're really asking: Certainly the closer an impartial altruist is to heat death the less forward-looking the altruist needs to be.
How much current animal suffering does longtermism let us ignore?

Thanks. One response:

  • It read to me that you were upset and offended and you wrote a lot in response.

I wouldn't say I was offended. Even if the author is wrong about some facts about me, it's not like they should know those facts about me? Which seems like it would be needed for me to feel offended?

I was maybe a bit upset? I would have called it annoyance but "slightly upset" is reasonable as a descriptor. For A, B, D and E my reaction feels mostly like "I'm confused why this seems like a decent argument for your thesis", and for C it was more like being upset.

How much current animal suffering does longtermism let us ignore?

As an instrumental thing, I am worried that this sort of post could backfire. 

The original post or my comment?

In either case, why?

1Charles He1mo
I agree with your comment. * It read to me that you were upset and offended and you wrote a lot in response. * I didn't think the OP seemed good to me, either in content or rhetoric. Below is a screenshot of a draft of a larger comment that I didn't share until now, raising my concerns. (It's a half written draft, it just contains fragments of thoughts). I wish people could see what is possible and what has been costly in animal welfare. I wish they knew how expensive it is carry around certain beliefs and I wish they could see who is bearing the cost for that.
How much current animal suffering does longtermism let us ignore?

I feel sorely misunderstood by this post and I am annoyed at how highly upvoted it is. It feels like the sort of thing one writes / upvotes when one has heard of these fabled "longtermists" but has never actually met one in person.

That reaction is probably unfair, and in particular it would not surprise me to learn that some of these were relevant arguments that people newer to the community hadn't really thought about before, and so were important for them to engage with. (Whereas I mostly know people who have been in the community for longer.)

Nonetheless... (read more)

5Jack Malde1mo
I upvoted OP because I think comparison to humans is a useful intuition pump, although I agree with most of your criticism here. One thing that surprised me was: Surprised to hear you say this. It is plausible that the EA longtermist community is increasing the expected amount of suffering in the future, but accepts this as they expect this suffering to be swamped by increases in total welfare. Remember one of the founding texts of longtermism [https://www.nickbostrom.com/astronomical/waste.html] says we should be maximising the probability that space colonisation will occur. Space colonisation will probably increase total suffering over the future simply because there will be so many more beings in total. When OP says : My answer is "pretty much yes". (Strong) longtermists will always ignore current suffering and focus on the future, provided it is vast in expectation. Of course a (strong) longtermist can simply say "So what? I'm still maximising undiscounted utility over time" (see my comment here [https://forum.effectivealtruism.org/posts/fo6xBBJpbpeAyQJSj/how-much-current-animal-suffering-does-longtermism-let-us?commentId=8YtKKycSXa9wwC3hj] ).

There is an estimate of 24.9 million people in slavery, of which 4.8 million are sexually exploited! Very likely these estimates are exaggerated, and the conditions are not as bad as one would think hearing those words, and even if they were the conditions might not be as bad as battery cages, but my broader point is that the world really does seem like it is very broken and there are problems of huge scale even just restricting to human welfare, and you still have to prioritize, which means ignoring some truly massive problems.

I agree, there is already a ... (read more)

1Nathan_Barnard1mo
I think that there is something to the claim being made in the post which is that longtermism as it currently is is mostly about increasing number of people in the future living good lives. It seems genuinely true that most longtermists are prioritising creating happiness over reducing suffering. This is the key factor which pushes me towards longtermist s-risk.

Thanks for writing this comment.

2Charles He1mo
I agree with this sentiment. As an instrumental thing, I am worried that this sort of post posts like the OP could backfire.
Are there any AI Safety labs that will hire self-taught ML engineers?

I would love to hear actual details of these cases (presumably less publicly); at this level of granularity I can't tell to what extent this should change my mind (I can imagine worlds consistent with both your statements and mine).

Are there any AI Safety labs that will hire self-taught ML engineers?

I continue to think that this is primarily a reflection of RSs having more experience than REs, and that a process with a single role and no RS / RE distinction would produce similar outcomes given the same people.

How to become an AI safety researcher

For example, in linear algebra you might have a feeling about some property of a matrix but then you actually have to show it with math.

I would distinguish between "I have an informal proof sketch, or idea for why a theorem should be true, and now I must convert it to a formal proof" and "I am looking at some piece of reality, and have to create mathematical definitions that capture that aspect of reality". These might be sufficiently similar that practicing the former helps the latter, but I suspect they aren't.

Or more relevantly, in Optimal Policies Tend

... (read more)
How to become an AI safety researcher

9 + 2 + 2 ≠ 12? Did someone have some kind of double PhD?

1Andy Jones1mo
nah i just accidentally a word. fixed!
How to become an AI safety researcher

This process of formalization is one of the skills that studying mathematics can help build. 

Huh, really? My experience of studying math is that you are given the formalizations and must derive conclusions from them, which doesn't seem like it would help much for the skill of coming up with good formalizations.

2peterbarnett1mo
That is definitely part of studying math. The thing I was trying to point to is the process of going from an idea or intuition to something that you can write in math. For example, in linear algebra you might have a feeling about some property of a matrix but then you actually have to show it with math. Or more relevantly, in Optimal Policies Tend to Seek Power [https://arxiv.org/abs/1912.01683] it seems like the definition of 'power' came from formalizing what properties we would want this thing called 'power' to have. But I'm curious to hear your thoughts on this, and if you think there are other useful ways to develop this 'formalization' skill.
Are AGI labs building up important intangibles?

I haven't looked into it much, but the PaLM paper has a list of contributions in Appendix A that would be a good starting point.

Are AGI labs building up important intangibles?

You can get an estimate based on how many authors there are on the papers (it's often quite a lot, e.g. 20-40). Though this will probably become less reliable in the future, as such organizations develop more infrastructure that's needed that no longer qualifies as "getting you on the paper", but is nonetheless important and not publicly available.

5Buck1mo
One problem with this estimate is that you don’t end up learning how long the authors spent on the project, or how important their contributions were. My sense is that contributors to industry publications often spent relatively little time on the project compared to academic contributors.
1Raven1mo
Interesting, thanks! Any thoughts on how we should think about the relative contributions and specialization level of these different authors? ie, a world of maximally important intangibles might be one where each author was responsible for tweaking a separate, important piece of the training process. My rough guess is that it's more like 2-5 subteams working on somewhat specialized things, with some teams being moderately more important and/or more specialized than others. Does that framing make sense, and if so, yeah, what do you think?
Against the "smarts fetish"

Huh, I would have taken nearly all of the qualities listed here as a reason to prioritize "smarts", because they seem so correlated with "smarts" to me (exceptions: being driven, interpersonal kindness and respect). Like, if I generate examples of people who are high on the skills listed here, they tend to be among the smartest people I know; and if I generate examples of smart people, each example seems to have many but not all of these qualities.

If I were listing useful-to-EA qualities that were reasons to think less about "smarts", I would include:

  • Willi
... (read more)
4Magnus Vinding1mo
Thanks for your comment and for listing those traits and skills; I strongly agree that those are all useful qualities. :) One might argue that willingness to do grunt work, taking initiative, and mental stamina all belong in a broader "drive/conscientiousness" category, but I think they are in any case important and meaningfully distinct traits worth highlighting in their own right. Likewise, one could perhaps argue that "ability to network well" falls under a broader category of "social skills", in which interpersonal kindness and respect might also be said to fall (as a somewhat distinct trait or ability, cf. the cognitive vs. affective empathy distinction; networking ability probably draws more strongly on cognitive empathy while [genuine] interpersonal kindness probably relies more on affective empathy). A related trait one could list in that category is skill in perspective-taking. Regarding the correlation point, I agree that IQ is likely correlated with many of the traits I listed, but I don't believe [https://forum.effectivealtruism.org/posts/NQ2MjsNWxKPdKgr4L/against-the-smarts-fetish#One_can_have_a_high_IQ_while_still_not_] that this is a strong reason to think that we are not overemphasizing IQ relative to these other traits. Moreover, as noted in another comment, a reason to focus more on these other traits relative to IQ at the level of what we seek to develop individually and incentivize collectively is that many of these other traits and skills probably are more elastic and improvable than is IQ. As for how many of these traits are correlated significantly with IQ, it's worth noting that — beyond "being driven" and "interpersonal kindness" — myside bias (also) appears to show [http://keithstanovich.com/Site/Research_on_Reasoning_files/Stanovich_CDPS_2013.pdf] “very little relation to intelligence”. And I likewise doubt that IQ has much of a correlation with a willingness to face unpleasant and inconvenient conclusions, or resistance to signaling
1FCCC1mo
I think you have to be smart to have all the OP’s listed traits, so sure, there’s going to be correlation. But what’s the phrase? “Science advances one funeral at a time.” If that’s true, then there are plenty of geniuses who can’t bring themselves to admit when someone else has a better theory. That would show that traits 2 and 3 are commonly lacking in smart people, which yes, makes those people dumber than they otherwise would be, but they’re still smart.
Are there any AI Safety labs that will hire self-taught ML engineers?

I don't know of some easy-to-describe bar, but as one anecdote, this post by Matthew Rahtz was easily enough to clear the "should interview" bar, and went most of the way to the "should hire" bar, when I was looking at applicants for the CHAI internship. It would also have been enough to clear the "should interview" bar at DeepMind.

I also like this 80K podcast on the topic, and in general I might recommend looking at my FAQ (though it doesn't cover this question particularly).

Are there any AI Safety labs that will hire self-taught ML engineers?

DeepMind doesn’t hire people without PhDs as research scientists

Basically true (though technically the requirement is "PhD in a technical field or equivalent practical experience")

places more restrictions on what research engineers can do than other places

Doesn't seem true to me. Within safety I can name two research engineers who are currently leading research projects.

DeepMind might be more explicit that in practice the people who lead research projects will tend to have PhDs. I think this pattern is just because usually people with PhDs are better at le... (read more)

4richard_ngo1mo
"DeepMind allows REs to lead research projects" is consistent with "DeepMind restricts REs more than other places". E.g. OpenAI doesn't even officially distinguish RE from RS positions, whereas DeepMind has different ladders with different expectations for each. And I think the default expectations for REs and RSs are pretty different (although I agree that it's possible for REs to end up doing most of the same things as RSs).
Are there any AI Safety labs that will hire self-taught ML engineers?

I'm currently at DeepMind and I'm not really sure where this reputation has come from. As far as I can tell DeepMind would be perfectly happy to hire self-taught ML engineers for the Research Engineer role (but probably not the Research Scientist role; my impression is that this is similar at other orgs). The interview process is focused on evaluating skills, not credentials.

DeepMind does get enough applicants that not everyone makes it to the interview stage, so it's possible that self-taught ML engineers are getting rejected before getting a chance to sh... (read more)

3ElizabethBarnes1mo
I think DM clearly restricts REs more than OpenAI (and I assume Anthropic). I know of REs at DM who have found it annoying/difficult to lead projects because of being REs, I know of someone without a PhD who left Brain (not DeepMind but still Google so prob more similar) partly because it was restrictive, and lead team at OAI/Anthropic, and I know of people without an undergrad degree who have been hired by OAI/Anthropic. At OpenAI I'm not aware of it being more difficult for people to lead projects etc because of being 'officially an RE'. I had bad experiences at DM that were ostensibly related to not having a PhD (but could also have been explained by lack of research ability).
5Buck2mo
As I understand it, DeepMind doesn’t hire people without PhDs as research scientists, and places more restrictions on what research engineers can do than other places.
2Ansh R2mo
As someone interested in applying for Research Engineering roles in the near future, what would be your criterion for determining whether someone self-taught is “worth-interviewing”? (Also a question for others who are familiar with hiring practices at various AI safety organizations).
Is it valuable to the field of AI Safety to have a neuroscience background?

To make up a number, I'd expect an AI degree to be ~10x more valuable for technical AI alignment research all else equal (i.e. assuming your personal fit for both is the same). Primarily this is because there are lots of existing philosophy / neuro / cog sci people who want to contribute to AI safety, and my impression is that they have trouble finding useful things to do, and I personally see many more opportunities for people with AI expertise than for people with any of those other degrees.

(Note that for other areas like AI governance I would not make the same claim.)

Community Builder Writing Contest: $20,000 in prizes for reflections

Eh, I'm just pretty happy to claim that many contests can in fact push people to do more valuable things with their time than they would have done otherwise. It's not that hard to think of considerations that most EAs haven't thought about.

Also, by this logic, should we also not have any posts that give advice to EAs? After all, they might pull EAs towards following that advice, even when that's not altruistically / impartially best.

Maybe the idea is that once there's money involved, it is individually rational for EAs to pursue the money from contests ins... (read more)

Early-warning Forecasting Center: What it is, and why it'd be cool

I totally agree that this is a useful case to look into! I just wish that the post actually looked into it, rather than simply stating that it would be useful without justifying it.

(I also think that you could just change the title + thesis of the post to something else, and that would also be a good post.)

2Linch2mo
Thanks for the feedback! I agree the title was not great in that it poorly represented the contents, which is more like "argument via a single extended hypothetical example that was elaborated on." This was an issue that was brought up in the review stage, but I didn't think of a better title after some time and decided that publishing was better than waiting.
Early-warning Forecasting Center: What it is, and why it'd be cool

It sounds like your thesis is:

I argue that advances in short-range forecasting (particularly in quality of predictions, number of hours invested, and the quality and decision-relevance of questions) can be robustly and significantly useful for existential risk reduction, even without directly improving our ability to forecast long-range outcomes, and without large step-change improvements to our current approaches to forecasting itself (as opposed to our pipelines for and ways of organizing forecasting efforts).

But I don't actually see much evidence or arg... (read more)

4JP Addison2mo
Thanks for engaging critically! Upvoted. I agree there's a shift between the title and the first two paragraphs and the rest of the post. I think the post could be clearer about the fact that its entire work is being done by the single hypothetical use-case. However, I'm pretty happy to see a post dig into a single specific case. I agree with Linch's feeling that a lot of forecasting win conditions are based on turning short term into long term forecasting, and was excited to see this post. Indeed, your questions also want to dig into this example, so maybe you agree that this case is useful? OTOH, this post does spend a significant amount of time digging into issues that seem relatively unrelated to convincing me of the question "will something like this work?" and indeed I didn't read them.
Aligning Recommender Systems as Cause Area

These approaches could help! I don't have strong reason to believe that they will, nor do I have strong reason to believe that they won't, and I also don't have strong reason to believe that the existing system is particularly problematic. I am just generally very uncertain and am mostly saying that other people should also be uncertain (or should explain why they are more confident).

Re: deliberative retrospective judgments as a solution: I assume you are going to be predicting what the deliberative retrospective judgment is in most cases (otherwise it wou... (read more)

Community Builder Writing Contest: $20,000 in prizes for reflections

I expect most submissions will take <10 hours to write over the course of 1-3 days.

If memory serves, it took me ~40 hours to write each of these retrospectives (maybe over the course of a month); these were by far the most useful reflections for me to improve my community-building efforts.

7Akash2mo
Thanks for mentioning this, Rohin! I agree that longer write-ups and retrospectives can be valuable. And if someone determines that it's valuable for them to spend 40 hours on a write-up, I'd encourage them to do so. For this contest, I don't want the "norm" or "expectation" to be a 20+ hour write-up. I'm expecting many submissions that take the form "here's an idea that I was already thinking about, and now this contest nudged me to sit down and write it up" or "I sat down and spent a few hours reflecting on X, and here's what I learned." This is partially motivated by finm's comment here [https://forum.effectivealtruism.org/posts/Fm7kyQceewuHirBEd/we-should-run-more-ea-contests?commentId=QTRmEuGGAc2eGgjd4] : Most importantly, I think people entering this contest should ask themselves if spending marginal hours on their entries would be a good use of their time (relative to their counterfactual). My guess is that most entrants would benefit from reflecting for 1-10 hours, and a smaller subset would benefit from reflecting for 10-100 hours.
Is transformative AI the biggest existential risk? Why or why not?

By "all else equal" I meant to ignore questions of personal fit (including e.g. whether or not people have the relevant technical skills). I was not imagining that the likelihoods were similar.

I agree that in practice personal fit will be a huge factor in determining what any individual should do.

6weeatquince2mo
Ah, sorry, I misunderstood. Thank you for the explanation :-)
Load More