How much of this is lost by compressing to something like: virtue ethics is an effective consequentialist heuristic?
I've been bought into that idea for a long time. As Shaq says, 'Excellence is not a singular act, but a habit. You are what you repeatedly do.'
We can also make analogies to martial arts, music, sports, and other practice/drills, and to aspects of reinforcement learning (artificial and natural).
Simple, clear, thought-provoking model. Thanks!
I also faintly recall hearing something similar in this vicinity: apparently some volunteering groups get zero (or less!?) value from many/most volunteers, but engaged volunteers dominate donations, so it's worthwhile bringing in volunteers and training them! (citation very much needed)
Nitpick: are these 'externalities'? I'd have said, 'side effects'. An externality is a third-party impact from some interaction between two parties. The effects you're describing don't seem to be distinguished by being third-party per se (I can imagine glossing them as such but it's not central or necessary to the model).
Yeah. I also sometimes use 'extinction-level' if I expect my interlocutor not to already have a clear notion of 'existential'.
Point of information: at least half the funding comes from Schmidt futures (not OpenAI), though OpenAI are publicising and administrating it.
Another high(er?) priority for governments:
I think this is a good and useful post in many ways, in particular laying out a partial taxonomy of differing pause proposals and gesturing at their grounding and assumptions. What follows is a mildly heated response I had a few days ago, whose heatedness I don't necessarily endorse but whose content seems important to me.
Sadly this letter is full of thoughtless remarks about China and the US/West. Scott, you should know better. Words have power. I recently wrote an admonishment to CAIS for something similar.
...The biggest disadvantage of pausing for a long
I think that the best work on AI alignment happens at the AGI labs
Based on your other discussion e.g. about public pressure on labs, it seems like this might be a (minor?) loadbearing belief?
I appreciate that you qualify this further in a footnote
This is a controversial view, but I’d guess it’s a majority opinion amongst AI alignment researchers.
I just wanted to call out that I weakly hold the opposite position, and also opposite best guess on majority opinion (based on safety researchers I know). Naturally there are sampling effects!
This is a margi...
This is an exemplary and welcome response: concise, full-throated, actioned. Respect, thank you Aidan.
Sincerely, I hope my feedback was all-considered good from your perspective. As I noted in this post, I felt my initial email was slightly unkind at one point, but I am overall glad I shared it - you appreciate my getting exercised about this, even over a few paragraphs!
...It’s important to discuss national AI policies which are often explicitly motivated by goals of competition without legitimizing or justifying zero-sum competitive mindsets which can unde
(Prefaced with the understanding that your comment is to some extent devil's advocating and this response may be too)
both the US and Chinese governments have the potential to step in when corporations in their country get too powerful
What is 'step in'? I think when people are describing things in aggregated national terms without nuance, they're implicitly imagining govts either already directing, or soon/inevitably appropriating and directing (perhaps to aggressive national interest plays). But govts could just as readily regulate and provide guidance...
Thanks Ben!
Please don't take these as endorsements that this thinking is correct, just that it's what I see when I inspect my instincts about this
Appreciated.
These psychological (and real) factors seem very plausible to me for explaining why mistakes in thinking and communication are made.
maybe we can think of the US companies as simultaneously closer friends and closer enemies with each other?
Mhm, this seems less lossy as a hypothetical model. Even if they were only 'closer friends', though, I don't think it's at all clearcut enough for it to be a...
Just in case we're out of sync, let's briefly refocus on some object details
China has made several efforts to preserve their chip access, including smuggling, buying chips that are just under the legal limit of performance, and investing in their domestic chip industry.
Are you aware of the following?
If...
Interesting. I'd love to know if you think the crux schema I outlined is indeed important? I mean this:
How quickly/totally/coherently could US gov/CCP capture AI talent/artefacts/compute within its jurisdiction and redirect them toward excludable destructive ends? Under what circumstances would they want/be able to do that?
Correct me at any point if I misinterpret: I read that, on the basis of answers to something a bit like these, you think an international competition/race is all but inevitable? Presumably that registers as terrifically dangerous for...
Thanks for this thoughtful response!
this tendency leads to analysis that assumes more coordination among governments, companies, and individuals in other countries than is warranted. When people talk about "the US" taking some action... more likely to be aware of the nuance this ignores... less likely to consider such nuances when people talk about "China" doing something
This seems exactly right and is what I'm frustrated by. Though, further than you give credit (or un-credit) for, frequently I come across writing or talking about "US success in AI", "...
The stream cut out, but there are longer versions available e.g. https://www.youtube.com/watch?v=Dg-rKXi9XYg
Great read, and interesting analysis. I like encountering models for complex systems (like community dynamics)!
One factor I don't think was discussed (maybe the gesture at possible inadequacy of encompasses this) is the duration of scandal effects. E.g. imagine some group claiming to be the Spanish Inquisition or the Mongol Horde, or the Illuminati tried to get stuff done. I think (assuming taken seriously) they'd encounter lingering reputational damage more than one year after the original scandals! Not sure how this models out; I'm not planning to d...
OpenAI as a whole, and individuals affiliated with or speaking for the org, appear to be largely behaving as if they are caught in an overdetermined race toward AGI.
What proportion of people at OpenAI believe this, and to what extent? What kind of observations, or actions or statements by others (and who?) would change their minds?
Great post. I basically agree, but in a spirit of devil's advocating, I will say: when I turn my mind to agent foundations thinking, I often find myself skirting queasily close to concepts which feel also capabilities-relevant (to the extent that I have avoided publicly airing several ideas for over a year).
I don't know if that's just me, but it does seem that some agent foundations content from the past has also had bearing on AI capabilities - especially if we include decision theory stuff, dynamic programming and RL, search, planning etc. which it's arg...
Thank you for sharing this! Especially the points about relevant maps and Meta/FAIR/LeCun.
I was recently approached by the UK FCDO as a technical expert in AI with perspective on x-risk. We had what I think were very productive conversations, with an interesting convergence of my framings and the ones you've shared here - that's encouraging! If I find time I'm hoping to write up some of my insights soon.
I've given a little thought to this hidden qualia hypothesis but it remains very confusing for me.
To what extent should we expect to be able to tractably and knowably affect such hidden qualia?
This is beautiful and important Tyler, thank you for sharing.
I've seen a few people burn out (and come close myself), and I have made a point of gently socially making and reinforcing this sort of point (far less eloquently) myself, in various contexts.
I have a lot of thoughts about this subject.
One thing I embrace always is silliness and (often self-deprecating) humour, which are useful antidotes to stress for a lot of people. Incidentally, your tweet thread rendition of the Eqyptian spell includes
...I am light heading for light. Even in the dark, a fi
Seconded/thirded on Human Compatible being near that frontier. I did find its ending 'overly optimistic' in the sense of framing it like 'but lo, there is a solution!' while other similar resources like Superintelligence and especially The Alignment Problem seem more nuanced in presenting uncertain proposals for paths forward not as oven-ready but preliminary and speculative.
I think it's a staircase? Maybe like climbing upwards to more good stuff. Plus some cool circles to make it logo ish.
I'm intrigued by this thread. I don't have an informed opinion on the particular aesthetic or choice of quiz questions, but I note some superficial similarities to Coursera, Khan Academy, and TED-Ed, which are aimed at mainly professional age adults, students of all ages, and youth/students (without excluding adults) respectively.
Fun/cute/cartoon aesthetics do seem to abound these days in all sorts of places, not just for kids.
My uninformed opinion is that I don't see why it should put off teenagers (talented or otherwise) in particular, but I weakly agree that if something is explicitly pitched at teenagers, that might be offputting!
It looks like I got at least one downvote on this comment. Should I be providing tips of this kind in a different way?
I've considered a possible pithy framing of the Life Despite Suffering question as a grim orthogonality thesis (though I'm not sure how useful it is):
We sometimes point to the substantial majority's revealed preference for staying alive as evidence of a 'life worth living'. But perhaps 'staying-aliveness' and 'moral patient value' can vary more independently than that claim assumes. This is the grim orthogonality thesis.
An existence proof for the 'high staying-aliveness x low moral patient value' quadrant is the complex of torturer+torturee, which quite cl...
I'm shocked and somewhat concerned that your empirical finding is that so few people have encountered or thought about this crucial consideration.
My experience is different, with maybe 70% of AI x-risk researchers I've discussed with being somewhat au fait with the notion that we might not know the sign of future value conditional on survival. But I agree that it seems people (myself included) have a tendency to slide off this consideration or hope to defer its resolution to future generations, and my sample size is quite small (a few dozen maybe) and quit...
My anecdata is also that most people have thought about it somewhat, and "maybe it's okay if everyone dies" is one of the more common initial responses I've heard to existential risk.
But I agree with OP that I more regularly hear "people are worried about negative outcomes just because they themselves are depressed" than "people assume positive outcomes just because they themselves are manic" (or some other cognitive bias).
Typo hint:
"10<sup>38</sup>" hasn't rendered how you hoped. You can use <dollar>10^{38}<dollar> which renders as
Got it, I think you're quite right on one reading. I should have been clearer about what I meant, which is something like
E.g. imagine a minor steelification (which loses the aesthetic and rhetorical strength) like "nobody's positive wellbeing (implicitly stemming from their freedom) can/should be cel...
It's possible the selection bias is high, but I don't have good evidence for this besides personal anecdata. I don't know how many people are relevantly similar to me, and I don't know how representative we are of the latest EA 'freshers', since dynamics will change and I'm reporting with several years' lag.
Here's my personal anecdata.
Since 2016, around when I completed undergrad, I've been an engaged (not sure what counts as 'highly engaged') longtermist. (Before that point I had not heard of EA per se but my motives were somewhat proto EA and I wanted to...
I just wanted to state agreement that it seems a large number of people largely misread Death with Dignity, at least according to what seems to me the most plausible intended message: mainly about the ethical injunctions (which are very important as a finitely-rational and prone-to-rationalisation being), as Yudkowsky has written of in the past.
The additional detail of 'and by the way this is a bad situation and we are doing badly' is basically modal Yudkowsky schtick and I'm somewhat surprised it updated anyone's beliefs (about Yudkowsky's beliefs, and th...
I wrote something similar (with more detail) about the Gato paper at the time.
I don't think this is any evidence at all against AI risk though? It is maybe weak evidence against 'scaling is all you need' or that sort of thing.
Thanks Rohin, I second almost all of this.
Interested to hear more about why long-term credit assignment isn't needed for powerful AI. I think it depends how you quantify those things and I'm pretty unsure about this myself.
Is it because there is already loads of human-generated data which implicitly embody or contain enough long-term credit assignment? Or is it that long-term credit assignment is irrelevant for long-term reasoning? Or maybe long-term reasoning isn't needed for 'powerful AI'?
OK, this is the terrible terrible failure mode which I think we are both agreeing on (emphasis mine)
the perceived standard of "you have to think about all of this critically and by your own, and you will probably arrive to similar conclusions than others in this field"
By 'a sceptical approach' I basically mean 'the thing where we don't do that'. Because there is not enough epistemic credit in the field, yet, to expect that all (tentative, not-consensus-yet) conclusions to be definitely right.
In traditional/undergraduate mathematics, it's different - al...
I feel like while “superintelligent AI would be dangerous” makes sense if you believe superintelligence is possible, it would be good to look at other risk scenarios from current and future AI systems as well.
I agree, and I think there's a gap for thoughtful and creative folks with technical understanding to contribute to filling out the map here!
One person I think has made really interesting contributions here is Andrew Critch, for example on Multipolar Failure and Robust Agent-Agnostic Processes (I realise this is literally me sharing a link without m...
I’m fairly sure deep learning alone will not result in AGI
How sure? :)
What about some combination of deep learning (e.g. massive self-supervised) + within-context/episodic memory/state + procedurally-generated tasks + large-scale population-based training + self-play...? I'm just naming a few contemporary 'prosaic' practices which, to me, seem plausibly-enough sufficient to produce AGI that it warrants attention.
I was one of the facilitators in the most recent run of EA Cambridge's AGI Safety Fundamentals course, and I also have professional DS/ML experience.
In my case I very deliberately emphasised a sceptical approach to engaging with all the material, while providing clarifications and corrections where people's misconceptions are the source of scepticism. I believe this was well-received by my cohort, all of whom appeared to engage thoughtfully and honestly with the material.
I think this is the best way to engage, when time permits, because (in brief)
Hey, as someone who also has professional CS and DS experience, this was a really welcome and interesting read. I have all sorts of thoughts but I had one main question
...So I used the AGISF Slack to find people who had already had a background in machine learning before getting into AI safety and asked them what had originally convinced them. Finally, I got answers from 3 people who fit my search criteria. They mentioned some different sources of first hearing about AI safety (80 000 Hours and LessWrong), but all three mentioned one same source that had de
It's not EA but I have a soft spot for Good King Wenceslas (https://en.m.wikipedia.org/wiki/Good_King_Wenceslas)
It's a Christmas hymn about a rich prince who was busy striding around and giving to the poor, and it ends by saying all good Christians 'wealth or rank possessing' should do the same. It's a cracking tune and it means that at least once per year, most Anglican churchgoers will get reminded of those words.
The story is medieval but the particular text comes out of the Victorian charity movement which, at its best, was vaguely proto EA and proto progress studies in many ways.
Just seconding this. For context I work not in academia but as a software engineer and data scientist in London.
I usually have crazy sticky-up hair that sort of does different things each day especially as it grows. That's my main superficial weirdness (unless you count the unusually big nose) though I have plenty of other quirks which are harder to label and harder to spot from a distance.
In hindsight I think the hair has made me memorable and recognisable in my workplaces (e.g. people have expressed looking forward to seeing me and my hair in meetings......
Thank you, I found myself agreeing with most of this post and reflecting on how I might have optimised during my undergrad experience. On the other hand, I note that neither the post nor any comments yet contains what I consider an important caveat:
Taking extra classes is a great way to explore in the sense of dissolving known- and unknown-unknowns (what fits me? what problem-framings am I missing? what tools do other disciplines have? what concerns to people interested in X have? what even is there if I look further?)
Extra-curricular activities also enabl...
Yes yes, more strength to this where it's tractable and possible backfires are well understood and mitigated/avoided!
One adjacent category which I think is helpful to consider explicitly (I think you have it implicit here) is 'well-informedness', which I motion is distinct from 'intelligence' or 'wisdom'. One could be quite wise and intelligent but crippled or even misdirected if the information available/salient is limited or biased. Perhaps this is countered by an understanding of one's own intellectual and cognitive biases, leading to appropriate ('wise...
It depends what media type you're talking about (audio, video, display, ...) - $6m/100m is $60CPM ('cost per mille'), which is certainly above the odds for similar 'premium video' advertising, but only by maybe 2-5x. For other media like audio and display the CPMs can be quite a bit lower, and if you're just looking to reach 'someone, somewhere' you can get a bargain via programmatic advertising.
I happen to work for a major demand-side platform in real-time ad buying and I've been wondering if there might be a way to efficiently do good this way. The pricing can be quite nuanced. Haven't done any analysis at this point.
Hey, let me know if you'd like another reviewer. I'm a medium-experienced senior software engineer whose professional work and side-projects use various proportions of open-source and proprietary software. And I enjoy reviewing/proof-reading :)
I appreciated your detailed analysis of the fire alarm situation along with evidence and introspection notes.
I'm not sure if it opens up any action-relevant new hypothesis space, but one feature of the fire alarm situation which I think you did not analyse is that commonly people are concerned also for the welfare of their fellows, especially those who are close by. This makes sense: if you find yourself in a group, even of strangers (and you've reached consensus that you're not fighting each other) it will usually pay off to look out for each other! So pe...
This was a great read, thank you - I especially valued the multiple series of illustrating/motivating examples, and the several sections laying out various hypotheses along with evidence/opinion on them.
I sometimes wonder how evolution ended up creating humans who are sometimes nonconformist, when it seems socially costly, but I think a story related to what you've written here makes sense: at least one kind of nonconformity can sometimes shift a group consensus from a fatal misinterpretation to an appropriate and survivable group response (and furthermore...
Thanks for these very helpful insights! I thought the mosaic charts were particularly creative and visually insightful.
I have one minor statistical nit and one related question.
In cases where 'only one significant difference was found' (at a 95% c.i.), it could be worth noting that you have around 20 categories... so on average one spurious significant difference is to be expected! (If the difference is small.)
Also a question about how the significance test was carried out. so for calling a difference significant at 95% it mat...
To the extent that you are concerned about intrinsically-multipolar negative outcomes (that is, failure modes which are limited to multipolar scenarios), AI safety which helps only to narrowly align individual automated services with their owners could help to accelerate such dangers.
Critch recently outlined this sort of concern well.
A classic which I personally consider to be related is Meditations on Moloch
Sure, take it or leave it! I think for the field-building benefits it can look more obviously like an externality (though I-the-fundraiser would in fact be pleased and not indifferent, presumably!), but the epistemic benefits could easily accrue mainly to me-the-fundraiser (of course they could also benefit other parties).