Thanks for these reflections!
Could you maybe elaborate on what you mean by a 'bad actor'? There's some part of me that feels nervous about this as a framing, at least without further specification -- like maybe the concept could be either applied too widely (e.g. to anyone who expresses sympathy with "hard-core utilitarianism", which I'd think wouldn't be right), or have a really strict definition (like only people with dark tetrad traits) in a way that leaves out people who might be likely to (or: have the capacity to?) take really harmful actions.
Thank you for doing this work and for the easy-to-read visualisations!
Thanks Vaidehi -- agree! I think another key part of why it's been useful is that it's just really readable/interesting -- even for people who aren't already invested in the ideas.
Hey! Arden here, also from 80,000 Hours. I think I can add a few things here on top of what Bella said, speaking more to the web content side of the question:
(These are additional to the 'there's a headwind on engagement time' part of Bella's answer above – though they less important I think compared to the points Bella already mentioned about a 'covid spike' in engagement time in 2020 and marketing not getting going strong until the latter half of 2022 .)
The career guide (https://80000hours.org/career-guide/) was very popular. In 2019 we deprioritised it in favour of a new 'key ideas series' that we thought would be more appealing to our target audience and more accurate in some ways, and stopped updating the career guide and put notes on its pages saying it was out of date. Engagement time on those pages fell dramatically.
This is relevant because, though engagement time fell due to this in 2019, a 'covid spike' in 2020 (especially due to this article, which went viral: https://80000hours.org/2020/04/good-news-about-covid-19/) masked the effect; when the spike ended we had less baseline core content engagement time.
(As a side note: we figured that engagement time would fall some amount when we decided to switch away from the career guide: ultimately our aim is to help solve pressing global problems, and engagement time is only a very rough proxy for that. We decided it'd be worth taking some hit in popularity to have content that we thought would be more impactful for the people who read it. That said, we're actually not sure it was the right call! In particular, our user survey suggests the career guide has been very useful for getting people started in their career journeys, so we now think we may have underrated it. We are actually thinking about bringing an updated version of the career guide back this year.)
2021 was also a low year for us in terms of releasing new written content. The most important reason for this was having less staff. In particular, Rob Wiblin (who wrote our two most popular pieces in 2020) moved to the podcast full time at the start of the year, we lost two other senior staff members, and we were setting up a new team with a first time manager (me!). As a result I spent most of the year on team building (hiring and building systems), and we didn't get as much new or updated content out as normal, and definitely nothing with as much engagement time as some of the pieces from the previous year.
The website's total year-on-year engagement time has historically been greater than the other programmes, largely because it's the oldest programme. So it's harder to move its total engagement time in terms of %.
Also, re the relationship of these figures with marketing, the amount of engagement time with the site due to marketing did go up dramatically over the past 2 years (I'm unsure of the exact figure, but it's many-fold), because it was very low before that. We didn't even have a marketing team before Jan 2022! Though we did do a small amount of advertising, almost all our site engagement time before then was from organic traffic, social media promotions of pieces, word of mouth, etc.
(Noticing I'm not helping with the 'length of answer to your question' issue here, but thought it might be helpful :) )
Thank you Max for all your hard work and all the good you've done in your role. Your colleagues' testimonials here are lovely to see. I think it's really cool you're taking care of yourself and thinking ahead in this way about handing off responsibility - even though I'm sure it's hard.
Good luck with the transition <3
Nice post. One thought on this - you wrote:
"I’d be especially excited for people to spread messages that help others understand - at a mechanistic level - how and why AI systems could end up with dangerous goals of their own, deceptive behavior, etc. I worry that by default, the concern sounds like lazy anthropomorphism (thinking of AIs just like humans)."
I agree that this seems good for avoiding the anthropomorphism (in perception and in one's own thought!) but I think it'll be important to emphasise when doing this that these are conceivable ways and ultimately possible examples rather than the whole risk-case. Why? People might otherwise think that they have solved the problem when they've ruled out or fixed that particular problematic mechanism, when really they haven't. Or when the more specific mechanistic descriptions probably end up wrong in some way, the whole case might be dismissed - when the argument for risk didn't ultimately depend on those particulars.
(this only applies if you are pretty unconfident confident in the particular mechanisms that will be risky vs. safe)
[written in my personal capacity]
[writing in my personal capacity, but asked an 80k colleague if it seemed fine for me to post this]
Thanks a lot for writing this - I agree with a lot of (most of?) of what's here.
One thing I'm a bit unsure of is the extent to which these worries have implications for the beliefs of those of us who are hovering more around 5% x-risk this century from AI, and who are one step removed from the bay area epistemic and social environment you write about. My guess is that they don't have much implication for most of us, because (though what you say is way better articulated) some of this is already naturally getting into people's estimates.
e.g. in my case, basically I think a lot of what you're writing about is sort of why for my all-things-considered beliefs I partly "defer at a discount" to people who know a ton about AI and have high x-risk estimates. Like I take their arguments, find them pretty persuasive, end up at some lower but still middlingly high probability, and then just sort of downgrade everything because of worries like the ones you cite, which I think is part of why I end up near 5%.
This kind of thing does have the problematic effect probably of incentivising the bay area folks to have more and more extreme probabilities - so that, to the extent that they care, quasi-normies like me will end up with a higher probability - closer to the truth in their view - after deferring at a discount.
I fulfil my gwwc pledge by donating each month to the EA funds animal welfare fund for the fund managers to distribute as they see fit. I trust them to make a better decision than I will on the individual charities' effectiveness since I don't have that much time/expertise to look into it.
I think the long-run future is incredibly important, and I spend my labour mostly on that. But my guess (though I'm pretty unsure) is that my donations do more good in animal welfare than in longtermism-focused things. Perhaps the new landscape should change that but I haven't made any updates yet.
I also admit to donating a bit extra to The Humane League because they are close to my heart and also seem really effective for animals.
I don't think about this that often, and there was part of me that didn't want to post this because it's not very rigorous! But also maybe others also feel that way and it feels honest to post. (If other people have takes on where I should donate instead though I'm open to hearing them!)
I'm trying out updating some of 80,000 Hours pages iteratively that we don't have time to do big research projects on right now. To this end, I've just released an update to https://80000hours.org/problem-profiles/improving-institutional-decision-making/ — our problem profile on improving epistemics and institutional decision making.
This is sort of a tricky page because there is a lot of reasonable-seeming disagreement about what the most important interventions are to highlight in this area.
I think the previous version had some issues: It was confusing, and it was common for readers to come away with very different impressions of the problem area. This seems like it is in part because the term "improving institutional decision making" is very very broad, and can include a lot of different things. We didn't do a great job of making clear our views about which sub-areas were most promising. This is partly because those views are not that strongly developed! Basically a lot of people who've thought about it disagree, and we're not confident about who's right. The previous version of the article, though, presented a confident-sounding picture that mostly highlighted forecasting, structured analytic techniques, and behavioral sciences. It was out of date. The opening felt a bit unrealistic.
In the update I sought to address (1)-(2) by just honestly writing that we aren't sure which focus(es) within the broad umbrella area are best, and going through a few of the options that seem most promising to us and some people we asked for advice. I sought to address (3) and (4) by doing a low-hanging-fruit edit to update the information and writing, and cut the opening.
The update was much quicker than most updates we'd make to our problem profiles. It will be far from perfect. I'd be very happy to get feedback — if you want to suggest changes you can do so here as comments or leave a comment on this thread. However, I probably won't respond to most comments — as I said above, people have very different views in this area, so I'd be surprised if there weren't a decent amount of disagreement with the update. That said, I still want to hear views (especially if you think perhaps I haven’t heard them before), and if there are smaller changes that seem positive I'd be very keen (e.g. "X is a bad example of the thing you're talking about.")