All of rosehadshar's Comments + Replies

Super cool, thanks for making this!

From Specification gaming examples in AI:

  • Roomba: "I hooked a neural network up to my Roomba. I wanted it to learn to navigate without bumping into things, so I set up a reward scheme to encourage speed and discourage hitting the bumper sensors. It learnt to drive backwards, because there are no bumpers on the back."
    • I guess this counts as real-world?
  • Bing - manipulation: The Microsoft Bing chatbot tried repeatedly to convince a user that December 16, 2022 was a date in the future and that Avatar: The Way of Water had not yet been released.
    • To be honest, I don
... (read more)

Glad it's relevant for you! For questions, I'd probably just stick them in the comments here, unless you think they won't be interesting to anyone but you, in which case DM me.

Thanks, this is really interesting.

One follow-up question: who are safety managers? How are they trained, what's their seniority in the org structure, and what sorts of resources do they have access to?

In the bio case it seems that in at least some jurisdictions and especially historically, the people put in charge of this stuff were relatively low-level administrators, and not really empowered to enforce difficult decisions or make big calls. From your post it sounds like safety managers in engineering have a pretty different role.

3
Denis
8mo
Indeed,  A Safety manager (in a small company) or a Safety Department (in a larger company) needs to be independent of the department whose safety they monitor, so that they are not conflicted between Safety and other objectives like, say, an urgent production deadline (of course, in reality they will know people and so on, it's never perfect). Typically, they will have reporting lines that meet higher up (e.g. CEO or Vice President), and this senior manager will be responsible for resolving any disagreements. If the Safety Manager says "it's not safe" and the production department says "we need to do this," we do not want it to become a battle of wills. Instead, the Safety Manager focuses exclusively on the risk, and the senior manager decides if the company will accept that risk. Typically, this would not be "OK, we accept a 10% risk of a big explosion" but rather finding a way to enable it to be done safely, even if it meant making it much more expensive and slower.  In a smaller company or a start-up, the Safety Manager will sometimes be a more experienced hire than most of the staff, and this too will give them a bit of authority.  I think what you're describing as the people "put in charge of this stuff" are probably not the analogous people to Safety Managers. In every factory and lab, there would be junior people doing important safety work. The difference is that in addition to these, there would be a Safety Manager, one person who would be empowered to influence decisions. This person would typically also oversee the safety work done by more junior people, but that isn't always the case.  Again, the difference is that people in engineering can point to historical incidences of oil-rigs exploding with multiple casualties, of buildings collapsing, ... and so they recognise that getting Safety wrong is a big deal, with catastrophic consequences. If I compare this to say, a chemistry lab, I see what you describe. Safety is still very much emphasised and sp

Thanks for the kind words!

Can you say more about how either of your two worries work for industrial chemical engineering? 

Also curious if you know anything about the legislative basis for such regulation in the US. My impression from the bio standards in the US is that it's pretty hard to get laws passed, so if there are laws for chemical engineering it would be interesting to understand why those were plausible whereas bio ones weren't.

Hi Rose,

To your second question first: I don't know if their are specific laws related to e.g. ASTM standards. But there are laws related to criminal negligence in every country. So if, say, you build a tank and it explodes, and it turns out that you didn't follow the appropriate regulations, you will be held criminally liable - you will pay fines and potentially end up in jail. You may believe that the approach you took was equally safe and/or that it was unrelated to the accident, but you're unlikely to succeed with this defence in court - it's like argu... (read more)

Good question.

There's a little bit on how to think about the XPT results in relation to other forecasts here (not much). Extrapolating from there to Samotsvety in particular:

  • Reasons to favour XPT (superforecaster) forecasts:
    • Larger sample size
    • The forecasts were incentivised (via reciprocal scoring, a bit more detail here)
    • The most accurate XPT forecasters in terms of reciprocal scoring also gave the lowest probabilities on AI risk (and  reciprocal scoring accuracy may correlate with actual accuracy)
  • Speculative reasons to favour Samotsvety forecasts:
    • (Gue
... (read more)

Don't apologise, think it's a helpful point!

I agree that the training computation requirements distribution is more subjective and matters more to the eventual output.

I also want to note that while on your view of the compute reqs distribution, the hardware/spending/algorithmic progress inputs are a rounding error, this isn't true for other views of the compute reqs distribution. E.g. for anyone who does agree with Ajeya on the compute reqs distribution, the XPT hardware/spending/algorithmic progress inputs shift median timelines from ~2050 to ~2090, which... (read more)

See here for a mash up of XPT forecasts on catastrophic and extinction risk, with Shulman and Thornley's paper on how much governments should pay to prevent catastrophes.

The follow-up project was on AI specifically, so we don't currently have any data that would allow us to transfer directly to bio and nuclear, alas.

I wasn't around when the XPT questions were being set, but I'd guess that you're right that extinction/catastrophe were chosen because they are easier to operationalise.

On your question about what forecasts on existential risk would have been: I think this is a great question.

FRI actually ran a follow-up project after the XPT to dig into the AI results. One of the things we did in this follow-up project was elicit forecasts on a broader range of outcomes, including some approximations of existential risk. I don't think I can share the results yet, but we're aiming to publish them in August!

2
Ulrik Horn
8mo
  Please do! And if possible, one small request from me would be if any insight on extinction vs existential risk for AI can be transferred to bio and nuclear - e.g. might there be some general amount of population decline (e.g. 70%) that seems to be able to trigger long-term/permanent civilizational collapse.

Thanks for this; I wasn't tracking it and it does seem potentially relevant.

Thanks Sanjay!

I agree that both of your bullet points would be good. I also think that the second one is extremely non-trivial - more like something it would be good to have a research team working on than something I could write a section on in a blog post. 

There's a sense in which there are already research team equivalents working on it, insofar as lots of forecasting efforts  relate to p(crunch time soon). But from my vantage point it doesn't seem like this community has clarity/consensus around what the best indicators of crunch time soon are, or that there are careful analyses of why we should expect those to be good indicators, and that makes me expect that more work is needed.

Thanks; I hadn't checked the Wikipedia current events page much previously, but I really like it.

Do you have any thoughts on how specifically the Wikipedia stuff is biased? I'm imagining that there isn't a general tendency, and it's more that  specific entries are biased in specific ways that it's hard to spot if you don't have background knowledge on the area.

9
purplefern
2y
As you said, I would imagine that specific entries probably contain some bias of their original author(s) that is somewhat difficult to spot without background knowledge. But, I am a multilingual person, and one interesting thing that I have noticed is that the same Wikipedia article can have pretty drastically different amounts of information depending on the author’s first language and the nature of the subject. Take for example, the Wikipedia entries for Edith Piaf (a famous French singer). The English Wikipedia entry for Piaf is something like 4000 or 5000 words, whereas the French entry is over 10000 words. The French entry also has more pictures!  Linguistically (and culturally?) speaking, French and English are pretty similar, so you might expect that this content would be easier to translate or to compare. The effect of language and culture on Wikipedia content is much stronger between languages and cultures that are more dissimilar. Take for example, the Wikipedia entries for Himeji Castle which is a famous historical site from 14th century Japan. The English Wikipedia for Himeji Castle is about 3000 words, whereas the Japanese Wikipedia for Himeji Castle is—so ridiculously long that I barely had the patience to come up with this character count—about 50000 characters. That would probably translate to something like 25000 words in English. (And again, the Japanese entry has way more pictures than the English one.) I think the broad implications of this might be that English speaking Wikipedia is biased by predominantly Western views and somewhat subject to international politics, in addition to whatever individual biases of the author in that highly specific context. I bet there is someone out there on the internet who has written a pretty interesting criticism about this, although I am not familiar with any content on it in particular.  My intuition is also that English speaking authors are probably more inclined to be left leaning in terms of their nat

Thanks; I forgot about the headline version. I've now removed.

Thanks so much for this! If this is pedantry, I am very pro pedantry :)

I think this makes my 'Humans launch 5 objects into space' section sufficiently dubious that I've removed it, but pasting here in the context of your comment:

Humans launch 5 objects into space.

It’s only in the last 8 years that the number of objects launched into space each day has exceeded 1.

2
katriel
2y
Heads up that it's still in the headline version - though I think as an average it's fine and useful to include. 
4
Davidmanheim
2y
To resolve the lack of clarity here, there are lots of launches that have many things on board, so we have fewer launches but many objects launched. Because averages aren't always typical - and the modal number is zero.)

there seems to be a large variance in how comfortable people are with numbers, but I think this is surmountable

Wanting to flag that my background is entirely qualitative, and I spent many years thinking this meant that I couldn't do things with numbers. I now think this is false, they aren't magic, and you don't need to have deep aptitude for maths/technical training/a background in stats to be able to fiddle around with basic numbers in a way that helps you think about things.

I've changed the wording to make it clearer that I mean deaths per human per minute. I don't want to change it to second; for me dying in the next minute is easier to imagine/take seriously than dying in the next second (though I imagine this varies between people).

1
Wayne_Chang
2y
New phrasing works well!

Yes, you are completely right. I've added 'farmed' now; thanks for picking this up.

Thanks for the link to Saulius' post; it's great and I recommend people check it out.

On the trillion wild birds: yeah you're right, it's too high - should be 100 billion instead. Thanks for the spot; have changed.

The number is on p. 89 in the supplementary materials - but importantly it's just aorder of magnitude, rather than a specific estimate. So it's consistent with Tomasik's range.

Yes! Thanks for the spot; updated now.

Thanks for picking this up Wayne!

The mistake I made was number of people: it should have read 115 other people, not one. I did mean minute, and the number of animals is 1/116 to get a number of animals per human, rather than 1/60 to get a number of animals per second.

I've corrected the number now. (Thanks also to someone else who messaged me about the error.)

1
Wayne_Chang
2y
Got it. But I think the phrasing for the number of animals that die is confusing then. Since you say "100 other human [sic] would probably die with me in that minute," the reference is to how many animals would also do during that minute.  I think what you want to say is for every human death, how many animals would die, but that's not the current phrasing (and by that logic, the number of humans that would die per human death would be 1, not 100). I'd suggest making everything consistent on a per-second basis as smaller numbers are more relatable. So  1 other human would die with you that second, along with 10 cows, etc.

Thanks Elias, I think you're right.

Isaac, I've tried to make this clearer in the table in the post.

[Also by happy chance this process made me notice that I'd lost all of my footnotes in the process of transferring from google docs, which I've now fixed. Thanks both for indirectly causing me to notice this.]

Yeah, I considered moving more slowly in the way that you suggest. The reasons I'm not doing that feel a bit complicated/hard to articulate, but some of my motivations:

  • Not wanting to be patronising towards people. Making a cal event is not hard, anyone can do it
  • Feeling like 'value this thing enough that someone in the group can make a cal event for it' is a reasonable bar below which it maybe just makes sense for a group to fail
  • Having more trust/faith in groups than I think some other people have. Like, I don't expect that by default everything will work s
... (read more)

I think this is interesting.

On whether the moral campaign was about morality:

  • There’s definitely a way of reducing it to economics. At the most zoomed out level, seems likely to me that without industrialisation you don’t get colonial presence and without colonial presence the anti-footbinding campaign doesn’t take off.
  • I don’t think the moral campaigners were interested purely in the empowerment of women, or thought about empowerment in the way we think about it. Seems like there was prejudice and misogyny and national interest, as well as concern for yo
... (read more)
5
Wei Dai
2y
Thanks for the response. I don't disagree with anything you say here, and to be clear, I have a lot of both empirical and moral uncertainty about this topic. This makes me think of another parallel: parents forcing kids to practice musical instruments, which a lot of kids also resist, and arguably causes real suffering among the kids who hate doing it. (I'm thinking of places like China where this phenomenon is much more widespread than in the US.) How likely is a "moral campaign" for stopping this likely to succeed, without some economic force behind it? Another parallel might be forcing kids to go to school and to do homework.

Some other related things I've pulled from my notes are arguments in Brown's review article against Shepherd's view:

  • "99 percent of women married regardless of how many had ever bound. So bound feet were clearly not needed to be able to marry. The BBG data show regional variation in whether being footbound at marriage age led to hypergamy, with significant correlations only for Sichuan (primarily from two counties) and not for North, Central, and Southwest China.17 Nevertheless, about 47 percent of women—even in Sichuan—married to households at the same w
... (read more)

Thanks for this point.

I'm actually a bit unsure how true it is that the status element of footbinding was important. Certainly that's an established narrative in the literature (e.g. Shepherd buys it).

Brown, Bossen and Hill have an article I've only skimmed called 'Marriage Mobility and Footbinding in Pre-1949 Rural China: A Reconsideration of Gender, Economics, and Meaning in Social Causation' (link here: https://www.cambridge.org/core/journals/journal-of-asian-studies/article/marriage-mobility-and-footbinding-in-pre1949-rural-china-a-reconsideration-of-g... (read more)

0
rosehadshar
2y
Some other related things I've pulled from my notes are arguments in Brown's review article against Shepherd's view: * "99 percent of women married regardless of how many had ever bound. So bound feet were clearly not needed to be able to marry. The BBG data show regional variation in whether being footbound at marriage age led to hypergamy, with significant correlations only for Sichuan (primarily from two counties) and not for North, Central, and Southwest China.17 Nevertheless, about 47 percent of women—even in Sichuan—married to households at the same wealth level as their natal households." * Shepherd's Taiwan data shows earlier marriage for footbound girls, but later marriage could also indicate more economic value to the parental household, so this isn't a clear signal * "most ever-bound women had released their feet before marriage"

Minor point on how you communicate the novelty point: I'm slightly worried about people misreading and thinking 'oh, I have to be super original', and then either neglecting important unoriginal things like reassessing existing work, or  twisting themselves into knots to prove how original they are.

I agree with you that all else equal a new insight is more valuable than one others have already had, but as originality is often over-egged in academia, it might be worth paying attention to how you phrase the novelty criterion in particular.
 

I think a list like this might be useful for other purposes too:

  • raising the profile of critiques that matter amongst EAs, thus hopefully improving people's thinking
  • signalling that criticism genuinely is welcome and seen as useful

Personally I'm a bit wary of things like this. A few reactions:

  • You mention that the appropriate number might vary depending on extroversion. I think it should also vary by situation. A student in an average year might meet 20 new people who might be interested in hearing about EA stuff. Someone who works in a steady job in a small company and has young kids might not meet any. A nice thing about donate 10% is that it applies across the board and scales to people's incomes. This sort of thing doesn't.
  • I don't think you're doing a strong version of this, b
... (read more)

I think this is a really good summary of what historians might do, thanks Oscar.

One contextual point is that I think 1 and 2 are something like 'central examples of useful things historians might do', rather than something like 'the main things current historians actually do'.

In particular, my outdated impression from when I studied history is that a lot of historical work is very zoomed in source work that may not involve much integration or summarisation. Some of this work is necessary groundwork for 1 and 2; some of it I think comes from specialisation pressures within the field and doesn't produce much value.

I especially like your points on 2 Ramiro, and the distinction between studying history/what historians do. I'm interested in both of these things, and also agree that 'studying history' is vague and ambiguous.

I'm still confused about what contentful things I'm trying to think about, and so I'm using a kind of empty label, 'history', to point at the cloud of stuff I think might be relevant. My hope was that people would interpret 'history' differently, and I'd get a range of answers that might help me think about what I do and don't mean - and that I might... (read more)

7
Oscar Buck
2y
(Edit: I wrote this and then realized you are a historian. Leaving it because maybe other people want to know one way of relating history to other fields). Some thoughts about distinguishing historical research from other research and why it might be valuable. First, history is exceptionally method agnostic, compared to other fields. Non-specialists (journalists, bankers, English professors) have made major historical contributions. This isn't because the methods used are always basic or non-scientific, but because such a wide variety have proved useful for historians. Historians usually can't go back and gather more information from their subjects, so their methods have to be flexible and change based on the time and place, and trying to define the 'historical method' is a pretty nebulous task. It's something sort of like, 'what evidence exists about this subject&period&place, and what can I trust it to describe accurately?', and then you choose whatever tools from other disciplines make sense to answer the question. This is a pretty Bayesian-friendly mindset compared to other fields. Second, other disciplines usually begin with an 'object' that they can then apply interpretive methods to, or an object which they can conduct tests on in order to test a theory. Historians, instead, mostly construct historical objects. Most historical questions start as "what was going on with X?" or "why do these other historians disagree about what happened during X?" "How do we periodize this series of events?" and the answers will be "England was developing a working class" or "military records and civilian correspondence tell very different stories about the Civil War" or "there seem to be six distinct stages of US party politics."  Sometimes two historical objects are so closely entangled that it's hard to study just one (how can you understand the Haitian Revolution without knowing what was going on in France?), and it's probably good that historians can tell other research

Thanks for this Michael, I think that's a good point. I've changed those labels to 'US radical right (see definition)' and 'US radical left (see definition)'. Not perfect, but less misleading.

Yes, very happy to respond to messages on this :)

Yeah, I think that's a good point.

I expect there are things other than ability to take risks that it's worth tracking too - like skill acquisition, demonstrable achievements...

Thanks, I enjoyed that post (and it's quite short, for people considering whether to read).

This seems like a useful point, thanks!

It makes me want to give a clarification: the reflections above are just the most important things I happened to learn - not a list of generally most important points to consider when testing fit for research. I think I'd need more research experience to write a good version of the latter thing (though I think my list probably overlaps with it somewhat).

I also want to respond to "you should definitely try [...] before you write off research in general". I think I agree with this, conditional on it being a sensible ide... (read more)

2
Gavin
2y
Agree with all of this

Thanks for the responses James, I found them thoughtful and helpful!

A few responses in return:

On your point regarding the methodology you would use to answer these questions, I would definitely be interested to hear more about that as I'll be finalising my research methodology over January.

Quick thoughts:

  • Partly it's just that I don't have a quant background, and so of necessity I would take a qualitative approach. I think for some questions, this would also be the most appropriate approach (to give a straw example, I'm sceptical about measuring broad socia
... (read more)

Another one: Alex Hill and Jaime Sevilla, Attempt at understanding the role of moral philosophy in moral progress (on women’s suffrage and animal rights)

Some more recent things:

Also fwiw, I have read the ACE case studies, and I think that the one on environmentalism is pretty high quality, more so than some of the other things listed here. I'd recommend people interested in working on this stuff to read the environmentalism one.

3
rosehadshar
2y
Another one: Alex Hill and Jaime Sevilla, Attempt at understanding the role of moral philosophy in moral progress (on women’s suffrage and animal rights)

Thanks for this post James! I found it thought provoking.

Overall, I'm still not sure what I make of your claims. There are a few things contributing to this, including:

  • The post is long and I read it quickly.
  • Probably just inferential distance/different background models and ways of thinking. E.g. I've never been involved in a mass movement, I'm a historian by background and would go about these questions in a different way methodologically, etc.
  • More specifically, I'm used to thinking of EA as a social movement, and other social movements as potentially usef
... (read more)
3
Jamie_Harris
2y
I think I agree with every point Rose made here. I'll also emphasise though that I think that the post has lots of (1) cool ideas and possibilities worth digging into and (2) snippets of useful empirical evidence.

Thanks for the informal post! I really liked it, and probably wouldn't have read the whole paper.

I have a few thoughts that I'd be curious for your take on.

I'm a bit unsure how generally the papers you're looking at apply to the broad question of changing values. A few intuitions:

  • These papers look at measurable and relatively narrow features of the past, and how far they explain features of the present which are again measurable and relatively narrow. My intuition would be that most of what matters in terms of values is pretty hard to measure, and might no
... (read more)
4
Jaime Sevilla
2y
Thank you Rose! You make interesting points, let me try to reason through them: This is a point worth grappling with. And let's be fair - there are many obvious ways in which cultural transmission clearly has had an effect on modern society.  Case in point: Xmas is approaching! And the fact that we have this arbitrary ritual of meeting once a year to celebrate is a very clear and measurable example of a cultural value being passed down through generations. And yet, all these studies fail to find any strong effect in their analyses. What is going on? Here are two possibilities:  a) cultural transmission mostly affects abstract social values like religion or political tendencies but not concrete personal values like trust. b) abstract social values are easy to measure and quantify; concrete personal values elude a precise quantification.  I do certainly think that there is some truth to hypothesis b). It is easy to ask people about religion and note down their answer; it is harder to measure trust except in very superficial ways. The studies on trust rely on imperfect proxies like surveys. I still think that hypothesis a) has more explanatory power. It is consistent with the literature on parental transmission, where my rough impression of the consensus is that children tend to culturally inherit abstract beliefs from their parents but not behaviours, which are mostly dictated by the shared cultural context and genetics. Here is an excerpt from Bryan Caplan's Selfish reasons to have more kids (pp. 45): My impression is that the same goes for politics. I do not know if other belief / attitude clusters have been studied, but I think it would be a safe bet to think they would have found the same pattern. Of course, bear in mind that I am no sociologist. So take everything I am saying with a dose of healthy skepticism.   There is a large corpus of historical analysis studying social movements like the suffragettes or the slavery abolitionists. My bet is that

Some examples of policy stuff RSPers have done:

  • Advising governments directly (including the Czech and UK governments on their COVID-19 response, the Mexican government on its national AI strategy)
  • Networking with relevant government individuals, think tanks etc
  • Attending relevant conferences, like the Biological Weapons Convention
  • Writing papers, written submissions and reports on various policy issues

Some examples of community building related stuff RSPers have done:

... (read more)

I think it can be a good fit for either of those groups. Currently most people are more in the academic work category, but we have a few RSPers who are working on more policy engagement style work, and having a fair bit of success.


It's also worth pointing out that plenty of RSPers don't fall neatly into either camp:

  • Policy people sometimes do academic style things and vice versa
  • Lots of RSPers are exploring and haven't yet narrowed down to 'I am definitely going to optimise for policy engagement/some other style of work'
  • There are other buckets of activities that RSPers do: software development, teaching and mentorship, community building
1
rsturrock
3y
Could you describe some of the more policy engagement style / community building projects people have taken on? Would be interested in what people have pursued as that's closer to my area of interest vs more academic. 

Impressiveness: good question, but feels hard to express without going into lots of detail, so I'm going to pass.

Acceptance rate: 9/~150, then 10 out of ~250. We're planning to take 8 in this round. The summer fellowship was 27/~300.

Some support options, briefly:

  • Talking with Owen, the programme director
  • Talking with me or other future project managers on the programme
  • Peer support
  • FHI provides opportunities for coaching and other external support
  • We have various structures that aim to help people with this, like 6-week project cycles, a major project
... (read more)
8
Owen Cotton-Barratt
4y
Note that those ratios are [number starting on programme]/[number of applications]. In fact a few people were made offers and declined, so I think on the natural way of understanding acceptance rate it's a little higher.

It could be good if someone wrote an overview of the growing number of fellowships and scholarships in EA (and maybe also other forms of professional EA work). It could include the kind of info given above, and maybe draw inspiration from Larks' overviews of the AI Alignment landscape. I don't think I have seen anything quite like that, but please correct me if I'm wrong.

7
Max_Daniel
4y
One more data point: last year's Summer Research Fellowship had an acceptance rate of 11/~90.

Thanks for this! I agree that a lot of the value of RSP won't become obvious until after the programme (and also want to flag that as our first cohort is only finishing this autumn, it's still quite uncertain how large this value will be).

At this stage, the best information we have on how things will shape up for scholars after the programme is what our first cohort have lined up to do immediately after the programme - see here.

Strong agree, thanks for pointing this out Ollie