Hi Mayowa, I agree that open-source safety is a big (and I think too overlooked) AI Safety problem. We can be sure that Hacker groups are currently/will soon use local LLMs for cyber-attacks.
Your idea is neat, but I'm worried that in practice it would be easy for actors (esp ones with decent technical capabilities) to circumvent these defences. You already mention the potential to just alter the data stored in plaintext, but I think the same would be possible with other methods of tracking the safety state. Eg with steganography, the attacker could occasionally use a different model to reword/rewrite outputs and thus remove the hidden information about the safety state.
I'd say it's a medium-sized deal. Academics can often propose ideas and show that they work on smaller (eg 7b) models. However, then it requires someone with a larger compute budget to like the idea & results and implement it at a larger scale.
There are some areas where access to compute is less important like MechInterp, RedTeaming, creating Benchmarks or more theoretical areas of AI research. Areas are more amenable to academic research if they don't require training Frontier models. Eg inference or small fine-tuning runs on Frontier Models are actually not super expensive and can be done by academic labs. Also some areas of research can be done well on smaller models (eg MechInterp), so it's fine if your uni doesn't have so many GPUs.
But my experience (and also that of some others I know) was that I would regularly think of experiments or research ideas that I didn't end up running or pursuing because I didn't think I had the necessary compute.
Thanks for the pointer Henry! It motivated me to look into culling more and I just wanted to share some EU-specific facts I found:
A hen produces ~350 eggs, so consuming one egg is ~1/350th of culling a male chicken. 28% of chicken in Europe have in-OVO sexing, with Germany having ~80%. The numbers are lower for organic eggs because for some reasons in-ovo sexing was forbidden for organic eggs until this year (stupid much???).
Overall, I find it difficult to weigh male-chicken-culling morally. Do they have strong conscious experience at that time? How much suffering is there involved in their deaths?
Thanks a lot! These seem like very significant issues that updated me to put only eat 2 eggs/month instead of per week. I was surprised to read that chicken (even organic ones) can be kept inside for 5-6 months per year. Also, reading about the welfare issues of chicken bread for egg laying seems pretty bad (eg weaker bonestructure, immune system and social behavior).
Thanks for writing this Kat! While I don't agree with everything, the core argument (cluelessness about nutritional science means ancestral diets are a strong prior) was convincing to me.
I wanted to note how I updated my diet from this and additional ~3 hours of research:
- 100g/week of sardines: (due to reasons here)
- 150g/week of mussels: I agree with the post that they are unlikely to be sentient
- 2 eggs/week: My guess is that EU welfare level 0 (organic) actually means chickens possibly have a net-positive life. Lmk if you know of welfare concerns with organic eggs in the EU!
- Once per month cow liver: In order to cover the "red-meat" food group, I'm adding some cow meat as it seems to cause the lowest suffering of commonly available animals per kg. Why liver and not normal beef? Firstly, it has higher nutrient density, thus you need less of it. Secondly, organs were regularly eaten by ancestors, thus the ancestral prior is strong. Thirdly, livers are a byproduct of normal meat production and organ meat is often discarded due to low demand. Thus, buying livers likely doesn't increase demand for cows much.
I was already occasionally eating cheese beforehand; otherwise, yoghurt/kefir might also look good.
I'm happy to be convinced of changing this based on new evidence!
It's a great question. I see Safety Cases more as a meta-framework in which you can use different kinds of evidence. Other risk management techniques can be used as evidence in a Safety Case (eg this paper uses a delphi method).Â
Also I think Safety Cases are attractive to people in AI Safety because:
1) They offer flexibility for the kind of evidence and reasoning that is allowed. From skimming it seems to me that many of the other risk management practices you linked are more strict about the kind of arguments or the kind of evidence that can be brought.
2) They strive to comprehensively prove that overall risk is low. I think most of the other techniques don't let you make claims such as "overall risk from a system is <x%" (which AI Safety people want).
3) (I might be wrong here), but it seems to me that many other risk management techniques require you to understand the system and it's environment decently well, whereas this is very difficult for AI Safety.
Overall, you might well be right that other risk management techniques have been overlooked and we shouldn't just focus on Safety Cases.