Lab Collaboration on AI Safety Best Practices

amta

This draft report doesn’t have many actionable insights, and is >5000 words, so I don’t recommend that many people read it. It might be interesting to anyone considering collaboration on standards at AI labs. The report isn’t necessarily an endorsement of the view that improving lab standards is one of the best courses of action in AI governance. I’m open to the idea that trying to make the current lab development paradigm safer may be the least bad option, and also open to the idea that some form of globally coordinated pause and/or international exclusive AGI research institution could be a better option.

Short Summary

Recent steps which facilitate coordination on safety practices in the AI industry, such as the creation of the Frontier Model Forum and the US AI Safety Institute Consortium, are promising, but there’s still room for improvement.

Openly publishing information on safety practices where possible is typically the most beneficial approach for industry-wide safety, as the information is made available to the greatest number of people. In cases where this isn’t desirable, such as with practices which entail higher collaboration risk, other forms of collaboration are also recommended, such as jointly developing best practices within an industry body.

Most of the examined safety practices (risk assessments, red teaming, third-party audits, incident reporting, segregation of duties background checks) were judged to have low downside risk from collaboration but some (model evaluations, alignment techniques, cybersecurity) may entail higher risks.

The most significant potential drawback of collaboration from an existential safety perspective is the connection between safety and capabilities advancement. The most significant potential drawbacks from an individual lab perspective are several legal issues such as liability and antitrust violations.

The potential impact of collaboration on safety practices on the reduction of extreme risks from AI systems could be greater than safety improvements at just one lab, but this depends on the number of frontier labs and their willingness to improve their safety practices.

The key recommendations of this report are: to pay more attention to collaboration and be more willing to i) engage in it and ii) fund it; to employ different methods of collaboration based on the needs of individual safety practices.

Longer Summary

As a rapidly emerging general-purpose technology with the potential to transform the economy and boost technological progress, artificial intelligence heralds unprecedented opportunities, alongside societal and civilisational risks. The governance of AI has been discussed at the highest levels of national and international institutions^[1]^[2]^[3]^[4], but it is still a very young field, lacking rigorous safety practices and in some cases any safety practices at all. In this context, despite the unique nature of AI, it may be relevant to examine how firms collaborate on safety practices^[5] in other industries, and how frontier AI labs might improve on this aspect going forward.

The report begins by taking a look at how collaboration on safety practices occurs in the aviation, energy, pharmaceutical and cybersecurity industries. It then moves on to examine contemporary safety collaboration in the AI industry, before discussing the methods of collaboration identified in the preceding sections. The report then introduces various practices which may benefit from collaboration, some potential drawbacks of collaboration, and measures which may mitigate them. Finally, the potential impact on AI safety is discussed.

The report concludes that recent steps which facilitate coordination on safety practices in the AI industry are promising, but there’s plenty of room for improvement. An examination of collaboration on safety practices in other sectors demonstrates that frontier AI labs could, for example, publish more safety practices, participate in more safety-focused events and form international industry bodies.

The report also argues that openly publishing information on safety practices where possible is typically the most beneficial approach for industry-wide safety, and in cases where this isn’t desirable, other forms of collaboration are also recommended. Labs which consider themselves to be utilising industry-leading safety practices should therefore seek to publish them in sufficient detail that they can be emulated, or share their knowledge through another channel in cases where publishing best practices may entail risks for the sharing company or the public as a whole.

Most of the safety practices were judged to have low downside risk from collaboration. The practices which entail lower collaboration risk include risk assessments, red teaming, third-party audits, incident reporting, segregation of duties and background checks. Best practices in these areas can typically be shared widely, for example through publishing the information or sharing it within a large industry body.

Practices with higher collaboration risk, such as model evaluations, alignment techniques, and cybersecurity, likely require a more nuanced approach to collaboration, and in many cases sharing best practices in these areas should be done on a more exclusive basis, such as directly or within a more exclusive industry body. The practicalities of how collaboration on each safety practice might take place could be researched further.

The most significant potential drawback of collaboration from an x-risk-reduction perspective is the connection between safety and capabilities research. The most significant potential drawbacks from an individual lab perspective are legal complications such as liability and antitrust violations. Potential mitigations are discussed, and it seems like the potential benefits of collaboration outweigh the potential drawbacks. However, further study of particular drawbacks could provide more clarity on this issue.

The potential impact of collaboration on safety practices on the reduction of extreme risks from AI systems is judged to be substantial. From the perspective of individual labs it is likely to be as significant as unilateral improvements in safety practices within those labs, and from the perspective of the industry as a whole the impact on safety is likely to be greater than improvements at just one firm. The question of how to ensure that best practices in AI safety swiftly propagate across the industry should therefore be a prominent one which safety researchers, labs and regulators alike should seek to answer alongside developing the actual practices.

Practices which may benefit from collaboration

Risk assessment involves the identification and analysis of risks. There are existing best practices for risk assessment in other industries which could be coordinated upon by AI companies^[6]. Additionally, new forms of risk assessment tailored to the unique risks arising from advanced AI systems will likely have to be developed, and this may be a task which is particularly suitable for collaboration among frontier AI labs.

Model evaluations (evals) is a rapidly growing area of safety research which involves designing experiments to test the properties and outputs of AI models in different circumstances^[7]. Safety evals have received significant attention from leading AI companies and even governments (citations) but best practices have yet to be established, and the way in which evals are conducted at different labs varies considerably. Model evals is another area which is ripe for cooperation, although in some cases it may alert labs to dangerous but commercially useful capabilities. On the other hand, establishing best practices for evals could reduce risks that an evals process itself produces a catastrophic scenario.

Red teaming involves actively trying to elicit harmful model outputs, within a controlled environment, thus alerting developers to issues with the model and enabling solutions to be applied^[8]. This is already carried out by all frontier labs^[9]^[10], but similarly to evals there are no established best practices across different labs.

Third-party audits entail the auditing of an AI model by external assessors. This can incorporate ‘a wide range of existing tools and methods, such as impact assessments, benchmarking, model evaluation, and red teaming, to conduct governance, model, and application audits’^[11]. Collaboration around the audits themselves may be a task for the auditors and other researchers, but the optimal ways to incorporate external audits into the development process is a potential area for collaboration among AI firms.

Some advances in broad alignment techniques, such as mechanistic interpretability and agent foundations^[12], could lead to improved model safety if widely shared, as they would give developers at different labs a better insight into how their models function. However, these advancements would be difficult to separate from subsequent capabilities growth, and it’s also unclear if firms would be incentivised to share their findings as they may provide significant advantages to the discoverer.

Cybersecurity is a field which already boasts significant cooperation on best practices. However, the specific measures required to protect AI models may necessitate new methods. Relatedly, espionage avoidance will be critical to prevent AI model theft. This is something which most leading firms would benefit from collaborating on, and would likely be straightforwardly beneficial from an extreme risk reduction standpoint.

Incident reporting involves informing a relevant state actor, other AI labs, and potentially the public whenever a safety incident occurs. There are various existing approaches, and this seems like another area where coalescing around particular best practices would be obviously beneficial.

Segregation of duties refers to the division of knowledge, division of authorization, division of duties, and implementation of multi-level approval protocols, so that no individual can alter, deploy or leak a model against the wishes of the company. This seems like it would have mainly upsides in terms of safety, but could slightly hamper a firm’s efficiency if more employees have to coordinate on certain tasks. Establishing best practices would allow labs to mitigate inefficiencies, and cooperation on this segregation of duties would have no obvious downsides.

Background checks are examinations of someone’s criminal record, employment history, and other potentially relevant information, and are conducted in many industries especially when hiring for senior positions^[13]. Know-your-customer (KYC)^[14] is a typically less demanding form of background check which firms in many industries require their customers to complete. Best practices could be established with regards to employees at AI firms and customer access to AI models.

Case studies of Collaboration on Safety Practices in Other Sectors

Aviation

The aviation industry is notable for its focus on safety, and in the last half-century has transitioned from a peak of 72 civil aviation crashes and around 3200 fatalities in 1972, to an average of 14.5 crashes and 409 fatalities in a typical year during the past decade^[15]. This safety culture is one which AI probably needs to aspire to or indeed exceed, at least in terms of capabilities which threaten extreme risks.

Airlines cooperate in a number of ways to enhance sector-wide safety, such as through the sharing of aviation data. For example, the US Federal Aviation Administration and the aviation community operate the Aviation Safety Information Analysis and Sharing (ASIAS) program^[16], an information-sharing partnership aimed at improving the assessment of safety risks. Internal hazard reporting databases, automated flight operations quality assurance technology and internal mentorship programs also ‘help everyone learn from the mistakes of others’^[17].

Airlines also sometimes collaborate through industry initiatives such as the Common Aviation Risk Models (CARM) stakeholder group, which features the involvement of many airlines^[18]. Air France and Air Transat worked together to develop bow-tie models which help to proactively identify and manage weaknesses in the aviation system.

Multiple events take place each year during which aviation representatives from across the industry come together to share best practices and learn from others. These include the jointly-hosted Evidence based training – Competency Based Training Assessment Workshop organised by Emirates and the International Air Transport Association (IATA) which focused on pilot training^[19], the annual Safety Forum organised by the Flight Safety Foundation (FSF) which discussed topics such as regulation, training and safety culture^[20], the Global Aerospace Summit which focused on standardised curricula and sharing of best practices across the top four preventable accident categories^[21], and the International Aviation Safety Conference which examined how to ensure safety while transitioning to more environmentally sustainable practices^[22]. Compared to AI, the aviation industry appears to hold more general safety events and lots of specific niche events, which one might expect in a more established safety-focused industry.

Airline representatives also often publish safety practices and other research findings in industry journals and magazines, such as Aero Safety World^[23], the Air Line Pilot Magazine^[24] and the Journal of Airline Operations and Aviation Management (JAOAM)^[25].

Energy

A common way in which companies in the energy sector collaborate on safety is through industry bodies. The World Association of Nuclear Operators^[26] is primarily made up of owners and operators of nuclear power plants, whose objective is to ‘assess, benchmark and improve performance through mutual support, exchange of information, and emulation of best practices.’ Similarly, the Institute of Nuclear Power Operators (INPO)^[27] enables nuclear utilities companies ‘to promptly share important information, including operating experience, operational performance data, and information related to the failure of equipment that affects safety and reliability. The industry also actively encourages benchmarking visits to support the sharing of best practices and the concepts of emulation and continuous improvement.’ Following the 2010 Deepwater Horizon oil spill, the American Petroleum Institute launched the Center for Offshore Safety (COS), incorporating many leading firms in the oil and gas sector, who are encouraged to develop new safety programs which are then assessed by third-party auditors^[28]. The Global Wind Organisation (GWO)^[29] is a non-profit formed by North America’s leading wind power companies, ‘responsible for a portfolio of training standards designed for the industry, by the industry’. The energy sector isn’t especially similar to the AI sector, but the AI sector nonetheless could take inspiration from the function of industry bodies in the energy sector.

The World Nuclear Symposium is an annual conference where nuclear industry professionals exchange insight on safety and other topics^[30]. The International Petroleum Technology Conference (IPTC)^[31], attended by members of the International Association of Oil and Gas Producers, is ‘focused on the dissemination of new and current technology, best practices and multi-disciplinary activities.’ The Offshore Technology Conference (OTC)^[32] brings together tens of thousands of attendees to learn and share insights regarding the enhancement of safety management.

Like other examined sectors, firms in the energy sector often discuss their safety practices in whitepapers or industry journals. For example, the journal Energy Global publishes whitepapers on various topics such as best practices for worker safety^[33]. Another way in which firms in various sectors including the energy sector share their approaches to safety is through their own blogs, such as Hart Energy’s distillation of a session on learning before accidents take place^[34].

I was unable to find examples of direct collaboration between energy firms on safety practices, as this mostly seems to take place with the involvement of an industry body.

Pharmaceutical

The International Federation of Pharmaceutical Manufacturers & Associations (IFPMA) represents the pharmaceutical industry in official relations with the United Nations^[35]. In addition to setting global standards, the IFPMA facilitates discussions or workshops where member companies can share insights and strategies for implementing GMP and GDP standards. Another industry body, the Pharmaceutical Research and Manufacturers of America (PhRMA), also facilitates collaboration among members on manufacturing processes, supply chain management, and ensuring compliance with regulatory requirements^[36].

The Drug Safety Symposium is a conference at which pharmacovigilance professionals meet to share principles, best practices, sources of information, the merits of current methods, and potential future challenges^[37]. The CDISC Interchange is a triannual event which discusses standards for sharing and reusing data within the pharmaceutical community. Attendees are expected to share expertise and ideas with colleagues and partners from across the industry^[38]. Reuters Events Pharma^[39] is an event for senior professionals throughout the pharmaceutical industry, at which attendees can exchange ideas and experiences, and understand how other firms solve problems. This again unsurprisingly equates to more safety-focused events in the more established pharmaceutical sector.

Direct collaboration between rivals in the pharmaceutical industry appears to be uncommon, although there are examples of collaboration involving companies in different sub sectors, such as during the Covid pandemic^[40]. However, pharmaceutical companies often publicly share the results of clinical trials^[41], and best practices within the health sector are often discussed at industry conferences, workshops and in medical journals^[42].

Cybersecurity

The cybersecurity industry is home to a plethora of industry bodies under whose auspices cooperation on safety practices occurs. These include the Cybersecurity Tech Accord^[43], under which signatories work alongside each other and adjacent groups to improve cybersecurity best practices, the European Union Information Sharing and Analysis Centers (ISACs)^[44] which allow ‘Knowledge on tackling cyber attacks, incident response, mitigation measures and preparatory controls [to] be shared between the relevant stakeholders’, and the Cyber Threat Alliance^[45] which facilitates ‘high-quality cyber threat information sharing among companies and organizations in the cybersecurity field’.

Direct collaboration is very common in the cybersecurity industry, especially when external funding is involved^[46]^[47]. Relatedly, firms outside the cybersecurity industry are increasingly entering into cybersecurity alliances in order to share intelligence or technical data, and defend themselves together^[48]^[49].

The RSA Conference^[50] brings together IT professionals and security experts to discuss current and emerging security topics, share best practices, and showcase new security solutions. DefCon^[51] ‘attracts a diverse crowd of hackers, security professionals, researchers, and enthusiasts from all over the world’, and features experts collaborating to solve various cybersecurity problems.

In terms of publicly sharing safety practices, cybersecurity firms and their representatives are typically active in this regard. Many industry professionals are active in online forums and communities where they discuss security concerns, share solutions, and keep up-to-date with the latest developments. Open source contributions are another notable way in which cybersecurity professionals collaborate^[52]. However, it’s important to note that cybersecurity professionals seem to have adopted a tiered approach to sharing best practices. General best practices are often shared publicly, but detailed information on how to enact those practices tend to be shared either directly or in exclusive groups. Some best practices for AI safety probably wouldn’t require this approach (risk assessment, red teaming, background checks, third-party audits, incident reporting, segregation of duties), others might (alignment techniques, evals, cybersecurity).

As another sub-sector within the tech industry, cybersecurity is probably the most relevant to AI of the sectors mentioned, and indeed robust cybersecurity practices are probably also one of the most important practices for securing AI systems. There is certainly more collaboration within the bounds of industry bodies in the cybersecurity sector, which is something the AI sector may be well-served in emulating.

Contemporary Collaboration on Safety Practices Among Frontier AI Labs

Despite all contemporary frontier AI labs only existing for a short period of time, and AI which surpasses human performance across a wide variety of tasks existing for a similarly brief period^[53]^[54], there are several ways in which these labs currently coordinate or plan to coordinate their safety practices.

Perhaps the most official channel to do this is the Frontier Model Forum (FMF), which is composed of Anthropic, Google, Microsoft and OpenAI. According to the FMF homepage, ‘The Forum is one vehicle for cross-organizational discussions and actions on AI safety and responsibility’, and its core objectives include advancing safety research and identifying best practices^[55]. As part of this endeavour, the FMF has created an AI Safety Fund with $10 million in initial funding^[56]. They have established workstreams with the aforementioned member companies ‘to develop shared best practices and standards around the safe development of frontier AI models’, and ‘met with leaders from across the AI ecosystem to discuss industry best practices and governance frameworks’. In 2024, they plan to fund researchers out of the AI Safety Fund, publish white papers on AI safety topics, and invite new member companies^[57].

The general approach of the FMF seems promising, especially the plan to develop shared safety practices across member companies and involve the wider AI ecosystem. However, the $10 million pledged is a tiny amount compared to the annual budgets of these companies and even compared to the amounts which they and others have already allocated to AI Safety^[58]^[59]^[60]^[61] (also cite AI safety funding post), the strength and extent of channels of collaboration between the FMF and four member companies is unclear, as is the extent to which new best practices will be identified, and innovative research produced. Overall, the FMF appears well-suited for the purpose of collaboration on safety practices among frontier AI labs, but its utility will depend on how seriously it gets taken by its members, and how willing they are to share and adopt new approaches through this medium.

On February 7th, the formation of the US AI Safety Institute Consortium (US AISIC) was announced^[62]. Initially comprised of more than 200 US-based members including all contemporary frontier labs^[63], it will ‘focus on establishing the foundations for a new measurement science in AI safety … including developing guidelines for red-teaming, capability evaluations, risk management, safety and security, and watermarking synthetic content.’ In addition to the US AI Safety Institute Consortium, the UK AI Safety Institute also intends to ‘support greater standardisation and promotion of best practice in evaluation more broadly.^[64]’

Another method of cooperation is direct collaboration between individual labs. As Microsoft is in a partnership with OpenAI, and Google and Amazon have invested heavily in Anthropic, there are close connections between those specific labs, which likely extend to safety discussions and shared practices. Due to the fact that the AI safety community is still a nascent one, and many safety researchers work at frontier labs, there are also strong personal connections between researchers at different labs. This is exacerbated by the fact that most contemporary frontier labs are based in California’s Bay Area.

Events attended by delegates of relevant companies, at which safety is prominent on the agenda, are an additional way in which companies can coordinate their safety practices. Although AI safety has been discussed at events in the past^[65]^[66]^[67], the inaugural AI safety summit was the first to involve prominent representatives of 28 countries. It took place in the UK in late 2023, and is planned to be an annual event, with France hosting the next in-person summit in late 2024, and South Korea co-hosting an additional virtual summit alongside the UK in May.

Publishing safety research and practices is a form of indirect collaboration which also takes place in the AI safety community. Many safety researchers, including some at frontier labs, post their work online, and share their opinions on which practices are most significant to reduce the extreme risks of future AI systems. An additional point worth noting here is that labs themselves will be mandated by the US Executive Order on Artificial Intelligence to share the steps they have taken to make their AI systems safe.

Lastly, collaboration on safety practices often takes place under the auspices of an independent industry body. Prominent standards pertaining to AI safety have been produced by organisations such as the International Organization for Standardization (ISO)^[68], National Institute of Standards and Technology (NIST)^[69], and the European Committee for Electrotechnical Standardization (CEN-CENELEC)^[70]. These organisations do engage with AI companies and other stakeholders, but ultimately the standards are provided in a top-down manner, rather than as a result of direct collaboration among the top AI labs.

Methods of Collaboration on Safety Practices

In many cases publicly sharing safety practices is likely to be the optimal approach from a safety standpoint, as it’s the most convenient way for industry peers and anyone else to access the information. However, in some cases publishing best safety practices might not be advisable, due to the potential for adversaries to use this information to jailbreak an AI system, or simply because there may be a strong relationship between safety practices and capabilities, such that sharing a particular practice which boosts model safety may allow other firms to produce similar models, thus contributing to a race dynamic.

In situations where publicly sharing safety practices may have negative externalities, the most suitable option is probably collaboration under the oversight of an industry body, such as the Frontier Model Forum, US AISIC, or another preferably international organisation. This has the advantage of filtering out potential bad actors, while still providing the opportunity to coordinate safety practices at scale, potentially across all frontier AI labs.

In some cases it may be preferable for two or more actors to directly collaborate with one another, if the companies which are directly collaborating have a particular reason to support each other but not others in the industry, or if industry bodies are simply inadequate.

Events seem a less efficient mode of sharing information than publishing safety practices, but are useful for initiating direct collaboration, developing trust, building friendships and demonstrating positive intentions. Similarly to collaborating through industry bodies, events make it possible to influence a large number of other firms, although it may be harder to filter which companies the information is shared with.

Potential Drawbacks and Mitigations

There are several potential drawbacks of collaboration on best safety practices among some or all frontier AI labs. From an x-risk-reduction perspective, the most significant potential drawback is the connection between safety and capabilities advances, while from an individual lab perspective the most significant potential drawback may be legal complications.

X-risk-reduction Considerations

One obvious drawback of collaboration on safety practices is that it could morph into collusion among a small group of companies if many safety practices are also beneficial for capabilities. This would allow those firms to gain an advantage over others, and centralise power in the hands of a few key actors. In order to reduce such collusion, regulators may have to enforce competition laws to prevent a small group of firms dominating the market. Additionally, a list of specific practices which are being collaborated on should probably be publicly available, any significant collaboration on safety practices could be made open to most leading / large AI labs. However, it should also be noted that the existence of fewer frontier AI labs could be beneficial for safety as it would somewhat reduce the race dynamic, and reduce the number of models which potentially harbour extreme risks at each capabilities threshold, therefore some benign form of ‘collusion’ might not be so bad.

A closely related point is that if some safety practices do benefit capabilities and speed up commercialisation of increasingly advanced models, cooperation on those practices may be net negative as the problem of how to manage a world in which AI agents can complete most cognitive tasks better than most humans would have to be tackled sooner. However, coordination on most safety practices (e.g. risk assessments, background checks) either wouldn’t benefit capabilities, or the benefits for model safety would likely outweigh the possible acceleration in capabilities (e.g. safety evals).

An additional risk of labs collaborating closely with one another on safety practices is that they could fall victim to groupthink. This would reduce the number of different approaches to safety which get developed, and may lead to complacency around certain risks, or failing to recognise some risks entirely. In order to combat groupthink, diverse and dissenting opinions could be encouraged, as could a culture which values critical evaluation.

Relatedly, some companies could become dependent on others to conduct safety research, thus reducing the net amount of safety research which gets done. It’s highly likely that some firms will end up contributing to more safety innovation than others, but there should be incentives for all companies to actively advance best practices and attempt to be safety leaders.

Individual Lab Considerations

An issue for firms themselves is that they could run into compliance and liability issues. For example, if a particular implementation of a safety practice which has been collaborated on is judged to have caused harm to customers, all firms which collaborated on this practice may be held liable for the damage caused by a particular firm’s product. Rigorous legal audits and the employment of regulatory compliance experts might help to resolve this.

In addition to compliance and liability, AI companies could also be found in breach of antitrust law. Both US and EU law prohibit agreements and other inter-firm cooperation which has the effect of restricting competition. This is discussed in Coordinated Pausing by Alaga and Schuett^[71], who suggest ‘using third parties like independent auditors or regulators as intermediaries for sharing information’, or consulting with regulatory bodies, as ways to mitigate this concern. Relatedly, both US and UK antitrust officials are actively investigating relationships between Open AI and Microsoft, and Anthropic with both Amazon and Google^[72]^[73], although this pertains to investments into OpenAI and Anthropic rather than collaboration on any particular practices.

Another potential issue involving the sharing of information between companies is the potential for export controls to be breached. This seems like it would be a particular concern for US firms sharing technology with China, but other countries are also developing their own related controls^[74].

Data security and privacy concerns are another potential drawback of collaboration on safety practices, as any large-scale sharing of information among companies can result in data breaches or the misuse of sensitive information. This could be mitigated by enacting robust data security protocols and ensuring compliance with data protection regulations.

Potential Impact on AI safety

Subjective Judgement of Ease of Implementation, Safety Improvement and Downside Potential of Collaboration on Some Safety Best Practices

Safety Practice	Implementation	Safety improvement	Downside Potential
Risk assessments	easy	moderate	low
Evals	hard^[75]	high	high^[76]
Red teaming	medium^[77]	high	low
Third-party audits	easy	low	low
Alignment techniques	hard^[78]	high	high^[79]
Information security	hard^[80]	high	low
Incident reporting	medium^[81]	moderate	high^[82]
Segregation of duties	medium^[83]	moderate	low
Background checks	easy	low	low

Collaboration on safety practices could have a significant impact on AI safety, due to the proliferation of best practices across the industry, and the development of counterfactually superior best practices. Overall this impact is likely to be more significant than safety improvements at just one firm (unless one lab is substantially ahead and produces all meaningful frontier models?), and should therefore receive attention from all industry actors.

The default scenario is that a somewhat higher level of cooperation occurs, due to existing industry initiatives such as the Frontier Model Forum, and state initiatives which aim to establish best practices for evals and other areas^[84]. Overall, it seems like a higher level of cooperation among frontier labs is one of the most promising steps towards reducing x-risks. In terms of individual practices, most appear to have low downside potential under collaboration, with the exceptions of evals, alignment techniques and incident reporting, and even collaboration on those practices may still be net positive, at least if done to a certain extent.

Over the next year, the ideal scenario would be that frontier labs, alongside safety researchers in general, publish their current best practices for safety, with practical details where possible, except perhaps information relating to alignment techniques which may widely disseminate capabilities. This already happens to a certain extent, and the fact that there are plans to publish more (e.g. as announced by the Frontier Model Forum) is promising. With regards to the FMF itself, it will hopefully expand to incorporate more leading labs, and be taken more seriously as an engine for collaboration. The allocation of more funding to the FMF would be a positive indicator of this. The formation of other industry bodies could allow even more labs to collaborate on safety practices, although without a firm focus on safety they could be harmful as they could legitimise a less rigorous approach to ensuring model safety. In some circumstances, direct collaboration on practices beyond current commercial partnerships could be helpful, although the default should be to collaborate more widely.

If cooperation on safety practices continues at the current level, the impact will depend on how many labs are producing genuinely frontier general-purpose AI models, and to what extent those firms already follow or are clearly open to implementing best practices as they’re developed. For example, at the moment OpenAI, DeepMind and Anthropic are probably the three genuine frontier labs, and if this scenario persists the need for collaboration on safety practices is lower than it would be if currently less responsible labs such as Facebook and Amazon were producing frontier models with their current safety practices^[85]. However, collaboration among the most responsible labs would still likely lead to notable safety improvements, and regardless of their current adoption of safety practices, labs would need to be willing to implement improved practices in order to make collaboration valuable.

Regulators mandating some straightforward safety practices, and creating an environment which encourages the adoption of other more nuanced practices, would create a more cautious frontier AI ecosystem, and combat any incentives companies may have to take shortcuts on safety. Regulators could also be key in the creation of more AI safety events, which they already have been with the launch of the international safety summits, although labs themselves could also take the initiative in organising several genuinely safety-focused events per year. Finally, although most customers are likely to use the most capable models, even a small proportion of safety-conscious customers could encourage firms to improve practices.

Conclusion

Other sectors: Firms in the more established sectors examined in this report don’t appear to collaborate on safety practices as much as I imagined they would prior to commencing this report. However, frontier AI labs could emulate the level of collaboration through industry bodies which takes place in the energy and cybersecurity sectors, and the AI industry in general could take inspiration from the amount of safety-focused events which occur in the aviation and pharmaceutical sectors.

Contemporary AI sector: The Frontier Model Forum is a promising platform for collaboration, as are existing direct collaboration ties and publishing practices. However, there’s room for more industry body involvement and safety-focused industry events, and issues such as competition law and the links between safety and capabilities research pose significant barriers. The newly-announced US AISIC may play a significant role in facilitating collaboration on safety practices but there’s still scope for international industry bodies which could include non-US companies.

Methods of Collaboration: The most promising methods of collaboration, in order of desirability, are open publishing of best practices, collaboration within an industry body, industry events, and direct collaboration. In some cases, some methods of collaboration may not be suitable for certain safety practices.

Safety Practices: Collaboration on the majority of safety practices referenced in this report appears to have few downsides. Collaboration on some practices such as evals and alignment techniques requires a more nuanced approach.

Drawbacks and mitigations: There are many potential drawbacks of safety collaboration which firms should be aware of. Most have straightforward mitigations, but antitrust laws may restrict the modes in which collaboration can take place, and some safety collaboration may hasten the development of new capabilities without ensuring that those capabilities themselves are safe.

Impact on AI safety: Collaboration on many of the identified safety practices would likely provide a significant boost to the safety of advanced AI models. Firms themselves, regulators, customers and the general public can all play a part in encouraging this scenario.

Thanks to Oliver Guest and Max Räuker for many helpful comments and suggestions on an earlier version of this draft. All remaining flaws are the responsibility of the author.

^{^}
"Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence," The White House, 30 Oct. 2023. [Online]. Available: https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/
^{^}
"AI Safety Summit 2023," UK Government. [Online]. Available: https://www.gov.uk/government/topical-events/ai-safety-summit-2023
^{^}
"Incident 34909," OECD.AI. [Online]. Available: https://oecd.ai/en/incidents/34909

^{^}
"AI Will Transform the Global Economy; Let's Make Sure It Benefits Humanity," IMF, 14 Jan. 2024. [Online]. Available: https://www.imf.org/en/Blogs/Articles/2024/01/14/ai-will-transform-the-global-economy-lets-make-sure-it-benefits-humanity
^{^}
In the context of this report, safety practices refers to practices which are judged to reduce the existential risk of present or future AI systems. It does not include practices which are designed to reduce misinformation, bias and other harms to users which likely have no direct bearing on existential risks.
^{^}
"Risk Assessment at AGI Companies," Governance.AI. [Online]. Available: https://cdn.governance.ai/Koessler_Schuett_(2023)_-_Risk_assessment_at_AGI_companies.pdf
^{^}
"AI Evaluations and Standards," Effective Altruism Forum. [Online]. Available: https://forum.effectivealtruism.org/topics/ai-evaluations-and-standards
^{^}
"What Does AI Red Teaming Actually Mean?," CSET Georgetown. [Online]. Available: https://cset.georgetown.edu/article/what-does-ai-red-teaming-actually-mean/
^{^}
"Red Teaming Network," OpenAI. [Online]. Available: https://openai.com/blog/red-teaming-network
^{^}
"Red Teaming," Microsoft Azure AI Services. [Online]. Available: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/red-teaming
^{^}
"Third-Party Audits of AI Systems," Springer. [Online]. Available: https://link.springer.com/article/10.1007/s43681-023-00289-2
^{^}
"AI Alignment Approaches," AI Safety Fundamentals. [Online]. Available: https://aisafetyfundamentals.com/blog/ai-alignment-approaches/
^{^}
"Background Check," Wikipedia. [Online]. Available: https://en.wikipedia.org/wiki/Background_check
^{^}
"Know Your Client (KYC)," Investopedia. [Online]. Available: https://www.investopedia.com/terms/k/knowyourclient.asp
^{^}
GPT-4; "Accident Statistics," ICAO, 2023. [Online]. Available: https://www.icao.int/safety/iStars/Pages/Accident-Statistics.aspx
^{^}
"Safety Collaboration," Flight Safety Foundation. [Online]. Available: https://flightsafety.org/asw-article/safety-collaboration/
^{^}
"Aviation Data Sharing: Crucial to Improved Safety," Global Aerospace, 2023. [Online]. Available: https://www.global-aero.com/aviation-data-sharing-crucial-to-improved-safety/
^{^}
"Modeling Common Aviation Risk," Flight Safety Foundation. [Online]. Available: https://flightsafety.org/asw-article/modeling-common-aviation-risk/
^{^}
"Emirates and IATA Marshal Industry to Share Best Practices in Pilot Training and Flight Safety," Emirates. [Online]. Available: https://www.emirates.com/media-centre/emirates-and-iata-marshal-industry-to-share-best-practices-in-pilot-training-and-flight-safety/
^{^}
"Safety Forum 2023," Flight Safety Foundation. [Online]. Available: https://flightsafety.org/safety-forum-2023/agenda/
^{^}
"All Eyes on the Future of Aviation Safety at Global Aerospace Summit," NBAA. [Online]. Available: https://nbaa.org/aircraft-operations/safety/all-eyes-on-the-future-of-aviation-safety-at-global-aerospace-summit/
^{^}
"2023 EASA-FAA International Aviation Safety Conference," EASA. [Online]. Available: https://www.easa.europa.eu/en/newsroom-and-events/events/2023-easa-faa-international-aviation-safety-conference#group-agenda
^{^}
"AeroSafety World Magazine," Flight Safety Foundation. [Online]. Available: https://flightsafety.org/aerosafety-world/aerosafety-world-magazine/
^{^}
"Air Line Pilot Magazine," ALPA. [Online]. Available: https://www.alpa.org/news-and-events/air-line-pilot-magazine
^{^}
"Journal of Airline Operations and Aviation Management," JAOAM. [Online]. Available: https://jaoam.com/index.php/jaoam
^{^}
"Our Mission," WANO. [Online]. Available: https://www.wano.info/about-us/our-mission
^{^}
Testimony by S. Willard, U.S. Senate, 13 Nov. 2019. [Online]. Available: https://www.epw.senate.gov/public/_cache/files/7/3/736241ed-3922-4144-a905-b965bb1cbe88/A7587FA91CC97E396A67DF16C8D4665A.willard-testimony-11.13.2019.pdf
^{^}
"API Offshore Oil and Gas Safety," NS Energy. [Online]. Available: https://www.nsenergybusiness.com/features/american-petroleum-institute-offshore-oil-gas-safety/
^{^}
"North American Firms Join to Focus on Safety as the Wind Turbine Industry Grows," Global Wind Safety. [Online]. Available: https://www.globalwindsafety.org/news/north-american-firms-join-to-focus-on-safety-as-the-wind-turbine-industry-grows
^{^}
"World Nuclear Symposium," WNA. [Online]. Available: https://www.wna-symposium.org/website/52662/
^{^}
"IPTC International Petroleum Technology Conference," IOGP. [Online]. Available: https://www.iogp.org/event/iptc-international-petroleum-technology-conference/
^{^}
"OTC Half-Day Recap," Center for Offshore Safety. [Online]. Available: https://www.centerforoffshoresafety.org/announcements_page/Announcements/OTC_halfday_recap
^{^}
"Protecting Your Most Valuable Asset: 4 Best Practices for Reducing Worker Injuries and Illnesses," Energy Global. [Online]. Available: https://www.energyglobal.com/whitepapers/intelex/protecting-your-most-valuable-asset-4-best-practices-for-reducing-worker-injuries-and-illnesses/
^{^}
"New Approach to Safety in Oil and Gas," Hart Energy. [Online]. Available: https://www.hartenergy.com/ep/exclusives/new-approach-safety-oil-and-gas-202415
^{^}
"About Us," IFPMA. [Online]. Available: https://www.ifpma.org/about-us/
^{^}
"Research and Development: Manufacturing," PhRMA. [Online]. Available: https://phrma.org/policy-issues/Research-and-Development/Manufacturing
^{^}
"Drug Safety Symposium," Bio-Equip. [Online]. Available: https://www.bio-equip.cn/ensrc.asp?ID=9131
^{^}
"CDISC Interchange," CDISC. [Online]. Available: https://www.cdisc.org/interchange/reasons
^{^}
"Reuters Events: Pharma Europe," Reuters Events. [Online]. Available: https://events.reutersevents.com/pharma/pharma-europe
^{^}
"COVID-19 Vaccine Manufacturing Collaborations," PhRMA. [Online]. Available: https://phrma.org/Coronavirus/Working-Together-To-Fight-COVID-19-Vaccine-Manufacturing-Collaborations
^{^}
"Clinical Trials Results," PubMed Central. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6614834/
^{^}
"Nature," Nature. [Online]. Available: https://www.nature.com/articles/d41586-019-00610-2
^{^}
"About Cybersecurity Tech Accord," Cybersecurity Tech Accord. [Online]. Available: https://cybertechaccord.org/about/
^{^}
"National Cyber Security Strategies Information Sharing," ENISA. [Online]. Available: https://www.enisa.europa.eu/topics/national-cyber-security-strategies/information-sharing
^{^}
"Cyber Threat Alliance," Cyber Threat Alliance. [Online]. Available: https://www.cyberthreatalliance.org/
^{^}
"Israel-US Binational Industrial R&D Foundation to Invest $3.85M in Critical Infrastructure Cybersecurity Projects," Dark Reading. [Online]. Available: https://www.darkreading.com/cybersecurity-operations/israel-us-binational-industrial-r-d-foundation-to-invest-3-85m-in-critical-infrastructure-cybersecurity-projects
^{^}
"Globalstars Call for Projects with Taiwan," EUREKA Network. [Online]. Available: https://www.eurekanetwork.org/dA/3a5f6bfce4/Globalstars%20call%20for%20projects%20with%20Taiwan.pdf
^{^}
"Why Companies Are Forming Cybersecurity Alliances," Harvard Business Review. [Online]. Available: https://hbr.org/2019/09/why-companies-are-forming-cybersecurity-alliances
^{^}
"Formation of Cybersecurity Alliances," TÜV. [Online]. Available: https://www.tuv.com/landingpage/en/cybersecurity-trends_2024/navi/formation-of-cybersecurity-alliances/
^{^}
"About RSA Conference," RSA Conference. [Online]. Available: https://www.rsaconference.com/about
^{^}
"The Definitive Guide to DEFCON," ThreatKey. [Online]. Available: https://www.threatkey.com/resource/the-definitive-guide-to-defcon-what-every-cyber-security-enthusiast-needs-to-know
^{^}
"Open Source in Cybersecurity," Venture in Security. [Online]. Available: https://ventureinsecurity.net/p/open-source-in-cybersecurity-a-deep
^{^}
"Deep Learning (DL) in Neural Networks (NNs): An Overview," arXiv. [Online]. Available: https://arxiv.org/pdf/1712.01815v1.pdf
^{^}
"GPT-4," OpenAI. [Online]. Available: https://cdn.openai.com/papers/gpt-4.pdf
^{^}
"Frontier Model Forum," Frontier Model Forum. [Online]. Available: https://www.frontiermodelforum.org/
^{^}
"Announcing Chris Meserole," Frontier Model Forum. [Online]. Available: https://www.frontiermodelforum.org/updates/announcing-chris-meserole/
^{^}
"Year in Review," Frontier Model Forum. [Online]. Available: https://www.frontiermodelforum.org/updates/year-in-review/
^{^}
J. Leike and I. Sutskever, "Introducing Superalignment," OpenAI Blog, July 5, 2023. [Online]. Available: https://openai.com/blog/introducing-superalignment.
^{^}
R. Shah and G. Irving, "DeepMind is hiring for the Scalable Alignment and Alignment Teams," AI Alignment Forum, May 13, 2022. [Online]. Available: https://www.alignmentforum.org/posts/nzmCvRvPm4xJuqztv/deepmind-is-hiring-for-the-scalable-alignment-and-alignment.
^{^}
"Open Philanthropy donations made (filtered to cause areas matching AI safety)." Donations.vipulnaik.com. [Online]. Available: https://donations.vipulnaik.com/donor.php?donor=Open+Philanthropy&cause_area_filter=AI+safety.
^{^}
"Initial £100 million for expert taskforce to help UK build and adopt next generation of safe AI," Department for Science, Innovation and Technology, Prime Minister's Office, 10 Downing Street, Apr. 24, 2023. [Online]. Available: https://www.gov.uk/government/news/initial-100-million-for-expert-taskforce-to-help-uk-build-and-adopt-next-generation-of-safe-ai.
^{^}
"Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety," National Institute of Standards and Technology (NIST), February 8, 2024. [Online]. Available: https://www.nist.gov/news-events/news/2024/02/biden-harris-administration-announces-first-ever-consortium-dedicated-ai.
^{^}
With the exception of Google DeepMind, based in London, which nonetheless is a subsidiary of Google, a member of the US AISIC.
^{^}
"Introducing the AI Safety Institute," UK Government Publications. Available: https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute#mission-and-scope.
^{^}
"AI safety conference in Puerto Rico," Future of Life Institute, October 12, 2015. [Online]. Available: https://futureoflife.org/event/ai-safety-conference-in-puerto-rico/.
^{^}
"SafeAI 2019," AAAI's Workshop on Artificial Intelligence Safety, held in conjunction with the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), January 27, 2019, Honolulu, Hawaii, USA. [Online]. Available: https://www.cser.ac.uk/events/safeai-2019/.
^{^}
"AI Safety Unconference at NeurIPS 2022," AI Safety Events, November 28, 2022, New Orleans, Louisiana, USA. [Online]. Available: https://aisafetyevents.org/events/aisuneurips2022/.
^{^}
"ISO Standard 74438," International Organization for Standardization. [Online]. Available: https://www.iso.org/standard/74438.html
^{^}
"AI Risk Management Framework," NIST. [Online]. Available: https://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF
^{^}
"Artificial Intelligence," CEN-CENELEC. [Online]. Available: https://www.cencenelec.eu/areas-of-work/cen-cenelec-topics/artificial-intelligence/
^{^}
"Coordinated Pausing: An Evaluation-Based Coordination Scheme for Frontier AI Developers," Governance.AI. [Online]. Available: https://cdn.governance.ai/Coordinated_Pausing_An_evaulation-based_coordination_scheme_for_frontier_AI_developers.pdf
^{^}
"FTC launches antitrust inquiry into artificial intelligence deals by tech giants," PBS Newshour, January 25, 2024. [Online]. Available: https://www.pbs.org/newshour/politics/ftc-launches-antitrust-inquiry-into-artificial-intelligence-deals-by-tech-giants.
^{^}
M. M, C. Mehta, and A. Soni, "Microsoft, OpenAI tie-up comes under antitrust scrutiny," Reuters, December 8, 2023. [Online]. Available: https://www.reuters.com/world/uk/uk-antitrust-regulator-considering-microsoft-openai-partnership-2023-12-08/.
^{^}
"Introducing the AI Safety Institute," UK Government Publications. Available: https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute#mission-and-scope.
^{^}
due to difficulty of separating safety / capabilities
^{^}
due to alerting other labs to potential capabilities
^{^}
due to liability concerns
^{^}
due to liability concerns and difficulty of separating safety / capabilities
^{^}
due to difficulty of separating safety / capabilities
^{^}
due to sharing of sensitive information
^{^}
due to liability concerns
^{^}
due to alerting other labs to potential capabilities
^{^}
due to liability concerns
^{^}
"Emerging Processes for Frontier AI Safety," Department for Science, Innovation & Technology, October 27, 2023. [Online]. Available: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety.
^{^}
"AI Safety Policies," Levy Hulme Centre for the Future of Intelligence, October 31, 2023. [Online]. Available: http://lcfi.ac.uk/news-and-events/news/2023/oct/31/ai-safety-policies/.

Show all footnotes

Effective Altruism Forum
EA Forum