This is a special post for quick takes by Marcel D. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
The main point I make is that NIST may not be well suited to creating measurements for complex, multi-dimensional characteristics of language models—and that some people may be overestimating the capabilities of NIST because they don't recognize how incomparable the Facial Recognition Vendor Test is to this situation of subjective metrics for GenAI and they don't realize NIST arguably even botched MNIST (which was actually produced by Yann LeCun by recompiling NIST's datasets). Moreover, government is slow, while AI is fast. Instead, I argue we should consider an alternative model such as federal funding for private/academic benchmark development (e.g., prize competitions).
I wasn't sure if this warranted a full post, especially since it feels a bit late; LMK if you think otherwise!
Sure! (I just realized the point about the MNIST dataset problems wasn't fully explained in my shared memo, but I've fixed that now)
Per the assessment section, some of the problems with assuming that FRVT demonstrates NIST's capabilities for evaluation of LLMs/etc. include:
Facial recognition is a relatively "objective" test—i.e., the answers can be linked to some form of "definitive" answer or correctness metric (e.g., name/identity labels). In contrast, many of the potential metrics of interest with language models (e.g., persuasiveness, knowledge about dangerous capabilities) may not have a "definitive" evaluation method, where following X procedure reliably evaluates a response (and does so in a way that onlookers would look silly to dispute).
The government arguably had some comparative advantage in specific types of facial image data, due to collecting millions of these images with labels. The government doesn't have a comparative advantage in, e.g., text data.
The government has not at all kept pace with private/academic benchmarks for most other ML capabilities, such as non-face image recognition (e.g., Common Objects in Context) and LLMs (e.g., SuperGLUE).
It's honestly not even clear to me whether FRVT's technical quality truly is the "gold standard" in comparison to the other public training/test datasets for facial recognition (e.g., MegaFace); it seems plausible that the value of FRVT is largely just that people can't easily cheat on it (unlike datasets where the test set is publicly available) because of how the government administers it.
For the MNIST case, I now have the following in my memo:
Even NIST’s efforts with handwriting recognition were of debatable quality: Yann LeCun's widely-used MNIST is a modification of NIST's datasets, in part because NIST's approach used census bureau employees’ handwriting for the training set and high school students’ handwriting for the test set.[1]
Some may argue this assumption was justified at the time because it required that models could “generalize” beyond the training set. However, popular usage appears to have favored MNIST’s approach. Additionally, it is externally unclear that one could effectively generalize from the handwriting of a narrow and potentially unrepresentative segment of society—professional bureaucrats—to high schoolers’, and the assumption that this would be necessary (e.g., due to the inability to get more representative data) seems unrealistic.
There are some major differences with the type of standards that NIST usually produces. Perhaps the most obvious is that a good AI model can teach itself to pass any standardised test. A typical standard is very precisely defined in order to be reproducible by different testers. But if you make such a clear standard test for an LLM, it would, say, be a series of standard prompts or tasks, which would be the same no matter who typed them in. But in such a case, the model just trains itself on how to answer these prompts, or follows the Volkswagen model of learning how to recognize that it's being evaluated, and to behave accordingly, which won't be hard if the testing questions are standard.
So the test tells you literally nothing useful about the model.
I don't think NIST (or anyone outside the AI community) has experience with the kind of evals that are needed for models, which will need to be designed specifically to be unlearnable. The standards will have to include things like red-teaming in which the model cannot know what specific tests it will be subjected to. But it's very difficult to write a precise description of such an evaluation which could be applied consistently.
In my view this is a major challenge for model evaluation. As a chemical engineer, I know exactly what it means to say that a machine has passed a particular standard test. And if I'm designing the equipment, I know exactly what standards it has to meet. It's not at all obvious how this would work for an LLM.
TL;DR: Someone should probably write a grant to produce a spreadsheet/dataset of past instances where people claimed a new technology would lead to societal catastrophe, with variables such as “multiple people working on the tech believed it was dangerous.”
Slightly longer TL;DR: Some AI risk skeptics are mocking people who believe AI could threaten humanity’s existence, saying that many people in the past predicted doom from some new tech. There is seemingly no dataset which lists and evaluates such past instances of “tech doomers.” It seems somewhat ridiculous* to me that nobody has grant-funded a researcher to put together a dataset with variables such as “multiple people working on the technology thought it could be very bad for society.”
*Low confidence: could totally change my mind
———
I have asked multiple people in the AI safety space if they were aware of any kind of "dataset for past predictions of doom (from new technology)", but have not encountered such a project. There have been some articles and arguments floating around recently such as "Tech Panics, Generative AI, and the Need for Regulatory Caution", in which skeptics say we shouldn't worry about AI x-risk because there are many past cases where people in society made overblown claims that some new technology (e.g., bicycles, electricity) would be disastrous for society.
While I think it's right to consider the "outside view" on these kinds of things, I think that most of these claims 1) ignore examples of where there were legitimate reasons to fear the technology (e.g., nuclear weapons, maybe synthetic biology?), and 2) imply the current worries about AI are about as baseless as claims like "electricity will destroy society," whereas I would argue that the claim "AI x-risk is >1%" stands up quite well against most current scrutiny.
(These claims also ignore the anthropic argument/survivor bias—that if they ever were right about doom we wouldn't be around to observe it—but this is less important.)
I especially would like to see a dataset that tracks things like "were the people warning of the risks also the people who were building the technology?" More generally, some measurement of "analytical rigor" also seems really important, e.g., "could the claims have stood up to an ounce of contemporary scrutiny (i.e., without the benefit of hindsight)?"
Absolutely seems worth spending up to $20K to hire researchers to produce such a spreadsheet within the next two-ish months… this could be a critical time period, where people are more receptive to new arguments/responses…?
Just saw this now, after following a link to another comment.
You have almost given me an idea for a research project. I would run the research honestly and report the facts, but my in-going guess is that survivor bias is a massive factor, contrary to what you say here. And that in most cases, the people who believed it could lead to catastrophe were probably right to be concerned. A lot of people have the Y2K bug mentality, in which they didn't see any disaster and so concluded that it was all a false-alarm, rather than the reality which is that a lot of people did great work to prevent it.
If I look at the different x-risk scenarios the public is most aware of:
Nuclear annihilation - this is very real. As is nuclear winter.
Climate change. This is almost the poster-child for deniers, but in fact there is as yet no reason to believe that the doom-saying predictions are wrong. Everything is going more or less as the scientists predicted, if anything, it's worse. We have just underestimated the human capacity to stick our heads in the ground and ignore reality*.
Pandemic. Some people see covid as proof that pandemics are not that bad. But we know that, for all the harm it wrought, covid was far from the worst-case. A bioweapon or a natural pandemic.
AI - the risks are very real. We may be lucky with how it evolves, but if we're not, it will be the machines who are around to write about what happened (and they will write that it wasn't that bad ...)
Etc.
My unique (for this group) perspective on this is that I've worked for years on industrial safety, and I know that there are factories out there which have operated for years without a serious safety incident or accident - and someone working in one of those could reach the conclusion that the risks were exaggerated, while being unaware of cases where entire factories or oil-rigs or nuclear power plants have exploded and caused terrible damage and loss of life.
Before I seriously start working on this (in the event that I find time), could you let me know if you've since discovered such a data-base?
*We humans are naturally very good at this, because we all know we're going to die, and we live our lives trying not to think about this fact or desperately trying to convince ourselves of the existence of some kind of afterlife.
Everything is going more or less as the scientists predicted, if anything, it's worse.
I'm not that focused on climate science, but my understanding is that this is a bit misleading in your context—that there were some scientists in the (90s/2000s?) who forecasted doom or at least major disaster within a few decades due to feedback loops or other dynamics which never materialized. More broadly, my understanding is that forecasting climate has proven very difficult, even if some broad conclusions (e.g., "the climate is changing," "humans contribute to climate change") have held up. Additionally, it seems that many engineers/scientists underestimated the pace of alternative energy technology (e.g., solar).
That aside, I would be excited to see someone work on this project, and I still have not discovered any such database.
I'm not sure. IMHO a major disaster is happening with the climate. Essentially, people have a false belief that there is some kind of set-point, and that after a while the temperature will return to that, but this isn't the case. Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C.
It's always interesting to ask people how high they think sea-level might rise if all the ice melted. This is an uncontroversial calculation which involves no modelling - just looking at how much ice there is, and how much sea-surface area there is. People tend to think it would be maybe a couple of metres. It would actually be 60 m (200 feet). That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.
Right now, if anything what we're seeing is worse than the average prediction. The glaciers and ice sheets are melting faster. The temperature is increasing faster. Etc. Feedback loops are starting to be powerful. There's a real chance that the Gulf Stream will stop or reverse, which would be a disaster for Europe, ironically freezing us as a result of global warming ...
Among serious climate scientists, the feeling of doom is palpable. I wouldn't say they are exaggerating. But we, as a global society, have decided that we'd rather have our oil and gas and steaks than prevent the climate disaster. The US seems likely to elect a president who makes it a point of honour to support climate-damaging technologies, just to piss off the scientists and liberals.
Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C. [...] [Regarding ice melting -- ] That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.
I'll be blunt, remarks like these undermine your credibility. But regardless, I just don't have any experience or contributions to make on climate change, other than re-emphasizing my general impression that, as a person who cares a lot about existential risk and has talked to various other people who also care a lot about existential risk, there seems to be very strong scientific evidence suggesting that extinction is unlikely.
But as a scientist, I feel it's valuable to speak the truth sometimes, to put my personal credibility on the line in service of the greater good. Venus is an Earth-sized planet which is 400C warmer than Earth, and only a tiny fraction of this is due to it being closer to the sun. The majority is about the % of the sun's heat that it absorbs vs. reflects. It is an extreme case of global warming. I'm not saying that Earth can be like Venus anytime soon, I'm saying that we have the illusion that Earth has a natural, "stable" temperature, and while it might vary, eventually we'll return to that temperature. But there is absolutely no scientific or empirical evidence for this.
Earth's temperature is like a ball balanced in a shallow groove on the top of a steep hill. We've never experienced anything outside the narrow groove, so we imagine that it is impossible. But we've also never dramatically changed the atmosphere the way we're doing now. There is, like I said, no fundamental reason why global-warming could not go totally out of control, way beyond 1.5C or 3C or even 20C.
I have struggled to explain this concept, even to very educated, open-minded people who fundamentally agree with my concerns about climate change. So I don't expect many people to believe me. But intellectually, I want to be honest.
I think it is valuable to keep trying to explain this, even knowing the low probability of success, because right now, statements like "1.5C temperature increase" are just not having the impact of changing people's habits. And if we do cross a tipping point, it will be too late to start realising this.
I spent way too much time organizing my thoughts on AI loss-of-control ("x-risk") debates without any feedback today, so I'm publishing perhaps one of my favorite snippets/threads:
A lot of debates seem to boil down to under-acknowledged and poorly-framed disagreements about questions like “who bears the burden of proof.” For example, some skeptics say “extraordinary claims require extraordinary evidence” when dismissing claims that the risk is merely “above 1%”, whereas safetyists argue that having >99% confidence that things won’t go wrong is the “extraordinary claim that requires extraordinary evidence.”
I think that talking about “burdens” might be unproductive. Instead, it may be better to frame the question more like “what should we assume by default, in the absence of definitive ‘evidence’ or arguments, and why?” “Burden” language is super fuzzy (and seems a bit morally charged), whereas this framing at least forces people to acknowledge that some default assumptions are being made and consider why.
To address that framing, I think it’s better to ask/answer questions like “What reference class does ‘building AGI’ belong to, and what are the base rates of danger for that reference class?” This framing at least pushes people to make explicit claims about what reference class building AGI belongs to, which should make it clearer that it doesn’t belong in your “all technologies ever” reference class.
In my view, the "default" estimate should not be “roughly zero until proven otherwise,” especially given that there isn’t consensus among experts and the overarching narrative of “intelligence proved really powerful in humans, misalignment even among humans is quite common (and is already often observed in existing models), and we often don’t get technologies right on the first few tries.”
The following is an assignment I submitted for my Cyber Operations class at Georgetown, regarding the risk of large AI model theft and what the US Cybersecurity and Infrastructure Security Agency (CISA) should/could do about it. Further caveats and clarifications in footnote.[1] (Apologies for formatting issues)
-------------
Memorandum for the Cybersecurity and Infrastructure Security Agency (CISA)
SUBJECT: Supporting Security of Large AI Models Against Theft
Recent years have seen a rapid increase in the capabilities of artificial intelligence (AI) models such as GPT-4. However, as these large models become more capable and more expensive to train, they become increasingly attractive targets for theft and could pose greater security risks to critical infrastructure (CI), in part by enhancing malicious actors’ cyber capabilities. Rather than strictly focusing on the downstream effects of powerful AI models, CISA should also work to reduce the likelihood (or rapidity) of large AI model theft. This memo will explain some of the threats to and from powerful AI models, briefly describe relevant market failures, and conclude with recommendations for CISA to mitigate the risk of AI model theft.
There are Strong Incentives and Historical Precedent for China and Other Actors to Steal AI Models
There are multiple reasons to expect that hackers will attempt to exfiltrate large AI model files:
Current large models have high up-front development (“training”) costs/requirements but comparatively low operational costs/requirements after training.[1] This makes theft of AI models attractive even for non-state actors and distinct from many instances of source code theft.[2] Additionally, recent export controls on semiconductors to China could undermine China’s ability to develop future large models,[3] which would further increase Beijing’s incentive to steal trained models.
China and other actors have repeatedly stolen sensitive data and intellectual property (IP) in the past.[4]
Someone leaked Meta’s new large language model (LLaMA) within days of Meta providing model access to researchers.[5]
AI Model Theft/Proliferation May Threaten Critical Infrastructure—and Far More
Theft of powerful AI models—or the threat thereof—could have significant negative consequences beyond straightforward economic losses:
Many powerful AI models could be abused:
Content generation models could enhance disinformation and spear phishing campaigns.[6]
Image recognition models could empower semi-autonomous weapons or authoritarian surveillance.[7]
Simulation models could facilitate the design of novel pathogens.[8]
Agent models could soon (if not already) semi-autonomously conduct effective offensive cyber campaigns.[9]
The mere threat of theft/leaks could discourage improvements in model reliability and interpretability that require providing more access to powerful models.[10] This is especially relevant to CISA as consumers and industries (including CI) increasingly rely on AI.[11]
If China steals powerful models that enhance its ability to conduct AI research, this could increase catastrophic “structural” risks.[12] For example, future AI systems may disrupt strategic stability by undermining confidence in nuclear second-strike capabilities.[13] Furthermore, intensified racing dynamics could drive China and the US to deprioritize safety measures, increasing the risk of permanent human disempowerment or even extinction by creating powerful/self-improving but misaligned systems.[14] In reality, such powerful systems may be attainable over the next two decades.[15]
Ultimately, the capabilities of future systems are hard to predict, but the consequences of model proliferation could be severe.
Traditional Market Incentives Will Likely Fail to Minimize These Risks
Many companies will have some incentives to protect their models. However, there are market failures and other reasons to expect that their efforts will be suboptimal:
The risks described in the previous section are primarily externalities and companies that do not appropriately guard against these risks may out-compete companies that do.
Unauthorized use of models may be limited to foreign jurisdictions where the companies did not expect to have substantial market access (e.g., China). Thus, IP theft may not have a significant impact on a company’s profitability.
Market dynamics could disincentivize some prosocial actions such as cybersecurity incident disclosures.[16]
Recommendations for CISA: Assess Risks While Developing Expertise, Partnerships, and Mitigation Measures
Thus far, there has been minimal public research on how policymakers should mitigate the risks of model theft. CISA has an important role to play at this early stage of the policy pipeline, especially to facilitate information flows while spurring and supporting other actors (e.g., Congress) who have more resources or authority to address the upstream problems. The following subsections provide four main categories of recommendations.
Conduct Focused Risk Assessments, Monitor Trends, and Disseminate Findings
As part of CISA’s objective to “continually identify nascent or emerging risks before they pose threats to our infrastructure,”[17] CISA should assess the risk of malicious actors using current or near-future large AI models to conduct cyberattacks against CI.[18]
CISA should work with law enforcement to populate datasets[19] and/or case study compilations of cybersecurity incidents regarding sabotage or theft of large models, as well as cyberattacks utilizing large models.[20]
Disseminate the findings of these assessments among relevant policymakers and stakeholders (as appropriate), to inform policymaking and countermeasure development. This is especially relevant for policy regarding the National AI Research Resource (NAIRR).[21]
This analysis—even in the very preliminary stages—should also inform CISA’s implementation of the remaining recommendations.
Build CISA Subject Matter Expertise and Analytical Capacity
To improve its assessments and preparations, CISA could employ Protective Security Advisors (PSAs) with experience relevant to AI model security and/or the AI supply chain, including cloud providers, semiconductors, and other CI sectors.[22] Alternatively, CISA could create a dedicated working group or task force to deal with these issues.[23] Depending on the findings of CISA’s risk assessments, CISA could also seek additional funding from Congress. The following category of recommendations could also help improve CISA’s knowledge and analytical capacity.
Develop Partnerships with AI Labs and Facilitate Information Sharing
Given that partnerships are the “foundation and the lifeblood”[24] of CISA’s efforts, it should invite AI labs into cyber information sharing and security tool ecosystems.[25] Specifically, CISA should ensure that large AI labs are aware of relevant programs, and if they are not participants, determine why and whether CISA can/should modify its policies to allow participation.[26][EA Forum Note: this footnote contains a potentially interesting/impactful suggestion about designating some AI tools/labs as "IT critical infrastructure." I could not fully explore this recommendation in my memo due to space and time constraints, but it could be the most important takeaway/suggestion from this memo.]
Help Develop and Implement Security Standards and Methods
CISA should work with the National Institute of Standards and Technology (NIST) and other federal research and development agencies[27] to develop cybersecurity methods and standards (e.g., hardware-based data flow limiters) that CISA and other agencies could mandate for federal agencies that produce/house large AI models (including the potential NAIRR[28]).
References
Abdallat, A. J. 2022. “Can We Trust Critical Infrastructure to Artificial Intelligence?” Forbes. July 1, 2022. https://www.forbes.com/sites/forbestechcouncil/2022/07/01/can-we-trust-critical-infrastructure-to-artificial-intelligence/?sh=3e21942e1a7b.
“About ISACs.” n.d. National Council of ISACs. Accessed April 15, 2023. https://www.nationalisacs.org/about-isacs.
Adi, Erwin, Zubair Baig, and Sherali Zeadally. 2022. “Artificial Intelligence for Cybersecurity: Offensive Tactics, Mitigation Techniques and Future Directions.” Applied Cybersecurity & Internet Governance Journal. November 4, 2022. https://acigjournal.com/resources/html/article/details?id=232841&language=en.
Allen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022. https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security.
Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023. https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db.
Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022. https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.
“CISA Strategic Plan 2023–2025.” 2022. CISA. September 2022. https://www.cisa.gov/sites/default/files/2023-01/StrategicPlan_20220912-V2_508c.pdf.
Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3.
———. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023. https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.
Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019. https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847.
Geist, Edward. 2018. “By 2040, Artificial Intelligence Could Upend Nuclear Stability.” RAND Corporation. April 24, 2018. https://www.rand.org/news/press/2018/04/24.html.
Grace, Katja. 2023. “How Bad a Future Do ML Researchers Expect?” AI Impacts. March 8, 2023. https://aiimpacts.org/how-bad-a-future-do-ml-researchers-expect/.
“Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception.
Hill, Michael. 2023. “NATO Tests AI’s Ability to Protect Critical Infrastructure against Cyberattacks.” CSO Online. January 5, 2023. https://www.csoonline.com/article/3684730/nato-tests-ai-s-ability-to-protect-critical-infrastructure-against-cyberattacks.html.
Humphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021. https://crsreports.congress.gov/product/pdf/IN/IN11683.
Kahn, Jeremy. 2023. “Silicon Valley Is Buzzing about ‘BabyAGI.’ Should We Be Worried?” Fortune. April 15, 2023. https://fortune.com/2023/04/15/babyagi-autogpt-openai-gpt-4-autonomous-assistant-agi/.
LaPlante, Phil, and Ben Amaba. 2021. “CSDL | IEEE Computer Society.” Computer. October 2021. https://www.computer.org/csdl/magazine/co/2021/10/09548022/1x9TFbzhvTG.
Laplante, Phil, Dejan Milojicic, Sergey Serebryakov, and Daniel Bennett. 2020. “Artificial Intelligence and Critical Systems: From Hype to Reality.” Computer 53 (11): 45–52. https://doi.org/10.1109/mc.2020.3006177.
Lawfare. 2023. “Cybersecurity and AI.” Youtube. April 3, 2023. https://www.youtube.com/watch?v=vyyiSCJVAHs&t=964s.
Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott. MIT Science Policy Review 3 (August): 47–56. https://doi.org/10.38105/spr.360apm5typ.
MITRE. n.d. “MITRE ATT&CK.” MITRE. Accessed April 15, 2023. https://attack.mitre.org/.
Murphy, Mike. 2022. “What Are Foundation Models?” IBM Research Blog. May 9, 2022. https://research.ibm.com/blog/what-are-foundation-models.
Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.” The Washington Post, June 4, 2015. https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html.
National Artificial Intelligence Research Resource Task Force. 2023. “Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem an Implementation Plan for a National Artificial Intelligence Research Resource.” https://www.ai.gov/wp-content/uploads/2023/01/NAIRR-TF-Final-Report-2023.pdf.
“Not My Problem.” 2014. The Economist. July 10, 2014. https://www.economist.com/special-report/2014/07/10/not-my-problem.
“Partnerships and Collaboration.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/partnerships-and-collaboration.
Rasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022. https://www.lawfareblog.com/right-time-chip-export-controls.
Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.
Sganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022. https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.
Stein-Perlman, Zach, Benjamin Weinstein-Raun, and Katja Grace. 2022. “2022 Expert Survey on Progress in AI.” AI Impacts. August 3, 2022. https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/.
“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.
Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023. https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.
Zwetsloot, Remco, and Allan Dafoe. 2019. “Thinking about Risks from AI: Accidents, Misuse and Structure.” Lawfare. February 11, 2019. https://www.lawfareblog.com/thinking-about-risks-ai-accidents-misuse-and-structure.
[1] For example, a single successful training run of GPT-3 reportedly required dozens of terabytes of data and cost millions of dollars of GPU usage, but the trained model is a file smaller than a terabyte in size and actors can operate it on cloud services that cost under $40 per hour. Sources: Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3; and Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
Additionally, one report suggested that by 2030, state-of-the-art models may cost hundreds of millions or even >$1B dollars to train (although the report highlights that these estimates could significantly change). Source: Cottier, Ben. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
[10] The example of Meta’s LLaMA, mentioned earlier, provides both some support and rebuttal for this concern: Meta has insisted it plans to continue sharing access despite the leaks, but there are good reasons to think this event will discourage other companies from implementing similar access rules. Source: Vincent, “Meta’s Powerful AI Language Model Has Leaked Online.”
[13] “Some observers have posited that autonomous systems like Sea Hunter may render the underwater domain transparent, thereby eroding the second-strike deterrence utility of stealthy SSBNs. [...] However, irrespective of the veracity of this emerging capability, the mere perception that nuclear capabilities face new strategic challenges would nonetheless elicit distrust between nuclear-armed adversaries—particularly where strategic force asymmetries exist.” Source: Johnson, James. 2020. “Artificial Intelligence: A Threat to Strategic Stability.” Strategic Studies Quarterly. https://www.airuniversity.af.edu/Portals/10/SSQ/documents/Volume-14_Issue-1/Johnson.pdf.
[14] Although this claim may be jarring for people who are not familiar with the progress in AI over the past decade or with the AI safety literature, the threat of extinction (or functionally equivalent outcomes) as a result of developing a very powerful system is non-trivial. Notably, in one 2022 survey of machine learning researchers, nearly half (48%) of the respondents believed there is at least a 10% chance that AI would lead to “extremely bad” outcomes (e.g., human extinction). Source: Grace, Katja. 2023. “How Bad a Future Do ML Researchers Expect?” AI Impacts. March 8, 2023. https://aiimpacts.org/how-bad-a-future-do-ml-researchers-expect/.
[15] Surveys of machine learning researchers provide a mixed range of forecasts, but in the aforementioned 2022 survey (notably prior to the public release of Chat-GPT), >75% of respondents said there was at least a 10% chance that humanity would develop “human-level AI” (roughly defined as a system that is better than humans at all or nearly all meaningful cognitive tasks) in the next 20 years. Additionally, >35% of the respondents said there was at least a 50% chance of this outcome. Notably however, some types of highly autonomous cyber systems may not even require “human-level AI.” For data and further discussion regarding these forecasts, see Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines. For the original survey, see: Stein-Perlman, Zach, Benjamin Weinstein-Raun, and Katja Grace. 2022. “2022 Expert Survey on Progress in AI.” AI Impacts. August 3, 2022. https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/.
[18] As part of this, CISA should work with other agencies such as the Office of Science and Technology Policy (OSTP), National Security Agency (NSA), and the broader Department of Homeland Security (DHS) to forecast future AI models’ capabilities and proliferation.
[19] This could potentially build on or be modeled after datasets such as MITRE’s ATT&CK. See: MITRE. n.d. “MITRE ATT&CK.” MITRE. Accessed April 15, 2023. https://attack.mitre.org/.
[20] This should apply to large models regardless of whether they were stolen, developed for malicious purposes, etc. The overall dataset or case study compilation should probably cover more than just critical infrastructure targets, but CISA could just be a primary contributor for incidents involving critical infrastructure.
[21] National Artificial Intelligence Research Resource Task Force. 2023. “Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem an Implementation Plan for a National Artificial Intelligence Research Resource.” https://www.ai.gov/wp-content/uploads/2023/01/NAIRR-TF-Final-Report-2023.pdf. Note that some mock legislation briefly specifies CISA (not by acronym) on page J-12.
[25] CISA has stated in its strategy “We will use our full suite of convening authorities and relationship management capabilities to expand and mature partnerships with stakeholders and facilitate information sharing.” Source: “CISA Strategic Plan 2023–2025.” Some of the relevant CISA programs include Automated Indicator Sharing (AIS), Enhanced Cybersecurity Services (ECS), and possibly even the Joint Cyber Defense Collaborative (JCDC).
[26]Perhaps one of the more drastic possible options here is to categorize labs producing so-called “foundation models” (e.g., GPT-4) as part of the information technology critical infrastructure sector. It is unclear from an outsider perspective how legally feasible or politically desirable this categorization would be, but as GPT-4 and related models increasingly become the basis for other software applications this designation should become more logical and/or acceptable. For more information about foundation models, see: Murphy, Mike. 2022. “What Are Foundation Models?” IBM Research Blog. May 9, 2022. https://research.ibm.com/blog/what-are-foundation-models. For information about the IT critical infrastructure sector designation, see: “Information Technology Sector.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/critical-infrastructure-security-and-resilience/critical-infrastructure-sectors/information-technology-sector.
[27] This particularly includes the Defense Advanced Research Projects Agency (DARPA) and the Intelligence Advanced Research Projects Activity (IARPA), both of which are already working on some projects related to the integrity and reliability of AI models, including GARD at DARPA and TrojAI at IARPA. Sources: “Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception; and “TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.
[28] National Artificial Intelligence Research Resource Task Force, “Strengthening and Democratizing the U.S….”
Ultimately I was fairly rushed with this memo and realized less than halfway through that perhaps I shouldn't have chosen CISA as my client, but it was too late to change. I don't confidently endorse all of the claims and recommendations in this memo (especially given my lack of familiarity with the field, tight length constraints, and lack of time to do as much research as I wanted), but I'm sharing it to potentially help others who might be interested.
(Summary: A debate league's yearlong policy debate resolution is about AI; does this seem like a good outreach opportunity?)
"Resolved: The United States Federal Government should substantially reform the use of Artificial Intelligence technology."
IMO, it's not the best of wording, but that's the current team policy debate resolution in the Stoa debate league. For the next ~9 months, a few hundred high school students will be researching and debating over "the use of artificial intelligence technology." In the past, people have posted about competitive debating and its potential relationship with EA; does this at all seem like an opportunity for outreach? (To be fair, Stoa is smaller than traditional public school leagues, but the policy debate norms are way better/less toxic, making team policy one of the most popular events in Stoa)
The following is a midterm assignment I submitted for my Cyber Operations class at Georgetown, regarding the risk of large AI model theft. I figured I would just publish this since it's fairly relevant to recent discussions and events around AI model theft. (I also am posting this so I have a non-Google-Doc link to share with people)
Note: In this assignment I had a 500-word limit and was only tasked to describe a problem's relevance to my client/audience while briefly mentioning policy options. In an upcoming memo assignment I will need to actually go into more detail on the policy recommendations (and I'd be happy to receive suggestions for what CISA should do if you have any).
(I also acknowledge that the recommendations I lay out here are a bit milquetoast, but I genuinely just didn't know what else to say...)
-------------
Memorandum for the Cybersecurity and Infrastructure Security Agency (CISA)
SUBJECT: Securing Large AI Models Against Theft
Large artificial intelligence (AI) models such as ChatGPT have increasingly demonstrated AI’s potential. However, as proprietary models become more powerful it is increasingly important to protect them against theft. CISA should work to facilitate information sharing that supports public policy and private responses. The following four sections will discuss some of the threat motivations/trends, potential consequences, market failures, and policy recommendations for CISA.
Motivations and Relevant Trends Regarding AI Model Theft
There are many reasons to expect that hackers will attempt to exfiltrate proprietary AI models:
China and other actors have repeatedly stolen sensitive data and intellectual property (IP).[1]
Future models may prove to have such significant economic or military value that state actors are willing to expend substantial effort/assets to steal them.
Current large models have high up-front development (“training”) costs/requirements but comparatively low operational costs/requirements after training.[2] This makes theft of models attractive even for non-state actors. Additionally, recent export controls on semiconductors to China could undermine China’s ability to train future large models,[3] which would further increase Beijing’s incentive to support model theft.
Someone reportedly leaked Meta’s new large language model (LLaMA) within days of Meta providing model access to researchers.[4]
Potential Consequences of AI Model Theft
Theft of powerful AI models—or the threat thereof—could have significant negative consequences beyond straightforward economic losses:
Many powerful AI models could be abused:
Content generation models could enhance disinformation and spear phishing campaigns.[5]
Image recognition models could empower semi-autonomous weapons or authoritarian surveillance.[6]
Simulation models could facilitate the design of novel pathogens.[7]
The mere threat of theft/leaks may discourage efforts to improve AI safety and interpretability that involve providing more access to powerful models.[8]
Enhanced Chinese AI research could intensify AI racing dynamics that prove catastrophic if “very powerful systems”[9] are attainable over the next 15 years.[10]
Why Traditional Market Incentives May Fail to Mitigate These Risks
Many companies will have some incentives to protect their models, but there are some reasons to expect their efforts will be suboptimal relative to the risks:
The risks described in the previous section are largely externalities and companies that do not appropriately guard against these risks may out-compete companies that do.
Unauthorized use of models may be limited to foreign jurisdictions where the companies did not expect to make substantial profits (e.g., an off-limits Chinese economy).
Market dynamics may disincentivize some prosocial actions such as cybersecurity incident disclosures.[11]
Suggestions for CISA
CISA should explore some options to inform and facilitate public policy and private responses to these threats:
Map relevant actors and stakeholders.
Evaluate and/or propose platforms and frameworks for information sharing.
Assess the presence and impact of market failures.
Collect research relevant to actions that other actors could take (e.g., programs at DARPA/IARPA,[12] mandatory incident disclosure legislation).
Begin drafting a report which incorporates the previous suggestions and elicits input from relevant actors.
-------------
References
Allen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022. https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security.
Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023. https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db.
Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022. https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.
Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3.
———. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023. https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.
Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019. https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847.
“Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception.
Humphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021. https://crsreports.congress.gov/product/pdf/IN/IN11683.
Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott. MIT Science Policy Review 3 (August): 47–56. https://doi.org/10.38105/spr.360apm5typ.
Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.” The Washington Post, June 4, 2015. https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html.
“Not My Problem.” 2014. The Economist. July 10, 2014. https://www.economist.com/special-report/2014/07/10/not-my-problem.
Rasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022. https://www.lawfareblog.com/right-time-chip-export-controls.
Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.
Sganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022. https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.
“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.
Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023. https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.
[2] For example, a single successful training run of GPT-3 reportedly required dozens of terabytes of data and cost millions of dollars of GPU usage, but the trained model is a file smaller than a terabyte in size and actors can operate it on cloud services that cost under $40 per hour. Sources: Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3; and Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
Additionally, one report suggested that by 2030, state-of-the-art models may cost hundreds of millions or even >$1B dollars to train (although the report highlights that these estimates could significantly change). Source: Cottier, Ben. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
[8] The example of Meta’s LLaMA, mentioned earlier, provides both some support and rebuttal for this concern: Meta has insisted it plans to continue sharing access despite the leaks, but there are good reasons to think this event will discourage other companies from implementing similar access rules. Source: Vincent, “Meta’s Powerful AI Language Model Has Leaked Online.”
[9] By this, I am referring to systems such as highly autonomous cyber systems (which could conceivably cause unintended havoc on a scale far greater than Stuxnet), AI systems in nuclear forces or strategic operations (e.g., early warning systems, command and control, and tracking foreign nuclear assets such as missile submarines), or outright “human-level” artificial general intelligence (AGI).
[10] Surveys of AI experts provide a mixed range of forecasts, but in a 2022 survey a non-trivial portion of such experts forecasted a 50% chance that “human-level AI” (roughly defined as a system that is better than humans at practically all meaningful tasks) will exist by 2035. Additionally, half of the surveyed experts forecasted a 50% chance of this outcome by 2061. Notably however, some types of “very powerful systems” (e.g., highly autonomous cyber systems) may not even require “human-level AI.” For data and further discussion regarding these forecasts, see Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.
TL;DR: I’m curious why there is so little mention of Kialo as a potential tool for hashing out disagreements in the EA forum/community, whereas I think it would be at least worth experimenting with. I’m considering writing a post on this topic, but want to get initial thoughts (e.g., have people already considered it and decided it wouldn’t be effective, initial impressions/concerns, better alternatives to Kialo)
The forum and broader EA community has lots of competing ideas and even some direct disagreements. Will Bradshaw's recent comment about discussing cancel culture on the EA forum is just the latest example of this that I’ve seen. I’ve often felt that the use of a platform like Kialo would be a much more efficient way of recording these disagreements, since it helps to separate out individual points of contention and allow for deep back-and-forth, among many other reasons. However, when I search for “Kialo” in the search bar on the forum, I only find a few minor comments mentioning it (as opposed to posts) and they are all at least 2 years old. I think I once saw a LessWrong post downplaying the platform, but I was wondering if people here have developed similar impressions.
More to the point, I was curious to see if anyone had any initial thoughts on whether it would be worthwhile to write an article introducing Kialo and highlighting how it could be used to help hash out disagreements here/in the community? If so, do you have any initial objections/concerns that I should address? Do you know of any other alternatives that would be better options (keeping in mind that one of the major benefits of Kialo is its accessibility)?
Do you just mean this shortform or do you mean the full post once I finish it? Either way I’d say feel free to post it! I’d love to get feedback on the idea
I'm not sure this is worth a full post, especially since the original post didn't really receive much positive feedback (or almost any feedback period). However, I was excited to discover recently that Kumu seems to handle the task of exporting from ORA fairly well, and I figured "why not make it accessible", rather than just relying on screenshots (as I did in the original article).
To rehash the original post/pitch, I think that a system like this, could
1a) reduce the time necessary to conduct literature reviews and similar tasks in AI policy research;
1b) improve research quality by reducing the likelihood that researchers will overlook important considerations prior to publishing or that they will choose a suboptimal research topic; and
2) serve as a highly-scalable/low-oversight task for entry-level researchers (e.g., interns/students) who want to get experience in AI policy but were unsuccessful in applying to other positions (e.g., SERI) that suffer from mentorship constraints—whereas I think that this work would require very little senior researcher oversight on a per-contributor basis (perhaps like a 1 to 30 ratio, if senior researchers are even necessary at all?).
The following example screenshots from Kumu will be ugly/disorienting (as it was with ORA), as I have put minimal effort into optimizing the view, and it really is something you need to zoom in for since you otherwise cannot read the text. Without further ado, however, here is a sample of what's on the Kumu project:
A few months ago I wrote a post on a decision-analysis framework (the stock issues framework) that I adapted from a framework which is very popular/prominent in competitive high school policy debate (which uses the same name). I was surprised to not receive any feedback/comments (I was at least expecting some criticism, confusion, etc.), but in retrospect I realized that it was probably a rather lengthy/inefficient post. I also realized that I probably should have written a shortform post to get a sense of interest, some preliminary thoughts on the validity and novelty/neglectedness of the concept, and how/where people might misinterpret or challenge the concept (or otherwise want to see more clarity/justification). So, I’ll try to offer a simplified summary here in hopes to get some more insight on some of those things I mentioned (e.g., the potential value, novelty/neglectedness, validity, areas of confusion/skepticism).
The framework remarkably echoes the “importance, neglectedness, tractability” (INT) heuristic for cause area prioritization, except that the stock issues framework is specific to individual decisions and avoids some of the problems of the INT heuristic (e.g., the overgeneralized assumption of diminishing marginal returns). Basically, the stock issues framework holds that every advantage and disadvantage (“pro and con”) of a decision rests on four mutually exclusive and exhaustive concepts: inherency (which is reminiscent of “neglectedness,” but is more just “the descriptive state of affairs”), significance, feasibility, and solvency. (I explain them in more detail in my post.)
Over time, I have informally thought of and jotted down some of the potential justifications for promoting this framework (e.g., checking against confirmation and other biases, providing common language and concept awareness in discourse, constructing concept categories so as to improve learning and application of lessons from similar cases). However, before I write a post about such justifications, I figured I would write this shortform to get some preliminary feedback, as I mentioned: I’d love to hear where you are skeptical, confused, interested, etc.! (Also, if you think the original post I made should/could be improved--such as by reducing caveats/parentheticals/specificities, making some explanation more clear, etc.--feel free to let me know!)
I really appreciate your constructive attitude here :) I write below some recommendations and my take on why this wasn't successful. Some of it is a bit harsh, but that's because I honestly respect you and think you'll take it well 😊
I remember coming across your post, which is in an area that I'm very interested in, but seeing that I didn't remember any details and didn't upvote, I probably just skimmed it and didn't find it worth my time to read. I've read it now, and I have some thoughts about how you could have written a post on this topic which I would find interesting and more readable - after reading it now, I think that it has some useful content that I'd like to know.
A lot of the post (and actually even most of this shortform post) is about your own views and thinking process and meta-thoughts about the post itself and it's context. This is a lot of overhead which is not needed and in fact damaging both because it is distracting and because it makes it harder to find the gold within.
As you said, the post is too lengthy and inefficient. I'd guess that most readers of the forum go through posts by filtering in approximately this order: Title-> skimming first paragraphs / look for clear bullet points or tl;dr-> skimming the post, mostly looking at headers, images, first words of paragraphs, bolded parts, bullet points -> skimming sections of interest -> dive deeper into all or what interests them.
I found myself confused from skimming the intro. I saw that you offer an alternative to ITN, but didn't understand what it is.
Skimming the rest of the post, I saw the four bulleted concepts and my next thought was that I get the general idea of the post, even if I'm confused about somethings, but it's not worth my time to read through this text to understand it better.
It feels that the post is aiming at persuasion rather than description. I got the feeling that I was being sold some new shiny framework, and that most of the effort in the post goes there instead of just explaining what it's all about. I really do think that you overpromise here, and by doing that I could easily discard the whole idea as not worthwhile even if it has some merit.
Relatedly, I found the attitude in the post somewhat vain and dismissive towards existing ideas and the readers. As I write this, I look back and didn't find any clear examples of that so perhaps I'm misjudging the post here. Perhaps it's because you make it seem like it's your idea.
Key ideas of the framework are not explained properly. I don't understand how exactly one uses this framework. Can you put a "number" or evaluation on inherency? How exactly do accounts of diminishing returns enter this framework? What do we do about some overlap between different parts? You write that you hope for people to comment and ask questions, but I think that this is too much to ask - it takes a while to clarify to myself what I don't understand, and it's a lot of overhead anyway.
What I'd really hope you will do is to write a short post (not a shortform) which only explains this framework and some of its features, without unneeded meta-discussion. I've tried skimming the Wikipedia page, but it's in a different enough context and language that it's difficult for me to understand without a lot of effort.
Thanks for the insight/feedback! I definitely see what you are saying on a lot of points. I’ll be working on an improved post soon that incorporates your feedback.
I’ve spent hours today trying to find answers for this, and I’m reaching the point where I think it’s worth throwing out this question just in case someone out there can just solve my problem fairly quickly.
Basically I'm trying to find a platform/software solution for a (~10 person?) AI research project idea I'm doing a writeup for, but I've searched for a while without success. I have a list of the features it would need to (or preferably) have, as well as some example platforms/software which have all the necessary features except for cornerstones like "capable of real-time collaboration (like on a Google Doc)."
Thus, I think it’s now worth asking whether there is any "help me find [or build] a software solution" service (preferably but not necessarily within EA)? Or should I just post a question somewhere (e.g., on the normal EA forum)?
I think your link doesn't work. It seems good to provide a description of your desired software (a few sentences/paragraphs) and some bullet points, early in your post?
Ah, does the link just not work, or are you saying it’s not helpful for finding a sufficient software? I realize now that I meant to say “a partial writeup of the project idea [including the screenshot of the software]”, not a writeup containing a list of the desired features.
It looks like you linked to a special draft mode or something, it says edit post in the url:
Zooming out and being sort of blunt/a jerk about it: it's sort of unpromising (especially when you're seeking detailed advice on what presumably is complicated software) that you haven't noticed this. This seems low effort. You want to demonstrate you can attend to details like this.
Yes, you should definitely write up a description of the software in your comment, because again, because a few sentences/paragraphs doesn't take much time and lets technical people skim and see if it makes sense to engage. Your going to bounce out all the people with high opportunity costs.
I see what you're saying now: initially I misread your comment as "It seems [good] to provide a description of your desired software..." which led me to think you were able to access the article ("it"), which confused me. I also didn't have problems when I tried the link, but of course it's now obvious that was because I am the editor. I just fixed that—I'm not even sure how I ended up copying the edit post link anyway.
FWIW, to address (2) and (1b), I was just trying to ask people if there is any "help me find [or build] a software solution" service (preferably but not necessarily within EA), or if I should just post a full message/question about it on the forum. I was not (yet) trying to ask people to find specific software for me. So, for efficiency's sake, this post was indeed somewhat low effort—hence the posting via shortform. But including the wrong link and overlooking one important word in your reply are definitely mistakes on my part.
For one of my grad school classes we've been discussing "strategy" and "grand strategy", and some of the readings even talk about theories of victory. I've been loosely tracking the "AI strategy" world, and now that I'm reading about strategy, I figured it might be helpful to share a framing that I've found helpful (but am still uncertain about):
It may feel tempting for some people to approach "strategy" (and somewhat relatedly, "theory" (e.g., IR theory)) as one might approach some problems in hard sciences: work hard to rigorously find the definitive right answer, and don't go around spreading un-caveated claims you think are slightly wrong (or which could be interpreted incorrectly). However, I personally found it helpful to frame "strategy" in terms of an optimization problem with constraints:
You have constraints with regards to:
Information collection
Analysis and processing of information
(Consider: chess and Go are games of perfect information, but you cannot analyze every possibility)
And communication.
For example, you can’t expect a policymaker (or even many fellow researchers) to read a dense 1,000-page document, and you may not have the time to spend writing a 1,000 page document.
Some goal(s) (and anti-goals!):
Goal: It’s not just discovering and conveying (accurate) information: telling people the sun is going to rise tomorrow isn’t helpful since they already know/assume that, and telling people that the 10th decimal of pi is 5 usually isn’t very valuable even though most people don’t know that. Rather, the key is to convey information or create a sense of understanding that the audience (or yourself!):
Doesn’t already know,
Will believe/understand, and
Benefits from knowing or believing. (or, where it is beneficial that the audience understands this information)
Goal: identifying key ideas and labeling concepts can make discussion easier and/or more efficient, if people have a shared lexicon of ideas.
Goal: especially with strategy, you may have coordination problems: It might even be the case that people would see that the idea/priority conveyed by some strategy is crucial if other people also coordinate, but it might not actually be optimal to focus on at the marginal, individual level unless other people coordinate (and thus initially may not even seem like a good idea/priority or become the natural default).
Anti-goal: On the flipside, you want to avoid misleading yourself or your audience, according to the same standard above:
What will they not already mistakenly believe
Will believe, and
Are made worse off by knowing or believing
Anti-goal: there are also information hazards, like if you discover that the Nash equilibrium in a nuclear standoff is to try a pre-emptive strike, whereas the participants would have otherwise believed it was not advantageous.
Anti-goal: (the opposite of solving coordination problems: you cause people to shift from an optimally-diverse set of pursuits to one that is overfocused on specific problems)
Ultimately, the point of all this is to say that you have to have reasonable expectations with regards to the accuracy of "strategy" (and "theory"):
You probably shouldn’t expect (or even try) to find a theory that properly explains everything as you might expect in e.g., physics or mathematics. You can’t try to consider everything, which is why sometimes it might be best to just focus on a few important concepts
You need to balance epistemic benefits and “epistemic damage” (e.g., spreading confusion or inaccuracy)
You should try to optimize within your constraints, rather than leaving slack
Different audiences may have different constrains:
For example, policymakers are probably less technically competent and have less time to listen to all of your caveats and nuances.
Additionally, you might have far less time to respond to a policymaker's request for analysis than if you are writing something that isn't pressing.
I'd be interested to hear people's thoughts! (It's still fairly raw from my notes)
Working title: Collaborative Discussion Spaces and "Epistemic Jam Sessions" for Community Building Claims/Ideas?
Tl;dr: I created an example discussion space on Kialo for claims/ideas about EA community building, with the idea being that community builders could collaborate via such structured discussions. Does this seem like something that could be valuable? Is it worth making this shortform into a full post?
I’m a big fan of structured discussions, and while reading this post early last month I wondered: would it be helpful if there were some kind of virtual space for sharing claims and ideas—and arguments for/against those claims and ideas—about community organizing?
Building on this, I also wondered if it might be good to designate some 1–3 day period each month as a focal/Schelling point for community organizer participation, perhaps also with some non-binding/optional goals laid out in advance (e.g., “we want to get a better sense of how to improve outreach/success at lower-prestige universities,” “we want to hear about your experiences/advice regarding outreach to STEM groups”). Perhaps you could call these “epistemic jam sessions” (I’m totally open to accepting better name ideas). Regardless, these discussions could be open to contributions at any time.
Ultimately, I’d love to hear your thoughts on:
Whether anything else like this already exists;
Whether something like this seems valuable;
Any recommendations for alternative characteristics/designs;
I'm considering doing another pilot "epistemic map", but I'm trying to decide what topic I should do it on, and thus soliciting suggestions.
For more on epistemic mapping, you can see here for a presentation I recently gave on the topic (just ignore the technical issues in the beginning)
Whereas my last pilot/test map focused on the relationship between poverty and terrorism (and the associated literature), I want to do this one on something EA-relevant. FWIW, I think that epistemic mapping is probably most valuable for topics that are important, dynamic (e.g., assumptions or technological capabilities may change over time), unsettled/divisive, and/or have a non-small literature base (among a few other considerations).
Some of my ideas thus far have been the controversial Democratising Risk paper (or something else X-risk related), the Worm Wars debate, biosecurity/pandemic risks, or maybe something about AI. But I'd love to hear any other suggestions (or feedback on those ideas I listed)!
[Summary: Most people would probably agree that science benefited greatly from the shift to structured, rigorous empirical analyses over the past century, but some fields still struggle to make progress. I’m curious whether people think that we could/should seek to introduce more structure/sophistication to the way researchers make and engage with theoretical analyses, such as something like "epistemic mapping"]
I just discovered this post, and I was struck by how it echoed some of my independent thoughts and impressions, especially the quote: "But it should temper our enthusiasm about how many insights we can glean by getting some data and doing something sciency to it."
(What follows is shortform-level caveating and overcomplicating, which is to say, less than I normally would provide, and more about conveying the overall idea/impression)
I've had some (perhaps hedgehoggy) "big ideas" about the potential value of what I call "epistemic mapping" for advancing scientific study/inquiry/debate in a variety of fields. One of them relates to the quote above: the "empirical-scientific revolution" of the past ~100-200 years (e.g., the shift to measuring medical treatment effectiveness through inpatient/outpatient data rather than professionals’ impressions) seems to have been crucial in the advancement of a variety of fields.
However, there are still many fields where such empirical/data-heavy methods appear insufficient and where it seems like progress languishes: my impression has been that this especially includes many of the social sciences (e.g., conflict studies, political science, sociology). There are no doubt many possible explanations, but over time I've increasingly wondered whether a major set of problems is loosely that the overall complexity of the systems (e.g., human decision making process vs. gravitational constants) + the difficulty of collecting sufficient data for empirical analyses + (a few other factors) leads to a situation of high information lossage between researchers/studies and/or people are incentivized to oversimplify things (e.g., following the elsewhere-effective pattern of regression analyses and p<0.05 = paper). I do not know, but if the answer is yes, that leads to a major question:
How could/should we attempt to solve or mitigate this problem? One of the (hedgehoggy?) questions that keeps bugging me: We have made enormous advances in the past few hundred years when it comes to empirical analyses; in comparison, it seems that we have only fractionally improved the way we do our theoretical analysis... could/should we be doing better?[Very interested to get people's thoughts about that overall characterization, which even I'll admit I'm uncertain about]
So, I'm curious if people share similar sentiment about our ability/need to improve our methods of theoretical analysis, including how people engage with the broader literature aside from the traditional (and, IMO, inefficient) paragraph-based literature reviews. If people do share similar sentiment, what do you think about that concept of epistemic mapping as a potential way of advancing some sciences forward? Could it be the key to efficient future progress in some fields? My base rates for such a claim are really low, and I recognize that I'm biased, but I feel like it's worth posing the question if only to see if it advances the conversation.
(I might make this into an official post if people display enough interest)
Seeing the drama with the NIST AI Safety Institute and Paul Christiano's appointment and this article about the difficulty of rigorously/objectively measuring characteristics of generative AI, I figured I'd post my class memo from last October/November.
The main point I make is that NIST may not be well suited to creating measurements for complex, multi-dimensional characteristics of language models—and that some people may be overestimating the capabilities of NIST because they don't recognize how incomparable the Facial Recognition Vendor Test is to this situation of subjective metrics for GenAI and they don't realize NIST arguably even botched MNIST (which was actually produced by Yann LeCun by recompiling NIST's datasets). Moreover, government is slow, while AI is fast. Instead, I argue we should consider an alternative model such as federal funding for private/academic benchmark development (e.g., prize competitions).
I wasn't sure if this warranted a full post, especially since it feels a bit late; LMK if you think otherwise!
I would be quite interested to hear more about what you’re saying re MNIST and the facial recognition vendor test
Sure! (I just realized the point about the MNIST dataset problems wasn't fully explained in my shared memo, but I've fixed that now)
Per the assessment section, some of the problems with assuming that FRVT demonstrates NIST's capabilities for evaluation of LLMs/etc. include:
For the MNIST case, I now have the following in my memo:
Some may argue this assumption was justified at the time because it required that models could “generalize” beyond the training set. However, popular usage appears to have favored MNIST’s approach. Additionally, it is externally unclear that one could effectively generalize from the handwriting of a narrow and potentially unrepresentative segment of society—professional bureaucrats—to high schoolers’, and the assumption that this would be necessary (e.g., due to the inability to get more representative data) seems unrealistic.
There are some major differences with the type of standards that NIST usually produces. Perhaps the most obvious is that a good AI model can teach itself to pass any standardised test. A typical standard is very precisely defined in order to be reproducible by different testers. But if you make such a clear standard test for an LLM, it would, say, be a series of standard prompts or tasks, which would be the same no matter who typed them in. But in such a case, the model just trains itself on how to answer these prompts, or follows the Volkswagen model of learning how to recognize that it's being evaluated, and to behave accordingly, which won't be hard if the testing questions are standard.
So the test tells you literally nothing useful about the model.
I don't think NIST (or anyone outside the AI community) has experience with the kind of evals that are needed for models, which will need to be designed specifically to be unlearnable. The standards will have to include things like red-teaming in which the model cannot know what specific tests it will be subjected to. But it's very difficult to write a precise description of such an evaluation which could be applied consistently.
In my view this is a major challenge for model evaluation. As a chemical engineer, I know exactly what it means to say that a machine has passed a particular standard test. And if I'm designing the equipment, I know exactly what standards it has to meet. It's not at all obvious how this would work for an LLM.
TL;DR: Someone should probably write a grant to produce a spreadsheet/dataset of past instances where people claimed a new technology would lead to societal catastrophe, with variables such as “multiple people working on the tech believed it was dangerous.”
Slightly longer TL;DR: Some AI risk skeptics are mocking people who believe AI could threaten humanity’s existence, saying that many people in the past predicted doom from some new tech. There is seemingly no dataset which lists and evaluates such past instances of “tech doomers.” It seems somewhat ridiculous* to me that nobody has grant-funded a researcher to put together a dataset with variables such as “multiple people working on the technology thought it could be very bad for society.”
*Low confidence: could totally change my mind
———
I have asked multiple people in the AI safety space if they were aware of any kind of "dataset for past predictions of doom (from new technology)", but have not encountered such a project. There have been some articles and arguments floating around recently such as "Tech Panics, Generative AI, and the Need for Regulatory Caution", in which skeptics say we shouldn't worry about AI x-risk because there are many past cases where people in society made overblown claims that some new technology (e.g., bicycles, electricity) would be disastrous for society.
While I think it's right to consider the "outside view" on these kinds of things, I think that most of these claims 1) ignore examples of where there were legitimate reasons to fear the technology (e.g., nuclear weapons, maybe synthetic biology?), and 2) imply the current worries about AI are about as baseless as claims like "electricity will destroy society," whereas I would argue that the claim "AI x-risk is >1%" stands up quite well against most current scrutiny.
(These claims also ignore the anthropic argument/survivor bias—that if they ever were right about doom we wouldn't be around to observe it—but this is less important.)
I especially would like to see a dataset that tracks things like "were the people warning of the risks also the people who were building the technology?" More generally, some measurement of "analytical rigor" also seems really important, e.g., "could the claims have stood up to an ounce of contemporary scrutiny (i.e., without the benefit of hindsight)?"
Absolutely seems worth spending up to $20K to hire researchers to produce such a spreadsheet within the next two-ish months… this could be a critical time period, where people are more receptive to new arguments/responses…?
Just saw this now, after following a link to another comment.
You have almost given me an idea for a research project. I would run the research honestly and report the facts, but my in-going guess is that survivor bias is a massive factor, contrary to what you say here. And that in most cases, the people who believed it could lead to catastrophe were probably right to be concerned. A lot of people have the Y2K bug mentality, in which they didn't see any disaster and so concluded that it was all a false-alarm, rather than the reality which is that a lot of people did great work to prevent it.
If I look at the different x-risk scenarios the public is most aware of:
My unique (for this group) perspective on this is that I've worked for years on industrial safety, and I know that there are factories out there which have operated for years without a serious safety incident or accident - and someone working in one of those could reach the conclusion that the risks were exaggerated, while being unaware of cases where entire factories or oil-rigs or nuclear power plants have exploded and caused terrible damage and loss of life.
Before I seriously start working on this (in the event that I find time), could you let me know if you've since discovered such a data-base?
*We humans are naturally very good at this, because we all know we're going to die, and we live our lives trying not to think about this fact or desperately trying to convince ourselves of the existence of some kind of afterlife.
I'm not that focused on climate science, but my understanding is that this is a bit misleading in your context—that there were some scientists in the (90s/2000s?) who forecasted doom or at least major disaster within a few decades due to feedback loops or other dynamics which never materialized. More broadly, my understanding is that forecasting climate has proven very difficult, even if some broad conclusions (e.g., "the climate is changing," "humans contribute to climate change") have held up. Additionally, it seems that many engineers/scientists underestimated the pace of alternative energy technology (e.g., solar).
That aside, I would be excited to see someone work on this project, and I still have not discovered any such database.
I'm not sure. IMHO a major disaster is happening with the climate. Essentially, people have a false belief that there is some kind of set-point, and that after a while the temperature will return to that, but this isn't the case. Venus is an extreme example of an Earth-like planet with a very different climate. There is nothing in physics or chemistry that says Earth's temperature could not one day exceed 100 C.
It's always interesting to ask people how high they think sea-level might rise if all the ice melted. This is an uncontroversial calculation which involves no modelling - just looking at how much ice there is, and how much sea-surface area there is. People tend to think it would be maybe a couple of metres. It would actually be 60 m (200 feet). That will take time, but very little time on a cosmic scale, maybe a couple of thousand years.
Right now, if anything what we're seeing is worse than the average prediction. The glaciers and ice sheets are melting faster. The temperature is increasing faster. Etc. Feedback loops are starting to be powerful. There's a real chance that the Gulf Stream will stop or reverse, which would be a disaster for Europe, ironically freezing us as a result of global warming ...
Among serious climate scientists, the feeling of doom is palpable. I wouldn't say they are exaggerating. But we, as a global society, have decided that we'd rather have our oil and gas and steaks than prevent the climate disaster. The US seems likely to elect a president who makes it a point of honour to support climate-damaging technologies, just to piss off the scientists and liberals.
I'll be blunt, remarks like these undermine your credibility. But regardless, I just don't have any experience or contributions to make on climate change, other than re-emphasizing my general impression that, as a person who cares a lot about existential risk and has talked to various other people who also care a lot about existential risk, there seems to be very strong scientific evidence suggesting that extinction is unlikely.
I know. :(
But as a scientist, I feel it's valuable to speak the truth sometimes, to put my personal credibility on the line in service of the greater good. Venus is an Earth-sized planet which is 400C warmer than Earth, and only a tiny fraction of this is due to it being closer to the sun. The majority is about the % of the sun's heat that it absorbs vs. reflects. It is an extreme case of global warming. I'm not saying that Earth can be like Venus anytime soon, I'm saying that we have the illusion that Earth has a natural, "stable" temperature, and while it might vary, eventually we'll return to that temperature. But there is absolutely no scientific or empirical evidence for this.
Earth's temperature is like a ball balanced in a shallow groove on the top of a steep hill. We've never experienced anything outside the narrow groove, so we imagine that it is impossible. But we've also never dramatically changed the atmosphere the way we're doing now. There is, like I said, no fundamental reason why global-warming could not go totally out of control, way beyond 1.5C or 3C or even 20C.
I have struggled to explain this concept, even to very educated, open-minded people who fundamentally agree with my concerns about climate change. So I don't expect many people to believe me. But intellectually, I want to be honest.
I think it is valuable to keep trying to explain this, even knowing the low probability of success, because right now, statements like "1.5C temperature increase" are just not having the impact of changing people's habits. And if we do cross a tipping point, it will be too late to start realising this.
I spent way too much time organizing my thoughts on AI loss-of-control ("x-risk") debates without any feedback today, so I'm publishing perhaps one of my favorite snippets/threads:
A lot of debates seem to boil down to under-acknowledged and poorly-framed disagreements about questions like “who bears the burden of proof.” For example, some skeptics say “extraordinary claims require extraordinary evidence” when dismissing claims that the risk is merely “above 1%”, whereas safetyists argue that having >99% confidence that things won’t go wrong is the “extraordinary claim that requires extraordinary evidence.”
I think that talking about “burdens” might be unproductive. Instead, it may be better to frame the question more like “what should we assume by default, in the absence of definitive ‘evidence’ or arguments, and why?” “Burden” language is super fuzzy (and seems a bit morally charged), whereas this framing at least forces people to acknowledge that some default assumptions are being made and consider why.
To address that framing, I think it’s better to ask/answer questions like “What reference class does ‘building AGI’ belong to, and what are the base rates of danger for that reference class?” This framing at least pushes people to make explicit claims about what reference class building AGI belongs to, which should make it clearer that it doesn’t belong in your “all technologies ever” reference class.
In my view, the "default" estimate should not be “roughly zero until proven otherwise,” especially given that there isn’t consensus among experts and the overarching narrative of “intelligence proved really powerful in humans, misalignment even among humans is quite common (and is already often observed in existing models), and we often don’t get technologies right on the first few tries.”
The following is an assignment I submitted for my Cyber Operations class at Georgetown, regarding the risk of large AI model theft and what the US Cybersecurity and Infrastructure Security Agency (CISA) should/could do about it. Further caveats and clarifications in footnote.[1] (Apologies for formatting issues)
-------------
Memorandum for the Cybersecurity and Infrastructure Security Agency (CISA)
SUBJECT: Supporting Security of Large AI Models Against Theft
Recent years have seen a rapid increase in the capabilities of artificial intelligence (AI) models such as GPT-4. However, as these large models become more capable and more expensive to train, they become increasingly attractive targets for theft and could pose greater security risks to critical infrastructure (CI), in part by enhancing malicious actors’ cyber capabilities. Rather than strictly focusing on the downstream effects of powerful AI models, CISA should also work to reduce the likelihood (or rapidity) of large AI model theft. This memo will explain some of the threats to and from powerful AI models, briefly describe relevant market failures, and conclude with recommendations for CISA to mitigate the risk of AI model theft.
There are Strong Incentives and Historical Precedent for China and Other Actors to Steal AI Models
There are multiple reasons to expect that hackers will attempt to exfiltrate large AI model files:
AI Model Theft/Proliferation May Threaten Critical Infrastructure—and Far More
Theft of powerful AI models—or the threat thereof—could have significant negative consequences beyond straightforward economic losses:
Ultimately, the capabilities of future systems are hard to predict, but the consequences of model proliferation could be severe.
Traditional Market Incentives Will Likely Fail to Minimize These Risks
Many companies will have some incentives to protect their models. However, there are market failures and other reasons to expect that their efforts will be suboptimal:
Recommendations for CISA: Assess Risks While Developing Expertise, Partnerships, and Mitigation Measures
Thus far, there has been minimal public research on how policymakers should mitigate the risks of model theft. CISA has an important role to play at this early stage of the policy pipeline, especially to facilitate information flows while spurring and supporting other actors (e.g., Congress) who have more resources or authority to address the upstream problems. The following subsections provide four main categories of recommendations.
Conduct Focused Risk Assessments, Monitor Trends, and Disseminate Findings
This analysis—even in the very preliminary stages—should also inform CISA’s implementation of the remaining recommendations.
Build CISA Subject Matter Expertise and Analytical Capacity
To improve its assessments and preparations, CISA could employ Protective Security Advisors (PSAs) with experience relevant to AI model security and/or the AI supply chain, including cloud providers, semiconductors, and other CI sectors.[22] Alternatively, CISA could create a dedicated working group or task force to deal with these issues.[23] Depending on the findings of CISA’s risk assessments, CISA could also seek additional funding from Congress. The following category of recommendations could also help improve CISA’s knowledge and analytical capacity.
Develop Partnerships with AI Labs and Facilitate Information Sharing
Given that partnerships are the “foundation and the lifeblood”[24] of CISA’s efforts, it should invite AI labs into cyber information sharing and security tool ecosystems.[25] Specifically, CISA should ensure that large AI labs are aware of relevant programs, and if they are not participants, determine why and whether CISA can/should modify its policies to allow participation.[26] [EA Forum Note: this footnote contains a potentially interesting/impactful suggestion about designating some AI tools/labs as "IT critical infrastructure." I could not fully explore this recommendation in my memo due to space and time constraints, but it could be the most important takeaway/suggestion from this memo.]
Help Develop and Implement Security Standards and Methods
CISA should work with the National Institute of Standards and Technology (NIST) and other federal research and development agencies[27] to develop cybersecurity methods and standards (e.g., hardware-based data flow limiters) that CISA and other agencies could mandate for federal agencies that produce/house large AI models (including the potential NAIRR[28]).
References
Abdallat, A. J. 2022. “Can We Trust Critical Infrastructure to Artificial Intelligence?” Forbes. July 1, 2022. https://www.forbes.com/sites/forbestechcouncil/2022/07/01/can-we-trust-critical-infrastructure-to-artificial-intelligence/?sh=3e21942e1a7b.
“About ISACs.” n.d. National Council of ISACs. Accessed April 15, 2023. https://www.nationalisacs.org/about-isacs.
Adi, Erwin, Zubair Baig, and Sherali Zeadally. 2022. “Artificial Intelligence for Cybersecurity: Offensive Tactics, Mitigation Techniques and Future Directions.” Applied Cybersecurity & Internet Governance Journal. November 4, 2022. https://acigjournal.com/resources/html/article/details?id=232841&language=en.
Allen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022. https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security.
“Automated Indicator Sharing (AIS).” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/cyber-threats-and-advisories/information-sharing/automated-indicator-sharing-ais.
Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023. https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db.
Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022. https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.
“CISA Strategic Plan 2023–2025.” 2022. CISA. September 2022. https://www.cisa.gov/sites/default/files/2023-01/StrategicPlan_20220912-V2_508c.pdf.
Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3.
———. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023. https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.
Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
“Enhanced Cybersecurity Services (ECS).” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/programs/enhanced-cybersecurity-services-ecs.
Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019. https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847.
Geist, Edward. 2018. “By 2040, Artificial Intelligence Could Upend Nuclear Stability.” RAND Corporation. April 24, 2018. https://www.rand.org/news/press/2018/04/24.html.
Grace, Katja. 2023. “How Bad a Future Do ML Researchers Expect?” AI Impacts. March 8, 2023. https://aiimpacts.org/how-bad-a-future-do-ml-researchers-expect/.
“Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception.
Hill, Michael. 2023. “NATO Tests AI’s Ability to Protect Critical Infrastructure against Cyberattacks.” CSO Online. January 5, 2023. https://www.csoonline.com/article/3684730/nato-tests-ai-s-ability-to-protect-critical-infrastructure-against-cyberattacks.html.
Humphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021. https://crsreports.congress.gov/product/pdf/IN/IN11683.
“ICT Supply Chain Risk Management Task Force.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/groups/ict-supply-chain-risk-management-task-force.
“Information Technology Sector.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/critical-infrastructure-security-and-resilience/critical-infrastructure-sectors/information-technology-sector.
Johnson, James. 2020. “Artificial Intelligence: A Threat to Strategic Stability.” Strategic Studies Quarterly. https://www.airuniversity.af.edu/Portals/10/SSQ/documents/Volume-14_Issue-1/Johnson.pdf.
“Joint Cyber Defense Collaborative.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/partnerships-and-collaboration/joint-cyber-defense-collaborative.
Kahn, Jeremy. 2023. “Silicon Valley Is Buzzing about ‘BabyAGI.’ Should We Be Worried?” Fortune. April 15, 2023. https://fortune.com/2023/04/15/babyagi-autogpt-openai-gpt-4-autonomous-assistant-agi/.
LaPlante, Phil, and Ben Amaba. 2021. “CSDL | IEEE Computer Society.” Computer. October 2021. https://www.computer.org/csdl/magazine/co/2021/10/09548022/1x9TFbzhvTG.
Laplante, Phil, Dejan Milojicic, Sergey Serebryakov, and Daniel Bennett. 2020. “Artificial Intelligence and Critical Systems: From Hype to Reality.” Computer 53 (11): 45–52. https://doi.org/10.1109/mc.2020.3006177.
Lawfare. 2023. “Cybersecurity and AI.” Youtube. April 3, 2023. https://www.youtube.com/watch?v=vyyiSCJVAHs&t=964s.
Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott. MIT Science Policy Review 3 (August): 47–56. https://doi.org/10.38105/spr.360apm5typ.
MITRE. n.d. “MITRE ATT&CK.” MITRE. Accessed April 15, 2023. https://attack.mitre.org/.
Murphy, Mike. 2022. “What Are Foundation Models?” IBM Research Blog. May 9, 2022. https://research.ibm.com/blog/what-are-foundation-models.
Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.” The Washington Post, June 4, 2015. https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html.
National Artificial Intelligence Research Resource Task Force. 2023. “Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem an Implementation Plan for a National Artificial Intelligence Research Resource.” https://www.ai.gov/wp-content/uploads/2023/01/NAIRR-TF-Final-Report-2023.pdf.
“Not My Problem.” 2014. The Economist. July 10, 2014. https://www.economist.com/special-report/2014/07/10/not-my-problem.
“Partnerships and Collaboration.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/partnerships-and-collaboration.
“Protective Security Advisor (PSA) Program.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/programs/protective-security-advisor-psa-program.
Rasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022. https://www.lawfareblog.com/right-time-chip-export-controls.
Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.
Sganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022. https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.
Stein-Perlman, Zach, Benjamin Weinstein-Raun, and Katja Grace. 2022. “2022 Expert Survey on Progress in AI.” AI Impacts. August 3, 2022. https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/.
“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.
Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023. https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.
Zwetsloot, Remco, and Allan Dafoe. 2019. “Thinking about Risks from AI: Accidents, Misuse and Structure.” Lawfare. February 11, 2019. https://www.lawfareblog.com/thinking-about-risks-ai-accidents-misuse-and-structure.
[1] For example, a single successful training run of GPT-3 reportedly required dozens of terabytes of data and cost millions of dollars of GPU usage, but the trained model is a file smaller than a terabyte in size and actors can operate it on cloud services that cost under $40 per hour. Sources:
Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3; and
Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.
Additionally, one report suggested that by 2030, state-of-the-art models may cost hundreds of millions or even >$1B dollars to train (although the report highlights that these estimates could significantly change). Source: Cottier, Ben. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
[2] The following podcast explains this point in more detail: Lawfare. 2023. “Cybersecurity and AI.” Youtube. April 3, 2023. https://www.youtube.com/watch?v=vyyiSCJVAHs&t=964s (starting mainly at 16:04).
[3] For discussion regarding this claim, see: Allen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022. https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security; and
Rasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022. https://www.lawfareblog.com/right-time-chip-export-controls.
[4] Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.” The Washington Post, June 4, 2015. https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html; and
Sganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022. https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.
[5] Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023. https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.
[6] Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023. https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db;
Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023. https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.
[7] Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019. https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847;
Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott. MIT Science Policy Review 3 (August): 47–56. https://doi.org/10.38105/spr.360apm5typ (p. 49).
[8] Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022. https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.
[9] For further discussion of this topic, see: Kahn, Jeremy. 2023. “Silicon Valley Is Buzzing about ‘BabyAGI.’ Should We Be Worried?” Fortune. April 15, 2023. https://fortune.com/2023/04/15/babyagi-autogpt-openai-gpt-4-autonomous-assistant-agi/;
Adi, Erwin, Zubair Baig, and Sherali Zeadally. 2022. “Artificial Intelligence for Cybersecurity: Offensive Tactics, Mitigation Techniques and Future Directions.” Applied Cybersecurity & Internet Governance Journal. November 4, 2022. https://acigjournal.com/resources/html/article/details?id=232841&language=en.
[10] The example of Meta’s LLaMA, mentioned earlier, provides both some support and rebuttal for this concern: Meta has insisted it plans to continue sharing access despite the leaks, but there are good reasons to think this event will discourage other companies from implementing similar access rules. Source: Vincent, “Meta’s Powerful AI Language Model Has Leaked Online.”
[11] Laplante, Phil, Dejan Milojicic, Sergey Serebryakov, and Daniel Bennett. 2020. “Artificial Intelligence and Critical Systems: From Hype to Reality.” Computer 53 (11): 45–52. https://doi.org/10.1109/mc.2020.3006177: “The use of artificial intelligence (AI) in critical infrastructure systems will increase significantly over the next five years” (p. 1);
LaPlante, Phil, and Ben Amaba. 2021. “CSDL | IEEE Computer Society.” Computer. October 2021. https://www.computer.org/csdl/magazine/co/2021/10/09548022/1x9TFbzhvTG;
Hill, Michael. 2023. “NATO Tests AI’s Ability to Protect Critical Infrastructure against Cyberattacks.” CSO Online. January 5, 2023. https://www.csoonline.com/article/3684730/nato-tests-ai-s-ability-to-protect-critical-infrastructure-against-cyberattacks.html;
Abdallat, A. J. 2022. “Can We Trust Critical Infrastructure to Artificial Intelligence?” Forbes. July 1, 2022. https://www.forbes.com/sites/forbestechcouncil/2022/07/01/can-we-trust-critical-infrastructure-to-artificial-intelligence/?sh=3e21942e1a7b.
Additionally, autonomous vehicles could constitute critical infrastructure.
[12] Zwetsloot, Remco, and Allan Dafoe. 2019. “Thinking about Risks from AI: Accidents, Misuse and Structure.” Lawfare. February 11, 2019. https://www.lawfareblog.com/thinking-about-risks-ai-accidents-misuse-and-structure.
[13] “Some observers have posited that autonomous systems like Sea Hunter may render the underwater domain transparent, thereby eroding the second-strike deterrence utility of stealthy SSBNs. [...] However, irrespective of the veracity of this emerging capability, the mere perception that nuclear capabilities face new strategic challenges would nonetheless elicit distrust between nuclear-armed adversaries—particularly where strategic force asymmetries exist.” Source: Johnson, James. 2020. “Artificial Intelligence: A Threat to Strategic Stability.” Strategic Studies Quarterly. https://www.airuniversity.af.edu/Portals/10/SSQ/documents/Volume-14_Issue-1/Johnson.pdf.
See also: Geist, Edward. 2018. “By 2040, Artificial Intelligence Could Upend Nuclear Stability.” RAND Corporation. April 24, 2018. https://www.rand.org/news/press/2018/04/24.html.
[14] Although this claim may be jarring for people who are not familiar with the progress in AI over the past decade or with the AI safety literature, the threat of extinction (or functionally equivalent outcomes) as a result of developing a very powerful system is non-trivial. Notably, in one 2022 survey of machine learning researchers, nearly half (48%) of the respondents believed there is at least a 10% chance that AI would lead to “extremely bad” outcomes (e.g., human extinction). Source: Grace, Katja. 2023. “How Bad a Future Do ML Researchers Expect?” AI Impacts. March 8, 2023. https://aiimpacts.org/how-bad-a-future-do-ml-researchers-expect/.
[15] Surveys of machine learning researchers provide a mixed range of forecasts, but in the aforementioned 2022 survey (notably prior to the public release of Chat-GPT), >75% of respondents said there was at least a 10% chance that humanity would develop “human-level AI” (roughly defined as a system that is better than humans at all or nearly all meaningful cognitive tasks) in the next 20 years. Additionally, >35% of the respondents said there was at least a 50% chance of this outcome. Notably however, some types of highly autonomous cyber systems may not even require “human-level AI.” For data and further discussion regarding these forecasts, see Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.
For the original survey, see: Stein-Perlman, Zach, Benjamin Weinstein-Raun, and Katja Grace. 2022. “2022 Expert Survey on Progress in AI.” AI Impacts. August 3, 2022. https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/.
[16] For sources on this claim, see: “Not My Problem.” 2014. The Economist. July 10, 2014. https://www.economist.com/special-report/2014/07/10/not-my-problem; and
Humphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021. https://crsreports.congress.gov/product/pdf/IN/IN11683.
[17] “CISA Strategic Plan 2023–2025.” 2022. CISA. September 2022. https://www.cisa.gov/sites/default/files/2023-01/StrategicPlan_20220912-V2_508c.pdf.
[18] As part of this, CISA should work with other agencies such as the Office of Science and Technology Policy (OSTP), National Security Agency (NSA), and the broader Department of Homeland Security (DHS) to forecast future AI models’ capabilities and proliferation.
[19] This could potentially build on or be modeled after datasets such as MITRE’s ATT&CK. See: MITRE. n.d. “MITRE ATT&CK.” MITRE. Accessed April 15, 2023. https://attack.mitre.org/.
[20] This should apply to large models regardless of whether they were stolen, developed for malicious purposes, etc. The overall dataset or case study compilation should probably cover more than just critical infrastructure targets, but CISA could just be a primary contributor for incidents involving critical infrastructure.
[21] National Artificial Intelligence Research Resource Task Force. 2023. “Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem an Implementation Plan for a National Artificial Intelligence Research Resource.” https://www.ai.gov/wp-content/uploads/2023/01/NAIRR-TF-Final-Report-2023.pdf. Note that some mock legislation briefly specifies CISA (not by acronym) on page J-12.
[22] For details about the PSA program, see: “Protective Security Advisor (PSA) Program.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/programs/protective-security-advisor-psa-program.
[23] See for example: “ICT Supply Chain Risk Management Task Force.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/groups/ict-supply-chain-risk-management-task-force.
[24] “Partnerships and Collaboration.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/partnerships-and-collaboration.
[25] CISA has stated in its strategy “We will use our full suite of convening authorities and relationship management capabilities to expand and mature partnerships with stakeholders and facilitate information sharing.” Source: “CISA Strategic Plan 2023–2025.”
Some of the relevant CISA programs include Automated Indicator Sharing (AIS), Enhanced Cybersecurity Services (ECS), and possibly even the Joint Cyber Defense Collaborative (JCDC).
“Automated Indicator Sharing (AIS).” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/cyber-threats-and-advisories/information-sharing/automated-indicator-sharing-ais;
“Enhanced Cybersecurity Services (ECS).” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/resources-tools/programs/enhanced-cybersecurity-services-ecs;
“Joint Cyber Defense Collaborative.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/partnerships-and-collaboration/joint-cyber-defense-collaborative.
This also includes external organizations such as sector-based Information Sharing and Analysis Centers (ISACs). For more details about ISACs, see: “About ISACs.” n.d. National Council of ISACs. Accessed April 15, 2023. https://www.nationalisacs.org/about-isacs. While CISA may not be able to control participation in such organizations, it should at least determine whether AI labs are participating in such collaborations, if only to inform the development of policies to address unmet needs (or, if necessary, to impose mandatory disclosure requirements).
[26] Perhaps one of the more drastic possible options here is to categorize labs producing so-called “foundation models” (e.g., GPT-4) as part of the information technology critical infrastructure sector. It is unclear from an outsider perspective how legally feasible or politically desirable this categorization would be, but as GPT-4 and related models increasingly become the basis for other software applications this designation should become more logical and/or acceptable.
For more information about foundation models, see: Murphy, Mike. 2022. “What Are Foundation Models?” IBM Research Blog. May 9, 2022. https://research.ibm.com/blog/what-are-foundation-models.
For information about the IT critical infrastructure sector designation, see: “Information Technology Sector.” n.d. CISA. Accessed April 15, 2023. https://www.cisa.gov/topics/critical-infrastructure-security-and-resilience/critical-infrastructure-sectors/information-technology-sector.
[27] This particularly includes the Defense Advanced Research Projects Agency (DARPA) and the Intelligence Advanced Research Projects Activity (IARPA), both of which are already working on some projects related to the integrity and reliability of AI models, including GARD at DARPA and TrojAI at IARPA. Sources: “Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception; and
“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.
[28] National Artificial Intelligence Research Resource Task Force, “Strengthening and Democratizing the U.S….”
Ultimately I was fairly rushed with this memo and realized less than halfway through that perhaps I shouldn't have chosen CISA as my client, but it was too late to change. I don't confidently endorse all of the claims and recommendations in this memo (especially given my lack of familiarity with the field, tight length constraints, and lack of time to do as much research as I wanted), but I'm sharing it to potentially help others who might be interested.
(Summary: A debate league's yearlong policy debate resolution is about AI; does this seem like a good outreach opportunity?)
"Resolved: The United States Federal Government should substantially reform the use of Artificial Intelligence technology."
IMO, it's not the best of wording, but that's the current team policy debate resolution in the Stoa debate league. For the next ~9 months, a few hundred high school students will be researching and debating over "the use of artificial intelligence technology." In the past, people have posted about competitive debating and its potential relationship with EA; does this at all seem like an opportunity for outreach? (To be fair, Stoa is smaller than traditional public school leagues, but the policy debate norms are way better/less toxic, making team policy one of the most popular events in Stoa)
Great spot. Presumably this means a lot of kids will be googling related terms and looking for pre-existing policy suggestions and pro/con lists.
This shortform has been obsolesced by the following new version of my memo: https://forum.effectivealtruism.org/posts/jPxnAawQ9edXLRLRF/harrison-d-s-shortform?commentId=zPBhKQL2q3cafWdc5.
The following is a midterm assignment I submitted for my Cyber Operations class at Georgetown, regarding the risk of large AI model theft. I figured I would just publish this since it's fairly relevant to recent discussions and events around AI model theft. (I also am posting this so I have a non-Google-Doc link to share with people)Note: In this assignment I had a 500-word limit and was only tasked to describe a problem's relevance to my client/audience whilebriefly mentioningpolicy options. In an upcoming memo assignment I will need to actually go into more detail on the policy recommendations (and I'd be happy to receive suggestions for what CISA should do if you have any).(I also acknowledge that the recommendations I lay out here are a bit milquetoast, but I genuinely just didn't know what else to say...)-------------Memorandum for the Cybersecurity and Infrastructure Security Agency (CISA)SUBJECT: Securing Large AI Models Against TheftLarge artificial intelligence (AI) models such as ChatGPT have increasingly demonstrated AI’s potential. However, as proprietary models become more powerful it is increasingly important to protect them against theft. CISA should work to facilitate information sharing that supports public policy and private responses. The following four sections will discuss some of the threat motivations/trends, potential consequences, market failures, and policy recommendations for CISA.Motivations and Relevant Trends Regarding AI Model TheftThere are many reasons to expect that hackers will attempt to exfiltrate proprietary AI models:China and other actors have repeatedly stolen sensitive data and intellectual property (IP).[1]Future models may prove to have such significant economic or military value that state actors are willing to expend substantial effort/assets to steal them.Current large models have high up-front development (“training”) costs/requirements but comparatively low operational costs/requirements after training.[2]This makes theft of models attractive even for non-state actors. Additionally, recent export controls on semiconductors to China could undermine China’s ability to train future large models,[3]which would further increase Beijing’s incentive to support model theft.Someone reportedly leaked Meta’s new large language model (LLaMA) within days of Meta providing model access to researchers.[4]Potential Consequences of AI Model TheftTheft of powerful AI models—or the threat thereof—could have significant negative consequences beyond straightforward economic losses:Many powerful AI models could be abused:Content generation models could enhance disinformation and spear phishing campaigns.[5]Image recognition models could empower semi-autonomous weapons or authoritarian surveillance.[6]Simulation models could facilitate the design of novel pathogens.[7]The merethreatof theft/leaks may discourage efforts to improve AI safety and interpretability that involve providing more access to powerful models.[8]Enhanced Chinese AI research could intensify AI racing dynamics that prove catastrophic if “very powerful systems”[9]are attainable over the next 15 years.[10]Why Traditional Market Incentives May Fail to Mitigate These RisksMany companies will havesomeincentives to protect their models, but there are some reasons to expect their efforts will be suboptimal relative to the risks:The risks described in the previous section are largely externalities and companies that do not appropriately guard against these risks may out-compete companies that do.Unauthorized use of models may be limited to foreign jurisdictions where the companies did not expect to make substantial profits (e.g., an off-limits Chinese economy).Market dynamics may disincentivize some prosocial actions such as cybersecurity incident disclosures.[11]Suggestions for CISACISA should explore some options to inform and facilitate public policy and private responses to these threats:Map relevant actors and stakeholders.Evaluate and/or propose platforms and frameworks for information sharing.Assess the presence and impact of market failures.Collect research relevant to actions that other actors could take (e.g., programs at DARPA/IARPA,[12]mandatory incident disclosure legislation).Begin drafting a report which incorporates the previous suggestions and elicits input from relevant actors.-------------ReferencesAllen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022. https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security.Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023. https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db.Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022. https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022. https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3.———. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023. https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023. https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.Dickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020. https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019. https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847.“Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023. https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception.Humphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021. https://crsreports.congress.gov/product/pdf/IN/IN11683.Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott.MIT Science Policy Review3 (August): 47–56. https://doi.org/10.38105/spr.360apm5typ.Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.”The Washington Post, June 4, 2015. https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html.“Not My Problem.” 2014. The Economist. July 10, 2014. https://www.economist.com/special-report/2014/07/10/not-my-problem.Rasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022. https://www.lawfareblog.com/right-time-chip-export-controls.Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023. https://ourworldindata.org/ai-timelines.Sganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022. https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023. https://www.iarpa.gov/research-programs/trojai.Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023. https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.[1]Nakashima, Ellen. 2015. “Chinese Breach Data of 4 Million Federal Workers.” The Washington Post, June 4, 2015.https://www.washingtonpost.com/world/national-security/chinese-hackers-breach-federal-governments-personnel-office/2015/06/04/889c0e52-0af7-11e5-95fd-d580f1c5d44e_story.html; andSganga, Nicole. 2022. “Chinese Hackers Took Trillions in Intellectual Property from about 30 Multinational Companies.” CBS News. May 4, 2022.https://www.cbsnews.com/news/chinese-hackers-took-trillions-in-intellectual-property-from-about-30-multinational-companies/.[2]For example, a single successful training run of GPT-3 reportedly required dozens of terabytes of data and cost millions of dollars of GPU usage, but the trained model is a file smaller than a terabyte in size and actors can operate it on cloud services that cost under $40 per hour. Sources:Cottier, Ben. 2022. “The Replication and Emulation of GPT-3.” Rethink Priorities. December 21, 2022.https://rethinkpriorities.org/publications/the-replication-and-emulation-of-gpt-3; andDickson, Ben. 2020. “The GPT-3 Economy.” TechTalks. September 21, 2020.https://bdtechtalks.com/2020/09/21/gpt-3-economy-business-model/.Additionally, one report suggested that by 2030, state-of-the-art models may cost hundreds of millions or even >$1B dollars to train (although the report highlights that these estimates could significantly change). Source: Cottier, Ben. 2023. “Trends in the Dollar Training Cost of Machine Learning Systems.” Epoch. January 31, 2023.https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.[3]For discussion regarding this claim, see: Allen, Gregory, Emily Benson, and William Reinsch. 2022. “Improved Export Controls Enforcement Technology Needed for U.S. National Security.” Center for Strategic and International Studies. November 30, 2022.https://www.csis.org/analysis/improved-export-controls-enforcement-technology-needed-us-national-security; andRasser, Martijn, and Kevin Wolf. 2022. “The Right Time for Chip Export Controls.” Lawfare. December 13, 2022.https://www.lawfareblog.com/right-time-chip-export-controls.[4]Vincent, James. 2023. “Meta’s Powerful AI Language Model Has Leaked Online — What Happens Now?” The Verge. March 8, 2023.https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse.[5]Brooks, Chuck. 2023. “Cybersecurity Trends & Statistics for 2023: More Treachery and Risk Ahead as Attack Surface and Hacker Capabilities Grow.” Forbes. March 5, 2023.https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=2c6fcebf19db;Cox, Joseph. 2023. “How I Broke into a Bank Account with an AI-Generated Voice.” Vice. February 23, 2023.https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice.[6]Feldstein, Steven. 2019. “The Global Expansion of AI Surveillance.” Carnegie Endowment for International Peace. September 17, 2019.https://carnegieendowment.org/2019/09/17/global-expansion-of-ai-surveillance-pub-79847;Longpre, Shayne, Marcus Storm, and Rishi Shah. 2022. “Lethal Autonomous Weapons Systems & Artificial Intelligence: Trends, Challenges, and Policies.” Edited by Kevin McDermott. MIT Science Policy Review 3 (August): 47–56.https://doi.org/10.38105/spr.360apm5typ(p. 49).[7]Calma, Justine. 2022. “AI Suggested 40,000 New Possible Chemical Weapons in Just Six Hours.” The Verge. March 17, 2022.https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx.[8]The example of Meta’s LLaMA, mentioned earlier, provides both some support and rebuttal for this concern: Meta has insisted it plans to continue sharing access despite the leaks, but there are good reasons to think this event will discourage other companies from implementing similar access rules. Source: Vincent, “Meta’s Powerful AI Language Model Has Leaked Online.”[9]By this, I am referring to systems such as highly autonomous cyber systems (which could conceivably cause unintended havoc on a scale far greater than Stuxnet), AI systems in nuclear forces or strategic operations (e.g., early warning systems, command and control, and tracking foreign nuclear assets such as missile submarines), or outright “human-level” artificial general intelligence (AGI).[10]Surveys of AI experts provide a mixed range of forecasts, but in a 2022 survey a non-trivial portion of such experts forecasted a 50% chance that “human-level AI” (roughly defined as a system that is better than humans at practically all meaningful tasks) will exist by 2035. Additionally, half of the surveyed experts forecasted a 50% chance of this outcome by 2061. Notably however, some types of “very powerful systems” (e.g., highly autonomous cyber systems) may not even require “human-level AI.” For data and further discussion regarding these forecasts, see Roser, Max. 2023. “AI Timelines: What Do Experts in Artificial Intelligence Expect for the Future?” Our World in Data. February 7, 2023.https://ourworldindata.org/ai-timelines.[11]For sources on this claim, see: “Not My Problem.” 2014. The Economist. July 10, 2014.https://www.economist.com/special-report/2014/07/10/not-my-problem; andHumphreys, Brian. 2021. “Critical Infrastructure Policy: Information Sharing and Disclosure Requirements after the Colonial Pipeline Attack.” Congressional Research Service. May 24, 2021.https://crsreports.congress.gov/product/pdf/IN/IN11683.[12]DARPA and IARPA are already working on some projects related to the security and reliability of AI models, including GARD at DARPA and TrojAI at IARPA. Sources: “Guaranteeing AI Robustness against Deception (GARD).” n.d. DARPA. Accessed March 11, 2023.https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception; and“TrojAI: Trojans in Artificial Intelligence.” n.d. IARPA. Accessed March 11, 2023.https://www.iarpa.gov/research-programs/trojai.EA (forum/community) and Kialo?
TL;DR: I’m curious why there is so little mention of Kialo as a potential tool for hashing out disagreements in the EA forum/community, whereas I think it would be at least worth experimenting with. I’m considering writing a post on this topic, but want to get initial thoughts (e.g., have people already considered it and decided it wouldn’t be effective, initial impressions/concerns, better alternatives to Kialo)
The forum and broader EA community has lots of competing ideas and even some direct disagreements. Will Bradshaw's recent comment about discussing cancel culture on the EA forum is just the latest example of this that I’ve seen. I’ve often felt that the use of a platform like Kialo would be a much more efficient way of recording these disagreements, since it helps to separate out individual points of contention and allow for deep back-and-forth, among many other reasons. However, when I search for “Kialo” in the search bar on the forum, I only find a few minor comments mentioning it (as opposed to posts) and they are all at least 2 years old. I think I once saw a LessWrong post downplaying the platform, but I was wondering if people here have developed similar impressions.
More to the point, I was curious to see if anyone had any initial thoughts on whether it would be worthwhile to write an article introducing Kialo and highlighting how it could be used to help hash out disagreements here/in the community? If so, do you have any initial objections/concerns that I should address? Do you know of any other alternatives that would be better options (keeping in mind that one of the major benefits of Kialo is its accessibility)?
How would you feel about reposting this in EAs for Political Tolerance (https://www.facebook.com/groups/159388659401670) ? I'd also be happy to repost it for you if you'd prefer.
Do you just mean this shortform or do you mean the full post once I finish it? Either way I’d say feel free to post it! I’d love to get feedback on the idea
Reposted: https://www.facebook.com/groups/159388659401670/permalink/165329602140909
I have created an interactive/explorable version of my incomplete research graph of AI policy considerations, which is accessible here: https://kumu.io/hmdurland/ai-policy-considerations-mapping-ora-to-kumu-export-test-1#untitled-map
I'm not sure this is worth a full post, especially since the original post didn't really receive much positive feedback (or almost any feedback period). However, I was excited to discover recently that Kumu seems to handle the task of exporting from ORA fairly well, and I figured "why not make it accessible", rather than just relying on screenshots (as I did in the original article).
To rehash the original post/pitch, I think that a system like this, could
1a) reduce the time necessary to conduct literature reviews and similar tasks in AI policy research;
1b) improve research quality by reducing the likelihood that researchers will overlook important considerations prior to publishing or that they will choose a suboptimal research topic; and
2) serve as a highly-scalable/low-oversight task for entry-level researchers (e.g., interns/students) who want to get experience in AI policy but were unsuccessful in applying to other positions (e.g., SERI) that suffer from mentorship constraints—whereas I think that this work would require very little senior researcher oversight on a per-contributor basis (perhaps like a 1 to 30 ratio, if senior researchers are even necessary at all?).
The following example screenshots from Kumu will be ugly/disorienting (as it was with ORA), as I have put minimal effort into optimizing the view, and it really is something you need to zoom in for since you otherwise cannot read the text. Without further ado, however, here is a sample of what's on the Kumu project:
A few months ago I wrote a post on a decision-analysis framework (the stock issues framework) that I adapted from a framework which is very popular/prominent in competitive high school policy debate (which uses the same name). I was surprised to not receive any feedback/comments (I was at least expecting some criticism, confusion, etc.), but in retrospect I realized that it was probably a rather lengthy/inefficient post. I also realized that I probably should have written a shortform post to get a sense of interest, some preliminary thoughts on the validity and novelty/neglectedness of the concept, and how/where people might misinterpret or challenge the concept (or otherwise want to see more clarity/justification). So, I’ll try to offer a simplified summary here in hopes to get some more insight on some of those things I mentioned (e.g., the potential value, novelty/neglectedness, validity, areas of confusion/skepticism).
The framework remarkably echoes the “importance, neglectedness, tractability” (INT) heuristic for cause area prioritization, except that the stock issues framework is specific to individual decisions and avoids some of the problems of the INT heuristic (e.g., the overgeneralized assumption of diminishing marginal returns). Basically, the stock issues framework holds that every advantage and disadvantage (“pro and con”) of a decision rests on four mutually exclusive and exhaustive concepts: inherency (which is reminiscent of “neglectedness,” but is more just “the descriptive state of affairs”), significance, feasibility, and solvency. (I explain them in more detail in my post.)
Over time, I have informally thought of and jotted down some of the potential justifications for promoting this framework (e.g., checking against confirmation and other biases, providing common language and concept awareness in discourse, constructing concept categories so as to improve learning and application of lessons from similar cases). However, before I write a post about such justifications, I figured I would write this shortform to get some preliminary feedback, as I mentioned: I’d love to hear where you are skeptical, confused, interested, etc.! (Also, if you think the original post I made should/could be improved--such as by reducing caveats/parentheticals/specificities, making some explanation more clear, etc.--feel free to let me know!)
I really appreciate your constructive attitude here :) I write below some recommendations and my take on why this wasn't successful. Some of it is a bit harsh, but that's because I honestly respect you and think you'll take it well 😊
I remember coming across your post, which is in an area that I'm very interested in, but seeing that I didn't remember any details and didn't upvote, I probably just skimmed it and didn't find it worth my time to read. I've read it now, and I have some thoughts about how you could have written a post on this topic which I would find interesting and more readable - after reading it now, I think that it has some useful content that I'd like to know.
What I'd really hope you will do is to write a short post (not a shortform) which only explains this framework and some of its features, without unneeded meta-discussion. I've tried skimming the Wikipedia page, but it's in a different enough context and language that it's difficult for me to understand without a lot of effort.
Thanks for the insight/feedback! I definitely see what you are saying on a lot of points. I’ll be working on an improved post soon that incorporates your feedback.
I’ve spent hours today trying to find answers for this, and I’m reaching the point where I think it’s worth throwing out this question just in case someone out there can just solve my problem fairly quickly.
Basically I'm trying to find a platform/software solution for a (~10 person?) AI research project idea I'm doing a writeup for, but I've searched for a while without success. I have a list of the features it would need to (or preferably) have, as well as some example platforms/software which have all the necessary features except for cornerstones like "capable of real-time collaboration (like on a Google Doc)."
Basically, I have already done a partial writeup on this project idea, which includes screenshots of one software which seemingly has most of the bare-necessity features except real-time collaboration: https://forum.effectivealtruism.org/posts/9RCFq976d9YXBbZyq/research-reality-graphing-to-support-ai-policy-and-more
Thus, I think it’s now worth asking whether there is any "help me find [or build] a software solution" service (preferably but not necessarily within EA)? Or should I just post a question somewhere (e.g., on the normal EA forum)?
I think your link doesn't work. It seems good to provide a description of your desired software (a few sentences/paragraphs) and some bullet points, early in your post?
Ah, does the link just not work, or are you saying it’s not helpful for finding a sufficient software? I realize now that I meant to say “a partial writeup of the project idea [including the screenshot of the software]”, not a writeup containing a list of the desired features.
good] to provide a description of your desired software..." which led me to think you were able to access the article ("it"), which confused me. I also didn't have problems when I tried the link, but of course it's now obvious that was because I am the editor. I just fixed that—I'm not even sure how I ended up copying the edit post link anyway.For one of my grad school classes we've been discussing "strategy" and "grand strategy", and some of the readings even talk about theories of victory. I've been loosely tracking the "AI strategy" world, and now that I'm reading about strategy, I figured it might be helpful to share a framing that I've found helpful (but am still uncertain about):
It may feel tempting for some people to approach "strategy" (and somewhat relatedly, "theory" (e.g., IR theory)) as one might approach some problems in hard sciences: work hard to rigorously find the definitive right answer, and don't go around spreading un-caveated claims you think are slightly wrong (or which could be interpreted incorrectly). However, I personally found it helpful to frame "strategy" in terms of an optimization problem with constraints:
I'd be interested to hear people's thoughts! (It's still fairly raw from my notes)
Working title: Collaborative Discussion Spaces and "Epistemic Jam Sessions" for Community Building Claims/Ideas?
Tl;dr: I created an example discussion space on Kialo for claims/ideas about EA community building, with the idea being that community builders could collaborate via such structured discussions. Does this seem like something that could be valuable? Is it worth making this shortform into a full post?
I’m a big fan of structured discussions, and while reading this post early last month I wondered: would it be helpful if there were some kind of virtual space for sharing claims and ideas—and arguments for/against those claims and ideas—about community organizing?
For example,
I went ahead and created a toy example of what such a structured discussion could look like on Kialo, with some example claims and arguments for/against those claims.
Building on this, I also wondered if it might be good to designate some 1–3 day period each month as a focal/Schelling point for community organizer participation, perhaps also with some non-binding/optional goals laid out in advance (e.g., “we want to get a better sense of how to improve outreach/success at lower-prestige universities,” “we want to hear about your experiences/advice regarding outreach to STEM groups”). Perhaps you could call these “epistemic jam sessions” (I’m totally open to accepting better name ideas). Regardless, these discussions could be open to contributions at any time.
Ultimately, I’d love to hear your thoughts on:
I'm considering doing another pilot "epistemic map", but I'm trying to decide what topic I should do it on, and thus soliciting suggestions.
For more on epistemic mapping, you can see here for a presentation I recently gave on the topic (just ignore the technical issues in the beginning)
Whereas my last pilot/test map focused on the relationship between poverty and terrorism (and the associated literature), I want to do this one on something EA-relevant. FWIW, I think that epistemic mapping is probably most valuable for topics that are important, dynamic (e.g., assumptions or technological capabilities may change over time), unsettled/divisive, and/or have a non-small literature base (among a few other considerations).
Some of my ideas thus far have been the controversial Democratising Risk paper (or something else X-risk related), the Worm Wars debate, biosecurity/pandemic risks, or maybe something about AI. But I'd love to hear any other suggestions (or feedback on those ideas I listed)!
[Summary: Most people would probably agree that science benefited greatly from the shift to structured, rigorous empirical analyses over the past century, but some fields still struggle to make progress. I’m curious whether people think that we could/should seek to introduce more structure/sophistication to the way researchers make and engage with theoretical analyses, such as something like "epistemic mapping"]
I just discovered this post, and I was struck by how it echoed some of my independent thoughts and impressions, especially the quote: "But it should temper our enthusiasm about how many insights we can glean by getting some data and doing something sciency to it."
(What follows is shortform-level caveating and overcomplicating, which is to say, less than I normally would provide, and more about conveying the overall idea/impression)
I've had some (perhaps hedgehoggy) "big ideas" about the potential value of what I call "epistemic mapping" for advancing scientific study/inquiry/debate in a variety of fields. One of them relates to the quote above: the "empirical-scientific revolution" of the past ~100-200 years (e.g., the shift to measuring medical treatment effectiveness through inpatient/outpatient data rather than professionals’ impressions) seems to have been crucial in the advancement of a variety of fields.
However, there are still many fields where such empirical/data-heavy methods appear insufficient and where it seems like progress languishes: my impression has been that this especially includes many of the social sciences (e.g., conflict studies, political science, sociology). There are no doubt many possible explanations, but over time I've increasingly wondered whether a major set of problems is loosely that the overall complexity of the systems (e.g., human decision making process vs. gravitational constants) + the difficulty of collecting sufficient data for empirical analyses + (a few other factors) leads to a situation of high information lossage between researchers/studies and/or people are incentivized to oversimplify things (e.g., following the elsewhere-effective pattern of regression analyses and p<0.05 = paper). I do not know, but if the answer is yes, that leads to a major question:
How could/should we attempt to solve or mitigate this problem? One of the (hedgehoggy?) questions that keeps bugging me: We have made enormous advances in the past few hundred years when it comes to empirical analyses; in comparison, it seems that we have only fractionally improved the way we do our theoretical analysis... could/should we be doing better? [Very interested to get people's thoughts about that overall characterization, which even I'll admit I'm uncertain about]
So, I'm curious if people share similar sentiment about our ability/need to improve our methods of theoretical analysis, including how people engage with the broader literature aside from the traditional (and, IMO, inefficient) paragraph-based literature reviews. If people do share similar sentiment, what do you think about that concept of epistemic mapping as a potential way of advancing some sciences forward? Could it be the key to efficient future progress in some fields? My base rates for such a claim are really low, and I recognize that I'm biased, but I feel like it's worth posing the question if only to see if it advances the conversation.
(I might make this into an official post if people display enough interest)