Hide table of contents

In its simplest version: Hacker-AI (Type I) uses AI for hacking arbitrary IT systems, i.e., finding vulnerabilities automatically, e.g., elevating privileges/user roles (sys-admin or system), stealing encryption keys, hiding attacking tools from being detected, or creating code that is irremovable on (once) occupied systems. It would create exploits automatically. In an advanced version, Hacker-AI (Type II) would be actively operating malware on computer systems with sys-admin rights, trying to stay hidden from detection, and making itself irremovable from any later deployed software.

In my post “Hacker-AI and Digital Ghosts – Pre-AGI”, I argued that Hacker-AI and Digital Ghosts are feasible and a serious threat to our civilization. In my follow-up post, “Improved Security to Prevent Hacker-AI and Digital Ghosts”, I proposed several solutions to eliminate this threat, e.g., via a separate layer of standardized security-related operations detached from regular tasks in which we ignore all security-related (regular) results.

In this post, I will discuss methods of how hackers and Hacker-AI gain the upper hand in our current cyber-security ecosystem – despite the potential use of AI in cyber defense. 

I don’t expect this post to surprise or educate hackers (criminal or governmental) or anyone knowledgeable enough to be a hacker -- they know already about Stuxnet, or Pegasus from the NSO-Group. The relative scarcity of Zero-Day Vulnerabilities could tell us that hacking is still not done systematically, but I am not sure about the cyberwar capabilities of governments. Moreover, many (enhanced) tools are required to be effective as a hacker. Some dumps of tools from hacker/government groups were published – but too little is publicly known, so I don’t dare to have a (public) opinion about that. I still assume it is not easy for lonely malicious actors or, more specifically, smart system developers, all trying to damage our (cyber-) world, to quickly turn into professional/efficient hackers. But the broad availability of these (more enhanced) software features could change quickly. 

As a side note: I have had a question. In what world would we live if the physics of building nuclear weapons were easy enough to enable students to build nukes in school experiments? This world would likely be much different (less free, more surveillance), or most of us would not live anymore. In my “Hacker-AI and Digital Ghosts – Pre-AGI” post, I argued that Hacker-AI (Type II) is a first-strike weapon capable of decapitating governments or societies. Therefore, there is a large incentive to have Hacker-AI developed. However, Hacker-AI – Type II would not be deadly but as dangerous as nukes. Hacker-AI (Type II) could be quietly deployed against people’s freedom, privacy, and security. So, what if developing Hacker-AI is achievable by many skilled individuals, potentially done in a few weeks in the privacy of their homes? I assume it would end trust in anything technical. Could cyber-defense fix all detected vulnerabilities? Doubtful. Instead, and even without competing Hacker-AI (Type II) systems, there will always be the nagging question, are all vulnerabilities fixed in systems? Or, is some malicious software (i.e., an external reason) responsible for being an unlucky victim of circumstances? Most of us will soon have 10s or 100s of Internet-capable devices; any of them could be a spy, traitor, or saboteur. How do we know we are not becoming targeted victims in eCommerce or even online banking? Constant threats from Hacker-AI are not sustainable.

I am raising the issue with Hacker-AI now, hoping that we have one, two, or even five years before (open) tools/ methods are broadly available for more hackers seeking to develop their private Hacker-AI version. The scary thought is that the code enabling AI is relatively simple. But the hopeful thought is that the global scientific community came up with viable solutions against COVID-19 in less than a few months. If this can also be done against Hacker-AI is unknown and should better not be live-tested. 


The underlying concern is that current CPUs and OS (implementations) are too complex than being able to prevent attackers from manipulating the OS. Complexity is the enemy of security. Also, it requires only one vulnerability for an attacker to achieve his goal. Finally, the weakest point (i.e., a vulnerability) determines the strength of the security. Under these circumstances, we should conclude that software-based security is unlikely to provide security. Unfortunately, hardware-based security does not guarantee better performance: it depends on how hardware is instantiated, used, and checked for anomalies.

Cyber security protection depends on the OS and adversary’s assumed abilities. As a disclaimer: I don’t know any classified systems/methods or what is being developed behind closed doors by hackers or legitimate cyber-security companies. However, I dare here to make some educated guesses. I am open to acknowledging that I may be too optimistic or pessimistic in my assumptions. However, the lack of open/public information should not be used as an excuse for ignoring this problem.

Starting with the adversary: Criminal or governmental groups have the resources to hire top hackers, i.e., someone with a strong computer science, system, or kernel programming background. They understand OS components/layers and, likely, how compilers work. These expert hackers can likely make goal-oriented decisions about extracting relevant information from this process much faster than hackers with less knowledge.

So far, hacking is labor and time intensive. But it is conceivable that hackers have created (internal) libraries of proprietary automation tools that they extend and share among the different hacker teams. As professionals, they have likely invested in tools to analyze extracted data (including context-related (meta-) data) so that humans could infer more easily their next steps toward their goal (e.g., stealing an encryption key or methods to elevate the user role/privilege, i.e., become super-user or even system). For software developers, labor-intensive steps are seen as something they could optimize to make it faster or better for decisions on how to go forward. In the end, hacking has a binary success criterion: A simple test will show if they succeeded or failed. And understanding the reasons for failures motivates more innovations.

AI based Hacking

An important tool for hackers is Reverse Code Engineering (RCE), i.e., the usage of decompilers, disassemblers (and debuggers). Hackers’ biggest obstacle is understanding the meaning of variables and functions. In source code, meaningful names are essential for making sense of code snippets. But there are other ways to help humans to get to a similar level of information or understanding. However, if hacking is (fully) automated, the meaning of variables or functions is secondary and irrelevant.

All required hacking tools have an open-source version. But these tools are relatively difficult/ inconvenient to use in their basic form. They are improved with additional automation tools that are often commercially available or shared among teams – otherwise, RCE or hacking is very labor-intensive. Extracting certain low-level data or transforming output into an easier usable, enhanced form helps attackers to understand/decide what is relevant or irrelevant to their goal. I am writing this as a self-educated practitioner, not a mentored expert. Therefore, more experienced hackers could follow a more efficient approach.

Without saying that this is already done, AI could help systematically test different hypotheses for variables and functions and rank them via probabilities if they are relevant/irrelevant to a binary goal. AI methods, like “Reward is Enough” from Deepmind, could turn many of these tasks into rules for a game. Low probabilities could reduce the search space significantly. High-probability variables/functions are tested continuously, directly, or in combination with other values/functions whether the goal is accomplished.

Humans often write (unintentionally) (too) complicated code. This code could have undetected bugs, so if we tried to automate code simplification (i.e., code transformations), it likely preserves these bugs. 

Code obfuscation and RASP (runtime application of self-protection) transforms code so that software’s core features are preserved while additional code is inserted to distract or slow down attackers’ understanding.

Reversing this obfuscation requires in-depth information; software is simulated (in special sandboxes or virtual machines) to extract revealing information or data in certain variables. Overwriting variable values at the right time could prevent RASP-related stop subroutines from activating. The automated removal of RASP-related code is not being discussed publicly yet (for legal reasons?). Does that mean nobody works on that? RASP is (easily) detectable, and via overwriting variable values, as done by debuggers (or via sandboxes), we could suppress consequences from RASP as part of a game played by an AI.

Moreover, AI could transform machine code into a more basic, standardized operation form that reveals the algorithm’s structure and the OS usage, independent of the compiler or used (computer) language. This form could be called “software normal form” (SNF) – a concept used in other IT or math contexts. From this SNF, automated tools could systematically study code features, assign probabilities and remove RASP hurdles one by one by comparing the internal states of the original software with the modified one. I have not seen that type of SNF being discussed publicly, but it would be a natural intermediary step toward making RCE (or hacking) more independent of OS, CPU, compiler, library, or language variations. (A brief side note on intermediate languages and VMs: They are not making applications safer – because they can be compromised/turned into a sandbox used within attacks).

For attackers, the search space for finding vulnerabilities is huge. But defenders’ tasks (challenges) are worse; they must find all vulnerabilities. Vulnerabilities are detected all the time but compared to large amounts of existing source code, these problems are (so far) relatively rare exceptions (I would assume many more). AI could be trained on already found vulnerabilities (or their success could be calibrated via finding known vulnerabilities). But more interesting is, what would happen if we have relentless AI persistently seeking vulnerabilities? Would we conclude that we can remove all vulnerabilities or can’t remove them all? Would AI tell us that all vulnerabilities are removed? I doubt that we will have the tools to answer these questions. Therefore, I would side with the assumption that we will always have vulnerabilities in our code despite the use of AI in cyber defense.

Most likely, Hacker-AI (Type I) is initially used within the safety of hacker-controlled servers. Hacker-AI pre-analyzes code and reduces attacks to simple exploits, i.e., instructions that would predictably deliver the expected results (with a low probability of detection). When deployed, software is usually bundled with features already available on systems because, during development, it is unknown what features systems already provide. Smarter malware could use existing (unchanged) OS tools for its task without creating suspicion (i.e., living off the land). It could even adapt to local systems via additional data (usage instructions) it requests from the outside after it has explored the local system. It could request data via piggybacking messages using temporarily modified apps or libraries, i.e., it could use the network to get help from systems with lower security. These are features that malware could already have without Hacker-AI. However, it is probably not done because undetectability is not a top design goal of malware developers, or malware developers are concerned about “air gaps” between targeted systems and the Internet.

Human defenders have a small advantage over attackers: Attackers use only a relatively small number of similar methods to attack systems (that’s why we use blacklisting in cybersecurity). Although they are in the thousands, exploits are not yet in the millions or billions of variations. Additionally, many of these attack methods are done poorly, i.e., they leave traces due to human mistakes. Also, malware provides no feedback on what new tools are used to detect the malware. Finally, malware is not automatically trained in concealment, camouflage, or self-defense within a wargaming environment that could reveal unnecessary detection and fix them proactively. However, these are features that I would expect from a Hacker-AI (Type I or II) and digital ghost.

But maybe we see only poorly done malware, not the sophisticated advanced Hacker-AI. I am aware of the problem with this argument: you can never prove a negative. Still, for people responsible for security, this concern must be on their minds. I am not convinced that digital ghosts are permanently invisible; on the contrary, I assume we could develop reliable tools to detect and stop them consistently. The same applies to stopping unknown exploits. No unknown software should be allowed to be executed (automatically) on any IT device.

Limits to Hacker-AI

Computer systems are, by design, open systems. Even high-security computers have a von-Neumann Architecture, i.e., a unified address space in which instructions and data are stored together. All multi-user, multithreading systems share the same OS concepts with less-safe cousins. Optimization and redundancy within some components make them different in many details, but all systems are designed in layers to be more easily extendable or modifiable. If Hacker-AI can read (or receive) compiled binaries, it is probably enough to check for vulnerabilities. (Side Note: it is probably not enough to use OS’s access control system to prevent read access – because these “simple” read features are too complex to be trusted).

Limits to Hacker-AI could only be set by computer systems (CPU/OS) and not inherently from limitations by the AI used in Hacker-AI. This means if a system is vulnerable, some (future) iteration of Hacker-AI will find it. Theoretically, Hacker-AI (even Type II) could be deceived via honey-pots, tripwires, or secret code that it doesn’t know or is being prevented from analyzing in detail. The problem with the secrecy approach is that we don’t know what an attacker really knows (secrets may already be known) or how far its attack went before it was detected.

Currently, the idea of an attack is actively and repeatedly probing for holes in the defense. In the same way, we would assume that Hacker-AI would pursue this. However, I doubt that Hacker-AI would only try; it would do a successful attack with the least amount of interactions with the targeted systems because it knows it would work. It would determine the situation and then start an attack only once – afterward, it would delete all traces as if nothing had happened. Attacks with massive numbers of attack events might be used to masquerade a single successful event. Detecting successful Hacker-AI intrusions (even with hindsight) could be considered a failure of Hacker-AI, which is likely only a temporary limitation of that Hacker-AI version.

Conceptually, Hacker-AI, if deployed as a Type-II attack tool, would need to be as undetectable as possible. The question is, are there limitations to undetectability? Operating systems of some computer systems are designed to detect malware or protect private data better and more easily/reliably than others. Without having first-hand knowledge, I assume that some military-grade systems are that way. But to analyze this problem, I ask a different question: Is it possible to have an OS kernel that can near-perfectly hide certain apps and hide that it is doing that?

If others are of a different opinion, they will certainly speak up, but I say yes, we can modify an OS so that it hides apps and hides that it has this feature. I am not claiming that these features could be hidden on the storage device within an independent audit, but if we would deal with an advanced Hacker-AI (Type II) that has additionally stolen keys related to modifications of the CPU’s microcode, boot guards of BIOS/UEFI or TPM/TEE, then I am open to the hypothesis that this Hacker-AI (Type II) could find a way to hide from a comprehensive independent audit as well.


The main problem with Hacker-AI (Type II) is that it could operate in complex ecosystems that allow software to be modified covertly. It could have many methods trying to hide or camouflage its existence. Software and hardware have many layers in which attack code can be inserted. Software developers have no problems using these layers; the same applies to attackers misusing them.

Soft- and Hardware is designed to provide some redundancy against failures without escalating them as big issues to users or manufacturers. Storage media and network connections are dealing with high error rates in their normal operations. Some of this error handling is delegated to hardware components that get their operating software from a potentially compromised operating system. Additionally, storage cells could be marked damaged but still be used to hide information from security audits. Moreover, many files or data formats have comment fields or segments that could be misused to hide information. Software developers or manufacturers have introduced many features without being aware that these features could also be misused. Therefore, the problem is our ignorance or lack of vigilance about which software (features) could be used against us.

And finally, we allow every developer to provide software to us without establishing sufficient accountability. Computers and their software are essential for our civilization; we must protect ourselves against adversaries who try to hide in anonymity. We don’t take prescriptions or legal advice from anonymous sources; also, we don’t have our money being managed by unnamed institutions or organizations who don’t care who they hire. Unfortunately, we accept anonymous software or are often not even informed about that or deceived into accepting unknown code.


Many components required for Hacker-AI already exist. The new quality in Hacker-AI is the speed in which new malware could be produced for different platforms (CPU/OS). More important, there is no reason to assume that Hacker-AI (Type I or II) is not feasible. Does Hacker-AI already exist? 

I consider it likely that governments have Hacker-AI (Type I) in their portfolio of cyberwar capabilities. There are likely significant differences in the quality of what different counties have developed and how fast they are. Eventually, they will end on the same or similar plateau over time. If criminal organizations or non-state hacker-teams have already developed Hacker-AI (Type-I) is unknown; we would probably see a decline in prices paid for exploits of Zero-Day Vulnerabilities. Also, if they attack crypto keys and detect additional user devices in 2-factor authentication, they may have Hacker-AI – and we will see a steep increase in eCommerce and online banking fraud. 

Hacker-AI (Type I/II) is a serious cyberwar weapon because it could decapitate governments or societies and create permanent (global) supremacy for its operators/masters. If countries have it already is unknown. Defensively used AI will likely not protect us against any Hacker-AI – certainly not against Type II. Furthermore, there is no reason to assume that the world community will agree to abandon the development of Hacker-AI (Type II). Using irremovable Hacker-AI Type II on systems on a massive scale is likely considered an act of war. If the use of Type II is detected or only assumed, and more than one nation has it, then there will be a race to be the first on as many systems as possible. If there is a way back (i.e., the removal of Type II on most systems) is unknown – it would be much better not to let it come to this point.

The consequences of having Hacker-AI (Type I and II) are unacceptable. Regulating or ostracizing Hacker-AI is unrealistic. There is only one change: dealing with Hacker-AI as a technical problem. My post, “Improved Security to Prevent Hacker-AI and Digital Ghosts”, provides several suggestions.





More posts like this

Sorted by Click to highlight new comments since:

Autonomous AI Hacking is a great theory - but it is difficult to see it happening at the current landscape where we have  a reality as of the moment that Artificial consciousness not being able to see themselves as separate entities - thus there is no need for them to defend themselves or figure out an attack procedure. The embodiment problem is crucial for these machines to work autonomously..

Curated and popular this week
Relevant opportunities