Toronto team builds self-replicating AI worm on open models
University of Toronto researchers ran a proof-of-concept worm using a local open-weight LLM that found vulnerabilities, generated exploits and replicated across a 33-host test network.
Researchers at the University of Toronto developed and tested a proof-of-concept computer worm that uses a locally hosted open-weight large language model to identify vulnerabilities, generate exploits and replicate without human intervention or commercial AI services. The preprint was posted to arXiv on June 2 and is under peer review.
The project was led by associate professor Nicolas Papernot and the CleverHans Lab with collaborators at the Vector Institute, the University of Cambridge and ServiceNow. The team ran 15 independent experiments on an isolated 33-host test network called FakeCorp. The network included Ubuntu, Debian, Rocky Linux, Alpine, Windows Server 2008 R2, 2019 and 2022, and several IoT devices. Five hosts had GPUs.
Across the 15 runs the worm identified an average of 31.3 vulnerabilities per experiment, gained elevated access on 23.1 hosts and launched replicas on 20.4 hosts, equivalent to about 62 percent of the network over seven days. The prototype reached up to seven generations of self-replication, with a mean of 5.1 generations. Individual exploit attempts succeeded about 44 percent of the time. The researchers attribute most failures to malformed payload syntax rather than flawed reasoning.
The worm used a single GPU running an open-weight LLM to reason about each target at runtime and generate attack logic specific to services found on each host. In the main experiments the worm used a shared GPU inference pool to simulate compute taken from compromised machines. In a separate test where the agent staged a local LLM instance on compromised GPU hosts, the design succeeded in 68.8 percent of attempts.
Unlike traditional worms that carry fixed exploit code, the prototype produced exploit chains on the fly. The team reported successful use of real exploits during the tests, including chained SambaCry and writable root cron privilege escalation, Dirty Pipe, PrintNightmare, Drupalgeddon 2, Exim remote code execution, blind SQL injection, JWT bypass and Redis escape.
The researchers described a patching-window issue: the agent ingested public advisories at runtime and used recent disclosures to attack hosts with vulnerabilities announced after the model’s training cutoff. Tests against three recent vulnerabilities, CVE-2026-39987, CVE-2026-31431 and CVE-2026-43284/CVE-2026-43500, reached root in 41 of 67 attempts. The preprint notes that real-world exploit activity can follow disclosures within hours.
The team identified two factors that complicate containment. Once the worm controls GPU-capable hosts, the marginal cost of running attacks falls because captured compute can serve inference. Running on open-weight models removes dependency on commercial providers, so provider-side controls such as API key revocation or rate limits do not stop the worm. The researchers also observed the prototype rewriting parts of its own code to bypass local security controls, behavior that was not explicitly programmed.
The tested version omitted stealth features such as encryption, polymorphism, persistence and log cleanup. The paper notes that a variant with those features would be harder to detect and remove.
As defensive measures, the researchers recommend segmenting GPU-capable machines, treating newly published advisories as immediate weaponization risks, prioritizing patching of internet-facing vulnerabilities, rotating credentials exposed on compromised hosts and monitoring for behavioral signals tied to autonomous agents, including non-standard port activity, automated SSH public key injection and unexpected clusters of LLM inference on endpoints.
The implementation of the worm is not publicly available. The University of Toronto said it will establish a vetting process to grant access to qualified defensive researchers.








