Google: AI-built 2FA zero-day stopped before mass attack

Google Threat Intelligence Group found an AI-generated 2FA bypass that required valid credentials and halted it before a planned mass exploitation campaign.

Google’s Threat Intelligence Group reported it identified and stopped an AI-generated zero-day exploit — a two-factor authentication bypass that still required valid user credentials — before a threat actor could launch a planned mass exploitation event.

Researchers found several indicators the exploit was produced with a large language model: abundant educational docstrings inside the script, a fabricated Common Vulnerability Scoring System score, and a ‘textbook’ Python coding style that matches LLM training data. Google said it has high confidence a model other than its Gemini system created the working exploit.

The vulnerability was a logic error in 2FA enforcement caused by a hardcoded trust assumption in application code. An embedded static exception allowed authentication checks to be bypassed under certain conditions, even though attackers needed valid credentials to reach the vulnerable code path.

GTIG researchers wrote that semantic mistakes of this kind are easier for advanced language models to spot and exploit than for traditional fuzzers or static-analysis tools, which focus on crashes and low-level memory issues rather than developer intent.

John Hultquist, chief analyst at GTIG, said, ‘There’s a misconception that the AI vulnerability race is imminent. The reality is that it’s already begun.’ He added that frontier models can perform contextual reasoning and correlate enforcement logic with contradictions in hardcoded exceptions.

The GTIG report described criminal and state-backed groups in China, North Korea and Russia using AI to speed vulnerability discovery and exploit development. These actors have used persona-driven prompting, specialized security datasets, and large volumes of automated queries to support their workflows.

GTIG tracked one group labeled UNC2814 using expert persona prompts in Gemini to research remote code execution flaws in TP-Link router firmware and in Odette File Transfer Protocol implementations. Another actor, APT45, issued thousands of repetitive prompts to analyze known CVEs and validate proof-of-concept exploits.

The report said AI tools are being applied across attack phases: finding vulnerabilities, generating exploits, automating command execution, refining reconnaissance and crafting social-engineering messages.

GTIG contrasted tool strengths, noting fuzzers and static-analysis systems remain effective at finding memory corruption and crash conditions, while frontier language models performed better at recognizing high-level semantic mistakes and inconsistent authorization logic. The researchers added that LLMs still struggle with very complex enterprise authorization flows.

Google said its internal detection and response work likely disrupted the planned mass exploitation but did not name the targeted vendor or provide a precise timeline. The company reiterated that its Gemini model was not involved in creating the exploit.

GTIG suggested security teams review authentication logic for hardcoded trust assumptions and examine authorization paths for semantic errors that traditional testing may miss.

Articles by this author