Google Cloud next: 8th‑gen TPUs, Gemini scale AI agents

At Google Cloud Next in Las Vegas, Google introduced eighth‑generation TPU pods with 121 exaflops (FP4) and updated Gemini Enterprise and Vertex AI to support AI agents.

At Google Cloud Next in Las Vegas this week, Google announced eighth‑generation TPU pods and changes to its Gemini Enterprise and Vertex AI offerings aimed at supporting large AI agent deployments. The company said the updates address customer demand for higher training throughput and broader deployment options.

Google described a pod built from 9,600 of its 8t TPU chips that delivers 121 exaflops of FP4 performance. The company clarified that FP4 is a low‑precision numerical format used for high‑volume model training and inference, and is not comparable to 64‑bit floating point measures used in traditional supercomputing rankings.

On the software side, Google expanded Gemini Enterprise and reorganized Vertex AI services. The company said the changes are intended to simplify building and operating agent-style systems by making it easier to route models to appropriate hardware, manage multiple models and connectors, and scale inference across distributed cloud and on‑premises environments.

Google said DeepMind teams were involved in the design and oversight of some new features. The company also announced a partnership with security firm Wiz to accelerate automation of security processes for AI workloads, aiming to detect and respond to risks in agent deployments.

One attendee at the conference observed, “Google Cloud keeps reiterating that it understands people are in a multi‑cloud world and no one wants to be locked into a single provider.” Another attendee highlighted the hardware numbers, saying, “In a pod of 9,600 8t TPUs they can deliver 121 exaflops of FP4 compute — FP4 is low‑resolution but suitable for training workloads.”

Speakers at the event described the combined hardware and software updates as offering more options for where and how enterprises run large AI systems, along with tools to manage multiple agents and scale production workloads. Conference presentations emphasized training capacity, inference efficiency and automated security controls as priorities for organizations deploying agent architectures.

Presentations and product briefs at Google Cloud Next framed the announcements as responses to demand for systems that combine multiple models, tools and connectors to complete tasks. The company provided technical specifications for the TPU pods and outlined the Vertex AI and Gemini Enterprise changes intended to support enterprise agent deployments.

Articles by this author