Apple’s next Siri to use NVIDIA Blackwell via Google Cloud

Apple’s redesigned Siri, due September 2026, will route heavier requests to Google Cloud running on NVIDIA Blackwell B200 chips after NVIDIA unveiled Nemotron 3 Ultra.

Apple plans to route complex Siri queries to Google Cloud servers running on NVIDIA Blackwell B200 accelerators while keeping lighter tasks on-device. The redesigned assistant is scheduled to arrive in September 2026. Company testing found Mac-chip servers could not meet the model’s compute demands, prompting the use of external cloud infrastructure.

Under the reported setup, Google Cloud will host a licensed version of Gemini to run the heavier workloads and use NVIDIA Blackwell B200 data-center chips for inference. Apple has approved a form of confidential computing that encrypts data and AI models while they are processed on external chips, a step intended to preserve the company’s privacy controls when using third-party cloud servers.

NVIDIA introduced Nemotron 3 Ultra at Computex 2026. The model is open-source and is reported to contain roughly 500 billion to 550 billion parameters. NVIDIA positioned Nemotron 3 Ultra for agentic workflows-AI systems that plan and execute multi-step tasks over long periods-and highlighted improvements in inference speed and cost for complex tasks.

NVIDIA wrote in a product announcement: “Nemotron 3 Ultra is built for that new workload. It’s a frontier smart model that delivers up to 5x faster inference and lowers the cost of complex agentic tasks by up to 30%.” The Nemotron 3 family also includes Nano and Super variants.

Adoption of the Nemotron 3 family has been strong, with more than 50 million downloads in the year through April 2026. Faster inference and lower cost per query matter for large-scale consumer services because they can reduce latency and operating expense when cloud servers handle many simultaneous requests.

The arrangement would connect Apple’s device and software layer with Google’s cloud services and NVIDIA’s data-center hardware. Trading moved NVIDIA shares to about $216.18, up roughly 0.7% in the latest 24-hour period, while Apple shares traded near $310.04, up about 0.2%. WWDC 2026 begins June 8, where Apple is expected to provide more detail on how it will balance on-device processing with cloud-based models and on safeguards for data handled by external servers.

Articles by this author