Google shipped its seventh-generation Ironwood Tensor Processing Unit (TPU) in Q4 2025, while Anthropic signed a contract for 1 million Amazon Trainium2 chips. The moves reflect mounting pressure on cloud providers to reduce AI infrastructure costs as workloads scale.
Amazon unveiled Project Rainier, a data center architecture built around Trainium chips designed for AI training and inference. The company reported the Anthropic deal will generate $2.1 billion in infrastructure revenue over three years. Amazon's AWS segment posted 22% year-over-year growth in Q3 2025, with AI workloads representing 31% of compute capacity.
Google's Ironwood TPU delivers 3.2x performance-per-watt versus the prior generation, targeting a 40% reduction in inference costs. Alphabet's Q3 earnings showed Cloud revenue up 28% to $11.4 billion, with TPU deployments expanding across 15 availability zones. CEO Sundar Pichai said custom chips now handle 63% of Google's internal AI training workloads.
The economics favor specialization. NVIDIA H100 GPUs cost $25,000-$30,000 per unit with lead times exceeding 6 months. Amazon's Trainium2 offers comparable performance at $15,000 per chip with guaranteed supply for committed customers. Google doesn't sell TPUs externally but uses cost savings to price Cloud AI services 20-35% below competitors.
Market analysts project custom AI accelerators will command 25-30% of the AI training chip market by 2027, up from 12% in 2025. Morgan Stanley estimates hyperscaler chip development programs will pull $18 billion in annual revenue from NVIDIA by 2028. The shift concentrates risk: companies building proprietary silicon face $500 million-$1 billion development costs per generation.
NVIDIA maintains dominance in third-party cloud AI with 76% market share. But Microsoft, which operates its own Maia chip program, reduced external GPU purchases by 23% in fiscal 2025. Inference workloads are migrating fastest—repetitive, high-volume tasks where custom chips deliver immediate ROI.
The hypothesis that custom accelerators will gain ground against general-purpose GPUs appears validated in hyperscaler deployments. Validation depends on 2026 data showing custom chip volumes, cost-per-inference metrics, and market share shifts in cloud AI workloads.

