Musk's Colossus 1 AI supercomputer's inefficient mixed-architecture design couldn't be used to train Grok, so Anthropic's using it for inference instead — Musk readies unified Blackwell-only Colossus 2 for frontier training and potential IPO

TL;DR

SpaceX’s Colossus 1 supercomputer, leased to Anthropic, suffers from low GPU utilization due to its heterogeneous architecture. This inefficiency explains Musk’s decision to lease the system to a rival AI firm. The development highlights challenges in large-scale AI infrastructure management.

SpaceX’s Colossus 1 supercomputer, recently leased to AI firm Anthropic, is experiencing significant efficiency issues due to its mixed GPU architecture, leading to underutilization of the system’s capacity.

Anthropic announced last week that it had leased the entire Colossus 1 data center, which contains over 220,000 GPUs and 30 megawatts of compute power, to address its growing demand for AI inference capacity. The move aims to alleviate bottlenecks in the company’s Claude ecosystem, which has faced restrictions on usage and API requests due to limited compute resources.

Recent reports from Mirae Asset Securities indicate that Colossus 1’s architecture is heterogeneous, comprising roughly 150,000 H100 GPUs, 50,000 H200s, and 20,000 GB200s, all running under one system. This mixed configuration was assembled rapidly as supply chain constraints allowed, rather than through deliberate design, resulting in significant inefficiencies. The slower GPUs cause the faster ones to wait, creating a bottleneck known as the straggler effect, which reduces GPU utilization to approximately 11%, far below industry standards of 40% or higher.

Why It Matters

This inefficiency impacts the economic and operational viability of large AI data centers, as unused GPUs represent wasted investment and energy consumption. For Anthropic, this lease provides immediate compute capacity but underscores challenges in scaling AI infrastructure efficiently. For Musk and SpaceX, it reveals limitations in their first-generation supercomputing efforts, which may influence future AI hardware deployment strategies.

NVD RTX PRO 6000 Blackwell Professional Workstation Edition Graphics Card for AI, Design, Simulation, Engineering – 96GB DDR7 ECC Memory – 4th Gen RT/5th Gen Tensor Core GPU – OEM Packaging

[NVIDIA Blackwell Streaming Multiprocessor] The new SM features increased processing throughput, and new neural shaders that integrate neural…

As an affiliate, we earn on qualifying purchases.

Background

SpaceX’s Colossus 1 was assembled rapidly, featuring a mix of GPU generations from Nvidia, including H100, H200, and GB200 models. The cluster was initially seen as a sign of Musk’s ambitions to compete with major AI players like OpenAI and Google. However, the heterogeneous architecture was not optimized for AI training or inference, leading to low utilization. Anthropic’s demand for high compute capacity has grown rapidly, driven by its expanding user base and the need to lift usage restrictions, prompting the lease of the supercomputer.

“The heterogeneous GPU configuration results in significant inefficiency, with utilization around 11%.”

— Mirae Asset Securities

“The system was assembled quickly to meet immediate demand; optimization efforts are ongoing.”

— SpaceX/XAI spokesperson

Amazon

AI supercomputer GPU upgrade

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear whether SpaceX plans to upgrade or reconfigure Colossus 1 to improve efficiency, or if future systems will adopt more homogeneous architectures. Details about the long-term use of the supercomputer and its impact on SpaceX’s AI ambitions are still emerging.

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include potential hardware reconfiguration or upgrades to enhance GPU utilization. Monitoring SpaceX’s plans for future supercomputers, such as Colossus 2, will clarify how they intend to address these inefficiencies and scale their AI infrastructure.

OpenMP: Heterogenous Execution and Data Movements: 11th International Workshop on OpenMP, IWOMP 2015, Aachen, Germany, October 1-2, 2015, Proceedings (Programming and Software Engineering)

As an affiliate, we earn on qualifying purchases.

Key Questions

Why does the mixed GPU architecture cause inefficiency?

Different GPU generations have varying processing speeds. When combined in one system, faster GPUs must wait for slower ones to complete tasks, leading to low overall utilization.

How does this inefficiency affect Anthropic’s AI services?

Low GPU utilization limits the amount of compute available for AI inference, causing restrictions on user requests and slowing down service improvements.

Will SpaceX upgrade Colossus 1 to fix these issues?

It is not yet clear whether SpaceX plans hardware upgrades or reconfiguration. Ongoing discussions suggest optimization efforts may be underway.

What does this mean for Musk’s AI ambitions?

The inefficiencies highlight challenges in scaling large AI systems quickly and cost-effectively, which could impact future projects like Colossus 2 and Musk’s broader AI strategy.

Musk’s Colossus 1 AI supercomputer’s inefficient mixed-architecture design couldn’t be used to train Grok, so Anthropic’s using it for inference instead — Musk readies unified Blackwell-only Colossus 2 for frontier training and potential IPO

Up next

A Cautious New Approach to Trump’s Impeachments at the Smithsonian

Author

The Genius Factory Team

Share article

Why It Matters

NVD RTX PRO 6000 Blackwell Professional Workstation Edition Graphics Card for AI, Design, Simulation, Engineering – 96GB DDR7 ECC Memory – 4th Gen RT/5th Gen Tensor Core GPU – OEM Packaging

Background

AI supercomputer GPU upgrade

What Remains Unclear

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch

What’s Next

OpenMP: Heterogenous Execution and Data Movements: 11th International Workshop on OpenMP, IWOMP 2015, Aachen, Germany, October 1-2, 2015, Proceedings (Programming and Software Engineering)

Key Questions

Why does the mixed GPU architecture cause inefficiency?

How does this inefficiency affect Anthropic’s AI services?

Will SpaceX upgrade Colossus 1 to fix these issues?

What does this mean for Musk’s AI ambitions?

Microsoft BitLocker – YellowKey zero-day exploit

Plasma Propulsion for Commercial Air Travel

China Sphere Capability Gap, Q2 2026 Update: Five Labs, Five Strategies, One Narrowing Frontier

Spintronics Chips Could Replace Silicon by 2030

The Early History Of The Singular Value Decomposition (1993) [Pdf]

Will The Temp In Austin Be Above 76.99° On Jul 12, 2026 At 5Am EDT?

15 Best Interactive Science Experiment Kits in 2026

Show HN: Learn By Rebuilding Redis, Git, A Database From Scratch

Musk’s Colossus 1 AI supercomputer’s inefficient mixed-architecture design couldn’t be used to train Grok, so Anthropic’s using it for inference instead — Musk readies unified Blackwell-only Colossus 2 for frontier training and potential IPO

Up next

Author

The Genius Factory Team

Share article

Why It Matters

NVD RTX PRO 6000 Blackwell Professional Workstation Edition Graphics Card for AI, Design, Simulation, Engineering – 96GB DDR7 ECC Memory – 4th Gen RT/5th Gen Tensor Core GPU – OEM Packaging

Background

AI supercomputer GPU upgrade

What Remains Unclear

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch

What’s Next

OpenMP: Heterogenous Execution and Data Movements: 11th International Workshop on OpenMP, IWOMP 2015, Aachen, Germany, October 1-2, 2015, Proceedings (Programming and Software Engineering)

Key Questions

Why does the mixed GPU architecture cause inefficiency?

How does this inefficiency affect Anthropic’s AI services?

Will SpaceX upgrade Colossus 1 to fix these issues?

What does this mean for Musk’s AI ambitions?

You May Also Like