A report from Cast AI reveals that average GPU utilization in the tech industry is only 5%, indicating widespread inefficiency in infrastructure usage. Despite significant investments in AI resources, companies are purchasing approximately twenty times more GPU capacity than necessary.
The findings suggest that overprovisioning is worsening rather than improving, with CPU utilization declining from 10% to 8% and memory utilization dropping from 23% to 20% over the past year. Organizations are reserving nearly double the CPU resources and four times the memory that their workloads actually require.
According to Cast AI, CPU overprovisioning has surged from 40% to 69%, while memory overprovisioning stands at 79%. This indicates that many companies are paying for infrastructure that their applications do not effectively utilize, thus exacerbating costs.
Stay Ahead of the Curve!
Don’t miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.
Subscribe Now
The report highlights that idle GPU costs significantly exceed those of idle CPUs, costing dollars per hour compared to mere cents for CPUs. In a notable shift, GPU prices increased by 15% in January 2026 for the first time since the launch of EC2 in 2006, attributed to supply and demand fluctuations.
“At 5% utilization, the math doesn’t work,” said Laurent Gil, co-founder and President of Cast AI. Gil emphasized that the trend of overprovisioning stems from a preference for perceived safety over resource efficiency.
While some organizations have achieved higher GPU utilization rates—one reported 49% utilization on H200s and 30% on H100s—most are not leveraging existing solutions such as automated rightsizing, GPU sharing, and Spot management, resulting in continued overprovisioning.
Cast AI’s data indicates that many companies remain reluctant to change long-standing operational practices, even at the cost of higher expenses. A shift toward treating resource efficiency as a continuous automated process is necessary to mitigate these inefficiencies.
Featured image credit
