A new craze has gripped Silicon Valley of late. Techies looking to prove they are in the vanguard of artificial-intelligence adoption have taken to “tokenmaxxing”, competing with one another to burn through the most tokens (as the chunks of text processed by AI models are known). Yet as demand for AI soars, those tokens are in increasingly short supply.

In March Anthropic, an AI lab whose models are popular with businesses, began throttling access to its tools at busy times, It has since been altering its subscription plans, seemingly in a bid to curb usage. In April its service has experienced outages of around 30 minutes a day. In March OpenAI, a rival, abruptly shut Sora, its video-generation tool, to redirect scarce computing power. On April 20th GitHub, a coding-collaboration site that is owned by Microsoft, stopped accepting new subscriptions for its programming bot.
They have had little choice. Demand is rising faster than they can add capacity: between January and March the weekly tokens processed by OpenRouter, a model marketplace, quadrupled. Meanwhile, the industry is racing to build new infrastructure. On April 20th Anthropic announced a $100bn partnership with Amazon to secure up to five gigawatts of server capacity, with nearly a fifth to come online by the end of the year. On April 24th it said that Google would also invest $40bn to help the lab meet its computing needs. On April 27th OpenAI said it was reworking its partnership with Microsoft to allow it to distribute all its products through any cloud provider, giving it greater flexibility to tap into computing supply.
Five hyperscalers—Alphabet, Amazon, Meta, Microsoft and Oracle—are investing ever larger sums of money in data centres. Alphabet, Amazon and Oracle have already raised more than $100bn in debt between them this year. To free up cash, Meta recently said that it would lay off 10% of its workforce, while Microsoft said that it would offer voluntary redundancies to about 7% of its workers.
Adding more capacity, however, is only getting more challenging. In America and beyond, political opposition to the construction of data centres is growing. What is more, the companies making the hardware that fills them—from chips and networking gear to cooling equipment—have been investing far too little to keep pace with demand. The squeeze on capacity, then, looks set to worsen.
Start with the politics. In April legislators in Maine voted in favour of a bill to ban the construction of data centres above 20 megawatts until November next year. Although it was subsequently vetoed by the governor, lawmakers in more than ten other American states are weighing similar measures. According to one count, $156bn-worth of data-centre projects were blocked or delayed last year in America by local opposition and litigation. Other countries, from Ireland to Brazil, are experiencing a growing backlash. Concern over the impact of power-hungry data centres on electricity bills in particular has become widespread—and may intensify further as the war in the Gulf raises energy prices.
Even when data centres are approved for construction and can get hooked up to a power source—whether the grid or, increasingly, their own means of generation—those erecting them are finding it harder to get their hands on the computing equipment needed to operate them.
Ivan Chiam of SemiAnalysis, a research firm, points out that there are not enough chips to fill the data centres now being built. Consider the graphics-processing units (GPUs) designed by Nvidia, which provide more than two-thirds of the world’s AI computing power. The price to rent one of its H100 GPUs, launched in 2022, has soared by around 30% since November, as customers unable to get their hands on newer models have resorted to older generations. Competing AI processors are also getting more difficult to obtain. In April Andy Jassy, Amazon’s boss, said that his company had nearly sold out access to its Trainium2 AI chips. A significant chunk of the capacity of Trainium4, due next year, “has already been reserved”.
The squeeze also extends to memory chips, in particular the kind of high-bandwidth memory (HBM) that AI models rely on. All three big producers—SK Hynix, Samsung and Micron—say that most of their supply for 2026 is sold out. Some hope of relief came in March when Google unveiled TurboQuant, an algorithm meant to reduce the amount of memory AI needs, causing the share prices of the memory-makers to briefly swoon. Even so, demand for HBM is expected to outstrip supply for at least the next three years.
The shortages are now spreading to central-processing units (CPUs). “Agentic” AI tools, which plan, reason and carry out tasks, rely more heavily on these types of chips to co-ordinate their work. Morgan Stanley, an investment bank, estimates that agentic systems require one CPU for every GPU, compared with a ratio of one to 12 for chatbot-style systems. Indeed, demand for CPUs has been so robust that it has breathed new life into Intel, which not long ago seemed to be heading for collapse. The market capitalisation of the American chipmaker, one of the leading producers of CPUs, has more than doubled over the past six months (see chart 1).
The crux of the problem is that companies along the AI supply chain are investing far less than the hyperscalers. We examined the planned capital spending this year of the 50 or so largest manufacturers of chips, chipmaking tools, servers, networking gear and cooling equipment, and how it has changed since 2024. The five hyperscalers have tripled their combined capital spending, to more than $750bn, but the hardware suppliers have increased theirs by only half, and will invest less than a third as much as the cloud giants this year (see chart 2).
Take TSMC, the world’s biggest contract chipmaker and the dominant supplier of cutting-edge GPUs and CPUs. Its most advanced fabs—those making chips that are five nanometres or smaller—are already running flat out.C.C. Wei, its boss, admits that supply is “very tight”, but that “there are no shortcuts”: building a new fab takes two to three years. The company plans to spend about $55bn in 2026, up by 34% from 2025; analysts expect the figure to rise to $65bn in 2027. But as a share of sales, its capital expenditure has fallen from around half in 2022 to a third this year.
TSMC’s caution has frustrated its customers. Sam Altman, OpenAI’s boss, has urged it to “just build more capacity”. Elon Musk, boss of Tesla and SpaceX, has said he will build a so-called “Terafab” with the modest ambition of churning out more processing power annually than the entire global semiconductor industry today.
The facility, which Mr Musk has enlisted Intel to help set up, is unlikely to start production until 2028 at the earliest, and even then at a fraction of the scale envisioned. What is more, Mr Musk may struggle to get his hands on enough of the advanced machines he will need to operate it, which are also in short supply. That illustrates the mismatch that now clouds the future of AI. Improving software takes months, whereas expanding supply chains takes years. Hardware-makers are wary of over-building and being stuck with idle capacity. The craze for “tokenmaxxing” may soon be cut short.
To track the trends shaping commerce, industry and technology,sign up to “The Bottom Line”, our weekly subscriber-only newsletter on global business.
