Amid an A.I. Chip Shortage, the GPU Rental Market Is Booming

For many organizations aspiring to develop A.I. projects, GPU-as-a-service provides a cost-effective way to access high-performance chips.

By Aaron Mok

GPU rentals allow small companies to access high-performance A.I. chips for specific projects. *Igor Omilaev/Unsplash*
GPU rentals allow small companies to access high-performance A.I. chips for specific projects. Igor Omilaev/Unsplash

GPUs, or graphic processing units, have become increasingly difficult to acquire as tech giants like OpenAI and Meta purchase mountains of them to power A.I. models. Amid an ongoing chip shortage, a crop of startups are stepping up to increase access to the highly sought-after A.I. chips—by renting them out.

The GPU rental market is part of a niche, existing industry known as GPU-as-a-service where chip owners use an online marketplace to sell compute power to clients over fixed periods of time through the cloud. Typically, companies turn to major cloud providers like Amazon Web Services, Microsoft Azure and Google Cloud—which collectively hold a 63 percent market share of the global cloud computing market—to run A.I. workloads on their on-premise data centers.

GPU-as-as-service, however, provides a more decentralized approach. Providers in that space partner with data centers and GPU owners globally to rent their clusters of chips to clients whenever the need arises. Renting computer power allows organizations with tight budgets, such as startups and academic institutions, access to high-performance GPUs for specific projects, said David Bader, director of the Institute for Data Science at the New Jersey Institute of Technology.

“GPU-as-a-service has significantly leveled the playing field in A.I. and high-performance computing,” Bader told Observer. “Instead of making substantial upfront investments in hardware that quickly depreciates and becomes obsolete, companies can now access GPU power on-demand.”

Even as supply chain constraints around GPUs start to ease, the rental market continues to grow. The GPU-as-a-service market, valued at $3.79 billion as of 2023, is expected to grow 21.5 percent annually to $12.26 billion by 2030 as the demand for advanced data analytics, like running machine learning algorithms, increases, according to data from Grand View Research.

Generative A.I. has spurred interest in GPU rentals

Some startups in the GPU rental space have seen demand surging since ChatGPT came out in November of 2022 as companies seek out compute power to build A.I.

Jake Cannell, founder and CEO of Vast.ai, said his company’s customers were primarily cryptocurrency miners before the generative A.I. hype began. Today, more than half of the projects run on Vast.ai’s GPU rentals are A.I.-related. Clients include A.I. entrepreneurs, startups and academics building custom large language models with foundational models like OpenAI’s GPT and deploying LLMs on A.I.-related workloads like A.I.-image generator Stable Diffusion, Cannell said.

The release of ChatGPT, combined with high demand for major cloud providers and the GPU shortage, pushed more customers to look for alternative options, which has in part accelerated demand for Vast.ai’s GPU rentals, according to Cannell. “That’s probably relaxed a bit now that production has caught up, but demand still seems really high and growing,” the CEO said.

Nvidia (NVDA) CEO Jensen Huang recently said demand for Nvidia’s new Blackwell chips has been “insane” and that the company, which owns about 90 percent of the GPU market, plans to ramp up Blackwell production this year through 2026.

Launched in 2017, Vast.ai is behind an online marketplace that connects owners of GPU clusters from Nvidia and AMD with organizations looking to rent compute power. As of late October, the marketplace offers 109 clusters of GPUs—including Nvidia’s popular H100 chips—housed in data centers and, in some cases, the owners’ garages scattered across the U.S. Europe, Asia and Australia, according to Cannell.

By offering GPU clusters with different capacities, speeds and system requirements for various lengths of time, Vast.ai aims to provide renters with the freedom to pick GPUs required for specific projects and scale up-and-down depending on their need. For instance, a client developing an A.I. chatbot may initially rent 100 GPUs to train their model. If their product takes off, the client could ramp up their compute capacity by renting out thousands of GPUs. The flexibility to access different amounts of compute at different stages of product development, the company claims, is what makes GPU rentals over purchasing chips appealing.

“Buying would only make sense if you have a much more predictable, steady demand for GPUs over a very long period of time,” Cannell said. “Only the hypercalers can formulate that,” referring to industry giants like OpenAI.

While startups like Vast.ai launched prior to ChatGPT are seeing an uptick in interest, new startups have emerged following the chatbot’s release to tap into the growing GPU rental market.

Foundry, a GPU marketplace built specifically for A.I. workloads, claims it has attracted “dozens” of customers since it launched its cloud platform in August and can significantly reduce compute costs by tapping into the excess power supply of pre-existing chips, according to CEO Jared Quincy Davis.

The startup, which raised $80 million from investors like Sequoia and Lightspeed Ventures as of March, rents out GPUs through a mix of compute clusters the company owns and “underutilized clusters” sourced from partnerships with data centers.

Foundry’s customers include companies in the technology, telecommunications, media and health care industries. Foundations and academic labs also use Foundry’s services. Common use cases include fine-tuning models like Meta’s Llama to exhibit desirable properties, building neural networks from scratch, and performing sentiment analyses—a deep learning technique used to analyze text to determine its emotional tone. Foundry even has clients renting GPUs to predict protein sequences for drug discovery, train models to translate rare languages, and build A.I. agents that can control websites without human intervention.

“Much of the cutting-edge development that could previously only be conducted by labs like OpenAI and DeepMind will now be obtainable by others as Foundry makes GPU compute more accessible and affordable,” Davis, who previously worked at Google DeepMind as an engineer, told Observer.

Some organizations are seeing the benefits of GPU rentals materialize. Bader, the professor at New Jersey Institute of Technology, said he’s seen his university use the GPU rental approach to “free up resources” for “critical activities” like research and development. The GPU rental model, he claims, is ideal for projects with “temporary” or “seasonal compute needs” and “eliminates the burden” of costly hardware management and maintenance. Bader said he has also seen small businesses the university collaborates with access the same GPU power as larger businesses.

“I’ve witnessed countless startups benefit from this,” Bader said. “They no longer need millions in upfront investment for specialized hardware. Instead, they can prototype, test and iterate their algorithms using rented GPUs, ensuring that funds are directed towards development rather than infrastructure.”

Renting out GPUs may not save that much money long-term

Still, Bader noted that renting out GPUs over purchasing them comes with some trade-offs.

Performance on shared infrastructure can be inconsistent, which could slow down the execution of tasks like A.I. model training if there’s service disruptions. GPU rentals could also get expensive down the line despite upfront cost savings. The costs of transferring data between the cloud and the company could “add up quickly,” and for workloads that require real-time processing, clients that continuously hit latency issues might end up paying more than if they owned GPUs, according to Bader. The lack of control over the infrastructure could also be “problematic” for companies with strict security and compliance protocols.

The future of the GPU rental market could also depend on how the chip industry evolves. After all, major cloud providers like Amazon Web Services are expected to continue expanding their product lines and are likely to absorb smaller companies, which could lower prices in the short term and limit consumer choice in the long run, according to Bader. Plus, supply chain delays could make it harder for cloud giants to get their hands on GPUs.

Despite these concerns, the startups that spoke to Observer remain confident there will still be a need for their services in the following years as A.I. continues to grow. Vast.ai continues to improve its GPU matchmaker service and is getting more directly involved in use cases like LLM inference, especially for A.I. agents. Foundry plans to release additional features to increase the accessibility for its platform and make it more useful for A.I. developers to build advanced models.

“Nvidia is still a leader, and I don’t see that changing overnight, but there is increasingly more competition,” Vast.ai CEO Cannnell said.

https://observer.com/2024/10/ai-gpu-rental-startup/

David A. Bader
David A. Bader
Distinguished Professor and Director of the Institute for Data Science

David A. Bader is a Distinguished Professor in the Department of Computer Science at New Jersey Institute of Technology.