Quantcast
Channel: Kubernetes Overview, News and Trends | The New Stack
Viewing all articles
Browse latest Browse all 243

Need a Trillion-Parameter LLM? Google Cloud Is for You.

$
0
0

At KubeCon+CloudNativeCon North America earlier this month, Google Cloud announced it had upgraded its Google Kubernetes Engine (GKE) to support clusters of up to 65,000 nodes. That’s a big leap up from its previous limit of 15,000 nodes[. This enhancement is specifically designed to meet the growing demands of training and running trillion-parameter AI Large Language Models (LLM).

How big is that? The biggest current LLM  is OpenAI’s GPT-4 with an estimated 1.7 trillion parameters. Next is Google Gemini with 1.56 trillion, and then Meta with  405 billion. According to Google, “We believe GKE offers more than 10X larger scale than the other two largest public cloud providers.” So, unless your business plan is to go toe-to-toe with the top LLMs, a trillion parameters should be more than enough.

Even a single GKE cluster can now manage AI models spread across 250,000 tensor processing units (TPUs), Google’s specialized AI processors. This is a fivefold increase from GKE’s previous benchmark, which supported 50,000 TPU chips in a single cluster.

LLMs Keep Getting Larger

Drew Bradstock, Google Cloud’s senior product director for Kubernetes and serverless, explained the rationale behind this massive upgrade: “Fundamentally, these large-scale LLMs keep getting bigger from companies around the world and require very large clusters to operate efficiently.” The move comes in response to the rapid growth of AI across the industry, with Google reporting a 900% increase in the use of TPUs and GPUs on GKE over the past year.

In short, maybe there aren’t that many customers for a trillion parameters, but Google should find lots of users who want to throw a hundred thousand or so TPUs at an LLM.

To achieve this unprecedented scale, Google Cloud has made significant changes to GKE’s underlying infrastructure. The company is transitioning from etcd, an open-source, distributed key-value store, to Spanner, Google’s distributed database. PostgreSQL users will be pleased to find they can use their favorite DBMS as its front end.  Spanner, Google status, allows for virtually unlimited scale and improved latency. Additionally, Google has overhauled the GKE infrastructure to scale significantly faster, enabling customers to meet demands faster.

This is hot stuff. In a statement,  James Bradbury, Anthropic, a leading AI model maker head of compute, praised the upgrade, stating, “GKE’s new support for larger clusters provides the scale we need to accelerate our pace of AI innovation.”.

The impact of this advancement extends beyond just supporting larger models. It also improves the reliability and efficiency of running large-scale AI workloads on Kubernetes. This enables users to handle system failures without worrying about downtime, leading to faster job completion times. And, when you’re talking about models this big, the last thing you want is more time lost to slow compute.

The post Need a Trillion-Parameter LLM? Google Cloud Is for You. appeared first on The New Stack.

Big, Bigger, Biggest: Google Cloud sets a new high for cloud-based AI Large Language Models. 

Viewing all articles
Browse latest Browse all 243

Trending Articles