Is Gen AI the killer app, or the final boss?
Big implications for the future of the Cloud game
Back in the day
In the 1990s and early 2000s, the only way to deploy an individual server to run business workloads was to buy a machine and ship it to a colocation or “colo” facility, which was usually used to fill extra space in a datacenter not yet at full capacity. When virtualization became more popular and viable in the mid-2000s with the rise of multi-core CPUs, businesses could rent Virtual Private Servers, or VPSs, which were used to fill extra space on individual servers not yet at fully capacity.
When the first Cloud providers appeared in the late 2000s, they offered VPS at a scale not seen before, referred to now simply as VMs or Virtual Machines, along with ancillary hosted services such as queuing systems (e.g. AWS SQS) and managed application hosting services (e.g. Google App Engine). The relative ease of spinning up a VM in comparison to buying a server and installing it into some datacenter rack somewhere and having to constantly replace failed hard drives was cloud’s original “killer app”.
At the end of the day, everything in IT is either an application, a database, or a conduit to move data between the two.
The three “hyperscalers”, as they humbly refer to themselves — Microsoft Azure, Google Cloud, and Amazon Web Services — have been playing this same basic game for the past 15 years. Virtual Machines (VMs) have continued as the core offering from all three cloud providers, around which an ever-growing repertoire of additional services are offered. Some of these services such as managed databases and load balancers can replace workloads that formerly ran on VMs, and others such as machine learning and global messaging systems (e.g. SQS, PubSub) are entirely new kinds of services that scale beyond the capacity of any single VM, or even large groups of VMs.
For each cloud provider, something like three-quarters of revenue still comes from selling VMs, which remain substantially similar in form to the VPS offerings two decades earlier. VMs, however, are among the lowest margin services offered by cloud providers. It’s difficult to differentiate one VPS service from another, especially considering that the underlying machines all rely on the same Intel, AMD, and Nvidia chips and related components, and so much of the competition between providers occurs in the realm of ancillary services. None of the providers are buying billboards boasting about how their VMs are 1.8% faster than somebody else’s VMs, because nobody cares. The “Intel Inside” days where one computer was 40% faster than somebody else’s computer are over for good.
What’s next?
What cloud customers do care about is getting access to new information technology that they don’t currently have to solve problems they can’t currently solve, and just about all new tech is made available through cloud computing platforms. But how much new tech can possibly be built? The providers have invested stupendous resources building everything from custom databases e.g. Google BigQuery, to managed open-source data pipelines e.g. AWS Managed Flink, to low-code business process orchestrators, e.g. Azure Logic Apps. But in the past few years, it’s clear that the three providers are homogenizing in all dimensions.
If you’re a cloud provider, you want to differentiate, but you don’t want to be so different that nobody understands what you’re doing. Providers are increasingly building and releasing services that already exist on other cloud platforms, precisely for that reason: if my competitor has one, I’d better have it too, and I should be compatible with their service so that I can steal their customers away more easily. No cloud provider wants to be the Betamax of the cloud business — they’ll be better off selling VCRs than trying to invent a new paradigm before the current one has run its course. This isn’t to say that the Cloud market is hurting for new areas of growth, because the cloud computing model is still very in-demand by IT customers. And it takes time for corporate datacenter leases to run out, and so there’s still a lot of migration yet to happen. But the space to offer totally novel services is running out. At the end of the day, everything in IT is either an application, a database, or a conduit to move data between the two. There are only so many things you need to make to support and enable these basic building blocks.
One of the reasons cloud providers love building ancillary services is because they offer higher margins than the core VM offering, and they often drag in other related workloads with them. If a business is going to move a terabyte of data into BigQuery or AWS Redshift, it doesn’t make sense for the workloads that utilize that data to remain in a corporate datacenter a thousand miles away. It’s like taking a new job in a new city, but staying in your current house and commuting four hours a day to work. Maybe you suck it up for awhile, but eventually your house is going to end up closer to your job, one way or the other. Likewise, the cloud providers know that if they get your data, other workloads will follow. This is known in the industry as “data gravity”.
But what’s next? Serious practitioners of machine learning used to insist that it was not correct to refer to ML as AI, or artificial intelligence. AI was a term used only by uninitiated dilettantes. Nonetheless, all three providers now proclaim their supremacy in the field of Generative AI, not Generative ML.
Machine learning workloads have long been the holy grail for cloud providers because of the immense amount of proximate data required to train them, and the large compute cost of training and inference and re-training, which cost is passed onto customers with a little extra margin thrown in. Not only does storing data cost money, but beyond a certain critical mass it becomes very difficult and risky to move that data somewhere else later. But cloud providers have hitherto struggled to get customers to use their AI services for much beyond occasional experimentation, and stuff that most people still do in Excel: ad targeting, sales forecasting, customer churn prediction, and so forth. Now, they say, Gen AI has come to the rescue.
Technical debt
A large organization cannot wake up one morning and suddenly decide to be innovative. A company’s IT position is like your brand, your distribution network, or any other capital asset — it represents the sum total of all technology investment over the life of your company. In the same way that postponing maintenance on critical machinery in a factory merely delays the day of reckoning (and can juice your stated financial results in the meantime), under-investment in a company’s IT systems accrues “technical debt”. And like factory machinery, delaying the problem only makes it worse when the bill finally comes due. This is part of the basic strategy of private equity, which is notorious for delaying capital expenditures so that the results look great on paper, and the future costs of maintenance and asset replacement are hidden. And indeed, many firms that exit PE ownership find themselves with immense levels of technical debt, which only herculean efforts and expense can undo.
Unfortunately for technically-indebted companies, effectively utilizing cloud ML services requires not only sophisticated technical staff, but it requires a well-situated, well-designed, modernized IT infrastructure that many companies do not possess. It doesn’t matter how innovative you feel, you cannot Leroy Jenkins your way straight from twenty-year-old tech and into the full benefits of Gen AI. And there is a lot of old tech still sitting at the heart of corporate IT. This applies especially to companies that say things like “we are a technology company that just happens to sell insurance” — these are the same companies who have a mission-critical Oracle database that crashes every Tuesday and nobody can quite figure out why.
Dragging in other workloads that are related to new Gen AI investment is likely the next big driver of growth for the cloud providers, moreso than direct spend on Gen AI services directly. Microsoft tip-toed around this idea in their Q1 2024 earnings call, saying that 6% of revenue growth was tied to Gen AI-related workloads. Notably, they did not cite the direct spend on AI services themselves. There is an argument to be made that if Gen AI can’t convince a company to go all-in to the cloud, nothing will.
The hype is real
In any case, companies are now scrambling to take advantage of the supposedly revelatory power of generative AI. Google botched their initial foray into the space by pioneering it and then pretending that it didn’t exist. Microsoft is “investing” billions of dollars into Open AI which mainly consists of Azure credits. Amazon is doing what they do best: waiting for customers to pound the table and then building whatever products they ask for.
For the cloud providers, Gen AI may be an opportunity to build up more data gravity and finally tap into workloads that are still sitting in corporate data centers, or at second-tier providers such as IBM and Oracle Cloud that haven’t yet caught up to the hyperscalers in this area. But it’s not really a new thing, in the sense that the three big providers have had machine learning services for years. Gen AI appears to be a new service within these existing suites, rather than a totally new area of expansion.
It remains unclear whether Gen AI will deliver on the benefits ascribed to it by its proponents, which seem to be just about everyone these days. It would make a lot of sense if Gen AI were actually sentient, because that would explain how every media organization in the world is working in perfect harmony to advertise its new-paradigm powers. If it does deliver, I wonder what the next thing will be for the cloud providers. Maybe Gen AI really is the beginning of a new wave of further consolidation of workloads into central computing platforms. I think it more likely that Gen AI represents the final boss — the opposite of the killer app which spawns new platforms, and more like the final piece that completes a puzzle.