OpenAI, Midjourney and Microsoft have set the bar for chargeable generative AI services with ChatGPT (GPT-4) and Midjourney costing $20 per month and Microsoft charging $30 per month for Copilot. The $20-per-month benchmark set by these early movers is also being used by generative AI start-ups to raise money at ludicrous valuations from investors hit by the current AI FOMO craze. But I suspect the reality is that it will end up being more like $20 a year.
To be fair, if one can charge $20 per month, have 6 million or more users, and run inference on NVIDIA’s latest hardware, then a lot of money can be made. If one then moves inference from the cloud to the end device, even more is possible as the cost of compute for inference will be transferred to the user. Furthermore, this is a better solution for data security and privacy as the user’s data in the form of requests and prompt priming will remain on the device and not transferred to the public cloud. This is why it can be concluded that for services that run at scale and for the enterprise, almost all generative AI inference will be run on the user’s hardware, be it a smartphone, PC or a private cloud.
Consequently, assuming that there is no price erosion and endless demand, the business cases being touted to raise money certainly hold water. While the demand is likely to be very strong, I am more concerned with price erosion. This is because outside of money to rent compute, there are not many barriers to entry and Meta Platforms has already removed the only real obstacle to everyone piling in.
The starting point for a generative AI service is a foundation model which is then tweaked and trained by humans to create the service desired. However, foundation models are difficult and expensive to design and cost a lot of money to train in terms of compute power. Up until March this year, there were no trained foundation models widely available, but that changed when Meta Platforms’ family of LlaMa models “leaked” online. Now it has become the gold standard for any hobbyist, tinkerer or start-up looking for a cheap way to get going.
Foundation models are difficult to switch out, which means that Meta Platforms now controls an AI standard in its own right, similar to the way OpenAI controls ChatGPT. However, the fact that it is freely available online has meant that any number of AI services for generating text or images are now freely available without any of the constraints or costs being applied to the larger models.
Furthermore, some of the other better-known start-ups such as Anthropic are making their best services available online for free. Claude 2 is arguably better than OpenAI’s paid ChatGPT service and so it is not impossible that many people notice and start to switch.
Another problem with generative AI services is that outside of foundation models, there are almost no switching costs to move from one service to another. The net result of this is that freely available models from the open-source community combined with start-ups, which need to get volume for their newly launched services, are going to start eroding the price of the services. This is likely to be followed by a race to the bottom, meaning that the real price ends up being more like $20 per year rather than $20 per month. It is at this point that the FOMO is likely to come unstuck as start-ups and generative AI companies will start missing their targets, leading to down rounds, falling valuations, and so on.
There are plenty of real-world use cases for generative AI, meaning that it is not the fundamentals that are likely to crack but merely the hype and excitement that surrounds them. This is precisely what has happened to the Metaverse where very little has changed in terms of developments or progress over the last 12 months, but now no one seems to care about it.
(This guest post was written by Richard Windsor, our Research Director at Large. This first appeared on Radio Free Mobile. All views expressed are Richard’s own.)