Paul Kelly
London
In the pursuit of value, competitive edge, excellence or efficiencies, artificial intelligence (AI) is only expected to grow in priority and focus for CEOs, CFOs, COOs, and CIOs alike.
For CEOs, AI will increasingly be part of efforts to improve decision-making processes in a business, a route to value creation, and used to deliver competitive edge through technology and innovation.
CFOs will increasingly seek to understand the value of AI investments accurately, drive budgeting and planning, and predict future financial performance to ensure investments are sound.
COOs, meanwhile, will need to focus on how AI delivers operational excellence and customer experience, streamlines supply and demand within the workforce, and facilitates automation, among other functions.
While executives broadly see this value and potential in AI, its adoption costs are often overlooked, misunderstood, or considered only as an afterthought. For example, using cloud infrastructure for AI activity is just one aspect of AI investment but it involves a variety of dimensions and an intimidating list of cost considerations.
For executives to confidently adopt and invest in AI there’s a set of core decision points, each carrying significant cost implications:
a. Is it to improve customer experience? This could take the form of an AI chatbot, an example of a lower investment, high-value commodity type of AI use case.
b. Is it to drive operational efficiency, e.g., supply chain optimisation for a manufacturing firm? While AI models might be available for this purpose, your enterprise data may not be ready to feed into an existing model.
c. How big is the data set required and how expensive is the analysis needed? A use case like specialised software in connected cars requires a high volume of each.
d. How clear is the use case? That clarity is one of the biggest factors affecting ultimate AI cost levels.
Opting for on-premises datacenters, a cloud-based operation, or a multi-modal approach will have major cost implications:
a. Using on-premises data centres to operate AI involves long lead times for specialised new AI hardware and high capex costs. These create high barriers to entry and clear disadvantages for certain AI use cases: If you are experimenting with AI, with no clear roadmap for enterprise AI adoption, what happens to all the specialised AI hardware you just invested in if your innovation experiment fails?
b. AI in the cloud has a comparatively lower barrier to entry. It is hard to disagree with the view that AI was born in the cloud and for the cloud. If you are looking to use the cloud for your AI needs, are you planning to use the cloud as an AI infrastructure layer (IaaS) only, and build your AI apps on top of that? Or are you looking to use fully cloud-managed AI services (that are offered as Platform as a Service or Software as a Service, rather than Infrastructure as a Service)?
Various dimensions drive the cost of fully managed AI services in the cloud. Modelling these costs can become extremely complex and intimidating, with businesses often left paying too much. Examples of the cost attributes include:
The table below summarises the fully managed AI services from AWS (correct as of Jan 2025):
As you can see, different services have different pricing mechanisms and can be quite different from typical IaaS and PaaS pricing models that we are used to.
Where you run your AI is one of biggest cost sources and creates major barriers to entry around cost, timelines, and ease of company adoption.
There are two key routes to take to answer this question: either rely on internal capability or leverage external specialists.
a. Internal capabilities: If you have a superior and large engineering and technology workforce, with most of the skills required to run AI (imagine data scientists, AI engineers, cloud architects, and engineers). This can be a cheaper option with flexibility to carve-out costs and budgets using people’s time away from internal projects, but this must be time-bound otherwise you can burn through the money quickly and for an extended period.
b. External specialists: If your organisation has been traditionally more business-focused, and you don’t have the necessary technology capability or engineering strength in this area, you might choose to partner with a specialist organisation in your AI journey. This can be a more expensive route but possibly better value for money as engagements are typically outcome-based for a finite duration.
Based on your business operating model and delivery model, this has significant cost implications.
Processing units, such as graphic processing units (GPUs), are central to AI hardware and there is a wide range of compute choices with staggering price differentials.
Take for example the list prices for some of the GPUs on Google Cloud (source: GCP website, correct Jan 2025):
Imagine having to run a few thousand GPUs of a certain type for 3-6-9-12 months. As you can see, choosing the right hardware type has significant cost implications. Once the correct processing unit is selected for the AI use, the costs can vary dramatically based on how many central processing units (CPUs) and GPUs you need, and for how long.
It is also possible to reserve AI compute capacity in advance, as this cloud space is in high demand but low supply. With reservations, you pay a substantially discounted and locked-in rate for a fixed term period for your AI compute instances and get the predictability you need. Such flexibility in the purchasing options is particularly useful for experimentation and innovation projects.
There is a whole set of different cost considerations for on-premises AI hardware. To start, the hardware should ideally have a dual use – for non-AI workloads too, which allows flexibility and de-risks your investment. Backwards compatibility of modern, cutting-edge AI hardware matters.
On-premises AI hardware also has unique characteristics, very different from conventional datacentre hardware. It requires rack-scale solutions designed for energy efficient, dense compute and large-scale AI adoption:
With hundreds of AI models to choose from and seemingly unclear pricing models, this area can be daunting to navigate. Key AI software questions include:
There are some interesting data regarding the cost of training AI models. A recent Statista article referenced that ChatGPT-3 cost only around $2-$4 million to make in 2020, but when it was upgraded to version 4, the technical creation cost uplift is reported to have been between $41-$78 million.
The cost of training Gemini – a large language model inputted with text, voice commands, and images – reportedly stood between $30-$191 million (excluding staff salaries). The compute cost to train Gemini's precursor, PaLM, in 2022 was between $3-$12 million.
Data is a primary factor in determining the cost of AI models: a complex data set means more training and more cost, while the type of data also matters: text is cheaper than training AI with graphics and video.
Other factors include the number of prompts a model is optimised for and the maintenance costs to run, train, and operate the model for business as usual.
Data shows that salaries for AI-focused roles and skills tend to be higher than non-AI IT roles. Add to this the supply-demand gap, which is driving these costs even higher.
Governance of AI and ML operations and lifecycle management can also be very expensive. It is widely accepted that AI engineers spend 90% of their time preparing data and 10% working with the models.
We’ve looked – in outline – at some of the core business decision points around AI investment and the significant and sometimes highly complex costs that these carry. Beyond these strategic and technical decision points there are also broader operational, budgetary, and legal questions that executives must tackle. For example:
With the growing focus on AI at the executive level, it will increasingly pay to be focused on the key questions and core decisions points that ultimately determine the cost-performance and viability of AI activity.