Charles Cipione
Dallas
Developments in artificial intelligence (AI) over the past few years, catalysed by the public release of ChatGPT in 2022, have captured the public’s attention and brought AI to the forefront for decision-makers. Simultaneously, we are seeing diametrically opposed predictions about the impacts of AI, which might be loosely characterized as “AI as our savior” or “AI as our overlord”.
While much of the hyperbole concerns the potential longer-term impacts of AI, regulators and lawmakers are focused on more immediate concerns, such as the risk of AI systems exhibiting bias or discriminatory behavior, or generative AI (GenAI) hallucinating (i.e., inventing falsehoods), or being used for nefarious means, such as producing realistic audio and video content (“deepfakes”) that can facilitate fraud or widespread disinformation.
Within this context, it is intriguing to consider how one might investigate an AI-decisioning system that is suspected of violating laws or regulations, or otherwise contributing to unfair outcomes. A key component of this question is the ability of human beings to comprehensively understand the mechanical process underpinning how AI systems generate results, versus how a more conventional decision-making algorithm does so. This question has come to the fore with the advent of GenAI, which is often misunderstood as a “truth engine” when, in reality, it is better understood as a “plausibility engine”.
Broadly speaking, we use the term “algorithm” to describe the logic used to direct a computer to perform tasks, which typically take the form of processing inputs according to pre-defined rules to produce a desired output. By decision-making, we refer to algorithms that play a critical role in our society, such as determining outcomes for individuals (e.g., your credit score), or in detecting and preventing breaches of the law or regulation (e.g., detecting money laundering).
Before we delve into AI algorithms in this article series, we will first investigate what we term “conventional algorithms”, (i.e., non-AI algorithms whose logic and functionality are entirely designed and dictated by humans). In abstract terms, conventional algorithms can be thought of as a series of “if-this-then-that” rules that were deliberately programmed by humans, according to a pre-defined design or desired outcome.
By contrast, AI algorithms are those in which humans have established a technical environment in which a computer can wholly, or in part, “discover” the appropriate logic to be applied. A potentially useful analogy here is the way in which people develop and understand language. A non-native speaker learning a new language might learn the words and the grammatical rules – this is analogous to a conventional algorithm, using codified rules. By contrast, a native speaker will have simply learned the language by being exposed to it daily. They may not know the grammatical rules explicitly, but they know what is correct based on listening to others or having been corrected.
Conventional algorithms provide a useful starting point as they have been in use for many years and so we have a developed framework for investigating them, which we can contrast to the framework we might need for investigating AI algorithms.
Conventional algorithms have been a mainstay of businesses and governments for decades, affecting the day-to-day life of anyone reading this article. They can be found in applications as varied as determining whether your bank gives you a loan, how medical appointments are scheduled, processing insurance claims, whether emails are sent to your spam folder, or processing job applications. Behind the scenes, algorithms are used to screen financial transactions for potential fraud, money laundering, terrorist financing and economic sanctions violations.
Unfortunately, even with these apparently straightforward algorithms, things can and do go wrong for a variety of reasons. For example, the programmer writing the algorithm could make a mistake in the code (a “bug”), an algorithm can be given faulty input data, or an algorithm can be mismatched to its use case. These failures – whether combined with human failures or not – can lead to serious, material consequences. Examples include the prosecution of sub-postmasters due to failings in the UK Post Office’s Horizon software, or the widespread CrowdStrike Windows outages that occurred in July 2024.
To break down the ways in which conventional decision-making algorithms can fail, it is necessary to zoom out and consider the broader environment in which the algorithm operates. This environment includes the physical hardware the algorithm runs on, its input and reference data, the governance processes overseeing it, as well as how the algorithm’s output is used. Any one of these components of the decision-making system, which includes the algorithm itself, is a potential point of failure. For example:
From a detection perspective, it could be argued in some cases that a complete failure of an algorithm is preferable to a partial failure, as with the former the issue can be detected and remedied immediately, whereas a partial failure can persist undetected for many years resulting in a significant issue to investigate and remedy. For example, contrast a burst water pipe – which is immediately detectable, and can be fixed and the water cleaned up – with a slow drip from a faulty pipe connection that goes undetected for years, allowing damp to permeate the fabric of the building and cause lasting damage.
Having examined what can go wrong with conventional decision-making algorithms, the rest of this series will explore how to investigate these algorithms and go on to discuss how we can apply this framework to AI algorithms.