Dead on Arrival: Why Explainable AI is Fundamentally Wrong and Why Investment in Explainable Systems Should be Avoided

By Journal of Business & Intellectual Property Law at WFU on April 26, 2022

By: Ben Suslavich

As Artificial Intelligence systems (“AI”) become increasingly prevalent in today’s world, there have been demands by some for AI algorithms to be “explainable.” Private companies, lawmakers, and the government have all discussed interest in the ability to audit how AI makes decisions. This discussion has centered around the use and implementation of eXplainable AI (“XAI”), which, in theory, would allow a casual user of an AI system to understand the reasoning that formulated an AI decision. Moreover, such a system would allow AI to be audited to ensure that the AI is operating as it should be. However, despite these admirable goals, the XAI movement represents a looming disaster, which is meeting little resistance. This blog explains why resources should not be invested in XAI.

First, it is necessary to explain how AI works at a fundamental level. AI uses algorithms to make decisions based on data inputs. For traditional algorithmic problem solving, numeric data is inputted into a mathematical equation that is derived from data. For instance, say you wanted to make a program that would guess a person’s weight based on their height. First, you would need some data on both the height and weight of individuals. Once you plot enough data points, you could create a trendline. This trended line could be created through statistical regression models, or you could simply print out a scatterplot and draw a line with a ruler. This trendline would have an associated equation. Using this equation, you could enter a person’s height, and it would generate an answer.

Of course, during the testing of your model, you would likely find that the predicted weights were not exact and were only approximations. To try to make our model more exact, we could try to find more height data, or we could tweak our model so that it better represents the data that we already have. This process of data collection, regression modeling, and adjusting the model is how many algorithms are developed. However, such algorithms have their limitations.

This height and weight model only had a couple of variables to account for; we could make this model more accurate by utilizing additional variables, such as the person’s sex, occupation, body measurements, recent meals, time of day, etc. For this specific application, it might be pointless, but consider, instead, if we had to construct an algorithm to identify traffic signals for a self-driving car. These applications would need to account for hundreds of different variables (high dimensionality) similar to what our own minds must account for while driving. Our logic-based algorithms would not be well suited to handle data with high dimensionally and large amounts of variability within that data.

In these applications where ambiguity is rampant, the strengths of AI are apparent. While it is hard to define what AI is since it is primarily a marketing term, this blog will define AI as a system capable of learning how to solve a problem on its own. For AI to “learn,” it typically incorporates neural networks. Neural networks are a system of digital knobs called “the model.” The machine learning algorithm adjusts the model. These systems are called neural networks because the model mimics the electrical connections of neurons in biological brains. The model is adjusted until it can accurately and repeatedly solve a problem that you give it. This process is called “training.” In a typical AI, millions of parameters are automatically tuned by the machine learning algorithm during the training of the model.

While it may sound complicated, this process is fundamentally the same as the linear regression in the height and weight example where a line was drawn on a scatterplot. However, the difference is that hundreds, if not thousands, of variables are used instead of only two variables. Consider a 32×32 pixel image. This image has a total of 1024 pixels, so a simple neural network could utilize 1024 variables, each accounting for a single pixel. Creating mathematical formulas to model image data like this would be far too complex to do practically, but AI does not require the user to create formulas. Instead, AI “finds” relationships in the data on its own, provided we give it training data for the algorithm to learn from.

As the number of variables and the complexity of the data which the AI must analyze increases, we introduce a form of unavoidable uncertainty. This is analogous to our height and weight model, where there may be many people with the same height but very different weights. AI is well equipped to handle these complex tasks; however, while AI can outperform our human minds, it is not perfect. The real world is full of ambiguity and requires us to make a judgment call when confronted with such data. Thus, there is unavoidable uncertainty in any AI model.

Many attempts at creating XAI have been centered around creating explainable neural networks. For instance, some have tried to map which data inputs have the most influence on the outcome. In our 32×32 pixel image example, this method would determine which pixels contribute most to the outcome. Another method of creating explainable neural networks is similar to Google’s Deep Dream, where the network is run in reverse to find an input pattern that would induce the maximum stimulation for a particular neural pathway.

To the general public (and even some “experts“), it may seem logical that finding which pathways produced an incorrect result would enable AI developers to fix the problem. This belief is likely founded on methods used in traditional logic-based computing, where programmers hunt for the “bug” within the code. In software development, it is essential to make code scrutable to ensure bugs can be found and corrected. However, this traditional approach does not work for neural networks.

Suppose a neural network makes a mistake, such as misclassifying an image. In that case, an XAI could hypothetically run through the entire neural pathway and reconstruct precisely why the AI made a certain decision. However, even if developers knew where the mistake was made in the neural network, there would be no way to directly change the AI to ensure the same mistake was not made again. This is because if the AI was changed to try to ensure correctness for a specific case, the generalizations that made the AI algorithm correct in the first place would be completely destroyed because changing even one “knob” would have a cascading effect. Unlike in traditional programming, where altering an algorithm creates a determinable result (that is, we can predict what will happen once the change is made), altering an AI’s neural network would create a non-determinable result that would affect the entire model in unknown ways.

However, all is not lost. There is a way to fix issues, and the method has been around long before there were demands for XAI. If there is a documented instance of misclassification, the example can be fed back into the AI for the machine learning algorithm to readjust the weights slightly. One might recall that this is precisely how training works in the first place. While there is no guarantee that similar cases would then be appropriately classified, there was never that guarantee in the first place with AI because, unlike traditional algorithms, AI systems are not deterministic. However, this is not a bad thing. It is precisely because of their flexibility that makes AI ideal for handling complex, real-world problems.

Furthermore, one does not place blind trust in AI by not adopting XAI. AI is already explainable and verifiable because it can be tested. There are a variety of techniques for testing a neural network, such as boundary value analysis and cause-effect graphing. At a basic level, examples can be given to AI, and if the AI can perform its task at a sufficient level, then the AI can be deemed to be effective. Auditing an AI is as simple as giving the AI test data. The call for an AI’s neural network to be explainable is about as reasonable as having all elementary schools run fMRI scans on their students’ brains to ensure that their neurons are activating correctly instead of simply giving them a pencil and paper tests.

Not only is XAI pointless, but it might also be highly damaging. If engineers, executives, lawyers, and lawmakers start engaging in post hoc analysis of AI methods, there is a serious risk that the very features that make AI a revolutionary tool may be lost. AI can detect relationships and features of data that the human mind is not equipped to recognize. However, simply because our own minds do not fully understand or accept how AI makes a decision does not mean that the AI is flawed. Consider our example with the middle school student. While most teachers (and adults) use traditional methods such as multiplication tables or row and column accounting to solve an algebra problem without a calculator, should those teachers find that a student is incorrect if they use the esoteric Trachtenberg system? The purpose of AI is to promote innovative solutions, and we must promote progress, not stifle it.

The only way to determine the effectiveness of an AI is to test it. It is true that AI systems are being used to make critical decisions in our lives, such as our healthcare, hiring decisions, and even which university we attend. The focus of audits on these systems should be on the data used to create the model, not the model itself. The “black box” of the neural network is a red herring; its mystique is perpetuated by a mistaken understanding of how AI systems operate.

Ultimately, the call for the development of XAI is misguided since we already have systems to audit AI. Furthermore, if XAI is implemented, there is a severe risk that this technology will hurt rather than help technological progress. Finally, even if XAI can help users understand how an AI system makes a decision, such knowledge is utterly useless for fixing the system. While it is important to audit and verify that AI is performing as desired, XAI is not the solution. Instead, users should independently verify that AI works at a level commensurate with the amount of risk the user is willing to tolerate.

Benjamin Suslavich is a second-year law student at Wake Forest University School of Law. He is a certified Chemical Engineer EIT and holds a Masters of Science in Metallurgical and Process Engineering as well as a Bachelors of Science in Materials Engineering from Montana Technological University.