Explainability for Sequential Decision-Making
Machine learning has been used to aid decision-making in several domains, from healthcare to finance. Understanding the decision process of ML models is paramount in high-stakes decisions that impact people’s lives, otherwise, loss of control and lack of trust may arise. Often, these decisions have a sequential nature. For instance, the transaction history of a credit card must be considered when predicting the risk of fraud of the most recent transaction. Although RNNs are state-of-the-art models for many sequential decision-making tasks, they are perceived as black-boxes, creating a tension between accuracy and interpretability. While there has been considerable research effort towards developing explanation methods for ML, recurrent models have received relatively much less attention. Recently, Lundberg and Lee unified several methods under a single family of additive feature attribution explainers. From this family, KernelSHAP has seen a wide adoption throughout the literature; however, this explainer is unfit to explain models in a sequential setting, as it only accounts for the current input not the whole sequence. In this work, we present TimeSHAP, a model-agnostic recurrent explainer that builds upon KernelSHAP and extends it to sequences. TimeSHAP explains recurrent models by computing feature-, timestep-, and cell-level attributions, producing explanations at both the feature and time axes. As sequences may be arbitrarily long, we further propose two pruning methods that are shown to dramatically decrease TimeSHAP’s computational cost and increase its reliability. We validate TimeSHAP by using it to explain predictions of two RNN models in two real-world fraud detection tasks, obtaining relevant insights into these models and their predictions.