18 research outputs found
Towards a Rigorous Evaluation of XAI Methods on Time Series
Explainable Artificial Intelligence (XAI) methods are typically deployed to
explain and debug black-box machine learning models. However, most proposed XAI
methods are black-boxes themselves and designed for images. Thus, they rely on
visual interpretability to evaluate and prove explanations. In this work, we
apply XAI methods previously used in the image and text-domain on time series.
We present a methodology to test and evaluate various XAI methods on time
series by introducing new verification techniques to incorporate the temporal
dimension. We further conduct preliminary experiments to assess the quality of
selected XAI method explanations with various verification methods on a range
of datasets and inspecting quality metrics on it. We demonstrate that in our
initial experiments, SHAP works robust for all models, but others like
DeepLIFT, LRP, and Saliency Maps work better with specific architectures.Comment: 5 Pages 1 Figure 1 Table 1 Page Reference - 2019 ICCV Workshop on
Interpreting and Explaining Visual Artificial Intelligence Model
Explaining and upsampling anomalies in time-series sensor data.
The aims of this research was to improve anomaly detection methods in multi-sensor data by extending current re-sampling and explanation methods to work in a time-series setting. While there is a plethora of literature surrounding XAI for tabular data, the same cannot be said for the multivariate time-series settings. It is also known that selecting an optimal baseline for attribution methods such as integrated gradients remains an open research question. Accordingly, the author is interested to explore the role of Case-Based Reasoning (CBR) in three ways: 1) to represent time series data from multiple sensors to enable effective anomaly detection; 2) to create explanation experiences (explanation-baseline pair) that can support the identification of suitable baselines to improve attribution discovery with integrated gradients for multivariate time-series settings; and 3) to represent the disagreements between past explanations in a case-base to better inform strategies for solving disagreement between explainers in the future. A common theme across my research is the need to explore how inherent relationships between sensors (causal or other ad-hoc inter-dependencies) can be captured and represented to improve anomaly detection and the follow-on explanation phases
Explainable NILM Networks
There has been an explosion in the literature recently on Nonintrusive load monitoring (NILM) approaches based on neural networks and other advanced machine learning methods. However, though these methods provide competitive accuracy, the inner workings of these models is less clear. Understanding the outputs of the networks help in improving the designs, highlights the relevant features and aspects of the data used for making the decision, provides a better picture of the accuracy of the models (since a single accuracy number is often insufficient), and also inherently provides a level of trust in the value of the provided consumption feedback to the NILM end-user. Explainable Artificial Intelligence (XAI) aims to address this issue by explaining these âblack-boxesâ. XAI methods, developed for image and text-based methods, can in many cases interpret well the outputs of complex models, making them transparent. However, explaining time-series data inference remains a challenge. In this paper, we show how some XAI-based approaches can be used to explain NILM deep learning-based autoencoders inner workings, and examine why the network performs or does not perform well in certain cases
On the Consistency and Robustness of Saliency Explanations for Time Series Classification
Interpretable machine learning and explainable artificial intelligence have
become essential in many applications. The trade-off between interpretability
and model performance is the traitor to developing intrinsic and model-agnostic
interpretation methods. Although model explanation approaches have achieved
significant success in vision and natural language domains, explaining time
series remains challenging. The complex pattern in the feature domain, coupled
with the additional temporal dimension, hinders efficient interpretation.
Saliency maps have been applied to interpret time series windows as images.
However, they are not naturally designed for sequential data, thus suffering
various issues.
This paper extensively analyzes the consistency and robustness of saliency
maps for time series features and temporal attribution. Specifically, we
examine saliency explanations from both perturbation-based and gradient-based
explanation models in a time series classification task. Our experimental
results on five real-world datasets show that they all lack consistent and
robust performances to some extent. By drawing attention to the flawed saliency
explanation models, we motivate to develop consistent and robust explanations
for time series classification
XTSC-Bench: Quantitative Benchmarking for Explainers on Time Series Classification
Despite the growing body of work on explainable machine learning in time
series classification (TSC), it remains unclear how to evaluate different
explainability methods. Resorting to qualitative assessment and user studies to
evaluate explainers for TSC is difficult since humans have difficulties
understanding the underlying information contained in time series data.
Therefore, a systematic review and quantitative comparison of explanation
methods to confirm their correctness becomes crucial. While steps to
standardized evaluations were taken for tabular, image, and textual data,
benchmarking explainability methods on time series is challenging due to a)
traditional metrics not being directly applicable, b) implementation and
adaption of traditional metrics for time series in the literature vary, and c)
varying baseline implementations. This paper proposes XTSC-Bench, a
benchmarking tool providing standardized datasets, models, and metrics for
evaluating explanation methods on TSC. We analyze 3 perturbation-, 6 gradient-
and 2 example-based explanation methods to TSC showing that improvements in the
explainers' robustness and reliability are necessary, especially for
multivariate data.Comment: Accepted at ICMLA 202
Transparent AI : explainability of deep learning based load disaggregation
The paper focuses on explaining the outputs of deep-learning based non-intrusive load monitoring (NILM). Explainability of NILM networks is needed for a range of stakeholders: (i) technology developers to understand why a model is under/over predicting energy usage, missing appliances or false positives, (ii) businesses offering energy advice based on NILM as part of a broader energy home management recommender system, and (iii) end-users who need to understand the outcomes of the NILM inference
On the Soundness of XAI in Prognostics and Health Management (PHM)
The aim of Predictive Maintenance, within the field of Prognostics and Health
Management (PHM), is to identify and anticipate potential issues in the
equipment before these become critical. The main challenge to be addressed is
to assess the amount of time a piece of equipment will function effectively
before it fails, which is known as Remaining Useful Life (RUL). Deep Learning
(DL) models, such as Deep Convolutional Neural Networks (DCNN) and Long
Short-Term Memory (LSTM) networks, have been widely adopted to address the
task, with great success. However, it is well known that this kind of black box
models are opaque decision systems, and it may be hard to explain its outputs
to stakeholders (experts in the industrial equipment). Due to the large number
of parameters that determine the behavior of these complex models,
understanding the reasoning behind the predictions is challenging. This work
presents a critical and comparative revision on a number of XAI methods applied
on time series regression model for PM. The aim is to explore XAI methods
within time series regression, which have been less studied than those for time
series classification. The model used during the experimentation is a DCNN
trained to predict the RUL of an aircraft engine. The methods are reviewed and
compared using a set of metrics that quantifies a number of desirable
properties that any XAI method should fulfill. The results show that GRAD-CAM
is the most robust method, and that the best layer is not the bottom one, as is
commonly seen within the context of Image Processing
Instance-based Counterfactual Explanations for Time Series Classification
In recent years, there has been a rapidly expanding focus on explaining the
predictions made by black-box AI systems that handle image and tabular data.
However, considerably less attention has been paid to explaining the
predictions of opaque AI systems handling time series data. In this paper, we
advance a novel model-agnostic, case-based technique -- Native Guide -- that
generates counterfactual explanations for time series classifiers. Given a
query time series, , for which a black-box classification system
predicts class, , a counterfactual time series explanation shows how
could change, such that the system predicts an alternative class, . The
proposed instance-based technique adapts existing counterfactual instances in
the case-base by highlighting and modifying discriminative areas of the time
series that underlie the classification. Quantitative and qualitative results
from two comparative experiments indicate that Native Guide generates
plausible, proximal, sparse and diverse explanations that are better than those
produced by key benchmark counterfactual methods