5,219 research outputs found
Interpretable Categorization of Heterogeneous Time Series Data
Understanding heterogeneous multivariate time series data is important in
many applications ranging from smart homes to aviation. Learning models of
heterogeneous multivariate time series that are also human-interpretable is
challenging and not adequately addressed by the existing literature. We propose
grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs
extend decision trees with a grammar framework. Logical expressions derived
from a context-free grammar are used for branching in place of simple
thresholds on attributes. The added expressivity enables support for a wide
range of data types while retaining the interpretability of decision trees. In
particular, when a grammar based on temporal logic is used, we show that GBDTs
can be used for the interpretable classi cation of high-dimensional and
heterogeneous time series data. Furthermore, we show how GBDTs can also be used
for categorization, which is a combination of clustering and generating
interpretable explanations for each cluster. We apply GBDTs to analyze the
classic Australian Sign Language dataset as well as data on near mid-air
collisions (NMACs). The NMAC data comes from aircraft simulations used in the
development of the next-generation Airborne Collision Avoidance System (ACAS
X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data
Mining (SDM) 201
A Process to Implement an Artificial Neural Network and Association Rules Techniques to Improve Asset Performance and Energy Efficiency
In this paper, we address the problem of asset performance monitoring, with the intention
of both detecting any potential reliability problem and predicting any loss of energy consumption
e ciency. This is an important concern for many industries and utilities with very intensive
capitalization in very long-lasting assets. To overcome this problem, in this paper we propose an
approach to combine an Artificial Neural Network (ANN) with Data Mining (DM) tools, specifically
with Association Rule (AR) Mining. The combination of these two techniques can now be done
using software which can handle large volumes of data (big data), but the process still needs to
ensure that the required amount of data will be available during the assets’ life cycle and that its
quality is acceptable. The combination of these two techniques in the proposed sequence di ers
from previous works found in the literature, giving researchers new options to face the problem.
Practical implementation of the proposed approach may lead to novel predictive maintenance models
(emerging predictive analytics) that may detect with unprecedented precision any asset’s lack of
performance and help manage assets’ O&M accordingly. The approach is illustrated using specific
examples where asset performance monitoring is rather complex under normal operational conditions.Ministerio de EconomÃa y Competitividad DPI2015-70842-
QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
The need to prediscretize numeric attributes before they can be used in
association rule learning is a source of inefficiencies in the resulting
classifier. This paper describes several new rule tuning steps aiming to
recover information lost in the discretization of numeric (quantitative)
attributes, and a new rule pruning strategy, which further reduces the size of
the classification models. We demonstrate the effectiveness of the proposed
methods on postoptimization of models generated by three state-of-the-art
association rule classification algorithms: Classification based on
Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016),
and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from
the UCI repository show that the postoptimized models are consistently smaller
-- typically by about 50% -- and have better classification performance on most
datasets
- …