25,286 research outputs found
QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
The need to prediscretize numeric attributes before they can be used in
association rule learning is a source of inefficiencies in the resulting
classifier. This paper describes several new rule tuning steps aiming to
recover information lost in the discretization of numeric (quantitative)
attributes, and a new rule pruning strategy, which further reduces the size of
the classification models. We demonstrate the effectiveness of the proposed
methods on postoptimization of models generated by three state-of-the-art
association rule classification algorithms: Classification based on
Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016),
and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from
the UCI repository show that the postoptimized models are consistently smaller
-- typically by about 50% -- and have better classification performance on most
datasets
A Process to Implement an Artificial Neural Network and Association Rules Techniques to Improve Asset Performance and Energy Efficiency
In this paper, we address the problem of asset performance monitoring, with the intention
of both detecting any potential reliability problem and predicting any loss of energy consumption
e ciency. This is an important concern for many industries and utilities with very intensive
capitalization in very long-lasting assets. To overcome this problem, in this paper we propose an
approach to combine an Artificial Neural Network (ANN) with Data Mining (DM) tools, specifically
with Association Rule (AR) Mining. The combination of these two techniques can now be done
using software which can handle large volumes of data (big data), but the process still needs to
ensure that the required amount of data will be available during the assets’ life cycle and that its
quality is acceptable. The combination of these two techniques in the proposed sequence di ers
from previous works found in the literature, giving researchers new options to face the problem.
Practical implementation of the proposed approach may lead to novel predictive maintenance models
(emerging predictive analytics) that may detect with unprecedented precision any asset’s lack of
performance and help manage assets’ O&M accordingly. The approach is illustrated using specific
examples where asset performance monitoring is rather complex under normal operational conditions.Ministerio de EconomÃa y Competitividad DPI2015-70842-
Interpretable multiclass classification by MDL-based rule lists
Interpretable classifiers have recently witnessed an increase in attention
from the data mining community because they are inherently easier to understand
and explain than their more complex counterparts. Examples of interpretable
classification models include decision trees, rule sets, and rule lists.
Learning such models often involves optimizing hyperparameters, which typically
requires substantial amounts of data and may result in relatively large models.
In this paper, we consider the problem of learning compact yet accurate
probabilistic rule lists for multiclass classification. Specifically, we
propose a novel formalization based on probabilistic rule lists and the minimum
description length (MDL) principle. This results in virtually parameter-free
model selection that naturally allows to trade-off model complexity with
goodness of fit, by which overfitting and the need for hyperparameter tuning
are effectively avoided. Finally, we introduce the Classy algorithm, which
greedily finds rule lists according to the proposed criterion. We empirically
demonstrate that Classy selects small probabilistic rule lists that outperform
state-of-the-art classifiers when it comes to the combination of predictive
performance and interpretability. We show that Classy is insensitive to its
only parameter, i.e., the candidate set, and that compression on the training
set correlates with classification performance, validating our MDL-based
selection criterion
Electricity clustering framework for automatic classification of customer loads
Clustering in energy markets is a top topic with high significance on expert and intelligent systems. The main impact of is paper is the proposal of a new clustering framework for the automatic classification of electricity customers’ loads. An automatic selection of the clustering classification algorithm is also highlighted. Finally, new customers can be assigned to a predefined set of clusters in the classificationphase. The computation time of the proposed framework is less than that of previous classification tech- niques, which enables the processing of a complete electric company sample in a matter of minutes on a personal computer. The high accuracy of the predicted classification results verifies the performance of the clustering technique. This classification phase is of significant assistance in interpreting the results, and the simplicity of the clustering phase is sufficient to demonstrate the quality of the complete mining framework.Ministerio de EconomÃa y Competitividad TEC2013-40767-RMinisterio de EconomÃa y Competitividad IDI- 2015004
- …