151,292 research outputs found
Efficient classification using parallel and scalable compressed model and Its application on intrusion detection
In order to achieve high efficiency of classification in intrusion detection,
a compressed model is proposed in this paper which combines horizontal
compression with vertical compression. OneR is utilized as horizontal
com-pression for attribute reduction, and affinity propagation is employed as
vertical compression to select small representative exemplars from large
training data. As to be able to computationally compress the larger volume of
training data with scalability, MapReduce based parallelization approach is
then implemented and evaluated for each step of the model compression process
abovementioned, on which common but efficient classification methods can be
directly used. Experimental application study on two publicly available
datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the
classification using the compressed model proposed can effectively speed up the
detection procedure at up to 184 times, most importantly at the cost of a
minimal accuracy difference with less than 1% on average
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
FAME: Face Association through Model Evolution
We attack the problem of learning face models for public faces from
weakly-labelled images collected from web through querying a name. The data is
very noisy even after face detection, with several irrelevant faces
corresponding to other people. We propose a novel method, Face Association
through Model Evolution (FAME), that is able to prune the data in an iterative
way, for the face models associated to a name to evolve. The idea is based on
capturing discriminativeness and representativeness of each instance and
eliminating the outliers. The final models are used to classify faces on novel
datasets with possibly different characteristics. On benchmark datasets, our
results are comparable to or better than state-of-the-art studies for the task
of face identification.Comment: Draft version of the stud
Interpretable Aircraft Engine Diagnostic via Expert Indicator Aggregation
Detecting early signs of failures (anomalies) in complex systems is one of
the main goal of preventive maintenance. It allows in particular to avoid
actual failures by (re)scheduling maintenance operations in a way that
optimizes maintenance costs. Aircraft engine health monitoring is one
representative example of a field in which anomaly detection is crucial.
Manufacturers collect large amount of engine related data during flights which
are used, among other applications, to detect anomalies. This article
introduces and studies a generic methodology that allows one to build automatic
early signs of anomaly detection in a way that builds upon human expertise and
that remains understandable by human operators who make the final maintenance
decision. The main idea of the method is to generate a very large number of
binary indicators based on parametric anomaly scores designed by experts,
complemented by simple aggregations of those scores. A feature selection method
is used to keep only the most discriminant indicators which are used as inputs
of a Naive Bayes classifier. This give an interpretable classifier based on
interpretable anomaly detectors whose parameters have been optimized indirectly
by the selection process. The proposed methodology is evaluated on simulated
data designed to reproduce some of the anomaly types observed in real world
engines.Comment: arXiv admin note: substantial text overlap with arXiv:1408.6214,
arXiv:1409.4747, arXiv:1407.088
Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
We provide a comprehensive evaluation of salient object detection (SOD)
models. Our analysis identifies a serious design bias of existing SOD datasets
which assumes that each image contains at least one clearly outstanding salient
object in low clutter. The design bias has led to a saturated high performance
for state-of-the-art SOD models when evaluated on existing datasets. The
models, however, still perform far from being satisfactory when applied to
real-world daily scenes. Based on our analyses, we first identify 7 crucial
aspects that a comprehensive and balanced dataset should fulfill. Then, we
propose a new high quality dataset and update the previous saliency benchmark.
Specifically, our SOC (Salient Objects in Clutter) dataset, includes images
with salient and non-salient objects from daily object categories. Beyond
object category annotations, each salient image is accompanied by attributes
that reflect common challenges in real-world scenes. Finally, we report
attribute-based performance assessment on our dataset.Comment: ECCV 201
- …