Search CORE

151,292 research outputs found

Efficient classification using parallel and scalable compressed model and Its application on intrusion detection

Author: Chen Tieming
Jin Shichao
Kim Okhee
Zhang Xu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

In order to achieve high efficiency of classification in intrusion detection, a compressed model is proposed in this paper which combines horizontal compression with vertical compression. OneR is utilized as horizontal com-pression for attribute reduction, and affinity propagation is employed as vertical compression to select small representative exemplars from large training data. As to be able to computationally compress the larger volume of training data with scalability, MapReduce based parallelization approach is then implemented and evaluated for each step of the model compression process abovementioned, on which common but efficient classification methods can be directly used. Experimental application study on two publicly available datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the classification using the compressed model proposed can effectively speed up the detection procedure at up to 184 times, most importantly at the cost of a minimal accuracy difference with less than 1% on average

arXiv.org e-Print Archive

Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare

Author: Duchene Florence
Garbay Catherine
Rialle Vincent
Publication venue
Publication date: 25/11/2004
Field of study

For the last years, time-series mining has become a challenging issue for researchers. An important application lies in most monitoring purposes, which require analyzing large sets of time-series for learning usual patterns. Any deviation from this learned profile is then considered as an unexpected situation. Moreover, complex applications may involve the temporal study of several heterogeneous parameters. In that paper, we propose a method for mining heterogeneous multivariate time-series for learning meaningful patterns. The proposed approach allows for mixed time-series -- containing both pattern and non-pattern data -- such as for imprecise matches, outliers, stretching and global translating of patterns instances in time. We present the early results of our approach in the context of monitoring the health status of a person at home. The purpose is to build a behavioral profile of a person by analyzing the time variations of several quantitative or qualitative parameters recorded through a provision of sensors installed in the home

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

FAME: Face Association through Model Evolution

Author: Duygulu Pinar
Golge Eren
Publication venue
Publication date: 10/07/2014
Field of study

We attack the problem of learning face models for public faces from weakly-labelled images collected from web through querying a name. The data is very noisy even after face detection, with several irrelevant faces corresponding to other people. We propose a novel method, Face Association through Model Evolution (FAME), that is able to prune the data in an iterative way, for the face models associated to a name to evolve. The idea is based on capturing discriminativeness and representativeness of each instance and eliminating the outliers. The final models are used to classify faces on novel datasets with possibly different characteristics. On benchmark datasets, our results are comparable to or better than state-of-the-art studies for the task of face identification.Comment: Draft version of the stud

arXiv.org e-Print Archive

Crossref

Interpretable Aircraft Engine Diagnostic via Expert Indicator Aggregation

Author: Cottrell Marie
Lacaille Jérôme
Rabenoro Tsirizo
Rossi Fabrice
Publication venue
Publication date: 01/10/2014
Field of study

Detecting early signs of failures (anomalies) in complex systems is one of the main goal of preventive maintenance. It allows in particular to avoid actual failures by (re)scheduling maintenance operations in a way that optimizes maintenance costs. Aircraft engine health monitoring is one representative example of a field in which anomaly detection is crucial. Manufacturers collect large amount of engine related data during flights which are used, among other applications, to detect anomalies. This article introduces and studies a generic methodology that allows one to build automatic early signs of anomaly detection in a way that builds upon human expertise and that remains understandable by human operators who make the final maintenance decision. The main idea of the method is to generate a very large number of binary indicators based on parametric anomaly scores designed by experts, complemented by simple aggregations of those scores. A feature selection method is used to keep only the most discriminant indicators which are used as inputs of a Naive Bayes classifier. This give an interpretable classifier based on interpretable anomaly detectors whose parameters have been optimized indirectly by the selection process. The proposed methodology is evaluated on simulated data designed to reproduce some of the anomaly types observed in real world engines.Comment: arXiv admin note: substantial text overlap with arXiv:1408.6214, arXiv:1409.4747, arXiv:1407.088

arXiv.org e-Print Archive

HAL-Paris1

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground

Author: A Borji
A Borji
A Borji
GX Zhang
J Wang
L Wang
MM Cheng
MM Cheng
MM Cheng
Olga Russakovsky
S Lazebnik
T Chen
T Chen
Tsung-Yi Lin
X Li
Y Wei
Publication venue
Publication date: 01/01/2018
Field of study

We provide a comprehensive evaluation of salient object detection (SOD) models. Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter. The design bias has led to a saturated high performance for state-of-the-art SOD models when evaluated on existing datasets. The models, however, still perform far from being satisfactory when applied to real-world daily scenes. Based on our analyses, we first identify 7 crucial aspects that a comprehensive and balanced dataset should fulfill. Then, we propose a new high quality dataset and update the previous saliency benchmark. Specifically, our SOC (Salient Objects in Clutter) dataset, includes images with salient and non-salient objects from daily object categories. Beyond object category annotations, each salient image is accompanied by attributes that reflect common challenges in real-world scenes. Finally, we report attribute-based performance assessment on our dataset.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)