Search CORE

4,017 research outputs found

Spatial support vector regression to detect silent errors in the exascale era

Author: Balaprakash Prasanna
Bautista Gomez Leonardo
Cappello Franck
Cristal Kestelman Adrián
Di Sheng
Labarta Mancho Jesús José
Subasi Omer
Unsal Osman Sabri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research Program, under Contract DE-AC02-06CH11357, by FI-DGR 2013 scholarship, by HiPEAC PhD Collaboration Grant, the European Community’s Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402, and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Automatic Environmental Sound Recognition: Performance versus Computational Cost

Author: Krstulovic Sacha
Plumbley Mark D.
Sigtia Siddharth
Stark Adam M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2016
Field of study

In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance em as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost

arXiv.org e-Print Archive

Surrey Research Insight

Hyperparameter Importance Across Datasets

Author: Bergstra J.
Bonilla E. V.
Brazdil P.
Demvsar J.
Feurer M.
Jamieson K.
Joachims T.
Klein A.
Li L.
Loshchilov I.
Sobol I. M.
van Rijn J. N.
van Rijn J. N.
Wistuba M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/05/2018
Field of study

With the advent of automated machine learning, automated hyperparameter optimization methods are by now routinely used in data mining. However, this progress is not yet matched by equal progress on automatic analyses that yield information beyond performance-optimizing hyperparameter settings. In this work, we aim to answer the following two questions: Given an algorithm, what are generally its most important hyperparameters, and what are typically good values for these? We present methodology and a framework to answer these questions based on meta-learning across many datasets. We apply this methodology using the experimental meta-data available on OpenML to determine the most important hyperparameters of support vector machines, random forests and Adaboost, and to infer priors for all their hyperparameters. The results, obtained fully automatically, provide a quantitative basis to focus efforts in both manual algorithm design and in automated hyperparameter optimization. The conducted experiments confirm that the hyperparameters selected by the proposed method are indeed the most important ones and that the obtained priors also lead to statistically significant improvements in hyperparameter optimization.Comment: \c{opyright} 2018. Copyright is held by the owner/author(s). Publication rights licensed to ACM. This is the author's version of the work. It is posted here for your personal use, not for redistribution. The definitive Version of Record was published in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Minin

arXiv.org e-Print Archive

Crossref

Abusive Language Detection in Online Conversations by Combining Content-and Graph-based Features

Author: Cecillon Noé
Dufour Richard
Labatut Vincent
Linarès Georges
Publication venue: 'Frontiers Media SA'
Publication date: 20/05/2019
Field of study

In recent years, online social networks have allowed worldwide users to meet and discuss. As guarantors of these communities, the administrators of these platforms must prevent users from adopting inappropriate behaviors. This verification task, mainly done by humans, is more and more difficult due to the ever growing amount of messages to check. Methods have been proposed to automatize this moderation process, mainly by providing approaches based on the textual content of the exchanged messages. Recent work has also shown that characteristics derived from the structure of conversations, in the form of conversational graphs, can help detecting these abusive messages. In this paper, we propose to take advantage of both sources of information by proposing fusion methods integrating content-and graph-based features. Our experiments on raw chat logs show that the content of the messages, but also of their dynamics within a conversation contain partially complementary information, allowing performance improvements on an abusive message classification task with a final F-measure of 93.26%

arXiv.org e-Print Archive

Hal-Diderot

Training Support Vector Machines Using Frank-Wolfe Optimization Methods

Author: Frandi Emanuele
Gasparo Maria Grazia
Lodi Stefano
Nanculef Ricardo
Sartori Claudio
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 04/12/2012
Field of study

Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of Core Vector Machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a Minimal Enclosing Ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs and can thus be used for a wider set of problems

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Use of Machine Learning for Partial Discharge Discrimination

Author: Hao L
Lewin P L
Swingler S G
Publication venue
Publication date: 26/05/2009
Field of study

Partial discharge (PD) measurements are an important tool for assessing the condition of power equipment. Different sources of PD have different effects on the insulation performance of power apparatus. Therefore, discrimination between PD sources is of great interest to both system utilities and equipment manufacturers. This paper investigates the use of a wide bandwidth PD on-line measurement system to facilitate automatic PD source identification. Three artificial PD models were used to simulate typical PD sources which may exist within power systems. Wavelet analysis was applied to pre-process the obtained measurement data. This data was then processed using correlation analysis to cluster the discharges into different groups. A machine learning technique, namely the support vector machine (SVM) was then used to identify between the different PD sources. The SVM is trained to differentiate between the inherent features of each discharge source signal. Laboratory experiments indicate that this approach is applicable for use with field measurement data

Southampton (e-Prints Soton)

Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores

Author: Colpaert Kirsten
Couckuyt Ivo
De Turck Filip
Decruyenaere Johan
Dhaene Tom
Gadeyne Bram
Houthooft Rein
Ongenae Femke
Ruyssinck Joeri
Stijven Sean
van der Herten Joachim
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Ghent University Academic Bibliography