Search CORE

2,291 research outputs found

Two Procedures for Robust Monitoring of Probability Distributions of Economic Data Streams induced by Depth Functions

Author: Kosiorowski Daniel
Publication venue
Publication date: 17/01/2015
Field of study

Data streams (streaming data) consist of transiently observed, evolving in time, multidimensional data sequences that challenge our computational and/or inferential capabilities. In this paper we propose user friendly approaches for robust monitoring of selected properties of unconditional and conditional distribution of the stream basing on depth functions. Our proposals are robust to a small fraction of outliers and/or inliers but sensitive to a regime change of the stream at the same time. Their implementations are available in our free R package DepthProc.Comment: Operations Research and Decisions, vol. 25, No. 1, 201

arXiv.org e-Print Archive

Directory of Open Access Journals

Spatial support vector regression to detect silent errors in the exascale era

Author: Balaprakash Prasanna
Bautista Gomez Leonardo
Cappello Franck
Cristal Kestelman Adrián
Di Sheng
Labarta Mancho Jesús José
Subasi Omer
Unsal Osman Sabri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research Program, under Contract DE-AC02-06CH11357, by FI-DGR 2013 scholarship, by HiPEAC PhD Collaboration Grant, the European Community’s Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402, and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

Mathematical programming for piecewise linear regression analysis

Author: Liu S
Papageorgiou LG
Tsoka S
Yang L
Publication venue: PERGAMON-ELSEVIER SCIENCE LTD
Publication date: 01/02/2016
Field of study

In data mining, regression analysis is a computational tool that predicts continuous output variables from a number of independent input variables, by approximating their complex inner relationship. A large number of methods have been successfully proposed, based on various methodologies, including linear regression, support vector regression, neural network, piece-wise regression, etc. In terms of piece-wise regression, the existing methods in literature are usually restricted to problems of very small scale, due to their inherent non-linear nature. In this work, a more efficient piece-wise linear regression method is introduced based on a novel integer linear programming formulation. The proposed method partitions one input variable into multiple mutually exclusive segments, and fits one multivariate linear regression function per segment to minimise the total absolute error. Assuming both the single partition feature and the number of regions are known, the mixed integer linear model is proposed to simultaneously determine the locations of multiple break-points and regression coefficients for each segment. Furthermore, an efficient heuristic procedure is presented to identify the key partition feature and final number of break-points. 7 real world problems covering several application domains have been used to demonstrate the efficiency of our proposed method. It is shown that our proposed piece-wise regression method can be solved to global optimality for datasets of thousands samples, which also consistently achieves higher prediction accuracy than a number of state-of-the-art regression methods. Another advantage of the proposed method is that the learned model can be conveniently expressed as a small number of if-then rules that are easily interpretable. Overall, this work proposes an efficient rule-based multivariate regression method based on piece-wise functions and achieves better prediction performance than state-of-the-arts approaches. This novel method can benefit expert systems in various applications by automatically acquiring knowledge from databases to improve the quality of knowledge base

King's Research Portal

Change-point Problem and Regression: An Annotated Bibliography

Author: Asgharian Masoud
Khodadadi Ahmad
Publication venue: Collection of Biostatistics Research Archive
Publication date: 12/11/2008
Field of study

The problems of identifying changes at unknown times and of estimating the location of changes in stochastic processes are referred to as the change-point problem or, in the Eastern literature, as disorder . The change-point problem, first introduced in the quality control context, has since developed into a fundamental problem in the areas of statistical control theory, stationarity of a stochastic process, estimation of the current position of a time series, testing and estimation of change in the patterns of a regression model, and most recently in the comparison and matching of DNA sequences in microarray data analysis. Numerous methodological approaches have been implemented in examining change-point models. Maximum-likelihood estimation, Bayesian estimation, isotonic regression, piecewise regression, quasi-likelihood and non-parametric regression are among the methods which have been applied to resolving challenges in change-point problems. Grid-searching approaches have also been used to examine the change-point problem. Statistical analysis of change-point problems depends on the method of data collection. If the data collection is ongoing until some random time, then the appropriate statistical procedure is called sequential. If, however, a large finite set of data is collected with the purpose of determining if at least one change-point occurred, then this may be referred to as non-sequential. Not surprisingly, both the former and the latter have a rich literature with much of the earlier work focusing on sequential methods inspired by applications in quality control for industrial processes. In the regression literature, the change-point model is also referred to as two- or multiple-phase regression, switching regression, segmented regression, two-stage least squares (Shaban, 1980), or broken-line regression. The area of the change-point problem has been the subject of intensive research in the past half-century. The subject has evolved considerably and found applications in many different areas. It seems rather impossible to summarize all of the research carried out over the past 50 years on the change-point problem. We have therefore confined ourselves to those articles on change-point problems which pertain to regression. The important branch of sequential procedures in change-point problems has been left out entirely. We refer the readers to the seminal review papers by Lai (1995, 2001). The so called structural change models, which occupy a considerable portion of the research in the area of change-point, particularly among econometricians, have not been fully considered. We refer the reader to Perron (2005) for an updated review in this area. Articles on change-point in time series are considered only if the methodologies presented in the paper pertain to regression analysis

Collection Of Biostatistics Research Archive

Certifying Bimanual RRT Motion Plans in a Second

Author: Amice Alexandre
Tedrake Russ
Werner Peter
Publication venue
Publication date: 25/10/2023
Field of study

We present an efficient method for certifying non-collision for piecewise-polynomial motion plans in algebraic reparametrizations of configuration space. Such motion plans include those generated by popular randomized methods including RRTs and PRMs, as well as those generated by many methods in trajectory optimization. Based on Sums-of-Squares optimization, our method provides exact, rigorous certificates of non-collision; it can never falsely claim that a motion plan containing collisions is collision-free. We demonstrate that our formulation is practical for real world deployment, certifying the safety of a twelve degree of freedom motion plan in just over a second. Moreover, the method is capable of discriminating the safety or lack thereof of two motion plans which differ by only millimeters.Comment: 7 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

A Parametric Non-Convex Decomposition Algorithm for Real-Time and Distributed NMPC

Author: Hours Jean-Hubert
Jones Colin N.
Publication venue
Publication date: 29/08/2014
Field of study

A novel decomposition scheme to solve parametric non-convex programs as they arise in Nonlinear Model Predictive Control (NMPC) is presented. It consists of a fixed number of alternating proximal gradient steps and a dual update per time step. Hence, the proposed approach is attractive in a real-time distributed context. Assuming that the Nonlinear Program (NLP) is semi-algebraic and that its critical points are strongly regular, contraction of the sequence of primal-dual iterates is proven, implying stability of the sub-optimality error, under some mild assumptions. Moreover, it is shown that the performance of the optimality-tracking scheme can be enhanced via a continuation technique. The efficacy of the proposed decomposition method is demonstrated by solving a centralised NMPC problem to control a DC motor and a distributed NMPC program for collaborative tracking of unicycles, both within a real-time framework. Furthermore, an analysis of the sub-optimality error as a function of the sampling period is proposed given a fixed computational power.Comment: 16 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX