Search CORE

31,620 research outputs found

A Better Alternative to Piecewise Linear Time Series Segmentation

Author: Lemire Daniel
Publication venue
Publication date: 01/01/2005
Field of study

Time series are difficult to monitor, summarize and predict. Segmentation organizes time series into few intervals having uniform characteristics (flatness, linearity, modality, monotonicity and so on). For scalability, we require fast linear time algorithms. The popular piecewise linear model can determine where the data goes up or down and at what rate. Unfortunately, when the data does not follow a linear model, the computation of the local slope creates overfitting. We propose an adaptive time series model where the polynomial degree of each interval vary (constant, linear and so on). Given a number of regressors, the cost of each interval is its polynomial degree: constant intervals cost 1 regressor, linear intervals cost 2 regressors, and so on. Our goal is to minimize the Euclidean (l_2) error for a given model complexity. Experimentally, we investigate the model where intervals can be either constant or linear. Over synthetic random walks, historical stock market prices, and electrocardiograms, the adaptive model provides a more accurate segmentation than the piecewise linear model without increasing the cross-validation error or the running time, while providing a richer vocabulary to applications. Implementation issues, such as numerical stability and real-world performance, are discussed.Comment: to appear in SIAM Data Mining 200

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal

Augmented Sparse Reconstruction of Protein Signaling Networks

Author: Liotta L.
Napoletani D.
Petricoin E.
Sauer T.
Struppa D. C.
Publication venue
Publication date: 01/01/2007
Field of study

The problem of reconstructing and identifying intracellular protein signaling and biochemical networks is of critical importance in biology today. We sought to develop a mathematical approach to this problem using, as a test case, one of the most well-studied and clinically important signaling networks in biology today, the epidermal growth factor receptor (EGFR) driven signaling cascade. More specifically, we suggest a method, augmented sparse reconstruction, for the identification of links among nodes of ordinary differential equation (ODE) networks from a small set of trajectories with different initial conditions. Our method builds a system of representation by using a collection of integrals of all given trajectories and by attenuating block of terms in the representation itself. The system of representation is then augmented with random vectors, and minimization of the 1-norm is used to find sparse representations for the dynamical interactions of each node. Augmentation by random vectors is crucial, since sparsity alone is not able to handle the large error-in-variables in the representation. Augmented sparse reconstruction allows to consider potentially very large spaces of models and it is able to detect with high accuracy the few relevant links among nodes, even when moderate noise is added to the measured trajectories. After showing the performance of our method on a model of the EGFR protein network, we sketch briefly the potential future therapeutic applications of this approach.Comment: 24 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Chapman University Digital Commons

Implicitly Constrained Semi-Supervised Least Squares Classification

Author: B Widrow
GJ McLachlan
K Nigam
KP Bennett
L Bottou
M Loog
M Loog
M Opper
O Chapelle
R Rifkin
R Tibshirani
RH Byrd
S Raudys
T Hastie
T Poggio
X Zhu
YF Li
Publication venue
Publication date: 24/07/2015
Field of study

We introduce a novel semi-supervised version of the least squares classifier. This implicitly constrained least squares (ICLS) classifier minimizes the squared loss on the labeled data among the set of parameters implied by all possible labelings of the unlabeled data. Unlike other discriminative semi-supervised methods, our approach does not introduce explicit additional assumptions into the objective function, but leverages implicit assumptions already present in the choice of the supervised least squares classifier. We show this approach can be formulated as a quadratic programming problem and its solution can be found using a simple gradient descent procedure. We prove that, in a certain way, our method never leads to performance worse than the supervised classifier. Experimental results corroborate this theoretical result in the multidimensional case on benchmark datasets, also in terms of the error rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium on Intelligent Data Analysis (2015), Saint-Etienne, Franc

arXiv.org e-Print Archive

Crossref

Robust optimization in simulation: Taguchi and response surface methodology

Author: Beyer
Borgonovo
Box
Carlo Meloni
Chase
Darwish
Efron
Fu
Gabriella Dellino
Gill
Hillier
Jack P.C. Kleijnen
Kleijnen
Kleijnen
Lee
Myers
Park
Pentico
Simpson
Wu
Wu
Yu
Zipkin
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Optimization of simulated systems is tackled by many methods, but most methods assume known environments. This article, however, develops a `robust' methodology for uncertain environments. This methodology uses Taguchi's view of the uncertain world, but replaces his statistical techniques by Response Surface Methodology (RSM). George Box originated RSM, and Douglas Montgomery recently extended RSM to robust optimization of real (non-simulated) systems. We combine Taguchi's view with RSM for simulated systems. We illustrate the resulting methodology through classic Economic Order Quantity (EOQ) inventory models, which demonstrate that robust optimization may require order quantities that differ from the classic EOQ

Crossref

Archivio della ricerca- Università di Roma La Sapienza

IMT Institutional Repository

Tilburg University Repository