8,253 research outputs found
Out-of-sample generalizations for supervised manifold learning for classification
Supervised manifold learning methods for data classification map data samples
residing in a high-dimensional ambient space to a lower-dimensional domain in a
structure-preserving way, while enhancing the separation between different
classes in the learned embedding. Most nonlinear supervised manifold learning
methods compute the embedding of the manifolds only at the initially available
training points, while the generalization of the embedding to novel points,
known as the out-of-sample extension problem in manifold learning, becomes
especially important in classification applications. In this work, we propose a
semi-supervised method for building an interpolation function that provides an
out-of-sample extension for general supervised manifold learning algorithms
studied in the context of classification. The proposed algorithm computes a
radial basis function (RBF) interpolator that minimizes an objective function
consisting of the total embedding error of unlabeled test samples, defined as
their distance to the embeddings of the manifolds of their own class, as well
as a regularization term that controls the smoothness of the interpolation
function in a direction-dependent way. The class labels of test data and the
interpolation function parameters are estimated jointly with a progressive
procedure. Experimental results on face and object images demonstrate the
potential of the proposed out-of-sample extension algorithm for the
classification of manifold-modeled data sets
Spatial support vector regression to detect silent errors in the exascale era
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing
Research Program, under Contract DE-AC02-06CH11357, by FI-DGR 2013 scholarship, by HiPEAC PhD Collaboration
Grant, the European Community’s Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402, and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft
Separable Dual Space Gaussian Pseudo-potentials
We present pseudo-potential coefficients for the first two rows of the
periodic table. The pseudo potential is of a novel analytic form, that gives
optimal efficiency in numerical calculations using plane waves as basis set. At
most 7 coefficients are necessary to specify its analytic form. It is separable
and has optimal decay properties in both real and Fourier space. Because of
this property, the application of the nonlocal part of the pseudo-potential to
a wave-function can be done in an efficient way on a grid in real space. Real
space integration is much faster for large systems than ordinary multiplication
in Fourier space since it shows only quadratic scaling with respect to the size
of the system. We systematically verify the high accuracy of these
pseudo-potentials by extensive atomic and molecular test calculations.Comment: 16 pages, 4 postscript figure
- …