Search CORE

110,486 research outputs found

Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions

Author: Aspeel Antoine
Delvenne Jean-Charles
Fanuel Michaël
Suykens Johan A. K.
Publication venue
Publication date: 06/10/2020
Field of study

In machine learning or statistics, it is often desirable to reduce the dimensionality of a sample of data points in a high dimensional space

\mathbb{R}^d

. This paper introduces a dimensionality reduction method where the embedding coordinates are the eigenvectors of a positive semi-definite kernel obtained as the solution of an infinite dimensional analogue of a semi-definite program. This embedding is adaptive and non-linear. A main feature of our approach is the existence of a non-linear out-of-sample extension formula of the embedding coordinates, called a projected Nystr\"om approximation. This extrapolation formula yields an extension of the kernel matrix to a data-dependent Mercer kernel function. Our empirical results indicate that this embedding method is more robust with respect to the influence of outliers, compared with a spectral embedding method.Comment: 16 pages, 5 figures. Improved presentatio

arXiv.org e-Print Archive

HAL Descartes

DIAL UCLouvain

Hal-Diderot

DROP: Dimensionality Reduction Optimization for Time Series

Author: Bailis Peter
Suri Sahaana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Dimensionality reduction is a critical step in scaling machine learning pipelines. Principal component analysis (PCA) is a standard tool for dimensionality reduction, but performing PCA over a full dataset can be prohibitively expensive. As a result, theoretical work has studied the effectiveness of iterative, stochastic PCA methods that operate over data samples. However, termination conditions for stochastic PCA either execute for a predetermined number of iterations, or until convergence of the solution, frequently sampling too many or too few datapoints for end-to-end runtime improvements. We show how accounting for downstream analytics operations during DR via PCA allows stochastic methods to efficiently terminate after operating over small (e.g., 1%) subsamples of input data, reducing whole workload runtime. Leveraging this, we propose DROP, a DR optimizer that enables speedups of up to 5x over Singular-Value-Decomposition-based PCA techniques, and exceeds conventional approaches like FFT and PAA by up to 16x in end-to-end workloads

arXiv.org e-Print Archive

Crossref

Visualizing dimensionality reduction of systems biology data

Author: A Hyvaerinen
A Hyvaerinen
A Inselberg
A Inselberg
A Saeed
Albert Pritzkau
Andreas Lehrmann
Aydin C. Polatkan
DH Jeong
DJ Lockhart
F Battke
F Battke
GH Golub
H Abdi
H Hotelling
HF Kaiser
J Shendure
JB Tenenbaum
K Pearson
Kay Nieselt
KQ Weinberger
LK Saul
M Fontes
M Harrower
M Schena
Michael Huber
P Mannfolk
R Karbauskaite
R Tarjan
S Roweis
Z Zhang
Ö Altug-Teber
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2012
Field of study

One of the challenges in analyzing high-dimensional expression data is the detection of important biological signals. A common approach is to apply a dimension reduction method, such as principal component analysis. Typically, after application of such a method the data is projected and visualized in the new coordinate system, using scatter plots or profile plots. These methods provide good results if the data have certain properties which become visible in the new coordinate system and which were hard to detect in the original coordinate system. Often however, the application of only one method does not suffice to capture all important signals. Therefore several methods addressing different aspects of the data need to be applied. We have developed a framework for linear and non-linear dimension reduction methods within our visual analytics pipeline SpRay. This includes measures that assist the interpretation of the factorization result. Different visualizations of these measures can be combined with functional annotations that support the interpretation of the results. We show an application to high-resolution time series microarray data in the antibiotic-producing organism Streptomyces coelicolor as well as to microarray data measuring expression of cells with normal karyotype and cells with trisomies of human chromosomes 13 and 21

arXiv.org e-Print Archive

Crossref

Publikationsserver der Universität Tübingen

Reduced-order Description of Transient Instabilities and Computation of Finite-Time Lyapunov Exponents

Author: Arnold V. I.
George Haller
Hessam Babaee
Mohamad Farazmand
Themistoklis P. Sapsis
Publication venue: 'AIP Publishing'
Publication date: 20/04/2017
Field of study

High-dimensional chaotic dynamical systems can exhibit strongly transient features. These are often associated with instabilities that have finite-time duration. Because of the finite-time character of these transient events, their detection through infinite-time methods, e.g. long term averages, Lyapunov exponents or information about the statistical steady-state, is not possible. Here we utilize a recently developed framework, the Optimally Time-Dependent (OTD) modes, to extract a time-dependent subspace that spans the modes associated with transient features associated with finite-time instabilities. As the main result, we prove that the OTD modes, under appropriate conditions, converge exponentially fast to the eigendirections of the Cauchy--Green tensor associated with the most intense finite-time instabilities. Based on this observation, we develop a reduced-order method for the computation of finite-time Lyapunov exponents (FTLE) and vectors. In high-dimensional systems, the computational cost of the reduced-order method is orders of magnitude lower than the full FTLE computation. We demonstrate the validity of the theoretical findings on two numerical examples

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Higher derivative theories with constraints : Exorcising Ostrogradski's Ghost

Author: A. De Felice
Andrew J Tolley
C. Deffayet
E.A. Lim
Eugene A Lim
F. de Urries
G. Calcagni
H.A. Buchdahl
J. Govaerts
J. Maldacena
M. Henneaux
M. Ostrogradski
Matteo Fasiello
N. Barnaby
P.A.M. Dirac
R. Durrer
R. Kallosh
T. Biswas
T. Chiba
Tai-jun Chen
Publication venue: 'IOP Publishing'
Publication date: 26/02/2013
Field of study

We prove that the linear instability in a non-degenerate higher derivative theory, the Ostrogradski instability, can only be removed by the addition of constraints if the original theory's phase space is reduced.Comment: 17 pages, no figures, version published in JCA

arXiv.org e-Print Archive

Crossref

Portsmouth University Research Portal (Pure)

A Multiscale Approach for Statistical Characterization of Functional Images

Author: Antoniadis Anestis
Bigot Jérémie
Von Sachs Rainer
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements

Scientific Publications of the University of Toulouse II Le Mirail

Hal - Université Grenoble Alpes

Open Archive Toulouse Archive Ouverte

HAL-INSA Toulouse

DIAL UCLouvain

Hal-Diderot