Search CORE

1,597 research outputs found

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Phan A-H.
Zhao Q.
Lee N.
Oseledets I. V.
Sugiyama M.
Mandic D.
Publication venue
Publication date: 01/01/2017
Field of study

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

arXiv.org e-Print Archive

Crossref

FigShare

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Lee N.
Mandic D.
Oseledets I. V.
Phan A-H.
Sugiyama M.
Zhao Q.
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

CERN Document Server

Zero-bias autoencoders and the benefits of co-adapting features

Author: Konda Kishore
Krueger David
Memisevic Roland
Publication venue
Publication date: 08/04/2015
Field of study

Regularized training of an autoencoder typically results in hidden unit biases that take on large negative values. We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation. We then show that negative biases impede the learning of data distributions whose intrinsic dimensionality is high. We also propose a new activation function that decouples the two roles of the hidden layer and that allows us to learn representations on data with very high intrinsic dimensionality, where standard autoencoders typically fail. Since the decoupled activation function acts like an implicit regularizer, the model can be trained by minimizing the reconstruction error of training data, without requiring any additional regularization

arXiv.org e-Print Archive

CiteSeerX

Consistency of functional learning methods based on derivatives

Author: Bahlmann
Berlinet
Berlinet
Besse
Biau
Cardot
Cawley
Cox
Craven
Dejean
Deville
Devroye
Devroye
Devroye
Devroye
Fabrice Rossi
Faragó
Ferraty
Ferraty
Ferré
Györfi
Heckman
James
James
James
James
Kallenberg
Kimeldorf
Laloë
Lugosi
Nathalie Villa-Vialaneix
Pollard
Ragozin
Ramsay
Ramsay
Ramsay
Rossi
Rossi
Rossi
Rossi
Shawe-Taylor
Steinwart
Steinwart
Thodberg
Utreras
Utreras
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

International audienceIn some real world applications, such as spectrometry, functional models achieve better predictive performances if they work on the derivatives of order m of their inputs rather than on the original functions. As a consequence, the use of derivatives is a common practice in Functional Data Analysis, despite a lack of theoretical guarantees on the asymptotically achievable performances of a derivative based model. In this paper, we show that a smoothing spline approach can be used to preprocess multivariate observations obtained by sampling functions on a discrete and finite sampling grid in a way that leads to a consistent scheme on the original infinite dimensional functional problem. This work extends (Mas and Pumo, 2009) to nonparametric approaches and incomplete knowledge. To be more precise, the paper tackles two difficulties in a nonparametric framework: the information loss due to the use of the derivatives instead of the original functions and the information loss due to the fact that the functions are observed through a discrete sampling and are thus also unperfectly known: the use of a smoothing spline based approach solves these two problems. Finally, the proposed approach is tested on two real world datasets and the approach is experimentaly proven to be a good solution in the case of noisy functional predictors

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse