Search CORE

797 research outputs found

Finding Exogenous Variables in Data with Many More Variables than Observations

Author: A. Hyvärinen
A. Hyvärinen
A. Londei
A.V. Ivshina
C. Lorén
D. Bernardo di
E. Lehmann
J. Pearl
N. Delfosse
P. Comon
P. Spirtes
S. Shimizu
Y. Benjamini
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Many statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations (p<n, p: the number of variables and n: the number of observations). However, modern datasets including gene expression data need high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations (p>>n). In this paper, we propose a method to find exogenous variables in a linear non-Gaussian causal model, which requires much smaller sample sizes than conventional methods and works even when p>>n. The key idea is to identify which variables are exogenous based on non-Gaussianity instead of estimating the entire structure of the model. Exogenous variables work as triggers that activate a causal chain in the model, and their identification leads to more efficient experimental designs and better understanding of the causal mechanism. We present experiments with artificial data and real-world gene expression data to evaluate the method.Comment: A revised version of this was published in Proc. ICANN201

arXiv.org e-Print Archive

CiteSeerX

Crossref

An Introduction to Independent Component Analysis: InfoMax and FastICA algorithms

Author: Amari S.
Bell A. J.
Bell A. J.
Cardoso J.-f.
Comon P.
H\'erault J.
Haykin S.
Hyvärinen A.
Hyvärinen A.
Hyvärinen A.
Hyvärinen A.
Hyvärinen A. and Karhunen, J. and Oja, E.
Publication venue: 'The Quantitative Methods for Psychology'
Publication date
Field of study

Crossref

Independent component analysis for domain independent watermarking

Author: A. Hyvärinen
B. Chen
F. A. P. Petitcolas
F. A. P. Petitcolas
I. Cox
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

A new principled domain independent watermarking framework is presented. The new approach is based on embedding the message in statistically independent sources of the covertext to mimimise covertext distortion, maximise the information embedding rate and improve the method's robustness against various attacks. Experiments comparing the performance of the new approach, on several standard attacks show the current proposed approach to be competitive with other state of the art domain-specific methods

Crossref

Aston Publications Explorer

Smoothed Analysis of Tensor Decompositions

Author: Anandkumar A.
Anandkumar A.
Anandkumar A.
Arora S.
Belkin M.
Dasgupta S.
Goyal N.
Harshman R.
Horn R.
Hyvärinen A.
Lindsay B.
Stegeman A.
Wedin P.
Publication venue
Publication date: 01/01/2014
Field of study

Low rank tensor decompositions are a powerful tool for learning generative models, and uniqueness results give them a significant advantage over matrix decomposition methods. However, tensors pose significant algorithmic challenges and tensors analogs of much of the matrix algebra toolkit are unlikely to exist because of hardness results. Efficient decomposition in the overcomplete case (where rank exceeds dimension) is particularly challenging. We introduce a smoothed analysis model for studying these questions and develop an efficient algorithm for tensor decomposition in the highly overcomplete case (rank polynomial in the dimension). In this setting, we show that our algorithm is robust to inverse polynomial error -- a crucial property for applications in learning since we are only allowed a polynomial number of samples. While algorithms are known for exact tensor decomposition in some overcomplete settings, our main contribution is in analyzing their stability in the framework of smoothed analysis. Our main technical contribution is to show that tensor products of perturbed vectors are linearly independent in a robust sense (i.e. the associated matrix has singular values that are at least an inverse polynomial). This key result paves the way for applying tensor methods to learning problems in the smoothed setting. In particular, we use it to obtain results for learning multi-view models and mixtures of axis-aligned Gaussians where there are many more "components" than dimensions. The assumption here is that the model is not adversarially chosen, formalized by a perturbation of model parameters. We believe this an appealing way to analyze realistic instances of learning problems, since this framework allows us to overcome many of the usual limitations of using tensor methods.Comment: 32 pages (including appendix

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Fourier PCA and Robust Tensor Decomposition

Author: Anandkumar A.
Anandkumar A.
Anderson J.
Arora S.
Belkin M.
Belkin M.
Cardoso J.
Chaudhuri K.
Comon P.
Dasgupta S.
Hyvärinen A.
Kannan R.
Publication venue
Publication date: 27/06/2014
Field of study

Fourier PCA is Principal Component Analysis of a matrix obtained from higher order derivatives of the logarithm of the Fourier transform of a distribution.We make this method algorithmic by developing a tensor decomposition method for a pair of tensors sharing the same vectors in rank-

1

decompositions. Our main application is the first provably polynomial-time algorithm for underdetermined ICA, i.e., learning an

n \times m

matrix

A

from observations

y=Ax

where

x

is drawn from an unknown product distribution with arbitrary non-Gaussian components. The number of component distributions

m

can be arbitrarily higher than the dimension

n

and the columns of

A

only need to satisfy a natural and efficiently verifiable nondegeneracy condition. As a second application, we give an alternative algorithm for learning mixtures of spherical Gaussians with linearly independent means. These results also hold in the presence of Gaussian noise.Comment: Extensively revised; details added; minor errors corrected; exposition improve

arXiv.org e-Print Archive

CiteSeerX

Crossref

Least Dependent Component Analysis Based on Mutual Information

Author: A. Cichocki
A. Cichocki
A. Hyvärinen
A. Hyvärinen
A. K. Jain
A. Ziehe
Alexander Kraskov
E. Ott
F. R. Bach
H. Kantz
Harald Stögbauer
J. Chen
J.-F. Cardoso
J.-F. Cardoso
L. F. Kozachenko
O. Vasicek
P. Grassberger
Peter Grassberger
R. L. Somorjai
S. Amari
S. E. Stein
Sergey A. Astakhov
T. M. Cover
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2004
Field of study

We propose to use precise estimators of mutual information (MI) to find least dependent components in a linearly mixed signal. On the one hand this seems to lead to better blind source separation than with any other presently available algorithm. On the other hand it has the advantage, compared to other implementations of `independent' component analysis (ICA) some of which are based on crude approximations for MI, that the numerical values of the MI can be used for: (i) estimating residual dependencies between the output components; (ii) estimating the reliability of the output, by comparing the pairwise MIs with those of re-mixed components; (iii) clustering the output according to the residual interdependencies. For the MI estimator we use a recently proposed k-nearest neighbor based algorithm. For time sequences we combine this with delay embedding, in order to take into account non-trivial time correlations. After several tests with artificial data, we apply the resulting MILCA (Mutual Information based Least dependent Component Analysis) algorithm to a real-world dataset, the ECG of a pregnant woman. The software implementation of the MILCA algorithm is freely available at http://www.fz-juelich.de/nic/cs/softwareComment: 18 pages, 20 figures, Phys. Rev. E (in press

arXiv.org e-Print Archive

CiteSeerX

Crossref

Juelich Shared Electronic Resources

CERN Document Server

Non-Redundant Spectral Dimensionality Reduction

Author: A Brun
A Hyvärinen
A Hyvärinen
A Singer
B Schölkopf
C Jutten
CC Chang
DL Donoho
EA Nadaraya
G Guo
GS Watson
JB Tenenbaum
L Maaten Van Der
M Belkin
M Belkin
M Rubinstein
MS Bartlett
N Halko
P Isola
RR Coifman
ST Roweis
X Geng
X He
Y Goldberg
Y LeCun
Z Zhang
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/04/2017
Field of study

Spectral dimensionality reduction algorithms are widely used in numerous domains, including for recognition, segmentation, tracking and visualization. However, despite their popularity, these algorithms suffer from a major limitation known as the "repeated Eigen-directions" phenomenon. That is, many of the embedding coordinates they produce typically capture the same direction along the data manifold. This leads to redundant and inefficient representations that do not reveal the true intrinsic dimensionality of the data. In this paper, we propose a general method for avoiding redundancy in spectral algorithms. Our approach relies on replacing the orthogonality constraints underlying those methods by unpredictability constraints. Specifically, we require that each embedding coordinate be unpredictable (in the statistical sense) from all previous ones. We prove that these constraints necessarily prevent redundancy, and provide a simple technique to incorporate them into existing methods. As we illustrate on challenging high-dimensional scenarios, our approach produces significantly more informative and compact representations, which improve visualization and classification tasks

arXiv.org e-Print Archive

Crossref

Seashore disturbance and management of the clonal Arctophila fulva: Modelling patch dynamics

Author: A.-L. Laine
J. Aspi
J. Siira
M. Hyvärinen
P. Rautiainen
S. Aikio
Publication venue: 'International Association for Vegetation Science'
Publication date: 01/01/2006
Field of study

Crossref

Use of Bimodal Coherence to Resolve Spectral Indeterminacy in Convolutive BSS

Author: A. Hyvärinen
B. Rivet
C. Jutten
D. Sodoyer
J. Thomas
J.F. Cardoso
P. Comon
Publication venue
Publication date: 01/01/2010
Field of study

Recent studies show that visual information contained in visual speech can be helpful for the performance enhancement of audio-only blind source separation (BSS) algorithms. Such information is exploited through the statistical characterisation of the coherence between the audio and visual speech using, e.g. a Gaussian mixture model (GMM). In this paper, we present two new contributions. An adapted expectation maximization (AEM) algorithm is proposed in the training process to model the audio-visual coherence upon the extracted features. The coherence is exploited to solve the permutation problem in the frequency domain using a new sorting scheme. We test our algorithm on the XM2VTS multimodal database. The experimental results show that our proposed algorithm outperforms traditional audio-only BSS

CiteSeerX

Crossref

University of Surrey

Surrey Research Insight

Microstructure and tribological properties of solid lubricant-doped CMT-WAAMed Stellite deposits

Author: Hyvärinen Leo
Peura P
Sabr A
Tuominen J
Publication venue
Publication date: 22/04/2024
Field of study

A large share of the world’s total energy consumption is used to overcome friction. Therefore, low friction wear-resistant materials are needed. Solid lubricants are solid-phase materials that can reduce friction at different temperatures between two surfaces sliding against each other without the need for a grease or liquid oil medium. In this study, Cold Metal Transfer Wire Arc Additive Manufacturing (CMT-WAAM) was used to deposit solid lubricant (WS2, MoS2, CaF2) doped hypoeutectic Stellite alloy. Fabricated deposits possessed crack- and pore-free microstructures consisting of γ-Co and M7C3 carbide eutectics embedded with chromium sulfides and microhardness values of ~ 530 HV1. They were also tested in self-mated unidirectional sliding wear tests in dry conditions at room temperature (RT) and at 300 °C in an air atmosphere. The results showed that the dynamic coefficient of friction (COF) decreased ~ 27% at RT and ~ 21% at 300 °C without losing the wear properties. During sliding wear tests severe strain hardening occurred and γ-Co was found to transform to ε-Co. The developed deposits can be used as hard facings or 3D printed components in applications that require good sliding wear properties at different temperatures such as metal forming tools, power transmission components, valves, and internal parts of combustion engines.Peer reviewe

Trepo - Institutional Repository of Tampere University