80,530 research outputs found
Non-Gaussian Component Analysis using Entropy Methods
Non-Gaussian component analysis (NGCA) is a problem in multidimensional data
analysis which, since its formulation in 2006, has attracted considerable
attention in statistics and machine learning. In this problem, we have a random
variable in -dimensional Euclidean space. There is an unknown subspace
of the -dimensional Euclidean space such that the orthogonal
projection of onto is standard multidimensional Gaussian and the
orthogonal projection of onto , the orthogonal complement
of , is non-Gaussian, in the sense that all its one-dimensional
marginals are different from the Gaussian in a certain metric defined in terms
of moments. The NGCA problem is to approximate the non-Gaussian subspace
given samples of .
Vectors in correspond to `interesting' directions, whereas
vectors in correspond to the directions where data is very noisy. The
most interesting applications of the NGCA model is for the case when the
magnitude of the noise is comparable to that of the true signal, a setting in
which traditional noise reduction techniques such as PCA don't apply directly.
NGCA is also related to dimension reduction and to other data analysis problems
such as ICA. NGCA-like problems have been studied in statistics for a long time
using techniques such as projection pursuit.
We give an algorithm that takes polynomial time in the dimension and has
an inverse polynomial dependence on the error parameter measuring the angle
distance between the non-Gaussian subspace and the subspace output by the
algorithm. Our algorithm is based on relative entropy as the contrast function
and fits under the projection pursuit framework. The techniques we develop for
analyzing our algorithm maybe of use for other related problems
Polarimetric Incoherent Target Decomposition by Means of Independent Component Analysis
International audienceThis paper presents an alternative approach for polarimetric incoherent target decomposition dedicated to the analysis of very-high resolution POLSAR images. Given the non-Gaussian nature of the heterogeneous POLSAR clutter due to the increase of spatial resolution, the conventional methods based on the eigenvector target decomposition can ensure uncorrelation of the derived backscattering components at most. By introducing the Independent Component Analysis (ICA) in lieu of the eigenvector decomposition, our method is rather deriving statistically independent components. The adopted algorithm - FastICA, uses the non-Gaussianity of the components as the criterion for their independence. Considering the eigenvector decomposition as being analogues to the Principal Component Analysis (PCA), we propose the generalization of the ICTD methods to the level of the Blind Source Separation (BSS) techniques (comprising both PCA and ICA). The proposed method preserves the invariance properties of the conventional ones, appearing to be robust both with respect to the rotation around the line of sight and to the change of the polarization basis. The efficiency of the method is demonstrated comparatively, using POLSAR Ramses X-band and ALOS L-band data sets. The main differences with respect to the conventional methods are mostly found in the behaviour of the second most dominant component, which is not necessarily orthogonal to the first one. The potential of retrieving non-orthogonal mechanisms is moreover demonstrated using synthetic data. On expense of a negligible entropy increase, the proposed method is capable of retrieving the edge diffraction of an elementary trihedral by recognizing dipole as the second component
Effect of component separation on the temperature distribution of the CMB
We present a study of the effect of component separation on the recovered
cosmic microwave background (CMB) temperature distribution, considering
Gaussian and non-Gaussian input CMB maps. First, we extract the CMB component
from simulated Planck data (in small patches of the sky) using the maximum
entropy method (MEM), Wiener filter (WF) and a method based on the subtraction
of foreground templates plus a linear combination of frequency channels (LCFC).
We then apply a wavelet-based method to study the Gaussianity of the recovered
CMB and compare it with the same analysis for the input map. When the original
CMB map is Gaussian (and assuming that point sources have been removed), we
find that none of the methods introduce non-Gaussianity (NG) in the CMB
reconstruction. On the contrary, if the input CMB map is non-Gaussian, all the
studied methods produce a reconstructed CMB with lower detections of NG than
the original map. This effect is mainly due to the presence of instrumental
noise. In this case, MEM tends to produce slightly higher non-Gaussian
detections in the reconstructed map than WF whereas the detections are lower
for the LCFC. We have also studied the effect of point sources in the MEM
reconstruction. If no attempt to remove point sources is performed, they
clearly contaminate the CMB reconstruction, introducing spurious NG. When the
brightest point sources are removed from the data using the Mexican Hat
Wavelet, the Gaussian character of the CMB is preserved. However, when
analysing larger regions of the sky, the variance of our estimators will be
appreciably reduced and, in this case, we expect the point source residuals to
introduce spurious NG in the CMB. Thus, a careful subtraction (or masking) of
point source emission is crucial when studying the Gaussianity of the CMB.Comment: 23 pages, 19 figures. Some new results added, including a new section
about the role of foregrounds and instrumental noise. Accepted for
publication in MNRA
Non-linear Causal Inference using Gaussianity Measures
We provide theoretical and empirical evidence for a type of asymmetry between
causes and effects that is present when these are related via linear models
contaminated with additive non-Gaussian noise. Assuming that the causes and the
effects have the same distribution, we show that the distribution of the
residuals of a linear fit in the anti-causal direction is closer to a Gaussian
than the distribution of the residuals in the causal direction. This
Gaussianization effect is characterized by reduction of the magnitude of the
high-order cumulants and by an increment of the differential entropy of the
residuals. The problem of non-linear causal inference is addressed by
performing an embedding in an expanded feature space, in which the relation
between causes and effects can be assumed to be linear. The effectiveness of a
method to discriminate between causes and effects based on this type of
asymmetry is illustrated in a variety of experiments using different measures
of Gaussianity. The proposed method is shown to be competitive with
state-of-the-art techniques for causal inference.Comment: 35 pages, 9 figure
Sunyaev-Zel'dovich clusters reconstruction in multiband bolometer camera surveys
We present a new method for the reconstruction of Sunyaev-Zel'dovich (SZ)
galaxy clusters in future SZ-survey experiments using multiband bolometer
cameras such as Olimpo, APEX, or Planck. Our goal is to optimise SZ-Cluster
extraction from our observed noisy maps. We wish to emphasize that none of the
algorithms used in the detection chain is tuned on prior knowledge on the SZ
-Cluster signal, or other astrophysical sources (Optical Spectrum, Noise
Covariance Matrix, or covariance of SZ Cluster wavelet coefficients). First, a
blind separation of the different astrophysical components which contribute to
the observations is conducted using an Independent Component Analysis (ICA)
method. Then, a recent non linear filtering technique in the wavelet domain,
based on multiscale entropy and the False Discovery Rate (FDR) method, is used
to detect and reconstruct the galaxy clusters. Finally, we use the Source
Extractor software to identify the detected clusters. The proposed method was
applied on realistic simulations of observations. As for global detection
efficiency, this new method is impressive as it provides comparable results to
Pierpaoli et al. method being however a blind algorithm. Preprint with full
resolution figures is available at the URL:
w10-dapnia.saclay.cea.fr/Phocea/Vie_des_labos/Ast/ast_visu.php?id_ast=728Comment: Submitted to A&A. 32 Pages, text onl
The Missing Link between Morphemic Assemblies and Behavioral Responses:a Bayesian Information-Theoretical model of lexical processing
We present the Bayesian Information-Theoretical (BIT) model of lexical processing: A mathematical model illustrating a novel approach to the modelling of language processes. The model shows how a neurophysiological theory of lexical processing relying on Hebbian association and neural assemblies can directly account for a variety of effects previously observed in behavioural experiments. We develop two information-theoretical measures of the distribution of usages of a morpheme or word, and use them to predict responses in three visual lexical decision datasets investigating inflectional morphology and polysemy. Our model offers a neurophysiological basis for the effects of
morpho-semantic neighbourhoods. These results demonstrate how distributed patterns of activation naturally result in the arisal of symbolic structures. We conclude by arguing that the modelling framework exemplified here, is
a powerful tool for integrating behavioural and neurophysiological results
Wavelet Domain Image Separation
In this paper, we consider the problem of blind signal and image separation
using a sparse representation of the images in the wavelet domain. We consider
the problem in a Bayesian estimation framework using the fact that the
distribution of the wavelet coefficients of real world images can naturally be
modeled by an exponential power probability density function. The Bayesian
approach which has been used with success in blind source separation gives also
the possibility of including any prior information we may have on the mixing
matrix elements as well as on the hyperparameters (parameters of the prior laws
of the noise and the sources). We consider two cases: first the case where the
wavelet coefficients are assumed to be i.i.d. and second the case where we
model the correlation between the coefficients of two adjacent scales by a
first order Markov chain. This paper only reports on the first case, the second
case results will be reported in a near future. The estimation computations are
done via a Monte Carlo Markov Chain (MCMC) procedure. Some simulations show the
performances of the proposed method. Keywords: Blind source separation,
wavelets, Bayesian estimation, MCMC Hasting-Metropolis algorithm.Comment: Presented at MaxEnt2002, the 22nd International Workshop on Bayesian
and Maximum Entropy methods (Aug. 3-9, 2002, Moscow, Idaho, USA). To appear
in Proceedings of American Institute of Physic
- …