Search CORE

13,087 research outputs found

BMICA-independent component analysis based on B-spline mutual information estimator

Author: Li Yan
Walters-Williams Janett
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/04/2012
Field of study

The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. Its estimation however using B-Spline has not been used before in creating an approach for Independent Component Analysis. In this paper we present a B-Spline estimator for mutual information to find the independent components in mixed signals. Tested using electroencephalography (EEG) signals the resulting BMICA (B-Spline Mutual Information Independent Component Analysis) exhibits better performance than the standard Independent Component Analysis algorithms of FastICA, JADE, SOBI and EFICA in similar simulations. BMICA was found to be also more reliable than the 'renown' FastICA

University of Southern Queensland ePrints

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

Author: Blanchard Gilles
Flaska Marek
Handy Gregory
Pozzi Sara
Scott Clayton
Publication venue
Publication date: 01/01/2016
Field of study

In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. In this work, we give conditions that are necessary and sufficient for the true class-conditional distributions to be identifiable. These conditions are weaker than those analyzed previously, and allow for the classes to be nonseparable and the noise levels to be asymmetric and unknown. The conditions essentially state that a majority of the observed labels are correct and that the true class-conditional distributions are "mutually irreducible," a concept we introduce that limits the similarity of the two distributions. For any label noise problem, there is a unique pair of true class-conditional distributions satisfying the proposed conditions, and we argue that this pair corresponds in a certain sense to maximal denoising of the observed distributions. Our results are facilitated by a connection to "mixture proportion estimation," which is the problem of estimating the maximal proportion of one distribution that is present in another. We establish a novel rate of convergence result for mixture proportion estimation, and apply this to obtain consistency of a discrimination rule based on surrogate loss minimization. Experimental results on benchmark data and a nuclear particle classification problem demonstrate the efficacy of our approach

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

The non-Gaussianity of the cosmic shear likelihood - or: How odd is the Chandra Deep Field South?

Author: Hartlap J.
Schneider P.
Schrabback T.
Simon P.
Publication venue: 'EDP Sciences'
Publication date: 09/07/2009
Field of study

(abridged) We study the validity of the approximation of a Gaussian cosmic shear likelihood. We estimate the true likelihood for a fiducial cosmological model from a large set of ray-tracing simulations and investigate the impact of non-Gaussianity on cosmological parameter estimation. We investigate how odd the recently reported very low value of

\sigma_8

really is as derived from the \textit{Chandra} Deep Field South (CDFS) using cosmic shear by taking the non-Gaussianity of the likelihood into account as well as the possibility of biases coming from the way the CDFS was selected. We find that the cosmic shear likelihood is significantly non-Gaussian. This leads to both a shift of the maximum of the posterior distribution and a significantly smaller credible region compared to the Gaussian case. We re-analyse the CDFS cosmic shear data using the non-Gaussian likelihood. Assuming that the CDFS is a random pointing, we find

\sigma_8=0.68_{-0.16}^{+0.09}

for fixed

\Omega_{\rm m}=0.25

. In a WMAP5-like cosmology, a value equal to or lower than this would be expected in

\approx 5%

of the times. Taking biases into account arising from the way the CDFS was selected, which we model as being dependent on the number of haloes in the CDFS, we obtain

\sigma_8 = 0.71^{+0.10}_{-0.15}

. Combining the CDFS data with the parameter constraints from WMAP5 yields

\Omega_{\rm m} = 0.26^{+0.03}_{-0.02}

and

\sigma_8 = 0.79^{+0.04}_{-0.03}

for a flat universe.Comment: 18 pages, 16 figures, accepted for publication in A&A; New Bayesian treatment of field selection bia

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Submillimeter Number Counts From Statistical Analysis of BLAST Maps

We describe the application of a statistical method to estimate submillimeter galaxy number counts from confusion limited observations by the Balloon-borne Large Aperture Submillimeter Telescope (BLAST). Our method is based on a maximum likelihood fit to the pixel histogram, sometimes called 'P(D)', an approach which has been used before to probe faint counts, the difference being that here we advocate its use even for sources with relatively high signal-to-noise ratios. This method has an advantage over standard techniques of source extraction in providing an unbiased estimate of the counts from the bright end down to flux densities well below the confusion limit. We specifically analyse BLAST observations of a roughly 10 sq. deg. map centered on the Great Observatories Origins Deep Survey South (GOODS-S) field. We provide estimates of number counts at the three BLAST wavelengths, 250, 350, and 500 microns; instead of counting sources in flux bins we estimate the counts at several flux density nodes connected with power-laws. We observe a generally very steep slope for the counts of about -3.7 at 250 microns and -4.5 at 350 and 500 microns, over the range ~0.02-0.5 Jy, breaking to a shallower slope below about 0.015 Jy at all three wavelengths. We also describe how to estimate the uncertainties and correlations in this method so that the results can be used for model-fitting. This method should be well-suited for analysis of data from the Herschel satellite.Comment: Accepted for publication in the Astrophysical Journal; see associated data and other papers at http://blastexperiment.info

arXiv.org e-Print Archive

Crossref

HAL-IN2P3

HAL-INSU

University of Miami: Scholarship Miami

HAL-OBSPM

HAL-CEA

Archivio della ricerca- Università di Roma La Sapienza

Cosmological baryonic and matter densities from 600,000 SDSS Luminous Red Galaxies with photometric redshifts

Author: Blake Chris
Bridle Sarah
Collister Adrian
Lahav Ofer
Publication venue: 'Wiley'
Publication date: 09/11/2006
Field of study

We analyze MegaZ-LRG, a photometric-redshift catalogue of Luminous Red Galaxies (LRGs) based on the imaging data of the Sloan Digital Sky Survey (SDSS) 4th Data Release. MegaZ-LRG, presented in a companion paper, contains 10^6 photometric redshifts derived with ANNz, an Artificial Neural Network method, constrained by a spectroscopic sub-sample of 13,000 galaxies obtained by the 2dF-SDSS LRG and Quasar (2SLAQ) survey. The catalogue spans the redshift range 0.4 < z < 0.7 with an r.m.s. redshift error ~ 0.03(1+z), covering 5,914 deg^2 to map out a total cosmic volume 2.5 h^-3 Gpc^3. In this study we use the most reliable 600,000 photometric redshifts to present the first cosmological parameter fits to galaxy angular power spectra from a photometric redshift survey. Combining the redshift slices with appropriate covariances, we determine best-fitting values for the matter and baryon densities of Omega_m h = 0.195 +/- 0.023 and Omega_b/Omega_m = 0.16 +/- 0.036 (with the Hubble parameter h = 0.75 and scalar index of primordial fluctuations n = 1 held fixed). These results are in agreement with and independent of the latest studies of the Cosmic Microwave Background radiation, and their precision is comparable to analyses of contemporary spectroscopic-redshift surveys. We perform an extensive series of tests which conclude that our power spectrum measurements are robust against potential systematic photometric errors in the catalogue. We conclude that photometric-redshift surveys are competitive with spectroscopic surveys for measuring cosmological parameters in the simplest vanilla models. Future deep imaging surveys have great potential for further improvement, provided that systematic errors can be controlled.Comment: 24 pages, 23 figures, MNRAS accepte

arXiv.org e-Print Archive

CiteSeerX

Crossref

The University of Manchester - Institutional Repository

Self-consistent method for density estimation

Author: Berlinet
Bialek
Binder
Bowman
Breymann
Brown
Clauset
Csorgo
Davis
Devroye
Efromovich
Glad
Glad
Goldstein
Good
Hall
Holy
Hossjer
Kass
Kleywegt
Marron
Odlyzko
Olson
Parzen
Periwal
Politis
Ripley
Ruppert
Schmidt
Scott
Silverman
Ushakov
Wand
Watson
Wiener
Publication venue: 'Wiley'
Publication date: 13/12/2010
Field of study

The estimation of a density profile from experimental data points is a challenging problem, usually tackled by plotting a histogram. Prior assumptions on the nature of the density, from its smoothness to the specification of its form, allow the design of more accurate estimation procedures, such as Maximum Likelihood. Our aim is to construct a procedure that makes no explicit assumptions, but still providing an accurate estimate of the density. We introduce the self-consistent estimate: the power spectrum of a candidate density is given, and an estimation procedure is constructed on the assumption, to be released \emph{a posteriori}, that the candidate is correct. The self-consistent estimate is defined as a prior candidate density that precisely reproduces itself. Our main result is to derive the exact expression of the self-consistent estimate for any given dataset, and to study its properties. Applications of the method require neither priors on the form of the density nor the subjective choice of parameters. A cutoff frequency, akin to a bin size or a kernel bandwidth, emerges naturally from the derivation. We apply the self-consistent estimate to artificial data generated from various distributions and show that it reaches the theoretical limit for the scaling of the square error with the dataset size.Comment: 21 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Copenhagen University Research Information System