13,532 research outputs found
On asymptotics of ICA estimators and their performance indices
Independent component analysis (ICA) has become a popular multivariate
analysis and signal processing technique with diverse applications. This paper
is targeted at discussing theoretical large sample properties of ICA unmixing
matrix functionals. We provide a formal definition of unmixing matrix
functional and consider two popular estimators in detail: the family based on
two scatter matrices with the independence property (e.g., FOBI estimator) and
the family of deflation-based fastICA estimators. The limiting behavior of the
corresponding estimates is discussed and the asymptotic normality of the
deflation-based fastICA estimate is proven under general assumptions.
Furthermore, properties of several performance indices commonly used for
comparison of different unmixing matrix estimates are discussed and a new
performance index is proposed. The proposed index fullfills three desirable
features which promote its use in practice and distinguish it from others.
Namely, the index possesses an easy interpretation, is fast to compute and its
asymptotic properties can be inferred from asymptotics of the unmixing matrix
estimate. We illustrate the derived asymptotical results and the use of the
proposed index with a small simulation study
Space Warps II. New Gravitational Lens Candidates from the CFHTLS Discovered through Citizen Science
We report the discovery of 29 promising (and 59 total) new lens candidates
from the CFHT Legacy Survey (CFHTLS) based on about 11 million classifications
performed by citizen scientists as part of the first Space Warps lens search.
The goal of the blind lens search was to identify lens candidates missed by
robots (the RingFinder on galaxy scales and ArcFinder on group/cluster scales)
which had been previously used to mine the CFHTLS for lenses. We compare some
properties of the samples detected by these algorithms to the Space Warps
sample and find them to be broadly similar. The image separation distribution
calculated from the Space Warps sample shows that previous constraints on the
average density profile of lens galaxies are robust. SpaceWarps recovers about
65% of known lenses, while the new candidates show a richer variety compared to
those found by the two robots. This detection rate could be increased to 80% by
only using classifications performed by expert volunteers (albeit at the cost
of a lower purity), indicating that the training and performance calibration of
the citizen scientists is very important for the success of Space Warps. In
this work we present the SIMCT pipeline, used for generating in situ a sample
of realistic simulated lensed images. This training sample, along with the
false positives identified during the search, has a legacy value for testing
future lens finding algorithms. We make the pipeline and the training set
publicly available.Comment: 23 pages, 12 figures, MNRAS accepted, minor to moderate changes in
this versio
Fourth Moments and Independent Component Analysis
In independent component analysis it is assumed that the components of the
observed random vector are linear combinations of latent independent random
variables, and the aim is then to find an estimate for a transformation matrix
back to these independent components. In the engineering literature, there are
several traditional estimation procedures based on the use of fourth moments,
such as FOBI (fourth order blind identification), JADE (joint approximate
diagonalization of eigenmatrices), and FastICA, but the statistical properties
of these estimates are not well known. In this paper various independent
component functionals based on the fourth moments are discussed in detail,
starting with the corresponding optimization problems, deriving the estimating
equations and estimation algorithms, and finding asymptotic statistical
properties of the estimates. Comparisons of the asymptotic variances of the
estimates in wide independent component models show that in most cases JADE and
the symmetric version of FastICA perform better than their competitors.Comment: Published at http://dx.doi.org/10.1214/15-STS520 in the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Robust Machine Learning Applied to Astronomical Datasets I: Star-Galaxy Classification of the SDSS DR3 Using Decision Trees
We provide classifications for all 143 million non-repeat photometric objects
in the Third Data Release of the Sloan Digital Sky Survey (SDSS) using decision
trees trained on 477,068 objects with SDSS spectroscopic data. We demonstrate
that these star/galaxy classifications are expected to be reliable for
approximately 22 million objects with r < ~20. The general machine learning
environment Data-to-Knowledge and supercomputing resources enabled extensive
investigation of the decision tree parameter space. This work presents the
first public release of objects classified in this way for an entire SDSS data
release. The objects are classified as either galaxy, star or nsng (neither
star nor galaxy), with an associated probability for each class. To demonstrate
how to effectively make use of these classifications, we perform several
important tests. First, we detail selection criteria within the probability
space defined by the three classes to extract samples of stars and galaxies to
a given completeness and efficiency. Second, we investigate the efficacy of the
classifications and the effect of extrapolating from the spectroscopic regime
by performing blind tests on objects in the SDSS, 2dF Galaxy Redshift and 2dF
QSO Redshift (2QZ) surveys. Given the photometric limits of our spectroscopic
training data, we effectively begin to extrapolate past our star-galaxy
training set at r ~ 18. By comparing the number counts of our training sample
with the classified sources, however, we find that our efficiencies appear to
remain robust to r ~ 20. As a result, we expect our classifications to be
accurate for 900,000 galaxies and 6.7 million stars, and remain robust via
extrapolation for a total of 8.0 million galaxies and 13.9 million stars.
[Abridged]Comment: 27 pages, 12 figures, to be published in ApJ, uses emulateapj.cl
- …