132,208 research outputs found
A reliable cluster detection technique using photometric redshifts: introducing the 2TecX algorithm
We present a new cluster detection algorithm designed for finding
high-redshift clusters using optical/infrared imaging data. The algorithm has
two main characteristics. First, it utilises each galaxy's full redshift
probability function, instead of an estimate of the photometric redshift based
on the peak of the probability function and an associated Gaussian error.
Second, it identifies cluster candidates through cross-checking the results of
two substantially different selection techniques (the name 2TecX representing
the cross-check of the two techniques). These are adaptations of the Voronoi
Tesselations and Friends-Of-Friends methods. Monte-Carlo simulations of mock
catalogues show that cross-checking the cluster candidates found by the two
techniques significantly reduces the detection of spurious sources.
Furthermore, we examine the selection effects and relative strengths and
weaknesses of either method. The simulations also allow us to fine-tune the
algorithm's parameters, and define completeness and mass limit as a function of
redshift. We demonstrate that the algorithm isolates high-redshift clusters at
a high level of efficiency and low contamination.Comment: 13 Pages, 17 figures, accepted for publication in MNRA
A Bayesian approach to discrete object detection in astronomical datasets
A Bayesian approach is presented for detecting and characterising the signal
from discrete objects embedded in a diffuse background. The approach centres
around the evaluation of the posterior distribution for the parameters of the
discrete objects, given the observed data, and defines the
theoretically-optimal procedure for parametrised object detection. Two
alternative strategies are investigated: the simultaneous detection of all the
discrete objects in the dataset, and the iterative detection of objects. In
both cases, the parameter space characterising the object(s) is explored using
Markov-Chain Monte-Carlo sampling. For the iterative detection of objects,
another approach is to locate the global maximum of the posterior at each
iteration using a simulated annealing downhill simplex algorithm. The
techniques are applied to a two-dimensional toy problem consisting of Gaussian
objects embedded in uncorrelated pixel noise. A cosmological illustration of
the iterative approach is also presented, in which the thermal and kinetic
Sunyaev-Zel'dovich effects from clusters of galaxies are detected in microwave
maps dominated by emission from primordial cosmic microwave background
anisotropies.Comment: 20 pages, 12 figures, accepted by MNRAS; contains some additional
material in response to referee's comment
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
We study the problem of detecting a structured, low-rank signal matrix
corrupted with additive Gaussian noise. This includes clustering in a Gaussian
mixture model, sparse PCA, and submatrix localization. Each of these problems
is conjectured to exhibit a sharp information-theoretic threshold, below which
the signal is too weak for any algorithm to detect. We derive upper and lower
bounds on these thresholds by applying the first and second moment methods to
the likelihood ratio between these "planted models" and null models where the
signal matrix is zero. Our bounds differ by at most a factor of root two when
the rank is large (in the clustering and submatrix localization problems, when
the number of clusters or blocks is large) or the signal matrix is very sparse.
Moreover, our upper bounds show that for each of these problems there is a
significant regime where reliable detection is information- theoretically
possible but where known algorithms such as PCA fail completely, since the
spectrum of the observed matrix is uninformative. This regime is analogous to
the conjectured 'hard but detectable' regime for community detection in sparse
graphs.Comment: For sparse PCA and submatrix localization, we determine the
information-theoretic threshold exactly in the limit where the number of
blocks is large or the signal matrix is very sparse based on a conditional
second moment method, closing the factor of root two gap in the first versio
Sunyaev-Zel'dovich clusters reconstruction in multiband bolometer camera surveys
We present a new method for the reconstruction of Sunyaev-Zel'dovich (SZ)
galaxy clusters in future SZ-survey experiments using multiband bolometer
cameras such as Olimpo, APEX, or Planck. Our goal is to optimise SZ-Cluster
extraction from our observed noisy maps. We wish to emphasize that none of the
algorithms used in the detection chain is tuned on prior knowledge on the SZ
-Cluster signal, or other astrophysical sources (Optical Spectrum, Noise
Covariance Matrix, or covariance of SZ Cluster wavelet coefficients). First, a
blind separation of the different astrophysical components which contribute to
the observations is conducted using an Independent Component Analysis (ICA)
method. Then, a recent non linear filtering technique in the wavelet domain,
based on multiscale entropy and the False Discovery Rate (FDR) method, is used
to detect and reconstruct the galaxy clusters. Finally, we use the Source
Extractor software to identify the detected clusters. The proposed method was
applied on realistic simulations of observations. As for global detection
efficiency, this new method is impressive as it provides comparable results to
Pierpaoli et al. method being however a blind algorithm. Preprint with full
resolution figures is available at the URL:
w10-dapnia.saclay.cea.fr/Phocea/Vie_des_labos/Ast/ast_visu.php?id_ast=728Comment: Submitted to A&A. 32 Pages, text onl
A GMBCG Galaxy Cluster Catalog of 55,424 Rich Clusters from SDSS DR7
We present a large catalog of optically selected galaxy clusters from the
application of a new Gaussian Mixture Brightest Cluster Galaxy (GMBCG)
algorithm to SDSS Data Release 7 data. The algorithm detects clusters by
identifying the red sequence plus Brightest Cluster Galaxy (BCG) feature, which
is unique for galaxy clusters and does not exist among field galaxies. Red
sequence clustering in color space is detected using an Error Corrected
Gaussian Mixture Model. We run GMBCG on 8240 square degrees of photometric data
from SDSS DR7 to assemble the largest ever optical galaxy cluster catalog,
consisting of over 55,000 rich clusters across the redshift range from 0.1 < z
< 0.55. We present Monte Carlo tests of completeness and purity and perform
cross-matching with X-ray clusters and with the maxBCG sample at low redshift.
These tests indicate high completeness and purity across the full redshift
range for clusters with 15 or more members.Comment: Updated to match the published version. The catalog can be accessed
from: http://home.fnal.gov/~jghao/gmbcg_sdss_catalog.htm
Bayesian Cluster Enumeration Criterion for Unsupervised Learning
We derive a new Bayesian Information Criterion (BIC) by formulating the
problem of estimating the number of clusters in an observed data set as
maximization of the posterior probability of the candidate models. Given that
some mild assumptions are satisfied, we provide a general BIC expression for a
broad class of data distributions. This serves as a starting point when
deriving the BIC for specific distributions. Along this line, we provide a
closed-form BIC expression for multivariate Gaussian distributed variables. We
show that incorporating the data structure of the clustering problem into the
derivation of the BIC results in an expression whose penalty term is different
from that of the original BIC. We propose a two-step cluster enumeration
algorithm. First, a model-based unsupervised learning algorithm partitions the
data according to a given set of candidate models. Subsequently, the number of
clusters is determined as the one associated with the model for which the
proposed BIC is maximal. The performance of the proposed two-step algorithm is
tested using synthetic and real data sets.Comment: 14 pages, 7 figure
- …