51,494 research outputs found
Ensemble Generation Methods and Cluster Ensemble Selection with Constraints
聚类融合首先生成一个包含多个不同聚类成员的聚类成员集,然后将其合并为一个更准确的共识分区。学者们普遍认为对于优质的聚类融合,其聚类成员应彼此不同,同时每个聚类成员的质量也应维持在一个可接受的水平。许多算法可用于生成不同的基聚类划分。与分类集成相似,诸多研究关注不同聚类成员的生成过程,例如对不同数据子集进行聚类(随机抽样)以及对不同特征子集进行聚类(随机投影)。然而,很少有研究关注这两种不同的抽样方法在质量和差异性上的性能比较。在本文中,我们提出了一种基于随机抽样的聚类成员生成新方法,通过寻找最近邻样本的方式来填补抽样时缺失样本的类别信息(简称为RS-NN)。我们通过与基于传统K-means的聚...Cluster ensemble first generates a large library of different clustering solutions and then combines them into a more accurate consensus clustering. It is commonly accepted that for cluster ensemble to work well the member partitions should be different from each other, and meanwhile the quality of each partition should remain at an acceptable level. Many different strategies have been used to gen...学位:工学硕士院系专业:信息科学与技术学院_模式识别与智能系统学号:2322011115323
Statistical Thermodynamics of Clustered Populations
We present a thermodynamic theory for a generic population of individuals
distributed into groups (clusters). We construct the ensemble of all
distributions with fixed and , introduce a selection functional that
embodies the physics that governs the population, and obtain the distribution
that emerges in the scaling limit as the most probable among all distributions
consistent with the given physics. We develop the thermodynamics of the
ensemble and establish a rigorous mapping to thermodynamics. We treat the
emergence of a so-called "giant component" as a formal phase transition and
show that the criteria for its emergence are entirely analogous to the
equilibrium conditions in molecular systems. We demonstrate the theory by an
analytic model and confirm the predictions by Monte Carlo simulation.Comment: Minor edits to tex
A Parameterized Galaxy Catalog Simulator for Testing Cluster Finding, Mass Estimation, and Photometric Redshift Estimation in Optical and Near-infrared Surveys
We present a galaxy catalog simulator that converts N -body simulations with halo and subhalo catalogs into mock, multiband photometric catalogs. The simulator assigns galaxy properties to each subhalo in a way that reproduces the observed cluster galaxy halo occupation distribution, the radial and mass-dependent variation in fractions of blue galaxies, the luminosity functions in the cluster and the field, and the color-magnitude relation in clusters. Moreover, the evolution of these parameters is tuned to match existing observational constraints. Parameterizing an ensemble of cluster galaxy properties enables us to create mock catalogs with variations in those properties, which in turn allows us to quantify the sensitivity of cluster finding to current observational uncertainties in these properties. Field galaxies are sampled from existing multiband photometric surveys of similar depth. We present an application of the catalog simulator to characterize the selection function and contamination of a galaxy cluster finder that utilizes the cluster red sequence together with galaxy clustering on the sky. We estimate systematic uncertainties in the selection to be at the ≤15% level with current observational constraints on cluster galaxy populations and their evolution. We find the contamination in this cluster finder to be ~35% to redshift z ~ 0.6. In addition, we use the mock galaxy catalogs to test the optical mass indicator B gc and a red-sequence redshift estimator. We measure the intrinsic scatter of the B gc -mass relation to be approximately log normal with ##IMG## [http://ej.iop.org/images/0004-637X/747/1/58/apj417488ieqn1.gif] {\sigma _{\log _{10M\sim 0.25 and we demonstrate photometric redshift accuracies for massive clusters at the ~3% level out to z ~ 0.7.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/98548/1/0004-637X_747_1_58.pd
Analysis of Sunyaev-Zel'dovich Effect Mass-Observable Relations using South Pole Telescope Observations of an X-ray Selected Sample of Low Mass Galaxy Clusters and Groups
(Abridged) We use 95, 150, and 220GHz observations from the SPT to examine
the SZE signatures of a sample of 46 X-ray selected groups and clusters drawn
from ~6 deg^2 of the XMM-BCS. These systems extend to redshift z=1.02, have
characteristic masses ~3x lower than clusters detected directly in the SPT data
and probe the SZE signal to the lowest X-ray luminosities (>10^42 erg s^-1)
yet.
We develop an analysis tool that combines the SZE information for the full
ensemble of X-ray-selected clusters. Using X-ray luminosity as a mass proxy, we
extract selection-bias corrected constraints on the SZE significance- and
Y_500-mass relations. The SZE significance- mass relation is in good agreement
with an extrapolation of the relation obtained from high mass clusters.
However, the fit to the Y_500-mass relation at low masses, while in good
agreement with the extrapolation from high mass SPT clusters, is in tension at
2.8 sigma with the constraints from the Planck sample. We examine the tension
with the Planck relation, discussing sample differences and biases that could
contribute.
We also present an analysis of the radio galaxy point source population in
this ensemble of X-ray selected systems. We find 18 of our systems have 843 MHz
SUMSS sources within 2 arcmin of the X-ray centre, and three of these are also
detected at significance >4 by SPT. Of these three, two are associated with the
group brightest cluster galaxies, and the third is likely an unassociated
quasar candidate. We examine the impact of these point sources on our SZE
scaling relation analyses and find no evidence of biases. We also examine the
impact of dusty galaxies using constraints from the 220 GHz data. The stacked
sample provides 2.8 significant evidence of dusty galaxy flux, which
would correspond to an average underestimate of the SPT Y_500 signal that is
(17+-9) per cent in this sample of low mass systems.Comment: 15 pages, 7 figure
Recommended from our members
The Ensemble of Conformations of Antifreeze Glycoproteins (AFGP8): A Study Using Nuclear Magnetic Resonance Spectroscopy.
The primary sequence of antifreeze glycoproteins (AFGPs) is highly degenerate, consisting of multiple repeats of the same tripeptide, Ala-Ala-Thr*, in which Thr* is a glycosylated threonine with the disaccharide beta-d-galactosyl-(1,3)-alpha-N-acetyl-d-galactosamine. AFGPs seem to function as intrinsically disordered proteins, presenting challenges in determining their native structure. In this work, a different approach was used to elucidate the three-dimensional structure of AFGP8 from the Arctic cod Boreogadus saida and the Antarctic notothenioid Trematomus borchgrevinki. Dimethyl sulfoxide (DMSO), a non-native solvent, was used to make AFGP8 less dynamic in solution. Interestingly, DMSO induced a non-native structure, which could be determined via nuclear magnetic resonance (NMR) spectroscopy. The overall three-dimensional structures of the two AFGP8s from two different natural sources were different from a random coil ensemble, but their "compactness" was very similar, as deduced from NMR measurements. In addition to their similar compactness, the conserved motifs, Ala-Thr*-Pro-Ala and Ala-Thr*-Ala-Ala, present in both AFGP8s, seemed to have very similar three-dimensional structures, leading to a refined definition of local structural motifs. These local structural motifs allowed AFGPs to be considered functioning as effectors, making a transition from disordered to ordered upon binding to the ice surface. In addition, AFGPs could act as dynamic linkers, whereby a short segment folds into a structural motif, while the rest of the AFGPs could still be disordered, thus simultaneously interacting with bulk water molecules and the ice surface, preventing ice crystal growth
Implications of multiple high-redshift galaxy clusters
To date, 14 high-redshift (z>1.0) galaxy clusters with mass measurements have
been observed, spectroscopically confirmed and are reported in the literature.
These objects should be exceedingly rare in the standard LCDM model. We
conservatively approximate the selection functions of these clusters' parent
surveys, and quantify the tension between the abundances of massive clusters as
predicted by the standard LCDM model and the observed ones. We alleviate the
tension considering non-Gaussian primordial perturbations of the local type,
characterized by the parameter fnl and derive constraints on fnl arising from
the mere existence of these clusters. At the 95% confidence level, fnl>467 with
cosmological parameters fixed to their most likely WMAP5 values, or fnl > 123
(at 95% confidence) if we marginalize over WMAP5 parameters priors. In
combination with fnl constraints from Cosmic Microwave Background and halo
bias, this determination implies a scale-dependence of fnl at approx. 3 sigma.
Given the assumptions made in the analysis, we expect any future improvements
to the modeling of the non-Gaussian mass function, survey volumes, or selection
functions to increase the significance of fnl>0 found here. In order to
reconcile these massive, high-z clusters with an fnl=0, their masses would need
to be systematically lowered by 1.5 sigma or the sigma8 parameter should be
approx. 3 sigma higher than CMB (and large-scale structure) constraints. The
existence of these objects is a puzzle: it either represents a challenge to the
LCDM paradigme or it is an indication that the mass estimates of clusters is
dramatically more uncertain than we think.Comment: 11 pages, 7 figures, modified to match published versio
CLASH: Weak-Lensing Shear-and-Magnification Analysis of 20 Galaxy Clusters
We present a joint shear-and-magnification weak-lensing analysis of a sample
of 16 X-ray-regular and 4 high-magnification galaxy clusters at 0.19<z<0.69
selected from the Cluster Lensing And Supernova survey with Hubble (CLASH). Our
analysis uses wide-field multi-color imaging, taken primarily with Suprime-Cam
on the Subaru Telescope. From a stacked shear-only analysis of the
X-ray-selected subsample, we detect the ensemble-averaged lensing signal with a
total signal-to-noise ratio of ~25 in the radial range of 200 to 3500kpc/h. The
stacked tangential-shear signal is well described by a family of standard
density profiles predicted for dark-matter-dominated halos in gravitational
equilibrium, namely the Navarro-Frenk-White (NFW), truncated variants of NFW,
and Einasto models. For the NFW model, we measure a mean concentration of
at . We show this is in excellent agreement with Lambda
cold-dark-matter (LCDM) predictions when the CLASH X-ray selection function and
projection effects are taken into account. The best-fit Einasto shape parameter
is , which is consistent with the
NFW-equivalent Einasto parameter of . We reconstruct projected mass
density profiles of all CLASH clusters from a joint likelihood analysis of
shear-and-magnification data, and measure cluster masses at several
characteristic radii. We also derive an ensemble-averaged total projected mass
profile of the X-ray-selected subsample by stacking their individual mass
profiles. The stacked total mass profile, constrained by the
shear+magnification data, is shown to be consistent with our shear-based
halo-model predictions including the effects of surrounding large-scale
structure as a two-halo term, establishing further consistency in the context
of the LCDM model.Comment: Accepted by ApJ on 11 August 2014. Textual changes to improve clarity
(e.g., Sec.3.2.2 "Number-count Depletion", Sec.4.3 "Shape Measurement",
Sec.4.4 "Background Galaxy Selection"). Results and conclusions remain
unchanged. For the public release of Subaru data, see
http://archive.stsci.edu/prepds/clash
EC3: Combining Clustering and Classification for Ensemble Learning
Classification and clustering algorithms have been proved to be successful
individually in different contexts. Both of them have their own advantages and
limitations. For instance, although classification algorithms are more powerful
than clustering methods in predicting class labels of objects, they do not
perform well when there is a lack of sufficient manually labeled reliable data.
On the other hand, although clustering algorithms do not produce label
information for objects, they provide supplementary constraints (e.g., if two
objects are clustered together, it is more likely that the same label is
assigned to both of them) that one can leverage for label prediction of a set
of unknown objects. Therefore, systematic utilization of both these types of
algorithms together can lead to better prediction performance. In this paper,
We propose a novel algorithm, called EC3 that merges classification and
clustering together in order to support both binary and multi-class
classification. EC3 is based on a principled combination of multiple
classification and multiple clustering methods using an optimization function.
We theoretically show the convexity and optimality of the problem and solve it
by block coordinate descent method. We additionally propose iEC3, a variant of
EC3 that handles imbalanced training data. We perform an extensive experimental
analysis by comparing EC3 and iEC3 with 14 baseline methods (7 well-known
standalone classifiers, 5 ensemble classifiers, and 2 existing methods that
merge classification and clustering) on 13 standard benchmark datasets. We show
that our methods outperform other baselines for every single dataset, achieving
at most 10% higher AUC. Moreover our methods are faster (1.21 times faster than
the best baseline), more resilient to noise and class imbalance than the best
baseline method.Comment: 14 pages, 7 figures, 11 table
Looking for bimodal distributions in multi-fragmentation reactions
The presence of a phase transition in a finite system can be deduced,
together with its order, from the shape of the distribution of the order
parameter. This issue has been extensively studied in multifragmentation
experiments, with results that do not appear fully consistent. In this paper we
discuss the effect of the statistical ensemble or sorting conditions on the
shape of fragment distributions, and propose a new method, which can be easily
implemented experimentally, to discriminate between different fragmentation
scenarii. This method, based on a reweighting of the measured distribution to
account for the experimental constraints linked to the energy deposit, is
tested on different simple models, and appears to provide a powerful
discrimination.Comment: 11 pages, 7 figure
Probing dark energy with cluster counts and cosmic shear power spectra: including the full covariance
(Abridged) Combining cosmic shear power spectra and cluster counts is
powerful to improve cosmological parameter constraints and/or test inherent
systematics. However they probe the same cosmic mass density field, if the two
are drawn from the same survey region, and therefore the combination may be
less powerful than first thought. We investigate the cross-covariance between
the cosmic shear power spectra and the cluster counts based on the halo model
approach, where the cross-covariance arises from the three-point correlations
of the underlying mass density field. Fully taking into account the
cross-covariance as well as non-Gaussian errors on the lensing power spectrum
covariance, we find a significant cross-correlation between the lensing power
spectrum signals at multipoles l~10^3 and the cluster counts containing halos
with masses M>10^{14}Msun. Including the cross-covariance for the combined
measurement degrades and in some cases improves the total signal-to-noise
ratios up to plus or minus 20% relative to when the two are independent. For
cosmological parameter determination, the cross-covariance has a smaller effect
as a result of working in a multi-dimensional parameter space, implying that
the two observables can be considered independent to a good approximation. We
also discuss that cluster count experiments using lensing-selected mass peaks
could be more complementary to cosmic shear tomography than mass-selected
cluster counts of the corresponding mass threshold. Using lensing selected
clusters with a realistic usable detection threshold (S/N~6 for a ground-based
survey), the uncertainty on each dark energy parameter may be roughly halved by
the combined experiments, relative to using the power spectra alone.Comment: 32 pages, 15 figures. Revised version, invited original contribution
to gravitational lensing focus issue, New Journal of Physic
- …