161,594 research outputs found
Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering
The robust improper maximum likelihood estimator (RIMLE) is a new method for
robust multivariate clustering finding approximately Gaussian clusters. It
maximizes a pseudo-likelihood defined by adding a component with improper
constant density for accommodating outliers to a Gaussian mixture. A special
case of the RIMLE is MLE for multivariate finite Gaussian mixture models. In
this paper we treat existence, consistency, and breakdown theory for the RIMLE
comprehensively. RIMLE's existence is proved under non-smooth covariance matrix
constraints. It is shown that these can be implemented via a computationally
feasible Expectation-Conditional Maximization algorithm.Comment: The title of this paper was originally: "A consistent and breakdown
robust model-based clustering method
Fast multi-image matching via density-based clustering
We consider the problem of finding consistent matches
across multiple images. Previous state-of-the-art solutions
use constraints on cycles of matches together with convex
optimization, leading to computationally intensive iterative
algorithms. In this paper, we propose a clustering-based
formulation. We first rigorously show its equivalence with
the previous one, and then propose QuickMatch, a novel
algorithm that identifies multi-image matches from a density
function in feature space. We use the density to order the
points in a tree, and then extract the matches by breaking this
tree using feature distances and measures of distinctiveness.
Our algorithm outperforms previous state-of-the-art methods
(such as MatchALS) in accuracy, and it is significantly faster
(up to 62 times faster on some bechmarks), and can scale to
large datasets (with more than twenty thousands features).Accepted manuscriptSupporting documentatio
The impact of baryonic processes on the two-point correlation functions of galaxies, subhaloes and matter
The observed clustering of galaxies and the cross-correlation of galaxies and
mass provide important constraints on both cosmology and models of galaxy
formation. Even though the dissipation and feedback processes associated with
galaxy formation are thought to affect the distribution of matter, essentially
all models used to predict clustering data are based on collisionless
simulations. Here, we use large hydrodynamical simulations to investigate how
galaxy formation affects the autocorrelation functions of galaxies and
subhaloes, as well as their cross-correlation with matter. We show that the
changes due to the inclusion of baryons are not limited to small scales and are
even present in samples selected by subhalo mass. Samples selected by subhalo
mass cluster ~10% more strongly in a baryonic run on scales r > 1Mpc/h, and
this difference increases for smaller separations. While the inclusion of
baryons boosts the clustering at fixed subhalo mass on all scales, the sign of
the effect on the cross-correlation of subhaloes with matter can vary with
radius. We show that the large-scale effects are due to the change in subhalo
mass caused by the strong feedback associated with galaxy formation and may
therefore not affect samples selected by number density. However, on scales r <
r_vir significant differences remain after accounting for the change in subhalo
mass. We conclude that predictions for galaxy-galaxy and galaxy-mass clustering
from models based on collisionless simulations will have errors greater than
10% on sub-Mpc scales, unless the simulation results are modified to correctly
account for the effects of baryons on the distributions of mass and satellites.Comment: 15 pages, 9 figures. Replaced to match the version accepted by MNRA
Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering
The two main topics of this paper are the introduction of the "optimally
tuned improper maximum likelihood estimator" (OTRIMLE) for robust clustering
based on the multivariate Gaussian model for clusters, and a comprehensive
simulation study comparing the OTRIMLE to Maximum Likelihood in Gaussian
mixtures with and without noise component, mixtures of t-distributions, and the
TCLUST approach for trimmed clustering. The OTRIMLE uses an improper constant
density for modelling outliers and noise. This can be chosen optimally so that
the non-noise part of the data looks as close to a Gaussian mixture as
possible. Some deviation from Gaussianity can be traded in for lowering the
estimated noise proportion. Covariance matrix constraints and computation of
the OTRIMLE are also treated. In the simulation study, all methods are
confronted with setups in which their model assumptions are not exactly
fulfilled, and in order to evaluate the experiments in a standardized way by
misclassification rates, a new model-based definition of "true clusters" is
introduced that deviates from the usual identification of mixture components
with clusters. In the study, every method turns out to be superior for one or
more setups, but the OTRIMLE achieves the most satisfactory overall
performance. The methods are also applied to two real datasets, one without and
one with known "true" clusters
Cosmological Analysis of Three-Dimensional BOSS Galaxy Clustering and Planck CMB Lensing Cross Correlations via Lagrangian Perturbation Theory
We present a formalism for jointly fitting pre- and post-reconstruction
redshift-space clustering (RSD) and baryon acoustic oscillations (BAO) plus
gravitational lensing (of the CMB) that works directly with the observed
2-point statistics. The formalism is based upon (effective) Lagrangian
perturbation theory and a Lagrangian bias expansion, which models RSD, BAO and
galaxy-lensing cross correlations within a consistent dynamical framework. As
an example we present an analysis of clustering measured by the Baryon
Oscillation Spectroscopic Survey in combination with CMB lensing measured by
Planck. The post-reconstruction BAO strongly constrains the distance-redshift
relation, the full-shape redshift-space clustering constrains the matter
density and growth rate, and CMB lensing constrains the clustering amplitude.
Using only the redshift space data we obtain , and . The addition of
lensing information, even when restricted to the Northern Galactic Cap,
improves constraints to ,
and , in tension with CMB and cosmic shear
constraints. The combination of and are consistent with
Planck, though their constraints derive mostly from redshift-space clustering.
The low value are driven by cross correlations with CMB lensing in
the low redshift bin () and at large angular scales, which show a
deficit compared to expectations from galaxy clustering alone. We
conduct several systematics tests on the data and find none that could fully
explain these tensions.Comment: 46 pages, 15 figures, updated to match version accepted by JCA
Modeling the reconstructed BAO in Fourier space
The density field reconstruction technique, which was developed to partially
reverse the nonlinear degradation of the Baryon Acoustic Oscillation (BAO)
feature in the galaxy redshift surveys, has been successful in substantially
improving the cosmology constraints from recent galaxy surveys such as Baryon
Oscillation Spectroscopic Survey (BOSS). We estimate the efficiency of the
reconstruction method as a function of various reconstruction details. To
directly quantify the BAO information in nonlinear density fields before and
after reconstruction, we calculate the cross-correlations (i.e., propagators)
of the pre(post)-reconstructed density field with the initial linear field
using a mock galaxy sample that is designed to mimic the clustering of the BOSS
CMASS galaxies. The results directly provide the BAO damping as a function of
wavenumber that can be implemented into the Fisher matrix analysis. We focus on
investigating the dependence of the propagator on a choice of smoothing filters
and on two major different conventions of the redshift-space density field
reconstruction that have been used in literature. By estimating the BAO
signal-to-noise for each case, we predict constraints on the angular diameter
distance and Hubble parameter using the Fisher matrix analysis. We thus
determine an optimal Gaussian smoothing filter scale for the signal-to-noise
level of the BOSS CMASS. We also present appropriate BAO fitting models for
different reconstruction methods based on the first and second order Lagrangian
perturbation theory in Fourier space. Using the mock data, we show that the
modified BAO fitting model can substantially improve the accuracy of the BAO
position in the best fits as well as the goodness of the fits.Comment: 21 pages, 7 figures, 1 table. Minor revisions. Matches version
accepted by MNRA
- …