161,594 research outputs found

    Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering

    Get PDF
    The robust improper maximum likelihood estimator (RIMLE) is a new method for robust multivariate clustering finding approximately Gaussian clusters. It maximizes a pseudo-likelihood defined by adding a component with improper constant density for accommodating outliers to a Gaussian mixture. A special case of the RIMLE is MLE for multivariate finite Gaussian mixture models. In this paper we treat existence, consistency, and breakdown theory for the RIMLE comprehensively. RIMLE's existence is proved under non-smooth covariance matrix constraints. It is shown that these can be implemented via a computationally feasible Expectation-Conditional Maximization algorithm.Comment: The title of this paper was originally: "A consistent and breakdown robust model-based clustering method

    Fast multi-image matching via density-based clustering

    Full text link
    We consider the problem of finding consistent matches across multiple images. Previous state-of-the-art solutions use constraints on cycles of matches together with convex optimization, leading to computationally intensive iterative algorithms. In this paper, we propose a clustering-based formulation. We first rigorously show its equivalence with the previous one, and then propose QuickMatch, a novel algorithm that identifies multi-image matches from a density function in feature space. We use the density to order the points in a tree, and then extract the matches by breaking this tree using feature distances and measures of distinctiveness. Our algorithm outperforms previous state-of-the-art methods (such as MatchALS) in accuracy, and it is significantly faster (up to 62 times faster on some bechmarks), and can scale to large datasets (with more than twenty thousands features).Accepted manuscriptSupporting documentatio

    The impact of baryonic processes on the two-point correlation functions of galaxies, subhaloes and matter

    Get PDF
    The observed clustering of galaxies and the cross-correlation of galaxies and mass provide important constraints on both cosmology and models of galaxy formation. Even though the dissipation and feedback processes associated with galaxy formation are thought to affect the distribution of matter, essentially all models used to predict clustering data are based on collisionless simulations. Here, we use large hydrodynamical simulations to investigate how galaxy formation affects the autocorrelation functions of galaxies and subhaloes, as well as their cross-correlation with matter. We show that the changes due to the inclusion of baryons are not limited to small scales and are even present in samples selected by subhalo mass. Samples selected by subhalo mass cluster ~10% more strongly in a baryonic run on scales r > 1Mpc/h, and this difference increases for smaller separations. While the inclusion of baryons boosts the clustering at fixed subhalo mass on all scales, the sign of the effect on the cross-correlation of subhaloes with matter can vary with radius. We show that the large-scale effects are due to the change in subhalo mass caused by the strong feedback associated with galaxy formation and may therefore not affect samples selected by number density. However, on scales r < r_vir significant differences remain after accounting for the change in subhalo mass. We conclude that predictions for galaxy-galaxy and galaxy-mass clustering from models based on collisionless simulations will have errors greater than 10% on sub-Mpc scales, unless the simulation results are modified to correctly account for the effects of baryons on the distributions of mass and satellites.Comment: 15 pages, 9 figures. Replaced to match the version accepted by MNRA

    Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering

    Get PDF
    The two main topics of this paper are the introduction of the "optimally tuned improper maximum likelihood estimator" (OTRIMLE) for robust clustering based on the multivariate Gaussian model for clusters, and a comprehensive simulation study comparing the OTRIMLE to Maximum Likelihood in Gaussian mixtures with and without noise component, mixtures of t-distributions, and the TCLUST approach for trimmed clustering. The OTRIMLE uses an improper constant density for modelling outliers and noise. This can be chosen optimally so that the non-noise part of the data looks as close to a Gaussian mixture as possible. Some deviation from Gaussianity can be traded in for lowering the estimated noise proportion. Covariance matrix constraints and computation of the OTRIMLE are also treated. In the simulation study, all methods are confronted with setups in which their model assumptions are not exactly fulfilled, and in order to evaluate the experiments in a standardized way by misclassification rates, a new model-based definition of "true clusters" is introduced that deviates from the usual identification of mixture components with clusters. In the study, every method turns out to be superior for one or more setups, but the OTRIMLE achieves the most satisfactory overall performance. The methods are also applied to two real datasets, one without and one with known "true" clusters

    Cosmological Analysis of Three-Dimensional BOSS Galaxy Clustering and Planck CMB Lensing Cross Correlations via Lagrangian Perturbation Theory

    Full text link
    We present a formalism for jointly fitting pre- and post-reconstruction redshift-space clustering (RSD) and baryon acoustic oscillations (BAO) plus gravitational lensing (of the CMB) that works directly with the observed 2-point statistics. The formalism is based upon (effective) Lagrangian perturbation theory and a Lagrangian bias expansion, which models RSD, BAO and galaxy-lensing cross correlations within a consistent dynamical framework. As an example we present an analysis of clustering measured by the Baryon Oscillation Spectroscopic Survey in combination with CMB lensing measured by Planck. The post-reconstruction BAO strongly constrains the distance-redshift relation, the full-shape redshift-space clustering constrains the matter density and growth rate, and CMB lensing constrains the clustering amplitude. Using only the redshift space data we obtain Ωm=0.303±0.008\Omega_\mathrm{m} = 0.303\pm 0.008, H0=69.21±0.78H_0 = 69.21\pm 0.78 and σ8=0.743±0.043\sigma_8 = 0.743\pm 0.043. The addition of lensing information, even when restricted to the Northern Galactic Cap, improves constraints to Ωm=0.300±0.008\Omega_m = 0.300 \pm 0.008, H0=69.21±0.77H_0 = 69.21 \pm 0.77 and σ8=0.707±0.035\sigma_8 = 0.707 \pm 0.035, in tension with CMB and cosmic shear constraints. The combination of Ωm\Omega_m and H0H_0 are consistent with Planck, though their constraints derive mostly from redshift-space clustering. The low σ8\sigma_8 value are driven by cross correlations with CMB lensing in the low redshift bin (z≃0.38z\simeq 0.38) and at large angular scales, which show a 20%20\% deficit compared to expectations from galaxy clustering alone. We conduct several systematics tests on the data and find none that could fully explain these tensions.Comment: 46 pages, 15 figures, updated to match version accepted by JCA

    Modeling the reconstructed BAO in Fourier space

    Get PDF
    The density field reconstruction technique, which was developed to partially reverse the nonlinear degradation of the Baryon Acoustic Oscillation (BAO) feature in the galaxy redshift surveys, has been successful in substantially improving the cosmology constraints from recent galaxy surveys such as Baryon Oscillation Spectroscopic Survey (BOSS). We estimate the efficiency of the reconstruction method as a function of various reconstruction details. To directly quantify the BAO information in nonlinear density fields before and after reconstruction, we calculate the cross-correlations (i.e., propagators) of the pre(post)-reconstructed density field with the initial linear field using a mock galaxy sample that is designed to mimic the clustering of the BOSS CMASS galaxies. The results directly provide the BAO damping as a function of wavenumber that can be implemented into the Fisher matrix analysis. We focus on investigating the dependence of the propagator on a choice of smoothing filters and on two major different conventions of the redshift-space density field reconstruction that have been used in literature. By estimating the BAO signal-to-noise for each case, we predict constraints on the angular diameter distance and Hubble parameter using the Fisher matrix analysis. We thus determine an optimal Gaussian smoothing filter scale for the signal-to-noise level of the BOSS CMASS. We also present appropriate BAO fitting models for different reconstruction methods based on the first and second order Lagrangian perturbation theory in Fourier space. Using the mock data, we show that the modified BAO fitting model can substantially improve the accuracy of the BAO position in the best fits as well as the goodness of the fits.Comment: 21 pages, 7 figures, 1 table. Minor revisions. Matches version accepted by MNRA
    • …
    corecore