7,087 research outputs found

    On Randomly Projected Hierarchical Clustering with Guarantees

    Full text link
    Hierarchical clustering (HC) algorithms are generally limited to small data instances due to their runtime costs. Here we mitigate this shortcoming and explore fast HC algorithms based on random projections for single (SLC) and average (ALC) linkage clustering as well as for the minimum spanning tree problem (MST). We present a thorough adaptive analysis of our algorithms that improve prior work from O(N2)O(N^2) by up to a factor of N/(logN)2N/(\log N)^2 for a dataset of NN points in Euclidean space. The algorithms maintain, with arbitrary high probability, the outcome of hierarchical clustering as well as the worst-case running-time guarantees. We also present parameter-free instances of our algorithms.Comment: This version contains the conference paper "On Randomly Projected Hierarchical Clustering with Guarantees'', SIAM International Conference on Data Mining (SDM), 2014 and, additionally, proofs omitted in the conference versio

    Recurrent Pixel Embedding for Instance Grouping

    Full text link
    We introduce a differentiable, end-to-end trainable framework for solving pixel-level grouping problems such as instance segmentation consisting of two novel components. First, we regress pixels into a hyper-spherical embedding space so that pixels from the same group have high cosine similarity while those from different groups have similarity below a specified margin. We analyze the choice of embedding dimension and margin, relating them to theoretical results on the problem of distributing points uniformly on the sphere. Second, to group instances, we utilize a variant of mean-shift clustering, implemented as a recurrent neural network parameterized by kernel bandwidth. This recurrent grouping module is differentiable, enjoys convergent dynamics and probabilistic interpretability. Backpropagating the group-weighted loss through this module allows learning to focus on only correcting embedding errors that won't be resolved during subsequent clustering. Our framework, while conceptually simple and theoretically abundant, is also practically effective and computationally efficient. We demonstrate substantial improvements over state-of-the-art instance segmentation for object proposal generation, as well as demonstrating the benefits of grouping loss for classification tasks such as boundary detection and semantic segmentation

    Measuring the escape velocity and mass profiles of galaxy clusters beyond their virial radius

    Full text link
    The caustic technique uses galaxy redshifts alone to measure the escape velocity and mass profiles of galaxy clusters to clustrocentric distances well beyond the virial radius, where dynamical equilibrium does not necessarily hold. We provide a detailed description of this technique and analyse its possible systematic errors. We apply the caustic technique to clusters with mass M_200>=10^{14}h^{-1} M_sun extracted from a cosmological hydrodynamic simulation of a LambdaCDM universe. With a few tens of redshifts per squared comoving megaparsec within the cluster, the caustic technique, on average, recovers the profile of the escape velocity from the cluster with better than 10 percent accuracy up to r~4 r_200. The caustic technique also recovers the mass profile with better than 10 percent accuracy in the range (0.6-4) r_200, but it overestimates the mass up to 70 percent at smaller radii. This overestimate is a consequence of neglecting the radial dependence of the filling function F_beta(r). The 1-sigma uncertainty on individual escape velocity profiles increases from ~20 to ~50 percent when the radius increases from r~0.1 r_200 to ~4 r_200. Individual mass profiles have 1-sigma uncertainty between 40 and 80 percent within the radial range (0.6-4) r_200. We show that the amplitude of these uncertainties is completely due to the assumption of spherical symmetry, which is difficult to drop. Alternatively, we can apply the technique to synthetic clusters obtained by stacking individual clusters: in this case, the 1-sigma uncertainty on the escape velocity profile is smaller than 20 percent out to 4 r_200. The caustic technique thus provides reliable average profiles which extend to regions difficult or impossible to probe with other techniques.Comment: MNRAS accepted, 20 page
    corecore