234,870 research outputs found

    Robust hierarchical k-center clustering

    Get PDF
    One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k

    Control and Filtering for Discrete Linear Repetitive Processes with H infty and ell 2--ell infty Performance

    No full text
    Repetitive processes are characterized by a series of sweeps, termed passes, through a set of dynamics defined over a finite duration known as the pass length. On each pass an output, termed the pass profile, is produced which acts as a forcing function on, and hence contributes to, the dynamics of the next pass profile. This can lead to oscillations which increase in amplitude in the pass to pass direction and cannot be controlled by standard control laws. Here we give new results on the design of physically based control laws for the sub-class of so-called discrete linear repetitive processes which arise in applications areas such as iterative learning control. The main contribution is to show how control law design can be undertaken within the framework of a general robust filtering problem with guaranteed levels of performance. In particular, we develop algorithms for the design of an H? and 2\ell_{2}–\ell_{\infty} dynamic output feedback controller and filter which guarantees that the resulting controlled (filtering error) process, respectively, is stable along the pass and has prescribed disturbance attenuation performance as measured by HH_{\infty} and 2\ell_{2}\ell_{\infty} norms

    Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches

    Get PDF
    Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensin

    Accelerating Cosmic Microwave Background map-making procedure through preconditioning

    Get PDF
    Estimation of the sky signal from sequences of time ordered data is one of the key steps in Cosmic Microwave Background (CMB) data analysis, commonly referred to as the map-making problem. Some of the most popular and general methods proposed for this problem involve solving generalised least squares (GLS) equations with non-diagonal noise weights given by a block-diagonal matrix with Toeplitz blocks. In this work we study new map-making solvers potentially suitable for applications to the largest anticipated data sets. They are based on iterative conjugate gradient (CG) approaches enhanced with novel, parallel, two-level preconditioners. We apply the proposed solvers to examples of simulated non-polarised and polarised CMB observations, and a set of idealised scanning strategies with sky coverage ranging from nearly a full sky down to small sky patches. We discuss in detail their implementation for massively parallel computational platforms and their performance for a broad range of parameters characterising the simulated data sets. We find that our best new solver can outperform carefully-optimised standard solvers used today by a factor of as much as 5 in terms of the convergence rate and a factor of up to 44 in terms of the time to solution, and to do so without significantly increasing the memory consumption and the volume of inter-processor communication. The performance of the new algorithms is also found to be more stable and robust, and less dependent on specific characteristics of the analysed data set. We therefore conclude that the proposed approaches are well suited to address successfully challenges posed by new and forthcoming CMB data sets.Comment: 19 pages // Final version submitted to A&

    Accelerating Cosmic Microwave Background map-making procedure through preconditioning

    Get PDF
    Estimation of the sky signal from sequences of time ordered data is one of the key steps in Cosmic Microwave Background (CMB) data analysis, commonly referred to as the map-making problem. Some of the most popular and general methods proposed for this problem involve solving generalised least squares (GLS) equations with non-diagonal noise weights given by a block-diagonal matrix with Toeplitz blocks. In this work we study new map-making solvers potentially suitable for applications to the largest anticipated data sets. They are based on iterative conjugate gradient (CG) approaches enhanced with novel, parallel, two-level preconditioners. We apply the proposed solvers to examples of simulated non-polarised and polarised CMB observations, and a set of idealised scanning strategies with sky coverage ranging from nearly a full sky down to small sky patches. We discuss in detail their implementation for massively parallel computational platforms and their performance for a broad range of parameters characterising the simulated data sets. We find that our best new solver can outperform carefully-optimised standard solvers used today by a factor of as much as 5 in terms of the convergence rate and a factor of up to 44 in terms of the time to solution, and to do so without significantly increasing the memory consumption and the volume of inter-processor communication. The performance of the new algorithms is also found to be more stable and robust, and less dependent on specific characteristics of the analysed data set. We therefore conclude that the proposed approaches are well suited to address successfully challenges posed by new and forthcoming CMB data sets.Comment: 19 pages // Final version submitted to A&
    corecore