15,325 research outputs found

    Hypergraph Modelling for Geometric Model Fitting

    Full text link
    In this paper, we propose a novel hypergraph based method (called HF) to fit and segment multi-structural data. The proposed HF formulates the geometric model fitting problem as a hypergraph partition problem based on a novel hypergraph model. In the hypergraph model, vertices represent data points and hyperedges denote model hypotheses. The hypergraph, with large and "data-determined" degrees of hyperedges, can express the complex relationships between model hypotheses and data points. In addition, we develop a robust hypergraph partition algorithm to detect sub-hypergraphs for model fitting. HF can effectively and efficiently estimate the number of, and the parameters of, model instances in multi-structural data heavily corrupted with outliers simultaneously. Experimental results show the advantages of the proposed method over previous methods on both synthetic data and real images.Comment: Pattern Recognition, 201

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    To Index or Not to Index: Optimizing Exact Maximum Inner Product Search

    Full text link
    Exact Maximum Inner Product Search (MIPS) is an important task that is widely pertinent to recommender systems and high-dimensional similarity search. The brute-force approach to solving exact MIPS is computationally expensive, thus spurring recent development of novel indexes and pruning techniques for this task. In this paper, we show that a hardware-efficient brute-force approach, blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers by over an order of magnitude, for some -- but not all -- inputs. In this paper, we also present a novel MIPS solution, MAXIMUS, that takes advantage of hardware efficiency and pruning of the search space. Like BMM, MAXIMUS is faster than other solvers by up to an order of magnitude, but again only for some inputs. Since no single solution offers the best runtime performance for all inputs, we introduce a new data-dependent optimizer, OPTIMUS, that selects online with minimal overhead the best MIPS solver for a given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS solvers by 3.2×\times on average, and up to 10.9×\times, on widely studied MIPS datasets.Comment: 12 pages, 8 figures, 2 table

    Consistent procedures for cluster tree estimation and pruning

    Full text link
    For a density ff on Rd{\mathbb R}^d, a {\it high-density cluster} is any connected component of {x:f(x)λ}\{x: f(x) \geq \lambda\}, for some λ>0\lambda > 0. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of ff. We present two procedures for estimating the cluster tree given samples from ff. The first is a robust variant of the single linkage algorithm for hierarchical clustering. The second is based on the kk-nearest neighbor graph of the samples. We give finite-sample convergence rates for these algorithms which also imply consistency, and we derive lower bounds on the sample complexity of cluster tree estimation. Finally, we study a tree pruning procedure that guarantees, under milder conditions than usual, to remove clusters that are spurious while recovering those that are salient

    An indoor variance-based localization technique utilizing the UWB estimation of geometrical propagation parameters

    Get PDF
    A novel localization framework is presented based on ultra-wideband (UWB) channel sounding, employing a triangulation method using the geometrical properties of propagation paths, such as time delay of arrival, angle of departure, angle of arrival, and their estimated variances. In order to extract these parameters from the UWB sounding data, an extension to the high-resolution RiMAX algorithm was developed, facilitating the analysis of these frequency-dependent multipath parameters. This framework was then tested by performing indoor measurements with a vector network analyzer and virtual antenna arrays. The estimated means and variances of these geometrical parameters were utilized to generate multiple sample sets of input values for our localization framework. Next to that, we consider the existence of multiple possible target locations, which were subsequently clustered using a Kim-Parks algorithm, resulting in a more robust estimation of each target node. Measurements reveal that our newly proposed technique achieves an average accuracy of 0.26, 0.28, and 0.90 m in line-of-sight (LoS), obstructed-LoS, and non-LoS scenarios, respectively, and this with only one single beacon node. Moreover, utilizing the estimated variances of the multipath parameters proved to enhance the location estimation significantly compared to only utilizing their estimated mean values
    corecore