15,335 research outputs found
Hypergraph Modelling for Geometric Model Fitting
In this paper, we propose a novel hypergraph based method (called HF) to fit
and segment multi-structural data. The proposed HF formulates the geometric
model fitting problem as a hypergraph partition problem based on a novel
hypergraph model. In the hypergraph model, vertices represent data points and
hyperedges denote model hypotheses. The hypergraph, with large and
"data-determined" degrees of hyperedges, can express the complex relationships
between model hypotheses and data points. In addition, we develop a robust
hypergraph partition algorithm to detect sub-hypergraphs for model fitting. HF
can effectively and efficiently estimate the number of, and the parameters of,
model instances in multi-structural data heavily corrupted with outliers
simultaneously. Experimental results show the advantages of the proposed method
over previous methods on both synthetic data and real images.Comment: Pattern Recognition, 201
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
To Index or Not to Index: Optimizing Exact Maximum Inner Product Search
Exact Maximum Inner Product Search (MIPS) is an important task that is widely
pertinent to recommender systems and high-dimensional similarity search. The
brute-force approach to solving exact MIPS is computationally expensive, thus
spurring recent development of novel indexes and pruning techniques for this
task. In this paper, we show that a hardware-efficient brute-force approach,
blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers
by over an order of magnitude, for some -- but not all -- inputs.
In this paper, we also present a novel MIPS solution, MAXIMUS, that takes
advantage of hardware efficiency and pruning of the search space. Like BMM,
MAXIMUS is faster than other solvers by up to an order of magnitude, but again
only for some inputs. Since no single solution offers the best runtime
performance for all inputs, we introduce a new data-dependent optimizer,
OPTIMUS, that selects online with minimal overhead the best MIPS solver for a
given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS
solvers by 3.2 on average, and up to 10.9, on widely studied
MIPS datasets.Comment: 12 pages, 8 figures, 2 table
Consistent procedures for cluster tree estimation and pruning
For a density on , a {\it high-density cluster} is any
connected component of , for some . The
set of all high-density clusters forms a hierarchy called the {\it cluster
tree} of . We present two procedures for estimating the cluster tree given
samples from . The first is a robust variant of the single linkage algorithm
for hierarchical clustering. The second is based on the -nearest neighbor
graph of the samples. We give finite-sample convergence rates for these
algorithms which also imply consistency, and we derive lower bounds on the
sample complexity of cluster tree estimation. Finally, we study a tree pruning
procedure that guarantees, under milder conditions than usual, to remove
clusters that are spurious while recovering those that are salient
An indoor variance-based localization technique utilizing the UWB estimation of geometrical propagation parameters
A novel localization framework is presented based on ultra-wideband (UWB) channel sounding, employing a triangulation method using the geometrical properties of propagation paths, such as time delay of arrival, angle of departure, angle of arrival, and their estimated variances. In order to extract these parameters from the UWB sounding data, an extension to the high-resolution RiMAX algorithm was developed, facilitating the analysis of these frequency-dependent multipath parameters. This framework was then tested by performing indoor measurements with a vector network analyzer and virtual antenna arrays. The estimated means and variances of these geometrical parameters were utilized to generate multiple sample sets of input values for our localization framework. Next to that, we consider the existence of multiple possible target locations, which were subsequently clustered using a Kim-Parks algorithm, resulting in a more robust estimation of each target node. Measurements reveal that our newly proposed technique achieves an average accuracy of 0.26, 0.28, and 0.90 m in line-of-sight (LoS), obstructed-LoS, and non-LoS scenarios, respectively, and this with only one single beacon node. Moreover, utilizing the estimated variances of the multipath parameters proved to enhance the location estimation significantly compared to only utilizing their estimated mean values
- …