3,213 research outputs found
Incremental non-dominated sorting with O(N) insertion for the two-dimensional case.
Abstract-We propose a new algorithm for incremental nondominated sorting of two-dimensional points. The data structure which stores non-dominating layers is based on a tree of Cartesian trees. If there are N points in M layers, the running time for of an insertion is O(M (1 + log(N/M )) + log M log(N/ log M )), which is O(N ) in the worst case. This algorithm can be a basic building block for efficient implementations of steady-state multiobjective algorithms such as NSGA-II
Fast Algorithms for the Computation of Ranklets
Ranklets are orientation selective rank features with applications to tracking, face detection, texture and medical imaging. We introduce efficient algorithms that reduce their computational complexity from O(N logN) to O(!N + k), where N is the area of the filter. Timing tests show a speedup of
one order of magnitude for typical usage, which should make Ranklets attractive for real-time applications
StreamLearner: Distributed Incremental Machine Learning on Event Streams: Grand Challenge
Today, massive amounts of streaming data from smart devices need to be
analyzed automatically to realize the Internet of Things. The Complex Event
Processing (CEP) paradigm promises low-latency pattern detection on event
streams. However, CEP systems need to be extended with Machine Learning (ML)
capabilities such as online training and inference in order to be able to
detect fuzzy patterns (e.g., outliers) and to improve pattern recognition
accuracy during runtime using incremental model training. In this paper, we
propose a distributed CEP system denoted as StreamLearner for ML-enabled
complex event detection. The proposed programming model and data-parallel
system architecture enable a wide range of real-world applications and allow
for dynamically scaling up and out system resources for low-latency,
high-throughput event processing. We show that the DEBS Grand Challenge 2017
case study (i.e., anomaly detection in smart factories) integrates seamlessly
into the StreamLearner API. Our experiments verify scalability and high event
throughput of StreamLearner.Comment: Christian Mayer, Ruben Mayer, and Majd Abdo. 2017. StreamLearner:
Distributed Incremental Machine Learning on Event Streams: Grand Challenge.
In Proceedings of the 11th ACM International Conference on Distributed and
Event-based Systems (DEBS '17), 298-30
From Proximity to Utility: A Voronoi Partition of Pareto Optima
We present an extension of Voronoi diagrams where when considering which site
a client is going to use, in addition to the site distances, other site
attributes are also considered (for example, prices or weights). A cell in this
diagram is then the locus of all clients that consider the same set of sites to
be relevant. In particular, the precise site a client might use from this
candidate set depends on parameters that might change between usages, and the
candidate set lists all of the relevant sites. The resulting diagram is
significantly more expressive than Voronoi diagrams, but naturally has the
drawback that its complexity, even in the plane, might be quite high.
Nevertheless, we show that if the attributes of the sites are drawn from the
same distribution (note that the locations are fixed), then the expected
complexity of the candidate diagram is near linear.
To this end, we derive several new technical results, which are of
independent interest. In particular, we provide a high-probability,
asymptotically optimal bound on the number of Pareto optima points in a point
set uniformly sampled from the -dimensional hypercube. To do so we revisit
the classical backward analysis technique, both simplifying and improving
relevant results in order to achieve the high-probability bounds
Answering Spatial Multiple-Set Intersection Queries Using 2-3 Cuckoo Hash-Filters
We show how to answer spatial multiple-set intersection queries in O(n(log
w)/w + kt) expected time, where n is the total size of the t sets involved in
the query, w is the number of bits in a memory word, k is the output size, and
c is any fixed constant. This improves the asymptotic performance over previous
solutions and is based on an interesting data structure, known as 2-3 cuckoo
hash-filters. Our results apply in the word-RAM model (or practical RAM model),
which allows for constant-time bit-parallel operations, such as bitwise AND,
OR, NOT, and MSB (most-significant 1-bit), as exist in modern CPUs and GPUs.
Our solutions apply to any multiple-set intersection queries in spatial data
sets that can be reduced to one-dimensional range queries, such as spatial join
queries for one-dimensional points or sets of points stored along space-filling
curves, which are used in GIS applications.Comment: Full version of paper from 2017 ACM SIGSPATIAL International
Conference on Advances in Geographic Information System
Efficient Maxima-Finding Algorithms for Random Planar Samples
this paper a simple classification of several known algorithms for finding the maxima, together with several new algorithms; among these are two efficient algorithms---one with expected complexity n +O( # nlogn) when the point samples are issued from some planar regions, and another more efficient than existing one
- …