8,679 research outputs found
Efficient Computation of Multiple Density-Based Clustering Hierarchies
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method,
produces a hierarchical organization of clusters in a dataset w.r.t. a
parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts in the
sense that a small change in mpts typically leads to only a small or no change
in the clustering structure, choosing a "good" mpts value can be challenging:
depending on the data distribution, a high or low value for mpts may be more
appropriate, and certain data clusters may reveal themselves at different
values of mpts. To explore results for a range of mpts values, however, one has
to run HDBSCAN* for each value in the range independently, which is
computationally inefficient. In this paper, we propose an efficient approach to
compute all HDBSCAN* hierarchies for a range of mpts values by replacing the
graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain
the required information. An extensive experimental evaluation shows that with
our approach one can obtain over one hundred hierarchies for the computational
cost equivalent to running HDBSCAN* about 2 times.Comment: A short version of this paper appears at IEEE ICDM 2017. Corrected
typos. Revised abstrac
Point-wise mutual information-based video segmentation with high temporal consistency
In this paper, we tackle the problem of temporally consistent boundary
detection and hierarchical segmentation in videos. While finding the best
high-level reasoning of region assignments in videos is the focus of much
recent research, temporal consistency in boundary detection has so far only
rarely been tackled. We argue that temporally consistent boundaries are a key
component to temporally consistent region assignment. The proposed method is
based on the point-wise mutual information (PMI) of spatio-temporal voxels.
Temporal consistency is established by an evaluation of PMI-based point
affinities in the spectral domain over space and time. Thus, the proposed
method is independent of any optical flow computation or previously learned
motion models. The proposed low-level video segmentation method outperforms the
learning-based state of the art in terms of standard region metrics
Object Proposals for Text Extraction in the Wild
Object Proposals is a recent computer vision technique receiving increasing
interest from the research community. Its main objective is to generate a
relatively small set of bounding box proposals that are most likely to contain
objects of interest. The use of Object Proposals techniques in the scene text
understanding field is innovative. Motivated by the success of powerful while
expensive techniques to recognize words in a holistic way, Object Proposals
techniques emerge as an alternative to the traditional text detectors.
In this paper we study to what extent the existing generic Object Proposals
methods may be useful for scene text understanding. Also, we propose a new
Object Proposals algorithm that is specifically designed for text and compare
it with other generic methods in the state of the art. Experiments show that
our proposal is superior in its ability of producing good quality word
proposals in an efficient way. The source code of our method is made publicly
available.Comment: 13th International Conference on Document Analysis and Recognition
(ICDAR 2015
Non-parametric Bayesian modeling of complex networks
Modeling structure in complex networks using Bayesian non-parametrics makes
it possible to specify flexible model structures and infer the adequate model
complexity from the observed data. This paper provides a gentle introduction to
non-parametric Bayesian modeling of complex networks: Using an infinite mixture
model as running example we go through the steps of deriving the model as an
infinite limit of a finite parametric model, inferring the model parameters by
Markov chain Monte Carlo, and checking the model's fit and predictive
performance. We explain how advanced non-parametric models for complex networks
can be derived and point out relevant literature
Perspects in astrophysical databases
Astrophysics has become a domain extremely rich of scientific data. Data
mining tools are needed for information extraction from such large datasets.
This asks for an approach to data management emphasizing the efficiency and
simplicity of data access; efficiency is obtained using multidimensional access
methods and simplicity is achieved by properly handling metadata. Moreover,
clustering and classification techniques on large datasets pose additional
requirements in terms of computation and memory scalability and
interpretability of results. In this study we review some possible solutions
The Iray Light Transport Simulation and Rendering System
While ray tracing has become increasingly common and path tracing is well
understood by now, a major challenge lies in crafting an easy-to-use and
efficient system implementing these technologies. Following a purely
physically-based paradigm while still allowing for artistic workflows, the Iray
light transport simulation and rendering system allows for rendering complex
scenes by the push of a button and thus makes accurate light transport
simulation widely available. In this document we discuss the challenges and
implementation choices that follow from our primary design decisions,
demonstrating that such a rendering system can be made a practical, scalable,
and efficient real-world application that has been adopted by various companies
across many fields and is in use by many industry professionals today
- …