42,163 research outputs found
Efficient Computation of Multiple Density-Based Clustering Hierarchies
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method,
produces a hierarchical organization of clusters in a dataset w.r.t. a
parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts in the
sense that a small change in mpts typically leads to only a small or no change
in the clustering structure, choosing a "good" mpts value can be challenging:
depending on the data distribution, a high or low value for mpts may be more
appropriate, and certain data clusters may reveal themselves at different
values of mpts. To explore results for a range of mpts values, however, one has
to run HDBSCAN* for each value in the range independently, which is
computationally inefficient. In this paper, we propose an efficient approach to
compute all HDBSCAN* hierarchies for a range of mpts values by replacing the
graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain
the required information. An extensive experimental evaluation shows that with
our approach one can obtain over one hundred hierarchies for the computational
cost equivalent to running HDBSCAN* about 2 times.Comment: A short version of this paper appears at IEEE ICDM 2017. Corrected
typos. Revised abstrac
Extension of One-Dimensional Proximity Regions to Higher Dimensions
Proximity maps and regions are defined based on the relative allocation of
points from two or more classes in an area of interest and are used to
construct random graphs called proximity catch digraphs (PCDs) which have
applications in various fields. The simplest of such maps is the spherical
proximity map which maps a point from the class of interest to a disk centered
at the same point with radius being the distance to the closest point from the
other class in the region. The spherical proximity map gave rise to class cover
catch digraph (CCCD) which was applied to pattern classification. Furthermore
for uniform data on the real line, the exact and asymptotic distribution of the
domination number of CCCDs were analytically available. In this article, we
determine some appealing properties of the spherical proximity map in compact
intervals on the real line and use these properties as a guideline for defining
new proximity maps in higher dimensions. Delaunay triangulation is used to
partition the region of interest in higher dimensions. Furthermore, we
introduce the auxiliary tools used for the construction of the new proximity
maps, as well as some related concepts that will be used in the investigation
and comparison of them and the resulting graphs. We characterize the geometry
invariance of PCDs for uniform data. We also provide some newly defined
proximity maps in higher dimensions as illustrative examples
TS2PACK: A Two-Level Tabu Search for the Three-dimensional Bin Packing Problem
Three-dimensional orthogonal bin packing is a problem NP-hard in the strong sense where a set of boxes must be orthogonally packed into the minimum number of three-dimensional bins. We present a two-level tabu search for this problem. The first-level aims to reduce the number of bins. The second optimizes the packing of the bins. This latter procedure is based on the Interval Graph representation of the packing, proposed by Fekete and Schepers, which reduces the size of the search space. We also introduce a general method to increase the size of the associated neighborhoods, and thus the quality of the search, without increasing the overall complexity of the algorithm. Extensive computational results on benchmark problem instances show the effectiveness of the proposed approach, obtaining better results compared to the existing one
Generalizations of the Kolmogorov-Barzdin embedding estimates
We consider several ways to measure the `geometric complexity' of an
embedding from a simplicial complex into Euclidean space. One of these is a
version of `thickness', based on a paper of Kolmogorov and Barzdin. We prove
inequalities relating the thickness and the number of simplices in the
simplicial complex, generalizing an estimate that Kolmogorov and Barzdin proved
for graphs. We also consider the distortion of knots. We give an alternate
proof of a theorem of Pardon that there are isotopy classes of knots requiring
arbitrarily large distortion. This proof is based on the expander-like
properties of arithmetic hyperbolic manifolds.Comment: 45 page
- …