1,900 research outputs found
Covering Points by Disjoint Boxes with Outliers
For a set of n points in the plane, we consider the axis--aligned (p,k)-Box
Covering problem: Find p axis-aligned, pairwise-disjoint boxes that together
contain n-k points. In this paper, we consider the boxes to be either squares
or rectangles, and we want to minimize the area of the largest box. For general
p we show that the problem is NP-hard for both squares and rectangles. For a
small, fixed number p, we give algorithms that find the solution in the
following running times:
For squares we have O(n+k log k) time for p=1, and O(n log n+k^p log^p k time
for p = 2,3. For rectangles we get O(n + k^3) for p = 1 and O(n log n+k^{2+p}
log^{p-1} k) time for p = 2,3.
In all cases, our algorithms use O(n) space.Comment: updated version: - changed problem from 'cover exactly n-k points' to
'cover at least n-k points' to avoid having non-feasible solutions. Results
are unchanged. - added Proof to Lemma 11, clarified some sections - corrected
typos and small errors - updated affiliations of two author
Covering many points with a small-area box
Let be a set of points in the plane. We show how to find, for a given
integer , the smallest-area axis-parallel rectangle that covers points
of in time. We also consider the problem of,
given a value , covering as many points of as possible with an
axis-parallel rectangle of area at most . For this problem we give a
probabilistic -approximation that works in near-linear time:
In time we find an
axis-parallel rectangle of area at most that, with high probability,
covers at least points, where
is the maximum possible number of points that could be
covered
A Constant Approximation for Colorful k-Center
In this paper, we consider the colorful k-center problem, which is a generalization of the well-known k-center problem. Here, we are given red and blue points in a metric space, and a coverage requirement for each color. The goal is to find the smallest radius rho, such that with k balls of radius rho, the desired number of points of each color can be covered. We obtain a constant approximation for this problem in the Euclidean plane. We obtain this result by combining a "pseudo-approximation" algorithm that works in any metric space, and an approximation algorithm that works for a special class of instances in the plane. The latter algorithm uses a novel connection to a certain matching problem in graphs
New Embedded Representations and Evaluation Protocols for Inferring Transitive Relations
Beyond word embeddings, continuous representations of knowledge graph (KG)
components, such as entities, types and relations, are widely used for entity
mention disambiguation, relation inference and deep question answering. Great
strides have been made in modeling general, asymmetric or antisymmetric KG
relations using Gaussian, holographic, and complex embeddings. None of these
directly enforce transitivity inherent in the is-instance-of and is-subtype-of
relations. A recent proposal, called order embedding (OE), demands that the
vector representing a subtype elementwise dominates the vector representing a
supertype. However, the manner in which such constraints are asserted and
evaluated have some limitations. In this short research note, we make three
contributions specific to representing and inferring transitive relations.
First, we propose and justify a significant improvement to the OE loss
objective. Second, we propose a new representation of types as
hyper-rectangular regions, that generalize and improve on OE. Third, we show
that some current protocols to evaluate transitive relation inference can be
misleading, and offer a sound alternative. Rather than use black-box deep
learning modules off-the-shelf, we develop our training networks using
elementary geometric considerations.Comment: Accepted at SIGIR 201
Massively-Parallel Heat Map Sorting and Applications To Explainable Clustering
Given a set of points labeled with labels, we introduce the heat map
sorting problem as reordering and merging the points and dimensions while
preserving the clusters (labels). A cluster is preserved if it remains
connected, i.e., if it is not split into several clusters and no two clusters
are merged.
We prove the problem is NP-hard and we give a fixed-parameter algorithm with
a constant number of rounds in the massively parallel computation model, where
each machine has a sublinear memory and the total memory of the machines is
linear. We give an approximation algorithm for a NP-hard special case of the
problem. We empirically compare our algorithm with k-means and density-based
clustering (DBSCAN) using a dimensionality reduction via locality-sensitive
hashing on several directed and undirected graphs of email and computer
networks
- …