2,794 research outputs found
Maximum Inner-Product Search using Tree Data-structures
The problem of {\em efficiently} finding the best match for a query in a
given set with respect to the Euclidean distance or the cosine similarity has
been extensively studied in literature. However, a closely related problem of
efficiently finding the best match with respect to the inner product has never
been explored in the general setting to the best of our knowledge. In this
paper we consider this general problem and contrast it with the existing
best-match algorithms. First, we propose a general branch-and-bound algorithm
using a tree data structure. Subsequently, we present a dual-tree algorithm for
the case where there are multiple queries. Finally we present a new data
structure for increasing the efficiency of the dual-tree algorithm. These
branch-and-bound algorithms involve novel bounds suited for the purpose of
best-matching with inner products. We evaluate our proposed algorithms on a
variety of data sets from various applications, and exhibit up to five orders
of magnitude improvement in query time over the naive search technique.Comment: Under submission in KDD 201
Few smooth d-polytopes with n lattice points
We prove that, for fixed n there exist only finitely many embeddings of
Q-factorial toric varieties X into P^n that are induced by a complete linear
system. The proof is based on a combinatorial result that for fixed nonnegative
integers d and n, there are only finitely many smooth d-polytopes with n
lattice points. We also enumerate all smooth 3-polytopes with at most 12
lattice points. In fact, it is sufficient to bound the singularities and the
number of lattice points on edges to prove finiteness.Comment: 20+2 pages; major revision: new author, new structure, new result
Nonlinear Integer Programming
Research efforts of the past fifty years have led to a development of linear
integer programming as a mature discipline of mathematical optimization. Such a
level of maturity has not been reached when one considers nonlinear systems
subject to integrality requirements for the variables. This chapter is
dedicated to this topic.
The primary goal is a study of a simple version of general nonlinear integer
problems, where all constraints are still linear. Our focus is on the
computational complexity of the problem, which varies significantly with the
type of nonlinear objective function in combination with the underlying
combinatorial structure. Numerous boundary cases of complexity emerge, which
sometimes surprisingly lead even to polynomial time algorithms.
We also cover recent successful approaches for more general classes of
problems. Though no positive theoretical efficiency results are available, nor
are they likely to ever be available, these seem to be the currently most
successful and interesting approaches for solving practical problems.
It is our belief that the study of algorithms motivated by theoretical
considerations and those motivated by our desire to solve practical instances
should and do inform one another. So it is with this viewpoint that we present
the subject, and it is in this direction that we hope to spark further
research.Comment: 57 pages. To appear in: M. J\"unger, T. Liebling, D. Naddef, G.
Nemhauser, W. Pulleyblank, G. Reinelt, G. Rinaldi, and L. Wolsey (eds.), 50
Years of Integer Programming 1958--2008: The Early Years and State-of-the-Art
Surveys, Springer-Verlag, 2009, ISBN 354068274
Approximate Hypergraph Coloring under Low-discrepancy and Related Promises
A hypergraph is said to be -colorable if its vertices can be colored
with colors so that no hyperedge is monochromatic. -colorability is a
fundamental property (called Property B) of hypergraphs and is extensively
studied in combinatorics. Algorithmically, however, given a -colorable
-uniform hypergraph, it is NP-hard to find a -coloring miscoloring fewer
than a fraction of hyperedges (which is achieved by a random
-coloring), and the best algorithms to color the hypergraph properly require
colors, approaching the trivial bound of as
increases.
In this work, we study the complexity of approximate hypergraph coloring, for
both the maximization (finding a -coloring with fewest miscolored edges) and
minimization (finding a proper coloring using fewest number of colors)
versions, when the input hypergraph is promised to have the following stronger
properties than -colorability:
(A) Low-discrepancy: If the hypergraph has discrepancy ,
we give an algorithm to color the it with colors.
However, for the maximization version, we prove NP-hardness of finding a
-coloring miscoloring a smaller than (resp. )
fraction of the hyperedges when (resp. ). Assuming
the UGC, we improve the latter hardness factor to for almost
discrepancy- hypergraphs.
(B) Rainbow colorability: If the hypergraph has a -coloring such
that each hyperedge is polychromatic with all these colors, we give a
-coloring algorithm that miscolors at most of the
hyperedges when , and complement this with a matching UG
hardness result showing that when , it is hard to even beat the
bound achieved by a random coloring.Comment: Approx 201
A Novel Approach for Ellipsoidal Outer-Approximation of the Intersection Region of Ellipses in the Plane
In this paper, a novel technique for tight outer-approximation of the
intersection region of a finite number of ellipses in 2-dimensional (2D) space
is proposed. First, the vertices of a tight polygon that contains the convex
intersection of the ellipses are found in an efficient manner. To do so, the
intersection points of the ellipses that fall on the boundary of the
intersection region are determined, and a set of points is generated on the
elliptic arcs connecting every two neighbouring intersection points. By finding
the tangent lines to the ellipses at the extended set of points, a set of
half-planes is obtained, whose intersection forms a polygon. To find the
polygon more efficiently, the points are given an order and the intersection of
the half-planes corresponding to every two neighbouring points is calculated.
If the polygon is convex and bounded, these calculated points together with the
initially obtained intersection points will form its vertices. If the polygon
is non-convex or unbounded, we can detect this situation and then generate
additional discrete points only on the elliptical arc segment causing the
issue, and restart the algorithm to obtain a bounded and convex polygon.
Finally, the smallest area ellipse that contains the vertices of the polygon is
obtained by solving a convex optimization problem. Through numerical
experiments, it is illustrated that the proposed technique returns a tighter
outer-approximation of the intersection of multiple ellipses, compared to
conventional techniques, with only slightly higher computational cost
Randomized Sketches of Convex Programs with Sharp Guarantees
Random projection (RP) is a classical technique for reducing storage and
computational costs. We analyze RP-based approximations of convex programs, in
which the original optimization problem is approximated by the solution of a
lower-dimensional problem. Such dimensionality reduction is essential in
computation-limited settings, since the complexity of general convex
programming can be quite high (e.g., cubic for quadratic programs, and
substantially higher for semidefinite programs). In addition to computational
savings, random projection is also useful for reducing memory usage, and has
useful properties for privacy-sensitive optimization. We prove that the
approximation ratio of this procedure can be bounded in terms of the geometry
of constraint set. For a broad class of random projections, including those
based on various sub-Gaussian distributions as well as randomized Hadamard and
Fourier transforms, the data matrix defining the cost function can be projected
down to the statistical dimension of the tangent cone of the constraints at the
original solution, which is often substantially smaller than the original
dimension. We illustrate consequences of our theory for various cases,
including unconstrained and -constrained least squares, support vector
machines, low-rank matrix estimation, and discuss implications on
privacy-sensitive optimization and some connections with de-noising and
compressed sensing
Exploring Human Vision Driven Features for Pedestrian Detection
Motivated by the center-surround mechanism in the human visual attention
system, we propose to use average contrast maps for the challenge of pedestrian
detection in street scenes due to the observation that pedestrians indeed
exhibit discriminative contrast texture. Our main contributions are first to
design a local, statistical multi-channel descriptorin order to incorporate
both color and gradient information. Second, we introduce a multi-direction and
multi-scale contrast scheme based on grid-cells in order to integrate
expressive local variations. Contributing to the issue of selecting most
discriminative features for assessing and classification, we perform extensive
comparisons w.r.t. statistical descriptors, contrast measurements, and scale
structures. This way, we obtain reasonable results under various
configurations. Empirical findings from applying our optimized detector on the
INRIA and Caltech pedestrian datasets show that our features yield
state-of-the-art performance in pedestrian detection.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
- …