29,354 research outputs found
On Optimizing Distributed Tucker Decomposition for Dense Tensors
The Tucker decomposition expresses a given tensor as the product of a small
core tensor and a set of factor matrices. Apart from providing data
compression, the construction is useful in performing analysis such as
principal component analysis (PCA)and finds applications in diverse domains
such as signal processing, computer vision and text analytics. Our objective is
to develop an efficient distributed implementation for the case of dense
tensors. The implementation is based on the HOOI (Higher Order Orthogonal
Iterator) procedure, wherein the tensor-times-matrix product forms the core
routine. Prior work have proposed heuristics for reducing the computational
load and communication volume incurred by the routine. We study the two metrics
in a formal and systematic manner, and design strategies that are optimal under
the two fundamental metrics. Our experimental evaluation on a large benchmark
of tensors shows that the optimal strategies provide significant reduction in
load and volume compared to prior heuristics, and provide up to 7x speed-up in
the overall running time.Comment: Preliminary version of the paper appears in the proceedings of
IPDPS'1
Optimization bounds from the branching dual
We present a general method for obtaining strong bounds for discrete optimization problems that is based on a concept of branching duality. It can be applied when no useful integer programming model is available, and we illustrate this with the minimum bandwidth problem. The method strengthens a known bound for a given problem by formulating a dual problem whose feasible solutions are partial branching trees. It solves the dual problem with a “worst-bound” local search heuristic that explores neighboring partial trees. After proving some optimality properties of the heuristic, we show that it substantially improves known combinatorial bounds for the minimum bandwidth problem with a modest amount of computation. It also obtains significantly tighter bounds than depth-first and breadth-first branching, demonstrating that the dual perspective can lead to better branching strategies when the object is to find valid bounds.Accepted manuscrip
A System for Induction of Oblique Decision Trees
This article describes a new system for induction of oblique decision trees.
This system, OC1, combines deterministic hill-climbing with two forms of
randomization to find a good oblique split (in the form of a hyperplane) at
each node of a decision tree. Oblique decision tree methods are tuned
especially for domains in which the attributes are numeric, although they can
be adapted to symbolic or mixed symbolic/numeric attributes. We present
extensive empirical studies, using both real and artificial data, that analyze
OC1's ability to construct oblique trees that are smaller and more accurate
than their axis-parallel counterparts. We also examine the benefits of
randomization for the construction of oblique decision trees.Comment: See http://www.jair.org/ for an online appendix and other files
accompanying this articl
Query Learning with Exponential Query Costs
In query learning, the goal is to identify an unknown object while minimizing
the number of "yes" or "no" questions (queries) posed about that object. A
well-studied algorithm for query learning is known as generalized binary search
(GBS). We show that GBS is a greedy algorithm to optimize the expected number
of queries needed to identify the unknown object. We also generalize GBS in two
ways. First, we consider the case where the cost of querying grows
exponentially in the number of queries and the goal is to minimize the expected
exponential cost. Then, we consider the case where the objects are partitioned
into groups, and the objective is to identify only the group to which the
object belongs. We derive algorithms to address these issues in a common,
information-theoretic framework. In particular, we present an exact formula for
the objective function in each case involving Shannon or Renyi entropy, and
develop a greedy algorithm for minimizing it. Our algorithms are demonstrated
on two applications of query learning, active learning and emergency response.Comment: 15 page
Progressive Simplification of Polygonal Curves
Simplifying polygonal curves at different levels of detail is an important
problem with many applications. Existing geometric optimization algorithms are
only capable of minimizing the complexity of a simplified curve for a single
level of detail. We present an -time algorithm that takes a polygonal
curve of n vertices and produces a set of consistent simplifications for m
scales while minimizing the cumulative simplification complexity. This
algorithm is compatible with distance measures such as the Hausdorff, the
Fr\'echet and area-based distances, and enables simplification for continuous
scaling in time. To speed up this algorithm in practice, we present
new techniques for constructing and representing so-called shortcut graphs.
Experimental evaluation of these techniques on trajectory data reveals a
significant improvement of using shortcut graphs for progressive and
non-progressive curve simplification, both in terms of running time and memory
usage.Comment: 20 pages, 20 figure
A Computational Comparison of Optimization Methods for the Golomb Ruler Problem
The Golomb ruler problem is defined as follows: Given a positive integer n,
locate n marks on a ruler such that the distance between any two distinct pair
of marks are different from each other and the total length of the ruler is
minimized. The Golomb ruler problem has applications in information theory,
astronomy and communications, and it can be seen as a challenge for
combinatorial optimization algorithms. Although constructing high quality
rulers is well-studied, proving optimality is a far more challenging task. In
this paper, we provide a computational comparison of different optimization
paradigms, each using a different model (linear integer, constraint programming
and quadratic integer) to certify that a given Golomb ruler is optimal. We
propose several enhancements to improve the computational performance of each
method by exploring bound tightening, valid inequalities, cutting planes and
branching strategies. We conclude that a certain quadratic integer programming
model solved through a Benders decomposition and strengthened by two types of
valid inequalities performs the best in terms of solution time for small-sized
Golomb ruler problem instances. On the other hand, a constraint programming
model improved by range reduction and a particular branching strategy could
have more potential to solve larger size instances due to its promising
parallelization features
- …