201,032 research outputs found
Cakewalk Sampling
We study the task of finding good local optima in combinatorial optimization
problems. Although combinatorial optimization is NP-hard in general, locally
optimal solutions are frequently used in practice. Local search methods however
typically converge to a limited set of optima that depend on their
initialization. Sampling methods on the other hand can access any valid
solution, and thus can be used either directly or alongside methods of the
former type as a way for finding good local optima. Since the effectiveness of
this strategy depends on the sampling distribution, we derive a robust learning
algorithm that adapts sampling distributions towards good local optima of
arbitrary objective functions. As a first use case, we empirically study the
efficiency in which sampling methods can recover locally maximal cliques in
undirected graphs. Not only do we show how our adaptive sampler outperforms
related methods, we also show how it can even approach the performance of
established clique algorithms. As a second use case, we consider how greedy
algorithms can be combined with our adaptive sampler, and we demonstrate how
this leads to superior performance in k-medoid clustering. Together, these
findings suggest that our adaptive sampler can provide an effective strategy to
combinatorial optimization problems that arise in practice.Comment: Accepted as a conference paper by AAAI-2020 (oral presentation
Improving the smoothed complexity of FLIP for max cut problems
Finding locally optimal solutions for max-cut and max--cut are well-known
PLS-complete problems. An instinctive approach to finding such a locally
optimum solution is the FLIP method. Even though FLIP requires exponential time
in worst-case instances, it tends to terminate quickly in practical instances.
To explain this discrepancy, the run-time of FLIP has been studied in the
smoothed complexity framework. Etscheid and R\"{o}glin showed that the smoothed
complexity of FLIP for max-cut in arbitrary graphs is quasi-polynomial. Angel,
Bubeck, Peres, and Wei showed that the smoothed complexity of FLIP for max-cut
in complete graphs is , where is an upper bound on
the random edge-weight density and is the number of vertices in the input
graph.
While Angel et al.'s result showed the first polynomial smoothed complexity,
they also conjectured that their run-time bound is far from optimal. In this
work, we make substantial progress towards improving the run-time bound. We
prove that the smoothed complexity of FLIP in complete graphs is . Our results are based on a carefully chosen matrix whose rank
captures the run-time of the method along with improved rank bounds for this
matrix and an improved union bound based on this matrix. In addition, our
techniques provide a general framework for analyzing FLIP in the smoothed
framework. We illustrate this general framework by showing that the smoothed
complexity of FLIP for max--cut in complete graphs is polynomial and for
max--cut in arbitrary graphs is quasi-polynomial. We believe that our
techniques should also be of interest towards addressing the smoothed
complexity of FLIP for max--cut in complete graphs for larger constants .Comment: 36 page
Globally Optimal Crowdsourcing Quality Management
We study crowdsourcing quality management, that is, given worker responses to
a set of tasks, our goal is to jointly estimate the true answers for the tasks,
as well as the quality of the workers. Prior work on this problem relies
primarily on applying Expectation-Maximization (EM) on the underlying maximum
likelihood problem to estimate true answers as well as worker quality.
Unfortunately, EM only provides a locally optimal solution rather than a
globally optimal one. Other solutions to the problem (that do not leverage EM)
fail to provide global optimality guarantees as well. In this paper, we focus
on filtering, where tasks require the evaluation of a yes/no predicate, and
rating, where tasks elicit integer scores from a finite domain. We design
algorithms for finding the global optimal estimates of correct task answers and
worker quality for the underlying maximum likelihood problem, and characterize
the complexity of these algorithms. Our algorithms conceptually consider all
mappings from tasks to true answers (typically a very large number), leveraging
two key ideas to reduce, by several orders of magnitude, the number of mappings
under consideration, while preserving optimality. We also demonstrate that
these algorithms often find more accurate estimates than EM-based algorithms.
This paper makes an important contribution towards understanding the inherent
complexity of globally optimal crowdsourcing quality management
Spectral Sequence Motif Discovery
Sequence discovery tools play a central role in several fields of
computational biology. In the framework of Transcription Factor binding
studies, motif finding algorithms of increasingly high performance are required
to process the big datasets produced by new high-throughput sequencing
technologies. Most existing algorithms are computationally demanding and often
cannot support the large size of new experimental data. We present a new motif
discovery algorithm that is built on a recent machine learning technique,
referred to as Method of Moments. Based on spectral decompositions, this method
is robust under model misspecification and is not prone to locally optimal
solutions. We obtain an algorithm that is extremely fast and designed for the
analysis of big sequencing data. In a few minutes, we can process datasets of
hundreds of thousand sequences and extract motif profiles that match those
computed by various state-of-the-art algorithms.Comment: 20 pages, 3 figures, 1 tabl
Largest small polygons: A sequential convex optimization approach
A small polygon is a polygon of unit diameter. The maximal area of a small
polygon with vertices is not known when . Finding the largest
small -gon for a given number can be formulated as a nonconvex
quadratically constrained quadratic optimization problem. We propose to solve
this problem with a sequential convex optimization approach, which is a ascent
algorithm guaranteeing convergence to a locally optimal solution. Numerical
experiments on polygons with up to sides suggest that optimal solutions
obtained are near-global. Indeed, for even , the algorithm
proposed in this work converges to known global optimal solutions found in the
literature
Recommended from our members
Finding High-Dimensional D-OptimalDesigns for Logistic Models via Differential Evolution
D-optimal designs are frequently used in controlled experiments to obtain the most accurateestimate of model parameters at minimal cost. Finding them can be a challenging task, especially whenthere are many factors in a nonlinear model. As the number of factors becomes large and interact withone another, there are many more variables to optimize and the D-optimal design problem becomes highdimensionaland non-separable. Consequently, premature convergence issues arise. Candidate solutions gettrapped in local optima and the classical gradient-based optimization approaches to search for the D-optimaldesigns rarely succeed. We propose a specially designed version of differential evolution (DE) which is arepresentative gradient-free optimization approach to solve such high-dimensional optimization problems.The proposed specially designed DE uses a new novelty-based mutation strategy to explore the variousregions in the search space. The exploration of the regions will be carried out differently from the previouslyexplored regions and the diversity of the population can be preserved. The proposed novelty-based mutationstrategy is collaborated with two common DE mutation strategies to balance exploration and exploitationat the early or medium stage of the evolution. Additionally, we adapt the control parameters of DE as theevolution proceeds. Using logistic models with several factors on various design spaces as examples, oursimulation results show our algorithm can find D-optimal designs efficiently and the algorithm outperformsits competitors. As an application, we apply our algorithm and re-design a 10-factor car refueling experimentwith discrete and continuous factors and selected pairwise interactions. Our proposed algorithm was able toconsistently outperform the other algorithms and find a more efficient D-optimal design for the problem
Detecting hierarchical and overlapping network communities using locally optimal modularity changes
Agglomerative clustering is a well established strategy for identifying
communities in networks. Communities are successively merged into larger
communities, coarsening a network of actors into a more manageable network of
communities. The order in which merges should occur is not in general clear,
necessitating heuristics for selecting pairs of communities to merge. We
describe a hierarchical clustering algorithm based on a local optimality
property. For each edge in the network, we associate the modularity change for
merging the communities it links. For each community vertex, we call the
preferred edge that edge for which the modularity change is maximal. When an
edge is preferred by both vertices that it links, it appears to be the optimal
choice from the local viewpoint. We use the locally optimal edges to define the
algorithm: simultaneously merge all pairs of communities that are connected by
locally optimal edges that would increase the modularity, redetermining the
locally optimal edges after each step and continuing so long as the modularity
can be further increased. We apply the algorithm to model and empirical
networks, demonstrating that it can efficiently produce high-quality community
solutions. We relate the performance and implementation details to the
structure of the resulting community hierarchies. We additionally consider a
complementary local clustering algorithm, describing how to identify
overlapping communities based on the local optimality condition.Comment: 10 pages; 4 tables, 3 figure
A polynomial training algorithm for calculating perceptrons of optimal stability
Recomi (REpeated COrrelation Matrix Inversion) is a polynomially fast
algorithm for searching optimally stable solutions of the perceptron learning
problem. For random unbiased and biased patterns it is shown that the algorithm
is able to find optimal solutions, if any exist, in at worst O(N^4) floating
point operations. Even beyond the critical storage capacity alpha_c the
algorithm is able to find locally stable solutions (with negative stability) at
the same speed. There are no divergent time scales in the learning process. A
full proof of convergence cannot yet be given, only major constituents of a
proof are shown.Comment: 11 pages, Latex, 4 EPS figure
- …