3,585 research outputs found
On Local Regret
Online learning aims to perform nearly as well as the best hypothesis in
hindsight. For some hypothesis classes, though, even finding the best
hypothesis offline is challenging. In such offline cases, local search
techniques are often employed and only local optimality guaranteed. For online
decision-making with such hypothesis classes, we introduce local regret, a
generalization of regret that aims to perform nearly as well as only nearby
hypotheses. We then present a general algorithm to minimize local regret with
arbitrary locality graphs. We also show how the graph structure can be
exploited to drastically speed learning. These algorithms are then demonstrated
on a diverse set of online problems: online disjunct learning, online Max-SAT,
and online decision tree learning.Comment: This is the longer version of the same-titled paper appearing in the
Proceedings of the Twenty-Ninth International Conference on Machine Learning
(ICML), 201
A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
We give a 2-approximation algorithm for the Maximum Agreement Forest problem
on two rooted binary trees. This NP-hard problem has been studied extensively
in the past two decades, since it can be used to compute the rooted Subtree
Prune-and-Regraft (rSPR) distance between two phylogenetic trees. Our algorithm
is combinatorial and its running time is quadratic in the input size. To prove
the approximation guarantee, we construct a feasible dual solution for a novel
linear programming formulation. In addition, we show this linear program is
stronger than previously known formulations, and we give a compact formulation,
showing that it can be solved in polynomial tim
Prefix Discrepancy, Smoothed Analysis, and Combinatorial Vector Balancing
A well-known result of Banaszczyk in discrepancy theory concerns the prefix
discrepancy problem (also known as the signed series problem): given a sequence
of unit vectors in , find signs for each of them such
that the signed sum vector along any prefix has a small -norm?
This problem is central to proving upper bounds for the Steinitz problem, and
the popular Koml\'os problem is a special case where one is only concerned with
the final signed sum vector instead of all prefixes. Banaszczyk gave an
bound for the prefix discrepancy problem. We
investigate the tightness of Banaszczyk's bound and consider natural
generalizations of prefix discrepancy:
We first consider a smoothed analysis setting, where a small amount of
additive noise perturbs the input vectors. We show an exponential improvement
in compared to Banaszczyk's bound. Using a primal-dual approach and a
careful chaining argument, we show that one can achieve a bound of
with high probability in the smoothed setting.
Moreover, this smoothed analysis bound is the best possible without further
improvement on Banaszczyk's bound in the worst case.
We also introduce a generalization of the prefix discrepancy problem where
the discrepancy constraints correspond to paths on a DAG on vertices. We
show that an analog of Banaszczyk's bound continues
to hold in this setting for adversarially given unit vectors and that the
factor is unavoidable for DAGs. We also show that the
dependence on cannot be improved significantly in the smoothed case for
DAGs.
We conclude by exploring a more general notion of vector balancing, which we
call combinatorial vector balancing. We obtain near-optimal bounds in this
setting, up to poly-logarithmic factors.Comment: 22 pages. Appear in ITCS 202
Phase transition in the sample complexity of likelihood-based phylogeny inference
Reconstructing evolutionary trees from molecular sequence data is a
fundamental problem in computational biology. Stochastic models of sequence
evolution are closely related to spin systems that have been extensively
studied in statistical physics and that connection has led to important
insights on the theoretical properties of phylogenetic reconstruction
algorithms as well as the development of new inference methods. Here, we study
maximum likelihood, a classical statistical technique which is perhaps the most
widely used in phylogenetic practice because of its superior empirical
accuracy.
At the theoretical level, except for its consistency, that is, the guarantee
of eventual correct reconstruction as the size of the input data grows, much
remains to be understood about the statistical properties of maximum likelihood
in this context. In particular, the best bounds on the sample complexity or
sequence-length requirement of maximum likelihood, that is, the amount of data
required for correct reconstruction, are exponential in the number, , of
tips---far from known lower bounds based on information-theoretic arguments.
Here we close the gap by proving a new upper bound on the sequence-length
requirement of maximum likelihood that matches up to constants the known lower
bound for some standard models of evolution.
More specifically, for the -state symmetric model of sequence evolution on
a binary phylogeny with bounded edge lengths, we show that the sequence-length
requirement behaves logarithmically in when the expected amount of mutation
per edge is below what is known as the Kesten-Stigum threshold. In general, the
sequence-length requirement is polynomial in . Our results imply moreover
that the maximum likelihood estimator can be computed efficiently on randomly
generated data provided sequences are as above.Comment: To appear in Probability Theory and Related Field
Computing Optimal Steiner Trees in Polynomial Space
Given an n-node edge-weighted graph and a subset of k terminal nodes, the NP-hard (weighted) Steiner tree problem is to compute a minimum-weight tree which spans the terminals. All the known algorithms for this problem which improve on trivial O(1.62 n )-time enumeration are based on dynamic programming, and require exponential space. Motivated by the fact that exponential-space algorithms are typically impractical, in this paper we address the problem of designing faster polynomial-space algorithms. Our first contribution is a simple O((27/4) k n O(logk))-time polynomial-space algorithm for the problem. This algorithm is based on a variant of the classical tree-separator theorem: every Steiner tree has a node whose removal partitions the tree in two forests, containing at most 2k/3 terminals each. Exploiting separators of logarithmic size which evenly partition the terminals, we are able to reduce the running time to . This improves on trivial enumeration for roughly k<n/3, which covers most of the cases of practical interest. Combining the latter algorithm (for small k) with trivial enumeration (for large k) we obtain a O(1.59 n )-time polynomial-space algorithm for the weighted Steiner tree problem. As a second contribution of this paper, we present a O(1.55 n )-time polynomial-space algorithm for the cardinality version of the problem, where all edge weights are one. This result is based on a improved branching strategy. The refined branching is based on a charging mechanism which shows that, for large values of k, convenient local configurations of terminals and non-terminals exist. The analysis of the algorithm relies on the Measure & Conquer approach: the non-standard measure used here is a linear combination of the number of nodes and number of non-terminals. Using a recent result in Nederlof (International colloquium on automata, languages and programming (ICALP), pp.713-725, 2009), the running time can be reduced to O(1.36 n ). The previous best algorithm for the cardinality case runs in O(1.42 n ) time and exponential spac
Constant-time dynamic (∆+1)-coloring
We give a fully dynamic (Las-Vegas style) algorithm with constant expected amortized time per update that maintains a proper (∆ + 1)-vertex coloring of a graph with maximum degree at most ∆. This improves upon the previous O(log ∆)-time algorithm by Bhattacharya et al. (SODA 2018). We show that our result does not only have optimal running time, but is also optimal in the sense that already deciding whether a ∆-coloring exists in a dynamically changing graph with maximum degree at most ∆ takes Ω(log n) time per operation
- …