172 research outputs found

    Tight Lower Bounds for Multiplicative Weights Algorithmic Families

    Get PDF
    We study the fundamental problem of prediction with expert advice and develop regret lower bounds for a large family of algorithms for this problem. We develop simple adversarial primitives, that lend themselves to various combinations leading to sharp lower bounds for many algorithmic families. We use these primitives to show that the classic Multiplicative Weights Algorithm (MWA) has a regret of Tlnk2\sqrt{\frac{T \ln k}{2}}, there by completely closing the gap between upper and lower bounds. We further show a regret lower bound of 23Tlnk2\frac{2}{3}\sqrt{\frac{T\ln k}{2}} for a much more general family of algorithms than MWA, where the learning rate can be arbitrarily varied over time, or even picked from arbitrary distributions over time. We also use our primitives to construct adversaries in the geometric horizon setting for MWA to precisely characterize the regret at 0.391δ\frac{0.391}{\sqrt{\delta}} for the case of 22 experts and a lower bound of 12lnk2δ\frac{1}{2}\sqrt{\frac{\ln k}{2\delta}} for the case of arbitrary number of experts kk

    The Limitations of Optimization from Samples

    Full text link
    In this paper we consider the following question: can we optimize objective functions from the training data we use to learn them? We formalize this question through a novel framework we call optimization from samples (OPS). In OPS, we are given sampled values of a function drawn from some distribution and the objective is to optimize the function under some constraint. While there are interesting classes of functions that can be optimized from samples, our main result is an impossibility. We show that there are classes of functions which are statistically learnable and optimizable, but for which no reasonable approximation for optimization from samples is achievable. In particular, our main result shows that there is no constant factor approximation for maximizing coverage functions under a cardinality constraint using polynomially-many samples drawn from any distribution. We also show tight approximation guarantees for maximization under a cardinality constraint of several interesting classes of functions including unit-demand, additive, and general monotone submodular functions, as well as a constant factor approximation for monotone submodular functions with bounded curvature

    Sketch-based Randomized Algorithms for Dynamic Graph Regression

    Full text link
    A well-known problem in data science and machine learning is {\em linear regression}, which is recently extended to dynamic graphs. Existing exact algorithms for updating the solution of dynamic graph regression problem require at least a linear time (in terms of nn: the size of the graph). However, this time complexity might be intractable in practice. In the current paper, we utilize {\em subsampled randomized Hadamard transform} and \textsf{CountSketch} to propose the first randomized algorithms. Suppose that we are given an n×mn\times m matrix embedding MM of the graph, where mnm \ll n. Let rr be the number of samples required for a guaranteed approximation error, which is a sublinear function of nn. Our first algorithm reduces time complexity of pre-processing to O(n(m+1)+2n(m+1)log2(r+1)+rm2)O(n(m + 1) + 2n(m + 1) \log_2(r + 1) + rm^2). Then after an edge insertion or an edge deletion, it updates the approximate solution in O(rm)O(rm) time. Our second algorithm reduces time complexity of pre-processing to O(nnz(M)+m3ϵ2log7(m/ϵ))O \left( nnz(M) + m^3 \epsilon^{-2} \log^7(m/\epsilon) \right), where nnz(M)nnz(M) is the number of nonzero elements of MM. Then after an edge insertion or an edge deletion or a node insertion or a node deletion, it updates the approximate solution in O(qm)O(qm) time, with q=O(m2ϵ2log6(m/ϵ))q=O\left(\frac{m^2}{\epsilon^2} \log^6(m/\epsilon) \right). Finally, we show that under some assumptions, if lnn<ϵ1\ln n < \epsilon^{-1} our first algorithm outperforms our second algorithm and if lnnϵ1\ln n \geq \epsilon^{-1} our second algorithm outperforms our first algorithm

    Online Learning with an Almost Perfect Expert

    Full text link
    We study the multiclass online learning problem where a forecaster makes a sequence of predictions using the advice of nn experts. Our main contribution is to analyze the regime where the best expert makes at most bb mistakes and to show that when b=o(log4n)b = o(\log_4{n}), the expected number of mistakes made by the optimal forecaster is at most log4n+o(log4n)\log_4{n} + o(\log_4{n}). We also describe an adversary strategy showing that this bound is tight and that the worst case is attained for binary prediction
    corecore