16,848 research outputs found

    Distributed Learning with Infinitely Many Hypotheses

    Full text link
    We consider a distributed learning setup where a network of agents sequentially access realizations of a set of random variables with unknown distributions. The network objective is to find a parametrized distribution that best describes their joint observations in the sense of the Kullback-Leibler divergence. Apart from recent efforts in the literature, we analyze the case of countably many hypotheses and the case of a continuum of hypotheses. We provide non-asymptotic bounds for the concentration rate of the agents' beliefs around the correct hypothesis in terms of the number of agents, the network parameters, and the learning abilities of the agents. Additionally, we provide a novel motivation for a general set of distributed Non-Bayesian update rules as instances of the distributed stochastic mirror descent algorithm.Comment: Submitted to CDC201

    Discrete MDL Predicts in Total Variation

    Get PDF
    The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.Comment: 15 LaTeX page

    Hypothesis Testing in Feedforward Networks with Broadcast Failures

    Full text link
    Consider a countably infinite set of nodes, which sequentially make decisions between two given hypotheses. Each node takes a measurement of the underlying truth, observes the decisions from some immediate predecessors, and makes a decision between the given hypotheses. We consider two classes of broadcast failures: 1) each node broadcasts a decision to the other nodes, subject to random erasure in the form of a binary erasure channel; 2) each node broadcasts a randomly flipped decision to the other nodes in the form of a binary symmetric channel. We are interested in whether there exists a decision strategy consisting of a sequence of likelihood ratio tests such that the node decisions converge in probability to the underlying truth. In both cases, we show that if each node only learns from a bounded number of immediate predecessors, then there does not exist a decision strategy such that the decisions converge in probability to the underlying truth. However, in case 1, we show that if each node learns from an unboundedly growing number of predecessors, then the decisions converge in probability to the underlying truth, even when the erasure probabilities converge to 1. We also derive the convergence rate of the error probability. In case 2, we show that if each node learns from all of its previous predecessors, then the decisions converge in probability to the underlying truth when the flipping probabilities of the binary symmetric channels are bounded away from 1/2. In the case where the flipping probabilities converge to 1/2, we derive a necessary condition on the convergence rate of the flipping probabilities such that the decisions still converge to the underlying truth. We also explicitly characterize the relationship between the convergence rate of the error probability and the convergence rate of the flipping probabilities

    Concentration for norms of infinitely divisible vectors with independent components

    Full text link
    We obtain dimension-free concentration inequalities for ℓp\ell^p-norms, p≥2p\geq2, of infinitely divisible random vectors with independent coordinates and finite exponential moments. Besides such norms, the methods and results extend to some other classes of Lipschitz functions.Comment: Published in at http://dx.doi.org/10.3150/08-BEJ131 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    The Laplace-Jaynes approach to induction

    Get PDF
    An approach to induction is presented, based on the idea of analysing the context of a given problem into `circumstances'. This approach, fully Bayesian in form and meaning, provides a complement or in some cases an alternative to that based on de Finetti's representation theorem and on the notion of infinite exchangeability. In particular, it gives an alternative interpretation of those formulae that apparently involve `unknown probabilities' or `propensities'. Various advantages and applications of the presented approach are discussed, especially in comparison to that based on exchangeability. Generalisations are also discussed.Comment: 38 pages, 1 figure. V2: altered discussion on some points, corrected typos, added reference

    Distributed stochastic optimization via matrix exponential learning

    Get PDF
    In this paper, we investigate a distributed learning scheme for a broad class of stochastic optimization problems and games that arise in signal processing and wireless communications. The proposed algorithm relies on the method of matrix exponential learning (MXL) and only requires locally computable gradient observations that are possibly imperfect and/or obsolete. To analyze it, we introduce the notion of a stable Nash equilibrium and we show that the algorithm is globally convergent to such equilibria - or locally convergent when an equilibrium is only locally stable. We also derive an explicit linear bound for the algorithm's convergence speed, which remains valid under measurement errors and uncertainty of arbitrarily high variance. To validate our theoretical analysis, we test the algorithm in realistic multi-carrier/multiple-antenna wireless scenarios where several users seek to maximize their energy efficiency. Our results show that learning allows users to attain a net increase between 100% and 500% in energy efficiency, even under very high uncertainty.Comment: 31 pages, 3 figure
    • …
    corecore