Search CORE

457 research outputs found

Cooperative Online Learning: Keeping your Neighbors Updated

Author: Cesa-Bianchi Nicolò
Cesari Tommaso R.
Monteleoni Claire
Publication venue
Publication date: 01/01/2020
Field of study

We study an asynchronous online learning setting with a network of agents. At each time step, some of the agents are activated, requested to make a prediction, and pay the corresponding loss. The loss function is then revealed to these agents and also to their neighbors in the network. Our results characterize how much knowing the network structure affects the regret as a function of the model of agent activations. When activations are stochastic, the optimal regret (up to constant factors) is shown to be of order

\sqrt{\alpha T}

, where

T

is the horizon and

\alpha

is the independence number of the network. We prove that the upper bound is achieved even when agents have no information about the network structure. When activations are adversarial the situation changes dramatically: if agents ignore the network structure, a

\Omega(T)

lower bound on the regret can be proven, showing that learning is impossible. However, when agents can choose to ignore some of their neighbors based on the knowledge of the network structure, we prove a

O(\sqrt{\overline{\chi} T})

sublinear regret bound, where

\overline{\chi} \ge \alpha

is the clique-covering number of the network

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Gambling in a rigged casino: The adversarial multi-armed bandit problem

Author: N. Cesa-Bianchi
P. Auer
R. Schapire
Y. Freund
Publication venue
Publication date
Field of study

Research Papers in Economics

Regret Bounds for Reinforcement Learning with Policy Advice

Author: C. Tekin
M.L. Puterman
N. Cesa-Bianchi
R. Ortner
R.S. Sutton
T. Jaksch
Publication venue
Publication date: 01/01/2013
Field of study

In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with policy advice (RLPA) algorithm which leverages this input set and learns to use the best policy in the set for the reinforcement learning task at hand. We prove that RLPA has a sub-linear regret of \tilde O(\sqrt{T}) relative to the best input policy, and that both this regret and its computational complexity are independent of the size of the state and action space. Our empirical simulations support our theoretical analysis. This suggests RLPA may offer significant advantages in large domains where some prior good policies are provided

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Laplace's rule of succession in information geometry

Author: E Takimoto
L Tierney
N Cesa-Bianchi
PD Grünwald
R Krichevsky
S-I Amari
S-I Amari
Publication venue
Publication date: 14/03/2015
Field of study

Laplace's "add-one" rule of succession modifies the observed frequencies in a sequence of heads and tails by adding one to the observed counts. This improves prediction by avoiding zero probabilities and corresponds to a uniform Bayesian prior on the parameter. The canonical Jeffreys prior corresponds to the "add-one-half" rule. We prove that, for exponential families of distributions, such Bayesian predictors can be approximated by taking the average of the maximum likelihood predictor and the \emph{sequential normalized maximum likelihood} predictor from information theory. Thus in this case it is possible to approximate Bayesian predictors without the cost of integrating or sampling in parameter space

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Byzantine Stochastic Gradient Descent

Author: Alistarh Dan-Adrian
Allen-Zhu Zeyuan
Bengio S.
Cesa-Bianchi N.
Garnett R.
Grauman K.
Larochelle H.
Li Jerry
Wallach H.
Publication venue
Publication date: 01/01/2018
Field of study

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the

m

machines which allegedly compute stochastic gradients every iteration, an

\alpha

-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds

\varepsilon

-approximate minimizers of convex functions in

T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big)

iterations. In contrast, traditional mini-batch SGD needs

T = O\big( \frac{1}{\varepsilon^2 m} \big)

iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sampling complexity and time complexity

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Study on Alkane Patterns of Grassland Species from the Patagonian Steppe

Author: Bakker M. L.
Cepeda R. E.
Cesa A.
Marinelli C. B.
Publication venue: UKnowledge
Publication date: 21/03/2021
Field of study

University of Kentucky

Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning

Author: Cesa-Bianchi N.
Freund Y.
Freund Y.
Hara K.
Inoue J. I.
Krogh A.
Miyoshi S.
Miyoshi S.
Miyoshi S.
Miyoshi S.
Nishimori H.
Saad D.
Urbanczik R.
Publication venue: 'Japan Society of Applied Physics'
Publication date: 22/09/2006
Field of study

Conventional ensemble learning combines students in the space domain. In this paper, however, we combine students in the time domain and call it time-domain ensemble learning. We analyze, compare, and discuss the generalization performances regarding time-domain ensemble learning of both a linear model and a nonlinear model. Analyzing in the framework of online learning using a statistical mechanical method, we show the qualitatively different behaviors between the two models. In a linear model, the dynamical behaviors of the generalization error are monotonic. We analytically show that time-domain ensemble learning is twice as effective as conventional ensemble learning. Furthermore, the generalization error of a nonlinear model features nonmonotonic dynamical behaviors when the learning rate is small. We numerically show that the generalization performance can be improved remarkably by using this phenomenon and the divergence of students in the time domain.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Eph receptors are involved in the activity-dependent synaptic wiring in the mouse cerebellar cortex

Author: Cesa R
Ethell Im
Pasquale Eb
Premoselli Federica
Renna A
Strata Pier Giorgio
Publication venue
Publication date: 01/01/2011
Field of study

Eph receptor tyrosine kinases are involved in many cellular processes. In the developing brain, they act as migratory and cell adhesive cues while in the adult brain they regulate dendritic spine plasticity. Here we show a new role for Eph receptor signalling in the cerebellar cortex. Cerebellar Purkinje cells are innervated by two different excitatory inputs. The climbing fibres contact the proximal dendritic domain of Purkinje cells, where synapse and spine density is low; the parallel fibres contact the distal dendritic domain, where synapse and spine density is high. Interestingly, Purkinje cells have the intrinsic ability to generate a high number of spines over their entire dendritic arborisations, which can be innervated by the parallel fibres. However, the climbing fibre input continuously exerts an activity-dependent repression on parallel fibre synapses, thus confining them to the distal Purkinje cell dendritic domain. Such repression persists after Eph receptor activation, but is overridden by Eph receptor inhibition with EphA4/Fc in neonatal cultured cerebellar slices as well as mature acute cerebellar slices, following in vivo infusion of the EphA4/Fc inhibitor and in EphB receptor-deficient mice. When electrical activity is blocked in vivo by tetrodotoxin leading to a high spine density in Purkinje cell proximal dendrites, stimulation of Eph receptor activation recapitulates the spine repressive effects of climbing fibres. These results suggest that Eph receptor signalling mediates the repression of spine proliferation induced by climbing fibre activity in Purkinje cell proximal dendrites. Such repression is necessary to maintain the correct architecture of the cerebellar cortex

Directory of Open Access Journals

PubMed Central

Institutional Research Information System University of Turin

Prediction with Expert Advice under Discounted Loss

Author: A. Chernov
B. Schölkopf
D. Haussler
D.A. Harville
E.F. Beckenbach
E.S. Gardner
J.F. Muth
M. Herbster
N. Cesa-Bianchi
R. Sutton
V. Vovk
V. Vovk
V. Vovk
Y. Kalnishkan
Publication venue
Publication date: 01/01/2010
Field of study

We study prediction with expert advice in the setting where the losses are accumulated with some discounting---the impact of old losses may gradually vanish. We generalize the Aggregating Algorithm and the Aggregating Algorithm for Regression to this case, propose a suitable new variant of exponential weights algorithm, and prove respective loss bounds.Comment: 26 pages; expanded (2 remarks -> theorems), some misprints correcte

arXiv.org e-Print Archive

Crossref

University of Brighton Research Portal

University of Bedfordshire Repository

Annotation of the modular polyketide synthase and nonribosomal peptide synthetase gene clusters in the genome of Streptomyces tsukubaensis NRRL18488

Author: A. Sun
A.K. McCallum
D. Koller
F. Rosenblatt
F.R. Kschischang
J. Shawe-Taylor
K.S. Azoury
L. Devroye
M.E. Ruiz
N. Cesa-Bianchi
N. Cesa-Bianchi
R. Rifkin
R.A. Horn
S.T. Dumais
V. Vovk
W. Hoeffding
Publication venue: American Society for Microbiology
Publication date: 01/01/2004
Field of study

et al.The high G+C content and large genome size make the sequencing and assembly of Streptomyces genomes more difficult than for other bacteria. Many pharmaceutically important natural products are synthesized by modular polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). The analysis of such gene clusters is difficult if the genome sequence is not of the highest quality, because clusters can be distributed over several contigs, and sequencing errors can introduce apparent frameshifts into the large PKS and NRPS proteins. An additional problem is that the modular nature of the clusters results in the presence of imperfect repeats, which may cause assembly errors. The genome sequence of Streptomyces tsukubaensis NRRL18488 was scanned for potential PKS and NRPS modular clusters. A phylogenetic approach was used to identify multiple contigs belonging to the same cluster. Four PKS clusters and six NRPS clusters were identified. Contigs containing cluster sequences were analyzed in detail by using the ClustScan program, which suggested the order and orientation of the contigs. The sequencing of the appropriate PCR products confirmed the ordering and allowed the correction of apparent frameshifts resulting from sequencing errors. The product chemistry of such correctly assembled clusters could also be predicted. The analysis of one PKS cluster showed that it should produce a bafilomycin-like compound, and reverse transcription (RT)-PCR was used to show that the cluster was transcribed. © 2012, American Society for Microbiology.We thank the Government of Slovenia, Ministry of Higher Education, Science and Technology (Slovenian Research Agency [ARRS]), for the award of grant no. J4-9331 and L4-2188 to H.P. We also thank the Ministry of the Economy, the JAPTI Agency, and the European Social Fund (contract no. 102/2008) for the funds awarded for the employment of G.K. This work was also funded by a cooperation grant of the German Academic Exchange Service (DAAD) and the Ministry of Science, Education, and Sports, Republic of Croatia (to J.C. and D.H.), and by grant 09/5 (to D.H.) from the Croatian Science Foundation.Peer Reviewe

Crossref

Digital.CSIC