Search CORE

301 research outputs found

On-line PCA with Optimal Regrets

Author: A.T. Kalai
D.P. Helmbold
J. Kivinen
K. Tsuda
K.S. Azoury
M. Herbster
M.K. Warmuth
M.K. Warmuth
N. Cesa-Bianchi
N. Cesa-Bianchi
N. Cesa-Bianchi
Publication venue
Publication date: 01/01/2013
Field of study

We carefully investigate the on-line version of PCA, where in each trial a learning algorithm plays a k-dimensional subspace, and suffers the compression loss on the next instance when projected into the chosen subspace. In this setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and Exponentiated Gradient (EG). We show that both algorithms are essentially optimal in the worst-case. This comes as a surprise, since EG is known to perform sub-optimally when the instances are sparse. This different behavior of EG for PCA is mainly related to the non-negativity of the loss in this case, which makes the PCA setting qualitatively different from other settings studied in the literature. Furthermore, we show that when considering regret bounds as function of a loss budget, EG remains optimal and strictly outperforms GD. Next, we study the extension of the PCA setting, in which the Nature is allowed to play with dense instances, which are positive matrices with bounded largest eigenvalue. Again we can show that EG is optimal and strictly better than GD in this setting

arXiv.org e-Print Archive

CiteSeerX

Crossref

Byzantine Stochastic Gradient Descent

Author: Alistarh Dan-Adrian
Allen-Zhu Zeyuan
Bengio S.
Cesa-Bianchi N.
Garnett R.
Grauman K.
Larochelle H.
Li Jerry
Wallach H.
Publication venue
Publication date: 01/01/2018
Field of study

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the

m

machines which allegedly compute stochastic gradients every iteration, an

\alpha

-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds

\varepsilon

-approximate minimizers of convex functions in

T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big)

iterations. In contrast, traditional mini-batch SGD needs

T = O\big( \frac{1}{\varepsilon^2 m} \big)

iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sampling complexity and time complexity

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Functional Brain Imaging with Multi-Objective Multi-Modal Evolutionary Optimization

Author: A. Hyvarinen
D. Chudova
D. Corne
D. Pantazis
D.A. Keim
I. McCowan
J. Daida
J. Roddick
J.-P. Li
K. Deb
K. Wu
M. Hmlinen
M. Laumanns
N. Cesa-Bianchi
X. Llorà
Publication venue
Publication date: 01/01/2006
Field of study

Functional brain imaging is a source of spatio-temporal data mining problems. A new framework hybridizing multi-objective and multi-modal optimization is proposed to formalize these data mining problems, and addressed through Evolutionary Computation (EC). The merits of EC for spatio-temporal data mining are demonstrated as the approach facilitates the modelling of the experts' requirements, and flexibly accommodates their changing goals

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning

Author: Cesa-Bianchi N.
Freund Y.
Freund Y.
Hara K.
Inoue J. I.
Krogh A.
Miyoshi S.
Miyoshi S.
Miyoshi S.
Miyoshi S.
Nishimori H.
Saad D.
Urbanczik R.
Publication venue: 'Japan Society of Applied Physics'
Publication date: 22/09/2006
Field of study

Conventional ensemble learning combines students in the space domain. In this paper, however, we combine students in the time domain and call it time-domain ensemble learning. We analyze, compare, and discuss the generalization performances regarding time-domain ensemble learning of both a linear model and a nonlinear model. Analyzing in the framework of online learning using a statistical mechanical method, we show the qualitatively different behaviors between the two models. In a linear model, the dynamical behaviors of the generalization error are monotonic. We analytically show that time-domain ensemble learning is twice as effective as conventional ensemble learning. Furthermore, the generalization error of a nonlinear model features nonmonotonic dynamical behaviors when the learning rate is small. We numerically show that the generalization performance can be improved remarkably by using this phenomenon and the divergence of students in the time domain.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Time series prediction via aggregation : an oracle bound including numerical cost

Author: A. S. Dalalyan
C. Andrieu
C. Coulon-Prieur
E. Moulines
E. R. Beadle
E. Rio
G. Leung
G. O. Roberts
J. Dedecker
K. L. Mengersen
K. Łatuszyński
K. Łatuszyński
N. Cesa-Bianchi
P. Alquier
P. Alquier
Y. F. Atchadé
Publication venue
Publication date: 26/05/2014
Field of study

We address the problem of forecasting a time series meeting the Causal Bernoulli Shift model, using a parametric set of predictors. The aggregation technique provides a predictor with well established and quite satisfying theoretical properties expressed by an oracle inequality for the prediction risk. The numerical computation of the aggregated predictor usually relies on a Markov chain Monte Carlo method whose convergence should be evaluated. In particular, it is crucial to bound the number of simulations needed to achieve a numerical precision of the same order as the prediction risk. In this direction we present a fairly general result which can be seen as an oracle inequality including the numerical cost of the predictor computation. The numerical cost appears by letting the oracle inequality depend on the number of simulations required in the Monte Carlo approximation. Some numerical experiments are then carried out to support our findings

arXiv.org e-Print Archive

Crossref

Classifier evaluation and attribute selection against active adversaries

Author: Bowei Xi
CH Teo
Chris Clifton
CP Robert
D Mitra
GR Lanckriet
K Fukunaga
MJ Osborne
Murat Kantarcıoğlu
N Cesa-Bianchi
R Duda
T Basar
T Fawcett
T Vallee
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

Author: A Clare
A McCallum
AS Weigend
B Rost
B Schoikowski
B Shahbaba
Babak Shahbaba
BE Engelhardt
D Koller
EM Marcotte
FR Blattner
H Blockeel
I Tsochantaridis
IUBMB
J DeRisi
J Fox
J Goodman
J Struyf
J Zhang
JA Eisen
JR Guest
K Sjölander
L Cai
L Dehaspe
M Brown
M Deng
M Deng
M Eisen
M Riley
M Riley
N Cesa-Bianchi
O Dekel
P Pavlidis
R Caruana
R Eisner
Radford M Neal
RD King
RD King
RM Neal
RM Neal
RM Neal
S Rison
S Sattath
S Spiro
SF Altschul
ST Dumais
WR Pearson
Z Barutcuoglu
Publication venue
Publication date: 01/01/2006
Field of study

We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

arXiv.org e-Print Archive

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

GluRδ2 Expression in the Mature Cerebellum of Hotfoot Mice Promotes Parallel Fiber Synaptogenesis and Axonal Competition

Glutamate receptor delta 2 (GluRdelta2) is selectively expressed in the cerebellum, exclusively in the spines of the Purkinje cells (PCs) that are in contact with parallel fibers (PFs). Although its structure is similar to ionotropic glutamate receptors, it has no channel function and its ligand is unknown. The GluRdelta2-null mice, such as knockout and hotfoot have profoundly altered cerebellar circuitry, which causes ataxia and impaired motor learning. Notably, GluRdelta2 in PC-PF synapses regulates their maturation and strengthening and induces long term depression (LTD). In addition, GluRdelta2 participates in the highly territorial competition between the two excitatory inputs to the PC; the climbing fiber (CF), which innervates the proximal dendritic compartment, and the PF, which is connected to spiny distal branchlets. Recently, studies have suggested that GluRdelta2 acts as an adhesion molecule in PF synaptogenesis. Here, we provide in vivo and in vitro evidence that supports this hypothesis. Through lentiviral rescue in hotfoot mice, we noted a recovery of PC-PF contacts in the distal dendritic domain. In the proximal domain, we observed the formation of new spines that were innervated by PFs and a reduction in contact with the CF; ie, the pattern of innervation in the PC shifted to favor the PF input. Moreover, ectopic expression of GluRdelta2 in HEK293 cells that were cocultured with granule cells or in cerebellar Golgi cells in the mature brain induced the formation of new PF contacts. Collectively, our observations show that GluRdelta2 is an adhesion molecule that induces the formation of PF contacts independently of its cellular localization and promotes heterosynaptic competition in the PC proximal dendritic domain

Crossref

Directory of Open Access Journals

PubMed Central

Decision Making in Uncertain and Changing Environments

Author: A E Roth
A Greenwald
A Zapechelnyuk
Andriy Zapechelnyuk
D Foster
D Foster
D Foster
D Foster
D Foster
D Fudenberg
D Fudenberg
D Ray
E Lehrer
E Lehrer
G Brown
G J Gordon
G J Mailath
I Erev
J Hannan
J R Marden
K Azuma
Karl H. Schlag
N Cesa-Bianchi
N Cesa-Bianchi
N Cesa-Bianchi
N Cesa-Bianchi
N Littlestone
P Auer
P Auer
R Radner
S Hart
S Hart
S Hart
S Hart
V Mallet
V Vovk
W Hoeffding
Y Freund
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Crossref