301 research outputs found
On-line PCA with Optimal Regrets
We carefully investigate the on-line version of PCA, where in each trial a
learning algorithm plays a k-dimensional subspace, and suffers the compression
loss on the next instance when projected into the chosen subspace. In this
setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and
Exponentiated Gradient (EG). We show that both algorithms are essentially
optimal in the worst-case. This comes as a surprise, since EG is known to
perform sub-optimally when the instances are sparse. This different behavior of
EG for PCA is mainly related to the non-negativity of the loss in this case,
which makes the PCA setting qualitatively different from other settings studied
in the literature. Furthermore, we show that when considering regret bounds as
function of a loss budget, EG remains optimal and strictly outperforms GD.
Next, we study the extension of the PCA setting, in which the Nature is allowed
to play with dense instances, which are positive matrices with bounded largest
eigenvalue. Again we can show that EG is optimal and strictly better than GD in
this setting
Byzantine Stochastic Gradient Descent
This paper studies the problem of distributed stochastic optimization in an
adversarial setting where, out of the machines which allegedly compute
stochastic gradients every iteration, an -fraction are Byzantine, and
can behave arbitrarily and adversarially. Our main result is a variant of
stochastic gradient descent (SGD) which finds -approximate
minimizers of convex functions in iterations. In contrast, traditional
mini-batch SGD needs iterations,
but cannot tolerate Byzantine failures. Further, we provide a lower bound
showing that, up to logarithmic factors, our algorithm is
information-theoretically optimal both in terms of sampling complexity and time
complexity
Functional Brain Imaging with Multi-Objective Multi-Modal Evolutionary Optimization
Functional brain imaging is a source of spatio-temporal data mining problems.
A new framework hybridizing multi-objective and multi-modal optimization is
proposed to formalize these data mining problems, and addressed through
Evolutionary Computation (EC). The merits of EC for spatio-temporal data mining
are demonstrated as the approach facilitates the modelling of the experts'
requirements, and flexibly accommodates their changing goals
Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning
Conventional ensemble learning combines students in the space domain. In this
paper, however, we combine students in the time domain and call it time-domain
ensemble learning. We analyze, compare, and discuss the generalization
performances regarding time-domain ensemble learning of both a linear model and
a nonlinear model. Analyzing in the framework of online learning using a
statistical mechanical method, we show the qualitatively different behaviors
between the two models. In a linear model, the dynamical behaviors of the
generalization error are monotonic. We analytically show that time-domain
ensemble learning is twice as effective as conventional ensemble learning.
Furthermore, the generalization error of a nonlinear model features
nonmonotonic dynamical behaviors when the learning rate is small. We
numerically show that the generalization performance can be improved remarkably
by using this phenomenon and the divergence of students in the time domain.Comment: 11 pages, 7 figure
Time series prediction via aggregation : an oracle bound including numerical cost
We address the problem of forecasting a time series meeting the Causal
Bernoulli Shift model, using a parametric set of predictors. The aggregation
technique provides a predictor with well established and quite satisfying
theoretical properties expressed by an oracle inequality for the prediction
risk. The numerical computation of the aggregated predictor usually relies on a
Markov chain Monte Carlo method whose convergence should be evaluated. In
particular, it is crucial to bound the number of simulations needed to achieve
a numerical precision of the same order as the prediction risk. In this
direction we present a fairly general result which can be seen as an oracle
inequality including the numerical cost of the predictor computation. The
numerical cost appears by letting the oracle inequality depend on the number of
simulations required in the Monte Carlo approximation. Some numerical
experiments are then carried out to support our findings
Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors
We investigate the application of hierarchical classification schemes to the
annotation of gene function based on several characteristics of protein
sequences including phylogenic descriptors, sequence based attributes, and
predicted secondary structure. We discuss three Bayesian models and compare
their performance in terms of predictive accuracy. These models are the
ordinary multinomial logit (MNL) model, a hierarchical model based on a set of
nested MNL models, and a MNL model with a prior that introduces correlations
between the parameters for classes that are nearby in the hierarchy. We also
provide a new scheme for combining different sources of information. We use
these models to predict the functional class of Open Reading Frames (ORFs) from
the E. coli genome. The results from all three models show substantial
improvement over previous methods, which were based on the C5 algorithm. The
MNL model using a prior based on the hierarchy outperforms both the
non-hierarchical MNL model and the nested MNL model. In contrast to previous
attempts at combining these sources of information, our approach results in a
higher accuracy rate when compared to models that use each data source alone.
Together, these results show that gene function can be predicted with higher
accuracy than previously achieved, using Bayesian models that incorporate
suitable prior information
GluRδ2 Expression in the Mature Cerebellum of Hotfoot Mice Promotes Parallel Fiber Synaptogenesis and Axonal Competition
Glutamate receptor delta 2 (GluRdelta2) is selectively expressed in the cerebellum, exclusively in the spines of the Purkinje cells (PCs) that are in contact with parallel fibers (PFs). Although its structure is similar to ionotropic glutamate receptors, it has no channel function and its ligand is unknown. The GluRdelta2-null mice, such as knockout and hotfoot have profoundly altered cerebellar circuitry, which causes ataxia and impaired motor learning. Notably, GluRdelta2 in PC-PF synapses regulates their maturation and strengthening and induces long term depression (LTD). In addition, GluRdelta2 participates in the highly territorial competition between the two excitatory inputs to the PC; the climbing fiber (CF), which innervates the proximal dendritic compartment, and the PF, which is connected to spiny distal branchlets. Recently, studies have suggested that GluRdelta2 acts as an adhesion molecule in PF synaptogenesis. Here, we provide in vivo and in vitro evidence that supports this hypothesis. Through lentiviral rescue in hotfoot mice, we noted a recovery of PC-PF contacts in the distal dendritic domain. In the proximal domain, we observed the formation of new spines that were innervated by PFs and a reduction in contact with the CF; ie, the pattern of innervation in the PC shifted to favor the PF input. Moreover, ectopic expression of GluRdelta2 in HEK293 cells that were cocultured with granule cells or in cerebellar Golgi cells in the mature brain induced the formation of new PF contacts. Collectively, our observations show that GluRdelta2 is an adhesion molecule that induces the formation of PF contacts independently of its cellular localization and promotes heterosynaptic competition in the PC proximal dendritic domain
- …