301 research outputs found

    On-line PCA with Optimal Regrets

    Full text link
    We carefully investigate the on-line version of PCA, where in each trial a learning algorithm plays a k-dimensional subspace, and suffers the compression loss on the next instance when projected into the chosen subspace. In this setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and Exponentiated Gradient (EG). We show that both algorithms are essentially optimal in the worst-case. This comes as a surprise, since EG is known to perform sub-optimally when the instances are sparse. This different behavior of EG for PCA is mainly related to the non-negativity of the loss in this case, which makes the PCA setting qualitatively different from other settings studied in the literature. Furthermore, we show that when considering regret bounds as function of a loss budget, EG remains optimal and strictly outperforms GD. Next, we study the extension of the PCA setting, in which the Nature is allowed to play with dense instances, which are positive matrices with bounded largest eigenvalue. Again we can show that EG is optimal and strictly better than GD in this setting

    Byzantine Stochastic Gradient Descent

    Full text link
    This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the mm machines which allegedly compute stochastic gradients every iteration, an α\alpha-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε\varepsilon-approximate minimizers of convex functions in T=O~(1ε2m+α2ε2)T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big) iterations. In contrast, traditional mini-batch SGD needs T=O(1ε2m)T = O\big( \frac{1}{\varepsilon^2 m} \big) iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sampling complexity and time complexity

    Functional Brain Imaging with Multi-Objective Multi-Modal Evolutionary Optimization

    Get PDF
    Functional brain imaging is a source of spatio-temporal data mining problems. A new framework hybridizing multi-objective and multi-modal optimization is proposed to formalize these data mining problems, and addressed through Evolutionary Computation (EC). The merits of EC for spatio-temporal data mining are demonstrated as the approach facilitates the modelling of the experts' requirements, and flexibly accommodates their changing goals

    Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning

    Full text link
    Conventional ensemble learning combines students in the space domain. In this paper, however, we combine students in the time domain and call it time-domain ensemble learning. We analyze, compare, and discuss the generalization performances regarding time-domain ensemble learning of both a linear model and a nonlinear model. Analyzing in the framework of online learning using a statistical mechanical method, we show the qualitatively different behaviors between the two models. In a linear model, the dynamical behaviors of the generalization error are monotonic. We analytically show that time-domain ensemble learning is twice as effective as conventional ensemble learning. Furthermore, the generalization error of a nonlinear model features nonmonotonic dynamical behaviors when the learning rate is small. We numerically show that the generalization performance can be improved remarkably by using this phenomenon and the divergence of students in the time domain.Comment: 11 pages, 7 figure

    Time series prediction via aggregation : an oracle bound including numerical cost

    Full text link
    We address the problem of forecasting a time series meeting the Causal Bernoulli Shift model, using a parametric set of predictors. The aggregation technique provides a predictor with well established and quite satisfying theoretical properties expressed by an oracle inequality for the prediction risk. The numerical computation of the aggregated predictor usually relies on a Markov chain Monte Carlo method whose convergence should be evaluated. In particular, it is crucial to bound the number of simulations needed to achieve a numerical precision of the same order as the prediction risk. In this direction we present a fairly general result which can be seen as an oracle inequality including the numerical cost of the predictor computation. The numerical cost appears by letting the oracle inequality depend on the number of simulations required in the Monte Carlo approximation. Some numerical experiments are then carried out to support our findings

    Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

    Get PDF
    We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

    GluRδ2 Expression in the Mature Cerebellum of Hotfoot Mice Promotes Parallel Fiber Synaptogenesis and Axonal Competition

    Get PDF
    Glutamate receptor delta 2 (GluRdelta2) is selectively expressed in the cerebellum, exclusively in the spines of the Purkinje cells (PCs) that are in contact with parallel fibers (PFs). Although its structure is similar to ionotropic glutamate receptors, it has no channel function and its ligand is unknown. The GluRdelta2-null mice, such as knockout and hotfoot have profoundly altered cerebellar circuitry, which causes ataxia and impaired motor learning. Notably, GluRdelta2 in PC-PF synapses regulates their maturation and strengthening and induces long term depression (LTD). In addition, GluRdelta2 participates in the highly territorial competition between the two excitatory inputs to the PC; the climbing fiber (CF), which innervates the proximal dendritic compartment, and the PF, which is connected to spiny distal branchlets. Recently, studies have suggested that GluRdelta2 acts as an adhesion molecule in PF synaptogenesis. Here, we provide in vivo and in vitro evidence that supports this hypothesis. Through lentiviral rescue in hotfoot mice, we noted a recovery of PC-PF contacts in the distal dendritic domain. In the proximal domain, we observed the formation of new spines that were innervated by PFs and a reduction in contact with the CF; ie, the pattern of innervation in the PC shifted to favor the PF input. Moreover, ectopic expression of GluRdelta2 in HEK293 cells that were cocultured with granule cells or in cerebellar Golgi cells in the mature brain induced the formation of new PF contacts. Collectively, our observations show that GluRdelta2 is an adhesion molecule that induces the formation of PF contacts independently of its cellular localization and promotes heterosynaptic competition in the PC proximal dendritic domain
    corecore