Search CORE

344 research outputs found

Distributed Deep Learning for Question Answering

Author: Bottou L.
Chilimbi T.
Dean J.
Sutskever I.
Zhang S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/08/2016
Field of study

This paper is an empirical study of the distributed deep learning for question answering subtasks: answer selection and question classification. Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP, DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results show that the distributed framework based on the message passing interface can accelerate the convergence speed at a sublinear scale. This paper demonstrates the importance of distributed training. For example, with 48 workers, a 24x speedup is achievable for the answer selection task and running time is decreased from 138.2 hours to 5.81 hours, which will increase the productivity significantly.Comment: This paper will appear in the Proceeding of The 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), Indianapolis, US

arXiv.org e-Print Archive

Crossref

Collaborative Filtering via Group-Structured Dictionary Learning

Author: F. Ricci
G. Takács
K. Goldberg
L. Bottou
R. Jenatton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Structured sparse coding and the related structured dictionary learning problems are novel research areas in machine learning. In this paper we present a new application of structured dictionary learning for collaborative filtering based recommender systems. Our extensive numerical experiments demonstrate that the presented technique outperforms its state-of-the-art competitors and has several advantages over approaches that do not put structured constraints on the dictionary elements.Comment: A compressed version of the paper has been accepted for publication at the 10th International Conference on Latent Variable Analysis and Source Separation (LVA/ICA 2012

arXiv.org e-Print Archive

CiteSeerX

Crossref

UCL Discovery

ELTE Digital Institutional Repository (EDIT)

Lazy training of radial basis neural networks

Author: C.G. Atkenson
D. Wettschereck
D.W. Aha
J. Park
J.E. Moody
J.M. Valls
J.M. Zaldívar
L. Bottou
L. Yingwei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Proceeding of: 16th International Conference on Artificial Neural Networks, ICANN 2006. Athens, Greece, September 10-14, 2006Usually, training data are not evenly distributed in the input space. This makes non-local methods, like Neural Networks, not very accurate in those cases. On the other hand, local methods have the problem of how to know which are the best examples for each test pattern. In this work, we present a way of performing a trade off between local and non-local methods. On one hand a Radial Basis Neural Network is used like learning algorithm, on the other hand a selection of the training patterns is used for each query. Moreover, the RBNN initialization algorithm has been modified in a deterministic way to eliminate any initial condition influence. Finally, the new method has been validated in two time series domains, an artificial and a real world one.This article has been financed by the Spanish founded research MEC project OPLINK::UC3M, Ref: TIN2005-08818-C04-0

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

e-Archivo (Univ. Carlos III de Madrid e-Archivo)

Consistency of probabilistic classifier trees

Author: A Beygelzimer
A Kumar
F Hutter
J Duchi
J Fox
L Bottou
MD Reid
PL Bartlett
T Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

Author: Anderson C.
Bennett J.
Bottou L.
Chander A
Dan-Dan Z.
Jolliffe I.
Lee D. D.
Salakhutdinov R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2018
Field of study

Recommendation systems are ubiquitous and impact many domains; they have the potential to influence product consumption, individuals' perceptions of the world, and life-altering decisions. These systems are often evaluated or trained with data from users already exposed to algorithmic recommendations; this creates a pernicious feedback loop. Using simulations, we demonstrate how using data confounded in this way homogenizes user behavior without increasing utility

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Dynamic sampling schemes for optimal noise learning under multiple nonsmooth constraints

Author: B Polyak
E Haber
H Robbins
JC Los Reyes De
L Ambrosio
L Bottou
MD Mantle
RH Byrd
RH Byrd
Y Nesterov
Publication venue
Publication date: 23/06/2014
Field of study

We consider the bilevel optimisation approach proposed by De Los Reyes, Sch\"onlieb (2013) for learning the optimal parameters in a Total Variation (TV) denoising model featuring for multiple noise distributions. In applications, the use of databases (dictionaries) allows an accurate estimation of the parameters, but reflects in high computational costs due to the size of the databases and to the nonsmooth nature of the PDE constraints. To overcome this computational barrier we propose an optimisation algorithm that by sampling dynamically from the set of constraints and using a quasi-Newton method, solves the problem accurately and in an efficient way

arXiv.org e-Print Archive

Crossref

A Cost-based Optimizer for Gradient Descent Optimization

Author: Abadi M.
Agrawal D.
Ben-David S.
Bottou L.
Bousquet O.
Johnson R.
Kraska T.
Liu J.
Recht B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/03/2017
Field of study

As the use of machine learning (ML) permeates into diverse application domains, there is an urgent need to support a declarative framework for ML. Ideally, a user will specify an ML task in a high-level and easy-to-use language and the framework will invoke the appropriate algorithms and system configurations to execute it. An important observation towards designing such a framework is that many ML tasks can be expressed as mathematical optimization problems, which take a specific form. Furthermore, these optimization problems can be efficiently solved using variations of the gradient descent (GD) algorithm. Thus, to decouple a user specification of an ML task from its execution, a key component is a GD optimizer. We propose a cost-based GD optimizer that selects the best GD plan for a given ML task. To build our optimizer, we introduce a set of abstract operators for expressing GD algorithms and propose a novel approach to estimate the number of iterations a GD algorithm requires to converge. Extensive experiments on real and synthetic datasets show that our optimizer not only chooses the best GD plan but also allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201

arXiv.org e-Print Archive

Crossref

A Neural Networks Committee for the Contextual Bandit Problem

Author: D.E. Rumelhart
E. Kaufmann
G. Tesauro
K. Hornik
L. Bottou
L. Kocsis
P. Auer
P. Auer
P. Auer
R. Feraud
S.M. Kakade
T.L. Lai
W. Thompson
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested on a large dataset with and without stationarity of rewards.Comment: 21st International Conference on Neural Information Processin

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Take a Ramble into Solution Spaces for Classification Problems in Neural Networks

Author: C Baldassi
J Li
L Bottou
Y Chen
Y Nesterov
Y Tarabalka
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

International audienc

Crossref

HAL Descartes

Institutional Research Information System University of Turin

Sparse Randomized Kaczmarz for Support Recovery of Jointly Sparse Corrupted Multiple Measurement Vectors

Author: C Hamaker
C Studer
EJ Candès
F Larusson
G Herman
G Tang
JN Laska
L Bottou
L Fodor
R Gordon
R Heckel
S Kaczmarz
S Li
T Strohmer
Y Li
Z Yang
Publication venue
Publication date: 14/06/2018
Field of study

While single measurement vector (SMV) models have been widely studied in signal processing, there is a surging interest in addressing the multiple measurement vectors (MMV) problem. In the MMV setting, more than one measurement vector is available and the multiple signals to be recovered share some commonalities such as a common support. Applications in which MMV is a naturally occurring phenomenon include online streaming, medical imaging, and video recovery. This work presents a stochastic iterative algorithm for the support recovery of jointly sparse corrupted MMV. We present a variant of the Sparse Randomized Kaczmarz algorithm for corrupted MMV and compare our proposed method with an existing Kaczmarz type algorithm for MMV problems. We also showcase the usefulness of our approach in the online (streaming) setting and provide empirical evidence that suggests the robustness of the proposed method to the distribution of the corruption and the number of corruptions occurring.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

Crossref