344 research outputs found
Distributed Deep Learning for Question Answering
This paper is an empirical study of the distributed deep learning for
question answering subtasks: answer selection and question classification.
Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP,
DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results
show that the distributed framework based on the message passing interface can
accelerate the convergence speed at a sublinear scale. This paper demonstrates
the importance of distributed training. For example, with 48 workers, a 24x
speedup is achievable for the answer selection task and running time is
decreased from 138.2 hours to 5.81 hours, which will increase the productivity
significantly.Comment: This paper will appear in the Proceeding of The 25th ACM
International Conference on Information and Knowledge Management (CIKM 2016),
Indianapolis, US
Collaborative Filtering via Group-Structured Dictionary Learning
Structured sparse coding and the related structured dictionary learning
problems are novel research areas in machine learning. In this paper we present
a new application of structured dictionary learning for collaborative filtering
based recommender systems. Our extensive numerical experiments demonstrate that
the presented technique outperforms its state-of-the-art competitors and has
several advantages over approaches that do not put structured constraints on
the dictionary elements.Comment: A compressed version of the paper has been accepted for publication
at the 10th International Conference on Latent Variable Analysis and Source
Separation (LVA/ICA 2012
Lazy training of radial basis neural networks
Proceeding of: 16th International Conference on Artificial Neural Networks, ICANN 2006. Athens, Greece, September 10-14, 2006Usually, training data are not evenly distributed in the input space. This makes non-local methods, like Neural Networks, not very accurate in those cases. On the other hand, local methods have the problem of how to know which are the best examples for each test pattern. In this work, we present a way of performing a trade off between local and non-local methods. On one hand a Radial Basis Neural Network is used like learning algorithm, on the other hand a selection of the training patterns is used for each query. Moreover, the RBNN initialization algorithm has been modified in a deterministic way to eliminate any initial condition influence. Finally, the new method has been validated in two time series domains, an artificial and a real world one.This article has been financed by the Spanish founded research MEC project OPLINK::UC3M, Ref: TIN2005-08818-C04-0
How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility
Recommendation systems are ubiquitous and impact many domains; they have the
potential to influence product consumption, individuals' perceptions of the
world, and life-altering decisions. These systems are often evaluated or
trained with data from users already exposed to algorithmic recommendations;
this creates a pernicious feedback loop. Using simulations, we demonstrate how
using data confounded in this way homogenizes user behavior without increasing
utility
Dynamic sampling schemes for optimal noise learning under multiple nonsmooth constraints
We consider the bilevel optimisation approach proposed by De Los Reyes,
Sch\"onlieb (2013) for learning the optimal parameters in a Total Variation
(TV) denoising model featuring for multiple noise distributions. In
applications, the use of databases (dictionaries) allows an accurate estimation
of the parameters, but reflects in high computational costs due to the size of
the databases and to the nonsmooth nature of the PDE constraints. To overcome
this computational barrier we propose an optimisation algorithm that by
sampling dynamically from the set of constraints and using a quasi-Newton
method, solves the problem accurately and in an efficient way
A Cost-based Optimizer for Gradient Descent Optimization
As the use of machine learning (ML) permeates into diverse application
domains, there is an urgent need to support a declarative framework for ML.
Ideally, a user will specify an ML task in a high-level and easy-to-use
language and the framework will invoke the appropriate algorithms and system
configurations to execute it. An important observation towards designing such a
framework is that many ML tasks can be expressed as mathematical optimization
problems, which take a specific form. Furthermore, these optimization problems
can be efficiently solved using variations of the gradient descent (GD)
algorithm. Thus, to decouple a user specification of an ML task from its
execution, a key component is a GD optimizer. We propose a cost-based GD
optimizer that selects the best GD plan for a given ML task. To build our
optimizer, we introduce a set of abstract operators for expressing GD
algorithms and propose a novel approach to estimate the number of iterations a
GD algorithm requires to converge. Extensive experiments on real and synthetic
datasets show that our optimizer not only chooses the best GD plan but also
allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201
A Neural Networks Committee for the Contextual Bandit Problem
This paper presents a new contextual bandit algorithm, NeuralBandit, which
does not need hypothesis on stationarity of contexts and rewards. Several
neural networks are trained to modelize the value of rewards knowing the
context. Two variants, based on multi-experts approach, are proposed to choose
online the parameters of multi-layer perceptrons. The proposed algorithms are
successfully tested on a large dataset with and without stationarity of
rewards.Comment: 21st International Conference on Neural Information Processin
Take a Ramble into Solution Spaces for Classification Problems in Neural Networks
International audienc
Sparse Randomized Kaczmarz for Support Recovery of Jointly Sparse Corrupted Multiple Measurement Vectors
While single measurement vector (SMV) models have been widely studied in
signal processing, there is a surging interest in addressing the multiple
measurement vectors (MMV) problem. In the MMV setting, more than one
measurement vector is available and the multiple signals to be recovered share
some commonalities such as a common support. Applications in which MMV is a
naturally occurring phenomenon include online streaming, medical imaging, and
video recovery. This work presents a stochastic iterative algorithm for the
support recovery of jointly sparse corrupted MMV. We present a variant of the
Sparse Randomized Kaczmarz algorithm for corrupted MMV and compare our proposed
method with an existing Kaczmarz type algorithm for MMV problems. We also
showcase the usefulness of our approach in the online (streaming) setting and
provide empirical evidence that suggests the robustness of the proposed method
to the distribution of the corruption and the number of corruptions occurring.Comment: 13 pages, 6 figure
- …