Search CORE

269 research outputs found

Label-Dependencies Aware Recurrent Neural Networks

Author: JL Elman
M Dinarelli
M Schuster
MI Jordan
MP Marcus
N Srivastava
P Werbos
R Collobert
R Mori De
S Hochreiter
Y Bengio
Y Bengio
Publication venue
Publication date: 06/06/2017
Field of study

In the last few years, Recurrent Neural Networks (RNNs) have proved effective on several NLP tasks. Despite such great success, their ability to model \emph{sequence labeling} is still limited. This lead research toward solutions where RNNs are combined with models which already proved effective in this domain, such as CRFs. In this work we propose a solution far simpler but very effective: an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words. We compare this RNN variant to all the other RNN models, Elman and Jordan RNN, LSTM and GRU, on two well-known tasks of Spoken Language Understanding (SLU). Thanks to label embeddings and their combination at the hidden layer, the proposed variant, which uses more parameters than Elman and Jordan RNNs, but far fewer than LSTM and GRU, is more effective than other RNNs, but also outperforms sophisticated CRF models.Comment: 22 pages, 3 figures. Accepted at CICling 2017 conference. Best Verifiability, Reproducibility, and Working Description awar

arXiv.org e-Print Archive

Crossref

Audio Event Detection using Weakly Labeled Data

Author: Gencoglu O.
J. F.
Kons Z.
Kumar A.
Mandel M. I.
Pancoast S.
Pikrakis A.
Rumelhart D. E.
Stowell D.
Wang F.
Wang J.
Werbos P. J.
Zhou Z.-H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/07/2016
Field of study

Acoustic event detection is essential for content analysis and description of multimedia recordings. The majority of current literature on the topic learns the detectors through fully-supervised techniques employing strongly labeled data. However, the labels available for majority of multimedia data are generally weak and do not provide sufficient detail for such methods to be employed. In this paper we propose a framework for learning acoustic event detectors using only weakly labeled data. We first show that audio event detection using weak labels can be formulated as an Multiple Instance Learning problem. We then suggest two frameworks for solving multiple-instance learning, one based on support vector machines, and the other on neural networks. The proposed methods can help in removing the time consuming and expensive process of manually annotating data to facilitate fully supervised learning. Moreover, it can not only detect events in a recording but can also provide temporal locations of events in the recording. This helps in obtaining a complete description of the recording and is notable since temporal information was never known in the first place in weakly labeled data.Comment: ACM Multimedia 201

arXiv.org e-Print Archive

Crossref

Linear Least-Squares algorithms for temporal difference learning

Author: A. G. Barto
Andrew G. Barto
C. J. C. H. Watkins
C. W. Anderson
G.C. Goodwin
G.J. Tesauro
H. Robbins
J.G. Kemeny
J.N. Tsitsiklis
L. Ljung
P. Dayan
P.J. Werbos
P.J. Werbos
P.J. Werbos
P.J. Werbos
R.S. Sutton
Steven J. Bradtke
T. Söderström
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1996
Field of study

Crossref

A stochastic approximation algorithm with multiplicative step size modification

Author: A. Plakhov
B. Delyon
B. T. Polyak
F. M. Silva
H. J. Kushner
H. Kesten
M. Frean
M. Nevel’son
P. Cruz
P. J. Werbos
R. Salomon
Y. Fang
Publication venue: 'Allerton Press'
Publication date: 01/01/2009
Field of study

An algorithm of searching a zero of an unknown function \vphi : \, \R \to \R is considered:

\, x_{t} = x_{t-1} - \gamma_{t-1} y_t

,\,

t=1,\ 2,\ldots

, where

y_t = \varphi(x_{t-1}) + \xi_t

is the value of \vphi measured at

x_{t-1}

and

\xi_t

is the measurement error. The step sizes \gam_t > 0 are modified in the course of the algorithm according to the rule: \, \gamma_t = \min\{u\, \gamma_{t-1},\, \mstep\} if

y_{t-1} y_t > 0

, and

\gamma_t = d\, \gamma_{t-1}

, otherwise, where

0 < d < 1 0

. That is, at each iteration \gam_t is multiplied either by

u

or by

d

, provided that the resulting value does not exceed the predetermined value \mstep. The function \vphi may have one or several zeros; the random values

\xi_t

are independent and identically distributed, with zero mean and finite variance. Under some additional assumptions on \vphi,

\xi_t

, and \mstep, the conditions on

u

and

d

guaranteeing a.s. convergence of the sequence

\{ x_t \}

, as well as a.s. divergence, are determined. In particular, if

\P (\xi_1 > 0) = \P (\xi_1 < 0) = 1/2

and

\P (\xi_1 = x) = 0

for any

x \in \R

, one has convergence for

ud 1

. Due to the multiplicative updating rule for \gam_t, the sequence

\{ x_t \}

converges rapidly: like a geometric progression (if convergence takes place), but the limit value may not coincide with, but instead, approximates one of the zeros of \vphi. By adjusting the parameters

u

and

d

, one can reach arbitrarily high precision of the approximation; higher precision is obtained at the expense of lower convergence rate

Crossref

Repositório Institucional da Universidade de Aveiro

The time dimension of neural network models

Author: Almeida L. B.
Barto A.
Beale R.
Broomhead D. S.
Cun Yann Le
Das S.
Giles C. L.
John
Lapedes Alan
Plutowski M.
Press W. H.
Richard Rohwer
Robinson A. J.
Rohwer R.
Rohwer R.
Rohwer R.
Rohwer R.
Rohwer R.
Rohwer R.
Rumelhart D. E.
Toomarian N.
Werbos P.
Werbos P.
Williams R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/1994
Field of study

This review attempts to provide an insightful perspective on the role of time within neural network models and the use of neural networks for problems involving time. The most commonly used neural network models are defined and explained giving mention to important technical issues but avoiding great detail. The relationship between recurrent and feedforward networks is emphasised, along with the distinctions in their practical and theoretical abilities. Some practical examples are discussed to illustrate the major issues concerning the application of neural networks to data with various types of temporal structure, and finally some highlights of current research on the more difficult types of problems are presented

Crossref

Aston Publications Explorer

Geometric deep learning

Author: Andreux M.
Boscaini D.
Bruna J.
Choromanska A.
Clevert D.
Cosmo L.
Dechter R.
Erhan D.
Glorot X.
Goodfellow I.
Gregor K.
Han X.
Hochreiter S.
Jaderberg M.
Kawaguchi K.
Kingma D. P.
Krizhevsky A.
Lähner Z.
Mallat S.
Masci J.
Mnih V.
Radford A.
Rodolà E.
Rusinkiewicz S.
Srivastava R. K.
Srivastava R. K.
Stollenga M.
Wang S.
Werbos P. J.
Wu Z.
Xu K.
Yu F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. The text in the course materials is primarily based on previously published work. With these notes we gather and provide a clear picture of the key concepts and techniques that fall under the umbrella of geometric deep learning, and illustrate the applications they enable. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Archivio della ricerca- Università di Roma La Sapienza

Scalable Massively Parallel Artificial Neural Networks

Author: Al-Alaoui M.A.
Barron A.R.
Carpenter G.A.
Haykin S.
Kumar V.
Kurzweil R.
Lewin
Long M.
Moravec H.
Mountcastle V.B.
Shtern V.
Takefuji Y.
Werbos P.
Werbos P.J.
White D.A.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2005
Field of study

There is renewed interest in computational intelligence, due to advances in algorithms, neuroscience, and computer hardware. In addition there is enormous interest in autonomous vehicles (air, ground, and sea) and robotics, which need significant onboard intelligence. Work in this area could not only lead to better understanding of the human brain but also very useful engineering applications. The functioning of the human brain is not well understood, but enormous progress has been made in understanding it and, in particular, the neocortex. There are many reasons to develop models of the brain. Artificial Neural Networks (ANN), one type of model, can be very effective for pattern recognition, function approximation, scientific classification, control, and the analysis of time series data. ANNs often use the back-propagation algorithm for training, and can require large training times especially for large networks, but there are many other types of ANNs. Once the network is trained for a particular problem, however, it can produce results in a very short time. Parallelization of ANNs could drastically reduce the training time. An object-oriented, massively-parallel ANN (Artificial Neural Network) software package SPANN (Scalable Parallel Artificial Neural Network) has been developed and is described here. MPI was use

CiteSeerX

Crossref