27 research outputs found
Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization
An efficient algorithm for recurrent neural network training is presented.
The approach increases the training speed for tasks where a length of the input
sequence may vary significantly. The proposed approach is based on the optimal
batch bucketing by input sequence length and data parallelization on multiple
graphical processing units. The baseline training performance without sequence
bucketing is compared with the proposed solution for a different number of
buckets. An example is given for the online handwriting recognition task using
an LSTM recurrent neural network. The evaluation is performed in terms of the
wall clock time, number of epochs, and validation loss value.Comment: 4 pages, 5 figures, Comments, 2016 IEEE First International
Conference on Data Stream Mining & Processing (DSMP), Lviv, 201
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory
The effectiveness of recurrent neural networks can be largely influenced by
their ability to store into their dynamical memory information extracted from
input sequences at different frequencies and timescales. Such a feature can be
introduced into a neural architecture by an appropriate modularization of the
dynamic memory. In this paper we propose a novel incrementally trained
recurrent architecture targeting explicitly multi-scale learning. First, we
show how to extend the architecture of a simple RNN by separating its hidden
state into different modules, each subsampling the network hidden activations
at different frequencies. Then, we discuss a training algorithm where new
modules are iteratively added to the model to learn progressively longer
dependencies. Each new module works at a slower frequency than the previous
ones and it is initialized to encode the subsampled sequence of hidden
activations. Experimental results on synthetic and real-world datasets on
speech recognition and handwritten characters show that the modular
architecture and the incremental training algorithm improve the ability of
recurrent neural networks to capture long-term dependencies.Comment: accepted @ ECML 2020. arXiv admin note: substantial text overlap with
arXiv:2001.1177
A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks
Deep LSTM is an ideal candidate for text recognition. However text
recognition involves some initial image processing steps like segmentation of
lines and words which can induce error to the recognition system. Without
segmentation, learning very long range context is difficult and becomes
computationally intractable. Therefore, alternative soft decisions are needed
at the pre-processing level. This paper proposes a hybrid text recognizer using
a deep recurrent neural network with multiple layers of abstraction and long
range context along with a language model to verify the performance of the deep
neural network. In this paper we construct a multi-hypotheses tree architecture
with candidate segments of line sequences from different segmentation
algorithms at its different branches. The deep neural network is trained on
perfectly segmented data and tests each of the candidate segments, generating
unicode sequences. In the verification step, these unicode sequences are
validated using a sub-string match with the language model and best first
search is used to find the best possible combination of alternative hypothesis
from the tree structure. Thus the verification framework using language models
eliminates wrong segmentation outputs and filters recognition errors
Implementasi Long Short-Term Memory (LSTM) untuk Prediksi Intensitas Curah Hujan (Studi Kasus: Kabupaten Malang)
Curah hujan merupakan salah satu fenomena alam yang dianggap sebagai salah satu faktor terpenting bagi setiap orang untuk meningkatkan produktivitasnya dalam berbagai sektor usaha. Kondisi ini sangat mempengaruhi dalam pengambilan keputusan yang optimal pada aspek kehidupan dengan berbagai tujuan, salah satu contohnya adalah kegiatan manusia di sektor pertanian. Sulitnya memprediksi curah hujan dikarenakan tidak menentunya keadaan cuaca. Pada beberapa daerah yang terlihat cerah, tidak lama kemudian dapat terjadi hujan bahkan badai. Kabupaten Malang merupakan daerah yang mempunyai iklim tropis dan juga memiliki sumber daya alam yang melimpah di sektor pertanian dan perkebunan. Pada sektor ini terdapat beberapa faktor yang memiliki pengaruh yang pada tingkat produktivitas yang mana salah satunya adalah curah hujan. Dengan dilakukannya prediksi pada curah hujan, yang bertujuan untuk meningkatkan produktivitas dan mobilitas pada aktivitas manusia. Penelitian ini membahas tentang prediksi curah hujan di Kabupaten Malang. Salah satu metode yang digunakan untuk memprediksi kondisi cuaca yaitu menggunakan Long Short-Term Memory (LSTM). Hasil penelitian ini diperoleh bahwa Model Long Short-Term Memory mempunyai performa terbaik dengan parameter yang telah ditentukan, dimana tingkat nilai error yang digunakan pada penelitian ini menggunakan RMSE dan MAE terkecil berturut-turut adalah sebesar 0.98162 dan 0.68847. Hal ini menunjukkan bahwa semakin kecil tingkat nilai error, maka semakin akurat model tersebut melakukan prediksi
DeepCare: A Deep Dynamic Memory Model for Predictive Medicine
Personalized predictive medicine necessitates the modeling of patient illness
and care processes, which inherently have long-term temporal dependencies.
Healthcare observations, recorded in electronic medical records, are episodic
and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural
network that reads medical records, stores previous illness history, infers
current illness states and predicts future medical outcomes. At the data level,
DeepCare represents care episodes as vectors in space, models patient health
state trajectories through explicit memory of historical records. Built on Long
Short-Term Memory (LSTM), DeepCare introduces time parameterizations to handle
irregular timed events by moderating the forgetting and consolidation of memory
cells. DeepCare also incorporates medical interventions that change the course
of illness and shape future medical risk. Moving up to the health state level,
historical and present health states are then aggregated through multiscale
temporal pooling, before passing through a neural network that estimates future
outcomes. We demonstrate the efficacy of DeepCare for disease progression
modeling, intervention recommendation, and future risk prediction. On two
important cohorts with heavy social and economic burden -- diabetes and mental
health -- the results show improved modeling and risk prediction accuracy.Comment: Accepted at JBI under the new name: "Predicting healthcare
trajectories from medical records: A deep learning approach