332,864 research outputs found
Exascale Deep Learning to Accelerate Cancer Research
Deep learning, through the use of neural networks, has demonstrated
remarkable ability to automate many routine tasks when presented with
sufficient data for training. The neural network architecture (e.g. number of
layers, types of layers, connections between layers, etc.) plays a critical
role in determining what, if anything, the neural network is able to learn from
the training data. The trend for neural network architectures, especially those
trained on ImageNet, has been to grow ever deeper and more complex. The result
has been ever increasing accuracy on benchmark datasets with the cost of
increased computational demands. In this paper we demonstrate that neural
network architectures can be automatically generated, tailored for a specific
application, with dual objectives: accuracy of prediction and speed of
prediction. Using MENNDL--an HPC-enabled software stack for neural architecture
search--we generate a neural network with comparable accuracy to
state-of-the-art networks on a cancer pathology dataset that is also
faster at inference. The speedup in inference is necessary because of the
volume and velocity of cancer pathology data; specifically, the previous
state-of-the-art networks are too slow for individual researchers without
access to HPC systems to keep pace with the rate of data generation. Our new
model enables researchers with modest computational resources to analyze newly
generated data faster than it is collected.Comment: Submitted to IEEE Big Dat
Automatic Detection and Categorization of Election-Related Tweets
With the rise in popularity of public social media and micro-blogging
services, most notably Twitter, the people have found a venue to hear and be
heard by their peers without an intermediary. As a consequence, and aided by
the public nature of Twitter, political scientists now potentially have the
means to analyse and understand the narratives that organically form, spread
and decline among the public in a political campaign. However, the volume and
diversity of the conversation on Twitter, combined with its noisy and
idiosyncratic nature, make this a hard task. Thus, advanced data mining and
language processing techniques are required to process and analyse the data. In
this paper, we present and evaluate a technical framework, based on recent
advances in deep neural networks, for identifying and analysing
election-related conversation on Twitter on a continuous, longitudinal basis.
Our models can detect election-related tweets with an F-score of 0.92 and can
categorize these tweets into 22 topics with an F-score of 0.90.Comment: ICWSM'16, May 17-20, 2016, Cologne, Germany. In Proceedings of the
10th AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne,
German
How far generated data can impact Neural Networks performance?
The success of deep learning models depends on the size and quality of the
dataset to solve certain tasks. Here, we explore how far generated data can aid
real data in improving the performance of Neural Networks. In this work, we
consider facial expression recognition since it requires challenging local data
generation at the level of local regions such as mouth, eyebrows, etc, rather
than simple augmentation. Generative Adversarial Networks (GANs) provide an
alternative method for generating such local deformations but they need further
validation. To answer our question, we consider noncomplex Convolutional Neural
Networks (CNNs) based classifiers for recognizing Ekman emotions. For the data
generation process, we consider generating facial expressions (FEs) by relying
on two GANs. The first generates a random identity while the second imposes
facial deformations on top of it. We consider training the CNN classifier using
FEs from: real-faces, GANs-generated, and finally using a combination of real
and GAN-generated faces. We determine an upper bound regarding the data
generation quantity to be mixed with the real one which contributes the most to
enhancing FER accuracy. In our experiments, we find out that 5-times more
synthetic data to the real FEs dataset increases accuracy by 16%.Comment: Conference Publication in Proceedings of the 18th International Joint
Conference on Computer Vision, Imaging and Computer Graphics Theory and
Applications - Volume 5: VISAPP, 10 page
Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks
Artificial neural networks (ANNs) trained using backpropagation are powerful
learning architectures that have achieved state-of-the-art performance in
various benchmarks. Significant effort has been devoted to developing custom
silicon devices to accelerate inference in ANNs. Accelerating the training
phase, however, has attracted relatively little attention. In this paper, we
describe a hardware-efficient on-line learning technique for feedforward
multi-layer ANNs that is based on pipelined backpropagation. Learning is
performed in parallel with inference in the forward pass, removing the need for
an explicit backward pass and requiring no extra weight lookup. By using binary
state variables in the feedforward network and ternary errors in
truncated-error backpropagation, the need for any multiplications in the
forward and backward passes is removed, and memory requirements for the
pipelining are drastically reduced. Further reduction in addition operations
owing to the sparsity in the forward neural and backpropagating error signal
paths contributes to highly efficient hardware implementation. For
proof-of-concept validation, we demonstrate on-line learning of MNIST
handwritten digit classification on a Spartan 6 FPGA interfacing with an
external 1Gb DDR2 DRAM, that shows small degradation in test error performance
compared to an equivalently sized binary ANN trained off-line using standard
back-propagation and exact errors. Our results highlight an attractive synergy
between pipelined backpropagation and binary-state networks in substantially
reducing computation and memory requirements, making pipelined on-line learning
practical in deep networks.Comment: Now also consider 0/1 binary activations. Memory access statistics
reporte
Incremental construction of LSTM recurrent neural network
Long Short--Term Memory (LSTM) is a recurrent neural network that
uses structures called memory blocks to allow the net remember
significant events distant in the past input sequence in order to
solve long time lag tasks, where other RNN approaches fail.
Throughout this work we have performed experiments using LSTM
networks extended with growing abilities, which we call GLSTM.
Four methods of training growing LSTM has been compared. These
methods include cascade and fully connected hidden layers as well
as two different levels of freezing previous weights in the
cascade case. GLSTM has been applied to a forecasting problem in a biomedical domain, where the input/output behavior of five
controllers of the Central Nervous System control has to be
modelled. We have compared growing LSTM results against other
neural networks approaches, and our work applying conventional
LSTM to the task at hand.Postprint (published version
An Improved Stock Price Prediction using Hybrid Market Indicators
In this paper the effect of hybrid market indicators is examined for an improved stock price prediction. The hybrid market indicators consist of technical, fundamental and expert opinion variables as input to artificial neural networks model. The empirical results obtained
with published stock data of Dell and Nokia obtained from New York Stock Exchange shows that the proposed model can be effective to improve accuracy of stock price prediction
- …