4,127 research outputs found
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Recently, the connectionist temporal classification (CTC) model coupled with
recurrent (RNN) or convolutional neural networks (CNN), made it easier to train
speech recognition systems in an end-to-end fashion. However in real-valued
models, time frame components such as mel-filter-bank energies and the cepstral
coefficients obtained from them, together with their first and second order
derivatives, are processed as individual elements, while a natural alternative
is to process such components as composed entities. We propose to group such
elements in the form of quaternions and to process these quaternions using the
established quaternion algebra. Quaternion numbers and quaternion neural
networks have shown their efficiency to process multidimensional inputs as
entities, to encode internal dependencies, and to solve many tasks with less
learning parameters than real-valued models. This paper proposes to integrate
multiple feature views in quaternion-valued convolutional neural network
(QCNN), to be used for sequence-to-sequence mapping with the CTC model.
Promising results are reported using simple QCNNs in phoneme recognition
experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme
error rate (PER) with less learning parameters than a competing model based on
real-valued CNNs.Comment: Accepted at INTERSPEECH 201
A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification
Convolutional auto-encoders have shown their remarkable performance in
stacking to deep convolutional neural networks for classifying image data
during past several years. However, they are unable to construct the
state-of-the-art convolutional neural networks due to their intrinsic
architectures. In this regard, we propose a flexible convolutional auto-encoder
by eliminating the constraints on the numbers of convolutional layers and
pooling layers from the traditional convolutional auto-encoder. We also design
an architecture discovery method by using particle swarm optimization, which is
capable of automatically searching for the optimal architectures of the
proposed flexible convolutional auto-encoder with much less computational
resource and without any manual intervention. We use the designed architecture
optimization algorithm to test the proposed flexible convolutional auto-encoder
through utilizing one graphic processing unit card on four extensively used
image classification datasets. Experimental results show that our work in this
paper significantly outperform the peer competitors including the
state-of-the-art algorithm.Comment: Accepted by IEEE Transactions on Neural Networks and Learning
Systems, 201
Localized Dimension Growth in Random Network Coding: A Convolutional Approach
We propose an efficient Adaptive Random Convolutional Network Coding (ARCNC)
algorithm to address the issue of field size in random network coding. ARCNC
operates as a convolutional code, with the coefficients of local encoding
kernels chosen randomly over a small finite field. The lengths of local
encoding kernels increase with time until the global encoding kernel matrices
at related sink nodes all have full rank. Instead of estimating the necessary
field size a priori, ARCNC operates in a small finite field. It adapts to
unknown network topologies without prior knowledge, by locally incrementing the
dimensionality of the convolutional code. Because convolutional codes of
different constraint lengths can coexist in different portions of the network,
reductions in decoding delay and memory overheads can be achieved with ARCNC.
We show through analysis that this method performs no worse than random linear
network codes in general networks, and can provide significant gains in terms
of average decoding delay in combination networks.Comment: 7 pages, 1 figure, submitted to IEEE ISIT 201
- …