10,733 research outputs found
Machine Learning for Wireless Communications in the Internet of Things: A Comprehensive Survey
The Internet of Things (IoT) is expected to require more effective and
efficient wireless communications than ever before. For this reason, techniques
such as spectrum sharing, dynamic spectrum access, extraction of signal
intelligence and optimized routing will soon become essential components of the
IoT wireless communication paradigm. Given that the majority of the IoT will be
composed of tiny, mobile, and energy-constrained devices, traditional
techniques based on a priori network optimization may not be suitable, since
(i) an accurate model of the environment may not be readily available in
practical scenarios; (ii) the computational requirements of traditional
optimization techniques may prove unbearable for IoT devices. To address the
above challenges, much research has been devoted to exploring the use of
machine learning to address problems in the IoT wireless communications domain.
This work provides a comprehensive survey of the state of the art in the
application of machine learning techniques to address key problems in IoT
wireless communications with an emphasis on its ad hoc networking aspect.
First, we present extensive background notions of machine learning techniques.
Then, by adopting a bottom-up approach, we examine existing work on machine
learning for the IoT at the physical, data-link and network layer of the
protocol stack. Thereafter, we discuss directions taken by the community
towards hardware implementation to ensure the feasibility of these techniques.
Additionally, before concluding, we also provide a brief discussion of the
application of machine learning in IoT beyond wireless communication. Finally,
each of these discussions is accompanied by a detailed analysis of the related
open problems and challenges.Comment: Ad Hoc Networks Journa
Deep Knowledge Tracing
Knowledge tracing---where a machine models the knowledge of a student as they
interact with coursework---is a well established problem in computer supported
education. Though effectively modeling student knowledge would have high
educational impact, the task has many inherent challenges. In this paper we
explore the utility of using Recurrent Neural Networks (RNNs) to model student
learning. The RNN family of models have important advantages over previous
methods in that they do not require the explicit encoding of human domain
knowledge, and can capture more complex representations of student knowledge.
Using neural networks results in substantial improvements in prediction
performance on a range of knowledge tracing datasets. Moreover the learned
model can be used for intelligent curriculum design and allows straightforward
interpretation and discovery of structure in student tasks. These results
suggest a promising new line of research for knowledge tracing and an exemplary
application task for RNNs
Kinematic Resolutions of Redundant Robot Manipulators using Integration-Enhanced RNNs
Recently, a time-varying quadratic programming (QP) framework that describes
the tracking operations of redundant robot manipulators is introduced to handle
the kinematic resolutions of many robot control tasks. Based on the
generalization of such a time-varying QP framework, two schemes, i.e., the
Repetitive Motion Scheme and the Hybrid Torque Scheme, are proposed. However,
measurement noises are unavoidable when a redundant robot manipulator is
executing a tracking task. To solve this problem, a novel integration-enhanced
recurrent neural network (IE-RNN) is proposed in this paper. Associating with
the aforementioned two schemes, the tracking task can be accurately completed
by IE-RNN. Both theoretical analyses and simulations results prove that the
residual errors of IE-RNN can converge to zero under different kinds of
measurement noises. Moreover, practical experiments are elaborately made to
verify the excellent convergence and strong robustness properties of the
proposed IE-RNN
Preconditioned Stochastic Gradient Descent
Stochastic gradient descent (SGD) still is the workhorse for many practical
problems. However, it converges slow, and can be difficult to tune. It is
possible to precondition SGD to accelerate its convergence remarkably. But many
attempts in this direction either aim at solving specialized problems, or
result in significantly more complicated methods than SGD. This paper proposes
a new method to estimate a preconditioner such that the amplitudes of
perturbations of preconditioned stochastic gradient match that of the
perturbations of parameters to be optimized in a way comparable to Newton
method for deterministic optimization. Unlike the preconditioners based on
secant equation fitting as done in deterministic quasi-Newton methods, which
assume positive definite Hessian and approximate its inverse, the new
preconditioner works equally well for both convex and non-convex optimizations
with exact or noisy gradients. When stochastic gradient is used, it can
naturally damp the gradient noise to stabilize SGD. Efficient preconditioner
estimation methods are developed, and with reasonable simplifications, they are
applicable to large scaled problems. Experimental results demonstrate that
equipped with the new preconditioner, without any tuning effort, preconditioned
SGD can efficiently solve many challenging problems like the training of a deep
neural network or a recurrent neural network requiring extremely long term
memories.Comment: 13 pages, 9 figures. To appear in IEEE Transactions on Neural
Networks and Learning Systems. Supplemental materials on
https://sites.google.com/site/lixilinx/home/psg
Provably Correct Learning Algorithms in the Presence of Time-Varying Features Using a Variational Perspective
Features in machine learning problems are often time-varying and may be
related to outputs in an algebraic or dynamical manner. The dynamic nature of
these machine learning problems renders current higher order accelerated
gradient descent methods unstable or weakens their convergence guarantees.
Inspired by methods employed in adaptive control, this paper proposes new
algorithms for the case when time-varying features are present, and
demonstrates provable performance guarantees. In particular, we develop a
unified variational perspective within a continuous time algorithm. This
variational perspective includes higher order learning concepts and
normalization, both of which stem from adaptive control, and allows stability
to be established for dynamical machine learning problems where time-varying
features are present. These higher order algorithms are also examined for
provably correct learning in adaptive control and identification. Simulations
are provided to verify the theoretical results.Comment: 25 pages, additional simulation detail, paper rewritte
Nonlinear Model Predictive Control of A Gasoline HCCI Engine Using Extreme Learning Machines
Homogeneous charge compression ignition (HCCI) is a futuristic combustion
technology that operates with a high fuel efficiency and reduced emissions.
HCCI combustion is characterized by complex nonlinear dynamics which
necessitates a model based control approach for automotive application. HCCI
engine control is a nonlinear, multi-input multi-output problem with state and
actuator constraints which makes controller design a challenging task. Typical
HCCI controllers make use of a first principles based model which involves a
long development time and cost associated with expert labor and calibration. In
this paper, an alternative approach based on machine learning is presented
using extreme learning machines (ELM) and nonlinear model predictive control
(MPC). A recurrent ELM is used to learn the nonlinear dynamics of HCCI engine
using experimental data and is shown to accurately predict the engine behavior
several steps ahead in time, suitable for predictive control. Using the ELM
engine models, an MPC based control algorithm with a simplified quadratic
program update is derived for real time implementation. The working and
effectiveness of the MPC approach has been analyzed on a nonlinear HCCI engine
model for tracking multiple reference quantities along with constraints defined
by HCCI states, actuators and operational limits.Comment: This paper was written as an extract from my PhD thesis (July 2013)
and so references may not be to date as of this submission (Jan 2015). The
article is in review and contains 10 figures, 35 reference
Jointly optimal denoising, dereverberation, and source separation
This paper proposes methods that can optimize a Convolutional BeamFormer
(CBF) for jointly performing denoising, dereverberation, and source separation
(DN+DR+SS) in a computationally efficient way. Conventionally, cascade
configuration composed of a Weighted Prediction Error minimization (WPE)
dereverberation filter followed by a Minimum Variance Distortionless Response
beamformer has been usedas the state-of-the-art frontend of far-field speech
recognition, however, overall optimality of this approach is not guaranteed. In
the blind signal processing area, an approach for jointly optimizing
dereverberation and source separation (DR+SS) has been proposed, however, this
approach requires huge computing cost, and has not been extended for
application to DN+DR+SS. To overcome the above limitations, this paper develops
new approaches for jointly optimizing DN+DR+SS in a computationally much more
efficient way. To this end, we first present an objective function to optimize
a CBF for performing DN+DR+SS based on the maximum likelihood estimation, on an
assumption that the steering vectors of the target signals are given or can be
estimated, e.g., using a neural network. This paper refers to a CBF optimized
by this objective function as a weighted Minimum-Power Distortionless Response
(wMPDR) CBF. Then, we derive two algorithms for optimizing a wMPDR CBF based on
two different ways of factorizing a CBF into WPE filters and beamformers.
Experiments using noisy reverberant sound mixtures show that the proposed
optimization approaches greatly improve the performance of the speech
enhancement in comparison with the conventional cascade configuration in terms
of the signal distortion measures and ASR performance. It is also shown that
the proposed approaches can greatly reduce the computing cost with improved
estimation accuracy in comparison with the conventional joint optimization
approach.Comment: Submitted to IEEE/ACM Trans. Audio, Speech, and Language Processing
on 12 Feb 2020, Accepted to IEEE/ACM Trans. Audio, Speech, and Language
Processing on 14 July 202
Building DNN Acoustic Models for Large Vocabulary Speech Recognition
Deep neural networks (DNNs) are now a central component of nearly all
state-of-the-art speech recognition systems. Building neural network acoustic
models requires several design decisions including network architecture, size,
and training loss function. This paper offers an empirical investigation on
which aspects of DNN acoustic model design are most important for speech
recognition system performance. We report DNN classifier performance and final
speech recognizer word error rates, and compare DNNs using several metrics to
quantify factors influencing differences in task performance. Our first set of
experiments use the standard Switchboard benchmark corpus, which contains
approximately 300 hours of conversational telephone speech. We compare standard
DNNs to convolutional networks, and present the first experiments using
locally-connected, untied neural networks for acoustic modeling. We
additionally build systems on a corpus of 2,100 hours of training data by
combining the Switchboard and Fisher corpora. This larger corpus allows us to
more thoroughly examine performance of large DNN models -- with up to ten times
more parameters than those typically used in speech recognition systems. Our
results suggest that a relatively simple DNN architecture and optimization
technique produces strong results. These findings, along with previous work,
help establish a set of best practices for building DNN hybrid speech
recognition systems with maximum likelihood training. Our experiments in DNN
optimization additionally serve as a case study for training DNNs with
discriminative loss functions for speech tasks, as well as DNN classifiers more
generally
Recent Advances in Physical Reservoir Computing: A Review
Reservoir computing is a computational framework suited for
temporal/sequential data processing. It is derived from several recurrent
neural network models, including echo state networks and liquid state machines.
A reservoir computing system consists of a reservoir for mapping inputs into a
high-dimensional space and a readout for pattern analysis from the
high-dimensional states in the reservoir. The reservoir is fixed and only the
readout is trained with a simple method such as linear regression and
classification. Thus, the major advantage of reservoir computing compared to
other recurrent neural networks is fast learning, resulting in low training
cost. Another advantage is that the reservoir without adaptive updating is
amenable to hardware implementation using a variety of physical systems,
substrates, and devices. In fact, such physical reservoir computing has
attracted increasing attention in diverse fields of research. The purpose of
this review is to provide an overview of recent advances in physical reservoir
computing by classifying them according to the type of the reservoir. We
discuss the current issues and perspectives related to physical reservoir
computing, in order to further expand its practical applications and develop
next-generation machine learning systems.Comment: 62 pages, 13 figure
Deep convolutional recurrent autoencoders for learning low-dimensional feature dynamics of fluid systems
Model reduction of high-dimensional dynamical systems alleviates
computational burdens faced in various tasks from design optimization to model
predictive control. One popular model reduction approach is based on projecting
the governing equations onto a subspace spanned by basis functions obtained
from the compression of a dataset of solution snapshots. However, this method
is intrusive since the projection requires access to the system operators.
Further, some systems may require special treatment of nonlinearities to ensure
computational efficiency or additional modeling to preserve stability. In this
work we propose a deep learning-based strategy for nonlinear model reduction
that is inspired by projection-based model reduction where the idea is to
identify some optimal low-dimensional representation and evolve it in time. Our
approach constructs a modular model consisting of a deep convolutional
autoencoder and a modified LSTM network. The deep convolutional autoencoder
returns a low-dimensional representation in terms of coordinates on some
expressive nonlinear data-supporting manifold. The dynamics on this manifold
are then modeled by the modified LSTM network in a computationally efficient
manner. An offline unsupervised training strategy that exploits the model
modularity is also developed. We demonstrate our model on three illustrative
examples each highlighting the model's performance in prediction tasks for
fluid systems with large parameter-variations and its stability in long-term
prediction
- …