10,733 research outputs found

    Machine Learning for Wireless Communications in the Internet of Things: A Comprehensive Survey

    Full text link
    The Internet of Things (IoT) is expected to require more effective and efficient wireless communications than ever before. For this reason, techniques such as spectrum sharing, dynamic spectrum access, extraction of signal intelligence and optimized routing will soon become essential components of the IoT wireless communication paradigm. Given that the majority of the IoT will be composed of tiny, mobile, and energy-constrained devices, traditional techniques based on a priori network optimization may not be suitable, since (i) an accurate model of the environment may not be readily available in practical scenarios; (ii) the computational requirements of traditional optimization techniques may prove unbearable for IoT devices. To address the above challenges, much research has been devoted to exploring the use of machine learning to address problems in the IoT wireless communications domain. This work provides a comprehensive survey of the state of the art in the application of machine learning techniques to address key problems in IoT wireless communications with an emphasis on its ad hoc networking aspect. First, we present extensive background notions of machine learning techniques. Then, by adopting a bottom-up approach, we examine existing work on machine learning for the IoT at the physical, data-link and network layer of the protocol stack. Thereafter, we discuss directions taken by the community towards hardware implementation to ensure the feasibility of these techniques. Additionally, before concluding, we also provide a brief discussion of the application of machine learning in IoT beyond wireless communication. Finally, each of these discussions is accompanied by a detailed analysis of the related open problems and challenges.Comment: Ad Hoc Networks Journa

    Deep Knowledge Tracing

    Full text link
    Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The RNN family of models have important advantages over previous methods in that they do not require the explicit encoding of human domain knowledge, and can capture more complex representations of student knowledge. Using neural networks results in substantial improvements in prediction performance on a range of knowledge tracing datasets. Moreover the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks. These results suggest a promising new line of research for knowledge tracing and an exemplary application task for RNNs

    Kinematic Resolutions of Redundant Robot Manipulators using Integration-Enhanced RNNs

    Full text link
    Recently, a time-varying quadratic programming (QP) framework that describes the tracking operations of redundant robot manipulators is introduced to handle the kinematic resolutions of many robot control tasks. Based on the generalization of such a time-varying QP framework, two schemes, i.e., the Repetitive Motion Scheme and the Hybrid Torque Scheme, are proposed. However, measurement noises are unavoidable when a redundant robot manipulator is executing a tracking task. To solve this problem, a novel integration-enhanced recurrent neural network (IE-RNN) is proposed in this paper. Associating with the aforementioned two schemes, the tracking task can be accurately completed by IE-RNN. Both theoretical analyses and simulations results prove that the residual errors of IE-RNN can converge to zero under different kinds of measurement noises. Moreover, practical experiments are elaborately made to verify the excellent convergence and strong robustness properties of the proposed IE-RNN

    Preconditioned Stochastic Gradient Descent

    Full text link
    Stochastic gradient descent (SGD) still is the workhorse for many practical problems. However, it converges slow, and can be difficult to tune. It is possible to precondition SGD to accelerate its convergence remarkably. But many attempts in this direction either aim at solving specialized problems, or result in significantly more complicated methods than SGD. This paper proposes a new method to estimate a preconditioner such that the amplitudes of perturbations of preconditioned stochastic gradient match that of the perturbations of parameters to be optimized in a way comparable to Newton method for deterministic optimization. Unlike the preconditioners based on secant equation fitting as done in deterministic quasi-Newton methods, which assume positive definite Hessian and approximate its inverse, the new preconditioner works equally well for both convex and non-convex optimizations with exact or noisy gradients. When stochastic gradient is used, it can naturally damp the gradient noise to stabilize SGD. Efficient preconditioner estimation methods are developed, and with reasonable simplifications, they are applicable to large scaled problems. Experimental results demonstrate that equipped with the new preconditioner, without any tuning effort, preconditioned SGD can efficiently solve many challenging problems like the training of a deep neural network or a recurrent neural network requiring extremely long term memories.Comment: 13 pages, 9 figures. To appear in IEEE Transactions on Neural Networks and Learning Systems. Supplemental materials on https://sites.google.com/site/lixilinx/home/psg

    Provably Correct Learning Algorithms in the Presence of Time-Varying Features Using a Variational Perspective

    Full text link
    Features in machine learning problems are often time-varying and may be related to outputs in an algebraic or dynamical manner. The dynamic nature of these machine learning problems renders current higher order accelerated gradient descent methods unstable or weakens their convergence guarantees. Inspired by methods employed in adaptive control, this paper proposes new algorithms for the case when time-varying features are present, and demonstrates provable performance guarantees. In particular, we develop a unified variational perspective within a continuous time algorithm. This variational perspective includes higher order learning concepts and normalization, both of which stem from adaptive control, and allows stability to be established for dynamical machine learning problems where time-varying features are present. These higher order algorithms are also examined for provably correct learning in adaptive control and identification. Simulations are provided to verify the theoretical results.Comment: 25 pages, additional simulation detail, paper rewritte

    Nonlinear Model Predictive Control of A Gasoline HCCI Engine Using Extreme Learning Machines

    Full text link
    Homogeneous charge compression ignition (HCCI) is a futuristic combustion technology that operates with a high fuel efficiency and reduced emissions. HCCI combustion is characterized by complex nonlinear dynamics which necessitates a model based control approach for automotive application. HCCI engine control is a nonlinear, multi-input multi-output problem with state and actuator constraints which makes controller design a challenging task. Typical HCCI controllers make use of a first principles based model which involves a long development time and cost associated with expert labor and calibration. In this paper, an alternative approach based on machine learning is presented using extreme learning machines (ELM) and nonlinear model predictive control (MPC). A recurrent ELM is used to learn the nonlinear dynamics of HCCI engine using experimental data and is shown to accurately predict the engine behavior several steps ahead in time, suitable for predictive control. Using the ELM engine models, an MPC based control algorithm with a simplified quadratic program update is derived for real time implementation. The working and effectiveness of the MPC approach has been analyzed on a nonlinear HCCI engine model for tracking multiple reference quantities along with constraints defined by HCCI states, actuators and operational limits.Comment: This paper was written as an extract from my PhD thesis (July 2013) and so references may not be to date as of this submission (Jan 2015). The article is in review and contains 10 figures, 35 reference

    Jointly optimal denoising, dereverberation, and source separation

    Full text link
    This paper proposes methods that can optimize a Convolutional BeamFormer (CBF) for jointly performing denoising, dereverberation, and source separation (DN+DR+SS) in a computationally efficient way. Conventionally, cascade configuration composed of a Weighted Prediction Error minimization (WPE) dereverberation filter followed by a Minimum Variance Distortionless Response beamformer has been usedas the state-of-the-art frontend of far-field speech recognition, however, overall optimality of this approach is not guaranteed. In the blind signal processing area, an approach for jointly optimizing dereverberation and source separation (DR+SS) has been proposed, however, this approach requires huge computing cost, and has not been extended for application to DN+DR+SS. To overcome the above limitations, this paper develops new approaches for jointly optimizing DN+DR+SS in a computationally much more efficient way. To this end, we first present an objective function to optimize a CBF for performing DN+DR+SS based on the maximum likelihood estimation, on an assumption that the steering vectors of the target signals are given or can be estimated, e.g., using a neural network. This paper refers to a CBF optimized by this objective function as a weighted Minimum-Power Distortionless Response (wMPDR) CBF. Then, we derive two algorithms for optimizing a wMPDR CBF based on two different ways of factorizing a CBF into WPE filters and beamformers. Experiments using noisy reverberant sound mixtures show that the proposed optimization approaches greatly improve the performance of the speech enhancement in comparison with the conventional cascade configuration in terms of the signal distortion measures and ASR performance. It is also shown that the proposed approaches can greatly reduce the computing cost with improved estimation accuracy in comparison with the conventional joint optimization approach.Comment: Submitted to IEEE/ACM Trans. Audio, Speech, and Language Processing on 12 Feb 2020, Accepted to IEEE/ACM Trans. Audio, Speech, and Language Processing on 14 July 202

    Building DNN Acoustic Models for Large Vocabulary Speech Recognition

    Full text link
    Deep neural networks (DNNs) are now a central component of nearly all state-of-the-art speech recognition systems. Building neural network acoustic models requires several design decisions including network architecture, size, and training loss function. This paper offers an empirical investigation on which aspects of DNN acoustic model design are most important for speech recognition system performance. We report DNN classifier performance and final speech recognizer word error rates, and compare DNNs using several metrics to quantify factors influencing differences in task performance. Our first set of experiments use the standard Switchboard benchmark corpus, which contains approximately 300 hours of conversational telephone speech. We compare standard DNNs to convolutional networks, and present the first experiments using locally-connected, untied neural networks for acoustic modeling. We additionally build systems on a corpus of 2,100 hours of training data by combining the Switchboard and Fisher corpora. This larger corpus allows us to more thoroughly examine performance of large DNN models -- with up to ten times more parameters than those typically used in speech recognition systems. Our results suggest that a relatively simple DNN architecture and optimization technique produces strong results. These findings, along with previous work, help establish a set of best practices for building DNN hybrid speech recognition systems with maximum likelihood training. Our experiments in DNN optimization additionally serve as a case study for training DNNs with discriminative loss functions for speech tasks, as well as DNN classifiers more generally

    Recent Advances in Physical Reservoir Computing: A Review

    Full text link
    Reservoir computing is a computational framework suited for temporal/sequential data processing. It is derived from several recurrent neural network models, including echo state networks and liquid state machines. A reservoir computing system consists of a reservoir for mapping inputs into a high-dimensional space and a readout for pattern analysis from the high-dimensional states in the reservoir. The reservoir is fixed and only the readout is trained with a simple method such as linear regression and classification. Thus, the major advantage of reservoir computing compared to other recurrent neural networks is fast learning, resulting in low training cost. Another advantage is that the reservoir without adaptive updating is amenable to hardware implementation using a variety of physical systems, substrates, and devices. In fact, such physical reservoir computing has attracted increasing attention in diverse fields of research. The purpose of this review is to provide an overview of recent advances in physical reservoir computing by classifying them according to the type of the reservoir. We discuss the current issues and perspectives related to physical reservoir computing, in order to further expand its practical applications and develop next-generation machine learning systems.Comment: 62 pages, 13 figure

    Deep convolutional recurrent autoencoders for learning low-dimensional feature dynamics of fluid systems

    Full text link
    Model reduction of high-dimensional dynamical systems alleviates computational burdens faced in various tasks from design optimization to model predictive control. One popular model reduction approach is based on projecting the governing equations onto a subspace spanned by basis functions obtained from the compression of a dataset of solution snapshots. However, this method is intrusive since the projection requires access to the system operators. Further, some systems may require special treatment of nonlinearities to ensure computational efficiency or additional modeling to preserve stability. In this work we propose a deep learning-based strategy for nonlinear model reduction that is inspired by projection-based model reduction where the idea is to identify some optimal low-dimensional representation and evolve it in time. Our approach constructs a modular model consisting of a deep convolutional autoencoder and a modified LSTM network. The deep convolutional autoencoder returns a low-dimensional representation in terms of coordinates on some expressive nonlinear data-supporting manifold. The dynamics on this manifold are then modeled by the modified LSTM network in a computationally efficient manner. An offline unsupervised training strategy that exploits the model modularity is also developed. We demonstrate our model on three illustrative examples each highlighting the model's performance in prediction tasks for fluid systems with large parameter-variations and its stability in long-term prediction
    • …
    corecore