Search CORE

1,141 research outputs found

Recommended from our members

Discovering gated recurrent neural network architectures

Author: Rawal Aditya, Ph. D. in computer science
Publication venue
Publication date: 07/02/2019
Field of study

Reinforcement Learning agent networks with memory are a key component in solving POMDP tasks. Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many supervised sequential processing tasks such as speech recognition and machine translation. However, scaling them to deep memory tasks in reinforcement learning domain is challenging because of sparse and deceptive reward function. To address this challenge first, a new secondary optimization objective is introduced that maximizes the information (Info-max) stored in the LSTM network. Results indicate that when combined with neuroevolution, Info-max can discover powerful LSTM-based memory solutions that outperform traditional RNNs. Next, for the supervised learning tasks, neuroevolution techniques are employed to design new LSTM architectures. Such architectural variations include discovering new pathways between the recurrent layers as well as designing new gated recurrent nodes. This dissertation proposes evolution of a tree-based encoding of the gated memory nodes, and shows that it makes it possible to explore new variations more effectively than other methods. The method discovers nodes with multiple recurrent paths and multiple memory cells, which lead to significant improvement in the standard language modeling benchmark task. The dissertation also shows how the search process can be speeded up by training an LSTM network to estimate performance of candidate structures, and by encouraging exploration of novel solutions. Thus, evolutionary design of complex neural network structures promises to improve performance of deep learning architectures beyond human ability to do so.Computer Science

Texas ScholarWorks

Long Term Predictions of Coal Fired Power Plant Data Using Evolved Recurrent Neural Networks

Author: Benson Steve
Desell Travis
ElSaid AbdElRahman
Lyu Zimeng
Patwardhan Shuchita
Stadem David
Publication venue: RIT Scholar Works
Publication date: 22/01/2020
Field of study

This work presents an investigation into the ability of recurrent neural networks (RNNs) to provide long term predictions of time series data generated by coal fired power plants. While there are numerous studies which have used artificial neural networks (ANNs) to predict coal plant parameters, to the authors’ knowledge these have almost entirely been restricted to predicting values at the next time step, and not farther into the future. Using a novel neuro-evolution strategy called Evolutionary eXploration of Augmenting Memory Models (EXAMM), we evolved RNNs with advanced memory cells to predict per-minute plant parameters and per-hour boiler parameters up to 8 hours into the future. These data sets were challenging prediction tasks as they involve spiking behavior in the parameters being predicted. While the evolved RNNs were able to successfully predict the spikes in the hourly data they did not perform very well in accurately predicting their severity. The per-minute data proved even more challenging as medium range predictions miscalculated the beginning and ending of spikes, and longer range predictions reverted to long term trends and ignored the spikes entirely. We hope this initial study will motivate further study into this highly challenging prediction problem. The use of fuel properties data generated by a new Coal Tracker Optimization (CTO) program was also investigated and this work shows that their use improved predictive ability of the evolved RNNs

RIT Scholar Works

A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization

Author: Gao Haobo
Gao Yunxuan
Guo Jianxiong
Li Yandi
Wu Weili
Publication venue
Publication date: 06/11/2022
Field of study

Influence Maximization (IM) is a classical combinatorial optimization problem, which can be widely used in mobile networks, social computing, and recommendation systems. It aims at selecting a small number of users such that maximizing the influence spread across the online social network. Because of its potential commercial and academic value, there are a lot of researchers focusing on studying the IM problem from different perspectives. The main challenge comes from the NP-hardness of the IM problem and \#P-hardness of estimating the influence spread, thus traditional algorithms for overcoming them can be categorized into two classes: heuristic algorithms and approximation algorithms. However, there is no theoretical guarantee for heuristic algorithms, and the theoretical design is close to the limit. Therefore, it is almost impossible to further optimize and improve their performance. With the rapid development of artificial intelligence, the technology based on Machine Learning (ML) has achieved remarkable achievements in many fields. In view of this, in recent years, a number of new methods have emerged to solve combinatorial optimization problems by using ML-based techniques. These methods have the advantages of fast solving speed and strong generalization ability to unknown graphs, which provide a brand-new direction for solving combinatorial optimization problems. Therefore, we abandon the traditional algorithms based on iterative search and review the recent development of ML-based methods, especially Deep Reinforcement Learning, to solve the IM problem and other variants in social networks. We focus on summarizing the relevant background knowledge, basic principles, common methods, and applied research. Finally, the challenges that need to be solved urgently in future IM research are pointed out.Comment: 45 page

arXiv.org e-Print Archive

Bayesian Neural Architecture Search using A Training-Free Performance Metric

Author: Alba Enrique
Bäck Thomas
Camero Andrés
Wang Hao
Publication venue
Publication date: 01/01/2020
Field of study

Recurrent neural networks (RNNs) are a powerful approach for time series prediction. However, their performance is strongly affected by their architecture and hyperparameter settings. The architecture optimization of RNNs is a time-consuming task, where the search space is typically a mixture of real, integer and categorical values. To allow for shrinking and expanding the size of the network, the representation of architectures often has a variable length. In this paper, we propose to tackle the architecture optimization problem with a variant of the Bayesian Optimization (BO) algorithm. To reduce the evaluation time of candidate architectures the Mean Absolute Error Random Sampling (MRS), a training-free method to estimate the network performance, is adopted as the objective function for BO. Also, we propose three fixed-length encoding schemes to cope with the variable-length architecture representation. The result is a new perspective on accurate and efficient design of RNNs, that we validate on three problems. Our findings show that 1) the BO algorithm can explore different network architectures using the proposed encoding schemes and successfully designs well-performing architectures, and 2) the optimization time is significantly reduced by using MRS, without compromising the performance as compared to the architectures obtained from the actual training procedure

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Leiden University Scholary Publications