Search CORE

424 research outputs found

Neural Networks

Author: Bishop Christopher M.
Jordan Michael I.
Publication venue
Publication date: 01/01/1996
Field of study

We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models

CiteSeerX

DSpace@MIT

On monte carlo tree search and reinforcement learning

Author: Brank Ster
Samothrakis Spyridon
Tom Vodopivec
Publication venue: 'AI Access Foundation'
Publication date: 20/12/2017
Field of study

Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search

University of Essex Research Repository

Crossref

Forecasting Automobile Demand Via Artificial Neural Networks & Neuro-Fuzzy Systems

Author: Niaki Armin
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2017
Field of study

The objective of this research is to obtain an accurate forecasting model for the demand for automobiles in Iran\u27s domestic market. The model is constructed using production data for vehicles manufactured from 2006 to 2016, by Iranian car makers. The increasing demand for transportation and automobiles in Iran necessitated an accurate forecasting model for car manufacturing companies in Iran so that future demand is met. Demand is deduced as a function of the historical data. The monthly gold, rubber, and iron ore prices along with the monthly commodity metals price index and the Stock index of Iran are Artificial neural network (ANN) and artificial neuro-fuzzy system (ANFIS) have been utilized in many fields such as energy consumption and load forecasting fields. The performances of the methodologies are investigated towards obtaining the most accurate forecasting model in terms of the forecast Mean Absolute Percentage Error (MAPE). It was concluded that the feedforward multi-layer perceptron network with back-propagation and the Levenberg-Marquardt learning algorithm provides forecasts with the lowest MAPE (5.85%) among the other models. Further development of the ANN network based on more data is recommended to enhance the model and obtain more accurate networks and subsequently improved forecasts

The Research Repository @ WVU (West Virginia University)

Nature-Inspired Topology Optimization of Recurrent Neural Networks

Author: ElSaid AbdElRahman A
Publication venue: RIT Scholar Works
Publication date: 01/12/2020
Field of study

Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, this work presents three nature-inspired (NI) algorithms for neural architecture search (NAS), introducing the subfield of nature-inspired neural architecture search (NI-NAS). These algorithms, based on ant colony optimization (ACO), progress from memory cell structure optimization, to bounded discrete-space architecture optimization, and finally to unbounded continuous-space architecture optimization. These methods were applied to real-world data sets representing challenging engineering problems, such as data from a coal-fired power plant, wind-turbine power generators, and aircraft flight data recorder (FDR) data. Initial work utilized ACO to select optimal connections inside recurrent long short-term memory (LSTM) cell structures. Viewing each LSTM cell as a graph, ants would choose potential input and output connections based on the pheromones previously laid down over those connections as done in a standard ACO search. However, this approach did not optimize the overall network of the RNN, particularly its synaptic parameters. I addressed this issue by introducing the Ant-based Neural Topology Search (ANTS) algorithm to directly optimize the entire RNN topology. ANTS utilizes a discrete-space superstructure representing a completely connected RNN where each node is connected to every other node, forming an extremely dense mesh of edges and recurrent edges. ANTS can select from a library of modern RNN memory cells. ACO agents (ants), in this thesis, build RNNs from the superstructure determined by pheromones laid out on the superstructure\u27s connections. Backpropagation is then used to train the generated RNNs in an asynchronous parallel computing design to accelerate the optimization process. The pheromone update depends on the evaluation of the tested RNN against a population of best performing RNNs. Several variations of the core algorithm was investigated to test several designed heuristics for ANTS and evaluate their efficacy in the formation of sparser synaptic connectivity patterns. This was done primarily by formulating different functions that drive the underlying pheromone simulation process as well as by introducing ant agents with 3 specialized roles (inspired by real-world ants) to construct the RNN structure. This characterization of the agents enables ants to focus on specific structure building roles. ``Communal intelligence\u27\u27 was also incorporated, where the best set of weights was across locally-trained RNN candidates for weight initialization, reducing the number of backpropagation epochs required to train each candidate RNN and speeding up the overall search process. However, the growth of the superstructure increased by an order of magnitude, as more input and deeper structures are utilized, proving to be one limitation of the proposed procedure. The limitation of ANTS motivated the development of the continuous ANTS algorithm (CANTS), which works with a continuous search space for any fixed network topology. In this process, ants moving within a (temporally-arranged) set of continuous/real-valued planes based on proximity and density of pheromone placements. The motion of the ants over these continuous planes, in a sense, more closely mimicks how actual ants move in the real world. Ants traverse a 3-dimensional space from the inputs to the outputs and across time lags. This continuous search space frees the ant agents from the limitations imposed by ANTS\u27 discrete massively connected superstructure, making the structural options unbounded when mapping the movements of ants through the 3D continuous space to a neural architecture graph. In addition, CANTS has fewer hyperparameters to tune than ANTS, which had five potential heuristic components that each had their own unique set of hyperparameters, as well as requiring the user to define the maximum recurrent depth, number of layers and nodes within each layer. CANTS only requires specifying the number ants and their pheromone sensing radius. The three applied strategies yielded three important successes. Applying ACO on optimizing LSTMs yielded a 1.34\% performance enhancement and more than 55% sparser structures (which is useful for speeding up inference). ANTS outperformed the NAS benchmark, NEAT, and the NAS state-of-the-art algorithm, EXAMM. CANTS showed competitive results to EXAMM and competed with ANTS while offering sparser structures, offering a promising path forward for optimizing (temporal) neural models with nature-inspired metaheuristics based the metaphor of ants

RIT Scholar Works

Adversarially Tuned Scene Generation

Author: Rothkopf Constantin
Veeravasarapu V S R
Visvanathan Ramesh
Publication venue
Publication date: 01/01/2017
Field of study

Generalization performance of trained computer vision systems that use computer graphics (CG) generated data is not yet effective due to the concept of 'domain-shift' between virtual and real data. Although simulated data augmented with a few real world samples has been shown to mitigate domain shift and improve transferability of trained models, guiding or bootstrapping the virtual data generation with the distributions learnt from target real world domain is desired, especially in the fields where annotating even few real images is laborious (such as semantic labeling, and intrinsic images etc.). In order to address this problem in an unsupervised manner, our work combines recent advances in CG (which aims to generate stochastic scene layouts coupled with large collections of 3D object models) and generative adversarial training (which aims train generative models by measuring discrepancy between generated and real data in terms of their separability in the space of a deep discriminatively-trained classifier). Our method uses iterative estimation of the posterior density of prior distributions for a generative graphical model. This is done within a rejection sampling framework. Initially, we assume uniform distributions as priors on the parameters of a scene described by a generative graphical model. As iterations proceed the prior distributions get updated to distributions that are closer to the (unknown) distributions of target data. We demonstrate the utility of adversarially tuned scene generation on two real-world benchmark datasets (CityScapes and CamVid) for traffic scene semantic labeling with a deep convolutional net (DeepLab). We realized performance improvements by 2.28 and 3.14 points (using the IoU metric) between the DeepLab models trained on simulated sets prepared from the scene generation models before and after tuning to CityScapes and CamVid respectively.Comment: 9 pages, accepted at CVPR 201

arXiv.org e-Print Archive

TUbiblio

Crossref

Computer simulations, machine learning and the Laplacean demon: Opacity in the case of high energy physics∗

Author: Boge Florian J.
Grünke Paul
Publication venue
Publication date: 01/01/2019
Field of study

In this paper, we pursue three general aims: (I) We will define a notion of fundamental opacity and ask whether it can be found in High Energy Physics (HEP), given the involvement of machine learning (ML) and computer simulations (CS) therein. (II) We identify two kinds of non-fundamental, contingent opacity associated with CS and ML in HEP respectively, and ask whether, and if so how, they may be overcome. (III) We raise the question of whether any kind of opacity, contingent or fundamental, is unique to ML or CS, or whether they stand in continuity to kinds of opacity associated with other scientific research

PhilSci Archive

Deep learning via message passing algorithms based on belief propagation

Author: Lucibello Carlo
Perugini Gabriele
Pittorino Fabrizio
Zecchina Riccardo
Publication venue
Publication date: 01/01/2022
Field of study

Message-passing algorithms based on the belief propagation (BP) equations constitute a well-known distributed computational scheme. They yield exact marginals on tree-like graphical models and have also proven to be effective in many problems defined on loopy graphs, from inference to optimization, from signal processing to clustering. The BP-based schemes are fundamentally different from stochastic gradient descent (SGD), on which the current success of deep networks is based. In this paper, we present and adapt to mini-batch training on GPUs a family of BP-based message-passing algorithms with a reinforcement term that biases distributions towards locally entropic solutions. These algorithms are capable of training multi-layer neural networks with performance comparable to SGD heuristics in a diverse set of experiments on natural datasets including multi-class image classification and continual learning, while being capable of yielding improved performances on sparse networks. Furthermore, they allow to make approximate Bayesian predictions that have higher accuracy than point-wise ones

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

Archivio istituzionale della ricerca - Politecnico di Milano

Trajectory prediction of moving objects by means of neural networks

Author: Barışık Hakan
Publication venue: Izmir Institute of Technology
Publication date: 01/01/1997
Field of study

Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 1997Includes bibliographical references (leaves: 103-105)Text in English; Abstract: Turkish and Englishviii, 105 leavesEstimating the three-dimensional motion of an object from a sequence of object positions and orientation is of significant importance in variety of applications in control and robotics. For instance, autonomous navigation, manipulation, servo, tracking, planning and surveillance needs prediction of motion parameters. Although "motion estimation" is an old problem (the formulations date back to the beginning of the century), only recently scientists have provided with the tools from nonlinear system estimation theory to solve this problem eural Networks are the ones which have recently been used in many nonlinear dynamic system parameter estimation context. The approximating ability of the neural network is used to identifY the relation between system variables and parameters of a dynamic system. The position, velocity and acceleration of the object are estimated by several neural networks using the II most recent measurements of the object coordinates as input to the system Several neural network topologies with different configurations are introduced and utilized in the solution of the problem. Training schemes for each configuration are given in detail. Simulation results for prediction of motion having different characteristics via different architectures with alternative configurations are presented comparatively