424 research outputs found
Neural Networks
We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models
On monte carlo tree search and reinforcement learning
Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread
adoption within the games community. Its links to traditional reinforcement learning (RL)
methods have been outlined in the past; however, the use of RL techniques within tree search has
not been thoroughly studied yet. In this paper we re-examine in depth this close relation between
the two fields; our goal is to improve the cross-awareness between the two communities. We show
that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new
algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning
methods inspired by RL in conjunction with online search demonstrate encouraging results on
several classic board games and in arcade video game competitions, where our algorithm recently
ranked first. Our study promotes a unified view of learning, planning, and search
Forecasting Automobile Demand Via Artificial Neural Networks & Neuro-Fuzzy Systems
The objective of this research is to obtain an accurate forecasting model for the demand for automobiles in Iran\u27s domestic market. The model is constructed using production data for vehicles manufactured from 2006 to 2016, by Iranian car makers. The increasing demand for transportation and automobiles in Iran necessitated an accurate forecasting model for car manufacturing companies in Iran so that future demand is met. Demand is deduced as a function of the historical data. The monthly gold, rubber, and iron ore prices along with the monthly commodity metals price index and the Stock index of Iran are Artificial neural network (ANN) and artificial neuro-fuzzy system (ANFIS) have been utilized in many fields such as energy consumption and load forecasting fields. The performances of the methodologies are investigated towards obtaining the most accurate forecasting model in terms of the forecast Mean Absolute Percentage Error (MAPE). It was concluded that the feedforward multi-layer perceptron network with back-propagation and the Levenberg-Marquardt learning algorithm provides forecasts with the lowest MAPE (5.85%) among the other models. Further development of the ANN network based on more data is recommended to enhance the model and obtain more accurate networks and subsequently improved forecasts
Nature-Inspired Topology Optimization of Recurrent Neural Networks
Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, this work presents three nature-inspired (NI) algorithms for neural architecture search (NAS), introducing the subfield of nature-inspired neural architecture search (NI-NAS). These algorithms, based on ant colony optimization (ACO), progress from memory cell structure optimization, to bounded discrete-space architecture optimization, and finally to unbounded continuous-space architecture optimization. These methods were applied to real-world data sets representing challenging engineering problems, such as data from a coal-fired power plant, wind-turbine power generators, and aircraft flight data recorder (FDR) data.
Initial work utilized ACO to select optimal connections inside recurrent long short-term memory (LSTM) cell structures. Viewing each LSTM cell as a graph, ants would choose potential input and output connections based on the pheromones previously laid down over those connections as done in a standard ACO search. However, this approach did not optimize the overall network of the RNN, particularly its synaptic parameters.
I addressed this issue by introducing the Ant-based Neural Topology Search (ANTS) algorithm to directly optimize the entire RNN topology. ANTS utilizes a discrete-space superstructure representing a completely connected RNN where each node is connected to every other node, forming an extremely dense mesh of edges and recurrent edges.
ANTS can select from a library of modern RNN memory cells.
ACO agents (ants), in this thesis, build RNNs from the superstructure determined by pheromones laid out on the superstructure\u27s connections. Backpropagation is then used to train the generated RNNs in an asynchronous parallel computing design to accelerate the optimization process. The pheromone update depends on the evaluation of the tested RNN against a population of best performing RNNs. Several variations of the core algorithm was investigated to test several designed heuristics for ANTS and evaluate their efficacy in the formation of sparser synaptic connectivity patterns. This was done primarily by formulating different functions that drive the underlying pheromone simulation process as well as by introducing ant agents with 3 specialized roles (inspired by real-world ants) to construct the RNN structure. This characterization of the agents enables ants to focus on specific structure building roles.
``Communal intelligence\u27\u27 was also incorporated, where the best set of weights was across locally-trained RNN candidates for weight initialization, reducing the number of backpropagation epochs required to train each candidate RNN and speeding up the overall search process. However, the growth of the superstructure increased by an order of magnitude, as more input and deeper structures are utilized, proving to be one limitation of the proposed procedure.
The limitation of ANTS motivated the development of the continuous ANTS algorithm (CANTS), which works with a continuous search space for any fixed network topology. In this process, ants moving within a (temporally-arranged) set of continuous/real-valued planes based on proximity and density of pheromone placements.
The motion of the ants over these continuous planes, in a sense, more closely mimicks how actual ants move in the real world. Ants traverse a 3-dimensional space from the inputs to the outputs and across time lags. This continuous search space frees the ant agents from the limitations imposed by ANTS\u27 discrete massively connected superstructure, making the structural options unbounded when mapping the movements of ants through the 3D continuous space to a neural architecture graph. In addition, CANTS has fewer hyperparameters to tune than ANTS, which had five potential heuristic components that each had their own unique set of hyperparameters, as well as requiring the user to define the maximum recurrent depth, number of layers and nodes within each layer. CANTS only requires specifying the number ants and their pheromone sensing radius.
The three applied strategies yielded three important successes. Applying ACO on optimizing LSTMs yielded a 1.34\% performance enhancement and more than 55% sparser structures (which is useful for speeding up inference). ANTS outperformed the NAS benchmark, NEAT, and the NAS state-of-the-art algorithm, EXAMM. CANTS showed competitive results to EXAMM and competed with ANTS while offering sparser structures, offering a promising path forward for optimizing (temporal) neural models with nature-inspired metaheuristics based the metaphor of ants
Adversarially Tuned Scene Generation
Generalization performance of trained computer vision systems that use
computer graphics (CG) generated data is not yet effective due to the concept
of 'domain-shift' between virtual and real data. Although simulated data
augmented with a few real world samples has been shown to mitigate domain shift
and improve transferability of trained models, guiding or bootstrapping the
virtual data generation with the distributions learnt from target real world
domain is desired, especially in the fields where annotating even few real
images is laborious (such as semantic labeling, and intrinsic images etc.). In
order to address this problem in an unsupervised manner, our work combines
recent advances in CG (which aims to generate stochastic scene layouts coupled
with large collections of 3D object models) and generative adversarial training
(which aims train generative models by measuring discrepancy between generated
and real data in terms of their separability in the space of a deep
discriminatively-trained classifier). Our method uses iterative estimation of
the posterior density of prior distributions for a generative graphical model.
This is done within a rejection sampling framework. Initially, we assume
uniform distributions as priors on the parameters of a scene described by a
generative graphical model. As iterations proceed the prior distributions get
updated to distributions that are closer to the (unknown) distributions of
target data. We demonstrate the utility of adversarially tuned scene generation
on two real-world benchmark datasets (CityScapes and CamVid) for traffic scene
semantic labeling with a deep convolutional net (DeepLab). We realized
performance improvements by 2.28 and 3.14 points (using the IoU metric) between
the DeepLab models trained on simulated sets prepared from the scene generation
models before and after tuning to CityScapes and CamVid respectively.Comment: 9 pages, accepted at CVPR 201
Computer simulations, machine learning and the Laplacean demon: Opacity in the case of high energy physics∗
In this paper, we pursue three general aims: (I) We will define a notion of fundamental opacity and ask whether it can be found in High Energy Physics (HEP), given the involvement of machine learning (ML) and computer simulations (CS) therein. (II) We identify two kinds of non-fundamental, contingent opacity associated with CS and ML in HEP respectively, and ask whether, and if so how, they may be overcome. (III) We raise the question of whether any kind of opacity, contingent or fundamental, is unique to ML or CS, or whether they stand in continuity to kinds of opacity associated with other scientific research
Deep learning via message passing algorithms based on belief propagation
Message-passing algorithms based on the belief propagation (BP) equations constitute a
well-known distributed computational scheme. They yield exact marginals on tree-like graphical
models and have also proven to be effective in many problems defined on loopy graphs, from
inference to optimization, from signal processing to clustering. The BP-based schemes are
fundamentally different from stochastic gradient descent (SGD), on which the current success of
deep networks is based. In this paper, we present and adapt to mini-batch training on GPUs a
family of BP-based message-passing algorithms with a reinforcement term that biases distributions
towards locally entropic solutions. These algorithms are capable of training multi-layer neural
networks with performance comparable to SGD heuristics in a diverse set of experiments on
natural datasets including multi-class image classification and continual learning, while being
capable of yielding improved performances on sparse networks. Furthermore, they allow to make
approximate Bayesian predictions that have higher accuracy than point-wise ones
Trajectory prediction of moving objects by means of neural networks
Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 1997Includes bibliographical references (leaves: 103-105)Text in English; Abstract: Turkish and Englishviii, 105 leavesEstimating the three-dimensional motion of an object from a sequence of object positions and orientation is of significant importance in variety of applications in control and robotics. For instance, autonomous navigation, manipulation, servo, tracking, planning and surveillance needs prediction of motion parameters. Although "motion estimation" is an old problem (the formulations date back to the beginning of the century), only recently scientists have provided with the tools from nonlinear system estimation theory to solve this problem eural Networks are the ones which have recently been used in many nonlinear dynamic system parameter estimation context. The approximating ability of the neural network is used to identifY the relation between system variables and parameters of a dynamic system. The position, velocity and acceleration of the object are estimated by several neural networks using the II most recent measurements of the object coordinates as input to the system Several neural network topologies with different configurations are introduced and utilized in the solution of the problem. Training schemes for each configuration are given in detail. Simulation results for prediction of motion having different characteristics via different architectures with alternative configurations are presented comparatively
- …