1,638 research outputs found
Learning Causal State Representations of Partially Observable Environments
Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs
Learning Queuing Networks by Recurrent Neural Networks
It is well known that building analytical performance models in practice is
difficult because it requires a considerable degree of proficiency in the
underlying mathematics. In this paper, we propose a machine-learning approach
to derive performance models from data. We focus on queuing networks, and
crucially exploit a deterministic approximation of their average dynamics in
terms of a compact system of ordinary differential equations. We encode these
equations into a recurrent neural network whose weights can be directly related
to model parameters. This allows for an interpretable structure of the neural
network, which can be trained from system measurements to yield a white-box
parameterized model that can be used for prediction purposes such as what-if
analyses and capacity planning. Using synthetic models as well as a real case
study of a load-balancing system, we show the effectiveness of our technique in
yielding models with high predictive power
Robust Controller for Delays and Packet Dropout Avoidance in Solar-Power Wireless Network
Solar Wireless Networked Control Systems (SWNCS) are a style of distributed control systems where sensors, actuators, and controllers are interconnected via a wireless communication network. This system setup has the benefit of low cost, flexibility, low weight, no wiring and simplicity of system diagnoses and maintenance. However, it also unavoidably calls some wireless network time delays and packet dropout into the design procedure. Solar lighting system offers a clean environment, therefore able to continue for a long period. SWNCS also offers multi Service infrastructure solution for both developed and undeveloped countries. The system provides wireless controller lighting, wireless communications network (WI-FI/WIMAX), CCTV surveillance, and wireless sensor for weather measurement which are all powered by solar energy
Recommended from our members
Composing Deep Learning and Bayesian Nonparametric Methods
Recent progress in Bayesian methods largely focus on non-conjugate models featured with extensive use of black-box functions: continuous functions implemented with neural networks. Using deep neural networks, Bayesian models can reasonably fit big data while at the same time capturing model uncertainty. This thesis targets at a more challenging problem: how do we model general random objects, including discrete ones, using random functions? Our conclusion is: many (discrete) random objects are in nature a composition of Poisson processes and random functions}. Thus, all discreteness is handled through the Poisson process while random functions captures the rest complexities of the object. Thus the title: composing deep learning and Bayesian nonparametric methods.
This conclusion is not a conjecture. In spacial cases such as latent feature models , we can prove this claim by working on infinite dimensional spaces, and that is how Bayesian nonparametric kicks in. Moreover, we will assume some regularity assumptions on random objects such as exchangeability. Then the representations will show up magically using representation theorems. We will see this two times throughout this thesis.
One may ask: when a random object is too simple, such as a non-negative random vector in the case of latent feature models, how can we exploit exchangeability? The answer is to aggregate infinite random objects and map them altogether onto an infinite dimensional space. And then assume exchangeability on the infinite dimensional space. We demonstrate two examples of latent feature models by (1) concatenating them as an infinite sequence (Section 2,3) and (2) stacking them as a 2d array (Section 4).
Besides, we will see that Bayesian nonparametric methods are useful to model discrete patterns in time series data. We will showcase two examples: (1) using variance Gamma processes to model change points (Section 5), and (2) using Chinese restaurant processes to model speech with switching speakers (Section 6).
We also aware that the inference problem can be non-trivial in popular Bayesian nonparametric models. In Section 7, we find a novel solution of online inference for the popular HDP-HMM model
Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Autonomous robots need to interact with unknown, unstructured and changing
environments, constantly facing novel challenges. Therefore, continuous online
adaptation for lifelong-learning and the need of sample-efficient mechanisms to
adapt to changes in the environment, the constraints, the tasks, or the robot
itself are crucial. In this work, we propose a novel framework for
probabilistic online motion planning with online adaptation based on a
bio-inspired stochastic recurrent neural network. By using learning signals
which mimic the intrinsic motivation signalcognitive dissonance in addition
with a mental replay strategy to intensify experiences, the stochastic
recurrent network can learn from few physical interactions and adapts to novel
environments in seconds. We evaluate our online planning and adaptation
framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is
shown by learning unknown workspace constraints sample-efficiently from few
physical interactions while following given way points.Comment: accepted in Neural Network
A Learning-based Stochastic MPC Design for Cooperative Adaptive Cruise Control to Handle Interfering Vehicles
Vehicle to Vehicle (V2V) communication has a great potential to improve
reaction accuracy of different driver assistance systems in critical driving
situations. Cooperative Adaptive Cruise Control (CACC), which is an automated
application, provides drivers with extra benefits such as traffic throughput
maximization and collision avoidance. CACC systems must be designed in a way
that are sufficiently robust against all special maneuvers such as cutting-into
the CACC platoons by interfering vehicles or hard braking by leading cars. To
address this problem, a Neural- Network (NN)-based cut-in detection and
trajectory prediction scheme is proposed in the first part of this paper. Next,
a probabilistic framework is developed in which the cut-in probability is
calculated based on the output of the mentioned cut-in prediction block.
Finally, a specific Stochastic Model Predictive Controller (SMPC) is designed
which incorporates this cut-in probability to enhance its reaction against the
detected dangerous cut-in maneuver. The overall system is implemented and its
performance is evaluated using realistic driving scenarios from Safety Pilot
Model Deployment (SPMD).Comment: 10 pages, Submitted as a journal paper at T-I
- …