6,866 research outputs found
Scalable Approach to Uncertainty Quantification and Robust Design of Interconnected Dynamical Systems
Development of robust dynamical systems and networks such as autonomous
aircraft systems capable of accomplishing complex missions faces challenges due
to the dynamically evolving uncertainties coming from model uncertainties,
necessity to operate in a hostile cluttered urban environment, and the
distributed and dynamic nature of the communication and computation resources.
Model-based robust design is difficult because of the complexity of the hybrid
dynamic models including continuous vehicle dynamics, the discrete models of
computations and communications, and the size of the problem. We will overview
recent advances in methodology and tools to model, analyze, and design robust
autonomous aerospace systems operating in uncertain environment, with stress on
efficient uncertainty quantification and robust design using the case studies
of the mission including model-based target tracking and search, and trajectory
planning in uncertain urban environment. To show that the methodology is
generally applicable to uncertain dynamical systems, we will also show examples
of application of the new methods to efficient uncertainty quantification of
energy usage in buildings, and stability assessment of interconnected power
networks
Influence-Optimistic Local Values for Multiagent Planning --- Extended Version
Recent years have seen the development of methods for multiagent planning
under uncertainty that scale to tens or even hundreds of agents. However, most
of these methods either make restrictive assumptions on the problem domain, or
provide approximate solutions without any guarantees on quality. Methods in the
former category typically build on heuristic search using upper bounds on the
value function. Unfortunately, no techniques exist to compute such upper bounds
for problems with non-factored value functions. To allow for meaningful
benchmarking through measurable quality guarantees on a very general class of
problems, this paper introduces a family of influence-optimistic upper bounds
for factored decentralized partially observable Markov decision processes
(Dec-POMDPs) that do not have factored value functions. Intuitively, we derive
bounds on very large multiagent planning problems by subdividing them in
sub-problems, and at each of these sub-problems making optimistic assumptions
with respect to the influence that will be exerted by the rest of the system.
We numerically compare the different upper bounds and demonstrate how we can
achieve a non-trivial guarantee that a heuristic solution for problems with
hundreds of agents is close to optimal. Furthermore, we provide evidence that
the upper bounds may improve the effectiveness of heuristic influence search,
and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS
2015
Influence-Based Abstraction for Multiagent Systems
This paper presents a theoretical advance by which factored POSGs can be decomposed into local models. We formalize the interface between such local models as the influence agents can exert on one another; and we prove that this interface is sufficient for decoupling them. The resulting influence-based abstraction substantially generalizes previous work on exploiting weakly-coupled agent interaction structures. Therein lie several important contributions. First, our general formulation sheds new light on the theoretical relationships among previous approaches, and promotes future empirical comparisons that could come by extending them beyond the more specific problem contexts for which they were developed. More importantly, the influence-based approaches that we generalize have shown promising improvements in the scalability of planning for more restrictive models. Thus, our theoretical result here serves as the foundation for practical algorithms that we anticipate will bring similar improvements to more general planning contexts, and also into other domains such as approximate planning, decision-making in adversarial domains, and online learning.United States. Air Force Office of Scientific Research. Multidisciplinary University Research Initiative (Project FA9550-09-1-0538
Reinforcement Learning in Deep Structured Teams: Initial Results with Finite and Infinite Valued Features
In this paper, we consider Markov chain and linear quadratic models for deep
structured teams with discounted and time-average cost functions under two
non-classical information structures, namely, deep state sharing and no
sharing. In deep structured teams, agents are coupled in dynamics and cost
functions through deep state, where deep state refers to a set of orthogonal
linear regressions of the states. In this article, we consider a homogeneous
linear regression for Markov chain models (i.e., empirical distribution of
states) and a few orthonormal linear regressions for linear quadratic models
(i.e., weighted average of states). Some planning algorithms are developed for
the case when the model is known, and some reinforcement learning algorithms
are proposed for the case when the model is not known completely. The
convergence of two model-free (reinforcement learning) algorithms, one for
Markov chain models and one for linear quadratic models, is established. The
results are then applied to a smart grid.Comment: This version corrects some typographical error
Universal Reinforcement Learning Algorithms: Survey and Experiments
Many state-of-the-art reinforcement learning (RL) algorithms typically assume
that the environment is an ergodic Markov Decision Process (MDP). In contrast,
the field of universal reinforcement learning (URL) is concerned with
algorithms that make as few assumptions as possible about the environment. The
universal Bayesian agent AIXI and a family of related URL algorithms have been
developed in this setting. While numerous theoretical optimality results have
been proven for these agents, there has been no empirical investigation of
their behavior to date. We present a short and accessible survey of these URL
algorithms under a unified notation and framework, along with results of some
experiments that qualitatively illustrate some properties of the resulting
policies, and their relative performance on partially-observable gridworld
environments. We also present an open-source reference implementation of the
algorithms which we hope will facilitate further understanding of, and
experimentation with, these ideas.Comment: 8 pages, 6 figures, Twenty-sixth International Joint Conference on
Artificial Intelligence (IJCAI-17
Resource Allocation Among Agents with MDP-Induced Preferences
Allocating scarce resources among agents to maximize global utility is, in
general, computationally challenging. We focus on problems where resources
enable agents to execute actions in stochastic environments, modeled as Markov
decision processes (MDPs), such that the value of a resource bundle is defined
as the expected value of the optimal MDP policy realizable given these
resources. We present an algorithm that simultaneously solves the
resource-allocation and the policy-optimization problems. This allows us to
avoid explicitly representing utilities over exponentially many resource
bundles, leading to drastic (often exponential) reductions in computational
complexity. We then use this algorithm in the context of self-interested agents
to design a combinatorial auction for allocating resources. We empirically
demonstrate the effectiveness of our approach by showing that it can, in
minutes, optimally solve problems for which a straightforward combinatorial
resource-allocation technique would require the agents to enumerate up to 2^100
resource bundles and the auctioneer to solve an NP-complete problem with an
input of that size
- …