327 research outputs found
Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond
This paper presents a novel nonmyopic adaptive Gaussian process planning
(GPP) framework endowed with a general class of Lipschitz continuous reward
functions that can unify some active learning/sensing and Bayesian optimization
criteria and offer practitioners some flexibility to specify their desired
choices for defining new tasks/problems. In particular, it utilizes a
principled Bayesian sequential decision problem framework for jointly and
naturally optimizing the exploration-exploitation trade-off. In general, the
resulting induced GPP policy cannot be derived exactly due to an uncountable
set of candidate observations. A key contribution of our work here thus lies in
exploiting the Lipschitz continuity of the reward functions to solve for a
nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real
time, we further propose an asymptotically optimal, branch-and-bound anytime
variant of epsilon-GPP with performance guarantee. We empirically demonstrate
the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian
optimization and an energy harvesting task.Comment: 30th AAAI Conference on Artificial Intelligence (AAAI 2016), Extended
version with proofs, 17 page
Adaptive Non-myopic Quantizer Design for Target Tracking in Wireless Sensor Networks
In this paper, we investigate the problem of nonmyopic (multi-step ahead)
quantizer design for target tracking using a wireless sensor network. Adopting
the alternative conditional posterior Cramer-Rao lower bound (A-CPCRLB) as the
optimization metric, we theoretically show that this problem can be temporally
decomposed over a certain time window. Based on sequential Monte-Carlo methods
for tracking, i.e., particle filters, we design the local quantizer adaptively
by solving a particlebased non-linear optimization problem which is well suited
for the use of interior-point algorithm and easily embedded in the filtering
process. Simulation results are provided to illustrate the effectiveness of our
proposed approach.Comment: Submitted to 2013 Asilomar Conference on Signals, Systems, and
Computer
Maximizing Expected Value of Information in Decision Problems by Querying on a Wish-to-Know Basis.
An agent acting under uncertainty regarding how it should complete the task assigned to it by its human user can learn more about how it should behave by posing queries to its human user. Asking too many queries, however, may risk requiring undue attentional demand of the user, and so the agent should prioritize asking the most valuable queries. For decision-making agents, Expected Value of Information (EVOI) measures the value of a query, and so given a set of queries the agent can ask, the agent should ask the query that is expected to maximally improve its performance by selecting the query with highest EVOI in its set. Unfortunately, to compute the EVOI of a query, the agent must consider how each possible response would influence its future behavior, which makes query selection particularly challenging in settings where planning the agent's behavior would be expensive even without the added complication of considering queries to ask, especially when there are many potential queries the agent should consider. The focus of this dissertation is on developing query selection algorithms that can be feasibly applied to such settings.
The main novel approach studied, Wishful Query Projection (WQP), is based on the intuition that the agent should consider which query to ask on the basis of obtaining knowledge that would help it resolve a particular dilemma that it wishes could be resolved, as opposed to blindly searching its entire query set in hopes of finding one that would give it valuable knowledge. In implementing WQP, this dissertation contributes algorithms that are founded upon the following novel result: for myopic settings, when the agent can ask any query as long as the query has no more than some set number of possible responses, the best query takes the form of asking the user to choose from a specified subset of ways for the agent to behave. The work presented shows that WQP selects queries with near-optimal EVOI when the agent's query set is (1) balanced in the range of queries it contains; and (2) rich in terms of the highest contained query EVOI.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120772/1/rwcohn_1.pd
- …