Search CORE

327 research outputs found

Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond

Author: Jaillet Patrick
Ling Chun Kai
Low Kian Hsiang
Publication venue
Publication date: 21/11/2015
Field of study

This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of epsilon-GPP with performance guarantee. We empirically demonstrate the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.Comment: 30th AAAI Conference on Artificial Intelligence (AAAI 2016), Extended version with proofs, 17 page

arXiv.org e-Print Archive

CiteSeerX

Association for the Advancement of Artificial Intelligence: AAAI Publications

Adaptive Non-myopic Quantizer Design for Target Tracking in Wireless Sensor Networks

Author: Liu Sijia
Masazade Engin
Shen Xiaojing
Varshney Pramod K.
Publication venue
Publication date: 06/05/2013
Field of study

In this paper, we investigate the problem of nonmyopic (multi-step ahead) quantizer design for target tracking using a wireless sensor network. Adopting the alternative conditional posterior Cramer-Rao lower bound (A-CPCRLB) as the optimization metric, we theoretically show that this problem can be temporally decomposed over a certain time window. Based on sequential Monte-Carlo methods for tracking, i.e., particle filters, we design the local quantizer adaptively by solving a particlebased non-linear optimization problem which is well suited for the use of interior-point algorithm and easily embedded in the filtering process. Simulation results are provided to illustrate the effectiveness of our proposed approach.Comment: Submitted to 2013 Asilomar Conference on Signals, Systems, and Computer

arXiv.org e-Print Archive

Crossref

Maximizing Expected Value of Information in Decision Problems by Querying on a Wish-to-Know Basis.

Author: Cohn Robert W.
Publication venue
Publication date: 01/01/2016
Field of study

An agent acting under uncertainty regarding how it should complete the task assigned to it by its human user can learn more about how it should behave by posing queries to its human user. Asking too many queries, however, may risk requiring undue attentional demand of the user, and so the agent should prioritize asking the most valuable queries. For decision-making agents, Expected Value of Information (EVOI) measures the value of a query, and so given a set of queries the agent can ask, the agent should ask the query that is expected to maximally improve its performance by selecting the query with highest EVOI in its set. Unfortunately, to compute the EVOI of a query, the agent must consider how each possible response would influence its future behavior, which makes query selection particularly challenging in settings where planning the agent's behavior would be expensive even without the added complication of considering queries to ask, especially when there are many potential queries the agent should consider. The focus of this dissertation is on developing query selection algorithms that can be feasibly applied to such settings. The main novel approach studied, Wishful Query Projection (WQP), is based on the intuition that the agent should consider which query to ask on the basis of obtaining knowledge that would help it resolve a particular dilemma that it wishes could be resolved, as opposed to blindly searching its entire query set in hopes of finding one that would give it valuable knowledge. In implementing WQP, this dissertation contributes algorithms that are founded upon the following novel result: for myopic settings, when the agent can ask any query as long as the query has no more than some set number of possible responses, the best query takes the form of asking the user to choose from a specified subset of ways for the agent to behave. The work presented shows that WQP selects queries with near-optimal EVOI when the agent's query set is (1) balanced in the range of queries it contains; and (2) rich in terms of the highest contained query EVOI.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120772/1/rwcohn_1.pd

Deep Blue Documents at the University of Michigan