Skip to main content
Article thumbnail
Location of Repository

Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats. submitted, accessible at jes9/bio/OptimalSequentialExplorationBCW.pdf

By David B. Brown and James E. Smith


This paper was motivated by the problem of developing an optimal strategy for exploring a large oil and gas field in the North Sea. Where should we drill first? Where do we drill next? The problem resembles a classical multiarmed bandit problem, but probabilistic dependence plays a key role: outcomes at drilled sites reveal information about neighboring targets. Good exploration strategies will take advantage of this information as it is revealed. We develop heuristic policies for sequential exploration problems and complement these heuristics with upper bounds on the performance of an optimal policy. We begin by grouping the targets into clusters of manageable size. The heuristics are derived from a model that treats these clusters as independent. The upper bounds are given by assuming each cluster has perfect information about the results from all other clusters. The analysis relies heavily on results for bandit superprocesses, a generalization of the classical multiarmed bandit problem. We evaluate the heuristics and bounds using Monte Carlo simulation and, in our problem, we find that the heuristic policies are nearly optimal

Topics: Subject Classifications, Dynamic Programming, Multiarmed Bandits, Bandit Superprocesses, Information Relaxations
Year: 2012
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.