Search CORE

184,754 research outputs found

Optimal Bounds for the Change-Making Problem

Author: Kozen Dexter
Zaks Shmuel
Publication venue: 'Aarhus University Library'
Publication date: 01/11/1991
Field of study

The change-making problem is the problem of representing a given value with the fewest coins possible. We investigate the problem of determining whether the greedy algorithm produces an optimal representation of all amounts for a given set of coin denominations 1 = c_1 < c_2 < ... < c_m. Chang and Gill show that if the greedy algorithm is not always optimal, then there exists a counterexample x in the rangec_3 <= x < (c_m(c_m c_m-1 + c_m - 3c_m-1)) \ (c_m - c_m-1).To test for the existence of such a counterexample, Chang and Gill propose computing and comparing the greedy and optimal representations of all x in this range. In this paper we show that if a counterexample exists, then the smallest one lies in the range c_3 + 1 < x < c_m + c_m-1, and these bounds are tight. Moreover, we give a simple test for the existence of a counterexample that does not require the calculation of optimal representations.In addition, we give a complete characterization of three-coin systems and an efficient algorithm for all systems with a fixed number of coins. Finally, we show that a related problem is coNP-complete

Tidsskrift.dk (Det Kongelige Bibliotek)

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Author: Han Yanjun
Jiao Jiantao
Rajaraman Nived
Ramchandran Kannan
Publication venue
Publication date: 14/03/2023
Field of study

We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action. Compared with the linear model, two curious phenomena arise in non-linear models: first, in addition to the "learning phase" with a standard parametric rate for estimation or regret, there is an "burn-in period" with a fixed cost determined by the non-linear function; second, achieving the smallest burn-in cost requires new exploration algorithms. For a special family of non-linear functions named ridge functions in the literature, we derive upper and lower bounds on the optimal burn-in cost, and in addition, on the entire learning trajectory during the burn-in period via differential equations. In particular, a two-stage algorithm that first finds a good initial action and then treats the problem as locally linear is statistically optimal. In contrast, several classical algorithms, such as UCB and algorithms relying on regression oracles, are provably suboptimal.Comment: Title change; add a new lower bound for linear bandits in Theorem 1

arXiv.org e-Print Archive

Metareasoning for Planning Under Uncertainty

Author: Horvitz Eric
Kamar Ece
Kolobov Andrey
Lin Christopher H.
Publication venue
Publication date: 03/05/2015
Field of study

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most real-world settings. For example, an autonomous drone is subject to nature's forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning---a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.Comment: Extended version of IJCAI 2015 pape

arXiv.org e-Print Archive

CiteSeerX

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

Author: Oliehoek Frans A.
Spaan Matthijs T. J.
Witwicki Stefan
Publication venue
Publication date: 20/07/2015
Field of study

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS 2015

arXiv.org e-Print Archive

University of Liverpool Repository

CiteSeerX