Search CORE

126,407 research outputs found

Online decision problems with large strategy sets

Author: Kleinberg Robert David
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2005.Includes bibliographical references (p. 165-171).In an online decision problem, an algorithm performs a sequence of trials, each of which involves selecting one element from a fixed set of alternatives (the "strategy set") whose costs vary over time. After T trials, the combined cost of the algorithm's choices is compared with that of the single strategy whose combined cost is minimum. Their difference is called regret, and one seeks algorithms which are efficient in that their regret is sublinear in T and polynomial in the problem size. We study an important class of online decision problems called generalized multi- armed bandit problems. In the past such problems have found applications in areas as diverse as statistics, computer science, economic theory, and medical decision-making. Most existing algorithms were efficient only in the case of a small (i.e. polynomial- sized) strategy set. We extend the theory by supplying non-trivial algorithms and lower bounds for cases in which the strategy set is much larger (exponential or infinite) and the cost function class is structured, e.g. by constraining the cost functions to be linear or convex. As applications, we consider adaptive routing in networks, adaptive pricing in electronic markets, and collaborative decision-making by untrusting peers in a dynamic environment.by Robert David Kleinberg.Ph.D

CiteSeerX

DSpace@MIT

Regret-Minimization Algorithms for Multi-Agent Cooperative Learning Systems

Author: Yi Jialin
Publication venue
Publication date: 30/10/2023
Field of study

A Multi-Agent Cooperative Learning (MACL) system is an artificial intelligence (AI) system where multiple learning agents work together to complete a common task. Recent empirical success of MACL systems in various domains (e.g. traffic control, cloud computing, robotics) has sparked active research into the design and analysis of MACL systems for sequential decision making problems. One important metric of the learning algorithm for decision making problems is its regret, i.e. the difference between the highest achievable reward and the actual reward that the algorithm gains. The design and development of a MACL system with low-regret learning algorithms can create huge economic values. In this thesis, I analyze MACL systems for different sequential decision making problems. Concretely, the Chapter 3 and 4 investigate the cooperative multi-agent multi-armed bandit problems, with full-information or bandit feedback, in which multiple learning agents can exchange their information through a communication network and the agents can only observe the rewards of the actions they choose. Chapter 5 considers the communication-regret trade-off for online convex optimization in the distributed setting. Chapter 6 discusses how to form high-productive teams for agents based on their unknown but fixed types using adaptive incremental matchings. For the above problems, I present the regret lower bounds for feasible learning algorithms and provide the efficient algorithms to achieve this bound. The regret bounds I present in Chapter 3, 4 and 5 quantify how the regret depends on the connectivity of the communication network and the communication delay, thus giving useful guidance on design of the communication protocol in MACL systemsComment: Thesis submitted to London School of Economics and Political Science for PhD in Statistic

arXiv.org e-Print Archive

Recommended from our members

Abstractions in Reasoning for Long-Term Autonomy

Author: Wray Kyle Hollins
Publication venue: ScholarWorks@UMass Amherst
Publication date: 02/07/2019
Field of study

The path to building adaptive, robust, intelligent agents has led researchers to develop a suite of powerful models and algorithms for agents with a single objective. However, in recent years, attempts to use this monolithic approach to solve an ever-expanding set of complex real-world problems, which increasingly include long-term autonomous deployments, have illuminated challenges in its ability to scale. Consequently, a fragmented collection of hierarchical and multi-objective models were developed. This trend continues into the algorithms as well, as each approximates an optimal solution in a different manner for scalability. These models and algorithms represent an attempt to solve pieces of an overarching problem: how can an agent explicitly model and integrate the necessary aspects of reasoning required to achieve long-term autonomy? This thesis presents a general hierarchical and multi-objective model called a policy network that unifies prior fragmented solutions into a single graphical decision-making structure. Policy networks are broadly useful to solve numerous real-world problems. This thesis focuses on autonomous vehicle (AV) problems: (1) route-planning with multiple objectives; (2) semi-autonomy with proactive transfer of control; and (3) intersection decision-making for reasoning online about any number of other vehicles and pedestrians. Formal models are presented for each of the distinct problems. Solutions are evaluated using real-world map data in simulation and demonstrated on a fully operational AV prototype driving on real public roads. Policy networks serve as a shared underlying framework for all three, enabling their seamless integration as parts of an overall solution for rich, real-world, scalable decision-making in agents with long-term autonomy

ScholarWorks@UMass Amherst

Meta-RaPS Hybridization with Machine Learning Algorithms

Author: Al-Duoli Fatemah
Publication venue: ODU Digital Commons
Publication date: 01/07/2015
Field of study

This dissertation focuses on advancing the Metaheuristic for Randomized Priority Search algorithm, known as Meta-RaPS, by integrating it with machine learning algorithms. Introducing a new metaheuristic algorithm starts with demonstrating its performance. This is accomplished by using the new algorithm to solve various combinatorial optimization problems in their basic form. The next stage focuses on advancing the new algorithm by strengthening its relatively weaker characteristics. In the third traditional stage, the algorithms are exercised in solving more complex optimization problems. In the case of effective algorithms, the second and third stages can occur in parallel as researchers are eager to employ good algorithms to solve complex problems. The third stage can inadvertently strengthen the original algorithm. The simplicity and effectiveness Meta-RaPS enjoys places it in both second and third research stages concurrently. This dissertation explores strengthening Meta-RaPS by incorporating memory and learning features. The major conceptual frameworks that guided this work are the Adaptive Memory Programming framework (or AMP) and the metaheuristic hybridization taxonomy. The concepts from both frameworks are followed when identifying useful information that Meta-RaPS can collect during execution. Hybridizing Meta-RaPS with machine learning algorithms helped in transforming the collected information into knowledge. The learning concepts selected are supervised and unsupervised learning. The algorithms selected to achieve both types of learning are the Inductive Decision Tree (supervised learning) and Association Rules (unsupervised learning). The objective behind hybridizing Meta-RaPS with an Inductive Decision Tree algorithm is to perform online control for Meta-RaPS\u27 parameters. This Inductive Decision Tree algorithm is used to find favorable parameter values using knowledge gained from previous Meta-RaPS iterations. The values selected are used in future Meta-RaPS iterations. The objective behind hybridizing Meta-RaPS with an Association Rules algorithm is to identify patterns associated with good solutions. These patterns are considered knowledge and are inherited as starting points for in future Meta-RaPS iteration. The performance of the hybrid Meta-RaPS algorithms is demonstrated by solving the capacitated Vehicle Routing Problem with and without time windows

Old Dominion University

Next challenges for adaptive learning systems

Author: Bifet A.
Gaber M.
Gabrys B.
Gama J.
Minku L.
Musial K.
Zliobaite I.
Publication venue
Publication date: 01/01/2012
Field of study

University of Birmingham Research Portal

Portsmouth University Research Portal (Pure)

Multiobjective synchronization of coupled systems

Author: Blekhman I. I.
Deb K.
Goldberg D. E.
Jian-an Fang
Jürgen Kurths
W. K. Wong
Yang Tang
Zidong Wang
Publication venue: 'AIP Publishing'
Publication date: 01/06/2011
Field of study

Copyright @ 2011 American Institute of PhysicsSynchronization of coupled chaotic systems has been a subject of great interest and importance, in theory but also various fields of application, such as secure communication and neuroscience. Recently, based on stability theory, synchronization of coupled chaotic systems by designing appropriate coupling has been widely investigated. However, almost all the available results have been focusing on ensuring the synchronization of coupled chaotic systems with as small coupling strengths as possible. In this contribution, we study multiobjective synchronization of coupled chaotic systems by considering two objectives in parallel, i. e., minimizing optimization of coupling strength and convergence speed. The coupling form and coupling strength are optimized by an improved multiobjective evolutionary approach. The constraints on the coupling form are also investigated by formulating the problem into a multiobjective constraint problem. We find that the proposed evolutionary method can outperform conventional adaptive strategy in several respects. The results presented in this paper can be extended into nonlinear time-series analysis, synchronization of complex networks and have various applications

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

Brunel University Research Archive