Search CORE

2,070 research outputs found

Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems

Author: Ho Chien-Ju
Slivkins Aleksandrs
Vaughan Jennifer Wortman
Publication venue
Publication date: 02/09/2015
Field of study

Crowdsourcing markets have emerged as a popular platform for matching available workers with tasks to complete. The payment for a particular task is typically set by the task's requester, and may be adjusted based on the quality of the completed work, for example, through the use of "bonus" payments. In this paper, we study the requester's problem of dynamically adjusting quality-contingent payments for tasks. We consider a multi-round version of the well-known principal-agent model, whereby in each round a worker makes a strategic choice of the effort level which is not directly observable by the requester. In particular, our formulation significantly generalizes the budget-free online task pricing problems studied in prior work. We treat this problem as a multi-armed bandit problem, with each "arm" representing a potential contract. To cope with the large (and in fact, infinite) number of arms, we propose a new algorithm, AgnosticZooming, which discretizes the contract space into a finite number of regions, effectively treating each region as a single arm. This discretization is adaptively refined, so that more promising regions of the contract space are eventually discretized more finely. We analyze this algorithm, showing that it achieves regret sublinear in the time horizon and substantially improves over non-adaptive discretization (which is the only competing approach in the literature). Our results advance the state of art on several different topics: the theory of crowdsourcing markets, principal-agent problems, multi-armed bandits, and dynamic pricing.Comment: This is the full version of a paper in the ACM Conference on Economics and Computation (ACM-EC), 201

arXiv.org e-Print Archive

CiteSeerX

Revisiting MAB based approaches to recursive delegation

Author: Oren Nir
Publication venue
Publication date: 02/11/2023
Field of study

In this paper we examine the effectiveness of several multi-arm bandit algorithms when used as a trust system to select agents to delegate tasks to. In contrast to existing work, we allow for recursive delegation to occur. That is, a task delegated to one agent can be delegated onwards by that agent, with further delegation possible until some agent finally executes the task. We show that modifications to the standard multi-arm bandit algorithms can provide improvements in performance in such recursive delegation settings

arXiv.org e-Print Archive

Learning User Preferences to Incentivize Exploration in the Sharing Economy

Author: Hirnschall Christoph
Krause Andreas
Singla Adish
Tschiatschek Sebastian
Publication venue
Publication date: 24/11/2017
Field of study

We study platforms in the sharing economy and discuss the need for incentivizing users to explore options that otherwise would not be chosen. For instance, rental platforms such as Airbnb typically rely on customer reviews to provide users with relevant information about different options. Yet, often a large fraction of options does not have any reviews available. Such options are frequently neglected as viable choices, and in turn are unlikely to be evaluated, creating a vicious cycle. Platforms can engage users to deviate from their preferred choice by offering monetary incentives for choosing a different option instead. To efficiently learn the optimal incentives to offer, we consider structural information in user preferences and introduce a novel algorithm - Coordinated Online Learning (CoOL) - for learning with structural information modeled as convex constraints. We provide formal guarantees on the performance of our algorithm and test the viability of our approach in a user study with data of apartments on Airbnb. Our findings suggest that our approach is well-suited to learn appropriate incentives and increase exploration on the investigated platform.Comment: Longer version of AAAI'18 paper. arXiv admin note: text overlap with arXiv:1702.0284

arXiv.org e-Print Archive

MPG.PuRe

Incentive mechanism design for citizen reporting application using Stackelberg game

Author: Harso Supangkat Suhono
Liana Aji Widya
Sanjaya I Made Ariya
Sembiring Jaka
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/04/2022
Field of study

The growing utilization of smartphones equipped with various sensors to collect and analyze information around us highlights a paradigm called mobile crowdsensing. To motivate citizens’ participation in crowdsensing and compensate them for their resources, it is necessary to incentivize the participants for their sensing service. There are several studies that used the Stackelberg game to model the incentive mechanism, however, those studies did not include a budget constraint for limited budget case. Another challenge is to optimize crowdsourcer (government) profit in conducting crowdsensing under the limited budget then allocates the budget to several regional working units that are responsible for the specific city problems. We propose an incentive mechanism for mobile crowdsensing based on several identified incentive parameters using the Stackelberg game model and applied the MOOP (multi-objective optimization problem) to the incentive model in which the participant reputation is taken into account. The evaluation of the proposed incentive model is performed through simulations. The simulation indicated that the result appropriately corresponds to the theoretical properties of the model

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Competing Bandits: The Perils of Exploration Under Competition

Author: Aridor Guy
Mansour Yishay
Slivkins Aleksandrs
Wu Zhiwei Steven
Publication venue
Publication date: 02/03/2021
Field of study

Most online platforms strive to learn from interactions with users, and many engage in exploration: making potentially suboptimal choices for the sake of acquiring new information. We study the interplay between exploration and competition: how such platforms balance the exploration for learning and the competition for users. Here users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing platforms. We consider a stylized duopoly model in which two firms face the same multi-armed bandit problem. Users arrive one by one and choose between the two firms, so that each firm makes progress on its bandit problem only if it is chosen. Through a mix of theoretical results and numerical simulations, we study whether and to what extent competition incentivizes the adoption of better bandit algorithms, and whether it leads to welfare increases for users. We find that stark competition induces firms to commit to a "greedy" bandit algorithm that leads to low welfare. However, weakening competition by providing firms with some "free" users incentivizes better exploration strategies and increases welfare. We investigate two channels for weakening the competition: relaxing the rationality of users and giving one firm a first-mover advantage. Our findings are closely related to the "competition vs. innovation" relationship, and elucidate the first-mover advantage in the digital economy.Comment: merged and extended version of arXiv:1702.08533 and arXiv:1902.0559

arXiv.org e-Print Archive

Dynamic project selection

Author: Athey
Banks
Bardi
Bergemann
Bolton
Cowgill
Dixit
Forand
Harris
Inderst
Keller
Klein
Milgrom
Motta
Murto
Oksendal
Ozbas
Peskir
Pham
Presman
Protter
Rajan
Rosenberg
Scharfstein
Shiryaev
Stein
Tirole
Wald
Webster
Publication venue: 'The Econometric Society'
Publication date: 01/01/2018
Field of study

We study a normative model of an internal capital market that a company uses to choose between its two divisions’ projects. Each project’s value is initially unknown to all, but can be dynamically learned by the corresponding division. Learning can be suspended or resumed at any time and is costly. We characterize an internal capital market that maximizes the company’s expected cash flow

City Research Online

Crossref

Birkbeck Institutional Research Online

Ballooning Multi-Armed Bandits

Author: Dhamal Swapnil
Ghalme Ganesh
Gujar Sujit
Jain Shweta
Narahari Y.
Publication venue
Publication date: 01/01/2020
Field of study

In this paper, we introduce Ballooning Multi-Armed Bandits (BL-MAB), a novel extension of the classical stochastic MAB model. In the BL-MAB model, the set of available arms grows (or balloons) over time. In contrast to the classical MAB setting where the regret is computed with respect to the best arm overall, the regret in a BL-MAB setting is computed with respect to the best available arm at each time. We first observe that the existing stochastic MAB algorithms result in linear regret for the BL-MAB model. We prove that, if the best arm is equally likely to arrive at any time instant, a sub-linear regret cannot be achieved. Next, we show that if the best arm is more likely to arrive in the early rounds, one can achieve sub-linear regret. Our proposed algorithm determines (1) the fraction of the time horizon for which the newly arriving arms should be explored and (2) the sequence of arm pulls in the exploitation phase from among the explored arms. Making reasonable assumptions on the arrival distribution of the best arm in terms of the thinness of the distribution's tail, we prove that the proposed algorithm achieves sub-linear instance-independent regret. We further quantify explicit dependence of regret on the arrival distribution parameters. We reinforce our theoretical findings with extensive simulation results. We conclude by showing that our algorithm would achieve sub-linear regret even if (a) the distributional parameters are not exactly known, but are obtained using a reasonable learning mechanism or (b) the best arm is not more likely to arrive early, but a large fraction of arms is likely to arrive relatively early.Comment: A full version of this paper is accepted in the Journal of Artificial Intelligence (AIJ) of Elsevier. A preliminary version is published as an extended abstract in AAMAS 2020. Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems. 202

arXiv.org e-Print Archive

Chalmers Research