7 research outputs found

    An Incentive Compatible Multi-Armed-Bandit Crowdsourcing Mechanism with Quality Assurance

    Full text link
    Consider a requester who wishes to crowdsource a series of identical binary labeling tasks to a pool of workers so as to achieve an assured accuracy for each task, in a cost optimal way. The workers are heterogeneous with unknown but fixed qualities and their costs are private. The problem is to select for each task an optimal subset of workers so that the outcome obtained from the selected workers guarantees a target accuracy level. The problem is a challenging one even in a non strategic setting since the accuracy of aggregated label depends on unknown qualities. We develop a novel multi-armed bandit (MAB) mechanism for solving this problem. First, we propose a framework, Assured Accuracy Bandit (AAB), which leads to an MAB algorithm, Constrained Confidence Bound for a Non Strategic setting (CCB-NS). We derive an upper bound on the number of time steps the algorithm chooses a sub-optimal set that depends on the target accuracy level and true qualities. A more challenging situation arises when the requester not only has to learn the qualities of the workers but also elicit their true costs. We modify the CCB-NS algorithm to obtain an adaptive exploration separated algorithm which we call { \em Constrained Confidence Bound for a Strategic setting (CCB-S)}. CCB-S algorithm produces an ex-post monotone allocation rule and thus can be transformed into an ex-post incentive compatible and ex-post individually rational mechanism that learns the qualities of the workers and guarantees a given target accuracy level in a cost optimal way. We provide a lower bound on the number of times any algorithm should select a sub-optimal set and we see that the lower bound matches our upper bound upto a constant factor. We provide insights on the practical implementation of this framework through an illustrative example and we show the efficacy of our algorithms through simulations

    Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards

    Full text link
    In practice, incentive providers (i.e., principals) often cannot observe the reward realizations of incentivized agents, which is in contrast to many principal-agent models that have been previously studied. This information asymmetry challenges the principal to consistently estimate the agent's unknown rewards by solely watching the agent's decisions, which becomes even more challenging when the agent has to learn its own rewards. This complex setting is observed in various real-life scenarios ranging from renewable energy storage contracts to personalized healthcare incentives. Hence, it offers not only interesting theoretical questions but also wide practical relevance. This paper explores a repeated adverse selection game between a self-interested learning agent and a learning principal. The agent tackles a multi-armed bandit (MAB) problem to maximize their expected reward plus incentive. On top of the agent's learning, the principal trains a parallel algorithm and faces a trade-off between consistently estimating the agent's unknown rewards and maximizing their own utility by offering adaptive incentives to lead the agent. For a non-parametric model, we introduce an estimator whose only input is the history of principal's incentives and agent's choices. We unite this estimator with a proposed data-driven incentive policy within a MAB framework. Without restricting the type of the agent's algorithm, we prove finite-sample consistency of the estimator and a rigorous regret bound for the principal by considering the sequential externality imposed by the agent. Lastly, our theoretical results are reinforced by simulations justifying applicability of our framework to green energy aggregator contracts.Comment: 72 pages, 6 figures. arXiv admin note: text overlap with arXiv:2304.0740

    An optimal bidimensional multi-armed bandit auction for multi-unit procurement

    No full text
    We study the problem of a buyer who gains stochastic rewards by procuring through an auction, multiple units of a service or item from a pool of heterogeneous agents who are strategic on two dimensions, namely cost and capacity. The reward obtained for a single unit from an allocated agent depends on the inherent quality of the agent; the agent's quality is fixed but unknown. Each agent can only supply a limited number of units (capacity of the agent). The cost incurred per unit and capacity (maximum number of units that can be supplied) are private information of each agent. The auctioneer is required to elicit from the agents their costs as well as capacities (making the mechanism design bidimensional) and further, learn the qualities of the agents as well, with a view to maximize her utility. Motivated by this, we design a bidimensional multi-armed bandit procurement auction that seeks to maximize the expected utility of the auctioneer subject to incentive compatibility and individual rationality, while simultaneously learning the unknown qualities of the agents. We first work with the assumption that the qualities are known, and propose an optimal, truthful mechanism 2D-OPT for the auctioneer to elicit costs and capacities. Next, in order to learn the qualities of the agents as well, we provide sufficient conditions for a learning algorithm to be Bayesian incentive compatible and individually rational. We finally design a novel learning mechanism, 2D-UCB that is stochastic Bayesian incentive compatible and individually rational

    CIMODE 2016: 3º Congresso Internacional de Moda e Design: proceedings

    Get PDF
    O CIMODE 2016 é o terceiro Congresso Internacional de Moda e Design, a decorrer de 9 a 12 de maio de 2016 na cidade de Buenos Aires, subordinado ao tema : EM--‐TRAMAS. A presente edição é organizada pela Faculdade de Arquitetura, Desenho e Urbanismo da Universidade de Buenos Aires, em conjunto com o Departamento de Engenharia Têxtil da Universidade do Minho e com a ABEPEM – Associação Brasileira de Estudos e Pesquisa em Moda.info:eu-repo/semantics/publishedVersio
    corecore