381 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics

    Full text link
    We study a game between autobidding algorithms that compete in an online advertising platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple rounds of a repeated auction, subject to budget and/or return-on-investment constraints. We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret. Our algorithm uses only bandit feedback and can be used with the first- or second-price auction, as well as with any "intermediate" auction format. Our main result is that when these autobidders play against each other, the resulting expected liquid welfare over all rounds is at least half of the expected optimal liquid welfare achieved by any allocation. This holds whether or not the bidding dynamics converges to an equilibrium and regardless of the correlation structure between advertiser valuations

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Contextual Bandits with Budgeted Information Reveal

    Full text link
    Contextual bandit algorithms are commonly used in digital health to recommend personalized treatments. However, to ensure the effectiveness of the treatments, patients are often requested to take actions that have no immediate benefit to them, which we refer to as pro-treatment actions. In practice, clinicians have a limited budget to encourage patients to take these actions and collect additional information. We introduce a novel optimization and learning algorithm to address this problem. This algorithm effectively combines the strengths of two algorithmic approaches in a seamless manner, including 1) an online primal-dual algorithm for deciding the optimal timing to reach out to patients, and 2) a contextual bandit learning algorithm to deliver personalized treatment to the patient. We prove that this algorithm admits a sub-linear regret bound. We illustrate the usefulness of this algorithm on both synthetic and real-world data

    Learning to Bid in Repeated First-Price Auctions with Budgets

    Full text link
    Budget management strategies in repeated auctions have received growing attention in online advertising markets. However, previous work on budget management in online bidding mainly focused on second-price auctions. The rapid shift from second-price auctions to first-price auctions for online ads in recent years has motivated the challenging question of how to bid in repeated first-price auctions while controlling budgets. In this work, we study the problem of learning in repeated first-price auctions with budgets. We design a dual-based algorithm that can achieve a near-optimal O~(T)\widetilde{O}(\sqrt{T}) regret with full information feedback where the maximum competing bid is always revealed after each auction. We further consider the setting with one-sided information feedback where only the winning bid is revealed after each auction. We show that our modified algorithm can still achieve an O~(T)\widetilde{O}(\sqrt{T}) regret with mild assumptions on the bidder's value distribution. Finally, we complement the theoretical results with numerical experiments to confirm the effectiveness of our budget management policy

    A Tight Competitive Ratio for Online Submodular Welfare Maximization

    Full text link
    In this paper we consider the online Submodular Welfare (SW) problem. In this problem we are given nn bidders each equipped with a general (not necessarily monotone) submodular utility and mm items that arrive online. The goal is to assign each item, once it arrives, to a bidder or discard it, while maximizing the sum of utilities. When an adversary determines the items' arrival order we present a simple randomized algorithm that achieves a tight competitive ratio of \nicefrac{1}{4}. The algorithm is a specialization of an algorithm due to [Harshaw-Kazemi-Feldman-Karbasi MOR`22], who presented the previously best known competitive ratio of 3−22≈0.1715733-2\sqrt{2}\approx 0.171573 to the problem. When the items' arrival order is uniformly random, we present a competitive ratio of ≈0.27493\approx 0.27493, improving the previously known \nicefrac{1}{4} guarantee. Our approach for the latter result is based on a better analysis of the (offline) Residual Random Greedy (RRG) algorithm of [Buchbinder-Feldman-Naor-Schwartz SODA`14], which we believe might be of independent interest

    Trading-off price for data quality to achieve fair online allocation

    Full text link
    We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by O(T)\mathcal{O}(\sqrt{T}). A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly

    Privacy in resource allocation problems

    Get PDF
    Collaborative decision-making processes help parties optimize their operations, remain competitive in their markets, and improve their performances with environmental issues. However, those parties also want to keep their data private to meet their obligations regarding various regulations and not to disclose their strategic information to the competitors. In this thesis, we study collaborative capacity allocation among multiple parties and present that (near) optimal allocations can be realized while considering the parties' privacy concerns.We first attempt to solve the multi-party resource sharing problem by constructing a single model that is available to all parties. We propose an equivalent data-private model that meets the parties' data privacy requirements while ensuring optimal solutions for each party. We show that when the proposed model is solved, each party can only get its own optimal decisions and cannot observe others' solutions. We support our findings with a simulation study.The third and fourth chapters of this thesis focus on the problem from a different perspective in which we use a reformulation that can be used to distribute the problem among the involved parties. This decomposition lets us eliminate almost all the information-sharing requirements. In Chapter 3, together with the reformulated model, we benefit from a secure multi-party computation protocol that allows parties to disguise their shared information while attaining optimal allocation decisions. We conduct a simulation study on a planning problem and show our proposed algorithm in practice. We use the decomposition approach in Chapter 4 with a different privacy notion. We employ differential privacy as our privacy definition and design a differentially private algorithm for solving the multi-party resource sharing problem. Differential privacy brings in formal data privacy guarantees at the cost of deviating slightly from optimality. We provide bounds on this deviation and discuss the consequences of these theoretical results. We show the proposed algorithm on a planning problem and present insights about its efficiency.<br/

    Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression

    Full text link
    We consider contextual bandits with linear constraints (CBwLC), a variant of contextual bandits in which the algorithm consumes multiple resources subject to linear constraints on total consumption. This problem generalizes contextual bandits with knapsacks (CBwK), allowing for packing and covering constraints, as well as positive and negative resource consumption. We provide the first algorithm for CBwLC (or CBwK) that is based on regression oracles. The algorithm is simple, computationally efficient, and admits vanishing regret. It is statistically optimal for the variant of CBwK in which the algorithm must stop once some constraint is violated. Further, we provide the first vanishing-regret guarantees for CBwLC (or CBwK) that extend beyond the stochastic environment. We side-step strong impossibility results from prior work by identifying a weaker (and, arguably, fairer) benchmark to compare against. Our algorithm builds on LagrangeBwK (Immorlica et al., FOCS 2019), a Lagrangian-based technique for CBwK, and SquareCB (Foster and Rakhlin, ICML 2020), a regression-based technique for contextual bandits. Our analysis leverages the inherent modularity of both techniques
    • …
    corecore