49 research outputs found

    Learning to Bid in Repeated First-Price Auctions with Budgets

    Full text link
    Budget management strategies in repeated auctions have received growing attention in online advertising markets. However, previous work on budget management in online bidding mainly focused on second-price auctions. The rapid shift from second-price auctions to first-price auctions for online ads in recent years has motivated the challenging question of how to bid in repeated first-price auctions while controlling budgets. In this work, we study the problem of learning in repeated first-price auctions with budgets. We design a dual-based algorithm that can achieve a near-optimal O~(T)\widetilde{O}(\sqrt{T}) regret with full information feedback where the maximum competing bid is always revealed after each auction. We further consider the setting with one-sided information feedback where only the winning bid is revealed after each auction. We show that our modified algorithm can still achieve an O~(T)\widetilde{O}(\sqrt{T}) regret with mild assumptions on the bidder's value distribution. Finally, we complement the theoretical results with numerical experiments to confirm the effectiveness of our budget management policy

    On the complexity of computing Markov perfect equilibrium in general-sum stochastic games

    Get PDF
    Similar to the role of Markov decision processes in reinforcement learning, Markov games (also called stochastic games) lay down the foundation for the study of multi-agent reinforcement learning and sequential agent interactions. We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness. This solution concept preserves the Markov perfect property and opens up the possibility for the success of multi-agent reinforcement learning algorithms on static two-player games to be extended to multi-agent dynamic games, expanding the reign of the PPAD-complete class

    On the Re-Solving Heuristic for (Binary) Contextual Bandits with Knapsacks

    Full text link
    In the problem of (binary) contextual bandits with knapsacks (CBwK), the agent receives an i.i.d. context in each of the TT rounds and chooses an action, resulting in a random reward and a random consumption of resources that are related to an i.i.d. external factor. The agent's goal is to maximize the accumulated reward under the initial resource constraints. In this work, we combine the re-solving heuristic, which proved successful in revenue management, with distribution estimation techniques to solve this problem. We consider two different information feedback models, with full and partial information, which vary in the difficulty of getting a sample of the external factor. Under both information feedback settings, we achieve two-way results: (1) For general problems, we show that our algorithm gets an O~(Tαu+Tαv+T1/2)\widetilde O(T^{\alpha_u} + T^{\alpha_v} + T^{1/2}) regret against the fluid benchmark. Here, αu\alpha_u and αv\alpha_v reflect the complexity of the context and external factor distributions, respectively. This result is comparable to existing results. (2) When the fluid problem is linear programming with a unique and non-degenerate optimal solution, our algorithm leads to an O~(1)\widetilde O(1) regret. To the best of our knowledge, this is the first O~(1)\widetilde O(1) regret result in the CBwK problem regardless of information feedback models. We further use numerical experiments to verify our results.Comment: 43 pages, 2 figures, 1 tabl

    Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets

    Full text link
    In online ad markets, a rising number of advertisers are employing bidding agencies to participate in ad auctions. These agencies are specialized in designing online algorithms and bidding on behalf of their clients. Typically, an agency usually has information on multiple advertisers, so she can potentially coordinate bids to help her clients achieve higher utilities than those under independent bidding. In this paper, we study coordinated online bidding algorithms in repeated second-price auctions with budgets. We propose algorithms that guarantee every client a higher utility than the best she can get under independent bidding. We show that these algorithms achieve maximal coalition welfare and discuss bidders' incentives to misreport their budgets, in symmetric cases. Our proofs combine the techniques of online learning and equilibrium analysis, overcoming the difficulty of competing with a multi-dimensional benchmark. The performance of our algorithms is further evaluated by experiments on both synthetic and real data. To the best of our knowledge, we are the first to consider bidder coordination in online repeated auctions with constraints.Comment: 43 pages, 12 figure

    Is Nash Equilibrium Approximator Learnable?

    Get PDF
    In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated from a distribution. First, we offer a generalization bound using the Probably Approximately Correct (PAC) learning model. The bound describes the gap between the expected loss and empirical loss of the NE approximator. Afterward, we prove the agnostic PAC learnability of the Nash approximator. In addition to theoretical analysis, we demonstrate an application of NE approximator in experiments. The trained NE approximator can be used to warm-start and accelerate classical NE solvers. Together, our results show the practicability of approximating NE through function approximation.Comment: Accepted by AAMAS 202

    Formal Analysis and Systematic Construction of Two-factor Authentication Scheme

    Get PDF
    One of the most commonly used two-factor authentication mechanisms is based on smart card and user\u27s password. Throughout the years, there have been many schemes proposed, but most of them have already been found flawed due to the lack of formal security analysis. On the cryptanalysis of this type of schemes, in this paper, we further review two recently proposed schemes and show that their security claims are invalid. To address the current issue, we propose a new and simplified property set and a formal adversarial model for analyzing the security of this type of schemes. We believe that the property set and the adversarial model themselves are of independent interest. We then propose a new scheme and a generic construction framework. In particular, we show that a secure password based key exchange protocol can be transformed efficiently to a smartcard and password based two-factor authentication scheme provided that there exist pseudorandom functions and collision-resistant hash functions

    Dynamic Budget Throttling in Repeated Second-Price Auctions

    Full text link
    Throttling is one of the most popular budget control methods in today's online advertising markets. When a budget-constrained advertiser employs throttling, she can choose whether or not to participate in an auction after the advertising platform recommends a bid. This paper focuses on the dynamic budget throttling process in repeated second-price auctions from a theoretical view. An essential feature of the underlying problem is that the advertiser does not know the distribution of the highest competing bid upon entering the market. To model the difficulty of eliminating such uncertainty, we consider two different information structures. The advertiser could obtain the highest competing bid in each round with full-information feedback. Meanwhile, with partial information feedback, the advertiser could only have access to the highest competing bid in the auctions she participates in. We propose the OGD-CB algorithm, which involves simultaneous distribution learning and revenue optimization. In both settings, we demonstrate that this algorithm guarantees an O(TlogT)O(\sqrt{T\log T}) regret with probability 1O(1/T)1 - O(1/T) relative to the fluid adaptive throttling benchmark. By proving a lower bound of Ω(T)\Omega(\sqrt{T}) on the minimal regret for even the hindsight optimum, we establish the near optimality of our algorithm. Finally, we compare the fluid optimum of throttling to that of pacing, another widely adopted budget control method. The numerical relationship of these benchmarks sheds new light on the understanding of different online algorithms for revenue maximization under budget constraints.Comment: 29 pages, 1 tabl

    Budget-Constrained Auctions with Unassured Priors

    Full text link
    In today's online advertising markets, it is common for an advertiser to set a long-period budget. Correspondingly, advertising platforms adopt budget control methods to ensure that any advertiser's payment is within her budget. Most budget control methods rely on value distributions of advertisers. However, due to the complex environment advertisers stand in and privacy issues, the platform hardly learns their true priors. Therefore, it is essential to understand how budget control auction mechanisms perform under unassured priors. This paper gives a two-fold answer. First, we propose a bid-discount method barely studied in the literature. We show that such a method exhibits desirable properties in revenue-maximizing and computation when fitting into first-price auction. Second, we compare this mechanism with another four in the prior manipulation model, where an advertiser can arbitrarily report a value distribution to the platform. These four mechanisms include the optimal mechanism satisfying budget-constrained IC, first-price/second-price mechanisms with the widely-studied pacing method, and an application of bid-discount in second-price mechanism. We consider three settings under the model, depending on whether the reported priors are fixed and advertisers are symmetric or not. We show that under all three cases, the bid-discount first-price auction we introduce dominates the other four mechanisms concerning the platform's revenue. For the advertisers' side, we show a surprising strategic-equivalence result between this mechanism and the optimal auction. Extensive revenue dominance and strategic relationships among these mechanisms are also revealed. Based on these findings, we provide a thorough understanding of prior dependency in repeated auctions with budgets. The bid-discount first-price auction itself may also be of further independent research interest.Comment: 47 pages, 2 figures, 1 table. In revie

    Synthesis of graphene and graphene nanostructures by ion implantation and pulsed laser annealing

    Get PDF
    In this paper, we report a systematic study that shows how the numerous processing parameters associated with ion implantation (II) and pulsed laser annealing (PLA) can be manipulated to control the quantity and quality of graphene (G), few-layer graphene (FLG), and other carbon nanostructures selectively synthesized in crystalline SiC (c-SiC). Controlled implantations of Si− plus C− and Au + ions in c-SiC showed that both the thickness of the amorphous layer formed by ion damage and the doping effect of the implanted Au enhance the formation of G and FLG during PLA. The relative contributions of the amorphous and doping effects were studied separately, and thermal simulation calculations were used to estimate surface temperatures and to help understand the phase changes occurring during PLA. In addition to the amorphous layer thickness and catalytic doping effects, other enhancement effects were found to depend on other ion species, the annealing environment, PLA fluence and number of pulses, and even laser frequency. Optimum II and PLA conditions are identified and possible mechanisms for selective synthesis of G, FLG, and carbon nanostructures are discussed
    corecore