49 research outputs found
Learning to Bid in Repeated First-Price Auctions with Budgets
Budget management strategies in repeated auctions have received growing
attention in online advertising markets. However, previous work on budget
management in online bidding mainly focused on second-price auctions. The rapid
shift from second-price auctions to first-price auctions for online ads in
recent years has motivated the challenging question of how to bid in repeated
first-price auctions while controlling budgets.
In this work, we study the problem of learning in repeated first-price
auctions with budgets. We design a dual-based algorithm that can achieve a
near-optimal regret with full information feedback
where the maximum competing bid is always revealed after each auction. We
further consider the setting with one-sided information feedback where only the
winning bid is revealed after each auction. We show that our modified algorithm
can still achieve an regret with mild assumptions on
the bidder's value distribution. Finally, we complement the theoretical results
with numerical experiments to confirm the effectiveness of our budget
management policy
On the complexity of computing Markov perfect equilibrium in general-sum stochastic games
Similar to the role of Markov decision processes in reinforcement learning, Markov games (also called stochastic games) lay down the foundation for the study of multi-agent reinforcement learning and sequential agent interactions. We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness. This solution concept preserves the Markov perfect property and opens up the possibility for the success of multi-agent reinforcement learning algorithms on static two-player games to be extended to multi-agent dynamic games, expanding the reign of the PPAD-complete class
On the Re-Solving Heuristic for (Binary) Contextual Bandits with Knapsacks
In the problem of (binary) contextual bandits with knapsacks (CBwK), the
agent receives an i.i.d. context in each of the rounds and chooses an
action, resulting in a random reward and a random consumption of resources that
are related to an i.i.d. external factor. The agent's goal is to maximize the
accumulated reward under the initial resource constraints. In this work, we
combine the re-solving heuristic, which proved successful in revenue
management, with distribution estimation techniques to solve this problem. We
consider two different information feedback models, with full and partial
information, which vary in the difficulty of getting a sample of the external
factor. Under both information feedback settings, we achieve two-way results:
(1) For general problems, we show that our algorithm gets an regret against the fluid benchmark.
Here, and reflect the complexity of the context and
external factor distributions, respectively. This result is comparable to
existing results. (2) When the fluid problem is linear programming with a
unique and non-degenerate optimal solution, our algorithm leads to an
regret. To the best of our knowledge, this is the first
regret result in the CBwK problem regardless of information
feedback models. We further use numerical experiments to verify our results.Comment: 43 pages, 2 figures, 1 tabl
Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets
In online ad markets, a rising number of advertisers are employing bidding
agencies to participate in ad auctions. These agencies are specialized in
designing online algorithms and bidding on behalf of their clients. Typically,
an agency usually has information on multiple advertisers, so she can
potentially coordinate bids to help her clients achieve higher utilities than
those under independent bidding.
In this paper, we study coordinated online bidding algorithms in repeated
second-price auctions with budgets. We propose algorithms that guarantee every
client a higher utility than the best she can get under independent bidding. We
show that these algorithms achieve maximal coalition welfare and discuss
bidders' incentives to misreport their budgets, in symmetric cases. Our proofs
combine the techniques of online learning and equilibrium analysis, overcoming
the difficulty of competing with a multi-dimensional benchmark. The performance
of our algorithms is further evaluated by experiments on both synthetic and
real data. To the best of our knowledge, we are the first to consider bidder
coordination in online repeated auctions with constraints.Comment: 43 pages, 12 figure
Is Nash Equilibrium Approximator Learnable?
In this paper, we investigate the learnability of the function approximator
that approximates Nash equilibrium (NE) for games generated from a
distribution. First, we offer a generalization bound using the Probably
Approximately Correct (PAC) learning model. The bound describes the gap between
the expected loss and empirical loss of the NE approximator. Afterward, we
prove the agnostic PAC learnability of the Nash approximator. In addition to
theoretical analysis, we demonstrate an application of NE approximator in
experiments. The trained NE approximator can be used to warm-start and
accelerate classical NE solvers. Together, our results show the practicability
of approximating NE through function approximation.Comment: Accepted by AAMAS 202
Formal Analysis and Systematic Construction of Two-factor Authentication Scheme
One of the most commonly used two-factor authentication mechanisms is
based on smart card and user\u27s password. Throughout the years, there
have been many schemes proposed, but most of them have already been
found flawed due to the lack of formal security analysis. On the
cryptanalysis of this type of schemes, in this paper, we further
review two recently proposed schemes and show that their security
claims are invalid. To address the current issue, we propose a new
and simplified property set and a formal adversarial model for
analyzing the security of this type of schemes. We believe that the
property set and the adversarial model themselves are of independent
interest.
We then propose a new scheme and a generic construction framework. In
particular, we show that a secure password based key exchange
protocol can be transformed efficiently to a smartcard and password
based two-factor authentication scheme provided that there exist
pseudorandom functions and collision-resistant hash functions
Dynamic Budget Throttling in Repeated Second-Price Auctions
Throttling is one of the most popular budget control methods in today's
online advertising markets. When a budget-constrained advertiser employs
throttling, she can choose whether or not to participate in an auction after
the advertising platform recommends a bid. This paper focuses on the dynamic
budget throttling process in repeated second-price auctions from a theoretical
view. An essential feature of the underlying problem is that the advertiser
does not know the distribution of the highest competing bid upon entering the
market. To model the difficulty of eliminating such uncertainty, we consider
two different information structures. The advertiser could obtain the highest
competing bid in each round with full-information feedback. Meanwhile, with
partial information feedback, the advertiser could only have access to the
highest competing bid in the auctions she participates in. We propose the
OGD-CB algorithm, which involves simultaneous distribution learning and revenue
optimization. In both settings, we demonstrate that this algorithm guarantees
an regret with probability relative to the
fluid adaptive throttling benchmark. By proving a lower bound of
on the minimal regret for even the hindsight optimum, we
establish the near optimality of our algorithm. Finally, we compare the fluid
optimum of throttling to that of pacing, another widely adopted budget control
method. The numerical relationship of these benchmarks sheds new light on the
understanding of different online algorithms for revenue maximization under
budget constraints.Comment: 29 pages, 1 tabl
Budget-Constrained Auctions with Unassured Priors
In today's online advertising markets, it is common for an advertiser to set
a long-period budget. Correspondingly, advertising platforms adopt budget
control methods to ensure that any advertiser's payment is within her budget.
Most budget control methods rely on value distributions of advertisers.
However, due to the complex environment advertisers stand in and privacy
issues, the platform hardly learns their true priors. Therefore, it is
essential to understand how budget control auction mechanisms perform under
unassured priors.
This paper gives a two-fold answer. First, we propose a bid-discount method
barely studied in the literature. We show that such a method exhibits desirable
properties in revenue-maximizing and computation when fitting into first-price
auction. Second, we compare this mechanism with another four in the prior
manipulation model, where an advertiser can arbitrarily report a value
distribution to the platform. These four mechanisms include the optimal
mechanism satisfying budget-constrained IC, first-price/second-price mechanisms
with the widely-studied pacing method, and an application of bid-discount in
second-price mechanism. We consider three settings under the model, depending
on whether the reported priors are fixed and advertisers are symmetric or not.
We show that under all three cases, the bid-discount first-price auction we
introduce dominates the other four mechanisms concerning the platform's
revenue. For the advertisers' side, we show a surprising strategic-equivalence
result between this mechanism and the optimal auction. Extensive revenue
dominance and strategic relationships among these mechanisms are also revealed.
Based on these findings, we provide a thorough understanding of prior
dependency in repeated auctions with budgets. The bid-discount first-price
auction itself may also be of further independent research interest.Comment: 47 pages, 2 figures, 1 table. In revie
Synthesis of graphene and graphene nanostructures by ion implantation and pulsed laser annealing
In this paper, we report a systematic study that shows how the numerous processing parameters associated with ion implantation (II) and pulsed laser annealing (PLA) can be manipulated to control the quantity and quality of graphene (G), few-layer graphene (FLG), and other carbon nanostructures selectively synthesized in crystalline SiC (c-SiC). Controlled implantations of Si− plus C− and Au + ions in c-SiC showed that both the thickness of the amorphous layer formed by ion damage and the doping effect of the implanted Au enhance the formation of G and FLG during PLA. The relative contributions of the amorphous and doping effects were studied separately, and thermal simulation calculations were used to estimate surface temperatures and to help understand the phase changes occurring during PLA. In addition to the amorphous layer thickness and catalytic doping effects, other enhancement effects were found to depend on other ion species, the annealing environment, PLA fluence and number of pulses, and even laser frequency. Optimum II and PLA conditions are identified and possible mechanisms for selective synthesis of G, FLG, and carbon nanostructures are discussed