93 research outputs found
On the Inducibility of Stackelberg Equilibrium for Security Games
Strong Stackelberg equilibrium (SSE) is the standard solution concept of
Stackelberg security games. As opposed to the weak Stackelberg equilibrium
(WSE), the SSE assumes that the follower breaks ties in favor of the leader and
this is widely acknowledged and justified by the assertion that the defender
can often induce the attacker to choose a preferred action by making an
infinitesimal adjustment to her strategy. Unfortunately, in security games with
resource assignment constraints, the assertion might not be valid; it is
possible that the defender cannot induce the desired outcome. As a result, many
results claimed in the literature may be overly optimistic. To remedy, we first
formally define the utility guarantee of a defender strategy and provide
examples to show that the utility of SSE can be higher than its utility
guarantee. Second, inspired by the analysis of leader's payoff by Von Stengel
and Zamir (2004), we provide the solution concept called the inducible
Stackelberg equilibrium (ISE), which owns the highest utility guarantee and
always exists. Third, we show the conditions when ISE coincides with SSE and
the fact that in general case, SSE can be extremely worse with respect to
utility guarantee. Moreover, introducing the ISE does not invalidate existing
algorithmic results as the problem of computing an ISE polynomially reduces to
that of computing an SSE. We also provide an algorithmic implementation for
computing ISE, with which our experiments unveil the empirical advantage of the
ISE over the SSE.Comment: The Thirty-Third AAAI Conference on Artificial Intelligenc
Elixir: Train a Large Language Model on a Small GPU Cluster
In recent years, the number of parameters of one deep learning (DL) model has
been growing much faster than the growth of GPU memory space. People who are
inaccessible to a large number of GPUs resort to heterogeneous training systems
for storing model parameters in CPU memory. Existing heterogeneous systems are
based on parallelization plans in the scope of the whole model. They apply a
consistent parallel training method for all the operators in the computation.
Therefore, engineers need to pay a huge effort to incorporate a new type of
model parallelism and patch its compatibility with other parallelisms. For
example, Mixture-of-Experts (MoE) is still incompatible with ZeRO-3 in
Deepspeed. Also, current systems face efficiency problems on small scale, since
they are designed and tuned for large-scale training. In this paper, we propose
Elixir, a new parallel heterogeneous training system, which is designed for
efficiency and flexibility. Elixir utilizes memory resources and computing
resources of both GPU and CPU. For flexibility, Elixir generates
parallelization plans in the granularity of operators. Any new type of model
parallelism can be incorporated by assigning a parallel pattern to the
operator. For efficiency, Elixir implements a hierarchical distributed memory
management scheme to accelerate inter-GPU communications and CPU-GPU data
transmissions. As a result, Elixir can train a 30B OPT model on an A100 with
40GB CUDA memory, meanwhile reaching 84% efficiency of Pytorch GPU training.
With its super-linear scalability, the training efficiency becomes the same as
Pytorch GPU training on multiple GPUs. Also, large MoE models can be trained
5.3x faster than dense models of the same size. Now Elixir is integrated into
ColossalAI and is available on its main branch
Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
User response prediction, which models the user preference w.r.t. the
presented items, plays a key role in online services. With two-decade rapid
development, nowadays the cumulated user behavior sequences on mature Internet
service platforms have become extremely long since the user's first
registration. Each user not only has intrinsic tastes, but also keeps changing
her personal interests during lifetime. Hence, it is challenging to handle such
lifelong sequential modeling for each individual user. Existing methodologies
for sequential modeling are only capable of dealing with relatively recent user
behaviors, which leaves huge space for modeling long-term especially lifelong
sequential patterns to facilitate user modeling. Moreover, one user's behavior
may be accounted for various previous behaviors within her whole online
activity history, i.e., long-term dependency with multi-scale sequential
patterns. In order to tackle these challenges, in this paper, we propose a
Hierarchical Periodic Memory Network for lifelong sequential modeling with
personalized memorization of sequential patterns for each user. The model also
adopts a hierarchical and periodical updating mechanism to capture multi-scale
sequential patterns of user interests while supporting the evolving user
behavior logs. The experimental results over three large-scale real-world
datasets have demonstrated the advantages of our proposed model with
significant improvement in user response prediction performance against the
state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets:
https://github.com/alimamarankgroup/HPM
Synthesis of a Novel Ce-bpdc for the Effective Removal of Fluoride from Aqueous Solution
Ce-1,1′-biphenyl-4,4′-dicarboxylic acid (Ce-bpdc), a novel type of metal organic framework, was synthesized and applied to remove excessive fluoride from water. The structure and morphology of Ce-bpdc were measured by X-ray diffraction, scanning electron microscopy, Fourier transform infrared spectroscopy, and X-ray photoelectron spectroscopy. The effects, such as saturated adsorption capacity, HCO3-, and pH, were investigated. The optimal pH value for fluoride adsorption was the range from 5 to 6. The coexisting bicarbonate anions have a little influence on fluoride removal. The fluoride adsorption over the Ce-bpdc adsorbent could reach its equilibrium in about 20 min. The Ce-bpdc coordination complex exhibited high binding capacity for fluoride ions. The maximum adsorption capacity calculated from Langmuir model was high up to 45.5 mg/g at 298 K (pH = 7.0) and the removal efficiency was greater than 80%. In order to investigate the mechanism of fluoride removal, various adsorption isotherms such as Langmuir and Freundlich were fitted. The experimental data revealed that the Langmuir isotherm gave a more satisfactory fit for fluoride removal. Finally, the tested results of ground water samples from three places, Yuefang, Jiangji, and Sanyi which exhibited high removal efficiency, also demonstrate the potential utility of the Ce-bpdc as an effective adsorbent
- …