134 research outputs found
Keyword Targeting Optimization in Sponsored Search Advertising: Combining Selection and Matching
In sponsored search advertising (SSA), advertisers need to select keywords
and determine matching types for selected keywords simultaneously, i.e.,
keyword targeting. An optimal keyword targeting strategy guarantees reaching
the right population effectively. This paper aims to address the keyword
targeting problem, which is a challenging task because of the incomplete
information of historical advertising performance indices and the high
uncertainty in SSA environments. First, we construct a data distribution
estimation model and apply a Markov Chain Monte Carlo method to make inference
about unobserved indices (i.e., impression and click-through rate) over three
keyword matching types (i.e., broad, phrase and exact). Second, we formulate a
stochastic keyword targeting model (BB-KSM) combining operations of keyword
selection and keyword matching to maximize the expected profit under the chance
constraint of the budget, and develop a branch-and-bound algorithm
incorporating a stochastic simulation process for our keyword targeting model.
Finally, based on a realworld dataset collected from field reports and logs of
past SSA campaigns, computational experiments are conducted to evaluate the
performance of our keyword targeting strategy. Experimental results show that,
(a) BB-KSM outperforms seven baselines in terms of profit; (b) BB-KSM shows its
superiority as the budget increases, especially in situations with more
keywords and keyword combinations; (c) the proposed data distribution
estimation approach can effectively address the problem of incomplete
performance indices over the three matching types and in turn significantly
promotes the performance of keyword targeting decisions. This research makes
important contributions to the SSA literature and the results offer critical
insights into keyword management for SSA advertisers.Comment: 38 pages, 4 figures, 5 table
DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation
In recent years, audio-driven 3D facial animation has gained significant
attention, particularly in applications such as virtual reality, gaming, and
video conferencing. However, accurately modeling the intricate and subtle
dynamics of facial expressions remains a challenge. Most existing studies
approach the facial animation task as a single regression problem, which often
fail to capture the intrinsic inter-modal relationship between speech signals
and 3D facial animation and overlook their inherent consistency. Moreover, due
to the limited availability of 3D-audio-visual datasets, approaches learning
with small-size samples have poor generalizability that decreases the
performance. To address these issues, in this study, we propose a cross-modal
dual-learning framework, termed DualTalker, aiming at improving data usage
efficiency as well as relating cross-modal dependencies. The framework is
trained jointly with the primary task (audio-driven facial animation) and its
dual task (lip reading) and shares common audio/motion encoder components. Our
joint training framework facilitates more efficient data usage by leveraging
information from both tasks and explicitly capitalizing on the complementary
relationship between facial motion and audio to improve performance.
Furthermore, we introduce an auxiliary cross-modal consistency loss to mitigate
the potential over-smoothing underlying the cross-modal complementary
representations, enhancing the mapping of subtle facial expression dynamics.
Through extensive experiments and a perceptual user study conducted on the VOCA
and BIWI datasets, we demonstrate that our approach outperforms current
state-of-the-art methods both qualitatively and quantitatively. We have made
our code and video demonstrations available at
https://github.com/sabrina-su/iadf.git
MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices
The deployment of large-scale text-to-image diffusion models on mobile
devices is impeded by their substantial model size and slow inference speed. In
this paper, we propose \textbf{MobileDiffusion}, a highly efficient
text-to-image diffusion model obtained through extensive optimizations in both
architecture and sampling techniques. We conduct a comprehensive examination of
model architecture design to reduce redundancy, enhance computational
efficiency, and minimize model's parameter count, while preserving image
generation quality. Additionally, we employ distillation and diffusion-GAN
finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference
respectively. Empirical studies, conducted both quantitatively and
qualitatively, demonstrate the effectiveness of our proposed techniques.
MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for
generating a image on mobile devices, establishing a new state
of the art
A Chebyshev Confidence Guided Source-Free Domain Adaptation Framework for Medical Image Segmentation
Source-free domain adaptation (SFDA) aims to adapt models trained on a
labeled source domain to an unlabeled target domain without the access to
source data. In medical imaging scenarios, the practical significance of SFDA
methods has been emphasized due to privacy concerns. Recent State-of-the-art
SFDA methods primarily rely on self-training based on pseudo-labels (PLs).
Unfortunately, PLs suffer from accuracy deterioration caused by domain shift,
and thus limit the effectiveness of the adaptation process. To address this
issue, we propose a Chebyshev confidence guided SFDA framework to accurately
assess the reliability of PLs and generate self-improving PLs for
self-training. The Chebyshev confidence is estimated by calculating probability
lower bound of the PL confidence, given the prediction and the corresponding
uncertainty. Leveraging the Chebyshev confidence, we introduce two
confidence-guided denoising methods: direct denoising and prototypical
denoising. Additionally, we propose a novel teacher-student joint training
scheme (TJTS) that incorporates a confidence weighting module to improve PLs
iteratively. The TJTS, in collaboration with the denoising methods, effectively
prevents the propagation of noise and enhances the accuracy of PLs. Extensive
experiments in diverse domain scenarios validate the effectiveness of our
proposed framework and establish its superiority over state-of-the-art SFDA
methods. Our paper contributes to the field of SFDA by providing a novel
approach for precisely estimating the reliability of pseudo-labels and a
framework for obtaining high-quality PLs, resulting in improved adaptation
performance
Estimating Brain Age with Global and Local Dependencies
The brain age has been proven to be a phenotype of relevance to cognitive
performance and brain disease. Achieving accurate brain age prediction is an
essential prerequisite for optimizing the predicted brain-age difference as a
biomarker. As a comprehensive biological characteristic, the brain age is hard
to be exploited accurately with models using feature engineering and local
processing such as local convolution and recurrent operations that process one
local neighborhood at a time. Instead, Vision Transformers learn global
attentive interaction of patch tokens, introducing less inductive bias and
modeling long-range dependencies. In terms of this, we proposed a novel network
for learning brain age interpreting with global and local dependencies, where
the corresponding representations are captured by Successive Permuted
Transformer (SPT) and convolution blocks. The SPT brings computation efficiency
and locates the 3D spatial information indirectly via continuously encoding 2D
slices from different views. Finally, we collect a large cohort of 22645
subjects with ages ranging from 14 to 97 and our network performed the best
among a series of deep learning methods, yielding a mean absolute error (MAE)
of 2.855 in validation set, and 2.911 in an independent test set
Probabilistic Latent Factor Model for Collaborative Filtering with Bayesian Inference
Latent Factor Model (LFM) is one of the most successful methods for
Collaborative filtering (CF) in the recommendation system, in which both users
and items are projected into a joint latent factor space. Base on matrix
factorization applied usually in pattern recognition, LFM models user-item
interactions as inner products of factor vectors of user and item in that space
and can be efficiently solved by least square methods with optimal estimation.
However, such optimal estimation methods are prone to overfitting due to the
extreme sparsity of user-item interactions. In this paper, we propose a
Bayesian treatment for LFM, named Bayesian Latent Factor Model (BLFM). Based on
observed user-item interactions, we build a probabilistic factor model in which
the regularization is introduced via placing prior constraint on latent
factors, and the likelihood function is established over observations and
parameters. Then we draw samples of latent factors from the posterior
distribution with Variational Inference (VI) to predict expected value. We
further make an extension to BLFM, called BLFMBias, incorporating
user-dependent and item-dependent biases into the model for enhancing
performance. Extensive experiments on the movie rating dataset show the
effectiveness of our proposed models by compared with several strong baselines.Comment: 8 pages, 5 figures, ICPR2020 conferenc
- …