134 research outputs found

    Keyword Targeting Optimization in Sponsored Search Advertising: Combining Selection and Matching

    Full text link
    In sponsored search advertising (SSA), advertisers need to select keywords and determine matching types for selected keywords simultaneously, i.e., keyword targeting. An optimal keyword targeting strategy guarantees reaching the right population effectively. This paper aims to address the keyword targeting problem, which is a challenging task because of the incomplete information of historical advertising performance indices and the high uncertainty in SSA environments. First, we construct a data distribution estimation model and apply a Markov Chain Monte Carlo method to make inference about unobserved indices (i.e., impression and click-through rate) over three keyword matching types (i.e., broad, phrase and exact). Second, we formulate a stochastic keyword targeting model (BB-KSM) combining operations of keyword selection and keyword matching to maximize the expected profit under the chance constraint of the budget, and develop a branch-and-bound algorithm incorporating a stochastic simulation process for our keyword targeting model. Finally, based on a realworld dataset collected from field reports and logs of past SSA campaigns, computational experiments are conducted to evaluate the performance of our keyword targeting strategy. Experimental results show that, (a) BB-KSM outperforms seven baselines in terms of profit; (b) BB-KSM shows its superiority as the budget increases, especially in situations with more keywords and keyword combinations; (c) the proposed data distribution estimation approach can effectively address the problem of incomplete performance indices over the three matching types and in turn significantly promotes the performance of keyword targeting decisions. This research makes important contributions to the SSA literature and the results offer critical insights into keyword management for SSA advertisers.Comment: 38 pages, 4 figures, 5 table

    DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

    Full text link
    In recent years, audio-driven 3D facial animation has gained significant attention, particularly in applications such as virtual reality, gaming, and video conferencing. However, accurately modeling the intricate and subtle dynamics of facial expressions remains a challenge. Most existing studies approach the facial animation task as a single regression problem, which often fail to capture the intrinsic inter-modal relationship between speech signals and 3D facial animation and overlook their inherent consistency. Moreover, due to the limited availability of 3D-audio-visual datasets, approaches learning with small-size samples have poor generalizability that decreases the performance. To address these issues, in this study, we propose a cross-modal dual-learning framework, termed DualTalker, aiming at improving data usage efficiency as well as relating cross-modal dependencies. The framework is trained jointly with the primary task (audio-driven facial animation) and its dual task (lip reading) and shares common audio/motion encoder components. Our joint training framework facilitates more efficient data usage by leveraging information from both tasks and explicitly capitalizing on the complementary relationship between facial motion and audio to improve performance. Furthermore, we introduce an auxiliary cross-modal consistency loss to mitigate the potential over-smoothing underlying the cross-modal complementary representations, enhancing the mapping of subtle facial expression dynamics. Through extensive experiments and a perceptual user study conducted on the VOCA and BIWI datasets, we demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively. We have made our code and video demonstrations available at https://github.com/sabrina-su/iadf.git

    MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

    Full text link
    The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both architecture and sampling techniques. We conduct a comprehensive examination of model architecture design to reduce redundancy, enhance computational efficiency, and minimize model's parameter count, while preserving image generation quality. Additionally, we employ distillation and diffusion-GAN finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference respectively. Empirical studies, conducted both quantitatively and qualitatively, demonstrate the effectiveness of our proposed techniques. MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for generating a 512×512512\times512 image on mobile devices, establishing a new state of the art

    A Chebyshev Confidence Guided Source-Free Domain Adaptation Framework for Medical Image Segmentation

    Full text link
    Source-free domain adaptation (SFDA) aims to adapt models trained on a labeled source domain to an unlabeled target domain without the access to source data. In medical imaging scenarios, the practical significance of SFDA methods has been emphasized due to privacy concerns. Recent State-of-the-art SFDA methods primarily rely on self-training based on pseudo-labels (PLs). Unfortunately, PLs suffer from accuracy deterioration caused by domain shift, and thus limit the effectiveness of the adaptation process. To address this issue, we propose a Chebyshev confidence guided SFDA framework to accurately assess the reliability of PLs and generate self-improving PLs for self-training. The Chebyshev confidence is estimated by calculating probability lower bound of the PL confidence, given the prediction and the corresponding uncertainty. Leveraging the Chebyshev confidence, we introduce two confidence-guided denoising methods: direct denoising and prototypical denoising. Additionally, we propose a novel teacher-student joint training scheme (TJTS) that incorporates a confidence weighting module to improve PLs iteratively. The TJTS, in collaboration with the denoising methods, effectively prevents the propagation of noise and enhances the accuracy of PLs. Extensive experiments in diverse domain scenarios validate the effectiveness of our proposed framework and establish its superiority over state-of-the-art SFDA methods. Our paper contributes to the field of SFDA by providing a novel approach for precisely estimating the reliability of pseudo-labels and a framework for obtaining high-quality PLs, resulting in improved adaptation performance

    Estimating Brain Age with Global and Local Dependencies

    Full text link
    The brain age has been proven to be a phenotype of relevance to cognitive performance and brain disease. Achieving accurate brain age prediction is an essential prerequisite for optimizing the predicted brain-age difference as a biomarker. As a comprehensive biological characteristic, the brain age is hard to be exploited accurately with models using feature engineering and local processing such as local convolution and recurrent operations that process one local neighborhood at a time. Instead, Vision Transformers learn global attentive interaction of patch tokens, introducing less inductive bias and modeling long-range dependencies. In terms of this, we proposed a novel network for learning brain age interpreting with global and local dependencies, where the corresponding representations are captured by Successive Permuted Transformer (SPT) and convolution blocks. The SPT brings computation efficiency and locates the 3D spatial information indirectly via continuously encoding 2D slices from different views. Finally, we collect a large cohort of 22645 subjects with ages ranging from 14 to 97 and our network performed the best among a series of deep learning methods, yielding a mean absolute error (MAE) of 2.855 in validation set, and 2.911 in an independent test set

    Probabilistic Latent Factor Model for Collaborative Filtering with Bayesian Inference

    Full text link
    Latent Factor Model (LFM) is one of the most successful methods for Collaborative filtering (CF) in the recommendation system, in which both users and items are projected into a joint latent factor space. Base on matrix factorization applied usually in pattern recognition, LFM models user-item interactions as inner products of factor vectors of user and item in that space and can be efficiently solved by least square methods with optimal estimation. However, such optimal estimation methods are prone to overfitting due to the extreme sparsity of user-item interactions. In this paper, we propose a Bayesian treatment for LFM, named Bayesian Latent Factor Model (BLFM). Based on observed user-item interactions, we build a probabilistic factor model in which the regularization is introduced via placing prior constraint on latent factors, and the likelihood function is established over observations and parameters. Then we draw samples of latent factors from the posterior distribution with Variational Inference (VI) to predict expected value. We further make an extension to BLFM, called BLFMBias, incorporating user-dependent and item-dependent biases into the model for enhancing performance. Extensive experiments on the movie rating dataset show the effectiveness of our proposed models by compared with several strong baselines.Comment: 8 pages, 5 figures, ICPR2020 conferenc
    • …
    corecore