48 research outputs found

    FINT: Field-aware INTeraction Neural Network For CTR Prediction

    Full text link
    As a critical component for online advertising and marking, click-through rate (CTR) prediction has draw lots of attentions from both industry and academia field. Recently, the deep learning has become the mainstream methodological choice for CTR. Despite of sustainable efforts have been made, existing approaches still pose several challenges. On the one hand, high-order interaction between the features is under-explored. On the other hand, high-order interactions may neglect the semantic information from the low-order fields. In this paper, we proposed a novel prediction method, named FINT, that employs the Field-aware INTeraction layer which captures high-order feature interactions while retaining the low-order field information. To empirically investigate the effectiveness and robustness of the FINT, we perform extensive experiments on the three realistic databases: KDD2012, Criteo and Avazu. The obtained results demonstrate that the FINT can significantly improve the performance compared to the existing methods, without increasing the amount of computation required. Moreover, the proposed method brought about 2.72\% increase to the advertising revenue of a big online video app through A/B testing. To better promote the research in CTR field, we will release our code as well as reference implementation of those baseline models in the final version.Comment: 5 pages, Submitted to CIKM 202

    Wound Segmentation with Dynamic Illumination Correction and Dual-view Semantic Fusion

    Full text link
    Wound image segmentation is a critical component for the clinical diagnosis and in-time treatment of wounds. Recently, deep learning has become the mainstream methodology for wound image segmentation. However, the pre-processing of the wound image, such as the illumination correction, is required before the training phase as the performance can be greatly improved. The correction procedure and the training of deep models are independent of each other, which leads to sub-optimal segmentation performance as the fixed illumination correction may not be suitable for all images. To address aforementioned issues, an end-to-end dual-view segmentation approach was proposed in this paper, by incorporating a learn-able illumination correction module into the deep segmentation models. The parameters of the module can be learned and updated during the training stage automatically, while the dual-view fusion can fully employ the features from both the raw images and the enhanced ones. To demonstrate the effectiveness and robustness of the proposed framework, the extensive experiments are conducted on the benchmark datasets. The encouraging results suggest that our framework can significantly improve the segmentation performance, compared to the state-of-the-art methods

    Trusted Multi-Scale Classification Framework for Whole Slide Image

    Full text link
    Despite remarkable efforts been made, the classification of gigapixels whole-slide image (WSI) is severely restrained from either the constrained computing resources for the whole slides, or limited utilizing of the knowledge from different scales. Moreover, most of the previous attempts lacked of the ability of uncertainty estimation. Generally, the pathologists often jointly analyze WSI from the different magnifications. If the pathologists are uncertain by using single magnification, then they will change the magnification repeatedly to discover various features of the tissues. Motivated by the diagnose process of the pathologists, in this paper, we propose a trusted multi-scale classification framework for the WSI. Leveraging the Vision Transformer as the backbone for multi branches, our framework can jointly classification modeling, estimating the uncertainty of each magnification of a microscope and integrate the evidence from different magnification. Moreover, to exploit discriminative patches from WSIs and reduce the requirement for computation resources, we propose a novel patch selection schema using attention rollout and non-maximum suppression. To empirically investigate the effectiveness of our approach, empirical experiments are conducted on our WSI classification tasks, using two benchmark databases. The obtained results suggest that the trusted framework can significantly improve the WSI classification performance compared with the state-of-the-art methods

    Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration

    Full text link
    The sparsity of extrinsic rewards poses a serious challenge for reinforcement learning (RL). Currently, many efforts have been made on curiosity which can provide a representative intrinsic reward for effective exploration. However, the challenge is still far from being solved. In this paper, we present a novel curiosity for RL, named DyMeCu, which stands for Dynamic Memory-based Curiosity. Inspired by human curiosity and information theory, DyMeCu consists of a dynamic memory and dual online learners. The curiosity arouses if memorized information can not deal with the current state, and the information gap between dual learners can be formulated as the intrinsic reward for agents, and then such state information can be consolidated into the dynamic memory. Compared with previous curiosity methods, DyMeCu can better mimic human curiosity with dynamic memory, and the memory module can be dynamically grown based on a bootstrap paradigm with dual learners. On multiple benchmarks including DeepMind Control Suite and Atari Suite, large-scale empirical experiments are conducted and the results demonstrate that DyMeCu outperforms competitive curiosity-based methods with or without extrinsic rewards. We will release the code to enhance reproducibility

    Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

    Full text link
    Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs). However, a notable challenge in RLHF is overoptimization, where beyond a certain threshold, the pursuit of higher rewards leads to a decline in human preferences. In this paper, we observe the weakness of KL regularization which is commonly employed in existing RLHF methods to address overoptimization. To mitigate this limitation, we scrutinize the RLHF objective in the offline dataset and propose uncertainty-penalized RLHF (UP-RLHF), which incorporates uncertainty regularization during RL-finetuning. To enhance the uncertainty quantification abilities for reward models, we first propose a diverse low-rank adaptation (LoRA) ensemble by maximizing the nuclear norm of LoRA matrix concatenations. Then we optimize policy models utilizing penalized rewards, determined by both rewards and uncertainties provided by the diverse reward LoRA ensembles. Our experimental results, based on two real human preference datasets, showcase the effectiveness of diverse reward LoRA ensembles in quantifying reward uncertainty. Additionally, uncertainty regularization in UP-RLHF proves to be pivotal in mitigating overoptimization, thereby contributing to the overall performance.Comment: 10 pages, 5 figures

    Optimistic Model Rollouts for Pessimistic Offline Policy Optimization

    Full text link
    Model-based offline reinforcement learning (RL) has made remarkable progress, offering a promising avenue for improving generalization with synthetic model rollouts. Existing works primarily focus on incorporating pessimism for policy optimization, usually via constructing a Pessimistic Markov Decision Process (P-MDP). However, the P-MDP discourages the policies from learning in out-of-distribution (OOD) regions beyond the support of offline datasets, which can under-utilize the generalization ability of dynamics models. In contrast, we propose constructing an Optimistic MDP (O-MDP). We initially observed the potential benefits of optimism brought by encouraging more OOD rollouts. Motivated by this observation, we present ORPO, a simple yet effective model-based offline RL framework. ORPO generates Optimistic model Rollouts for Pessimistic offline policy Optimization. Specifically, we train an optimistic rollout policy in the O-MDP to sample more OOD model rollouts. Then we relabel the sampled state-action pairs with penalized rewards and optimize the output policy in the P-MDP. Theoretically, we demonstrate that the performance of policies trained with ORPO can be lower-bounded in linear MDPs. Experimental results show that our framework significantly outperforms P-MDP baselines by a margin of 30%, achieving state-of-the-art performance on the widely-used benchmark. Moreover, ORPO exhibits notable advantages in problems that require generalization

    Trends in transport injuries burden and risk factors among children under 14 years old in China: 1990–2019

    Get PDF
    BackgroundTransport injuries (TI) remains one of leading causes of death in children in China. This study aimed to analyze the temporal trend of disease burden and associated risk factors of TI among children aged 0–14 years in China, utilizing data from 1990 to 2019.MethodsWe retrieved data of disease burden and risk factors of TI among children aged 0–14 year in China from 1990 to 2019 from the Global Burden of Disease (GBD) dataset. We estimated incidence rate, death rate, and disability adjusted life years (DALYs) rate with a 95% uncertainty interval (95% UI), stratified by age, sex, and all type-road users. Trends in disease burden with annual percentage changes (APC) and average annual percent change (AAPC) were performed by Joinpoint regression model.ResultsThe incidence rate (AAPC = 1.18%, P < 0.001) of TI among children aged 0–14 years showed an increasing trend, whereas mortality rate (AAPC = -3.87%, P < 0.001) and DALYs rate (AAPC = -3.83%, P < 0.001) decreased annually. Notably, boys experienced a higher increase in incidence (1.30%) compared to girls (1.06%), but a faster decrease in mortality and DALYs rate (-3.90% vs. -3.82%, -3.88% vs. -3.79%, respectively) (Pall < 0.001). Declines in death rates and DALYs rates were observed across all age groups (Pall < 0.001), while remained the highest among children aged 0–4 in 2019. Among different road-type users, cyclist road injuries were identified as the primary cause of TI (182.3 cases per 100,000) while pedestrians were the group with the highest mortality (2.9 cases per 100,000) and DALYs rate (243 cases per 100,000) in 2019. Besides, alcohol use was a significant risk factors for TI, while low temperature appeared to be a protective factor.ConclusionFuture efforts must prioritize raising awareness among children and their guardians to mitigate the disease burden of TI in children. It’s critical to enhance preventive interventions for boys, children aged 0–4 and vulnerable road users such as pedestrians and cyclists in future
    corecore