48 research outputs found
FINT: Field-aware INTeraction Neural Network For CTR Prediction
As a critical component for online advertising and marking, click-through
rate (CTR) prediction has draw lots of attentions from both industry and
academia field. Recently, the deep learning has become the mainstream
methodological choice for CTR. Despite of sustainable efforts have been made,
existing approaches still pose several challenges. On the one hand, high-order
interaction between the features is under-explored. On the other hand,
high-order interactions may neglect the semantic information from the low-order
fields. In this paper, we proposed a novel prediction method, named FINT, that
employs the Field-aware INTeraction layer which captures high-order feature
interactions while retaining the low-order field information. To empirically
investigate the effectiveness and robustness of the FINT, we perform extensive
experiments on the three realistic databases: KDD2012, Criteo and Avazu. The
obtained results demonstrate that the FINT can significantly improve the
performance compared to the existing methods, without increasing the amount of
computation required. Moreover, the proposed method brought about 2.72\%
increase to the advertising revenue of a big online video app through A/B
testing. To better promote the research in CTR field, we will release our code
as well as reference implementation of those baseline models in the final
version.Comment: 5 pages, Submitted to CIKM 202
Wound Segmentation with Dynamic Illumination Correction and Dual-view Semantic Fusion
Wound image segmentation is a critical component for the clinical diagnosis
and in-time treatment of wounds. Recently, deep learning has become the
mainstream methodology for wound image segmentation. However, the
pre-processing of the wound image, such as the illumination correction, is
required before the training phase as the performance can be greatly improved.
The correction procedure and the training of deep models are independent of
each other, which leads to sub-optimal segmentation performance as the fixed
illumination correction may not be suitable for all images. To address
aforementioned issues, an end-to-end dual-view segmentation approach was
proposed in this paper, by incorporating a learn-able illumination correction
module into the deep segmentation models. The parameters of the module can be
learned and updated during the training stage automatically, while the
dual-view fusion can fully employ the features from both the raw images and the
enhanced ones. To demonstrate the effectiveness and robustness of the proposed
framework, the extensive experiments are conducted on the benchmark datasets.
The encouraging results suggest that our framework can significantly improve
the segmentation performance, compared to the state-of-the-art methods
Trusted Multi-Scale Classification Framework for Whole Slide Image
Despite remarkable efforts been made, the classification of gigapixels
whole-slide image (WSI) is severely restrained from either the constrained
computing resources for the whole slides, or limited utilizing of the knowledge
from different scales. Moreover, most of the previous attempts lacked of the
ability of uncertainty estimation. Generally, the pathologists often jointly
analyze WSI from the different magnifications. If the pathologists are
uncertain by using single magnification, then they will change the
magnification repeatedly to discover various features of the tissues. Motivated
by the diagnose process of the pathologists, in this paper, we propose a
trusted multi-scale classification framework for the WSI. Leveraging the Vision
Transformer as the backbone for multi branches, our framework can jointly
classification modeling, estimating the uncertainty of each magnification of a
microscope and integrate the evidence from different magnification. Moreover,
to exploit discriminative patches from WSIs and reduce the requirement for
computation resources, we propose a novel patch selection schema using
attention rollout and non-maximum suppression. To empirically investigate the
effectiveness of our approach, empirical experiments are conducted on our WSI
classification tasks, using two benchmark databases. The obtained results
suggest that the trusted framework can significantly improve the WSI
classification performance compared with the state-of-the-art methods
Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration
The sparsity of extrinsic rewards poses a serious challenge for reinforcement
learning (RL). Currently, many efforts have been made on curiosity which can
provide a representative intrinsic reward for effective exploration. However,
the challenge is still far from being solved. In this paper, we present a novel
curiosity for RL, named DyMeCu, which stands for Dynamic Memory-based
Curiosity. Inspired by human curiosity and information theory, DyMeCu consists
of a dynamic memory and dual online learners. The curiosity arouses if
memorized information can not deal with the current state, and the information
gap between dual learners can be formulated as the intrinsic reward for agents,
and then such state information can be consolidated into the dynamic memory.
Compared with previous curiosity methods, DyMeCu can better mimic human
curiosity with dynamic memory, and the memory module can be dynamically grown
based on a bootstrap paradigm with dual learners. On multiple benchmarks
including DeepMind Control Suite and Atari Suite, large-scale empirical
experiments are conducted and the results demonstrate that DyMeCu outperforms
competitive curiosity-based methods with or without extrinsic rewards. We will
release the code to enhance reproducibility
Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles
Reinforcement learning from human feedback (RLHF) emerges as a promising
paradigm for aligning large language models (LLMs). However, a notable
challenge in RLHF is overoptimization, where beyond a certain threshold, the
pursuit of higher rewards leads to a decline in human preferences. In this
paper, we observe the weakness of KL regularization which is commonly employed
in existing RLHF methods to address overoptimization. To mitigate this
limitation, we scrutinize the RLHF objective in the offline dataset and propose
uncertainty-penalized RLHF (UP-RLHF), which incorporates uncertainty
regularization during RL-finetuning. To enhance the uncertainty quantification
abilities for reward models, we first propose a diverse low-rank adaptation
(LoRA) ensemble by maximizing the nuclear norm of LoRA matrix concatenations.
Then we optimize policy models utilizing penalized rewards, determined by both
rewards and uncertainties provided by the diverse reward LoRA ensembles. Our
experimental results, based on two real human preference datasets, showcase the
effectiveness of diverse reward LoRA ensembles in quantifying reward
uncertainty. Additionally, uncertainty regularization in UP-RLHF proves to be
pivotal in mitigating overoptimization, thereby contributing to the overall
performance.Comment: 10 pages, 5 figures
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
Model-based offline reinforcement learning (RL) has made remarkable progress,
offering a promising avenue for improving generalization with synthetic model
rollouts. Existing works primarily focus on incorporating pessimism for policy
optimization, usually via constructing a Pessimistic Markov Decision Process
(P-MDP). However, the P-MDP discourages the policies from learning in
out-of-distribution (OOD) regions beyond the support of offline datasets, which
can under-utilize the generalization ability of dynamics models. In contrast,
we propose constructing an Optimistic MDP (O-MDP). We initially observed the
potential benefits of optimism brought by encouraging more OOD rollouts.
Motivated by this observation, we present ORPO, a simple yet effective
model-based offline RL framework. ORPO generates Optimistic model Rollouts for
Pessimistic offline policy Optimization. Specifically, we train an optimistic
rollout policy in the O-MDP to sample more OOD model rollouts. Then we relabel
the sampled state-action pairs with penalized rewards and optimize the output
policy in the P-MDP. Theoretically, we demonstrate that the performance of
policies trained with ORPO can be lower-bounded in linear MDPs. Experimental
results show that our framework significantly outperforms P-MDP baselines by a
margin of 30%, achieving state-of-the-art performance on the widely-used
benchmark. Moreover, ORPO exhibits notable advantages in problems that require
generalization
Trends in transport injuries burden and risk factors among children under 14 years old in China: 1990–2019
BackgroundTransport injuries (TI) remains one of leading causes of death in children in China. This study aimed to analyze the temporal trend of disease burden and associated risk factors of TI among children aged 0–14 years in China, utilizing data from 1990 to 2019.MethodsWe retrieved data of disease burden and risk factors of TI among children aged 0–14 year in China from 1990 to 2019 from the Global Burden of Disease (GBD) dataset. We estimated incidence rate, death rate, and disability adjusted life years (DALYs) rate with a 95% uncertainty interval (95% UI), stratified by age, sex, and all type-road users. Trends in disease burden with annual percentage changes (APC) and average annual percent change (AAPC) were performed by Joinpoint regression model.ResultsThe incidence rate (AAPC = 1.18%, P < 0.001) of TI among children aged 0–14 years showed an increasing trend, whereas mortality rate (AAPC = -3.87%, P < 0.001) and DALYs rate (AAPC = -3.83%, P < 0.001) decreased annually. Notably, boys experienced a higher increase in incidence (1.30%) compared to girls (1.06%), but a faster decrease in mortality and DALYs rate (-3.90% vs. -3.82%, -3.88% vs. -3.79%, respectively) (Pall < 0.001). Declines in death rates and DALYs rates were observed across all age groups (Pall < 0.001), while remained the highest among children aged 0–4 in 2019. Among different road-type users, cyclist road injuries were identified as the primary cause of TI (182.3 cases per 100,000) while pedestrians were the group with the highest mortality (2.9 cases per 100,000) and DALYs rate (243 cases per 100,000) in 2019. Besides, alcohol use was a significant risk factors for TI, while low temperature appeared to be a protective factor.ConclusionFuture efforts must prioritize raising awareness among children and their guardians to mitigate the disease burden of TI in children. It’s critical to enhance preventive interventions for boys, children aged 0–4 and vulnerable road users such as pedestrians and cyclists in future