128 research outputs found
Towards Real World HDRTV Reconstruction: A Data Synthesis-based Approach
Existing deep learning based HDRTV reconstruction methods assume one kind of
tone mapping operators (TMOs) as the degradation procedure to synthesize
SDRTV-HDRTV pairs for supervised training. In this paper, we argue that,
although traditional TMOs exploit efficient dynamic range compression priors,
they have several drawbacks on modeling the realistic degradation: information
over-preservation, color bias and possible artifacts, making the trained
reconstruction networks hard to generalize well to real-world cases. To solve
this problem, we propose a learning-based data synthesis approach to learn the
properties of real-world SDRTVs by integrating several tone mapping priors into
both network structures and loss functions. In specific, we design a
conditioned two-stream network with prior tone mapping results as a guidance to
synthesize SDRTVs by both global and local transformations. To train the data
synthesis network, we form a novel self-supervised content loss to constraint
different aspects of the synthesized SDRTVs at regions with different
brightness distributions and an adversarial loss to emphasize the details to be
more realistic. To validate the effectiveness of our approach, we synthesize
SDRTV-HDRTV pairs with our method and use them to train several HDRTV
reconstruction networks. Then we collect two inference datasets containing both
labeled and unlabeled real-world SDRTVs, respectively. Experimental results
demonstrate that, the networks trained with our synthesized data generalize
significantly better to these two real-world datasets than existing solutions
Mitigating Pooling Bias in E-commerce Search via False Negative Estimation
Efficient and accurate product relevance assessment is critical for user
experiences and business success. Training a proficient relevance assessment
model requires high-quality query-product pairs, often obtained through
negative sampling strategies. Unfortunately, current methods introduce pooling
bias by mistakenly sampling false negatives, diminishing performance and
business impact. To address this, we present Bias-mitigating Hard Negative
Sampling (BHNS), a novel negative sampling strategy tailored to identify and
adjust for false negatives, building upon our original False Negative
Estimation algorithm. Our experiments in the Instacart search setting confirm
BHNS as effective for practical e-commerce use. Furthermore, comparative
analyses on public dataset showcase its domain-agnostic potential for diverse
applications.Comment: Submitted to WWW'24 Industry Trac
Towards Personalized Federated Learning via Heterogeneous Model Reassembly
This paper focuses on addressing the practical yet challenging problem of
model heterogeneity in federated learning, where clients possess models with
different network structures. To track this problem, we propose a novel
framework called pFedHR, which leverages heterogeneous model reassembly to
achieve personalized federated learning. In particular, we approach the problem
of heterogeneous model personalization as a model-matching optimization task on
the server side. Moreover, pFedHR automatically and dynamically generates
informative and diverse personalized candidates with minimal human
intervention. Furthermore, our proposed heterogeneous model reassembly
technique mitigates the adverse impact introduced by using public data with
different distributions from the client data to a certain extent. Experimental
results demonstrate that pFedHR outperforms baselines on three datasets under
both IID and Non-IID settings. Additionally, pFedHR effectively reduces the
adverse impact of using different public data and dynamically generates diverse
personalized models in an automated manner
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation
Health risk prediction is one of the fundamental tasks under predictive
modeling in the medical domain, which aims to forecast the potential health
risks that patients may face in the future using their historical Electronic
Health Records (EHR). Researchers have developed several risk prediction models
to handle the unique challenges of EHR data, such as its sequential nature,
high dimensionality, and inherent noise. These models have yielded impressive
results. Nonetheless, a key issue undermining their effectiveness is data
insufficiency. A variety of data generation and augmentation methods have been
introduced to mitigate this issue by expanding the size of the training data
set through the learning of underlying data distributions. However, the
performance of these methods is often limited due to their task-unrelated
design. To address these shortcomings, this paper introduces a novel,
end-to-end diffusion-based risk prediction model, named MedDiffusion. It
enhances risk prediction performance by creating synthetic patient data during
training to enlarge sample space. Furthermore, MedDiffusion discerns hidden
relationships between patient visits using a step-wise attention mechanism,
enabling the model to automatically retain the most vital information for
generating high-quality data. Experimental evaluation on four real-world
medical datasets demonstrates that MedDiffusion outperforms 14 cutting-edge
baselines in terms of PR-AUC, F1, and Cohen's Kappa. We also conduct ablation
studies and benchmark our model against GAN-based alternatives to further
validate the rationality and adaptability of our model design. Additionally, we
analyze generated data to offer fresh insights into the model's
interpretability
Weak Supervision for Fake News Detection via Reinforcement Learning
Today social media has become the primary source for news. Via social media
platforms, fake news travel at unprecedented speeds, reach global audiences and
put users and communities at great risk. Therefore, it is extremely important
to detect fake news as early as possible. Recently, deep learning based
approaches have shown improved performance in fake news detection. However, the
training of such models requires a large amount of labeled data, but manual
annotation is time-consuming and expensive. Moreover, due to the dynamic nature
of news, annotated samples may become outdated quickly and cannot represent the
news articles on newly emerged events. Therefore, how to obtain fresh and
high-quality labeled samples is the major challenge in employing deep learning
models for fake news detection. In order to tackle this challenge, we propose a
reinforced weakly-supervised fake news detection framework, i.e., WeFEND, which
can leverage users' reports as weak supervision to enlarge the amount of
training data for fake news detection. The proposed framework consists of three
main components: the annotator, the reinforced selector and the fake news
detector. The annotator can automatically assign weak labels for unlabeled news
based on users' reports. The reinforced selector using reinforcement learning
techniques chooses high-quality samples from the weakly labeled data and
filters out those low-quality ones that may degrade the detector's prediction
performance. The fake news detector aims to identify fake news based on the
news content. We tested the proposed framework on a large collection of news
articles published via WeChat official accounts and associated user reports.
Extensive experiments on this dataset show that the proposed WeFEND model
achieves the best performance compared with the state-of-the-art methods.Comment: AAAI 202
Rethinking GNN-based Entity Alignment on Heterogeneous Knowledge Graphs: New Datasets and A New Method
The development of knowledge graph (KG) applications has led to a rising need
for entity alignment (EA) between heterogeneous KGs that are extracted from
various sources. Recently, graph neural networks (GNNs) have been widely
adopted in EA tasks due to GNNs' impressive ability to capture structure
information. However, we have observed that the oversimplified settings of the
existing common EA datasets are distant from real-world scenarios, which
obstructs a full understanding of the advancements achieved by recent methods.
This phenomenon makes us ponder: Do existing GNN-based EA methods really make
great progress?
In this paper, to study the performance of EA methods in realistic settings,
we focus on the alignment of highly heterogeneous KGs (HHKGs) (e.g., event KGs
and general KGs) which are different with regard to the scale and structure,
and share fewer overlapping entities. First, we sweep the unreasonable
settings, and propose two new HHKG datasets that closely mimic real-world EA
scenarios. Then, based on the proposed datasets, we conduct extensive
experiments to evaluate previous representative EA methods, and reveal
interesting findings about the progress of GNN-based EA methods. We find that
the structural information becomes difficult to exploit but still valuable in
aligning HHKGs. This phenomenon leads to inferior performance of existing EA
methods, especially GNN-based methods. Our findings shed light on the potential
problems resulting from an impulsive application of GNN-based methods as a
panacea for all EA datasets. Finally, we introduce a simple but effective
method: Simple-HHEA, which comprehensively utilizes entity name, structure, and
temporal information. Experiment results show Simple-HHEA outperforms previous
models on HHKG datasets.Comment: 11 pages, 6 figure
A Benchmark Dataset for Understandable Medical Language Translation
In this paper, we introduce MedLane -- a new human-annotated Medical Language
translation dataset, to align professional medical sentences with
layperson-understandable expressions. The dataset contains 12,801 training
samples, 1,015 validation samples, and 1,016 testing samples. We then evaluate
one naive and six deep learning-based approaches on the MedLane dataset,
including directly copying, a statistical machine translation approach Moses,
four neural machine translation approaches (i.e., the proposed PMBERT-MT model,
Seq2Seq and its two variants), and a modified text summarization model
PointerNet. To compare the results, we utilize eleven metrics, including three
new measures specifically designed for this task. Finally, we discuss the
limitations of MedLane and baselines, and point out possible research
directions for this task
- …