4,609 research outputs found
Behavior patterns of online users and the effect on information filtering
Understanding the structure and evolution of web-based user-object bipartite
networks is an important task since they play a fundamental role in online
information filtering. In this paper, we focus on investigating the patterns of
online users' behavior and the effect on recommendation process. Empirical
analysis on the e-commercial systems show that users have significant taste
diversity and their interests for niche items highly overlap. Additionally,
recommendation process are investigated on both the real networks and the
reshuffled networks in which real users' behavior patterns can be gradually
destroyed. Our results shows that the performance of personalized
recommendation methods is strongly related to the real network structure.
Detail study on each item shows that recommendation accuracy for hot items is
almost maximum and quite robust to the reshuffling process. However, niche
items cannot be accurately recommended after removing users' behavior patterns.
Our work also is meaningful in practical sense since it reveals an effective
direction to improve the accuracy and the robustness of the existing
recommender systems.Comment: 8 pages, 6 figure
Similarity from multi-dimensional scaling: solving the accuracy and diversity dilemma in information filtering
Recommender systems are designed to assist individual users to navigate through the rapidly growing amount of information. One of the most successful recommendation techniques is the collaborative filtering, which has been extensively investigated and has already found wide applications in e-commerce. One of challenges in this algorithm is how to accurately quantify the similarities of user pairs and item pairs. In this paper, we employ the multidimensional scaling (MDS) method to measure the similarities between nodes in user-item bipartite networks. The MDS method can extract the essential similarity information from the networks by smoothing out noise, which provides a graphical display of the structure of the networks. With the similarity measured from MDS, we find that the item-based collaborative filtering algorithm can outperform the diffusion-based recommendation algorithms. Moreover, we show that this method tends to recommend unpopular items and increase the global diversification of the networks in long term
The reinforcing influence of recommendations on global diversification
Recommender systems are promising ways to filter the overabundant information
in modern society. Their algorithms help individuals to explore decent items,
but it is unclear how they allocate popularity among items. In this paper, we
simulate successive recommendations and measure their influence on the
dispersion of item popularity by Gini coefficient. Our result indicates that
local diffusion and collaborative filtering reinforce the popularity of hot
items, widening the popularity dispersion. On the other hand, the heat
conduction algorithm increases the popularity of the niche items and generates
smaller dispersion of item popularity. Simulations are compared to mean-field
predictions. Our results suggest that recommender systems have reinforcing
influence on global diversification.Comment: 6 pages, 6 figure
Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19
Despite recent progress in improving the performance of misinformation
detection systems, classifying misinformation in an unseen domain remains an
elusive challenge. To address this issue, a common approach is to introduce a
domain critic and encourage domain-invariant input features. However, early
misinformation often demonstrates both conditional and label shifts against
existing misinformation data (e.g., class imbalance in COVID-19 datasets),
rendering such methods less effective for detecting early misinformation. In
this paper, we propose contrastive adaptation network for early misinformation
detection (CANMD). Specifically, we leverage pseudo labeling to generate
high-confidence target examples for joint training with source data. We
additionally design a label correction component to estimate and correct the
label shifts (i.e., class priors) between the source and target domains.
Moreover, a contrastive adaptation loss is integrated in the objective function
to reduce the intra-class discrepancy and enlarge the inter-class discrepancy.
As such, the adapted model learns corrected class priors and an invariant
conditional distribution across both domains for improved estimation of the
target data distribution. To demonstrate the effectiveness of the proposed
CANMD, we study the case of COVID-19 early misinformation detection and perform
extensive experiments using multiple real-world datasets. The results suggest
that CANMD can effectively adapt misinformation detection systems to the unseen
COVID-19 target domain with significant improvements compared to the
state-of-the-art baselines.Comment: Accepted to CIKM 202
Defending Substitution-Based Profile Pollution Attacks on Sequential Recommenders
While sequential recommender systems achieve significant improvements on
capturing user dynamics, we argue that sequential recommenders are vulnerable
against substitution-based profile pollution attacks. To demonstrate our
hypothesis, we propose a substitution-based adversarial attack algorithm, which
modifies the input sequence by selecting certain vulnerable elements and
substituting them with adversarial items. In both untargeted and targeted
attack scenarios, we observe significant performance deterioration using the
proposed profile pollution algorithm. Motivated by such observations, we design
an efficient adversarial defense method called Dirichlet neighborhood sampling.
Specifically, we sample item embeddings from a convex hull constructed by
multi-hop neighbors to replace the original items in input sequences. During
sampling, a Dirichlet distribution is used to approximate the probability
distribution in the neighborhood such that the recommender learns to combat
local perturbations. Additionally, we design an adversarial training method
tailored for sequential recommender systems. In particular, we represent
selected items with one-hot encodings and perform gradient ascent on the
encodings to search for the worst case linear combination of item embeddings in
training. As such, the embedding function learns robust item representations
and the trained recommender is resistant to test-time adversarial examples.
Extensive experiments show the effectiveness of both our attack and defense
methods, which consistently outperform baselines by a significant margin across
model architectures and datasets.Comment: Accepted to RecSys 202
Domain Adaptation for Question Answering via Question Classification
Question answering (QA) has demonstrated impressive progress in answering
questions from customized domains. Nevertheless, domain adaptation remains one
of the most elusive challenges for QA systems, especially when QA systems are
trained in a source domain but deployed in a different target domain. In this
work, we investigate the potential benefits of question classification for QA
domain adaptation. We propose a novel framework: Question Classification for
Question Answering (QC4QA). Specifically, a question classifier is adopted to
assign question classes to both the source and target data. Then, we perform
joint training in a self-supervised fashion via pseudo-labeling. For
optimization, inter-domain discrepancy between the source and target domain is
reduced via maximum mean discrepancy (MMD) distance. We additionally minimize
intra-class discrepancy among QA samples of the same question class for
fine-grained adaptation performance. To the best of our knowledge, this is the
first work in QA domain adaptation to leverage question classification with
self-supervised adaptation. We demonstrate the effectiveness of the proposed
QC4QA with consistent improvements against the state-of-the-art baselines on
multiple datasets.Comment: Accepted to COLING 202
Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup
In the real-world application of COVID-19 misinformation detection, a
fundamental challenge is the lack of the labeled COVID data to enable
supervised end-to-end training of the models, especially at the early stage of
the pandemic. To address this challenge, we propose an unsupervised domain
adaptation framework using contrastive learning and adversarial domain mixup to
transfer the knowledge from an existing source data domain to the target
COVID-19 data domain. In particular, to bridge the gap between the source
domain and the target domain, our method reduces a radial basis function (RBF)
based discrepancy between these two domains. Moreover, we leverage the power of
domain adversarial examples to establish an intermediate domain mixup, where
the latent representations of the input text from both domains could be mixed
during the training process. Extensive experiments on multiple real-world
datasets suggest that our method can effectively adapt misinformation detection
systems to the unseen COVID-19 target domain with significant improvements
compared to the state-of-the-art baselines
- …