4,609 research outputs found

    Behavior patterns of online users and the effect on information filtering

    Get PDF
    Understanding the structure and evolution of web-based user-object bipartite networks is an important task since they play a fundamental role in online information filtering. In this paper, we focus on investigating the patterns of online users' behavior and the effect on recommendation process. Empirical analysis on the e-commercial systems show that users have significant taste diversity and their interests for niche items highly overlap. Additionally, recommendation process are investigated on both the real networks and the reshuffled networks in which real users' behavior patterns can be gradually destroyed. Our results shows that the performance of personalized recommendation methods is strongly related to the real network structure. Detail study on each item shows that recommendation accuracy for hot items is almost maximum and quite robust to the reshuffling process. However, niche items cannot be accurately recommended after removing users' behavior patterns. Our work also is meaningful in practical sense since it reveals an effective direction to improve the accuracy and the robustness of the existing recommender systems.Comment: 8 pages, 6 figure

    Similarity from multi-dimensional scaling: solving the accuracy and diversity dilemma in information filtering

    Get PDF
    Recommender systems are designed to assist individual users to navigate through the rapidly growing amount of information. One of the most successful recommendation techniques is the collaborative filtering, which has been extensively investigated and has already found wide applications in e-commerce. One of challenges in this algorithm is how to accurately quantify the similarities of user pairs and item pairs. In this paper, we employ the multidimensional scaling (MDS) method to measure the similarities between nodes in user-item bipartite networks. The MDS method can extract the essential similarity information from the networks by smoothing out noise, which provides a graphical display of the structure of the networks. With the similarity measured from MDS, we find that the item-based collaborative filtering algorithm can outperform the diffusion-based recommendation algorithms. Moreover, we show that this method tends to recommend unpopular items and increase the global diversification of the networks in long term

    The reinforcing influence of recommendations on global diversification

    Get PDF
    Recommender systems are promising ways to filter the overabundant information in modern society. Their algorithms help individuals to explore decent items, but it is unclear how they allocate popularity among items. In this paper, we simulate successive recommendations and measure their influence on the dispersion of item popularity by Gini coefficient. Our result indicates that local diffusion and collaborative filtering reinforce the popularity of hot items, widening the popularity dispersion. On the other hand, the heat conduction algorithm increases the popularity of the niche items and generates smaller dispersion of item popularity. Simulations are compared to mean-field predictions. Our results suggest that recommender systems have reinforcing influence on global diversification.Comment: 6 pages, 6 figure

    Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19

    Full text link
    Despite recent progress in improving the performance of misinformation detection systems, classifying misinformation in an unseen domain remains an elusive challenge. To address this issue, a common approach is to introduce a domain critic and encourage domain-invariant input features. However, early misinformation often demonstrates both conditional and label shifts against existing misinformation data (e.g., class imbalance in COVID-19 datasets), rendering such methods less effective for detecting early misinformation. In this paper, we propose contrastive adaptation network for early misinformation detection (CANMD). Specifically, we leverage pseudo labeling to generate high-confidence target examples for joint training with source data. We additionally design a label correction component to estimate and correct the label shifts (i.e., class priors) between the source and target domains. Moreover, a contrastive adaptation loss is integrated in the objective function to reduce the intra-class discrepancy and enlarge the inter-class discrepancy. As such, the adapted model learns corrected class priors and an invariant conditional distribution across both domains for improved estimation of the target data distribution. To demonstrate the effectiveness of the proposed CANMD, we study the case of COVID-19 early misinformation detection and perform extensive experiments using multiple real-world datasets. The results suggest that CANMD can effectively adapt misinformation detection systems to the unseen COVID-19 target domain with significant improvements compared to the state-of-the-art baselines.Comment: Accepted to CIKM 202

    Defending Substitution-Based Profile Pollution Attacks on Sequential Recommenders

    Full text link
    While sequential recommender systems achieve significant improvements on capturing user dynamics, we argue that sequential recommenders are vulnerable against substitution-based profile pollution attacks. To demonstrate our hypothesis, we propose a substitution-based adversarial attack algorithm, which modifies the input sequence by selecting certain vulnerable elements and substituting them with adversarial items. In both untargeted and targeted attack scenarios, we observe significant performance deterioration using the proposed profile pollution algorithm. Motivated by such observations, we design an efficient adversarial defense method called Dirichlet neighborhood sampling. Specifically, we sample item embeddings from a convex hull constructed by multi-hop neighbors to replace the original items in input sequences. During sampling, a Dirichlet distribution is used to approximate the probability distribution in the neighborhood such that the recommender learns to combat local perturbations. Additionally, we design an adversarial training method tailored for sequential recommender systems. In particular, we represent selected items with one-hot encodings and perform gradient ascent on the encodings to search for the worst case linear combination of item embeddings in training. As such, the embedding function learns robust item representations and the trained recommender is resistant to test-time adversarial examples. Extensive experiments show the effectiveness of both our attack and defense methods, which consistently outperform baselines by a significant margin across model architectures and datasets.Comment: Accepted to RecSys 202

    Domain Adaptation for Question Answering via Question Classification

    Full text link
    Question answering (QA) has demonstrated impressive progress in answering questions from customized domains. Nevertheless, domain adaptation remains one of the most elusive challenges for QA systems, especially when QA systems are trained in a source domain but deployed in a different target domain. In this work, we investigate the potential benefits of question classification for QA domain adaptation. We propose a novel framework: Question Classification for Question Answering (QC4QA). Specifically, a question classifier is adopted to assign question classes to both the source and target data. Then, we perform joint training in a self-supervised fashion via pseudo-labeling. For optimization, inter-domain discrepancy between the source and target domain is reduced via maximum mean discrepancy (MMD) distance. We additionally minimize intra-class discrepancy among QA samples of the same question class for fine-grained adaptation performance. To the best of our knowledge, this is the first work in QA domain adaptation to leverage question classification with self-supervised adaptation. We demonstrate the effectiveness of the proposed QC4QA with consistent improvements against the state-of-the-art baselines on multiple datasets.Comment: Accepted to COLING 202

    Unsupervised Domain Adaptation for COVID-19 Information Service with Contrastive Adversarial Domain Mixup

    Full text link
    In the real-world application of COVID-19 misinformation detection, a fundamental challenge is the lack of the labeled COVID data to enable supervised end-to-end training of the models, especially at the early stage of the pandemic. To address this challenge, we propose an unsupervised domain adaptation framework using contrastive learning and adversarial domain mixup to transfer the knowledge from an existing source data domain to the target COVID-19 data domain. In particular, to bridge the gap between the source domain and the target domain, our method reduces a radial basis function (RBF) based discrepancy between these two domains. Moreover, we leverage the power of domain adversarial examples to establish an intermediate domain mixup, where the latent representations of the input text from both domains could be mixed during the training process. Extensive experiments on multiple real-world datasets suggest that our method can effectively adapt misinformation detection systems to the unseen COVID-19 target domain with significant improvements compared to the state-of-the-art baselines
    corecore