245 research outputs found

    Gravity Effects on Information Filtering and Network Evolving

    Full text link
    In this paper, based on the gravity principle of classical physics, we propose a tunable gravity-based model, which considers tag usage pattern to weigh both the mass and distance of network nodes. We then apply this model in solving the problems of information filtering and network evolving. Experimental results on two real-world data sets, \emph{Del.icio.us} and \emph{MovieLens}, show that it can not only enhance the algorithmic performance, but can also better characterize the properties of real networks. This work may shed some light on the in-depth understanding of the effect of gravity model

    Inducing Causal Structure for Abstractive Text Summarization

    Full text link
    The mainstream of data-driven abstractive summarization models tends to explore the correlations rather than the causal relationships. Among such correlations, there can be spurious ones which suffer from the language prior learned from the training corpus and therefore undermine the overall effectiveness of the learned model. To tackle this issue, we introduce a Structural Causal Model (SCM) to induce the underlying causal structure of the summarization data. We assume several latent causal factors and non-causal factors, representing the content and style of the document and summary. Theoretically, we prove that the latent factors in our SCM can be identified by fitting the observed training data under certain conditions. On the basis of this, we propose a Causality Inspired Sequence-to-Sequence model (CI-Seq2Seq) to learn the causal representations that can mimic the causal factors, guiding us to pursue causal information for summary generation. The key idea is to reformulate the Variational Auto-encoder (VAE) to fit the joint distribution of the document and summary variables from the training corpus. Experimental results on two widely used text summarization datasets demonstrate the advantages of our approach

    On the Robustness of Generative Retrieval Models: An Out-of-Distribution Perspective

    Full text link
    Recently, we have witnessed generative retrieval increasingly gaining attention in the information retrieval (IR) field, which retrieves documents by directly generating their identifiers. So far, much effort has been devoted to developing effective generative retrieval models. There has been less attention paid to the robustness perspective. When a new retrieval paradigm enters into the real-world application, it is also critical to measure the out-of-distribution (OOD) generalization, i.e., how would generative retrieval models generalize to new distributions. To answer this question, firstly, we define OOD robustness from three perspectives in retrieval problems: 1) The query variations; 2) The unforeseen query types; and 3) The unforeseen tasks. Based on this taxonomy, we conduct empirical studies to analyze the OOD robustness of several representative generative retrieval models against dense retrieval models. The empirical results indicate that the OOD robustness of generative retrieval models requires enhancement. We hope studying the OOD robustness of generative retrieval models would be advantageous to the IR community.Comment: 4 pages, submit to GenIR2

    Continual Learning for Generative Retrieval over Dynamic Corpora

    Full text link
    Generative retrieval (GR) directly predicts the identifiers of relevant documents (i.e., docids) based on a parametric model. It has achieved solid performance on many ad-hoc retrieval tasks. So far, these tasks have assumed a static document collection. In many practical scenarios, however, document collections are dynamic, where new documents are continuously added to the corpus. The ability to incrementally index new documents while preserving the ability to answer queries with both previously and newly indexed relevant documents is vital to applying GR models. In this paper, we address this practical continual learning problem for GR. We put forward a novel Continual-LEarner for generatiVE Retrieval (CLEVER) model and make two major contributions to continual learning for GR: (i) To encode new documents into docids with low computational cost, we present Incremental Product Quantization, which updates a partial quantization codebook according to two adaptive thresholds; and (ii) To memorize new documents for querying without forgetting previous knowledge, we propose a memory-augmented learning mechanism, to form meaningful connections between old and new documents. Empirical results demonstrate the effectiveness and efficiency of the proposed model.Comment: Accepted by CIKM 202

    Learning to Truncate Ranked Lists for Information Retrieval

    Full text link
    Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned documents according to some user-defined objectives, in order to reach a balance between the overall utility of the results and user efforts. Existing methods formulate this task as a sequential decision problem and take some pre-defined loss as a proxy objective, which suffers from the limitation of local decision and non-direct optimization. In this work, we propose a global decision based truncation model named AttnCut, which directly optimizes user-defined objectives for the ranked list truncation. Specifically, we take the successful transformer architecture to capture the global dependency within the ranked list for truncation decision, and employ the reward augmented maximum likelihood (RAML) for direct optimization. We consider two types of user-defined objectives which are of practical usage. One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search. Empirical results over the Robust04 and MQ2007 datasets demonstrate the effectiveness of our approach as compared with the state-of-the-art baselines

    L^2R: Lifelong Learning for First-stage Retrieval with Backward-Compatible Representations

    Full text link
    First-stage retrieval is a critical task that aims to retrieve relevant document candidates from a large-scale collection. While existing retrieval models have achieved impressive performance, they are mostly studied on static data sets, ignoring that in the real-world, the data on the Web is continuously growing with potential distribution drift. Consequently, retrievers trained on static old data may not suit new-coming data well and inevitably produce sub-optimal results. In this work, we study lifelong learning for first-stage retrieval, especially focusing on the setting where the emerging documents are unlabeled since relevance annotation is expensive and may not keep up with data emergence. Under this setting, we aim to develop model updating with two goals: (1) to effectively adapt to the evolving distribution with the unlabeled new-coming data, and (2) to avoid re-inferring all embeddings of old documents to efficiently update the index each time the model is updated. We first formalize the task and then propose a novel Lifelong Learning method for the first-stage Retrieval, namely L^2R. L^2R adopts the typical memory mechanism for lifelong learning, and incorporates two crucial components: (1) selecting diverse support negatives for model training and memory updating for effective model adaptation, and (2) a ranking alignment objective to ensure the backward-compatibility of representations to save the cost of index rebuilding without hurting the model performance. For evaluation, we construct two new benchmarks from LoTTE and Multi-CPR datasets to simulate the document distribution drift in realistic retrieval scenarios. Extensive experiments show that L^2R significantly outperforms competitive lifelong learning baselines.Comment: accepted by CIKM202

    PRADA: Practical Black-Box Adversarial Attacks against Neural Ranking Models

    Full text link
    Neural ranking models (NRMs) have shown remarkable success in recent years, especially with pre-trained language models. However, deep neural models are notorious for their vulnerability to adversarial examples. Adversarial attacks may become a new type of web spamming technique given our increased reliance on neural information retrieval models. Therefore, it is important to study potential adversarial attacks to identify vulnerabilities of NRMs before they are deployed. In this paper, we introduce the Adversarial Document Ranking Attack (ADRA) task against NRMs, which aims to promote a target document in rankings by adding adversarial perturbations to its text. We focus on the decision-based black-box attack setting, where the attackers have no access to the model parameters and gradients, but can only acquire the rank positions of the partial retrieved list by querying the target model. This attack setting is realistic in real-world search engines. We propose a novel Pseudo Relevance-based ADversarial ranking Attack method (PRADA) that learns a surrogate model based on Pseudo Relevance Feedback (PRF) to generate gradients for finding the adversarial perturbations. Experiments on two web search benchmark datasets show that PRADA can outperform existing attack strategies and successfully fool the NRM with small indiscernible perturbations of text

    FCS-HGNN: Flexible Multi-type Community Search in Heterogeneous Information Networks

    Full text link
    Community Search (CS), a crucial task in network science, has attracted considerable interest owing to its prowess in unveiling personalized communities, thereby finding applications across diverse domains. Existing research primarily focuses on traditional homogeneous networks, which cannot be directly applied to heterogeneous information networks (HINs). However, existing research also has some limitations. For instance, either they solely focus on single-type or multi-type community search, which severely lacking flexibility, or they require users to specify meta-paths or predefined community structures, which poses significant challenges for users who are unfamiliar with community search and HINs. In this paper, we propose an innovative method, FCS-HGNN, that can flexibly identify either single-type or multi-type communities in HINs based on user preferences. We propose the heterogeneous information transformer to handle node heterogeneity, and the edge-semantic attention mechanism to address edge heterogeneity. This not only considers the varying contributions of edges when identifying different communities, but also expertly circumvents the challenges presented by meta-paths, thereby elegantly unifying the single-type and multi-type community search problems. Moreover, to enhance the applicability on large-scale graphs, we propose the neighbor sampling and depth-based heuristic search strategies, resulting in LS-FCS-HGNN. This algorithm significantly improves training and query efficiency while maintaining outstanding community effectiveness. We conducted extensive experiments on five real-world large-scale HINs, and the results demonstrated that the effectiveness and efficiency of our proposed method, which significantly outperforms state-of-the-art methods.Comment: 13 page

    Synthesis and Antioxidant Activity of Cationic 1,2,3-Triazole Functionalized Starch Derivatives

    Get PDF
    In this study, starch was chemically modified to improve its antioxidant activity. Five novel cationic 1,2,3-triazole functionalized starch derivatives were synthesized by using "click" reaction and N-alkylation. A convenient method for pre-azidation of starch was developed. The structures of the derivatives were analyzed using FTIR and H-1 NMR. The radicals scavenging abilities of the derivatives against hydroxyl radicals, DPPH radicals, and superoxide radicals were tested in vitro in order to evaluate their antioxidant activity. Results revealed that all the cationic starch derivatives (2a-2e), as well as the precursor starch derivatives (1a-1e), had significantly improved antioxidant activity compared to native starch. In particular, the scavenging ability of the derivatives against superoxide radicals was extremely strong. The improved antioxidant activity benefited from the enhanced solubility and the added positive charges. The biocompatibility of the cationic derivatives was confirmed by the low hemolytic rate (<2%). The obtained derivatives in this study have great potential as antioxidant materials that can be applied in the fields of food and biomedicine
    corecore