241 research outputs found

    MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning

    Full text link
    In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data. This scenario is particularly challenging because graph neural networks (GNNs) have been shown to be sensitive to distributional shifts, even when labels are available. To address this challenge, we propose a \underline{M}odel-\underline{A}gnostic \underline{R}ecipe for \underline{I}mproving \underline{O}OD generalizability of unsupervised graph contrastive learning methods, which we refer to as MARIO. MARIO introduces two principles aimed at developing distributional-shift-robust graph contrastive methods to overcome the limitations of existing frameworks: (i) Information Bottleneck (IB) principle for achieving generalizable representations and (ii) Invariant principle that incorporates adversarial data augmentation to obtain invariant representations. To the best of our knowledge, this is the first work that investigates the OOD generalization problem of graph contrastive learning, with a specific focus on node-level tasks. Through extensive experiments, we demonstrate that our method achieves state-of-the-art performance on the OOD test set, while maintaining comparable performance on the in-distribution test set when compared to existing approaches. The source code for our method can be found at: https://github.com/ZhuYun97/MARIOComment: 20 pages, 15 figure

    Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction

    Full text link
    Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations. However, the generated training data typically contain massive noise, and may result in poor performances with the vanilla supervised learning. In this paper, we propose to conduct multi-instance learning with a novel Cross-relation Cross-bag Selective Attention (C2^2SA), which leads to noise-robust training for distant supervised relation extractor. Specifically, we employ the sentence-level selective attention to reduce the effect of noisy or mismatched sentences, while the correlation among relations were captured to improve the quality of attention weights. Moreover, instead of treating all entity-pairs equally, we try to pay more attention to entity-pairs with a higher quality. Similarly, we adopt the selective attention mechanism to achieve this goal. Experiments with two types of relation extractor demonstrate the superiority of the proposed approach over the state-of-the-art, while further ablation studies verify our intuitions and demonstrate the effectiveness of our proposed two techniques.Comment: AAAI 201

    A New Method on Software Reliability Prediction

    Get PDF
    As we all know, relevant data during software life cycle can be used to analyze and predict software reliability. Firstly, the major disadvantages of the current software reliability models are discussed. And then based on analyzing classic PSO-SVM model and the characteristics of software reliability prediction, some measures of the improved PSO-SVM model are proposed, and the improved model is established. Lastly, simulation results show that compared with classic models, the improved model has better prediction precision, better generalization ability, and lower dependence on the number of samples, which is more applicable for software reliability prediction

    An improved stochastic EM algorithm for large-scale full-information item factor analysis

    Get PDF
    In this paper, we explore the use of the stochastic EM algorithm (Celeux & Diebolt, 1985) for large-scale full-information item factor analysis. Innovations have been made on its implementation, including (1) an adaptive-rejection-based Gibbs sampler for the stochastic E step, (2) a proximal gradient descent algorithm for the optimization in the M step, and (3) diagnostic procedures for determining the burn-in size and the stopping of the algorithm. These developments are based on the theoretical results of Nielsen (2000), as well as advanced sampling and optimization techniques. The proposed algorithm is computationally efficient and virtually tuning-free, making it scalable to large-scale data with many latent traits (e.g. more than five latent traits) and easy to use for practitioners. Standard errors of parameter estimation are also obtained based on the missing information identity (Louis, 1982). The performance of the algorithm is evaluated through simulation studies and an application to the analysis of the IPIP-NEO personality inventory. Extensions of the proposed algorithm to other latent variable models are discussed
    • …
    corecore