3 research outputs found

    Models and Benchmarks for Representation Learning of Partially Observed Subgraphs

    Full text link
    Subgraphs are rich substructures in graphs, and their nodes and edges can be partially observed in real-world tasks. Under partial observation, existing node- or subgraph-level message-passing produces suboptimal representations. In this paper, we formulate a novel task of learning representations of partially observed subgraphs. To solve this problem, we propose Partial Subgraph InfoMax (PSI) framework and generalize existing InfoMax models, including DGI, InfoGraph, MVGRL, and GraphCL, into our framework. These models maximize the mutual information between the partial subgraph's summary and various substructures from nodes to full subgraphs. In addition, we suggest a novel two-stage model with kk-hop PSI, which reconstructs the representation of the full subgraph and improves its expressiveness from different local-global structures. Under training and evaluation protocols designed for this problem, we conduct experiments on three real-world datasets and demonstrate that PSI models outperform baselines.Comment: CIKM 2022 Short Paper (Camera-ready + Appendix

    KOLD: Korean Offensive Language Dataset

    Full text link
    Although large attention has been paid to the detection of hate speech, most work has been done in English, failing to make it applicable to other languages. To fill this gap, we present a Korean offensive language dataset (KOLD), 40k comments labeled with offensiveness, target, and targeted group information. We also collect two types of span, offensive and target span that justifies the decision of the categorization within the text. Comparing the distribution of targeted groups with the existing English dataset, we point out the necessity of a hate speech dataset fitted to the language that best reflects the culture. Trained with our dataset, we report the baseline performance of the models built on top of large pretrained language models. We also show that title information serves as context and is helpful to discern the target of hatred, especially when they are omitted in the comment.Comment: 8 pages, 1 figur
    corecore