53 research outputs found

    LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

    Full text link
    Extreme Multi-label text Classification (XMC) is a task of finding the most relevant labels from a large label set. Nowadays deep learning-based methods have shown significant success in XMC. However, the existing methods (e.g., AttentionXML and X-Transformer etc) still suffer from 1) combining several models to train and predict for one dataset, and 2) sampling negative labels statically during the process of training label ranking model, which reduces both the efficiency and accuracy of the model. To address the above problems, we proposed LightXML, which adopts end-to-end training and dynamic negative labels sampling. In LightXML, we use generative cooperative networks to recall and rank labels, in which label recalling part generates negative and positive labels, and label ranking part distinguishes positive labels from these labels. Through these networks, negative labels are sampled dynamically during label ranking part training by feeding with the same text representation. Extensive experiments show that LightXML outperforms state-of-the-art methods in five extreme multi-label datasets with much smaller model size and lower computational complexity. In particular, on the Amazon dataset with 670K labels, LightXML can reduce the model size up to 72% compared to AttentionXML

    What is the Way Allah's Word Manifests Itself in Yemeni Arabic?

    Get PDF
    In this paper, the author shows how ‘Allah’ is used in daily Yemeni Arabic conversations. The term Allah has a variety of meanings in Yemeni Arabic, as it does in the Arab world, reflecting the belief that Allah alone is in charge of all the affairs, grants blessings, and either encourages or criticizes someone to do something. The result of this is that the term Allah appears in several expressions when the term is part of a sentence containing the word. For example, there are expressions that have over one meaning, such as Allah alaik, which signifies two literal meanings. The word Allah can also be found in other expressions, but with entirely different meanings, including moaning or aiming for guidance. I conducted a study looking at the occurrences of social life contact, reactions, and the cultural influence of native Yemenis. The rest of this paper explores some of the other most common expressions used in Yemeni society, which shows the word is heavily influenced by religion and culture in its use in Yemeni society

    Frequency Enhanced Hybrid Attention Network for Sequential Recommendation

    Full text link
    The self-attention mechanism, which equips with a strong capability of modeling long-range dependencies, is one of the extensively used techniques in the sequential recommendation field. However, many recent studies represent that current self-attention based models are low-pass filters and are inadequate to capture high-frequency information. Furthermore, since the items in the user behaviors are intertwined with each other, these models are incomplete to distinguish the inherent periodicity obscured in the time domain. In this work, we shift the perspective to the frequency domain, and propose a novel Frequency Enhanced Hybrid Attention Network for Sequential Recommendation, namely FEARec. In this model, we firstly improve the original time domain self-attention in the frequency domain with a ramp structure to make both low-frequency and high-frequency information could be explicitly learned in our approach. Moreover, we additionally design a similar attention mechanism via auto-correlation in the frequency domain to capture the periodic characteristics and fuse the time and frequency level attention in a union model. Finally, both contrastive learning and frequency regularization are utilized to ensure that multiple views are aligned in both the time domain and frequency domain. Extensive experiments conducted on four widely used benchmark datasets demonstrate that the proposed model performs significantly better than the state-of-the-art approaches.Comment: 11 pages, 7 figures, The 46th International ACM SIGIR Conference on Research and Development in Information Retrieva

    Intent Contrastive Learning with Cross Subsequences for Sequential Recommendation

    Full text link
    The user purchase behaviors are mainly influenced by their intentions (e.g., buying clothes for decoration, buying brushes for painting, etc.). Modeling a user's latent intention can significantly improve the performance of recommendations. Previous works model users' intentions by considering the predefined label in auxiliary information or introducing stochastic data augmentation to learn purposes in the latent space. However, the auxiliary information is sparse and not always available for recommender systems, and introducing stochastic data augmentation may introduce noise and thus change the intentions hidden in the sequence. Therefore, leveraging user intentions for sequential recommendation (SR) can be challenging because they are frequently varied and unobserved. In this paper, Intent contrastive learning with Cross Subsequences for sequential Recommendation (ICSRec) is proposed to model users' latent intentions. Specifically, ICSRec first segments a user's sequential behaviors into multiple subsequences by using a dynamic sliding operation and takes these subsequences into the encoder to generate the representations for the user's intentions. To tackle the problem of no explicit labels for purposes, ICSRec assumes different subsequences with the same target item may represent the same intention and proposes a coarse-grain intent contrastive learning to push these subsequences closer. Then, fine-grain intent contrastive learning is mentioned to capture the fine-grain intentions of subsequences in sequential behaviors. Extensive experiments conducted on four real-world datasets demonstrate the superior performance of the proposed ICSRec model compared with baseline methods.Comment: 10pages, 5figures, WSDM2024. arXiv admin note: text overlap with arXiv:2304.0776

    RHCO: A Relation-aware Heterogeneous Graph Neural Network with Contrastive Learning for Large-scale Graphs

    Full text link
    Heterogeneous graph neural networks (HGNNs) have been widely applied in heterogeneous information network tasks, while most HGNNs suffer from poor scalability or weak representation when they are applied to large-scale heterogeneous graphs. To address these problems, we propose a novel Relation-aware Heterogeneous Graph Neural Network with Contrastive Learning (RHCO) for large-scale heterogeneous graph representation learning. Unlike traditional heterogeneous graph neural networks, we adopt the contrastive learning mechanism to deal with the complex heterogeneity of large-scale heterogeneous graphs. We first learn relation-aware node embeddings under the network schema view. Then we propose a novel positive sample selection strategy to choose meaningful positive samples. After learning node embeddings under the positive sample graph view, we perform a cross-view contrastive learning to obtain the final node representations. Moreover, we adopt the label smoothing technique to boost the performance of RHCO. Extensive experiments on three large-scale academic heterogeneous graph datasets show that RHCO achieves best performance over the state-of-the-art models

    Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction

    Full text link
    Potential crowd flow prediction for new planned transportation sites is a fundamental task for urban planners and administrators. Intuitively, the potential crowd flow of the new coming site can be implied by exploring the nearby sites. However, the transportation modes of nearby sites (e.g. bus stations, bicycle stations) might be different from the target site (e.g. subway station), which results in severe data scarcity issues. To this end, we propose a data driven approach, named MOHER, to predict the potential crowd flow in a certain mode for a new planned site. Specifically, we first identify the neighbor regions of the target site by examining the geographical proximity as well as the urban function similarity. Then, to aggregate these heterogeneous relations, we devise a cross-mode relational GCN, a novel relation-specific transformation model, which can learn not only the correlations but also the differences between different transportation modes. Afterward, we design an aggregator for inductive potential flow representation. Finally, an LTSM module is used for sequential flow prediction. Extensive experiments on real-world data sets demonstrate the superiority of the MOHER framework compared with the state-of-the-art algorithms.Comment: Accepted by the 35th AAAI Conference on Artificial Intelligence (AAAI 2021

    Graph Learning and Its Applications: A Holistic Survey

    Full text link
    Graph learning is a prevalent domain that endeavors to learn the intricate relationships among nodes and the topological structure of graphs. These relationships endow graphs with uniqueness compared to conventional tabular data, as nodes rely on non-Euclidean space and encompass rich information to exploit. Over the years, graph learning has transcended from graph theory to graph data mining. With the advent of representation learning, it has attained remarkable performance in diverse scenarios, including text, image, chemistry, and biology. Owing to its extensive application prospects, graph learning attracts copious attention from the academic community. Despite numerous works proposed to tackle different problems in graph learning, there is a demand to survey previous valuable works. While some researchers have perceived this phenomenon and accomplished impressive surveys on graph learning, they failed to connect related objectives, methods, and applications in a more coherent way. As a result, they did not encompass current ample scenarios and challenging problems due to the rapid expansion of graph learning. Different from previous surveys on graph learning, we provide a holistic review that analyzes current works from the perspective of graph structure, and discusses the latest applications, trends, and challenges in graph learning. Specifically, we commence by proposing a taxonomy from the perspective of the composition of graph data and then summarize the methods employed in graph learning. We then provide a detailed elucidation of mainstream applications. Finally, based on the current trend of techniques, we propose future directions.Comment: 20 pages, 7 figures, 3 table
    • …
    corecore