108 research outputs found

    Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path

    Full text link
    The rapid growth of web pages and the increasing complexity of their structure poses a challenge for web mining models. Web mining models are required to understand the semi-structured web pages, particularly when little is known about the subject or template of a new page. Current methods migrate language models to the web mining by embedding the XML source code into the transformer or encoding the rendered layout with graph neural networks. However, these approaches do not take into account the relationships between text nodes within and across pages. In this paper, we propose a new approach, ReXMiner, for zero-shot relation extraction in web mining. ReXMiner encodes the shortest relative paths in the Document Object Model (DOM) tree which is a more accurate and efficient signal for key-value pair extraction within a web page. It also incorporates the popularity of each text node by counting the occurrence of the same text node across different web pages. We use the contrastive learning to address the issue of sparsity in relation extraction. Extensive experiments on public benchmarks show that our method, ReXMiner, outperforms the state-of-the-art baselines in the task of zero-shot relation extraction in web mining

    DPPred: an effective prediction framework with concise discriminative patterns and its biomedical applications

    Get PDF
    In the literature, two series of models have been proposed to address prediction problems including classification and regression. Simple models, such as generalized linear models, have ordinary performance but strong interpretability on a set of simple features. The other series, including tree-based models, organize numerical, categorical and high dimensional features into a comprehensive structure with rich interpretable information in the data. In this thesis, we propose a novel discriminative pattern-based prediction framework (DPPred) to accomplish the prediction tasks by taking their advantages of both effectiveness and interpretability. Specifically, DPPred adopts the concise discriminative patterns that are on the prefix paths from the root to leaf nodes in the tree-based models. Moreover, DPPred selects a limited number of the useful discriminative patterns by searching for the most effective pattern combination to fit generalized linear models. To validate the effectiveness of DPPred, we conduct experiments on both classification and regression tasks. Experimental results demonstrate that DPPred provides competitive accuracy with the state-of-the-art as well as the valuable interpretability for developers and experts. In particular, when studying health status for cardiopulmonary patients, DPPred shows the acceptable predicting accuracy (more than 95%) and reveals the importance of demographic features; when studying the amyotrophic lateral sclerosis (ALS) disease, DPPred not only outperforms the baselines by using only 40 concise discriminative patterns out of a potentially exponentially large set of patterns, but also discover novel markers

    Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting

    Full text link
    We show that label noise exists in adversarial training. Such label noise is due to the mismatch between the true label distribution of adversarial examples and the label inherited from clean examples - the true label distribution is distorted by the adversarial perturbation, but is neglected by the common practice that inherits labels from clean examples. Recognizing label noise sheds insights on the prevalence of robust overfitting in adversarial training, and explains its intriguing dependence on perturbation radius and data quality. Also, our label noise perspective aligns well with our observations of the epoch-wise double descent in adversarial training. Guided by our analyses, we proposed a method to automatically calibrate the label to address the label noise and robust overfitting. Our method achieves consistent performance improvements across various models and datasets without introducing new hyper-parameters or additional tuning.Comment: Neurips 2022 (Oral); A previous version of this paper (v1) used the title `Double Descent in Adversarial Training: An Implicit Label Noise Perspective

    An Attention-based Collaboration Framework for Multi-View Network Representation Learning

    Full text link
    Learning distributed node representations in networks has been attracting increasing attention recently due to its effectiveness in a variety of applications. Existing approaches usually study networks with a single type of proximity between nodes, which defines a single view of a network. However, in reality there usually exists multiple types of proximities between nodes, yielding networks with multiple views. This paper studies learning node representations for networks with multiple views, which aims to infer robust node representations across different views. We propose a multi-view representation learning approach, which promotes the collaboration of different views and lets them vote for the robust representations. During the voting process, an attention mechanism is introduced, which enables each node to focus on the most informative views. Experimental results on real-world networks show that the proposed approach outperforms existing state-of-the-art approaches for network representation learning with a single view and other competitive approaches with multiple views.Comment: CIKM 201

    Generating Efficient Training Data via LLM-based Attribute Manipulation

    Full text link
    In this paper, we propose a novel method, Chain-of-Thoughts Attribute Manipulation (CoTAM), to guide few-shot learning by carefully crafted data from Large Language Models (LLMs). The main idea is to create data with changes only in the attribute targeted by the task. Inspired by facial attribute manipulation, our approach generates label-switched data by leveraging LLMs to manipulate task-specific attributes and reconstruct new sentences in a controlled manner. Instead of conventional latent representation controlling, we implement chain-of-thoughts decomposition and reconstruction to adapt the procedure to LLMs. Extensive results on text classification and other tasks verify the advantage of CoTAM over other LLM-based text generation methods with the same number of training examples. Analysis visualizes the attribute manipulation effectiveness of CoTAM and presents the potential of LLM-guided learning with even less supervision

    Open-world Semi-supervised Generalized Relation Discovery Aligned in a Real-world Setting

    Full text link
    Open-world Relation Extraction (OpenRE) has recently garnered significant attention. However, existing approaches tend to oversimplify the problem by assuming that all unlabeled texts belong to novel classes, thereby limiting the practicality of these methods. We argue that the OpenRE setting should be more aligned with the characteristics of real-world data. Specifically, we propose two key improvements: (a) unlabeled data should encompass known and novel classes, including hard-negative instances; and (b) the set of novel classes should represent long-tail relation types. Furthermore, we observe that popular relations such as titles and locations can often be implicitly inferred through specific patterns, while long-tail relations tend to be explicitly expressed in sentences. Motivated by these insights, we present a novel method called KNoRD (Known and Novel Relation Discovery), which effectively classifies explicitly and implicitly expressed relations from known and novel classes within unlabeled data. Experimental evaluations on several Open-world RE benchmarks demonstrate that KNoRD consistently outperforms other existing methods, achieving significant performance gains.Comment: 10 pages, 6 figure