15 research outputs found

    Relation Extraction with Self-determined Graph Convolutional Network

    Full text link
    Relation Extraction is a way of obtaining the semantic relationship between entities in text. The state-of-the-art methods use linguistic tools to build a graph for the text in which the entities appear and then a Graph Convolutional Network (GCN) is employed to encode the pre-built graphs. Although their performance is promising, the reliance on linguistic tools results in a non end-to-end process. In this work, we propose a novel model, the Self-determined Graph Convolutional Network (SGCN), which determines a weighted graph using a self-attention mechanism, rather using any linguistic tool. Then, the self-determined graph is encoded using a GCN. We test our model on the TACRED dataset and achieve the state-of-the-art result. Our experiments show that SGCN outperforms the traditional GCN, which uses dependency parsing tools to build the graph.Comment: CIKM-202

    Machine Learning for Information Extraction

    Get PDF
    The dissertation presents a number of novel machine learning techniques and applies them to information extraction. The study addresses several information extraction subtasks: part of speech tagging, entity extraction, coreference resolution, and relation extraction. Each of the tasks is formalized as a learning problem and appropriated learning algorithms are developed and applied to the problem The dissertation studies part of speech tagging as a multi-class classification problem, and applies the SNOW (Sparse Network of Winnows) learning system to learn a part of speech classifier. A Comprehensive experimental evaluation of the system confirms that it is appropriate for NLP applications. The dissertation addresses the problem of entity extraction is conjunction with coreference resolution. A Classification approach is presented for entity extraction, and coreference resolution is treated from the decoding perspective. The dissertation describes novel decoding algorithms that given local coreference decisions produce a global coherent interpretation of document entities. The dissertation studies the problem of relation extraction as a classification problem, and applies kernel methods to learn the relation classifiers. Novel kernels are defined in terms of shallow parses, and efficient algorithms are given for computing the kernels. The study evaluates the kernel approach experimentally, with positive results. The dissertation combines the constituent solutions to present a single coherent information extraction system and concludes that machine learning is a viable methodology for designing natural language processing applications

    Part of Speech Tagging Using a Network of Linear Separators

    No full text
    We present an architecture and an on-line learning algorithm and apply it to the problem of part-ofspeech tagging. The architecture presented, SNOW, is a network of linear separators in the feature space, utilizing the Winnow update algorithm. Multiplicative weight-update algorithms such as Winnow have been shown to have exceptionally good behavior when applied to very high dimensional problems, and especially when the target concepts depend on only a small subset of the features in the feature space. In this paper we describe an architecture that utilizes this mistake-driven algorithm for multi-class prediction -- selecting the part of speech of a word. The experimental analysis presented here provides more evidence to that these algorithms are suitable for natural language problems. The algorithm used is an on-line algorithm: every example is used by the algorithm only once, and is then discarded. This has significance in terms of efficiency, as well as quick adaptation to new conte..

    Toward a Theory of Learning Coherent Concepts

    No full text
    We develop a theory for learning scenarios where multiple learners co-exist but there are mutual compatibility constraints on their outcomes. This is natural in cognitive learning situations, where \natural" compatibility constraints are imposed on the outcomes of classiers so that a valid sentence, image or any other domain representation is produced. We suggest that work in this direction may help to resolve the contrast between the hardness of learning as predicted by the current theoretical models and the apparent ease at which cognitive systems seem to learn. A model of concept learning is studied in which the target concept is required to cohere with other concepts of interest. The coherency is expressed via a (Boolean) constraint that the concepts have to satisfy. Under this model, learning a concept is shown to be easier (in terms of sample complexity and mistake bounds) and the concepts learned are shown to be more robust to noise in their input (attribute n..
    corecore