Search CORE

20,988 research outputs found

Deep Structured Neural Network for Event Temporal Relation Extraction

Author: Galstyan Aram
Han Rujun
Hsu I-Hung
Peng Nanyun
Weischedel Ralph
Yang Mu
Publication venue
Publication date: 24/09/2019
Field of study

We propose a novel deep structured learning framework for event temporal relation extraction. The model consists of 1) a recurrent neural network (RNN) to learn scoring functions for pair-wise relations, and 2) a structured support vector machine (SSVM) to make joint predictions. The neural network automatically learns representations that account for long-term contexts to provide robust features for the structured model, while the SSVM incorporates domain knowledge such as transitive closure of temporal relations as constraints to make better globally consistent decisions. By jointly training the two components, our model combines the benefits of both data-driven learning and knowledge exploitation. Experimental results on three high-quality event temporal relation datasets (TCR, MATRES, and TB-Dense) demonstrate that incorporated with pre-trained contextualized embeddings, the proposed model achieves significantly better performances than the state-of-the-art methods on all three datasets. We also provide thorough ablation studies to investigate our model.Comment: This paper will be published in CoNLL 201

arXiv.org e-Print Archive

Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction

Author: Han Rujun
Peng Nanyun
Zhou Yichao
Publication venue
Publication date: 06/10/2020
Field of study

Extracting event temporal relations is a critical task for information extraction and plays an important role in natural language understanding. Prior systems leverage deep learning and pre-trained language models to improve the performance of the task. However, these systems often suffer from two short-comings: 1) when performing maximum a posteriori (MAP) inference based on neural models, previous systems only used structured knowledge that are assumed to be absolutely correct, i.e., hard constraints; 2) biased predictions on dominant temporal relations when training with a limited amount of data. To address these issues, we propose a framework that enhances deep neural network with distributional constraints constructed by probabilistic domain knowledge. We solve the constrained inference problem via Lagrangian Relaxation and apply it on end-to-end event temporal relation extraction tasks. Experimental results show our framework is able to improve the baseline neural network models with strong statistical significance on two widely used datasets in news and clinical domains.Comment: Appear in EMNLP'2

arXiv.org e-Print Archive

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis

Author: Bihorac Azra
Rashidi Parisa
Shickel Benjamin
Tighe Patrick
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/02/2018
Field of study

The past decade has seen an explosion in the amount of digital information stored in electronic health records (EHR). While primarily designed for archiving patient clinical information and administrative healthcare tasks, many researchers have found secondary use of these records for various clinical informatics tasks. Over the same period, the machine learning community has seen widespread advances in deep learning techniques, which also have been successfully applied to the vast amount of EHR data. In this paper, we review these deep EHR systems, examining architectures, technical aspects, and clinical applications. We also identify shortcomings of current techniques and discuss avenues of future research for EHR-based deep learning.Comment: Accepted for publication with Journal of Biomedical and Health Informatics: http://ieeexplore.ieee.org/abstract/document/8086133

arXiv.org e-Print Archive

Brundlefly at SemEval-2016 Task 12: Recurrent Neural Networks vs. Joint Inference for Clinical Temporal Information Extraction

Author: Fries Jason Alan
Publication venue
Publication date: 04/06/2016
Field of study

We submitted two systems to the SemEval-2016 Task 12: Clinical TempEval challenge, participating in Phase 1, where we identified text spans of time and event expressions in clinical notes and Phase 2, where we predicted a relation between an event and its parent document creation time. For temporal entity extraction, we find that a joint inference-based approach using structured prediction outperforms a vanilla recurrent neural network that incorporates word embeddings trained on a variety of large clinical document sets. For document creation time relations, we find that a combination of date canonicalization and distant supervision rules for predicting relations on both events and time expressions improves classification, though gains are limited, likely due to the small scale of training data.Comment: NAACL HLT 2016, SemEval-2016 Task 12 submissio

arXiv.org e-Print Archive

A Study of Recent Contributions on Information Extraction

Author: Azizi Shahrzad
Dashti HosseinAli Rahmani
Golshan Parisa Naderi
Safari Leila
Publication venue
Publication date: 15/03/2018
Field of study

This paper reports on modern approaches in Information Extraction (IE) and its two main sub-tasks of Named Entity Recognition (NER) and Relation Extraction (RE). Basic concepts and the most recent approaches in this area are reviewed, which mainly include Machine Learning (ML) based approaches and the more recent trend to Deep Learning (DL) based methods

arXiv.org e-Print Archive

Word-Level Loss Extensions for Neural Temporal Relation Classification

Author: Leeuwenberg Artuur
Moens Marie-Francine
Publication venue
Publication date: 07/08/2018
Field of study

Unsupervised pre-trained word embeddings are used effectively for many tasks in natural language processing to leverage unlabeled textual data. Often these embeddings are either used as initializations or as fixed word representations for task-specific classification models. In this work, we extend our classification model's task loss with an unsupervised auxiliary loss on the word-embedding level of the model. This is to ensure that the learned word representations contain both task-specific features, learned from the supervised loss component, and more general features learned from the unsupervised loss component. We evaluate our approach on the task of temporal relation extraction, in particular, narrative containment relation extraction from clinical records, and show that continued training of the embeddings on the unsupervised objective together with the task objective gives better task-specific embeddings, and results in an improvement over the state of the art on the THYME dataset, using only a general-domain part-of-speech tagger as linguistic resource.Comment: Accepted at the 27th International Conference on Computational Linguistics (COLING 2018

arXiv.org e-Print Archive

Deep Learning applied to NLP

Author: Kalita Jugal
Lopez Marc Moreno
Publication venue
Publication date: 08/03/2017
Field of study

Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP

arXiv.org e-Print Archive

Clinical Information Extraction via Convolutional Neural Network

Author: Huang Heng
Li Peng
Publication venue
Publication date: 30/03/2016
Field of study

We report an implementation of a clinical information extraction tool that leverages deep neural network to annotate event spans and their attributes from raw clinical notes and pathology reports. Our approach uses context words and their part-of-speech tags and shape information as features. Then we hire temporal (1D) convolutional neural network to learn hidden feature representations. Finally, we use Multilayer Perceptron (MLP) to predict event spans. The empirical evaluation demonstrates that our approach significantly outperforms baselines.Comment: arXiv admin note: text overlap with arXiv:1408.5882 by other author

arXiv.org e-Print Archive

Exploring Contextualized Neural Language Models for Temporal Dependency Parsing

Author: Cai Jonathon
Min Bonan
Ross Hayley
Publication venue
Publication date: 02/10/2020
Field of study

Extracting temporal relations between events and time expressions has many applications such as constructing event timelines and time-related question answering. It is a challenging problem which requires syntactic and semantic information at sentence or discourse levels, which may be captured by deep contextualized language models (LMs) such as BERT (Devlin et al., 2019). In this paper, we develop several variants of BERT-based temporal dependency parser, and show that BERT significantly improves temporal dependency parsing (Zhang and Xue, 2018a). We also present a detailed analysis on why deep contextualized neural LMs help and where they may fall short. Source code and resources are made available at https://github.com/bnmin/tdp_ranking

arXiv.org e-Print Archive

Learning Actor Relation Graphs for Group Activity Recognition

Author: Guo Jie
Wang Li
Wang Limin
Wu Gangshan
Wu Jianchao
Publication venue
Publication date: 22/04/2019
Field of study

Modeling relation between actors is important for recognizing group activity in a multi-person scene. This paper aims at learning discriminative relation between actors efficiently using deep models. To this end, we propose to build a flexible and efficient Actor Relation Graph (ARG) to simultaneously capture the appearance and position relation between actors. Thanks to the Graph Convolutional Network, the connections in ARG could be automatically learned from group activity videos in an end-to-end manner, and the inference on ARG could be efficiently performed with standard matrix operations. Furthermore, in practice, we come up with two variants to sparsify ARG for more effective modeling in videos: spatially localized ARG and temporal randomized ARG. We perform extensive experiments on two standard group activity recognition datasets: the Volleyball dataset and the Collective Activity dataset, where state-of-the-art performance is achieved on both datasets. We also visualize the learned actor graphs and relation features, which demonstrate that the proposed ARG is able to capture the discriminative relation information for group activity recognition.Comment: Accepted by CVPR 201

arXiv.org e-Print Archive