Search CORE

66 research outputs found

Boosting implicit discourse relation recognition with connective-based word embeddings

Author: Changxing Wu
Jinsong Su
Xiaodong Shi
Yidong Chen
Publication venue: 'Elsevier BV'
Publication date: 20/09/2019
Field of study

Abstract(#br)Implicit discourse relation recognition is the performance bottleneck of discourse structure analysis. To alleviate the shortage of training data, previous methods usually use explicit discourse data, which are naturally labeled by connectives, as additional training data. However, it is often difficult for them to integrate large amounts of explicit discourse data because of the noise problem. In this paper, we propose a simple and effective method to leverage massive explicit discourse data. Specifically, we learn connective-based word embeddings ( CBWE ) by performing connective classification on explicit discourse data. The learned CBWE is capable of capturing discourse relationships between words, and can be used as pre-trained word embeddings for implicit discourse relation recognition. On both the English PDTB and Chinese CDTB data sets, using CBWE achieves significant improvements over baselines with general word embeddings, and better performance than baselines integrating explicit discourse data. By combining CBWE with a strong baseline, we achieve the state-of-the-art performance

Xiamen University Institutional Repository

Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph

Author: Dai Zeyu
Huang Ruihong
Publication venue
Publication date: 30/09/1985
Field of study

We argue that semantic meanings of a sentence or clause can not be interpreted independently from the rest of a paragraph, or independently from all discourse relations and the overall paragraph-level discourse structure. With the goal of improving implicit discourse relation classification, we introduce a paragraph-level neural networks that model inter-dependencies between discourse units as well as discourse relation continuity and patterns, and predict a sequence of discourse relations in a paragraph. Experimental results show that our model outperforms the previous state-of-the-art systems on the benchmark corpus of PDTB.Comment: Accepted by NAACL 201

arXiv.org e-Print Archive

Revistas y Boletines - Banco de la República

Argument Labeling of Discourse Relations using LSTM Neural Networks

Author: Hooda Sohail
Publication venue
Publication date: 24/01/2019
Field of study

A discourse relation can be described as a linguistic unit that is composed of sub-units that, when combined, present more information than the sum of its parts. A discourse relation is usually comprised of two arguments that relate to each other in a given form. A discourse relation may have another optional sub-unit called the discourse connective that connects the two arguments and describes the relationship between the two more explicitly. This is called Explicit Discourse relation. Extracting or labeling arguments present in an explicit discourse relations is a challenging task. In recent years, due to the CoNLL competitions, feature engineering has been applied to allow various machine learning models to achieve an F-measure value of about 55%. However, feature engineering is brittle and hand-crafted, requiring advanced knowledge of linguistics as well as the dataset in question. In this thesis, we propose an approach for segmenting (or identifying the boundaries of) Arg1 and Arg2 without feature engineering. We introduce a Bidirectional Long Short-Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the Penn Discourse Treebank (PDTB) dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art Recurrent Neural Network (RNN) approach, but significantly lower than the feature based state of the art systems. On the other hand, because our approach learns only from the raw dataset, it is more widely applicable to multiple textual genres and languages

Concordia University Research Repository

Extracting Temporal and Causal Relations between Events

Author: Mirza Paramita
Publication venue
Publication date: 12/04/2016
Field of study

Structured information resulting from temporal information processing is crucial for a variety of natural language processing tasks, for instance to generate timeline summarization of events from news documents, or to answer temporal/causal-related questions about some events. In this thesis we present a framework for an integrated temporal and causal relation extraction system. We first develop a robust extraction component for each type of relations, i.e. temporal order and causality. We then combine the two extraction components into an integrated relation extraction system, CATENA---CAusal and Temporal relation Extraction from NAtural language texts---, by utilizing the presumption about event precedence in causality, that causing events must happened BEFORE resulting events. Several resources and techniques to improve our relation extraction systems are also discussed, including word embeddings and training data expansion. Finally, we report our adaptation efforts of temporal information processing for languages other than English, namely Italian and Indonesian.Comment: PhD Thesi

arXiv.org e-Print Archive

Unitn-eprints PhD

Recommended from our members

Advances in statistical script learning

Author: Pichotta Karl
Publication venue
Publication date: 05/02/2018
Field of study

When humans encode information into natural language, they do so with the clear assumption that the reader will be able to seamlessly make inferences based on world knowledge. For example, given the sentence ``Mrs. Dalloway said she would buy the flowers herself,'' one can make a number of probable inferences based on event co-occurrences: she bought flowers, she went to a store, she took the flowers home, and so on. Observing this, it is clear that many different useful natural language end-tasks could benefit from models of events as they typically co-occur (so-called script models). Robust question-answering systems must be able to infer highly-probable implicit events from what is explicitly stated in a text, as must robust information-extraction systems that map from unstructured text to formal assertions about relations expressed in the text. Coreference resolution systems, semantic role labeling, and even syntactic parsing systems could, in principle, benefit from event co-occurrence models. To this end, we present a number of contributions related to statistical event co-occurrence models. First, we investigate a method of incorporating multiple entities into events in a count-based co-occurrence model. We find that modeling multiple entities interacting across events allows for improved empirical performance on the task of modeling sequences of events in documents. Second, we give a method of applying Recurrent Neural Network sequence models to the task of predicting held-out predicate-argument structures from documents. This model allows us to easily incorporate entity noun information, and can allow for more complex, higher-arity events than a count-based co-occurrence model. We find the neural model improves performance considerably over the count-based co-occurrence model. Third, we investigate the performance of a sequence-to-sequence encoder-decoder neural model on the task of predicting held-out predicate-argument events from text. This model does not explicitly model any external syntactic information, and does not require a parser. We find the text-level model to be competitive in predictive performance with an event level model directly mediated by an external syntactic analysis. Finally, motivated by this result, we investigate incorporating features derived from these models into a baseline noun coreference resolution system. We find that, while our additional features do not appreciably improve top-level performance, we can nonetheless provide empirical improvement on a number of restricted classes of difficult coreference decisions.Computer Science

Texas ScholarWorks

Recommended from our members

Content Selection for Effective Counter-Argument Generation

Author: Hidey Christopher
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

The information ecosystem of social media has resulted in an abundance of opinions on political topics and current events. In order to encourage better discussions, it is important to promote high-quality responses and relegate low-quality ones. We thus focus on automatically analyzing and generating counter-arguments in response to posts on social media with the goal of providing effective responses. This thesis is composed of three parts. In the first part, we conduct an analysis of arguments. Specifically, we first annotate discussions from Reddit for aspects of arguments and then analyze them for their persuasive impact. Then we present approaches to identify the argumentative structure of these discussions and predict the persuasiveness of an argument. We evaluate each component independently using automatic or manual evaluations and show significant improvement in each. In the second part, we leverage our discoveries from our analysis in the process of generating counter-arguments. We develop two approaches in the retrieve-and-edit framework, where we obtain content using methods created during our analysis of arguments, among others, and then modify the content using techniques from natural language generation. In the first approach, we develop an approach to retrieve counter-arguments by annotating a dataset for stance and building models for stance prediction. Then we use our approaches from our analysis of arguments to extract persuasive argumentative content before modifying non-content phrases for coherence. In contrast, in the second approach we create a dataset and models for modifying content -- making semantic edits to a claim to have a contrasting stance. We evaluate our approaches using intrinsic automatic evaluation of our predictive models and an overall human evaluation of our generated output. Finally, in the third part, we discuss the semantic challenges of argumentation that we need to solve in order to make progress in the understanding of arguments. To clarify, we develop new methods for identifying two types of semantic relations -- causality and veracity. For causality, we build a distant-labeled dataset of causal relations using lexical indicators and then we leverage features from those indicators to build predictive models. For veracity, we build new models to retrieve evidence given a claim and predict whether the claim is supported by that evidence. We also develop a new dataset for veracity to illuminate the areas that need progress. We evaluate these approaches using automated and manual techniques and obtain significant improvement over strong baselines. Finally, we apply these techniques to claims in the domain of household electricity consumption, mining claims using our methods for causal relations and then verifying their truthfulness

Columbia University Academic Commons

Extreme multi-label deep neural classification of Spanish health records according to the International Classification of Diseases

Author: Blanco Garcés Alberto
Publication venue
Publication date: 20/09/2022
Field of study

111 p.Este trabajo trata sobre la minería de textos clínicos, un campo del Procesamiento del Lenguaje Natural aplicado al dominio biomédico. El objetivo es automatizar la tarea de codificación médica. Los registros electrónicos de salud (EHR) son documentos que contienen información clínica sobre la salud de unpaciente. Los diagnósticos y procedimientos médicos plasmados en la Historia Clínica Electrónica están codificados con respecto a la Clasificación Internacional de Enfermedades (CIE). De hecho, la CIE es la base para identificar estadísticas de salud internacionales y el estándar para informar enfermedades y condiciones de salud. Desde la perspectiva del aprendizaje automático, el objetivo es resolver un problema extremo de clasificación de texto de múltiples etiquetas, ya que a cada registro de salud se le asignan múltiples códigos ICD de un conjunto de más de 70 000 términos de diagnóstico. Una cantidad importante de recursos se dedican a la codificación médica, una laboriosa tarea que actualmente se realiza de forma manual. Los EHR son narraciones extensas, y los codificadores médicos revisan los registros escritos por los médicos y asignan los códigos ICD correspondientes. Los textos son técnicos ya que los médicos emplean una jerga médica especializada, aunque rica en abreviaturas, acrónimos y errores ortográficos, ya que los médicos documentan los registros mientras realizan la práctica clínica real. Paraabordar la clasificación automática de registros de salud, investigamos y desarrollamos un conjunto de técnicas de clasificación de texto de aprendizaje profundo

Archivo Digital para la Docencia y la Investigación