1,413 research outputs found
Enriching entity grids and graphs with discourse relations: the impact in local coherence evaluation
This paper describes how discursive knowledge, given by the discursive theories RST (Rhetorical Structure Theory) and CST (Crossdocument Structure Theory), may improve the automatic evaluation of local coherence in multi-document summaries. Two of the main coherence models from literature were incremented with discursive information and obtained 91.3% of accuracy, with a gain of 53% in relation to the original results.FAPES
A discursive grid approach to model local coherence in multi-document summaries
Multi-document summarization is a very important area of Natural Language Processing (NLP) nowadays because of the huge amount of data in the web. People want more and more information and this information must be coherently organized and summarized. The main focus of this paper is to deal with the coherence of multi-document summaries. Therefore, a model that uses discursive information to automatically evaluate local coherence in multi-document summaries has been developed. This model obtains 92.69% of accuracy in distinguishing coherent from incoherent summaries, outperforming the state of the art in the area.CAPESFAPESPUniversity of Goiá
Coherence in Machine Translation
Coherence ensures individual sentences work together to form a meaningful document. When properly translated, a coherent document in one language should result in a coherent document in another language. In Machine Translation, however, due to reasons of modeling and computational complexity, sentences are pieced together from words or phrases based on short context windows and
with no access to extra-sentential context.
In this thesis I propose ways to automatically assess the coherence of machine translation output. The work is structured around three dimensions: entity-based coherence, coherence as evidenced via syntactic patterns, and coherence as
evidenced via discourse relations.
For the first time, I evaluate existing monolingual coherence models on this new task, identifying issues and challenges that are specific to the machine translation setting. In order to address these issues, I adapted a state-of-the-art syntax
model, which also resulted in improved performance for the monolingual task. The results clearly indicate how much more difficult the new task is than the task of detecting shuffled texts. I proposed a new coherence model, exploring the crosslingual transfer of discourse relations in machine translation. This model is novel in that it measures the correctness of the discourse relation by comparison to the source text rather than to a reference translation. I identified patterns of incoherence common across different language pairs, and created a corpus of machine translated output annotated with coherence errors for evaluation purposes. I then examined
lexical coherence in a multilingual context, as a preliminary study for crosslingual transfer. Finally, I determine how the new and adapted models correlate with human judgements of translation quality and suggest that improvements in general evaluation within machine translation would benefit from having a coherence component that evaluated the translation output with respect to the source text
Probabilistic approaches for modeling text structure and their application to text-to-text generation
Since the early days of generation research, it has been acknowledged that modeling the global structure of a document is crucial for producing coherent, readable output. However, traditional knowledge-intensive approaches have been of limited utility in addressing this problem since they cannot be effectively scaled to operate in domain-independent, large-scale applications. Due to this difficulty, existing text-to-text generation systems rarely rely on such structural information when producing an output text. Consequently, texts generated by these methods do not match the quality of those written by humans – they are often fraught with severe coherence violations and disfluencies.
In this chapter, I will present probabilistic models of document structure that can be effectively learned from raw document collections. This feature distinguishes these new models from traditional knowledge intensive approaches used in symbolic concept-to-text generation. Our results demonstrate that these probabilistic models can be directly applied to content organization, and suggest that these models can prove useful in an even broader range of text-to-text applications than we have considered here.National Science Foundation (U.S.) (CAREER grant IIS- 0448168)Microsoft Research. New Faculty Fellowshi
Modeling Local Coherence: An Entity-Based Approach
This article proposes a novel framework for representing and measuring local coherence. Central to this approach is the entity-grid representation of discourse, which captures patterns of entity distribution in a text. The algorithm introduced in the article automatically abstracts a text into a set of entity transition sequences and records distributional, syntactic, and referential information about discourse entities. We re-conceptualize coherence assessment as a learning task and show that our entity-based representation is well-suited for ranking-based generation and text classification tasks. Using the proposed representation, we achieve good performance on text ordering, summary coherence evaluation, and readability assessment. 1
Modeling Local Coherence: An Entity-Based Approach
This article proposes a novel framework for representing and measuring local coherence. Central to this approach is the entity-grid representation of discourse, which captures patterns of entity distribution in a text. The algorithm introduced in the article automatically abstracts a text into a set of entity transition sequences and records distributional, syntactic, and referential information about discourse entities. We re-conceptualize coherence assessment as a learning task and show that our entity-based representation is well-suited for ranking-based generation and text classification tasks. Using the proposed representation, we achieve good performance on text ordering, summary coherence evaluation, and readability assessment. 1
Graph-based Patterns for Local Coherence Modeling
Coherence is an essential property of well-written texts. It distinguishes a multi-sentence text from a sequence of randomly strung sentences. The task of local coherence modeling is about the way that sentences in a text link up one another. Solving this task is beneficial for assessing the quality of texts. Moreover, a coherence model can be integrated into text generation systems such as text summarizers to produce coherent texts.
In this dissertation, we present a graph-based approach to local coherence modeling that accounts for the connectivity structure among sentences in a text. Graphs give our model the capability to take into account relations between non-adjacent sentences as well as those between adjacent sentences. Besides, the connectivity style among nodes in graphs reflects the relationships among sentences in a text.
We first employ the entity graph approach, proposed by Guinaudeau and Strube (2013), to represent a text via a graph. In the entity graph representation of a text, nodes encode sentences and edges depict the existence of a pair of coreferent mentions in sentences. We then devise graph-based features to capture the connectivity structure of nodes in a graph, and accordingly the connectivity structure of sentences in the corresponding text. We extract all subgraphs of entity graphs as features which encode the connectivity structure of graphs. Frequencies of subgraphs correlate with the perceived coherence of their corresponding texts. Therefore, we refer to these subgraphs as coherence patterns.
In order to complete our approach to coherence modeling, we propose a new graph representation of texts, rather than the entity graph. Our approach employs lexico-semantic relations among words in sentences, instead of only entity coreference relations, to model relationships between sentences via a graph. This new lexical graph representation of text plus our method for mining coherence patterns make our coherence model.
We evaluate our approach on the readability assessment task because a primary factor of readability is coherence. Coherent texts are easy to read and consequently demand less effort from their readers. Our extensive experiments on two separate readability assessment datasets show that frequencies of coherence patterns in texts correlate with the readability ratings assigned by human judges. By training a machine learning method on our coherence patterns, our model outperforms its counterparts on ranking texts with respect to their readability. As one of the ultimate goals of coherence models is to be used in text generation systems, we show how our coherence patterns can be integrated into a graph-based text summarizer to produce informative and coherent summaries. Our coherence patterns improve the performance of the summarization system based on both standard summarization metrics and human evaluations. An implementation of the approaches discussed in this dissertation is publicly available
Recommended from our members
Neural approaches to discourse coherence: modeling, evaluation and application
Discourse coherence is an important aspect of text quality that refers to the way different textual units relate to each other. In this thesis, I investigate neural approaches to modeling discourse coherence. I present a multi-task neural network where the main task is to predict a document-level coherence score and the secondary task is to learn word-level syntactic features. Additionally, I examine the effect of using contextualised word representations in single-task and multi-task setups. I evaluate my models on a synthetic dataset where incoherent documents are created by shuffling the sentence order in coherent original documents. The results show the efficacy of my multi-task learning approach, particularly when enhanced with contextualised embeddings, achieving new state-of-the-art results in ranking the coherent documents higher than the incoherent ones (96.9%). Furthermore, I apply my approach to the realistic domain of people’s everyday writing, such as emails and online posts, and further demonstrate its ability to capture various degrees of coherence. In order to further investigate the linguistic properties captured by coherence models, I create two datasets that exhibit syntactic and semantic alterations. Evaluating different models on these datasets reveals their ability to capture syntactic perturbations but their inadequacy to detect semantic changes. I find that semantic alterations are instead captured by models that first build sentence representations from averaged word embeddings, then apply a set of linear transformations over input sentence pairs. Finally, I present an application for coherence models in the pedagogical domain. I first demonstrate that state of-the-art neural approaches to automated essay scoring (AES) are not robust to adversarially created, grammatical, but incoherent sequences of sentences. Accordingly, I propose a framework for integrating and jointly training a coherence model with a state-of-the-art neural AES system in order to enhance its ability to detect such adversarial input. I show that this joint framework maintains a performance comparable to the state-of-the-art AES system in predicting a holistic essay score while significantly outperforming it in adversarial detection
Mix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries
In this article, we propose a method of text summary\u27s content and linguistic quality evaluation that is based on a machine learning approach. This method operates by combining multiple features to build predictive models that evaluate the content and the linguistic quality of new summaries (unseen) constructed from the same source documents as the summaries used in the training and the validation of models. To obtain the best model, many single and ensemble learning classifiers are tested. Using the constructed models, we have achieved a good performance in predicting the content and the linguistic quality scores. In order to evaluate the summarization systems, we calculated the system score as the average of the score of summaries that are built from the same system. Then, we evaluated the correlation of the system score with the manual system score. The obtained correlation indicates that the system score outperforms the baseline scores
Learning to tell tales: automatic story generation from Corpora
Automatic story generation has a long-standing tradition in the field of Artificial
Intelligence. The ability to create stories on demand holds great potential for entertainment
and education. For example, modern computer games are becoming more
immersive, containing multiple story lines and hundreds of characters. This has substantially
increased the amount of work required to produce each game. However, by
allowing the game to write its own story line, it can remain engaging to the player
whilst shifting the burden of writing away from the game’s developers. In education,
intelligent tutoring systems can potentially provide students with instant feedback and
suggestions of how to write their own stories. Although several approaches have been
introduced in the past (e.g., story grammars, story schema and autonomous agents),
they all rely heavily on handwritten resources. Which places severe limitations on its
scalability and usage.
In this thesis we will motivate a new approach to story generation which takes its
inspiration from recent research in Natural Language Generation. Whose result is an
interactive data-driven system for the generation of children’s stories. One of the key
features of this system is that it is end-to-end, realising the various components of the
generation pipeline stochastically. Knowledge relating to the generation and planning
of stories is leveraged automatically from corpora and reformulated into new stories to
be presented to the user.
We will also show that story generation can be viewed as a search task, operating
over a large number of stories that can be generated from knowledge inherent in a corpus.
Using trainable scoring functions, our system can search the story space using
different document level criteria. In this thesis we focus on two of these, namely, coherence
and interest. We will also present two major paradigms for generation through
search, (a) generate and rank, and (b) genetic algorithms. We show the effects on
perceived story interest, fluency and coherence that result from these approaches. In
addition, we show how the explicit use of plots induced from the corpus can be used
to guide the generation process, providing a heuristically motivated starting point for
story search.
We motivate extensions to the system and show that additional modules can be
used to improve the quality of the generated stories and overall scalability. Finally we
highlight the current strengths and limitations of our approach and discuss possible
future approaches to this field of research
- …