20 research outputs found
Thread Reconstruction in Conversational Data using Neural Coherence Models
Discussion forums are an important source of information. They are often used
to answer specific questions a user might have and to discover more about a
topic of interest. Discussions in these forums may evolve in intricate ways,
making it difficult for users to follow the flow of ideas. We propose a novel
approach for automatically identifying the underlying thread structure of a
forum discussion. Our approach is based on a neural model that computes
coherence scores of possible reconstructions and then selects the highest
scoring, i.e., the most coherent one. Preliminary experiments demonstrate
promising results outperforming a number of strong baseline methods.Comment: Neu-IR: Workshop on Neural Information Retrieval 201
Coherence Aspects Within Efl Essay
The purpose of this research is to discover Indonesian undergraduate studentsâ understanding in composing a coherence English essay. The subjects of the research are essay texts written by thirty undergraduate students using theme âStudentsâ attitude toward Indonesian languageâ chosen randomly. The analysis has revealed that s coherence aspects have not used properly, more over there is one aspect is not used. The coherence aspects consist of repetition key noun, using consistent pronoun, transition signals and logical order. This research is designed as library research using content analysis method. Data are analyzed to investigated the coherency in studentsâ essay.Key words: essay, coherenc
Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application
We present two novel models of document coherence and their application to
information retrieval (IR). Both models approximate document coherence using
discourse entities, e.g. the subject or object of a sentence. Our first model
views text as a Markov process generating sequences of discourse entities
(entity n-grams); we use the entropy of these entity n-grams to approximate the
rate at which new information appears in text, reasoning that as more new words
appear, the topic increasingly drifts and text coherence decreases. Our second
model extends the work of Guinaudeau & Strube [28] that represents text as a
graph of discourse entities, linked by different relations, such as their
distance or adjacency in text. We use several graph topology metrics to
approximate different aspects of the discourse flow that can indicate
coherence, such as the average clustering or betweenness of discourse entities
in text. Experiments with several instantiations of these models show that: (i)
our models perform on a par with two other well-known models of text coherence
even without any parameter tuning, and (ii) reranking retrieval results
according to their coherence scores gives notable performance gains, confirming
a relation between document coherence and relevance. This work contributes two
novel models of document coherence, the application of which to IR complements
recent work in the integration of document cohesiveness or comprehensibility to
ranking [5, 56]
Enriching entity grids and graphs with discourse relations: the impact in local coherence evaluation
This paper describes how discursive knowledge, given by the discursive theories RST (Rhetorical Structure Theory) and CST (Crossdocument Structure Theory), may improve the automatic evaluation of local coherence in multi-document summaries. Two of the main coherence models from literature were incremented with discursive information and obtained 91.3% of accuracy, with a gain of 53% in relation to the original results.FAPES
Topological Sort for Sentence Ordering
Sentence ordering is the task of arranging the sentences of a given text in
the correct order. Recent work using deep neural networks for this task has
framed it as a sequence prediction problem. In this paper, we propose a new
framing of this task as a constraint solving problem and introduce a new
technique to solve it. Additionally, we propose a human evaluation for this
task. The results on both automatic and human metrics across four different
datasets show that this new technique is better at capturing coherence in
documents.Comment: Will be published at the Proceedings of the 58th Annual Meeting of
the Association for Computational Linguistics (ACL) 202
Automated assessment of non-native learner essays: Investigating the role of linguistic features
Automatic essay scoring (AES) refers to the process of scoring free text
responses to given prompts, considering human grader scores as the gold
standard. Writing such essays is an essential component of many language and
aptitude exams. Hence, AES became an active and established area of research,
and there are many proprietary systems used in real life applications today.
However, not much is known about which specific linguistic features are useful
for prediction and how much of this is consistent across datasets. This article
addresses that by exploring the role of various linguistic features in
automatic essay scoring using two publicly available datasets of non-native
English essays written in test taking scenarios. The linguistic properties are
modeled by encoding lexical, syntactic, discourse and error types of learner
language in the feature set. Predictive models are then developed using these
features on both datasets and the most predictive features are compared. While
the results show that the feature set used results in good predictive models
with both datasets, the question "what are the most predictive features?" has a
different answer for each dataset.Comment: Article accepted for publication at: International Journal of
Artificial Intelligence in Education (IJAIED). To appear in early 2017
(journal url: http://www.springer.com/computer/ai/journal/40593
Dialogue Coherence Assessment Without Explicit Dialogue Act Labels
Recent dialogue coherence models use the coherence features designed for
monologue texts, e.g. nominal entities, to represent utterances and then
explicitly augment them with dialogue-relevant features, e.g., dialogue act
labels. It indicates two drawbacks, (a) semantics of utterances is limited to
entity mentions, and (b) the performance of coherence models strongly relies on
the quality of the input dialogue act labels. We address these issues by
introducing a novel approach to dialogue coherence assessment. We use dialogue
act prediction as an auxiliary task in a multi-task learning scenario to obtain
informative utterance representations for coherence assessment. Our approach
alleviates the need for explicit dialogue act labels during evaluation. The
results of our experiments show that our model substantially (more than 20
accuracy points) outperforms its strong competitors on the DailyDialogue
corpus, and performs on par with them on the SwitchBoard corpus for ranking
dialogues concerning their coherence.Comment: Accepted at ACL 202
Optimization of Window Size for Calculating Semantic Coherence Within an Essay
Over the last fifty years, as the field of automated essay evaluation has progressed, several ways have been offered. The three aspects of style, substance, and semantics are the primary focus of automated essay evaluation. The style and content attributes have received the most attention, while the semantics attribute has received less attention. A smaller fraction of the essay (window) is chosen to measure semantics, and the essay is broken into smaller portions using this window. The goal of this work is to determine an acceptable window size for measuring semantic coherence between different parts of the essay with more precision