4,702 research outputs found
Identifying Signs of Syntactic Complexity for Rule-Based Sentence Simplification
This article presents a new method to automatically simplify English sentences. The approach is designed to reduce the number of compound clauses and nominally bound relative clauses in input sentences. The article provides an overview of a corpus annotated with information about various explicit signs of syntactic complexity and describes the two major components of a sentence simplification method that works by exploiting information on the signs occurring in the sentences of a text. The first component is a sign tagger which automatically classifies signs in accordance with the annotation scheme used to annotate the corpus. The second component is an iterative rule-based sentence transformation tool. Exploiting the sign tagger in conjunction with other NLP components, the sentence transformation tool automatically rewrites long sentences containing compound clauses and nominally bound relative clauses as sequences of shorter single-clause sentences. Evaluation of the different components reveals acceptable performance in rewriting sentences containing compound clauses but less accuracy when rewriting sentences containing nominally bound relative clauses. A detailed error analysis revealed that the major sources of error include inaccurate sign tagging, the relatively limited coverage of the rules used to rewrite sentences, and an inability to discriminate between various subtypes of clause coordination. Despite this, the system performed well in comparison with two baselines. This finding was reinforced by automatic estimations of the readability of system output and by surveys of readers’ opinions about the accuracy, accessibility, and meaning of this output
Sentence Simplification for Text Processing
A thesis submitted in partial fulfilment of the requirement of the University of Wolverhampton for the degree of Doctor of Philosophy.Propositional density and syntactic complexity are two features of sentences which
affect the ability of humans and machines to process them effectively. In this
thesis, I present a new approach to automatic sentence simplification which processes
sentences containing compound clauses and complex noun phrases (NPs)
and converts them into sequences of simple sentences which contain fewer of these
constituents and have reduced per sentence propositional density and syntactic
complexity.
My overall approach is iterative and relies on both machine learning and handcrafted
rules. It implements a small set of sentence transformation schemes, each
of which takes one sentence containing compound clauses or complex NPs and
converts it one or two simplified sentences containing fewer of these constituents
(Chapter 5). The iterative algorithm applies the schemes repeatedly and is able
to simplify sentences which contain arbitrary numbers of compound clauses and
complex NPs. The transformation schemes rely on automatic detection of these
constituents, which may take a variety of forms in input sentences. In the thesis, I
present two new shallow syntactic analysis methods which facilitate the detection
process.
The first of these identifies various explicit signs of syntactic complexity in
input sentences and classifies them according to their specific syntactic linking and bounding functions. I present the annotated resources used to train and
evaluate this sign tagger (Chapter 2) and the machine learning method used to
implement it (Chapter 3). The second syntactic analysis method exploits the sign
tagger and identifies the spans of compound clauses and complex NPs in input
sentences. In Chapter 4 of the thesis, I describe the development and evaluation
of a machine learning approach performing this task. This chapter also presents
a new annotated dataset supporting this activity.
In the thesis, I present two implementations of my approach to sentence simplification.
One of these exploits handcrafted rule activation patterns to detect
different parts of input sentences which are relevant to the simplification process.
The other implementation uses my machine learning method to identify
compound clauses and complex NPs for this purpose.
Intrinsic evaluation of the two implementations is presented in Chapter 6 together
with a comparison of their performance with several baseline systems. The
evaluation includes comparisons of system output with human-produced simplifications,
automated estimations of the readability of system output, and surveys
of human opinions on the grammaticality, accessibility, and meaning of automatically
produced simplifications.
Chapter 7 presents extrinsic evaluation of the sentence simplification method
exploiting handcrafted rule activation patterns. The extrinsic evaluation involves
three NLP tasks: multidocument summarisation, semantic role labelling, and information
extraction. Finally, in Chapter 8, conclusions are drawn and directions
for future research considered
An evaluation of syntactic simplification rules for people with autism
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014)Syntactically complex sentences constitute an obstacle for some people with Autistic Spectrum Disorders. This paper evaluates a set of simplification rules specifically designed for tackling complex and compound sentences. In total, 127 different rules were developed for the rewriting of complex sentences and 56 for the rewriting of compound sentences. The evaluation assessed the accuracy of these rules individually and revealed that fully automatic conversion of these sentences into a more accessible form is not very reliable.EC FP7-ICT-2011-
Type-driven semantic interpretation and feature dependencies in R-LFG
Once one has enriched LFG's formal machinery with the linear logic mechanisms
needed for semantic interpretation as proposed by Dalrymple et. al., it is
natural to ask whether these make any existing components of LFG redundant. As
Dalrymple and her colleagues note, LFG's f-structure completeness and coherence
constraints fall out as a by-product of the linear logic machinery they propose
for semantic interpretation, thus making those f-structure mechanisms
redundant. Given that linear logic machinery or something like it is
independently needed for semantic interpretation, it seems reasonable to
explore the extent to which it is capable of handling feature structure
constraints as well.
R-LFG represents the extreme position that all linguistically required
feature structure dependencies can be captured by the resource-accounting
machinery of a linear or similiar logic independently needed for semantic
interpretation, making LFG's unification machinery redundant. The goal is to
show that LFG linguistic analyses can be expressed as clearly and perspicuously
using the smaller set of mechanisms of R-LFG as they can using the much larger
set of unification-based mechanisms in LFG: if this is the case then we will
have shown that positing these extra f-structure mechanisms is not
linguistically warranted.Comment: 30 pages, to appear in the the ``Glue Language'' volume edited by
Dalrymple, uses tree-dvips, ipa, epic, eepic, fullnam
Cognitive constraints and island effects
Competence-based theories of island effects play a central role in generative grammar, yet the graded nature of many syntactic islands has never been properly accounted for. Categorical syntactic accounts of island effects have persisted in spite of a wealth of data suggesting that island effects are not categorical in nature and that nonstructural manipulations that leave island structures intact can radically alter judgments of island violations. We argue here, building on work by Paul Deane, Robert Kluender, and others, that processing factors have the potential to account for this otherwise unexplained variation in acceptability judgments.
We report the results of self-paced reading experiments and controlled acceptability studies that explore the relationship between processing costs and judgments of acceptability. In each of the three self-paced reading studies, the data indicate that the processing cost of different types of island violations can be significantly reduced to a degree comparable to that of nonisland filler-gap constructions by manipulating a single nonstructural factor. Moreover, this reduction in processing cost is accompanied by significant improvements in acceptability. This evidence favors the hypothesis that island-violating constructions involve numerous processing pressures that aggregate to drive processing difficulty above a threshold, resulting in unacceptability. We examine the implications of these findings for the grammar of filler-gap dependencies
Placing pauses in read spoken Spanish : a model and an algorithm
The purpose of this work is to describe the appearance and location of typographically unmarked pauses in any Spanish text to be read. An experiment is designed to derive pause location from natural speech: results show that Intonation Group length constraints guide the appearance of pauses, which are placed depending on syntactic information. Then, a rule-based algorithm is developed to automatically place pauses whose performance is tested by means of qualitative tests. The evaluation shows that the system adequately places pauses in read texts, since it predicts 81% of orthographically unmarked pauses; when pauses associated to punctuation signs are included, the percentage of correct prediction increases to 92%
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
Sentence simplification for semantic role labelling and information extraction
In this paper, we report on the extrinsic evaluation of an automatic sentence simplification method with respect to two NLP tasks: semantic role labelling (SRL) and information extraction (IE). The paper begins with our observation of challenges in the intrinsic evaluation of sentence simplification systems, which motivates the use of extrinsic evaluation of these systems with respect to other NLP tasks. We describe the two NLP systems and the test data used in the extrinsic evaluation, and present arguments and evidence motivating the integration of a sentence simplification step as a means of improving the accuracy of these systems. Our evaluation reveals that their performance is improved by the simplification step: the SRL system is better able to assign semantic roles to the majority of the arguments of verbs and the IE system is better able to identify fillers for all IE template slots
Intelligent text processing to help readers with autism
© 2018, Springer International Publishing AG. Autistic Spectrum Disorder (ASD) is a neurodevelopmental disorder which has a life-long impact on the lives of people diagnosed with the condition. In many cases, people with ASD are unable to derive the gist or meaning of written documents due to their inability to process complex sentences, understand non-literal text, and understand uncommon and technical terms. This paper presents FIRST, an innovative project which developed language technology (LT) to make documents more accessible to people with ASD. The project has produced a powerful editor which enables carers of people with ASD to prepare texts suitable for this population. Assessment of the texts generated using the editor showed that they are not less readable than those generated more slowly as a result of onerous unaided conversion and were significantly more readable than the originals. Evaluation of the tool shows that it can have a positive impact on the lives of people with ASD.Published versio
- …