Search CORE

4,702 research outputs found

Identifying Signs of Syntactic Complexity for Rule-Based Sentence Simplification

Author: Evans Richard
Orasan Constantin
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 31/10/2018
Field of study

This article presents a new method to automatically simplify English sentences. The approach is designed to reduce the number of compound clauses and nominally bound relative clauses in input sentences. The article provides an overview of a corpus annotated with information about various explicit signs of syntactic complexity and describes the two major components of a sentence simplification method that works by exploiting information on the signs occurring in the sentences of a text. The first component is a sign tagger which automatically classifies signs in accordance with the annotation scheme used to annotate the corpus. The second component is an iterative rule-based sentence transformation tool. Exploiting the sign tagger in conjunction with other NLP components, the sentence transformation tool automatically rewrites long sentences containing compound clauses and nominally bound relative clauses as sequences of shorter single-clause sentences. Evaluation of the different components reveals acceptable performance in rewriting sentences containing compound clauses but less accuracy when rewriting sentences containing nominally bound relative clauses. A detailed error analysis revealed that the major sources of error include inaccurate sign tagging, the relatively limited coverage of the rules used to rewrite sentences, and an inability to discriminate between various subtypes of clause coordination. Despite this, the system performed well in comparison with two baselines. This finding was reinforced by automatic estimations of the readability of system output and by surveys of readers’ opinions about the accuracy, accessibility, and meaning of this output

Wolverhampton Intellectual Repository and E-theses

Sentence Simplification for Text Processing

Author: Evans Richard
Publication venue: University of Wolverhampton
Publication date: 01/05/2020
Field of study

A thesis submitted in partial fulfilment of the requirement of the University of Wolverhampton for the degree of Doctor of Philosophy.Propositional density and syntactic complexity are two features of sentences which affect the ability of humans and machines to process them effectively. In this thesis, I present a new approach to automatic sentence simplification which processes sentences containing compound clauses and complex noun phrases (NPs) and converts them into sequences of simple sentences which contain fewer of these constituents and have reduced per sentence propositional density and syntactic complexity. My overall approach is iterative and relies on both machine learning and handcrafted rules. It implements a small set of sentence transformation schemes, each of which takes one sentence containing compound clauses or complex NPs and converts it one or two simplified sentences containing fewer of these constituents (Chapter 5). The iterative algorithm applies the schemes repeatedly and is able to simplify sentences which contain arbitrary numbers of compound clauses and complex NPs. The transformation schemes rely on automatic detection of these constituents, which may take a variety of forms in input sentences. In the thesis, I present two new shallow syntactic analysis methods which facilitate the detection process. The first of these identifies various explicit signs of syntactic complexity in input sentences and classifies them according to their specific syntactic linking and bounding functions. I present the annotated resources used to train and evaluate this sign tagger (Chapter 2) and the machine learning method used to implement it (Chapter 3). The second syntactic analysis method exploits the sign tagger and identifies the spans of compound clauses and complex NPs in input sentences. In Chapter 4 of the thesis, I describe the development and evaluation of a machine learning approach performing this task. This chapter also presents a new annotated dataset supporting this activity. In the thesis, I present two implementations of my approach to sentence simplification. One of these exploits handcrafted rule activation patterns to detect different parts of input sentences which are relevant to the simplification process. The other implementation uses my machine learning method to identify compound clauses and complex NPs for this purpose. Intrinsic evaluation of the two implementations is presented in Chapter 6 together with a comparison of their performance with several baseline systems. The evaluation includes comparisons of system output with human-produced simplifications, automated estimations of the readability of system output, and surveys of human opinions on the grammaticality, accessibility, and meaning of automatically produced simplifications. Chapter 7 presents extrinsic evaluation of the sentence simplification method exploiting handcrafted rule activation patterns. The extrinsic evaluation involves three NLP tasks: multidocument summarisation, semantic role labelling, and information extraction. Finally, in Chapter 8, conclusions are drawn and directions for future research considered

Wolverhampton Intellectual Repository and E-theses

An evaluation of syntactic simplification rules for people with autism

Author: Dornescu Iustin
Evans Richard
Orasan Constantin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014)Syntactically complex sentences constitute an obstacle for some people with Autistic Spectrum Disorders. This paper evaluates a set of simplification rules specifically designed for tackling complex and compound sentences. In total, 127 different rules were developed for the rewriting of complex sentences and 56 for the rewriting of compound sentences. The evaluation assessed the accuracy of these rules individually and revealed that fully automatic conversion of these sentences into a more accessible form is not very reliable.EC FP7-ICT-2011-

Crossref

Wolverhampton Intellectual Repository and E-theses

Type-driven semantic interpretation and feature dependencies in R-LFG

Author: Johnson Mark
Publication venue
Publication date: 21/11/1997
Field of study

Once one has enriched LFG's formal machinery with the linear logic mechanisms needed for semantic interpretation as proposed by Dalrymple et. al., it is natural to ask whether these make any existing components of LFG redundant. As Dalrymple and her colleagues note, LFG's f-structure completeness and coherence constraints fall out as a by-product of the linear logic machinery they propose for semantic interpretation, thus making those f-structure mechanisms redundant. Given that linear logic machinery or something like it is independently needed for semantic interpretation, it seems reasonable to explore the extent to which it is capable of handling feature structure constraints as well. R-LFG represents the extreme position that all linguistically required feature structure dependencies can be captured by the resource-accounting machinery of a linear or similiar logic independently needed for semantic interpretation, making LFG's unification machinery redundant. The goal is to show that LFG linguistic analyses can be expressed as clearly and perspicuously using the smaller set of mechanisms of R-LFG as they can using the much larger set of unification-based mechanisms in LFG: if this is the case then we will have shown that positing these extra f-structure mechanisms is not linguistically warranted.Comment: 30 pages, to appear in the the ``Glue Language'' volume edited by Dalrymple, uses tree-dvips, ipa, epic, eepic, fullnam

arXiv.org e-Print Archive

CiteSeerX

Cognitive constraints and island effects

Author: Ivan A. Sag
Philip Hofmeister
Publication venue: 'Project Muse'
Publication date: 01/01/2010
Field of study

Competence-based theories of island effects play a central role in generative grammar, yet the graded nature of many syntactic islands has never been properly accounted for. Categorical syntactic accounts of island effects have persisted in spite of a wealth of data suggesting that island effects are not categorical in nature and that nonstructural manipulations that leave island structures intact can radically alter judgments of island violations. We argue here, building on work by Paul Deane, Robert Kluender, and others, that processing factors have the potential to account for this otherwise unexplained variation in acceptability judgments. We report the results of self-paced reading experiments and controlled acceptability studies that explore the relationship between processing costs and judgments of acceptability. In each of the three self-paced reading studies, the data indicate that the processing cost of different types of island violations can be significantly reduced to a degree comparable to that of nonisland filler-gap constructions by manipulating a single nonstructural factor. Moreover, this reduction in processing cost is accompanied by significant improvements in acceptability. This evidence favors the hypothesis that island-violating constructions involve numerous processing pressures that aggregate to drive processing difficulty above a threshold, resulting in unacceptability. We examine the implications of these findings for the grammar of filler-gap dependencies

University of Essex Research Repository

PubMed Central

Placing pauses in read spoken Spanish : a model and an algorithm

Author: Aguilar Lourdes
Casacuberta David,
Marín Gálvez Rafael
Publication venue
Publication date: 01/01/2002
Field of study

The purpose of this work is to describe the appearance and location of typographically unmarked pauses in any Spanish text to be read. An experiment is designed to derive pause location from natural speech: results show that Intonation Group length constraints guide the appearance of pauses, which are placed depending on syntactic information. Then, a rule-based algorithm is developed to automatically place pauses whose performance is tested by means of qualitative tests. The evaluation shows that the system adequately places pauses in read texts, since it predicts 81% of orthographically unmarked pauses; when pauses associated to punctuation signs are included, the percentage of correct prediction increases to 92%

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Diposit Digital de Documents de la UAB

Recommended from our members

Proceedings of QG2010: The Third Workshop on Question Generation

Author: Boyer Kristy Elizabeth
Piwek Paul
Publication venue: questiongeneration.org
Publication date: 18/06/2010
Field of study

These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge". QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)

Open Research Online (The Open University)

Sentence simplification for semantic role labelling and information extraction

Author: Evans Richard
Orasan Constantin
Publication venue: RANLP
Publication date: 06/07/2019
Field of study

In this paper, we report on the extrinsic evaluation of an automatic sentence simplification method with respect to two NLP tasks: semantic role labelling (SRL) and information extraction (IE). The paper begins with our observation of challenges in the intrinsic evaluation of sentence simplification systems, which motivates the use of extrinsic evaluation of these systems with respect to other NLP tasks. We describe the two NLP systems and the test data used in the extrinsic evaluation, and present arguments and evidence motivating the integration of a sentence simplification step as a means of improving the accuracy of these systems. Our evaluation reveals that their performance is improved by the simplification step: the SRL system is better able to assign semantic roles to the majority of the arguments of verbs and the IE system is better able to identify fillers for all IE template slots

Crossref

Wolverhampton Intellectual Repository and E-theses

Intelligent text processing to help readers with autism

Author: Evans R
Mitkov R
Orăsan C
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2017
Field of study

© 2018, Springer International Publishing AG. Autistic Spectrum Disorder (ASD) is a neurodevelopmental disorder which has a life-long impact on the lives of people diagnosed with the condition. In many cases, people with ASD are unable to derive the gist or meaning of written documents due to their inability to process complex sentences, understand non-literal text, and understand uncommon and technical terms. This paper presents FIRST, an innovative project which developed language technology (LT) to make documents more accessible to people with ASD. The project has produced a powerful editor which enables carers of people with ASD to prepare texts suitable for this population. Assessment of the texts generated using the editor showed that they are not less readable than those generated more slowly as a result of onerous unaided conversion and were significantly more readable than the originals. Evaluation of the tool shows that it can have a positive impact on the lives of people with ASD.Published versio

Wolverhampton Intellectual Repository and E-theses