15,864 research outputs found
Native and non-native speakers of English in summarising expository texts
This study examines how native and non-native English speakers summarise expository texts. It investigates if there is any difference in quality between the summaries produced by two groups of students; namely native speakers of English, who acquire the language in early childhood and have their education (from kindergarten / grade 1 to high school) in English, and non-native speakers, who acquire the language in an ESL/EFL context. The sample consisted of seventy undergraduates from a private Malaysian university,
comprising thirty-five native and thirty-five non-native speakers of English. Data for the study include summaries by students, response to teacher and student questionnaires as well as interviews with both teachers and students. The results of the study revealed that there was a significant difference in the quality of summaries of native and non-native English speakers in expository text
Extracting Formal Models from Normative Texts
We are concerned with the analysis of normative texts - documents based on
the deontic notions of obligation, permission, and prohibition. Our goal is to
make queries about these notions and verify that a text satisfies certain
properties concerning causality of actions and timing constraints. This
requires taking the original text and building a representation (model) of it
in a formal language, in our case the C-O Diagram formalism. We present an
experimental, semi-automatic aid that helps to bridge the gap between a
normative text in natural language and its C-O Diagram representation. Our
approach consists of using dependency structures obtained from the
state-of-the-art Stanford Parser, and applying our own rules and heuristics in
order to extract the relevant components. The result is a tabular data
structure where each sentence is split into suitable fields, which can then be
converted into a C-O Diagram. The process is not fully automatic however, and
some post-editing is generally required of the user. We apply our tool and
perform experiments on documents from different domains, and report an initial
evaluation of the accuracy and feasibility of our approach.Comment: Extended version of conference paper at the 21st International
Conference on Applications of Natural Language to Information Systems (NLDB
2016). arXiv admin note: substantial text overlap with arXiv:1607.0148
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Evaluating prose style transfer with the Bible
In the prose style transfer task a system, provided with text input and a
target prose style, produces output which preserves the meaning of the input
text but alters the style. These systems require parallel data for evaluation
of results and usually make use of parallel data for training. Currently, there
are few publicly available corpora for this task. In this work, we identify a
high-quality source of aligned, stylistically distinct text in different
versions of the Bible. We provide a standardized split, into training,
development and testing data, of the public domain versions in our corpus. This
corpus is highly parallel since many Bible versions are included. Sentences are
aligned due to the presence of chapter and verse numbers within all versions of
the text. In addition to the corpus, we present the results, as measured by the
BLEU and PINC metrics, of several models trained on our data which can serve as
baselines for future research. While we present these data as a style transfer
corpus, we believe that it is of unmatched quality and may be useful for other
natural language tasks as well
Comparison and Adaptation of Automatic Evaluation Metrics for Quality Assessment of Re-Speaking
Re-speaking is a mechanism for obtaining high quality subtitles for use in
live broadcast and other public events. Because it relies on humans performing
the actual re-speaking, the task of estimating the quality of the results is
non-trivial. Most organisations rely on humans to perform the actual quality
assessment, but purely automatic methods have been developed for other similar
problems, like Machine Translation. This paper will try to compare several of
these methods: BLEU, EBLEU, NIST, METEOR, METEOR-PL, TER and RIBES. These will
then be matched to the human-derived NER metric, commonly used in re-speaking.Comment: Comparison and Adaptation of Automatic Evaluation Metrics for Quality
Assessment of Re-Speaking. arXiv admin note: text overlap with
arXiv:1509.0908
Plagiarism in philosophy: prevention better than cure
[Introduction]
Plagiarism more common than thought in student essays’ would
make a good headline. Recent research suggests that students
admit to much more plagiarism and other forms of cheating than
teachers generally suspect, and it is widely believed that the problem is
increasing as a result of the internet. The solution is to use a range of
techniques to get the thought back into student essay writing, and to take
more active steps to spot when this has not happened
Researching the Use of Dictionary by Students of English Literature Department at Jenderal Soedirman University
Dictionaries are recommnded as a useful tool when learning EFL because it gives
information of the language about many aspects like, phonology, morphology, syntax,
and semantics. Nevertheless, EFL practionaires rarely pay attention to the dictionary
used by the students. The article focuses on the investigation about types of dictionaries
used, the frequency of dictionary use, and the lexical information examined. Respondents
were students of English Literature Department, Jenderal Soedirman University. Data
were taken from questionnaires. The result showed that students did not have any special
instruction on how to make full use of the dictionaries. The respondents favored bilingual
dictionaries over monolingual dictionaries. Respondents saw that pronunciation, usage,
and examples were considered as a secondary importance
Specifying Logic Programs in Controlled Natural Language
Writing specifications for computer programs is not easy since one has to
take into account the disparate conceptual worlds of the application domain and
of software development. To bridge this conceptual gap we propose controlled
natural language as a declarative and application-specific specification
language. Controlled natural language is a subset of natural language that can
be accurately and efficiently processed by a computer, but is expressive enough
to allow natural usage by non-specialists. Specifications in controlled natural
language are automatically translated into Prolog clauses, hence become formal
and executable. The translation uses a definite clause grammar (DCG) enhanced
by feature structures. Inter-text references of the specification, e.g.
anaphora, are resolved with the help of discourse representation theory (DRT).
The generated Prolog clauses are added to a knowledge base. We have implemented
a prototypical specification system that successfully processes the
specification of a simple automated teller machine.Comment: 16 pages, compressed, uuencoded Postscript, published in Proceedings
CLNLP 95, COMPULOGNET/ELSNET/EAGLES Workshop on Computational Logic for
Natural Language Processing, Edinburgh, April 3-5, 199
- …