99,238 research outputs found
A Multivariate Study of T/V Forms in European Languages Based on a Parallel Corpus of Film Subtitles
The present study investigates the cross-linguistic differences in the use of so-called T/V forms (e.g. French tu and vous, German du and Sie, Russian ty and vy) in ten European languages from different language families and genera. These constraints represent an elusive object of investigation because they depend on a large number of subtle contextual features and social distinctions, which should be cross-linguistically matched. Film subtitles in different languages offer a convenient solution because the situations of communication between film characters can serve as comparative concepts. I selected more than two hundred contexts that contain the pronouns you and yourself in the original English versions, which are then coded for fifteen contextual variables that describe the Speaker and the Hearer, their relationships and different situational properties. The creators of subtitles in the other languages have to choose between T and V when translating from English, where the T/V distinction is not expressed grammatically. On the basis of these situations translated in ten languages, I perform multivariate analyses using the method of conditional inference trees in order to identify the most relevant contextual variables that constrain the T/V variation in each language
On the tree-transformation power of XSLT
XSLT is a standard rule-based programming language for expressing
transformations of XML data. The language is currently in transition from
version 1.0 to 2.0. In order to understand the computational consequences of
this transition, we restrict XSLT to its pure tree-transformation capabilities.
Under this focus, we observe that XSLT~1.0 was not yet a computationally
complete tree-transformation language: every 1.0 program can be implemented in
exponential time. A crucial new feature of version~2.0, however, which allows
nodesets over temporary trees, yields completeness. We provide a formal
operational semantics for XSLT programs, and establish confluence for this
semantics
Renaming Global Variables in C Mechanically Proved Correct
Most integrated development environments are shipped with refactoring tools.
However, their refactoring operations are often known to be unreliable. As a
consequence, developers have to test their code after applying an automatic
refactoring. In this article, we consider a refactoring operation (renaming of
global variables in C), and we prove that its core implementation preserves the
set of possible behaviors of transformed programs. That proof of correctness
relies on the operational semantics of C provided by CompCert C in Coq.Comment: In Proceedings VPT 2016, arXiv:1607.0183
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Generating varied narrative probability exercises
This paper presents Genpex, a system for automatic generation of narrative probability exercises. Generation of exercises in Genpex is done in two steps. First, the system creates a specification of a solvable probability problem, based on input from the user (a researcher or test developer) who selects a specific question type and a narrative context for the problem. Then, a text expressing the probability problem is generated. The user can tune the generated text by setting the values of some linguistic variation parameters. By varying the mathematical content of the exercise, its narrative context and the linguistic parameter settings, many different exercises can be produced. Here we focus on the natural language generation part of Genpex. After describing how the system works, we briefly present our first evaluation results, and discuss some aspects requiring further investigation
- …