5 research outputs found
Handling non-compositionality in multilingual CNLs
In this paper, we describe methods for handling multilingual
non-compositional constructions in the framework of GF. We specifically look at
methods to detect and extract non-compositional phrases from parallel texts and
propose methods to handle such constructions in GF grammars. We expect that the
methods to handle non-compositional constructions will enrich CNLs by providing
more flexibility in the design of controlled languages. We look at two specific
use cases of non-compositional constructions: a general-purpose method to
detect and extract multilingual multiword expressions and a procedure to
identify nominal compounds in German. We evaluate our procedure for multiword
expressions by performing a qualitative analysis of the results. For the
experiments on nominal compounds, we incorporate the detected compounds in a
full SMT pipeline and evaluate the impact of our method in machine translation
process.Comment: CNL workshop in COLING 201
Recommended from our members
Inferring Inferences: Relational Propositions for Argument Mining
Inferential reasoning is an essential feature of argumentation. Therefore, a method for mining discourse for inferential structures would be of value for argument analysis and assessment. The logic of relational propositions is a procedure for rendering texts as expressions in propositional logic directly from their rhetorical structures. From rhetorical structures, relational propositions are defined, and from these propositions, logical expressions are then generated. There are, however, unsettled issues associated with Rhetorical Structure Theory (RST), some of which are problematic for inference mining. This paper takes a deep dive into some of these issues, with the aim of elucidating the problems and providing guidance for how they may be resolved
Passage retrieval in legal texts
[EN] Legal texts usually comprise many kinds of texts, such as contracts, patents and treaties. These texts usually include a huge quantity of unstructured information written in natural language. Thanks to automatic analysis and Information Retrieval (IR) techniques, it is possible to filter out information that is not relevant and, therefore, to reduce the amount of documents that users need to browse to find the information they are looking for. In this paper we adapted the JIRS passage retrieval system to work with three kinds of legal texts: treaties, patents and contracts, studying the issues related with the processing of this kind of information. In particular, we studied how a passage retrieval system might be linked up to automated analysis based on logic and algebraic programming for the detection of conflicts in contracts. In our set-up, a contract is translated into formal clauses, which are analysed by means of a model checking tool; then, the passage retrieval system is used to extract conflicting sentences from the original contract text. © 2011 Elsevier Inc. All rights reserved.We thank the MICINN (Plan I+D+i) TEXT-ENTERPRISE 2.0: (TIN2009-13391-C04-03) research project. The work of the
second author has been possible thanks to a scholarship funded by Maat Gknowledge in the framework of the project with
the Universidad PolitĂ©cnica de Valencia MĂłdulo de servicios semánticos de la plataforma GRosso, P.; Correa GarcĂa, S.; Buscaldi, D. (2011). Passage retrieval in legal texts. Journal of Logic and Algebraic Programming. 80(3-5):139-153. doi:10.1016/j.jlap.2011.02.001S139153803-
Text complexity and text simplification in the crisis management domain
Due to the fact that emergency situations can lead to substantial losses, both financial and in terms of human lives, it is essential that texts used in a crisis situation be clearly understandable. This thesis is concerned with the study of the complexity of the crisis management sub-language and with methods to produce new, clear texts and to rewrite pre-existing crisis management documents which are too complex to be understood. By doing this, this interdisciplinary study makes several contributions to the crisis management field. First, it contributes to the knowledge of the complexity of the texts used in the domain, by analysing the presence of a set of written language complexity issues derived from the psycholinguistic literature in a novel corpus of crisis management documents. Second, since the text complexity analysis shows that crisis management documents indeed exhibit high numbers of text complexity issues, the thesis adapts to the English language controlled language writing guidelines which, when applied to the crisis management language, reduce its complexity and ambiguity, leading to clear text documents. Third, since low quality of communication can have fatal consequences in emergency situations, the proposed controlled language guidelines and a set of texts which were re-written according to them are evaluated from multiple points of view. In order to achieve that, the thesis both applies existing evaluation approaches and develops new methods which are more appropriate for the task. These are used in two evaluation experiments – evaluation on extrinsic tasks and evaluation of users’ acceptability. The evaluations on extrinsic tasks (evaluating the impact of the controlled language on text complexity, reading comprehension under stress, manual translation, and machine translation tasks) Text Complexity and Text Simplification in the Crisis Management domain 4 show a positive impact of the controlled language on simplified documents and thus ensure the quality of the resource. The evaluation of users’ acceptability contributes additional findings about manual simplification and helps to determine directions for future implementation. The thesis also gives insight into reading comprehension, machine translation, and cross-language adaptability, and provides original contributions to machine translation, controlled languages, and natural language generation evaluation techniques, which make it valuable for several scientific fields, including Linguistics, Psycholinguistics, and a number of different sub-fields of NLP.EThOS - Electronic Theses Online ServiceGBUnited Kingdo