224 research outputs found

    Models to represent linguistic linked data

    Get PDF
    As the interest of the Semantic Web and computational linguistics communities in linguistic linked data (LLD) keeps increasing and the number of contributions that dwell on LLD rapidly grows, scholars (and linguists in particular) interested in the development of LLD resources sometimes find it difficult to determine which mechanism is suitable for their needs and which challenges have already been addressed. This review seeks to present the state of the art on the models, ontologies and their extensions to represent language resources as LLD by focusing on the nature of the linguistic content they aim to encode. Four basic groups of models are distinguished in this work: models to represent the main elements of lexical resources (group 1), vocabularies developed as extensions to models in group 1 and ontologies that provide more granularity on specific levels of linguistic analysis (group 2), catalogues of linguistic data categories (group 3) and other models such as corpora models or service-oriented ones (group 4). Contributions encompassed in these four groups are described, highlighting their reuse by the community and the modelling challenges that are still to be faced

    El modelado de OLIF utilizando las especificaciones de EAGLES/ISLE: un enfoque interlingüístico

    Full text link
    [EN] FunGramKB is a lexico-conceptual knowledge base for NLP systems. The FunGramKB lexical model is basically derived from OLIF and enhanced with EAGLES/ISLE recommendations with the purpose of designing robust computational lexica. However, the FunGramKB interlingual approach gives a more cognitive view to EAGLES/ISLE proposals. The aim of this paper is to describe how this approach influences the way of conceiving lexical frames.[ES] FunGramKB es una base de conocimiento léxico-conceptual para su implementación en sistemas del PLN. El modelo léxico de FunGramKB se construyó a partir del modelo de OLIF, aunque fue preciso incorporar algunas de las recomendaciones de EAGLES/ISLE con el fin de poder diseñar lexicones computacionales más robustos. El propósito de este artículo es describir cómo el enfoque interlingüístico de FunGramKB proporciona una visión más cognitiva de los marcos léxicos que las propuestas por OLIF y EAGLES/ISLE.Periñán Pascual, JC.; Arcas Túnez, F. (2008). Modelling OLIF frame with EAGLES/ISLE specifications: an interlingual approach. Procesamiento del Lenguaje Natural. (40):9-16. http://hdl.handle.net/10251/52126S9164

    Creación de datos multilingües para diversos enfoques basados en corpus en el ámbito de la traducción y la interpretación

    Get PDF
    Accordingly, this research work aims at exploiting and developing new technologies and methods to better ascertain not only translators’ and interpreters’ needs, but also professionals’ and ordinary people’s on their daily tasks, such as corpora and terminology compilation and management. The main topics covered by this work relate to Computational Linguistics (CL), Natural Language Processing (NLP), Machine Translation (MT), Comparable Corpora, Distributional Similarity Measures (DSM), Terminology Extraction Tools (TET) and Terminology Management Tools (TMT). In particular, this work examines three main questions: 1) Is it possible to create a simpler and user-friendly comparable corpora compilation tool? 2) How to identify the most suitable TMT and TET for a given translation or interpreting task? 3) How to automatically assess and measure the internal degree of relatedness in comparable corpora? This work is composed of thirteen peer-reviewed scientific publications, which are included in Appendix A, while the methodology used and the results obtained in these studies are summarised in the main body of this document. Fecha de lectura de Tesis Doctoral: 22 de noviembre 2019Corpora are playing an increasingly important role in our multilingual society. High-quality parallel corpora are a preferred resource in the language engineering and the linguistics communities. Nevertheless, the lack of sufficient and up-to-date parallel corpora, especially for narrow domains and poorly-resourced languages is currently one of the major obstacles to further advancement across various areas like translation, language learning and, automatic and assisted translation. An alternative is the use of comparable corpora, which are easier and faster to compile. Corpora, in general, are extremely important for tasks like translation, extraction, inter-linguistic comparisons and discoveries or even to lexicographical resources. Its objectivity, reusability, multiplicity and applicability of uses, easy handling and quick access to large volume of data are just an example of their advantages over other types of limited resources like thesauri or dictionaries. By a way of example, new terms are coined on a daily basis and dictionaries cannot keep up with the rate of emergence of new terms

    Applying language technology to ontology-based querying : the OntoQuery project

    Get PDF
    This paper addresses the issue of how language technology resources and components can be applied in ontology-based querying. In particular, it presents the approach to text and query analysis adopted in the Danish research project OntoQuery, where shallow syntactic analysis and ontology-based parsing are combined in order to identify nominal phrases (NPs) and assign them a semantic description. Semantic descriptions are used by the search engine to match queries against texts in a database, and a ranking of the texts retrieved is produced based on a domain ontology. This is intended to be a general methodology applicable to texts from different domains, including those relevant to cultural heritage, although OntoQuery has chosen nutrition as its first target domain. The paper focuses on the language technology aspects of the methodology, the ontology-based lexicon inherited from the SIMPLE project, and the development of the domain-specific ontology. The methodology is partly implemented in a prototype.peer-reviewe

    Interactive Visual Analysis of Translations

    Get PDF
    This thesis is the result of a collaboration with the College of Arts and Humanities at Swansea University. The goal of this collaboration is to design novel visualization techniques to enable digital humanities scholars to explore and analyze parallel translations. To this end, chapter 2 introduces the first survey of surveys on text visualization which reviews all of the surveys and state-of-the-art reports on text visualization techniques, classifies them, provides recommendations, and discusses reported challenges.Following this, we present three visual interactive designs that support the typical digital humanities scholars workflow. In Chapter 4, we present VNLP, a visual, interactive design that enables users to explicitly observe the NLP pipeline processes and update the parameters at each processing stage. Chapter 5 presents AlignVis, a visual tool that provides a semi-automatic alignment framework to build a correspondence between multiple translations. It presents the results of using text similarity measurements and enables the user to create, verify, and edit alignments using a novel visual interface. Chapter 6 introduce TransVis, a novel visual design that supports comparison of multiple parallel translations. It incorporates customized mechanisms for rapid and interactive filtering and selection of a large number of German translations of Shakespeare’s Othello. All of the visual designs are evaluated using examples, detailed observations, case studies, and/or domain expert feedback from a specialist in modern and contemporary German literature and culture.Chapter 7 reports our collaborative experience and proposes a methodological workflow to guide such interdisciplinary research projects. This chapter also includes a summary of outcomes and lessons learned from our collaboration with the domain expert. Finally, Chapter 8 presents a summary of the thesis and future work directions

    A Computational Lexicon and Representational Model for Arabic Multiword Expressions

    Get PDF
    The phenomenon of multiword expressions (MWEs) is increasingly recognised as a serious and challenging issue that has attracted the attention of researchers in various language-related disciplines. Research in these many areas has emphasised the primary role of MWEs in the process of analysing and understanding language, particularly in the computational treatment of natural languages. Ignoring MWE knowledge in any NLP system reduces the possibility of achieving high precision outputs. However, despite the enormous wealth of MWE research and language resources available for English and some other languages, research on Arabic MWEs (AMWEs) still faces multiple challenges, particularly in key computational tasks such as extraction, identification, evaluation, language resource building, and lexical representations. This research aims to remedy this deficiency by extending knowledge of AMWEs and making noteworthy contributions to the existing literature in three related research areas on the way towards building a computational lexicon of AMWEs. First, this study develops a general understanding of AMWEs by establishing a detailed conceptual framework that includes a description of an adopted AMWE concept and its distinctive properties at multiple linguistic levels. Second, in the use of AMWE extraction and discovery tasks, the study employs a hybrid approach that combines knowledge-based and data-driven computational methods for discovering multiple types of AMWEs. Third, this thesis presents a representative system for AMWEs which consists of multilayer encoding of extensive linguistic descriptions. This project also paves the way for further in-depth AMWE-aware studies in NLP and linguistics to gain new insights into this complicated phenomenon in standard Arabic. The implications of this research are related to the vital role of the AMWE lexicon, as a new lexical resource, in the improvement of various ANLP tasks and the potential opportunities this lexicon provides for linguists to analyse and explore AMWE phenomena

    The interaction of parsing rules and argument-predicate constructions: implications for the structure of the grammaticon in FunGramKB

    Full text link
    [EN] The Functional Grammar Knowledge Base (FunGramKB), (Periñán-Pascual and Arcas-Túnez 2010) is a multipurpose lexico-conceptual knowledge base designed to be used in different Natural Language Processing (NLP) tasks. It is complemented with the ARTEMIS (Automatically Representing Text Meaning via an Interlingua–based System) application, a parsing device linguistically grounded on Role and Reference Grammar (RRG) that transduces natural language fragments into their corresponding grammatical and semantic structures. This paper unveils the different phases involved in its parsing routine, paying special attention to the treatment of argumental constructions. As an illustrative case, we will follow all the steps necessary to effectively parse a For-Benefactive structure within ARTEMIS. This methodology will reveal the necessity to distinguish between Kernel constructs and L1-constructions, since the latter involve a modification of the lexical template of the verb. Our definition of L1-constructions leads to the reorganization of the catalogue of FunGramKB L1-constructions, formerly based on Levin’s (1993) alternations. Accordingly, a rearrangement of the internal configuration of the L1-Constructicon within the Grammaticon is proposed.Este trabajo ha sido desarrollado en el contexto del proyecto de investigación: “Desarrollo de un laboratorio virtual para el procesamiento computacional de la lengua desde un paradigma funcional”. (UNED) FF2014-53788-C3-1-P.Fumero Pérez, MDC.; Díaz Galán, A. (2017). The interaction of parsing rules and argument-predicate constructions: implications for the structure of the grammaticon in FunGramKB. Revista de Lingüística y Lenguas Aplicadas. 12:33-44. https://doi.org/10.4995/rlyla.2017.5406SWORD334412Boas, H. and Sag, I. (2012). Sign-Based Construction Grammar. Stanford, Cal.: CSLI Publications.Ferrari, G. (2004). State of the art in Computational Linguistics. Linguistics Today – Facing a Greater Challenge, 163. doi:10.1075/z.126.09ferGoldberg, A. (1995). Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press.Goldberg, A. (2006). Constructions at Work: the Nature of Generalization in Language. Oxford: Oxford University Press.Levin, B. (1993). English Verb Classes and Alternations. A Preliminary Investigation. Chicago / London: University of Chicago Press.Luzondo-Oyón, A., & Ruiz de Mendoza-Ibáñez, F. J. (2015). Argument structure constructions in a Natural Language Processing environment. Language Sciences, 48, 70-89. doi:10.1016/j.langsci.2015.01.001Mairal Usón, R., & Ruiz de Mendoza Ibáñez, F. J. (2009). Levels of description and explanation in meaning construction. Deconstructing Constructions, 153-198. doi:10.1075/slcs.107.08levPeri-án-Pascual, C. (2012). "En defensa del Procesamiento del Lenguaje Natural Fundamentado en la Lingüística Teórica". ONOMÁZEIN, 26/2: 13-48.Periñán-Pascual, C. (2013). Towards a model of constructional meaning for natural language understanding. Linking Constructions into Functional Linguistics, 205-230. doi:10.1075/slcs.145.08perPeri-án-Pascual, C. and Arcas-Túnez, F. (2010). The Architecture of FungramKB. In Proceedings of the 7th International Conference on Language Resources and Evaluation, 2667-2674. Malta: European Language Resources Association.Peri-án-Pascual, C. and Arcas-Túnez, F. (2014). The implementation of the CLS constructor in ARTEMIS. In Nolan, B. and Peri-án-Pascual, C. (eds.) Language Processing and Grammars the role of functionally oriented computational models. Amsterdam / Philadelphia: John Benjamins, 164-196.Peri-án-Pascual, C. and Mairal Usón, R. (2010). "La gramática de COREL: un lenguaje de representación conceptual The COREL grammar: a conceptual representation language". ONOMÁZEIN, 21/1: 11-45.Ruiz de Mendoza Ibáñez, F. J. (2013). Meaning construction, meaning interpretation and formal expression in the Lexical Constructional Model. Linking Constructions into Functional Linguistics, 231-270. doi:10.1075/slcs.145.09ib225Ruiz de Mendoza Ibáñez, F. J., & Usón, R. M. (2008). Levels of description and constraining factors in meaning construction: an introduction to the Lexical Constructional Model. Folia Linguistica, 42(3-4). doi:10.1515/flin.2008.355Sag, I, Wasow, T. and Bender, E. (2003). Syntactic Theory: Formal Introduction. Stanford: CSLI Publications.Steels, L. (2015). "Introducing Fluid Construction Grammar". In L. Steels (ed.) Design Patterns in Fluid Construction Grammar. Amsterdam: John Benjamins, pp. 3-30.Van Valin, R. D. J. (2005). Exploring the Syntax–Semantics Interface. doi:10.1017/cbo9780511610578Van Valin, R. D., & LaPolla, R. J. (1997). Syntax. doi:10.1017/cbo9781139166799Wintner, S. (2009). What Science Underlies Natural Language Engineering? Computational Linguistics, 35(4), 641-644. doi:10.1162/coli.2009.35.4.3540

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    The anatomy of the lexicon within the framework of an NLP knowledge base

    Full text link
    [EN] The aim of this paper is to present the format of the lexical level within the framework of FunGramKB (www.fungramkb.com), a lexical conceptual knowledge base that is part of the Lexical Constructional Model (www.lexicom.es). In doing so, we discuss the different features that define the Spanish and the English lexica.[ES] El objetivo de este trabajo es presentar el formato del nivel léxico en el contexto de la base de conocimiento léxico conceptual FunGramKB (www.fungramkb.com) que, a su vez, forma parte del Modelo Léxico Construccional (www.lexicom.es). Ofrecemos una descripción de los rasgos esenciales que definen el componente léxico en español e inglés.Mairal Usón, R.; Periñán Pascual, JC. (2009). The anatomy of the lexicon within the framework of an NLP knowledge base. RESLA. Revista Española de Lingüística Aplicada. 22:217-244. http://hdl.handle.net/10251/53342S2172442
    corecore