3 research outputs found

    (How) is formulaic language universal? Insights from Korean, German and English

    Get PDF
    The existence of common expressions, also referred to as formulaic language or phraseological units, has been evidenced in a very large number of languages. However, the extent to which languages feature such formulaic material, how formulaicity may be understood across typologically different languages and whether indeed there is a concept of formulaic language that applies across languages, are questions that have been less commonly discussed. Using a novel data set consisting of topically matched corpora in three typologically different languages (Korean, German and English), this study proposes an empirically founded universal concept for formulaic language and discusses what the shape of this concept suggests for the theoretical understanding of formulaic language going forward. In particular, it is argued that the nexus of the concept of formulaic language cannot be fixed at any particular structural level (such as the phrase or the level of polylexicality) and incorporates elements specified at varying levels of abstraction (or schematicity). This means that a cross-linguistic concept of formulaic language fits in well with a constructionist view of linguistic structure

    (How) is formulaic language universal? Insights from Korean, German and English

    Get PDF
    The existence of common expressions, also referred to as formulaic language or phraseological units, has been evidenced in a very large number of languages. However, the extent to which languages feature such formulaic material, how formulaicity may be understood across typologically different languages and whether indeed there is a concept of formulaic language that applies across languages, are questions that have been less commonly discussed. Using a novel data set consisting of topically matched corpora in three typologically different languages (Korean, German and English), this study proposes an empirically founded universal concept for formulaic language and discusses what the shape of this concept suggests for the theoretical understanding of formulaic language going forward. In particular, it is argued that the nexus of the concept of formulaic language cannot be fixed at any particular structural level (such as the phrase or the level of polylexicality) and incorporates elements specified at varying levels of abstraction (or schematicity). This means that a cross-linguistic concept of formulaic language fits in well with a constructionist view of linguistic structure

    A Computational Lexicon and Representational Model for Arabic Multiword Expressions

    Get PDF
    The phenomenon of multiword expressions (MWEs) is increasingly recognised as a serious and challenging issue that has attracted the attention of researchers in various language-related disciplines. Research in these many areas has emphasised the primary role of MWEs in the process of analysing and understanding language, particularly in the computational treatment of natural languages. Ignoring MWE knowledge in any NLP system reduces the possibility of achieving high precision outputs. However, despite the enormous wealth of MWE research and language resources available for English and some other languages, research on Arabic MWEs (AMWEs) still faces multiple challenges, particularly in key computational tasks such as extraction, identification, evaluation, language resource building, and lexical representations. This research aims to remedy this deficiency by extending knowledge of AMWEs and making noteworthy contributions to the existing literature in three related research areas on the way towards building a computational lexicon of AMWEs. First, this study develops a general understanding of AMWEs by establishing a detailed conceptual framework that includes a description of an adopted AMWE concept and its distinctive properties at multiple linguistic levels. Second, in the use of AMWE extraction and discovery tasks, the study employs a hybrid approach that combines knowledge-based and data-driven computational methods for discovering multiple types of AMWEs. Third, this thesis presents a representative system for AMWEs which consists of multilayer encoding of extensive linguistic descriptions. This project also paves the way for further in-depth AMWE-aware studies in NLP and linguistics to gain new insights into this complicated phenomenon in standard Arabic. The implications of this research are related to the vital role of the AMWE lexicon, as a new lexical resource, in the improvement of various ANLP tasks and the potential opportunities this lexicon provides for linguists to analyse and explore AMWE phenomena
    corecore