11 research outputs found

    Modeling information structure in a cross-linguistic perspective

    Get PDF
    This study makes substantial contributions to both the theoretical and computational treatment of information structure, with a specific focus on creating natural language processing applications such as multilingual machine translation systems. The present study first provides cross-linguistic findings in regards to information structure meanings and markings. Building upon such findings, the current model represents information structure within the HPSG/MRS framework using Individual Constraints. The primary goal of the present study is to create a multilingual grammar model of information structure for the LinGO Grammar Matrix system. The present study explores the construction of a grammar library for creating customized grammar incorporating information structure and illustrates how the information structure-based model improves performance of transfer-based machine translation

    Building the Arabic Learner Corpus and a System for Arabic Error Annotation

    Get PDF
    Recent developments in learner corpora have highlighted the growing role they play in some linguistic and computational research areas such as language teaching and natural language processing. However, there is a lack of a well-designed Arabic learner corpus that can be used for studies in the aforementioned research areas. This thesis aims to introduce a detailed and original methodology for developing a new learner corpus. This methodology which represents the major contribution of the thesis includes a combination of resources, proposed standards and tools developed for the Arabic Learner Corpus project. The resources include the Arabic Learner Corpus, which is the largest learner corpus for Arabic based on systematic design criteria. The resources also include the Error Tagset of Arabic that was designed for annotating errors in Arabic covering 29 types of errors under five broad categories. The Guide on Design Criteria for Learner Corpus is an example of the proposed standards which was created based on a review of previous work. It focuses on 11 aspects of corpus design criteria. The tools include the Computer-aided Error Annotation Tool for Arabic that provides some functions facilitating error annotation such as the smart-selection function and the auto-tagging function. Additionally, the tools include the ALC Search Tool that is developed to enable searching the ALC and downloading the source files based on a number of determinants. The project was successfully able to recruit 992 people including language learners, data collectors, evaluators, annotators and collaborators from more than 30 educational institutions in Saudi Arabia and the UK. The data of the Arabic Learner Corpus was used in a number of projects for different purposes including error detection and correction, native language identification, Arabic analysers evaluation, applied linguistics studies and data-driven Arabic learning. The use of the ALC highlights the extent to which it is important to develop this project

    Aspect and Meaning in the Russian Future Tense: Corpus and Experimental Investigations

    Get PDF
    This dissertation is a study of the Russian future tense within the framework of cognitive linguistics. In this dissertation I focus on the distribution of the perfective and imperfective future forms, their future and non-future meanings, and the use of the future tense verb forms by both native and non-native speakers. In the Russian tense-aspect system, it is reasonable to operate with markedness on a local level of tense, rather than the level of the verb. Via local markedness it is possible to see that the perfective future is the unmarked member of the opposition, and the imperfective future is the marked one. The perfective future tense forms are approximately fourteen times more frequent than imperfective future tense forms in the Russian National Corpus. Both perfective and imperfective future tense forms express not only future meanings but also gnomic, directive etc. The (non-)future meanings form a radial category with the future meaning as a prototype and other meanings as extensions. Native speakers operate with frequency when they use future tense forms. Non-native speakers are not sensitive to frequency, and instruction in the use of the future tense forms in Russian could be improved

    South America

    Get PDF

    Words in Space and Time

    Get PDF
    With forty-two extensively annotated maps, this atlas offers novel insights into the history and mechanics of how Central Europe’s languages have been made, unmade, and deployed for political action. The innovative combination of linguistics, history, and cartography makes a wealth of hard-to-reach knowledge readily available to both specialist and general readers. It combines information on languages, dialects, alphabets, religions, mass violence, or migrations over an extended period of time. The story first focuses on Central Europe’s dialect continua, the emergence of states, and the spread of writing technology from the tenth century onward. Most maps concentrate on the last two centuries. The main storyline opens with the emergence of the Western European concept of the nation, in accord with which the ethnolinguistic nation-states of Italy and Germany were founded. In the Central European view, a “proper” nation is none other than the speech community of a single language. The Atlas aspires to help users make the intellectual leap of perceiving languages as products of human history and part of culture. Like states, nations, universities, towns, associations, art, beauty, religions, injustice, or atheism—languages are artefacts invented and shaped by individuals and their groups

    Words in Space and Time

    Get PDF
    With forty-two extensively annotated maps, this atlas offers novel insights into the history and mechanics of how Central Europe’s languages have been made, unmade, and deployed for political action. The innovative combination of linguistics, history, and cartography makes a wealth of hard-to-reach knowledge readily available to both specialist and general readers. It combines information on languages, dialects, alphabets, religions, mass violence, or migrations over an extended period of time. The story first focuses on Central Europe’s dialect continua, the emergence of states, and the spread of writing technology from the tenth century onward. Most maps concentrate on the last two centuries. The main storyline opens with the emergence of the Western European concept of the nation, in accord with which the ethnolinguistic nation-states of Italy and Germany were founded. In the Central European view, a “proper” nation is none other than the speech community of a single language. The Atlas aspires to help users make the intellectual leap of perceiving languages as products of human history and part of culture. Like states, nations, universities, towns, associations, art, beauty, religions, injustice, or atheism—languages are artefacts invented and shaped by individuals and their groups

    The Global Encyclopaedia of Informality, Volume 1

    Get PDF
    Alena Ledeneva invites you on a voyage of discovery, to explore society’s open secrets, unwritten rules and know-how practices. Broadly defined as ‘ways of getting things done’, these invisible yet powerful informal practices tend to escape articulation in official discourse. They include emotion-driven exchanges of gifts or favours and tributes for services, interest-driven know-how (from informal welfare to informal employment and entrepreneurship), identity-driven practices of solidarity, and power-driven forms of co-optation and control. The paradox, or not, of the invisibility of these informal practices is their ubiquity. Expertly practised by insiders but often hidden from outsiders, informal practices are, as this book shows, deeply rooted all over the world, yet underestimated in policy. Entries from the five continents presented in this volume are samples of the truly global and ever-growing collection, made possible by a remarkable collaboration of over 200 scholars across disciplines and area studies. By mapping the grey zones, blurred boundaries, types of ambivalence and contexts of complexity, this book creates the first Global Map of Informality. The accompanying database is searchable by region, keyword or type of practice, so do explore what works, how, where and why
    corecore