1,260 research outputs found

    Cross-Lingual and Cross-Chronological Information Access to Multilingual Historical Documents

    Get PDF
    In this chapter, we present our work in realizing information access across different languages and periods. Nowadays, digital collections of historical documents have to handle materials written in many different languages in different time periods. Even in a particular language, there are significant differences over time in terms of grammar, vocabulary and script. Our goal is to develop a method to access digital collections in a wide range of periods from ancient to modern. We introduce an information extraction method for digitized ancient Mongolian historical manuscripts for reducing labour-intensive analysis. The proposed method performs computerized analysis on Mongolian historical documents. Named entities such as personal names and place names are extracted by employing support vector machine. The extracted named entities are utilized to create a digital edition that reflects an ancient Mongolian historical manuscript written in traditional Mongolian script. The Text Encoding Initiative guidelines are adopted to encode the named entities, transcriptions and interpretations of ancient words. A web-based prototype system is developed for utilizing digital editions of ancient Mongolian historical manuscripts as scholarly tools. The proposed prototype has the capability to display and search traditional Mongolian text and its transliteration in Latin letters along with the highlighted named entities and the scanned images of the source manuscript

    Extraction and Visualization of Toponyms in Diachronic Text Corpora

    Get PDF
    International audienceThis paper focuses on the extraction of German and Austrian place names in historical texts. Our text basis is Die Fackel (The Torch) published by Karl Kraus. The database we develop follows from a combination of approaches: gazetteers are curated in a supervised way to account for historical differences,and current geographical information is used as a fallback. Our maps highlight the linguistic and cultural ties of Kraus and his contemporaries, "Die Fackel" is (at least) a European phenomenon; Kraus' vision of Europe is more inclined towards cultural centers

    GlamMap: visualizing library metadata

    Get PDF

    Multiethnic Societies of Central Asia and Siberia Represented in Indigenous Oral and Written Literature

    Get PDF
    Central Asia and Siberia are characterized by multiethnic societies formed by a patchwork of often small ethnic groups. At the same time large parts of them have been dominated by state languages, especially Russian and Chinese. On a local level the languages of the autochthonous people often play a role parallel to the central national language. The contributions of this conference proceeding follow up on topics such as: What was or is collected and how can it be used under changed conditions in the research landscape, how does it help local ethnic communities to understand and preserve their own culture and language? Do the spatially dispersed but often networked collections support research on the ground? What contribution do these collections make to the local languages and cultures against the backdrop of dwindling attention to endangered groups? These and other questions are discussed against the background of the important role libraries and private collections play for multiethnic societies in often remote regions that are difficult to reach

    Central Asian Sources and Central Asian Research

    Get PDF
    In October 2014 about thirty scholars from Asia and Europe came together for a conference to discuss different kinds of sources for the research on Central Asia. From museum collections and ancient manuscripts to modern newspapers and pulp fiction and the wind horses flying against the blue sky of Mongolia there was a wide range of topics. Modern data processing and data management and the problems of handling five different languages and scripts for a dictionary project were leading us into the modern digital age. The dominating theme of the whole conference was the importance of collections of source material found in libraries and archives, their preservation and expansion for future generations of scholars. Some of the finest presentations were selected for this volume and are now published for a wider audience

    Multiethnic Societies of Central Asia and Siberia Represented in Indigenous Oral and Written Literature

    Get PDF
    Central Asia and Siberia are characterized by multiethnic societies formed by a patchwork of often small ethnic groups. At the same time large parts of them have been dominated by state languages, especially Russian and Chinese. On a local level the languages of the autochthonous people often play a role parallel to the central national language. The contributions of this conference proceeding follow up on topics such as: What was or is collected and how can it be used under changed conditions in the research landscape, how does it help local ethnic communities to understand and preserve their own culture and language? Do the spatially dispersed but often networked collections support research on the ground? What contribution do these collections make to the local languages and cultures against the backdrop of dwindling attention to endangered groups? These and other questions are discussed against the background of the important role libraries and private collections play for multiethnic societies in often remote regions that are difficult to reach

    A Survey on Rendering Traditional Mongolian Script

    Full text link
    • …
    corecore