313 research outputs found

    The TXM Portal Software giving access to Old French Manuscripts Online

    Get PDF
    Texte intégral en ligne : http://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdfInternational audiencehttp://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdf This paper presents the new TXM software platform giving online access to Old French Text Manuscripts images and tagged transcriptions for concordancing and text mining. This platform is able to import medieval sources encoded in XML according to the TEI Guidelines for linking manuscript images to transcriptions, encode several diplomatic levels of transcription including abbreviations and word level corrections. It includes a sophisticated tokenizer able to deal with TEI tags at different levels of linguistic hierarchy. Words are tagged on the fly during the import process using IMS TreeTagger tool with a specific language model. Synoptic editions displaying side by side manuscript images and text transcriptions are automatically produced during the import process. Texts are organized in a corpus with their own metadata (title, author, date, genre, etc.) and several word properties indexes are produced for the CQP search engine to allow efficient word patterns search to build different type of frequency lists or concordances. For syntactically annotated texts, special indexes are produced for the Tiger Search engine to allow efficient syntactic concordances building. The platform has also been tested on classical Latin, ancient Greek, Old Slavonic and Old Hieroglyphic Egyptian corpora (including various types of encoding and annotations)

    Final report : PATTON Alliance gazetteer evaluation project.

    Full text link

    Linked Data Methodologies in Gandhāran Buddhist Art and Texts:

    Get PDF
    The Working Group “Linked Data Methodologies in Gandhāran Buddhist Art and Texts” explores potential uses of Linked Open Data principles in bridging different collections of Gandhāran Buddhist resources extant in both textual and visual media. The team includes art historians and archaeologists, philologists and historians of Buddhism as well as experts in Digital Humanities from various research institutions across North America and Europe who have decided to combine their expertise to propose means to foster interoperability between repositories and ultimately, to advance our knowledge of Buddhism in Gandhāra. The result of the discussion, presented here, is a set of guidelines that should help in planning and implementing future collection management systems. These guidelines abide to the following basic principles: 1. bridging diverse collections 2. a progressive enhancement approach 3. incremental changes into existing databases. With this goal in mind, a set of four vocabularies – Places, Motifs, Narratives, and Persons – to which stable identifiers will be attributed has been defined. For each, we identify best practice models that teach us how to model the information and we list a very limited set of primary sources that give us information about what we want to model. The Working Group “Linked Data Methodologies in Gandhāran Buddhist Art and Texts” and the resulting publication are financed by a grant from the Pelagios Network (“Small grants”, 2019). Suggestion for citation: Elwert, F. & Pons, J. 2020. ‘Linked Data Methodologies in Gandhāran Buddhist Art and Texts. Pelagios Working Group Final Report’, doi10.13154/rub.148.125

    The Future of Information Sciences : INFuture2009 : Digital Resources and Knowledge Sharing

    Get PDF

    Crossing Experiences in Digital Epigraphy: From Practice to Discipline

    Get PDF
    Although a relevant number of projects digitizing inscriptions are under development or have been recently accomplished, Digital Epigraphy is not yet considered to be a proper discipline and there are still no regular occasions to meet and discuss. By collecting contributions on nineteen projects – very diversified for geographic and chronological context, for script and language, and for typology of digital output – this volume intends to point out the methodological issues which are specific to the application of information technologies to epigraphy. The first part of the volume is focused on data modelling and encoding, which are conditioned by the specific features of different scripts and languages, and deeply influence the possibility to perform searches on texts and the approach to the lexicographic study of such under-resourced languages. The second part of the volume is dedicated to the initiatives aimed at fostering aggregation, dissemination and the reuse of epigraphic materials, and to discuss issues of interoperability. The common theme of the volume is the relationship between the compliance with the theoretic tools and the methodologies developed by each different tradition of studies, and, on the other side, the necessity of adopting a common framework in order to produce commensurable and shareable results. The final question is whether the computational approach is changing the way epigraphy is studied, to the extent of renovating the discipline on the basis of new, unexplored questions

    Bourgeois Letters: Language Planning as an Avanue of Social Engineering in Ukraine (1919-1938)

    Get PDF
    At present, the Ukrainian population in Ukraine, and Ukrainian émigrés abroad are using two different orthographical systems. The issue of which of the two codes of Ukrainian can be considered legitmate standard Ukrainian is the subject of many emotionally charged debates in Ukraine and within the Ukrainian community in the West. This research focused on the events, processes, and politics that had led to the emergence of the two orthographic codes of Ukrainian, as well as on the social engineering efforts that had accompanied each stage of language planning. Books and publications on the history of Ukraine, language planning and government policies in Ukraine, Imperial Russia and the Soviet Union have been examined alongside with the theory of types of social engineering to reveal how government policies reflected on selection, codification, elaboration and securing acceptance stages of language planning. Outcome of the study of political impact on standard Ukrainian may be of interest to scholars researching language planning in the context of bilingual societies and political power. It may be used to explain to students of Ukrainian how the differences between the orthographies used in the West and in Ukraine came into existence. Awareness and understanding of the historical roots and political context of the development of the existing standards of Ukrainian may assist individuals involved in and effected by this polarizing issue to find shared concepts, and begin appreciating the existing diversity of the language

    Bibliographic Control in the Digital Ecosystem

    Get PDF
    With the contributions of international experts, the book aims to explore the new boundaries of universal bibliographic control. Bibliographic control is radically changing because the bibliographic universe is radically changing: resources, agents, technologies, standards and practices. Among the main topics addressed: library cooperation networks; legal deposit; national bibliographies; new tools and standards (IFLA LRM, RDA, BIBFRAME); authority control and new alliances (Wikidata, Wikibase, Identifiers); new ways of indexing resources (artificial intelligence); institutional repositories; new book supply chain; “discoverability” in the IIIF digital ecosystem; role of thesauri and ontologies in the digital ecosystem; bibliographic control and search engines

    Rapid Generation of Pronunciation Dictionaries for new Domains and Languages

    Get PDF
    This dissertation presents innovative strategies and methods for the rapid generation of pronunciation dictionaries for new domains and languages. Depending on various conditions, solutions are proposed and developed. Starting from the straightforward scenario in which the target language is present in written form on the Internet and the mapping between speech and written language is close up to the difficult scenario in which no written form for the target language exists

    CyberResearch on the Ancient Near East and Eastern Mediterranean

    Get PDF
    CyberResearch on the Ancient Near East and Neighboring Regions provides case studies on archaeology, objects, cuneiform texts, and online publishing, digital archiving, and preservation. Eleven chapters present a rich array of material, spanning the fifth through the first millennium BCE, from Anatolia, the Levant, Mesopotamia, and Iran. Customized cyber- and general glossaries support readers who lack either a technical background or familiarity with the ancient cultures. Edited by Vanessa Bigot Juloux, Amy Rebecca Gansell, and Alessandro Di Ludovico, this volume is dedicated to broadening the understanding and accessibility of digital humanities tools, methodologies, and results to Ancient Near Eastern Studies. Ultimately, this book provides a model for introducing cyber-studies to the mainstream of humanities research
    • 

    corecore