17 research outputs found

    The Taming of the Shrew - non-standard text processing in the Digital Humanities

    Get PDF
    Natural language processing (NLP) has focused on the automatic processing of newspaper texts for many years. With the growing importance of text analysis in various areas such as spoken language understanding, social media processing and the interpretation of text material from the humanities, techniques and methodologies have to be reviewed and redefined since so called non-standard texts pose challenges on the lexical and syntactic level especially for machine-learning-based approaches. Automatic processing tools developed on the basis of newspaper texts show a decreased performance for texts with divergent characteristics. Digital Humanities (DH) as a field that has risen to prominence in the last decades, holds a variety of examples for this kind of texts. Thus, the computational analysis of the relationships of Shakespeare’s dramatic characters requires the adjustment of processing tools to English texts from the 16th-century in dramatic form. Likewise, the investigation of narrative perspective in Goethe’s ballads calls for methods that can handle German verse from the 18th century. In this dissertation, we put forward a methodology for NLP in a DH environment. We investigate how an interdisciplinary context in combination with specific goals within projects influences the general NLP approach. We suggest thoughtful collaboration and increased attention to the easy applicability of resulting tools as a solution for differences in the store of knowledge between project partners. Projects in DH are not only constituted by the automatic processing of texts but are usually framed by the investigation of a research question from the humanities. As a consequence, time limitations complicate the successful implementation of analysis techniques especially since the diversity of texts impairs the transferability and reusability of tools beyond a specific project. We answer to this with modular and thus easily adjustable project workflows and system architectures. Several instances serve as examples for our methodology on different levels. We discuss modular architectures that balance time-saving solutions and problem-specific implementations on the example of automatic postcorrection of the output text from an optical character recognition system. We address the problem of data diversity and low resource situations by investigating different approaches towards non-standard text processing. We examine two main techniques: text normalization and tool adjustment. Text normalization aims at the transformation of non-standard text in order to assimilate it to the standard whereas tool adjustment concentrates on the contrary direction of enabling tools to successfully handle a specific kind of text. We focus on the task of part-of-speech tagging to illustrate various approaches toward the processing of historical texts as an instance for non-standard texts. We discuss how the level of deviation from a standard form influences the performance of different methods. Our approaches shed light on the importance of data quality and quantity and emphasize the indispensability of annotations for effective machine learning. In addition, we highlight the advantages of problem-driven approaches where the purpose of a tool is clearly formulated through the research question. Another significant finding to emerge from this work is a summary of the experiences and increased knowledge through collaborative projects between computer scientists and humanists. We reflect on various aspects of the elaboration and formalization of research questions in the DH and assess the limitations and possibilities of the computational modeling of humanistic research questions. An emphasis is placed on the interplay of expert knowledge with respect to a subject of investigation and the implementation of tools for that purpose and the thereof resulting advantages such as the targeted improvement of digital methods through purposeful manual correction and error analysis. We show obstacles and chances and give prospects and directions for future development in this realm of interdisciplinary research

    Reader Response in the Digital Age. Letters to the editor vs. below-the-line comments. A synchronic comparison.

    Get PDF
    Heralded by some as the biggest revolution of the Internet, with great egalitarian and democratic potential, web 2.0 and social media are frowned on by others as sites where users constantly compete to take centre stage, more often than not by sharing everyday banalities, thus flooding the web with “tedious piffle”. While it is true that it has never been so easy to put in your two cents’ worth, the concept of user-generated content – one of the buzzwords of today’s participatory web – can look back on a long tradition in newspapers, where letters to the editor have always been a highly popular way for readers to make their voices heard in public. In their move online, most newspapers added comment sections to their websites, thus taking readers’ letters to the digital level and providing the basis for the present synchronic study, which compares 1,000 below-the-line comments posted on the websites of the Guardian and the Times to 1,000 letters to the editor written to the same newspapers by addressing, one by one, four common claims about, or (mis-)conceptions of, this form of user-generated content. The analysis begins on the micro-linguistic level, comparing the data sets in terms of their orthographic, typographic, lexical and syntactic features and addressing the claim that the language used to communicate on the Internet differs substantially from the language used in other contexts. The focus then shifts to the interactional structures found in the two genres and the question of whether below-the-line comments, as a form of web 2.0, are really more interactive than traditional letters to the editor, which are commonly perceived as a means of ‘talking back’ to the newspaper or journalist rather than a forum for interactive debates among users. The discussion then moves on to matters of face, (im-)politeness and identity construction by first investigating the face-threatening act of criticising others as well as the act of providing positive feedback. This analysis was inspired by the fact that the two genres, although clearly related, are perceived very differently: while comment sections are often associated with aggressive and uninhibited verbal behaviour and numerous calls for their closure can be found, such concerns have not been voiced about letters pages in newspapers. Moreover, it has been claimed that via online comments, more and more private topics are entering the public sphere, thus leading to an increase in subjectivity and personalisation. This last claim is addressed by exploring strategies of personalisation and the moves used to construct an expert identity. The comparative analysis is thus concluded with a focus on the domain of social behaviour, investigating the different means contributors employ to create their own identity and that of the people talked about or addressed

    Digital Classical Philology

    Get PDF
    The buzzwords “Information Society” and “Age of Access” suggest that information is now universally accessible without any form of hindrance. Indeed, the German constitution calls for all citizens to have open access to information. Yet in reality, there are multifarious hurdles to information access – whether physical, economic, intellectual, linguistic, political, or technical. Thus, while new methods and practices for making information accessible arise on a daily basis, we are nevertheless confronted by limitations to information access in various domains. This new book series assembles academics and professionals in various fields in order to illuminate the various dimensions of information's inaccessability. While the series discusses principles and techniques for transcending the hurdles to information access, it also addresses necessary boundaries to accessability.This book describes the state of the art of digital philology with a focus on ancient Greek and Latin. It addresses problems such as accessibility of information about Greek and Latin sources, data entry, collection and analysis of Classical texts and describes the fundamental role of libraries in building digital catalogs and developing machine-readable citation systems

    AIUCD2016 - Book of Abstracts

    Get PDF
    Questo volume raccoglie gli abstract dei contributi accolti al convegno AIUCD 2016, dal titolo "Edizioni digitali: rappresentazione, interoperabilitĂ , analisi del testo e infrastrutture" (Digital editions: representation, interoperability, text analysis and infrastructures). Si tratta del quinto convegno dell'Associazione di Informatica Umanistica e Cultura Digitale (AIUCD), tenutosi a Venezia dal 7 al 9 Settembre 2016, che Ăš stato infatti dedicato alla rappresentazione e allo studio del testo sotto vari punti di vista (risorse, analisi, infrastrutture di pubblicazione), con lo scopo di far dialogare intorno al testo filologi, storici, umanisti digitali, linguisti computazionali, logici, informatici e ingegneri informatici. Il presente volume raccoglie dunque gli abstract dei soli interventi accettati al convegno, che hanno ottenuto il parere favorevole da parte di valutatori esperti della materia, attraverso un processo di revisione anonima sotto la responsabilitĂ  del Comitato Scientifico di AIUCD 2016

    AIUCD2016 - Book of Abstracts

    Get PDF
    Questo volume raccoglie gli abstract dei contributi accolti al convegno AIUCD 2016, dal titolo "Edizioni digitali: rappresentazione, interoperabilitĂ , analisi del testo e infrastrutture" (Digital editions: representation, interoperability, text analysis and infrastructures). Si tratta del quinto convegno dell'Associazione di Informatica Umanistica e Cultura Digitale (AIUCD), tenutosi a Venezia dal 7 al 9 Settembre 2016, che Ăš stato infatti dedicato alla rappresentazione e allo studio del testo sotto vari punti di vista (risorse, analisi, infrastrutture di pubblicazione), con lo scopo di far dialogare intorno al testo filologi, storici, umanisti digitali, linguisti computazionali, logici, informatici e ingegneri informatici. Il presente volume raccoglie dunque gli abstract dei soli interventi accettati al convegno, che hanno ottenuto il parere favorevole da parte di valutatori esperti della materia, attraverso un processo di revisione anonima sotto la responsabilitĂ  del Comitato Scientifico di AIUCD 2016

    7th International Conference on Higher Education Advances (HEAd'21)

    Full text link
    Information and communication technologies together with new teaching paradigms are reshaping the learning environment.The International Conference on Higher Education Advances (HEAd) aims to become a forum for researchers and practitioners to exchange ideas, experiences,opinions and research results relating to the preparation of students and the organization of educational systems.Doménech I De Soria, J.; Merello Giménez, P.; Poza Plaza, EDL. (2021). 7th International Conference on Higher Education Advances (HEAd'21). Editorial Universitat PolitÚcnica de ValÚncia. https://doi.org/10.4995/HEAD21.2021.13621EDITORIA

    The scribe as interpreter : a new look at New Testament textual criticism according to reader reception theory

    Get PDF
    Philosophy, Practical and Systematic TheologyD.Litt. et Phil. (Theory of Literature

    Children\u27s Folklore

    Get PDF
    A collection of original essays by scholars from a variety of fields—including American studies, folklore, anthropology, psychology, sociology, and education—Children\u27s Folklore: A Source Book moves beyond traditional social-science views of child development. It reveals the complexity and artistry of interactions among children, challenging stereotypes of simple childhood innocence and conventional explanations of development that privilege sober and sensible adult outcomes. Instead, the play and lore of children is shown to be often disruptive, wayward, and irrational. The contributors variably con-sider and demonstrate contextual and textual ways of studying the folklore of children. Avoiding a narrow definition of the subject, they examine a variety of resources and approaches for studying, researching, and teaching it. These range from surveys of the history and literature of children\u27s folklore to methods of field research, studies of genres of lore, and attempts to capture children\u27s play and games.https://digitalcommons.usu.edu/usupress_pubs/1059/thumbnail.jp
    corecore