    Ways of Reading, Models for Text, and the Usefulness of Dead People

    The definition of text is still a live issue with important implications for emerging forms of digital textuality. This paper proposes that no single definition of text is sufficient to account for all manifestations of textuality. Medieval textuality is a test case: four different models for text are offered, corresponding to ways in which modern medievalists approach medieval texts. Studying medieval texts has value not only to support historically informed theories of reading and writing, but also to suggest alternative models of organizing, representing, and processing textual information

    In search of comity: TEI for distant reading

    Any expansion of the TEI beyond its traditional user base involves a recognition that there are many differing answers to the traditional question “What is text, really?” We report on some work carried out in the context of the COST Action Distant Reading for European Literary History (CA16204), in particular on the TEI-conformant schemas developed for one of its principal deliverables: the European Literary Text Collection (ELTeC). The ELTeC will contain comparable corpora for each of at least a dozen European languages, each being a balanced sample of one hundred novels from the period 1840 to 1920, together with metadata concerning their production and reception. We hope that it will become a reliable basis for comparative work in data-driven textual analytics. The focus of the ELTeC encoding scheme is not to represent texts in all their original complexity, nor to duplicate the work of scholarly editors. Instead, we aim to facilitate a richer and better-informed distant reading than a transcription of lexical content alone would permit. At the same time, where the TEI encourages diversity, we enforce consistency by permitting representation of only a specific and quite small set of textual features, both structural and analytical. These constraints are expressed by a master TEI ODD, from which we derive three different schemas by ODD chaining, each associated with appropriate documentation

    Konversion des kulturellen Erbes für die Forschung: Volltextbeschaffung und -bereitstellung als Aufgabe der Bibliotheken

    Mit der Transformation des gedruckten Buch zum elektronischen Text verändern sich zentrale Rahmenbedingungen der Bibliothek. Die theoretischen Grundlagen des ‚Buches‘ müssen unter dem Gesichtspunkt des Digitalen neu durchdacht und auf ihre praktischen Konsequenzen hin geprüft werden. Vor allem die Transitivität, spezifische Schriftlichkeit und Prozessierbarkeit elektronischer Texte sind Eigenschaften, die Konsequenzen für eine ganze Reihe bibliothekarischer Kernaufgaben haben. Mit Blick auf das kulturelle Erbe, das in Bibliotheken verwahrt wird, stellt sich die Aufgabe, auf diesen Paradigmenwechsel angemessen zu reagieren und Sorge dafür zu tragen, dass das schriftliche und gedruckte Kulturgut auch in einer adäquaten maschinenlesbaren Form zur Verfügung steht. Nach gut 10 Jahren erfolgreicher Imagedigitalisierung muss daher jetzt, nach der Entwicklung entsprechender Techniken der nächste Schritt zur Herstellung, Aufbereitung und Bereitstellung von Volltext getan werden, um neuen, sich aus der digitalen Wende ergebenden Forschungsanforderungen und Forschungsfragen, die sich mit Begriffen wie Stilometrie, Clusteranalyse, Topic Modeling etc. verbinden, zu genügen. Den Bibliotheken wächst vor diesem Hintergrund in der Transformation des schriftlichen Kulturgutes und Bereitstellung von Volltexten eine neue Aufgabe zu. Sie können darin einen wichtigen Beitrag zum Aufbau einer Infrastruktur für eine digital arbeitende Geistes- und Kultwissenschaft bzw. die Digital Humanities leisten.The transformation of the book into an electronic text has led to constitutive changes in the functional frameworks of the library. Basic concepts of the notion of the ‘book’ have to be reconsidered and to be evaluated in view of the practical consequences. Above all, the transitivity of such texts, the specific mode of writing employed, and their ability to be processed are aspects that have considerable impact on several core tasks of the library. In view of the cultural heritage that is preserved in libraries this paradigmatic shift requires an appropriate response, and libraries have the task of converting artifacts of written and printed cultural heritage into machine-readable form and to provide access to it. After ten years of successful image digitization and the development of suitable techniques, the aim is now to produce, enhance and provide access to full text in order to fulfill new research requirements and deal with the issues involved. This involves methods such as stylometry, cluster analysis or topic modeling. In taking on these new tasks libraries can make an important contribution to building up a research infrastructure for the digital humanities

    E-text:Download 1. draft here.

    Automatic Text Recognition Applied to Spanish Golden Age Gothic Script: Creation of an HTR Model Based on 16th Century Spanish Romances of Chivalry on the Transkribus Platform

    [Resumen] La presente investigación se centra en los principales aspectos de la digitalización masiva de textos y el reconocimiento automático de las imágenes digitalizadas por medio de software de OCR/HTR. Se presenta pues un experimento de reconocimiento HTR con libros de caballerías del siglo XVI y se propone un modelo apto para transcribir los textos de forma semi-automática y colaborativa.[Abstract] The present investigation centres on the main aspects of massive digitalization of texts and the automated recognition of digitalized images thanks to OCR/HTR softwares. Finally, we present an experiment on HTR recognition dealing with XVI Century Spanish Romances of Chivalry and is delivered a model to transcribe in a semi-automated and collaborative way these texts

    La edición académica digital. De las teorías del texto a la visualización de la información

    El objetivo de este trabajo es comparar y analizar algunas definiciones de "texto" porque de ellas depende gran parte del proceso ecdótico; tras esto intentaré poner en diálogo a las distintas teorías editoriales, buscando los puntos en común en lugar de acentuar las diferencias epistemológicas, geográficas y lingüísticas. El estudio de las teorías del texto y de las teorías editoriales, indisociables de la edición impresa, debiera permitirnos proponer una teoría de la edición académica digital basada en la estructuración de la información por capas y en la capacidad del ordenador de visualizar estas capas de manera interactiva a petición del usuario

    Textual Assemblages and Transmission: Unified models for (Digital) Scholarly Editions and Text Digitisation

    Scholarly editing and textual digitisation are typically seen as two distinct, though related, fields. Scholarly editing is replete with traditions and codified practices, while the digitisation of text-bearing material is a recent enterprise, governed more by practice than theory. From the perspective of scholarly editing, the mere digitisation of text is a world away from the intellectual engagement and rigour on which textual scholarship is founded. Recent developments have led to a more open-minded perspective. As scholarly editing has made increasing use of the digital medium, and textual digitisation begins to make use of scholarly editing tools and techniques, the more obvious distinctions dissolve. Such criteria as ‘critical engagement’ become insufficient grounds on which to base a clear distinction. However, this perspective is not without its risks either. It perpetuates the idea that a (digital) scholarly edition and a digitised text are interchangeable. This thesis argues that a real distinction can be drawn. It starts by considering scholarly editing and textual digitisation as textual transmissions. Starting from the ontological perspective of Deleuze and Guattari, it builds a framework capable for considering the processes behind scholarly editing and digitisation. In doing so, it uncovers a number of critical distinction. Scholarly editing creates a regime of representation that is self-consistent and self-validating. Textual digitisation does not. In the final chapters, this thesis uses the crowd-sourced Letters of 1916 project as a test-case for a new conceptualisation of a scholarly edition: one that is neither globally self-consistent nor self-validating, but which provides a conceptual model in which these absences might be mitigated against and the function of a scholarly edition fulfilled

    On the term 'text' in Digital Humanities

