5 research outputs found

    Impresso Text Reuse at Scale

    Get PDF
    U-AGR-7251 - INTER/SNF/22/17498891/IMPRESSO2 (01/09/2023 - 28/02/2027) - DURING Marten4. Quality educatio

    Proceedings of the Digital Humanities in the Nordic Countries 4th Conference (DHN 2019), Copenhagen, Denmark, March 5-8, 2019

    Get PDF
    This paper is based on the study of text reuse in the Finnish press from 1771-1920. In the Computational History and the Transformation of Public Discourse in Finland (COMHIS) project, we found 61 million occurrences of similarity, which formed 13.8 million clusters of reuse. This material also included strikingly slow processes of repetition, and the longest reuse cases were almost as long as the time span of the project. In sum, 2.03 million clusters, 15 per cent of the total amount, were longer than 12 months. As well, 76,259 clusters spanned over 20 years or more. The longest span was 146 years. The paper explores the volume and nature of this long-term text reuse in the Finnish press and analyses three distinctive features of slow repetition: newspapers as a site of memory, newspapers as an archive and the political ramifications of reuse. The paper argues that the habit of reprinting old texts aimed to bridge the gap between past and present, emphasising the continuity between old and new. On the other hand, there were cases where past texts were activated precisely for the opposite purposes, to obscure the past and to show how different the bygone world was.</p

    Tekstien uudelleenkäyttö suomalaisessa sanoma- ja aikakauslehdistössä 1771–1920. Digitaalisten ihmistieteiden näkökulma

    Get PDF
    Artikkelissa tutkitaan suomalaista sanoma- ja aikakauslehdistöä tekstin uudelleenkäytön näkökulmasta.Saman tekstin julkaiseminen uudelleen eri yhteyksissä on sinänsä vanha ja tunnettu ilmiö, mutta ennen sanoma- ja aikakauslehtien digitoimista tätä lehdistön piirrettä ei ole voitu tutkia systemaattisesti. Tutkimuksen lähdeaineistona on Suomen Kansalliskirjaston julkaisema sanoma- ja aikakauslehtien digitoitu OCR-korpus, josta on COMHIS-hankkeessa kehitetyn, tekstin uudelleenkäytön tunnistavan BLAST-menetelmän avulla etsitty lehdistössä esiintyvää kopiointia ja toisteisuutta. Aikavälillä 1771–1920 toistoa sisältäviä tekstejä tai tekstikatkelmia on löytynyt noin 13,8 miljoonan klusterin eli pidemmän merkkijonon verran. Artikkelissa esitellään sekä itse uudelleenkäytön tunnistukseen käytettyä BLAST-menetelmää että tämän tunnistuksen tuloksia. Tutkimus osoittaa, että tekstien kopioiminen ja uudelleenkäyttö on merkittävä osa suomalaista lehdistöä. Menetelmänä tekstien uudelleenkäytön tunnistus tarjoaa uuden keinon tutkia informaation liikkeitä ja reittejä.This article explores Finnish newspapers and periodicals produced between 1771 and 1920, with a focus on the reuse of texts. While the reprinting of particular texts in a range of different locations can be regarded as an old and well-acknowledged practice in the press, a systematic examination was not possible until the digitization of these historical documents. This primary research material derives from the digitized OCR corpus of newspapers and periodicals published by the National Library of Finland. In the COMHIS project, we have developed a text-mining software, based on NCBI BLAST, which effectively recognizes and enables the location of textual repetitions. We have found approximately 13.8 million clusters of text reuse. As well as an introduction to the methods and uses of BLAST, the article will also explore the results gained through these and what they reveal about the nature of the circulation of information in the Finnish press during this period. This article shows that the copying and reuse of texts was a remarkable part of the process.</p

    Infectious Media: Cholera and the Circulation of Texts in the Finnish Press, 1860–1920

    Get PDF
    Cholera was the emblematic disease of the nineteenth-century Europe. This article explores the cultural ramifications of cholera by concentrating on the ways in which public discourse participated in circulating information on the disease. It focuses on the reuse of texts about cholera in the Finnish press from 1860 to 1920. The most difficult cholera epidemics in Finland were the first ones in the 1830s and 1850s, and the number of casualties dropped significantly towards the end of the century. At the same time, however, cholera was discussed more than ever, and there was the rising curve of the references to cholera from the 1860s onwards. In Finland, the public discourse on cholera was also entangled with the rising nationalism towards the end of the nineteenth century.</p
    corecore