3 research outputs found

    Exploring data provenance in handwritten text recognition infrastructure:Sharing and reusing ground truth data, referencing models, and acknowledging contributions. Starting the conversation on how we could get it done

    Get PDF
    This paper discusses best practices for sharing and reusing Ground Truth in Handwritten Text Recognition infrastructures, and ways to reference and acknowledge contributions to the creation and enrichment of data within these Machine Learning systems. We discuss how one can publish Ground Truth data in a repository and, subsequently, inform others. Furthermore, we suggest appropriate citation methods for HTR data, models, and contributions made by volunteers. Moreover, when using digitised sources (digital facsimiles), it becomes increasingly important to distinguish between the physical object and the digital collection. These topics all relate to the proper acknowledgement of labour put into digitising, transcribing, and sharing Ground Truth HTR data. This also points to broader issues surrounding the use of Machine Learning in archival and library contexts, and how the community should begin toacknowledge and record both contributions and data provenance

    Images of the Past. 7 years of Images for the Future

    Get PDF
    Over a period of seven years Sound and Vision, EYE Film Institute, the National Archives and Kennisland preserved over 90,000 hours of video, 20,000 hours of film, some 100,000 hours of audio and 2,500,000 photos. The digitised material is now being reused for numerous purposes, from lesson material and Wikipedia to apps and services for the creative industry. Images for the Future has played a pioneering role in both the development of large-scale digitisation processes and the thinking on the role of heritage organisations in our digital society

    Beelden van het verleden. 7 jaar Beelden voor de Toekomst

    Get PDF
    In zeven jaar tijd hebben Beeld en Geluid, EYE Filmmuseum, het Nationaal Archief en Kennisland meer dan 90 duizend uur video, 20 duizend uur film, zo’n 100 duizend uur audio en 2,5 miljoen foto’s geconserveerd en gedigitaliseerd. Het gedigitaliseerde materiaal wordt inmiddels hergebruikt in talloze toepassingen, van les materiaal en Wikipedia tot apps en diensten voor de creatieve industrie. Beelden voor de Toekomst heeft een pioniersrol vervuld in zowel de ontwikkeling van grootschalige digitaliseringsprocessen als de gedachtevorming over de rol van erfgoedinstellingen in de digitale samenleving
    corecore