386 research outputs found

    To Relive the Web: A Framework for the Transformation and Archival Replay of Web Pages

    Get PDF
    When replaying an archived web page (known as a memento), the fundamental expectation is that the page should be viewable and function exactly as it did at archival time. However, this expectation requires web archives to modify the page and its embedded resources, so that they no longer reference (link to) the original server(s) they were archived from but back to the archive. Although these modifications necessarily change the state of the representation, it is understood that without them the replay of mementos from the archive would not be possible. Unfortunately, because the replay of mementos and the modifications made to them by web archives in order to facilitate replay varies between archives, the terminology for describing replay and the modification made to mementos for facilitating replay does not exist. In this thesis, we propose terminology for describing the existing styles of replay and the modifications made on the part of web archives to mementos in order to facilitate replay. This thesis also, in the process of defining terminology for the modifications made by client-side rewriting libraries to the JavaScript execution environment of the browser during replay, proposes a general framework for the auto-generation of client-side rewriting libraries. Finally, we evaluate the effectiveness of using a generated client-side rewriting library to augment the existing replay systems of web archives by crawling mementos replayed from the Internet Archive’s Wayback Machine with and without the generated client-side rewriter. By using the generated client-side rewriter we were able to decrease the cumulative number of requests blocked by the content security policy of the Wayback Machine for 577 mementos by 87.5% and increased the cumulative number of requests made by 32.8%. Also by using the generated client-side rewriter, we were able to replay mementos that were previously not replayable from the Internet Archive

    JISC Preservation of Web Resources (PoWR) Handbook

    Get PDF
    Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

    An Updated Portrait of the Portuguese Web

    Get PDF
    This study presents an updated characterization of the Portuguese Web derived from a crawl of 48 million contents belonging to all media types (2.5 TB of data), performed in March, 2008. The resulting data was analyzed to characterize contents, sites and domains. This study was performed within the scope of the Portuguese Web Archive.POSC/EU, UMI

    Web Archiving in the UK: Current Developments and Reflections for the Future

    Get PDF
    This work presents a brief overview on the history of Web archiving projects in some English speaking countries, paying particular attention to the development and main problems faced by the UK Web Archive Consortium (UKWAC) and UK Web Archive partnership in Britain. It highlights, particularly, the changeable nature of Web pages through constant content removal and/or alteration and the evolving technological innovations brought recently by Web 2.0 applications, discussing how these factors have an impact on Web archiving projects. It also examines different collecting approaches, harvesting software limitations and how the current copyright and deposit regulations in the UK covering digital contents are failing to support Web archive projects in the country. From the perspective of users’ access, this dissertation offers an analysis of UK Web archive interfaces identifying their main drawbacks and suggesting how these could be further improved in order to better respond to users’ information needs and access to archived Web content

    Digital archives : comparative study and interoperability framework

    Get PDF
    Estágio realizado na ParadigmaXis e orientado pelo Eng.º Filipe CorreiaTese de mestrado integrado. Engenharia Informátca e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    Building a New Infrastructure for Digital Media: Northwestern University Library

    Get PDF
    The Northwestern University Library has been a pioneer in text and media digitization. From early efforts primarily focused on enhancing access to reserve material to current projects involving vast quantities of streaming media, in great part these projects have been the result of close collaboration between the library and other units on campus, particularly Academic Technologies. As the depth and breadth of digitization efforts have increased, so have the technological and organizational issues. This article examines the history of digitization efforts at Northwestern University as a context for exploring the emerging issues most libraries face as digitization enters a new era

    The Feminist Library: “History is Herstory, Too”

    Get PDF
    The Feminist Library is not a typical public library; it is an organization with roots in the historical revolution. Its history, services, and classification system are unique; its collection is irreplaceable. The purpose of this study is to document the history, resources, and organization of the Feminist Library in London, England

    The development of a set of principles for the through-life management of engineering information

    Get PDF
    Belgium Herbarium image of Meise Botanic Garden
    • …
    corecore