1,882 research outputs found

    A Framework for Aggregating Private and Public Web Archives

    Full text link
    Personal and private Web archives are proliferating due to the increase in the tools to create them and the realization that Internet Archive and other public Web archives are unable to capture personalized (e.g., Facebook) and private (e.g., banking) Web pages. We introduce a framework to mitigate issues of aggregation in private, personal, and public Web archives without compromising potential sensitive information contained in private captures. We amend Memento syntax and semantics to allow TimeMap enrichment to account for additional attributes to be expressed inclusive of the requirements for dereferencing private Web archive captures. We provide a method to involve the user further in the negotiation of archival captures in dimensions beyond time. We introduce a model for archival querying precedence and short-circuiting, as needed when aggregating private and personal Web archive captures with those from public Web archives through Memento. Negotiation of this sort is novel to Web archiving and allows for the more seamless aggregation of various types of Web archives to convey a more accurate picture of the past Web.Comment: Preprint version of the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2018) full paper, accessible at the DO

    Provenance : from long-term preservation to query federation and grid reasoning

    Get PDF

    Remembering today tomorrow: exploring the human-centred design of digital mementos

    Get PDF
    This paper describes two-part research exploring the context for and human-centred design of ‘digital mementos’, as an example of technology for reflection on personal experience(in this case, autobiographical memories). Field studies into families’ use of physical and digital objects for remembering provided a rich understanding of associated user needs and human values, and suggested properties for ‘digital mementos’ such as being ‘not like work’, discoverable and fun. In a subsequent design study, artefacts were devised to express these features and develop the understanding of needs and values further via discussion with groups of potential ‘users’. ‘Critical artefacts’(the products of Critical Design)were used to enable participants to envisage broader possibilities for social practices and applications of technology in the context of personal remembering, and thus to engage in the design of novel devices and systems relevant to their lives. Reflection was a common theme in the work, being what the digital mementos were designed to afford and the mechanism by which the design activity progressed. Ideas for digital mementos formed the output of this research and expressed the designer’s and researcher’s understanding of participants’ practices and needs, and the human values that underlie them and, in doing so, suggest devices and systems that go beyond usability to support a broader conception of human activity

    Web Archive Services Framework for Tighter Integration Between the Past and Present Web

    Get PDF
    Web archives have contained the cultural history of the web for many years, but they still have a limited capability for access. Most of the web archiving research has focused on crawling and preservation activities, with little focus on the delivery methods. The current access methods are tightly coupled with web archive infrastructure, hard to replicate or integrate with other web archives, and do not cover all the users\u27 needs. In this dissertation, we focus on the access methods for archived web data to enable users, third-party developers, researchers, and others to gain knowledge from the web archives. We build ArcSys, a new service framework that extracts, preserves, and exposes APIs for the web archive corpus. The dissertation introduces a novel categorization technique to divide the archived corpus into four levels. For each level, we will propose suitable services and APIs that enable both users and third-party developers to build new interfaces. The first level is the content level that extracts the content from the archived web data. We develop ArcContent to expose the web archive content processed through various filters. The second level is the metadata level; we extract the metadata from the archived web data and make it available to users. We implement two services, ArcLink for temporal web graph and ArcThumb for optimizing the thumbnail creation in the web archives. The third level is the URI level that focuses on using the URI HTTP redirection status to enhance the user query. Finally, the highest level in the web archiving service framework pyramid is the archive level. In this level, we define the web archive by the characteristics of its corpus and building Web Archive Profiles. The profiles are used by the Memento Aggregator for query optimization

    MementoMap: An Archive Profile Dissemination Framework

    Get PDF
    We introduce MementoMap, a framework to express and disseminate holdings of web archives (archive profiles) by themselves or third parties. The framework allows arbitrary, flexible, and dynamic levels of details in its entries that fit the needs of archives of different scales. This enables Memento aggregators to significantly reduce wasted traffic to web archives

    Aggregating Private and Public Web Archives Using the Mementity Framework

    Get PDF
    Web archives preserve the live Web for posterity, but the content on the Web one cares about may not be preserved. The ability to access this content in the future requires the assurance that those sites will continue to exist on the Web until the content is requested and that the content will remain accessible. It is ultimately the responsibility of the individual to preserve this content, but attempting to replay personally preserved pages segregates archived pages by individuals and organizations of personal, private, and public Web content. This is misrepresentative of the Web as it was. While the Memento Framework may be used for inter-archive aggregation, no dynamics exist for the special consideration needed for the contents of these personal and private captures. In this work we introduce a framework for aggregating private and public Web archives. We introduce three mementities that serve the roles of the aforementioned aggregation, access control to personal Web archives, and negotiation of Web archives in dimensions beyond time, inclusive of the dimension of privacy. These three mementities serve as the foundation of the Mementity Framework. We investigate the difficulties and dynamics of preserving, replaying, aggregating, propagating, and collaborating with live Web captures of personal and private content. We offer a systematic solution to these outstanding issues through the application of the framework. We ensure the framework\u27s applicability beyond the use cases we describe as well as the extensibility of reusing the mementities for currently unforeseen access patterns. We evaluate the framework by justifying the mementity design decisions, formulaically abstracting the anticipated temporal and spatial costs, and providing reference implementations, usage, and examples for the framework
    • …
    corecore