Search CORE

30 research outputs found

Мировые тенденции развития веб-архивов библиотек

Author: N. Redkina S.
Н. Редькина С.
Publication venue: 'State Public Scientific-Technical Library'
Publication date: 17/03/2021
Field of study

The need for studying and promoting web-archiving for longterm information preservation and accessibility in future is substantiated. The existing technologies of web-archiving are specified and the problems related to the web dynamic character, errors, content complexity, are revealed. Successful experience in the world libraries’ web-archiving is discussed (selection, search, description technologies, access terms, etc.). The study findings demonstrate that web-archives are selected to supplement the libraries’ digital collections on hot topics, like COVID-19, or to meet the demands of specific user groups. For the purpose of cultural heritage preservation, the national libraries often focus of acquiring web-sites by the domains in the corresponding country. The university libraries focus on acquiring web-archives that meet research and educational demands of their users; and public libraries prefer the resources of interest to their local community. The findings may be used by world libraries for developing their digital collections.Обоснована необходимость изучения и распространения практики библиотек мира в области веб-архивирования в целях долгосрочного сохранения информации и обеспечения её доступности в будущем. Выявлены современные технологии, используемые в веб-архивировании, а также проблемы, связанные с динамичной природой сайтов, ошибками, сложностью контента для сохранения и др. Рассмотрены примеры активно развивающихся веб-архивов библиотек мира (технологии отбора, поиска, описания, условия доступа и др.). Результаты исследования показывают, что веб-архивы отбираются для дополнения существующих цифровых коллекций библиотек по актуальным проблемам, например, по коронавирусной болезни (COVID-19), либо в интересах определённой группы пользователей. В целях сохранения культурного наследия национальные библиотеки чаще сосредоточены на сборе сайтов по доменам, отражающих принадлежность к тому или иному государству. Университетские библиотеки концентрируются на сборе веб-архивов, которые служат исследовательским или образовательным потребностям пользователей конкретного учреждения, а публичные библиотеки – на ресурсах, отражающих жизнь местного сообщества. Рассмотренный опыт может быть распространён среди других библиотек мира для развития цифровых коллекций

Scientific and Technical Libraries (STL/NTB - E-Journal) / Научные и технические библиотеки

Recommended from our members

Using Web Archives to Model Academic Migration and Identify Brain Drain

Author: Kelly Mat
Yan Erjia
Zarrillo Deanna
Publication venue
Publication date: 03/05/2023
Field of study

Presentation for the IIPC General Assembly and Web Archiving Conference held on May 10-12, 2023 in Hilversum, Netherlands. This presentation discusses academic migration at Historically Black Colleges & Universities through analysis of web archives and introduces challenges faced throughout the project

UNT Digital Library

Avoiding Zombies in Archival Replay Using ServiceWorker

Author: Alam Sawood
Kelly Mat
Nelson Michael L.
Weigle Michele C.
Publication venue: ODU Digital Commons
Publication date: 01/01/2017
Field of study

[First paragraph] A Composite Memento is an archived representation of a web page with all the page requisites such as images and stylesheets. All embedded resources have their own URIs, hence, they are archived independently. For a meaningful archival replay, it is important to load all the page requisites from the archive within the temporal neighborhood of the base HTML page. To achieve this goal, archival replay systems try to rewrite all the resource references to appropriate archived versions before serving HTML, CSS, or JS. However, an effective server-side URL rewriting is difficult when URLs are generated dynamically using JavaScript. A failure of correct URL rewriting might yield an invalid/unintended URI or resolve to a live resource. Such live resources, leaking into a composite memento, are called zombies

Old Dominion University

Client-Assisted Memento Aggregation Using The Prefer Header

Author: Alam Sawood
Kelly Mat
Nelson Michael L.
Weigle Michele C.
Publication venue: ODU Digital Commons
Publication date: 01/01/2018
Field of study

[First paragraph] Preservation of the Web ensures that future generations have a picture of how the web was. Web archives like Internet Archive\u27s Wayback Machine, WebCite, and archive.is allow individuals to submit URIs to be archived, but the captures they preserve then reside at the archives. Traversing these captures in time as preserved by multiple archive sources (using Memento [8]) provides a more comprehensive picture of the past Web than relying on a single archive. Some content on the Web, such as content behind authentication, may be unsuitable or inaccessible for preservation by these organizations. Furthermore, this content may be inappropriate for the organizations to preserve due to reasons of privacy or exposure of personally identifiable information [4]. However, preserving this content would ensure an even-more comprehensive picture of the web and may be useful for future historians who wish to analyze content beyond the capability or suitability of archives created to preserve the public Web

Old Dominion University

MementoEmbed and Raintale for Web Archive Storytelling

Author: Jones Shawn M.
Klein Martin
Nelson Michael L.
Weigle Michele C.
Publication venue: ODU Digital Commons
Publication date: 01/01/2020
Field of study

For traditional library collections, archivists can select a representative sample from a collection and display it in a featured physical or digital library space. Web archive collections may consist of thousands of archived pages, or mementos. How should an archivist display this sample to drive visitors to their collection? Search engines and social media platforms often represent web pages as cards consisting of text snippets, titles, and images. Web storytelling is a popular method for grouping these cards in order to summarize a topic. Unfortunately, social media platforms are not archive-aware and fail to consistently create a good experience for mementos. They also allow no UI alterations for their cards. Thus, we created MementoEmbed to generate cards for individual mementos and Raintale for creating entire stories that archivists can export to a variety of formats

arXiv.org e-Print Archive

Old Dominion University

It is Hard to Compute Fixity on Archived Web Pages

Author: Aturban Mohamed
Nelson Michael L.
Weigle Michele C.
Publication venue: ODU Digital Commons
Publication date: 01/01/2018
Field of study

[Introduction] Checking fixity in web archives is performed to ensure archived resources, or mementos (denoted by URI-M) have remained unaltered since when they were captured. The final report of the PREMIS Working Group [2] defines information used for fixity as information used to verify whether an object has been altered in an undocumented or unauthorized way. The common technique for checking fixity is to generate a current hash value (i.e., a message digest or a checksum) for a file using a cryptographic hash function (e.g., SHA-256) and compare it to the hash value generated originally. If they have different hash values, then the file has been changed, either maliciously or not. We implicitly trust content delivered by web archives, but with the current trend of extended use of other public and private web archives, we should consider the question of validity of archived web pages. Most web archives do not allow users to retrieve fixity information. More importantly, even if fixity information is accessible, it is provided by the same archive delivering the content. A part of our research is dedicated to establishing and checking the fixity of archived resources with the following requirements: Any user can generate fixity information, not only the archive Fixity information can be generated on the mementos playbac

Old Dominion University

SHARI- An Integration of Tools to Visualize the Story of the Day

Author: Jones Shawn M.
Klein Martin
Nelson Michael L.
Nwala Alexander C.
Weigle Michele C.
Publication venue: ODU Digital Commons
Publication date: 01/01/2020
Field of study

Tools such as google news and flipboard exist to convey daily news, but what about the news of the past? In this paper, we describe how to combine several existing tools and web archive holdings to convey the “biggest story” for a given date in the past. StoryGraph clusters news articles together to identify a common news story. Hypercane leverages ArchiveNow to store URLs produced by Story-Graph in web archives. Hypercane analyzes these URLs to identify the most common terms, entities, and highest quality images for social media storytelling. Raintale then takes the output of these tools to produce a visualization of the news story for a given day. We name this process SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration). With SHARI, a user can visualize the articles belonging to a past date’s news story

Old Dominion University

Provenance : from long-term preservation to query federation and grid reasoning

Author: Coppens Sam
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2015
Field of study

Ghent University Academic Bibliography