30 research outputs found

    ΠœΠΈΡ€ΠΎΠ²Ρ‹Π΅ Ρ‚Π΅Π½Π΄Π΅Π½Ρ†ΠΈΠΈ развития Π²Π΅Π±-Π°Ρ€Ρ…ΠΈΠ²ΠΎΠ² Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ

    Get PDF
    The need for studying and promoting web-archiving for longterm information preservation and accessibility in future is substantiated. The existing technologies of web-archiving are specified and the problems related to the web dynamic character, errors, content complexity, are revealed. Successful experience in the world libraries’ web-archiving is discussed (selection, search, description technologies, access terms, etc.). The study findings demonstrate that web-archives are selected to supplement the libraries’ digital collections on hot topics, like COVID-19, or to meet the demands of specific user groups. For the purpose of cultural heritage preservation, the national libraries often focus of acquiring web-sites by the domains in the corresponding country. The university libraries focus on acquiring web-archives that meet research and educational demands of their users; and public libraries prefer the resources of interest to their local community. The findings may be used by world libraries for developing their digital collections.Обоснована Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎΡΡ‚ΡŒ изучСния ΠΈ распространСния ΠΏΡ€Π°ΠΊΡ‚ΠΈΠΊΠΈ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ ΠΌΠΈΡ€Π° Π² области Π²Π΅Π±-архивирования Π² цСлях долгосрочного сохранСния ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ ΠΈ обСспСчСния Π΅Ρ‘ доступности Π² Π±ΡƒΠ΄ΡƒΡ‰Π΅ΠΌ. ВыявлСны соврСмСнныС Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ, ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅ΠΌΡ‹Π΅ Π² Π²Π΅Π±-Π°Ρ€Ρ…ΠΈΠ²ΠΈΡ€ΠΎΠ²Π°Π½ΠΈΠΈ, Π° Ρ‚Π°ΠΊΠΆΠ΅ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹, связанныС с Π΄ΠΈΠ½Π°ΠΌΠΈΡ‡Π½ΠΎΠΉ ΠΏΡ€ΠΈΡ€ΠΎΠ΄ΠΎΠΉ сайтов, ошибками, ΡΠ»ΠΎΠΆΠ½ΠΎΡΡ‚ΡŒΡŽ ΠΊΠΎΠ½Ρ‚Π΅Π½Ρ‚Π° для сохранСния ΠΈ Π΄Ρ€. РассмотрСны ΠΏΡ€ΠΈΠΌΠ΅Ρ€Ρ‹ Π°ΠΊΡ‚ΠΈΠ²Π½ΠΎ Ρ€Π°Π·Π²ΠΈΠ²Π°ΡŽΡ‰ΠΈΡ…ΡΡ Π²Π΅Π±-Π°Ρ€Ρ…ΠΈΠ²ΠΎΠ² Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ ΠΌΠΈΡ€Π° (Ρ‚Π΅Ρ…Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ ΠΎΡ‚Π±ΠΎΡ€Π°, поиска, описания, условия доступа ΠΈ Π΄Ρ€.). Π Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ исслСдования ΠΏΠΎΠΊΠ°Π·Ρ‹Π²Π°ΡŽΡ‚, Ρ‡Ρ‚ΠΎ Π²Π΅Π±-Π°Ρ€Ρ…ΠΈΠ²Ρ‹ ΠΎΡ‚Π±ΠΈΡ€Π°ΡŽΡ‚ΡΡ для дополнСния ΡΡƒΡ‰Π΅ΡΡ‚Π²ΡƒΡŽΡ‰ΠΈΡ… Ρ†ΠΈΡ„Ρ€ΠΎΠ²Ρ‹Ρ… ΠΊΠΎΠ»Π»Π΅ΠΊΡ†ΠΈΠΉ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ ΠΏΠΎ Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½Ρ‹ΠΌ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠ°ΠΌ, Π½Π°ΠΏΡ€ΠΈΠΌΠ΅Ρ€, ΠΏΠΎ коронавирусной Π±ΠΎΠ»Π΅Π·Π½ΠΈ (COVID-19), Π»ΠΈΠ±ΠΎ Π² интСрСсах ΠΎΠΏΡ€Π΅Π΄Π΅Π»Ρ‘Π½Π½ΠΎΠΉ Π³Ρ€ΡƒΠΏΠΏΡ‹ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Π΅ΠΉ. Π’ цСлях сохранСния ΠΊΡƒΠ»ΡŒΡ‚ΡƒΡ€Π½ΠΎΠ³ΠΎ наслСдия Π½Π°Ρ†ΠΈΠΎΠ½Π°Π»ΡŒΠ½Ρ‹Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊΠΈ Ρ‡Π°Ρ‰Π΅ сосрСдоточСны Π½Π° сборС сайтов ΠΏΠΎ Π΄ΠΎΠΌΠ΅Π½Π°ΠΌ, ΠΎΡ‚Ρ€Π°ΠΆΠ°ΡŽΡ‰ΠΈΡ… ΠΏΡ€ΠΈΠ½Π°Π΄Π»Π΅ΠΆΠ½ΠΎΡΡ‚ΡŒ ΠΊ Ρ‚ΠΎΠΌΡƒ ΠΈΠ»ΠΈ ΠΈΠ½ΠΎΠΌΡƒ государству. УнивСрситСтскиС Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊΠΈ ΠΊΠΎΠ½Ρ†Π΅Π½Ρ‚Ρ€ΠΈΡ€ΡƒΡŽΡ‚ΡΡ Π½Π° сборС Π²Π΅Π±-Π°Ρ€Ρ…ΠΈΠ²ΠΎΠ², ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ слуТат ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΡΠΊΠΈΠΌ ΠΈΠ»ΠΈ ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹ΠΌ потрСбностям ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Π΅ΠΉ ΠΊΠΎΠ½ΠΊΡ€Π΅Ρ‚Π½ΠΎΠ³ΠΎ учрСТдСния, Π° ΠΏΡƒΠ±Π»ΠΈΡ‡Π½Ρ‹Π΅ Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊΠΈ – Π½Π° рСсурсах, ΠΎΡ‚Ρ€Π°ΠΆΠ°ΡŽΡ‰ΠΈΡ… Тизнь мСстного сообщСства. РассмотрСнный ΠΎΠΏΡ‹Ρ‚ ΠΌΠΎΠΆΠ΅Ρ‚ Π±Ρ‹Ρ‚ΡŒ распространён срСди Π΄Ρ€ΡƒΠ³ΠΈΡ… Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊ ΠΌΠΈΡ€Π° для развития Ρ†ΠΈΡ„Ρ€ΠΎΠ²Ρ‹Ρ… ΠΊΠΎΠ»Π»Π΅ΠΊΡ†ΠΈΠΉ

    Avoiding Zombies in Archival Replay Using ServiceWorker

    Get PDF
    [First paragraph] A Composite Memento is an archived representation of a web page with all the page requisites such as images and stylesheets. All embedded resources have their own URIs, hence, they are archived independently. For a meaningful archival replay, it is important to load all the page requisites from the archive within the temporal neighborhood of the base HTML page. To achieve this goal, archival replay systems try to rewrite all the resource references to appropriate archived versions before serving HTML, CSS, or JS. However, an effective server-side URL rewriting is difficult when URLs are generated dynamically using JavaScript. A failure of correct URL rewriting might yield an invalid/unintended URI or resolve to a live resource. Such live resources, leaking into a composite memento, are called zombies

    Client-Assisted Memento Aggregation Using The Prefer Header

    Get PDF
    [First paragraph] Preservation of the Web ensures that future generations have a picture of how the web was. Web archives like Internet Archive\u27s Wayback Machine, WebCite, and archive.is allow individuals to submit URIs to be archived, but the captures they preserve then reside at the archives. Traversing these captures in time as preserved by multiple archive sources (using Memento [8]) provides a more comprehensive picture of the past Web than relying on a single archive. Some content on the Web, such as content behind authentication, may be unsuitable or inaccessible for preservation by these organizations. Furthermore, this content may be inappropriate for the organizations to preserve due to reasons of privacy or exposure of personally identifiable information [4]. However, preserving this content would ensure an even-more comprehensive picture of the web and may be useful for future historians who wish to analyze content beyond the capability or suitability of archives created to preserve the public Web

    MementoEmbed and Raintale for Web Archive Storytelling

    Get PDF
    For traditional library collections, archivists can select a representative sample from a collection and display it in a featured physical or digital library space. Web archive collections may consist of thousands of archived pages, or mementos. How should an archivist display this sample to drive visitors to their collection? Search engines and social media platforms often represent web pages as cards consisting of text snippets, titles, and images. Web storytelling is a popular method for grouping these cards in order to summarize a topic. Unfortunately, social media platforms are not archive-aware and fail to consistently create a good experience for mementos. They also allow no UI alterations for their cards. Thus, we created MementoEmbed to generate cards for individual mementos and Raintale for creating entire stories that archivists can export to a variety of formats

    It is Hard to Compute Fixity on Archived Web Pages

    Get PDF
    [Introduction] Checking fixity in web archives is performed to ensure archived resources, or mementos (denoted by URI-M) have remained unaltered since when they were captured. The final report of the PREMIS Working Group [2] defines information used for fixity as information used to verify whether an object has been altered in an undocumented or unauthorized way. The common technique for checking fixity is to generate a current hash value (i.e., a message digest or a checksum) for a file using a cryptographic hash function (e.g., SHA-256) and compare it to the hash value generated originally. If they have different hash values, then the file has been changed, either maliciously or not. We implicitly trust content delivered by web archives, but with the current trend of extended use of other public and private web archives, we should consider the question of validity of archived web pages. Most web archives do not allow users to retrieve fixity information. More importantly, even if fixity information is accessible, it is provided by the same archive delivering the content. A part of our research is dedicated to establishing and checking the fixity of archived resources with the following requirements: Any user can generate fixity information, not only the archive Fixity information can be generated on the mementos playbac

    SHARI- An Integration of Tools to Visualize the Story of the Day

    Get PDF
    Tools such as google news and flipboard exist to convey daily news, but what about the news of the past? In this paper, we describe how to combine several existing tools and web archive holdings to convey the β€œbiggest story” for a given date in the past. StoryGraph clusters news articles together to identify a common news story. Hypercane leverages ArchiveNow to store URLs produced by Story-Graph in web archives. Hypercane analyzes these URLs to identify the most common terms, entities, and highest quality images for social media storytelling. Raintale then takes the output of these tools to produce a visualization of the news story for a given day. We name this process SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration). With SHARI, a user can visualize the articles belonging to a past date’s news story

    Provenance : from long-term preservation to query federation and grid reasoning

    Get PDF
    corecore