1,374 research outputs found

    A Guide to Distributed Digital Preservation

    Get PDF
    This volume is devoted to the broad topic of distributed digital preservation, a still-emerging field of practice for the cultural memory arena. Replication and distribution hold out the promise of indefinite preservation of materials without degradation, but establishing effective organizational and technical processes to enable this form of digital preservation is daunting. Institutions need practical examples of how this task can be accomplished in manageable, low-cost ways."--P. [4] of cove

    Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

    Full text link
    We present the architecture behind Twitter's real-time related query suggestion and spelling correction service. Although these tasks have received much attention in the web search literature, the Twitter context introduces a real-time "twist": after significant breaking news events, we aim to provide relevant results within minutes. This paper provides a case study illustrating the challenges of real-time data processing in the era of "big data". We tell the story of how our system was built twice: our first implementation was built on a typical Hadoop-based analytics stack, but was later replaced because it did not meet the latency requirements necessary to generate meaningful real-time results. The second implementation, which is the system deployed in production, is a custom in-memory processing engine specifically designed for the task. This experience taught us that the current typical usage of Hadoop as a "big data" platform, while great for experimentation, is not well suited to low-latency processing, and points the way to future work on data analytics platforms that can handle "big" as well as "fast" data

    Recovering Residual Forensic Data from Smartphone Interactions with Cloud Storage Providers

    Full text link
    There is a growing demand for cloud storage services such as Dropbox, Box, Syncplicity and SugarSync. These public cloud storage services can store gigabytes of corporate and personal data in remote data centres around the world, which can then be synchronized to multiple devices. This creates an environment which is potentially conducive to security incidents, data breaches and other malicious activities. The forensic investigation of public cloud environments presents a number of new challenges for the digital forensics community. However, it is anticipated that end-devices such as smartphones, will retain data from these cloud storage services. This research investigates how forensic tools that are currently available to practitioners can be used to provide a practical solution for the problems related to investigating cloud storage environments. The research contribution is threefold. First, the findings from this research support the idea that end-devices which have been used to access cloud storage services can be used to provide a partial view of the evidence stored in the cloud service. Second, the research provides a comparison of the number of files which can be recovered from different versions of cloud storage applications. In doing so, it also supports the idea that amalgamating the files recovered from more than one device can result in the recovery of a more complete dataset. Third, the chapter contributes to the documentation and evidentiary discussion of the artefacts created from specific cloud storage applications and different versions of these applications on iOS and Android smartphones

    Mixed-mode multicore reliability

    Get PDF
    Future processors are expected to observe increasing rates of hardware faults. Using Dual-Modular Redundancy (DMR), two cores of a multicore can be loosely coupled to redundantly execute a single software thread, providing very high coverage from many difference sources of faults. This reliability, however, comes at a high price in terms of per-thread IPC and overall system throughput. We make the observation that a user may want to run both applications requiring high reliability, such as financial software, and more fault tolerant applications requiring high performance, such as media or web software, on the same machine at the same time. Yet a traditional DMR system must fully operate in redundant mode whenever any application requires high reliability. This paper proposes a Mixed-Mode Multicore (MMM), which enables most applications, including the system software, to run with high reliability in DMR mode, while applications that need high performance can avoid the penalty of DMR. Though conceptually simple, two key challenges arise: 1) care must be taken to protect reliable applications from any faults occurring to applications running in high performance mode, and 2) the desire to execute additional independent software threads for a performance application complicates the scheduling of computation to cores. After solving these issues, an MMM is shown to improve overall system performance, compared to a traditional DMR system, by approximately 2X when one reliable and one performance application are concurrently executing

    Performance comparison of cache coherence protocol on multi-core architecture

    Get PDF
    Number of cores in multi-core processors is steadily increased to make it faster and more reliable. Increasing the number of cores comes with a numerous issues that need to be addressed. In this dissertation we looked at the cache coherence issue, its importance and solution. Cache coherence is important as two or more cores sharing the same data must maintain the recent updated value to avoid reading of stale value. We have made an extensive study of existing cache coherence methods, such as Snoopy coherence technique and Directory coherence technique. Snoopy coherence technique is studied with the help of MOESI coherence protocol and Directory coherence technique is observed with the help of MI, MESI TWO LEVEL, MESI THREE LEVEL, MOESI, and MOESI TOKEN coherence protocol. We have used GEM5 simulator and Splash-2 benchmark to compare their performance. For simulation a precompiled program called MemTest, Ruby random tester, and Splash-2 suite is used. It is observed that the performance is improved as we move from MI, MESI TWO LEVEL, MESI THREE LEVEL, MOESI, and MOESI TOKEN in Directory coherence technique and for Snoopy coherence we observed the performance through varying parameters like, cache size, block size and associativity. It is also observed that that adding L3 level cache the performance of MESI Three Level is improved over MESI Two Level
    corecore