Search CORE

58,585 research outputs found

Technology and torts: memory costs, nondurable precautions and interference effects

Author: De Mot Jef
Depoorter Ben
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Muppet: MapReduce-Style Processing of Fast Data

Author: Doan AnHai
Lam Wang
Liu Lu
Prasad STS
Rajaraman Anand
Vacheri Zoheb
Publication venue
Publication date: 01/01/2012
Field of study

MapReduce has emerged as a popular method to process big data. In the past few years, however, not just big data, but fast data has also exploded in volume and availability. Examples of such data include sensor data streams, the Twitter Firehose, and Facebook updates. Numerous applications must process fast data. Can we provide a MapReduce-style framework so that developers can quickly write such applications and execute them over a cluster of machines, to achieve low latency and high scalability? In this paper we report on our investigation of this question, as carried out at Kosmix and WalmartLabs. We describe MapUpdate, a framework like MapReduce, but specifically developed for fast data. We describe Muppet, our implementation of MapUpdate. Throughout the description we highlight the key challenges, argue why MapReduce is not well suited to address them, and briefly describe our current solutions. Finally, we describe our experience and lessons learned with Muppet, which has been used extensively at Kosmix and WalmartLabs to power a broad range of applications in social media and e-commerce.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Fine-Grain Checkpointing with In-Cache-Line Logging

Author: Aksun David T.
Avni Hillel
Cohen Nachshon
Larus James R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/02/2019
Field of study

Non-Volatile Memory offers the possibility of implementing high-performance, durable data structures. However, achieving performance comparable to well-designed data structures in non-persistent (transient) memory is difficult, primarily because of the cost of ensuring the order in which memory writes reach NVM. Often, this requires flushing data to NVM and waiting a full memory round-trip time. In this paper, we introduce two new techniques: Fine-Grained Checkpointing, which ensures a consistent, quickly recoverable data structure in NVM after a system failure, and In-Cache-Line Logging, an undo-logging technique that enables recovery of earlier state without requiring cache-line flushes in the normal case. We implemented these techniques in the Masstree data structure, making it persistent and demonstrating the ease of applying them to a highly optimized system and their low (5.9-15.4\%) runtime overhead cost.Comment: In 2019 Architectural Support for Programming Languages and Operating Systems (ASPLOS 19), April 13, 2019, Providence, RI, US

arXiv.org e-Print Archive

Crossref