Search CORE

9,414 research outputs found

Flexible Sensor Network Reprogramming for Logistics

Author: Evers L.
Havinga P.J.M.
Kuper J.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Besides the currently realized applications, Wireless Sensor Networks can be put to use in logistics processes. However, doing so requires a level of flexibility and safety not provided by the current WSN software platforms. This paper discusses a logistics scenario, and presents SensorScheme, a runtime environment used to realize this scenario, based on semantics of the Scheme programming language. SensorScheme is a general purpose WSN platform, providing dynamic reprogramming, memory safety (sandboxing), blocking I/O, marshalled communication, compact code transport. It improves on the state of the art by making better use of the little available memory, thereby providing greater capability in terms of program size and complexity. We illustrate the use of our platform with some application examples, and provide experimental results to show its compactness, speed of operation and energy efficiency

CiteSeerX

University of Twente Research Information

Prioritized Garbage Collection: Explicit GC Support for Software Caches

Author: Berger Emery D.
Guyer Samuel Z.
Nunez Diogenes
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/10/2016
Field of study

Programmers routinely trade space for time to increase performance, often in the form of caching or memoization. In managed languages like Java or JavaScript, however, this space-time tradeoff is complex. Using more space translates into higher garbage collection costs, especially at the limit of available memory. Existing runtime systems provide limited support for space-sensitive algorithms, forcing programmers into difficult and often brittle choices about provisioning. This paper presents prioritized garbage collection, a cooperative programming language and runtime solution to this problem. Prioritized GC provides an interface similar to soft references, called priority references, which identify objects that the collector can reclaim eagerly if necessary. The key difference is an API for defining the policy that governs when priority references are cleared and in what order. Application code specifies a priority value for each reference and a target memory bound. The collector reclaims references, lowest priority first, until the total memory footprint of the cache fits within the bound. We use this API to implement a space-aware least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement for existing caches, such as Google's Guava library. The garbage collector automatically grows and shrinks the Sache in response to available memory and workload with minimal provisioning information from the programmer. Using a Sache, it is almost impossible for an application to experience a memory leak, memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201

arXiv.org e-Print Archive

Crossref

A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems

Author: Hamandawana Prince
Khan Awais
Kim Youngjae
Lee Chang-Gyu
Park Sungyong
Publication venue
Publication date: 20/03/2018
Field of study

Deduplication has been largely employed in distributed storage systems to improve space efficiency. Traditional deduplication research ignores the design specifications of shared-nothing distributed storage systems such as no central metadata bottleneck, scalability, and storage rebalancing. Further, deduplication introduces transactional changes, which are prone to errors in the event of a system failure, resulting in inconsistencies in data and deduplication metadata. In this paper, we propose a robust, fault-tolerant and scalable cluster-wide deduplication that can eliminate duplicate copies across the cluster. We design a distributed deduplication metadata shard which guarantees performance scalability while preserving the design constraints of shared- nothing storage systems. The placement of chunks and deduplication metadata is made cluster-wide based on the content fingerprint of chunks. To ensure transactional consistency and garbage identification, we employ a flag-based asynchronous consistency mechanism. We implement the proposed deduplication on Ceph. The evaluation shows high disk-space savings with minimal performance degradation as well as high robustness in the event of sudden server failure.Comment: 6 Pages including reference

arXiv.org e-Print Archive

Crossref

Incremental copying garbage collection for WAM-based Prolog systems

Author: Demoen Bart
Vandeginste Ruben
Publication venue
Publication date: 02/01/2006
Field of study

The design and implementation of an incremental copying heap garbage collector for WAM-based Prolog systems is presented. Its heap layout consists of a number of equal-sized blocks. Other changes to the standard WAM allow these blocks to be garbage collected independently. The independent collection of heap blocks forms the basis of an incremental collecting algorithm which employs copying without marking (contrary to the more frequently used mark&copy or mark&slide algorithms in the context of Prolog). Compared to standard semi-space copying collectors, this approach to heap garbage collection lowers in many cases the memory usage and reduces pause times. The algorithm also allows for a wide variety of garbage collection policies including generational ones. The algorithm is implemented and evaluated in the context of hProlog.Comment: 33 pages, 22 figures, 5 tables. To appear in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

Lirias

CiteSeerX

Large Scale Parallel Computations in R through Elemental

Author: Bientinesi Paolo
Canales Rodrigo
Peise Elmar
Publication venue
Publication date: 01/01/2016
Field of study

Even though in recent years the scale of statistical analysis problems has increased tremendously, many statistical software tools are still limited to single-node computations. However, statistical analyses are largely based on dense linear algebra operations, which have been deeply studied, optimized and parallelized in the high-performance-computing community. To make high-performance distributed computations available for statistical analysis, and thus enable large scale statistical computations, we introduce RElem, an open source package that integrates the distributed dense linear algebra library Elemental into R. While on the one hand, RElem provides direct wrappers of Elemental's routines, on the other hand, it overloads various operators and functions to provide an entirely native R experience for distributed computations. We showcase how simple it is to port existing R programs to Relem and demonstrate that Relem indeed allows to scale beyond the single-node limitation of R with the full performance of Elemental without any overhead.Comment: 16 pages, 5 figure

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Persistent Memory Programming Abstractions in Context of Concurrent Applications

Author: Shapiro Marc
Singh Ajay
Thomas Gael
Publication venue
Publication date: 13/12/2017
Field of study

The advent of non-volatile memory (NVM) technologies like PCM, STT, memristors and Fe-RAM is believed to enhance the system performance by getting rid of the traditional memory hierarchy by reducing the gap between memory and storage. This memory technology is considered to have the performance like that of DRAM and persistence like that of disks. Thus, it would also provide significant performance benefits for big data applications by allowing in-memory processing of large data with the lowest latency to persistence. Leveraging the performance benefits of this memory-centric computing technology through traditional memory programming is not trivial and the challenges aggravate for parallel/concurrent applications. To this end, several programming abstractions have been proposed like NVthreads, Mnemosyne and intel's NVML. However, deciding upon a programming abstraction which is easier to program and at the same time ensures the consistency and balances various software and architectural trade-offs is openly debatable and active area of research for NVM community. We study the NVthreads, Mnemosyne and NVML libraries by building a concurrent and persistent set and open addressed hash-table data structure application. In this process, we explore and report various tradeoffs and hidden costs involved in building concurrent applications for persistence in terms of achieving efficiency, consistency and ease of programming with these NVM programming abstractions. Eventually, we evaluate the performance of the set and hash-table data structure applications. We observe that NVML is easiest to program with but is least efficient and Mnemosyne is most performance friendly but involves significant programming efforts to build concurrent and persistent applications.Comment: Accepted in HiPC SRS 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Parallel processing and expert systems

Author: Lau Sonie
Yan Jerry C.
Publication venue
Publication date
Field of study

Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

NASA Technical Reports Server