Search CORE

39,412 research outputs found

CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance

Author: Hager Georg
Kreutzer Moritz
Shahzad Faisal
Thies Jonas
Wellein Gerhard
Zeiser Thomas
Publication venue
Publication date: 07/08/2017
Field of study

In order to efficiently use the future generations of supercomputers, fault tolerance and power consumption are two of the prime challenges anticipated by the High Performance Computing (HPC) community. Checkpoint/Restart (CR) has been and still is the most widely used technique to deal with hard failures. Application-level CR is the most effective CR technique in terms of overhead efficiency but it takes a lot of implementation effort. This work presents the implementation of our C++ based library CRAFT (Checkpoint-Restart and Automatic Fault Tolerance), which serves two purposes. First, it provides an extendable library that significantly eases the implementation of application-level checkpointing. The most basic and frequently used checkpoint data types are already part of CRAFT and can be directly used out of the box. The library can be easily extended to add more data types. As means of overhead reduction, the library offers a build-in asynchronous checkpointing mechanism and also supports the Scalable Checkpoint/Restart (SCR) library for node level checkpointing. Second, CRAFT provides an easier interface for User-Level Failure Mitigation (ULFM) based dynamic process recovery, which significantly reduces the complexity and effort of failure detection and communication recovery mechanism. By utilizing both functionalities together, applications can write application-level checkpoints and recover dynamically from process failures with very limited programming effort. This work presents the design and use of our library in detail. The associated overheads are thoroughly analyzed using several benchmarks

arXiv.org e-Print Archive

Institute of Transport Research:Publications

LogBase: A Scalable Log-structured Database System in the Cloud

Author: Agrawal Divyakant
Chen Gang
Ooi Beng Chin
Vo Hoang Tam
Wang Sheng
Publication venue
Publication date: 01/01/2012
Field of study

Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

ScholarBank@NUS

New Hampshire University Research and Industry Plan: A Roadmap for Collaboration and Innovation

Author: Keen Point Consulting
TEConomy Partners
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2016
Field of study

This University Research and Industry plan for New Hampshire is focused on accelerating innovation-led development in the state by partnering academia’s strengths with the state’s substantial base of existing and emerging advanced industries. These advanced industries are defined by their deep investment and connections to research and development and the high-quality jobs they generate across production, new product development and administrative positions involving skills in science, technology, engineering and math (STEM)

UNH Scholars' Repository

Edge vulnerability in neural and metabolic networks

Author: Hilgetag Claus C.
Kaiser Marcus
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2004
Field of study

Biological networks, such as cellular metabolic pathways or networks of corticocortical connections in the brain, are intricately organized, yet remarkably robust toward structural damage. Whereas many studies have investigated specific aspects of robustness, such as molecular mechanisms of repair, this article focuses more generally on how local structural features in networks may give rise to their global stability. In many networks the failure of single connections may be more likely than the extinction of entire nodes, yet no analysis of edge importance (edge vulnerability) has been provided so far for biological networks. We tested several measures for identifying vulnerable edges and compared their prediction performance in biological and artificial networks. Among the tested measures, edge frequency in all shortest paths of a network yielded a particularly high correlation with vulnerability, and identified inter-cluster connections in biological but not in random and scale-free benchmark networks. We discuss different local and global network patterns and the edge vulnerability resulting from them.Comment: 8 pages, 4 figures, to appear in Biological Cybernetic

arXiv.org e-Print Archive

Crossref

Feedback-Aware Precoding for Millimeter Wave Massive MIMO Systems

Author: Burg Andreas
Ghanaatian Reza
Jamali Vahid
Schober Robert
Publication venue
Publication date: 28/06/2019
Field of study

Millimeter wave (mmWave) communication is a promising solution for coping with the ever-increasing mobile data traffic because of its large bandwidth. To enable a sufficient link margin, a large antenna array employing directional beamforming, which is enabled by the availability of channel state information at the transmitter (CSIT), is required. However, CSIT acquisition for mmWave channels introduces a huge feedback overhead due to the typically large number of transmit and receive antennas. Leveraging properties of mmWave channels, this paper proposes a precoding strategy which enables a flexible adjustment of the feedback overhead. In particular, the optimal unconstrained precoder is approximated by selecting a variable number of elements from a basis that is constructed as a function of the transmitter array response, where the number of selected basis elements can be chosen according to the feedback constraint. Simulation results show that the proposed precoding scheme can provide a near-optimal solution if a higher feedback overhead can be afforded. For a low overhead, it can still provide a good approximation of the optimal precoder.Comment: 7 pages, 5 figures, to appear at the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

A load-sharing architecture for high performance optimistic simulations on multi-core machines

Author: Pellegrini Alessandro
Quaglia Francesco
Vitali Roberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well

Crossref

ART

Archivio della ricerca- Università di Roma La Sapienza