39,412 research outputs found
CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance
In order to efficiently use the future generations of supercomputers, fault
tolerance and power consumption are two of the prime challenges anticipated by
the High Performance Computing (HPC) community. Checkpoint/Restart (CR) has
been and still is the most widely used technique to deal with hard failures.
Application-level CR is the most effective CR technique in terms of overhead
efficiency but it takes a lot of implementation effort. This work presents the
implementation of our C++ based library CRAFT (Checkpoint-Restart and Automatic
Fault Tolerance), which serves two purposes. First, it provides an extendable
library that significantly eases the implementation of application-level
checkpointing. The most basic and frequently used checkpoint data types are
already part of CRAFT and can be directly used out of the box. The library can
be easily extended to add more data types. As means of overhead reduction, the
library offers a build-in asynchronous checkpointing mechanism and also
supports the Scalable Checkpoint/Restart (SCR) library for node level
checkpointing. Second, CRAFT provides an easier interface for User-Level
Failure Mitigation (ULFM) based dynamic process recovery, which significantly
reduces the complexity and effort of failure detection and communication
recovery mechanism. By utilizing both functionalities together, applications
can write application-level checkpoints and recover dynamically from process
failures with very limited programming effort. This work presents the design
and use of our library in detail. The associated overheads are thoroughly
analyzed using several benchmarks
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
New Hampshire University Research and Industry Plan: A Roadmap for Collaboration and Innovation
This University Research and Industry plan for New Hampshire is focused on accelerating innovation-led development in the state by partnering academia’s strengths with the state’s substantial base of existing and emerging advanced industries. These advanced industries are defined by their deep investment and connections to research and development and the high-quality jobs they generate across production, new product development and administrative positions involving skills in science, technology, engineering and math (STEM)
Edge vulnerability in neural and metabolic networks
Biological networks, such as cellular metabolic pathways or networks of
corticocortical connections in the brain, are intricately organized, yet
remarkably robust toward structural damage. Whereas many studies have
investigated specific aspects of robustness, such as molecular mechanisms of
repair, this article focuses more generally on how local structural features in
networks may give rise to their global stability. In many networks the failure
of single connections may be more likely than the extinction of entire nodes,
yet no analysis of edge importance (edge vulnerability) has been provided so
far for biological networks. We tested several measures for identifying
vulnerable edges and compared their prediction performance in biological and
artificial networks. Among the tested measures, edge frequency in all shortest
paths of a network yielded a particularly high correlation with vulnerability,
and identified inter-cluster connections in biological but not in random and
scale-free benchmark networks. We discuss different local and global network
patterns and the edge vulnerability resulting from them.Comment: 8 pages, 4 figures, to appear in Biological Cybernetic
Feedback-Aware Precoding for Millimeter Wave Massive MIMO Systems
Millimeter wave (mmWave) communication is a promising solution for coping
with the ever-increasing mobile data traffic because of its large bandwidth. To
enable a sufficient link margin, a large antenna array employing directional
beamforming, which is enabled by the availability of channel state information
at the transmitter (CSIT), is required. However, CSIT acquisition for mmWave
channels introduces a huge feedback overhead due to the typically large number
of transmit and receive antennas. Leveraging properties of mmWave channels,
this paper proposes a precoding strategy which enables a flexible adjustment of
the feedback overhead. In particular, the optimal unconstrained precoder is
approximated by selecting a variable number of elements from a basis that is
constructed as a function of the transmitter array response, where the number
of selected basis elements can be chosen according to the feedback constraint.
Simulation results show that the proposed precoding scheme can provide a
near-optimal solution if a higher feedback overhead can be afforded. For a low
overhead, it can still provide a good approximation of the optimal precoder.Comment: 7 pages, 5 figures, to appear at the IEEE International Symposium on
Personal, Indoor and Mobile Radio Communications (PIMRC) 201
A load-sharing architecture for high performance optimistic simulations on multi-core machines
In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well
- …