1,989 research outputs found
DeSyRe: on-Demand System Reliability
The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints
Coz: Finding Code that Counts with Causal Profiling
Improving performance is a central concern for software developers. To locate
optimization opportunities, developers rely on software profilers. However,
these profilers only report where programs spent their time: optimizing that
code may have no impact on performance. Past profilers thus both waste
developer time and make it difficult for them to uncover significant
optimization opportunities.
This paper introduces causal profiling. Unlike past profiling approaches,
causal profiling indicates exactly where programmers should focus their
optimization efforts, and quantifies their potential impact. Causal profiling
works by running performance experiments during program execution. Each
experiment calculates the impact of any potential optimization by virtually
speeding up code: inserting pauses that slow down all other code running
concurrently. The key insight is that this slowdown has the same relative
effect as running that line faster, thus "virtually" speeding it up.
We present Coz, a causal profiler, which we evaluate on a range of
highly-tuned applications: Memcached, SQLite, and the PARSEC benchmark suite.
Coz identifies previously unknown optimization opportunities that are both
significant and targeted. Guided by Coz, we improve the performance of
Memcached by 9%, SQLite by 25%, and accelerate six PARSEC applications by as
much as 68%; in most cases, these optimizations involve modifying under 10
lines of code.Comment: Published at SOSP 2015 (Best Paper Award
A Touch of Evil: High-Assurance Cryptographic Hardware from Untrusted Components
The semiconductor industry is fully globalized and integrated circuits (ICs)
are commonly defined, designed and fabricated in different premises across the
world. This reduces production costs, but also exposes ICs to supply chain
attacks, where insiders introduce malicious circuitry into the final products.
Additionally, despite extensive post-fabrication testing, it is not uncommon
for ICs with subtle fabrication errors to make it into production systems.
While many systems may be able to tolerate a few byzantine components, this is
not the case for cryptographic hardware, storing and computing on confidential
data. For this reason, many error and backdoor detection techniques have been
proposed over the years. So far all attempts have been either quickly
circumvented, or come with unrealistically high manufacturing costs and
complexity.
This paper proposes Myst, a practical high-assurance architecture, that uses
commercial off-the-shelf (COTS) hardware, and provides strong security
guarantees, even in the presence of multiple malicious or faulty components.
The key idea is to combine protective-redundancy with modern threshold
cryptographic techniques to build a system tolerant to hardware trojans and
errors. To evaluate our design, we build a Hardware Security Module that
provides the highest level of assurance possible with COTS components.
Specifically, we employ more than a hundred COTS secure crypto-coprocessors,
verified to FIPS140-2 Level 4 tamper-resistance standards, and use them to
realize high-confidentiality random number generation, key derivation, public
key decryption and signing. Our experiments show a reasonable computational
overhead (less than 1% for both Decryption and Signing) and an exponential
increase in backdoor-tolerance as more ICs are added
Development of an integrated low-power RF partial discharge detector
This paper presents the results from integrating a low-power partial discharge detector with a wireless sensor node designed for operating as part of an IEEE 802.15.4 sensor network, and applying an on-line classifier capable of classifying partial discharges in real-time. Such a system is of benefit to monitoring engineers as it provides a means to exploit the RF technique using a low-cost device while circumventing the need for any additional cabling associated with new condition monitoring systems. The detector uses a frequency-based technique to differentiate between multiple defects, and has been integrated with a SunSPOT wireless sensor node hosting an agent-based monitoring platform, which includes a data capture agent and rule induction agent trained using experimental data. The results of laboratory system verification are discussed, and the requirements for a fully robust and flexible system are outlined
Runtime Verification in Context : Can Optimizing Error Detection Improve Fault Diagnosis
Runtime verification has primarily been developed and evaluated as a means of enriching the software testing process. While many researchers have pointed to its potential applicability in online approaches to software fault tolerance, there has been a dearth of work exploring the details of how that might be accomplished. In this paper, we describe how a component-oriented approach to software health management exposes the connections between program execution, error detection, fault diagnosis, and recovery. We identify both research challenges and opportunities in exploiting those connections. Specifically, we describe how recent approaches to reducing the overhead of runtime monitoring aimed at error detection might be adapted to reduce the overhead and improve the effectiveness of fault diagnosis
Practical Run-time Checking via Unobtrusive Property Caching
The use of annotations, referred to as assertions or contracts, to describe
program properties for which run-time tests are to be generated, has become
frequent in dynamic programing languages. However, the frameworks proposed to
support such run-time testing generally incur high time and/or space overheads
over standard program execution. We present an approach for reducing this
overhead that is based on the use of memoization to cache intermediate results
of check evaluation, avoiding repeated checking of previously verified
properties. Compared to approaches that reduce checking frequency, our proposal
has the advantage of being exhaustive (i.e., all tests are checked at all
points) while still being much more efficient than standard run-time checking.
Compared to the limited previous work on memoization, it performs the task
without requiring modifications to data structure representation or checking
code. While the approach is general and system-independent, we present it for
concreteness in the context of the Ciao run-time checking framework, which
allows us to provide an operational semantics with checks and caching. We also
report on a prototype implementation and provide some experimental results that
support that using a relatively small cache leads to significant decreases in
run-time checking overhead.Comment: 30 pages, 1 table, 170 figures; added appendix with plots; To appear
in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 201
- …