4,741 research outputs found
Advancing Hardware Security Using Polymorphic and Stochastic Spin-Hall Effect Devices
Protecting intellectual property (IP) in electronic circuits has become a
serious challenge in recent years. Logic locking/encryption and layout
camouflaging are two prominent techniques for IP protection. Most existing
approaches, however, particularly those focused on CMOS integration, incur
excessive design overheads resulting from their need for additional circuit
structures or device-level modifications. This work leverages the innate
polymorphism of an emerging spin-based device, called the giant spin-Hall
effect (GSHE) switch, to simultaneously enable locking and camouflaging within
a single instance. Using the GSHE switch, we propose a powerful primitive that
enables cloaking all the 16 Boolean functions possible for two inputs. We
conduct a comprehensive study using state-of-the-art Boolean satisfiability
(SAT) attacks to demonstrate the superior resilience of the proposed primitive
in comparison to several others in the literature. While we tailor the
primitive for deterministic computation, it can readily support stochastic
computation; we argue that stochastic behavior can break most, if not all,
existing SAT attacks. Finally, we discuss the resilience of the primitive
against various side-channel attacks as well as invasive monitoring at runtime,
which are arguably even more concerning threats than SAT attacks.Comment: Published in Proc. Design, Automation and Test in Europe (DATE) 201
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency
Persistent memory provides high-performance data persistence at main memory.
Memory writes need to be performed in strict order to satisfy storage
consistency requirements and enable correct recovery from system crashes.
Unfortunately, adhering to such a strict order significantly degrades system
performance and persistent memory endurance. This paper introduces a new
mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering
requirements at significantly lower performance and endurance loss. LOC
consists of two key techniques. First, Eager Commit eliminates the need to
perform a persistent commit record write within a transaction. We do so by
ensuring that we can determine the status of all committed transactions during
recovery by storing necessary metadata information statically with blocks of
data written to memory. Second, Speculative Persistence relaxes the write
ordering between transactions by allowing writes to be speculatively written to
persistent memory. A speculative write is made visible to software only after
its associated transaction commits. To enable this, our mechanism supports the
tracking of committed transaction ID and multi-versioning in the CPU cache. Our
evaluations show that LOC reduces the average performance overhead of memory
persistence from 66.9% to 34.9% and the memory write traffic overhead from
17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and
Distributed System
Limits on Fundamental Limits to Computation
An indispensable part of our lives, computing has also become essential to
industries and governments. Steady improvements in computer hardware have been
supported by periodic doubling of transistor densities in integrated circuits
over the last fifty years. Such Moore scaling now requires increasingly heroic
efforts, stimulating research in alternative hardware and stirring controversy.
To help evaluate emerging technologies and enrich our understanding of
integrated-circuit scaling, we review fundamental limits to computation: in
manufacturing, energy, physical space, design and verification effort, and
algorithms. To outline what is achievable in principle and in practice, we
recall how some limits were circumvented, compare loose and tight limits. We
also point out that engineering difficulties encountered by emerging
technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl
LightBox: Full-stack Protected Stateful Middlebox at Lightning Speed
Running off-site software middleboxes at third-party service providers has
been a popular practice. However, routing large volumes of raw traffic, which
may carry sensitive information, to a remote site for processing raises severe
security concerns. Prior solutions often abstract away important factors
pertinent to real-world deployment. In particular, they overlook the
significance of metadata protection and stateful processing. Unprotected
traffic metadata like low-level headers, size and count, can be exploited to
learn supposedly encrypted application contents. Meanwhile, tracking the states
of 100,000s of flows concurrently is often indispensable in production-level
middleboxes deployed at real networks.
We present LightBox, the first system that can drive off-site middleboxes at
near-native speed with stateful processing and the most comprehensive
protection to date. Built upon commodity trusted hardware, Intel SGX, LightBox
is the product of our systematic investigation of how to overcome the inherent
limitations of secure enclaves using domain knowledge and customization. First,
we introduce an elegant virtual network interface that allows convenient access
to fully protected packets at line rate without leaving the enclave, as if from
the trusted source network. Second, we provide complete flow state management
for efficient stateful processing, by tailoring a set of data structures and
algorithms optimized for the highly constrained enclave space. Extensive
evaluations demonstrate that LightBox, with all security benefits, can achieve
10Gbps packet I/O, and that with case studies on three stateful middleboxes, it
can operate at near-native speed.Comment: Accepted at ACM CCS 201
Autonomous Probabilistic Coprocessing with Petaflips per Second
In this paper we present a concrete design for a probabilistic (p-) computer
based on a network of p-bits, robust classical entities fluctuating between -1
and +1, with probabilities that are controlled through an input constructed
from the outputs of other p-bits. The architecture of this probabilistic
computer is similar to a stochastic neural network with the p-bit playing the
role of a binary stochastic neuron, but with one key difference: there is no
sequencer used to enforce an ordering of p-bit updates, as is typically
required. Instead, we explore \textit{sequencerless} designs where all p-bits
are allowed to flip autonomously and demonstrate that such designs can allow
ultrafast operation unconstrained by available clock speeds without
compromising the solution's fidelity. Based on experimental results from a
hardware benchmark of the autonomous design and benchmarked device models, we
project that a nanomagnetic implementation can scale to achieve petaflips per
second with millions of neurons. A key contribution of this paper is the focus
on a hardware metric flips per second as a problem and
substrate-independent figure-of-merit for an emerging class of hardware
annealers known as Ising Machines. Much like the shrinking feature sizes of
transistors that have continually driven Moore's Law, we believe that flips per
second can be continually improved in later technology generations of a wide
class of probabilistic, domain specific hardware.Comment: 13 pages, 8 figures, 1 tabl
- …