445 research outputs found
Hardware-based Security for Virtual Trusted Platform Modules
Virtual Trusted Platform modules (TPMs) were proposed as a software-based
alternative to the hardware-based TPMs to allow the use of their cryptographic
functionalities in scenarios where multiple TPMs are required in a single
platform, such as in virtualized environments. However, virtualizing TPMs,
especially virutalizing the Platform Configuration Registers (PCRs), strikes
against one of the core principles of Trusted Computing, namely the need for a
hardware-based root of trust. In this paper we show how strength of
hardware-based security can be gained in virtual PCRs by binding them to their
corresponding hardware PCRs. We propose two approaches for such a binding. For
this purpose, the first variant uses binary hash trees, whereas the other
variant uses incremental hashing. In addition, we present an FPGA-based
implementation of both variants and evaluate their performance
Virtualization for a Network Processor Runtime System
The continuing ossification of the Internet is slowing the pace of network innovation. Network diversification presents one solution to this problem, by virtualizing the network at multiple layers. Diversified networks consist of a shared physical substrate, virtual routers (metarouters), and virtual links (metalinks). Virtualizing routers enables smooth and incremental upgrades to new network services. Our current priority for a diversified router prototype is to enable reserved slices of the network for researchers to perform repeatable, high-speed network experiments. General-purpose processors have well established techniques for virtualization, but do not scale efficiently to multi-gigabit speeds. To achieve these speeds, we employ network processors (NPs), typically consisting of multicore, multi-threaded processors with asymmetric, heterogeneous memories. The complexity and lack of hardware thread isolation in NP’s, combined with a lack of simple programming models, creates numerous challenges for effective sharing between metarouters. In this paper, we detail strategies for enabling NP virtualization at the link, memory, and processor levels, to better enable a research infrastructure for network innovation
TALUS: Reinforcing TEE Confidentiality with Cryptographic Coprocessors (Technical Report)
Platforms are nowadays typically equipped with tristed execution environments
(TEES), such as Intel SGX and ARM TrustZone. However, recent microarchitectural
attacks on TEEs repeatedly broke their confidentiality guarantees, including
the leakage of long-term cryptographic secrets. These systems are typically
also equipped with a cryptographic coprocessor, such as a TPM or Google Titan.
These coprocessors offer a unique set of security features focused on
safeguarding cryptographic secrets. Still, despite their simultaneous
availability, the integration between these technologies is practically
nonexistent, which prevents them from benefitting from each other's strengths.
In this paper, we propose TALUS, a general design and a set of three main
requirements for a secure symbiosis between TEEs and cryptographic
coprocessors. We implement a proof-of-concept of TALUS based on Intel SGX and a
hardware TPM. We show that with TALUS, the long-term secrets used in the SGX
life cycle can be moved to the TPM. We demonstrate that our design is robust
even in the presence of transient execution attacks, preventing an entire class
of attacks due to the reduced attack surface on the shared hardware.Comment: In proceedings of Financial Cryptography 2023. This is the technical
report of the published pape
Just-in-time binary translation of operating system kernels
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 57-58).This thesis presents a just-in-time binary translation scheme that dynamically switches between system emulation with a slower but more memory efficient instruction interpreter, and a faster, more memory intensive binary translator. In testing, this hybrid interpreter/translator scheme reduced the size of the binary translation cache by up to 99% with a slowdown less than a factor of 5x in the worst case, and less than a 2x in the best case compared to a pure binary translation scheme. With only a 10% decrease in performance, upwards of 49% memory reduction is demonstrated. Additionally, a technique of guest kernel introspection and profiling using binary translation is presented.by Perry L. Hung.M.Eng
Scalable and Configurable Tracking for Any Rowhammer Threshold
The Rowhammer vulnerability continues to get worse, with the Rowhammer
Threshold (TRH) reducing from 139K activations to 4.8K activations over the
last decade. Typical Rowhammer mitigations rely on tracking aggressor rows. The
number of possible aggressors increases with lowering thresholds, making it
difficult to reliably track such rows in a storage-efficient manner. At lower
thresholds, academic trackers such as Graphene require prohibitive SRAM
overheads (hundreds of KBs to MB). Recent in-DRAM trackers from industry, such
as DSAC-TRR, perform approximate tracking, sacrificing guaranteed protection
for reduced storage overheads, leaving DRAM vulnerable to Rowhammer attacks.
Ideally, we seek a scalable tracker that tracks securely and precisely, and
incurs negligible dedicated SRAM and performance overheads, while still being
able to track arbitrarily low thresholds.
To that end, we propose START - a Scalable Tracker for Any Rowhammer
Threshold. Rather than relying on dedicated SRAM structures, START dynamically
repurposes a small fraction the Last-Level Cache (LLC) to store tracking
metadata. START is based on the observation that while the memory contains
millions of rows, typical workloads touch only a small subset of rows within a
refresh period of 64ms, so allocating tracking entries on demand significantly
reduces storage. If the application does not access many rows in memory, START
does not reserve any LLC capacity. Otherwise, START dynamically uses 1-way,
2-way, or 8-way of the cache set based on demand. START consumes, on average,
9.4% of the LLC capacity to store metadata, which is 5X lower compared to
dedicating a counter in LLC for each row in memory. We also propose START-M, a
memory-mapped START for large-memory systems. Our designs require only 4KB SRAM
for newly added structures and perform within 1% of idealized tracking even at
TRH of less than 100
- …