1,474 research outputs found
A Survey of Techniques for Architecting TLBs
“Translation lookaside buffer” (TLB) caches virtual to physical address translation information and is used
in systems ranging from embedded devices to high-end servers. Since TLB is accessed very frequently
and a TLB miss is extremely costly, prudent management of TLB is important for improving performance
and energy efficiency of processors. In this paper, we present a survey of techniques for architecting and
managing TLBs. We characterize the techniques across several dimensions to highlight their similarities and
distinctions. We believe that this paper will be useful for chip designers, computer architects and system
engineers
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
Computers continue to diversify with respect to system designs, emerging
memory technologies, and application memory demands. Unfortunately, continually
adapting the conventional virtual memory framework to each possible system
configuration is challenging, and often results in performance loss or requires
non-trivial workarounds. To address these challenges, we propose a new virtual
memory framework, the Virtual Block Interface (VBI). We design VBI based on the
key idea that delegating memory management duties to hardware can reduce the
overheads and software complexity associated with virtual memory. VBI
introduces a set of variable-sized virtual blocks (VBs) to applications. Each
VB is a contiguous region of the globally-visible VBI address space, and an
application can allocate each semantically meaningful unit of information
(e.g., a data structure) in a separate VB. VBI decouples access protection from
memory allocation and address translation. While the OS controls which programs
have access to which VBs, dedicated hardware in the memory controller manages
the physical memory allocation and address translation of the VBs. This
approach enables several architectural optimizations to (1) efficiently and
flexibly cater to different and increasingly diverse system configurations, and
(2) eliminate key inefficiencies of conventional virtual memory. We demonstrate
the benefits of VBI with two important use cases: (1) reducing the overheads of
address translation (for both native execution and virtual machine
environments), as VBI reduces the number of translation requests and associated
memory accesses; and (2) two heterogeneous main memory architectures, where VBI
increases the effectiveness of managing fast memory regions. For both cases,
VBI significanttly improves performance over conventional virtual memory
Thin Hypervisor-Based Security Architectures for Embedded Platforms
Virtualization has grown increasingly popular, thanks to its benefits of isolation, management, and utilization, supported by hardware advances. It is also receiving attention for its potential to support security, through hypervisor-based services and advanced protections supplied to guests. Today, virtualization is even making inroads in the embedded space, and embedded systems, with their security needs, have already started to benefit from virtualization’s security potential. In this thesis, we investigate the possibilities for thin hypervisor-based security on embedded platforms. In addition to significant background study, we present implementation of a low-footprint, thin hypervisor capable of providing security protections to a single FreeRTOS guest kernel on ARM. Backed by performance test results, our hypervisor provides security to a formerly unsecured kernel with minimal performance overhead, and represents a first step in a greater research effort into the security advantages and possibilities of embedded thin hypervisors. Our results show that thin hypervisors are both possible and beneficial even on limited embedded systems, and sets the stage for more advanced investigations, implementations, and security applications in the future
Privacy Protection on Cloud Computing
Cloud is becoming the most popular computing infrastructure because it can attract more and more traditional companies due to flexibility and cost-effectiveness. However, privacy concern is the major issue that prevents users from deploying on public clouds. My research focuses on protecting user\u27s privacy in cloud computing. I will present a hardware-based and a migration-based approach to protect user\u27s privacy. The root cause of the privacy problem is current cloud privilege design gives too much power to cloud providers. Once the control virtual machine (installed by cloud providers) is compromised, external adversaries will breach users’ privacy. Malicious cloud administrators are also possible to disclose user’s privacy by abusing the privilege of cloud providers. Thus, I develop two cloud architectures – MyCloud and MyCloud SEP to protect user’s privacy based on hardware virtualization technology. I eliminate the privilege of cloud providers by moving the control virtual machine (control VM) to the processor’s non-root mode and only keep the privacy protection and performance crucial components in the Trust Computing Base (TCB). In addition, the new cloud platform can provide rich functionalities on resource management and allocation without greatly increasing the TCB size. Besides the attacks to control VM, many external adversaries will compromise one guest VM or directly install a malicious guest VM, then target other legitimate guest VMs based on the connections. Thus, collocating with vulnerable virtual machines, or ”bad neighbors” on the same physical server introduces additional security risks. I develop a migration-based scenario that quantifies the security risk of each VM and generates virtual machine placement to minimize the security risks considering the connections among virtual machines. According to the experiment, our approach can improve the survivability of most VMs
Contribution of the idiothetic and the allothetic information to the hippocampal place code
Hippocampal cells exhibit preference to be active at a specific place in a familiar environment, enabling them to encode the representation of space within the brain at the population level (J. O’Keefe and Dostrovsky 1971). These cells rely on the external sensory inputs and self-motion cues, however, it is still not known how exactly these inputs interact to build a stable representation of a certain location (“place field”). Existing studies suggest that both proprioceptive and other idiothetic types of information are continuously integrated to update the self-position (e.g. implementing “path integration”) while other stable sensory cues provide references to update the allocentric position of self and correct it for the collected integration-related errors. It was shown that both allocentric and idiothetic types of information influence positional cell firing, however in most of the studies these inputs were firmly coupled. The use of virtual reality setups (Thurley and Ayaz 2016) made it possible to separate the influence of vision and proprioception for the price of not keeping natural conditions - the animal is usually head- or body-fixed (Hölscher et al. 2005; Ravassard A. 2013; Jayakumar et al. 2018a; Haas et al. 2019), which introduces vestibular motor- and visual- conflicts, providing a bias for space encoding. Here we use the novel CAVE Virtual Reality system for freely-moving rodents (Del Grosso 2018) that allows to investigate the effect of visual- and positional- (vestibular) manipulation on the hippocampal space code while keeping natural behaving conditions.
In this study, we focus on the dynamic representation of space when the visual- cue-defined and physical-boundary-defined reference frames are in conflict. We confirm the dominance of one reference frame over the other on the level of place fields, when the information about one reference frame is absent (Gothard et al. 2001). We show that the hippocampal cells form adjacent categories by their input preference - surprisingly, not only that they are being driven either by visual / allocentric information or by the distance to the physical boundaries and path integration, but also by a specific combination of both. We found a large category of units integrating inputs from both allocentric and idiothetic pathways that are able to represent an intermediate position between two reference frames, when they are in conflict. This experimental evidence suggests that most of the place cells are involved in representing both reference frames using a weighted combination of sensory inputs. In line with the studies showing dominance of the more reliable sensory modality (Kathryn J. Jeffery and J. M. O’Keefe 1999; Gothard et al. 2001), our data is consistent (although not proving it) with CA1 cells implementing an optimal Bayesian coding given the idiothetic and allocentric inputs with weights inversely proportional to the availability of the input, as proposed for other sensory systems (Kate J. Jeffery, Page, and Simon M. Stringer 2016). This mechanism of weighted sensory integration, consistent with recent dynamic loop models of the hippocampal-entorhinal network (Li, Arleo, and Sheynikhovich 2020), can contribute to the physiological explanation of Bayesian inference and optimal combination of spatial cues for localization (Cheng et al. 2007)
ACCELERATING STORAGE APPLICATIONS WITH EMERGING KEY VALUE STORAGE DEVICES
With the continuous data explosion in the big data era, traditional software and hardware stack
are facing unprecedented challenges on how to operate on such data scale. Thus, designing new
architectures and efficient systems for data oriented applications has become increasingly critical.
This motivates us to re-think of the conventional storage system design and re-architect both
software and hardware to meet the challenges of scale.
Besides the fast growth of data volume, the increasing demand on storage applications such
as video streaming, data analytics are pushing high performance flash based storage devices to
replace the traditional spinning disks. Such all-flash era increase the data reliability concerns
due to the endurance problem of flash devices. Key-value stores (KVS) are important storage
infrastructure to handle the fast growing unstructured data and have been widely deployed in a
variety of scale-out enterprise applications such as online retail, big data analytic, social networks,
etc. How to efficiently manage data redundancy for key-value stores to provide data reliability, how
to efficiently support range query for key-value stores to accelerate analytic oriented applications
under emerging key-value store system architecture become an important research problem.
In this research, we focus on how to design new software hardware architectures for the keyvalue
store applications to provide reliability and improve query performance. In order to address
the different issues identified in this dissertation, we propose to employ a logical key management
layer, a thin layer above the KV devices that maps logical keys into phsyical keys on the devices.
We show how such a layer can enable multiple solutions to improve the performance and reliability
of KVSSD based storage systems. First, we present KVRAID, a high performance, write
efficient erasure coding management scheme on emerging key-value SSDs. The core innovation
of KVRAID is to propose a logical key management layer that maps logical keys to physical keys
to efficiently pack similar size KV objects and dynamically manage the membership of erasure
coding groups. Unlike existing schemes which manage erasure codes on the block level, KVRAID
manages the erasure codes on the KV object level. In order to achieve better storage efficiency for variable sized objects, KVRAID predefines multiple fixed sizes (slabs) according to the object size
distribution for the erasure code. KVRAID uses a logical to physical key conversion to pack the
KV objects of similar size into a parity group. KVRAID uses a lazy deletion mechanism with a
garbage collector for object updates. Our experiments show that in 100% put case, KVRAID outperforms
software block RAID by 18x in case of throughput and reduces 15x write amplification
(WAF) with only ~5% CPU utilization. In a mixed update/get workloads, KVRAID achieves ~4x
better throughput with ~23% CPU utilization and reduces the storage overhead and WAF by 3.6x
and 11.3x in average respectively.
Second, we present KVRangeDB, an ordered log structure tree based key index that supports
range queries on a hash-based KVSSD. In addition, we propose to pack smaller application records
into a larger physical record on the device through the logical key management layer. We compared
the performance of KVRangeDB against RocksDB implementation on KVSSD and stateof-
art software KV-store Wisckey on block device, on three types of real world applications of
cloud-serving workloads, TABLEFS filesystem and time-series databases. For cloud serving applications,
KVRangeDB achieves 8.3x and 1.7x better 99.9% write tail latency respectively compared
to RocksDB implementation on KV-SSD and Wisckey on block SSD. On the query side,
KVrangeDB only performs worse for those very long scans, but provides fast point queries and
closed range queries. The experiments on TABLEFS demonstrate that using KVRangeDB for
metadata indexing can boost the performance by a factor of ~6.3x in average and reduce ~3.9x
CPU cost for four metadata-intensive workloads compared to RocksDB implementation on KVSSD.
Compared toWisckey, KVRangeDB improves performance by ~2.6x in average and reduces
~1.7x CPU usage.
Third, we propose a generic FPGA accelerator for emerging Minimum Storage Regenerating
(MSR) codes encoding/decoding which maximizes the computation parallelism and minimizes
the data movement between off-chip DRAM and the on-chip SRAM buffers. To demonstrate the
efficiency of our proposed accelerator, we implemented the encoding/decoding algorithms for a
specific MSR code called Zigzag code on Xilinx VCU1525 acceleration card. Our evaluation shows our proposed accelerator can achieve ~2.4-3.1x better throughput and ~4.2-5.7x better
power efficiency compared to the state-of-art multi-core CPU implementation and ~2.8-3.3x better
throughput and ~4.2-5.3x better power efficiency compared to a modern GPU accelerato
Recommended from our members
Exploitation from Malicious PCI Express Peripherals
The thesis of this dissertation is that, despite widespread belief in the security community, systems are still vulnerable to attacks from malicious peripherals delivered over the PCI Express (PCIe) protocol.
Malicious peripherals can be plugged directly into internal PCIe slots, or connected via an external Thunderbolt connection.
To prove this thesis, we designed and built a new PCIe attack platform.
We discovered that a simple platform was insufficient to carry out complex attacks, so created the first PCIe attack platform that runs a full, conventional OS.
To allows us to conduct attacks against higher-level OS functionality built on PCIe, we made the attack platform emulate in detail the behaviour of an Intel 82574L Network Interface Controller (NIC), by using a device model extracted from the QEMU emulator.
We discovered a number of vulnerabilities in the PCIe protocol itself, and with the way that the defence mechanisms it provides are used by modern OSs.
The principal defence mechanism provided is the Input/Output Memory Management Unit (IOMMU).
The remaps the address space used by peripherals in 4KiB chunks, and can prevent access to areas of address space that a peripheral should not be able to access.
We found that, contrary to belief in the security community, the IOMMUs in modern systems were not designed to protect against attacks from malicious peripherals, but to allow virtual machines direct access to real hardware.
We discovered that use of the IOMMU is patchy even in modern operating systems.
Windows effectively does not use the IOMMU at all; macOS opens windows that are shared by all devices; Linux and FreeBSD map windows into host memory separately for each device, but only if poorly documented boot flags are used.
These OSs make no effort to ensure that only data that should be visible to the devices is in the mapped windows.
We created novel attacks that subverted control flow and read private data against systems running macOS, Linux and FreeBSD with the highest level of relevant protection enabled.
These represent the first use of the relevant exploits in each case.
In the final part of this thesis, we evaluate the suitability of a number of proposed general purpose and specific mitigations against DMA attacks, and make a number of recommendations about future directions in IOMMU software and hardware.EPSRC and ARM iCASE Awar
- …