184 research outputs found

    A Real-Time Error Detection (RTD) architecture and its use for reliability and post-silicon validation for F/F based memory arrays

    Get PDF
    This work proposes in-situ Real-Time Error Detection (RTD): embedding hardware in a memory array for detecting a fault in the array when it occurs, rather than when it is read. RTD breaks the serialization between data access and error-detection and, thus, it can speed-up the access-time of arrays that use in-line error-correction. The approach can also reduce the time needed to root-cause array related bugs during post-silicon validation and product testing. The paper introduces a two-dimensional error-correction scheme based on RTD and, also, presents a proactive error-correction method that combines RTD with demand-scrubbing. The work describes how to build RTD into a memory array with flip-flops to track in real-time the column-parity. A comparison of the proposed two-dimensional ECC scheme, as compared to single-error-correction-double-error-detection, shows that the RTD design has comparable error-detection-and-correction strength and, depending on the array dimensions and configuration, RTD reduces access time by 4% to 26% at an area and power overhead (negative is a reduction) between -7% to 33% and -42% to 86% respectively.Peer ReviewedPostprint (author's final draft

    Hardware Mechanisms for Efficient Memory System Security

    Full text link
    The security of a computer system hinges on the trustworthiness of the operating system and the hardware, as applications rely on them to protect code and data. As a result, multiple protections for safeguarding the hardware and OS from attacks are being continuously proposed and deployed. These defenses, however, are far from ideal as they only provide partial protection, require complex hardware and software stacks, or incur high overheads. This dissertation presents hardware mechanisms for efficiently providing strong protections against an array of attacks on the memory hardware and the operating system’s code and data. In the first part of this dissertation, we analyze and optimize protections targeted at defending memory hardware from physical attacks. We begin by showing that, contrary to popular belief, current DDR3 and DDR4 memory systems that employ memory scrambling are still susceptible to cold boot attacks (where the DRAM is frozen to give it sufficient retention time and is then re-read by an attacker after reboot to extract sensitive data). We then describe how memory scramblers in modern memory controllers can be transparently replaced by strong stream ciphers without impacting performance. We also demonstrate how the large storage overheads associated with authenticated memory encryption schemes (which enable tamper-proof storage in off-chip memories) can be reduced by leveraging compact integer encodings and error-correcting code (ECC) DRAMs – without forgoing the error detection and correction capabilities of ECC DRAMs. The second part of this dissertation presents Neverland: a low-overhead, hardware-assisted, memory protection scheme that safeguards the operating system from rootkits and kernel-mode malware. Once the system is done booting, Neverland’s hardware takes away the operating system’s ability to overwrite certain configuration registers, as well as portions of its own physical address space that contain kernel code and security-critical data. Furthermore, it prohibits the CPU from fetching privileged code from any memory region lying outside the physical addresses assigned to the OS kernel and drivers. This combination of protections makes it extremely hard for an attacker to tamper with the kernel or introduce new privileged code into the system – even in the presence of software vulnerabilities. Neverland enables operating systems to reduce their attack surface without having to rely on complex integrity monitoring software or hardware. The hardware mechanisms we present in this dissertation provide building blocks for constructing a secure computing base while incurring lower overheads than existing protections.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147604/1/salessaf_1.pd

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    High-Performance Energy-Efficient and Reliable Design of Spin-Transfer Torque Magnetic Memory

    Get PDF
    In this dissertation new computing paradigms, architectures and design philosophy are proposed and evaluated for adopting the STT-MRAM technology as highly reliable, energy efficient and fast memory. For this purpose, a novel cross-layer framework from the cell-level all the way up to the system- and application-level has been developed. In these framework, the reliability issues are modeled accurately with appropriate fault models at different abstraction levels in order to analyze the overall failure rates of the entire memory and its Mean Time To Failure (MTTF) along with considering the temperature and process variation effects. Design-time, compile-time and run-time solutions have been provided to address the challenges associated with STT-MRAM. The effectiveness of the proposed solutions is demonstrated in extensive experiments that show significant improvements in comparison to state-of-the-art solutions, i.e. lower-power, higher-performance and more reliable STT-MRAM design

    Biblioteca de procesamiento de imágenes optimizada para Arm Cortex-M7

    Get PDF
    La mayoría de los vehículos en la actualidad están equipados con sistemas que asisten al conductor en tareas difíciles y repetitivas, como reducir la velocidad del vehículo en una zona escolar. Algunos de estos sistemas requieren una computadora a bordo capaz de realizar el procesamiento en tiempo real de las imágenes del camino obtenidas por una cámara. El objetivo de este proyecto es implementar una librería de procesamiento de imagen optimizada para la arquitectura ARM® Cortex®-M7. Esta librería provee rutinas para realizar filtrado espacial, resta, binarización y extracción de la información direccional de una imagen, así como el reconocimiento parametrizado de patrones de una figura predefinida utilizando la Transformada Generalizada de Hough. Estas rutinas están escritas en el lenguaje de programación C, para aprovechar las optimizaciones del compilador GNU ARM C, y obtener el máximo desempeño y el mínimo tamaño de objetos. El desempeño de las rutinas fue comparado con la implementación existente para otro microcontrolador, el Freescale® MPC5561. Para probar la funcionalidad de esta librería en una aplicación de tiempo real, se desarrolló un sistema de reconocimiento de señales de tráfico. Los resultados muestran que en promedio el tiempo de ejecución es 18% más rápido y el tamaño de objetos es 25% menor que en la implementación de referencia, lo que habilita a este sistema para procesar hasta 24 cuadros por segundo. En conclusión, estos resultados demuestran la funcionalidad de la librería de procesamiento de imágenes en sistemas de tiempo real.Most modern vehicles are equipped with systems that assist the driver by automating difficult and repetitive tasks, such as reducing the vehicle speed in a school zone. Some of these systems require an onboard computer capable of performing real-time processing of the road images captured by a camera. The goal of this project is to implement an optimized image processing library for the ARM® Cortex®-M7 architecture. This library includes the routines to perform image spatial filtering, subtraction, binarization, and extraction of the directional information along with the parameterized pattern recognition of a predefined template using the Generalized Hough Transform (GHT). These routines are written in the C programming language, leveraging GNU ARM C compiler optimizations to obtain maximum performance and minimum object size. The performance of the routines was benchmarked with an existing implementation for a different microcontroller, the Freescale® MPC5561. To prove the usability of this library in a real-time application, a Traffic Sign Recognition (TSR) system was implemented. The results show that in average the execution time is 18% faster and the binary object size is 25% smaller than in the reference implementation, enabling the TSR application to process up to 24 fps. In conclusion, these results demonstrate that the image processing library implemented in this project is suitable for real-time applications.ITESO, A. C.Consejo Nacional de Ciencia y TecnologíaContinental Automotiv
    • …
    corecore