1,992 research outputs found

    Dynamically Reconfigurable Systolic Array Accelerators: A Case Study with Extended Kalman Filter and Discrete Wavelet Transform Algorithms

    Get PDF
    Field programmable grid arrays (FPGA) are increasingly being adopted as the primary on-board computing system for autonomous deep space vehicles. There is a need to support several complex applications for navigation and image processing in a rapidly responsive on-board FPGA-based computer. This requires exploring and combining several design concepts such as systolic arrays, hardware-software partitioning, and partial dynamic reconfiguration. A microprocessor/co-processor design that can accelerate two single precision oating-point algorithms, extended Kalman lter and a discrete wavelet transform, is presented. This research makes three key contributions. (i) A polymorphic systolic array framework comprising of recofigurable partial region-based sockets to accelerate algorithms amenable to being mapped onto linear systolic arrays. When implemented on a low end Xilinx Virtex4 SX35 FPGA the design provides a speedup of at least 4.18x and 6.61x over a state of the art microprocessor used in spacecraft systems for the extended Kalman lter and discrete wavelet transform algorithms, respectively. (ii) Switchboxes to enable communication between static and partial reconfigurable regions and a simple protocol to enable schedule changes when a socket\u27s contents are dynamically reconfigured to alter the concurrency of the participating systolic arrays. (iii) A hybrid partial dynamic reconfiguration method that combines Xilinx early access partial reconfiguration, on-chip bitstream decompression, and bitstream relocation to enable fast scaling of systolic arrays on the PolySAF. This technique provided a 2.7x improvement in reconfiguration time compared to an o-chip partial reconfiguration technique that used a Flash card on the FPGA board, and a 44% improvement in BRAM usage compared to not using compression

    Extending the PCIe Interface with Parallel Compression/Decompression Hardware for Energy and Performance Optimization

    Get PDF
    PCIe is a high-performing interface used to move data from a central host PC to an accelerator such as Field Programmable Gate Arrays (FPGA). This interface allows a system to perform fast data transfers in High-Performance Computing (HPC) and provide a performance boost. However, HPC systems normally require large datasets, and in these situations PCIe can become a bottleneck. To address this issue, we propose an open-source hardware compression/decompression system that can be used to adapt with continuously-streamed data with low latency and high throughput. We implement a compressor and decompressor engines on FPGA, scale up with multiple engines working in parallel, and evaluate the energy reduction and performance with different numbers of multiple engines. To alleviate the performance bottleneck in the processor acting as a controller, we propose a hardware scheduler to fairly distribute the datasets among the engines. Our design reduces the transmission time in PCIe, and the results show an energy reduction of up to 48% in the PCIe transfers, thanks to the decrease in the number of bits that have to be transmitted. The overhead in terms of latency is maintained to a minimum and user selectable depending on the tolerances of the intended application

    Proposed data compression schemes for the Galileo S-band contingency mission

    Get PDF
    The Galileo spacecraft is currently on its way to Jupiter and its moons. In April 1991, the high gain antenna (HGA) failed to deploy as commanded. In case the current efforts to deploy the HGA fails, communications during the Jupiter encounters will be through one of two low gain antenna (LGA) on an S-band (2.3 GHz) carrier. A lot of effort has been and will be conducted to attempt to open the HGA. Also various options for improving Galileo's telemetry downlink performance are being evaluated in the event that the HGA will not open at Jupiter arrival. Among all viable options the most promising and powerful one is to perform image and non-image data compression in software onboard the spacecraft. This involves in-flight re-programming of the existing flight software of Galileo's Command and Data Subsystem processors and Attitude and Articulation Control System (AACS) processor, which have very limited computational and memory resources. In this article we describe the proposed data compression algorithms and give their respective compression performance. The planned image compression algorithm is a 4 x 4 or an 8 x 8 multiplication-free integer cosine transform (ICT) scheme, which can be viewed as an integer approximation of the popular discrete cosine transform (DCT) scheme. The implementation complexity of the ICT schemes is much lower than the DCT-based schemes, yet the performances of the two algorithms are indistinguishable. The proposed non-image compression algorith is a Lempel-Ziv-Welch (LZW) variant, which is a lossless universal compression algorithm based on a dynamic dictionary lookup table. We developed a simple and efficient hashing function to perform the string search

    TrusNet: Peer-to-Peer Cryptographic Authentication

    Get PDF
    Originally, the Internet was meant as a general purpose communication protocol, transferring primarily text documents between interested parties. Over time, documents expanded to include pictures, videos and even web pages. Increasingly, the Internet is being used to transfer a new kind of data which it was never designed for. In most ways, this new data type fits in naturally to the Internet, taking advantage of the near limit-less expanse of the protocol. Hardware protocols, unlike previous data types, provide a unique set security problem. Much like financial data, hardware protocols extended across the Internet must be protected with authentication. Currently, systems which do authenticate do so through a central server, utilizing a similar authentication model to the HTTPS protocol. This hierarchical model is often at odds with the needs of hardware protocols, particularly in ad-hoc networks where peer-to-peer communication is prioritized over a hierarchical model. Our project attempts to implement a peer-to-peer cryptographic authentication protocol to be used to protect hardware protocols extending over the Internet. The TrusNet project uses public-key cryptography to authenticate nodes on a distributed network, with each node locally managing a record of the public keys of nodes which it has encountered. These keys are used to secure data transmission between nodes and to authenticate the identities of nodes. TrusNet is designed to be used on multiple different types of network interfaces, but currently only has explicit hooks for Internet Protocol connections. As of June 2016, TrusNet has successfully achieved a basic authentication and communication protocol on Windows 7, OSX, Linux 14 and the Intel Edison. TrusNet uses RC-4 as its stream cipher and RSA as its public-key algorithm, although both of these are easily configurable. Along with the library, TrusNet also enables the building of a unit testing suite, a simple UI application designed to visualize the basics of the system and a build with hooks into the I/O pins of the Intel Edison allowing for a basic demonstration of the system

    Hybrid Enhanced Epidermal Spacesuit Design Approaches

    Get PDF
    A Space suit that does not rely on gas pressurization is a multi-faceted problem that requires major stability controls to be incorporated during design and construction.The concept of Hybrid Epidermal Enhancement space suit integrates evolved human anthropomorphic and physiological adaptations into its functionality, using commercially available bio-medical technologies to address shortcomings of conventional gas pressure suits, and the impracticalities of MCP suits. The prototype HEE Space Suit explored integumentary homeostasis, thermal control and mobility using advanced bio-medical materials technology and construction concepts. The goal was a space suit that functions as an enhanced, multi-functional bio-mimic of the human epidermal layer that works in attunement with the wearer rather than as a separate system. In addressing human physiological requirements for design and construction of the HEE suit, testing regimes were devised and integrated into the prototype which was then subject to a series of detailed tests using both anatomical reproduction methods and human subject

    CPCIe:A Compression-enabled PCIe Core for Energy and Performance Optimization

    Get PDF

    Robo-line storage: Low latency, high capacity storage systems over geographically distributed networks

    Get PDF
    Rapid advances in high performance computing are making possible more complete and accurate computer-based modeling of complex physical phenomena, such as weather front interactions, dynamics of chemical reactions, numerical aerodynamic analysis of airframes, and ocean-land-atmosphere interactions. Many of these 'grand challenge' applications are as demanding of the underlying storage system, in terms of their capacity and bandwidth requirements, as they are on the computational power of the processor. A global view of the Earth's ocean chlorophyll and land vegetation requires over 2 terabytes of raw satellite image data. In this paper, we describe our planned research program in high capacity, high bandwidth storage systems. The project has four overall goals. First, we will examine new methods for high capacity storage systems, made possible by low cost, small form factor magnetic and optical tape systems. Second, access to the storage system will be low latency and high bandwidth. To achieve this, we must interleave data transfer at all levels of the storage system, including devices, controllers, servers, and communications links. Latency will be reduced by extensive caching throughout the storage hierarchy. Third, we will provide effective management of a storage hierarchy, extending the techniques already developed for the Log Structured File System. Finally, we will construct a protototype high capacity file server, suitable for use on the National Research and Education Network (NREN). Such research must be a Cornerstone of any coherent program in high performance computing and communications

    The Design of a System Architecture for Mobile Multimedia Computers

    Get PDF
    This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies
    • …
    corecore