6,205 research outputs found

    FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

    Full text link
    Neural Network (NN) accelerators with emerging ReRAM (resistive random access memory) technologies have been investigated as one of the promising solutions to address the \textit{memory wall} challenge, due to the unique capability of \textit{processing-in-memory} within ReRAM-crossbar-based processing elements (PEs). However, the high efficiency and high density advantages of ReRAM have not been fully utilized due to the huge communication demands among PEs and the overhead of peripheral circuits. In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing. We highly leverage the software system to make the hardware design compact and efficient. To satisfy the high-performance communication demand, we optimize it with a reconfigurable routing architecture and the placement & routing tool. To improve the computational density, we greatly simplify the PE circuit with the spiking schema and then adopt neural synthesizer to enable the high density computation-resources to support different kinds of NN operations. In addition, we provide spiking memory blocks (SMBs) and configurable logic blocks (CLBs) in hardware and leverage the temporal-to-spatial mapper to utilize them to balance the storage and computation requirements of NN. Owing to the end-to-end software system, we can efficiently deploy existing deep neural networks to FPSA. Evaluations show that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME, the computational density of FPSA improves by 31x; for representative NNs, its inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201

    Deep Space Network information system architecture study

    Get PDF
    The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

    Technical note : TRACKFlow, a new versatile microscope system forfission track analysis

    Get PDF
    We here present TRACKFlow, a new system with dedicated modules for the fission track (FT) laboratory. It is based on the motorised Nikon Eclipse Ni-E upright microscope with the Nikon DS-Ri2 full frame camera and is embedded within the Nikon NIS-Elements Advanced Research software package. TRACKFlow decouples image acquisition from analysis to decrease schedule stress of the microscope. The system further has the aim of being versatile, adaptable to multiple preparation protocols and analysis approaches. It is both suited for small-scale laboratories and is also ready for upscaling to high-throughput imaging. The versatility of the system, based on the operators’ full access to the NIS-Elements package, exceeds that of other systems for FT and further expands to stepping away from the dedicated FT microscope towards a general microscope for Earth Sciences, including dedicated modules for FT research. TRACKFlow consists of a number of user-friendly protocols which are based on the well plate design that allows sequential scanning of multiple samples without the need of replacing the slide on the stage. All protocols include a sub-protocol to scan a map of the mount for easy navigation through the samples on the stage. Two protocols are designed for the External Detector Method (EDM) and the LA–ICP–MS apatite fission track (LAFT) approach, with tools for repositioning and calibration to the external detector. Two other tools are designed for large crystals, such as the Durango age standard and U-doped glass external detectors. These protocols generate a regular grid of points and inspect if each point is suitable for analysis. Both protocols also include an option to image each withheld point. One more protocol is included for the measurement of etch pit diameters and one last protocol prepares a list of coordinates for correlative microscopy. In a following phase of development TRACKFlow can be expanded towards fully autonomous calibration, grain detection and imaging

    High throughput spatial convolution filters on FPGAs

    Get PDF
    Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility
    • …
    corecore