12,448 research outputs found

    Demonstration of Run-time Spatial Mapping of Streaming Applications to a Heterogeneous Multi-Processor System-on-Chip (MPSoC)

    Get PDF
    In this paper, the problem of spatial mapping is defined. Reasons are presented to show why performing spatial mappings at run-time is both necessary and desirable and criteria for the qualitative comparison of spatial mappings are introduced. An algorithm is described that implements a preliminary spatial mapper. The methods used in the algorithm are demonstrated with an illustrative example

    Time-Shared Execution of Realtime Computer Vision Pipelines by Dynamic Partial Reconfiguration

    Full text link
    This paper presents an FPGA runtime framework that demonstrates the feasibility of using dynamic partial reconfiguration (DPR) for time-sharing an FPGA by multiple realtime computer vision pipelines. The presented time-sharing runtime framework manages an FPGA fabric that can be round-robin time-shared by different pipelines at the time scale of individual frames. In this new use-case, the challenge is to achieve useful performance despite high reconfiguration time. The paper describes the basic runtime support as well as four optimizations necessary to achieve realtime performance given the limitations of DPR on today's FPGAs. The paper provides a characterization of a working runtime framework prototype on a Xilinx ZC706 development board. The paper also reports the performance of realtime computer vision pipelines when time-shared

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    GraphR: Accelerating Graph Processing Using ReRAM

    Full text link
    This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suit- able for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x) speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes 4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is 3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201

    The Road Ahead for Networking: A Survey on ICN-IP Coexistence Solutions

    Full text link
    In recent years, the current Internet has experienced an unexpected paradigm shift in the usage model, which has pushed researchers towards the design of the Information-Centric Networking (ICN) paradigm as a possible replacement of the existing architecture. Even though both Academia and Industry have investigated the feasibility and effectiveness of ICN, achieving the complete replacement of the Internet Protocol (IP) is a challenging task. Some research groups have already addressed the coexistence by designing their own architectures, but none of those is the final solution to move towards the future Internet considering the unaltered state of the networking. To design such architecture, the research community needs now a comprehensive overview of the existing solutions that have so far addressed the coexistence. The purpose of this paper is to reach this goal by providing the first comprehensive survey and classification of the coexistence architectures according to their features (i.e., deployment approach, deployment scenarios, addressed coexistence requirements and architecture or technology used) and evaluation parameters (i.e., challenges emerging during the deployment and the runtime behaviour of an architecture). We believe that this paper will finally fill the gap required for moving towards the design of the final coexistence architecture.Comment: 23 pages, 16 figures, 3 table

    Real-time optical manipulation of cardiac conduction in intact hearts

    Get PDF
    Optogenetics has provided new insights in cardiovascular research, leading to new methods for cardiac pacing, resynchronization therapy and cardioversion. Although these interventions have clearly demonstrated the feasibility of cardiac manipulation, current optical stimulation strategies do not take into account cardiac wave dynamics in real time. Here, we developed an all‐optical platform complemented by integrated, newly developed software to monitor and control electrical activity in intact mouse hearts. The system combined a wide‐field mesoscope with a digital projector for optogenetic activation. Cardiac functionality could be manipulated either in free‐run mode with submillisecond temporal resolution or in a closed‐loop fashion: a tailored hardware and software platform allowed real‐time intervention capable of reacting within 2 ms. The methodology was applied to restore normal electrical activity after atrioventricular block, by triggering the ventricle in response to optically mapped atrial activity with appropriate timing. Real‐time intraventricular manipulation of the propagating electrical wavefront was also demonstrated, opening the prospect for real‐time resynchronization therapy and cardiac defibrillation. Furthermore, the closed‐loop approach was applied to simulate a re‐entrant circuit across the ventricle demonstrating the capability of our system to manipulate heart conduction with high versatility even in arrhythmogenic conditions. The development of this innovative optical methodology provides the first proof‐of‐concept that a real‐time optically based stimulation can control cardiac rhythm in normal and abnormal conditions, promising a new approach for the investigation of the (patho)physiology of the heart

    On autonomic platform-as-a-service: characterisation and conceptual model

    Get PDF
    In this position paper, we envision a Platform-as-a-Service conceptual and architectural solution for large-scale and data intensive applications. Our architectural approach is based on autonomic principles, therefore, its ultimate goal is to reduce human intervention, the cost, and the perceived complexity by enabling the autonomic platform to manage such applications itself in accordance with highlevel policies. Such policies allow the platform to (i) interpret the application specifications; (ii) to map the specifications onto the target computing infrastructure, so that the applications are executed and their Quality of Service (QoS), as specified in their SLA, enforced; and, most importantly, (iii) to adapt automatically such previously established mappings when unexpected behaviours violate the expected. Such adaptations may involve modifications in the arrangement of the computational infrastructure, i.e. by re-designing a different communication network topology that dictates how computational resources interact, or even the live-migration to a different computational infrastructure. The ultimate goal of these challenges is to (de)provision computational machines, storage and networking links and their required topologies in order to supply for the application the virtualised infrastructure that better meets the SLAs. Generic architectural blueprints and principles have been provided for designing and implementing an autonomic computing system.We revisit them in order to provide a customised and specific viewfor PaaS platforms and integrate emerging paradigms such as DevOps for automate deployments, Monitoring as a Service for accurate and large-scale monitoring, or well-known formalisms such as Petri Nets for building performance models
    corecore