Search CORE

293 research outputs found

Time-Shared Execution of Realtime Computer Vision Pipelines by Dynamic Partial Reconfiguration

Author: Hoe James C.
Nguyen Marie
Publication venue
Publication date: 25/05/2018
Field of study

This paper presents an FPGA runtime framework that demonstrates the feasibility of using dynamic partial reconfiguration (DPR) for time-sharing an FPGA by multiple realtime computer vision pipelines. The presented time-sharing runtime framework manages an FPGA fabric that can be round-robin time-shared by different pipelines at the time scale of individual frames. In this new use-case, the challenge is to achieve useful performance despite high reconfiguration time. The paper describes the basic runtime support as well as four optimizations necessary to achieve realtime performance given the limitations of DPR on today's FPGAs. The paper provides a characterization of a working runtime framework prototype on a Xilinx ZC706 development board. The paper also reports the performance of realtime computer vision pipelines when time-shared

arXiv.org e-Print Archive

Crossref

Scipedia

Effective parallel computation on workstation cluster with a user-level communication network

Author: Hoe James C. (James Chu-Yue)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1994
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (p. 118-119).by James C. Hoe.M.S

DSpace@MIT

ProtoFlex: Co-simulation for Component-wise FPGA Emulator Development

Author: Chung Eric S.
Falsafi Babak
Hoe James C.
Publication venue
Publication date: 21/04/2009
Field of study

Infoscience - École polytechnique fédérale de Lausanne

PAI: A lightweight mechanism for single-node memory recovery in DSM servers

Author: Falsafi Babak
Hoe James C.
Kim Jangwoo
Smolens Jared C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/04/2009
Field of study

Several recent studies identify the memory system as the most frequent source of hardware failures in commercial servers. Techniques to protect the memory system from failures must continue to service memory requests, despite hardware failures. Furthermore, to support existing OS's, the physical address space must be retained following reconfiguration. Existing techniques either suffer from a high performance overhead or require pervasive hardware changes to support transparent recovery. In this paper, we propose Physical Address Indirection (PAI), a lightweight, hardware-based mechanism for memory system failure recovery. PAI provides a simple hardware mapping to transparently reconstruct affected data in alternate locations, while maintaining high performance and avoiding physical address changes. With full-system simulation of commercial and scientific workloads on a 16-node distributed shared memory server, we show that prior techniques have an average degraded mode performance loss of 14% and 51% for commercial and scientific workloads, respectively. Using PAI's dataswap reconstruction, the same workloads have 1% and 32% average performance losses. © 2007 IEEE

Infoscience - École polytechnique fédérale de Lausanne

Выделение радия из водных растворов хлоридов редкоземельных элементов наночастицами сульфата бария

Author: Beres Stephen B.
Brandt Claudia M.
Haase Gerhard Michael
Hoe Nancy P.
Liu Mengyao
Matsumoto Masakado
Musser James M.
Sylva Gail L.
Publication venue
Publication date: 01/01/2003
Field of study

Electronic archive of Tomsk Polytechnic University

Understanding the performance of concurrent error detecting superscalar microarchitectures

Author: Falsafi Babak
Hoe James C.
Jangwoo Kim
Smolens Jared C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/04/2009
Field of study

Superscalar out-of-order microarchitectures can be modified to support redundant execution of a program as two concurrent threads for soft-error detection. However, the extra workload from redundant execution incurs a performance penalty due to increased contention for resources throughout the datapath. We present four key parameters that affect performance of these designs, namely 1) issue and functional unit bandwidth, 2) issue queue and reorder buffer capacity, 3) decode and retirement bandwidth, and 4) coupling between redundant threads' instantaneous resource requirements. We then survey existing work in concurrent error detecting superscalar microarchitectures and evaluate these proposals with respect to the four factors. © 2005 IEEE

Infoscience - École polytechnique fédérale de Lausanne

TurboSMARTS: Accurate microarchitecture simulation sampling in minutes

Author: Falsafi Babak
Hoe James C.
Wenisch Thomas F.
Wunderlich Roland E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/04/2009
Field of study

Recent research proposes accelerating processor microarchitecture simulation through statistical sampling. Prior simulation sampling approaches construct accurate model state for each measurement by continuously warming large microarchitectural structures (e.g., caches and the branch predictor) while emulating the billions of instructions between measurements. This approach, called functional warming, occupies hours of runtime while the detailed simulation that is measured requires mere minutes. To eliminate the functional warming bottleneck, we propose TurboSMARTS, a simulation framework that stores functionally-warmed state in a library of small, reusable checkpoints. TurboSMARTS enables the creation of the thousands of checkpoints necessary for accurate sampling by storing only the subset of warmed state accessed during simulation of each brief execution window. TurboSMARTS matches the accuracy of prior simulation sampling techniques (i.e., ±3% error with 99.7% confidence), while estimating the performance of an 8-way out-of-order superscalar processor running SPEC CPU2000 in 91 seconds per benchmark, on average, using a 12 GB checkpoint library

Infoscience - École polytechnique fédérale de Lausanne