3 research outputs found

    2D PET backprojection acceleration through a 2D predictive cache

    Get PDF
    Reduction of image reconstruction time is a key point for the development and spreading of PET scans. Thus this article presentes a hardware/software architecture which aims at accelerating the 2D reconstruction on a SoPC (System on Programmable Chip) plateform, the new generation of reconfigurable chip. Issue posed by the latency of memory accesses has been solved thanks to the 2D Aptative and Predictive cache (2D-AP cache).Le développement et la diffusion des équipements TEP passent par la réduction des temps de calcul de la reconstruction des images acquises. Aussi cet article présente une solution mixte logicielle/matérielle pour l'accélération de la reconstruction 2D sur une plateforme SOPC (System on Programmable Chip), la nouvelle génération de circuits reconfigurables. Le verrou technologique posé par la latence des accès mémoire est levé grâce au cache 2D Adaptatif et Prédictif (cache 2D-AP)

    Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms

    Get PDF
    The combination of low-cost imaging chips and high-performance, multicore, embedded processors heralds a new era in portable vision systems. Early vision algorithms have the potential for highly data-parallel, integer execution. However, an implementation must operate within the constraints of embedded systems including low clock rate, low-power operation and with limited memory. This dissertation explores new approaches to adapt novel pixel-based vision algorithms for tomorrow's multicore embedded processors. It presents : - An adaptive, multimodal background modeling technique called Multimodal Mean that achieves high accuracy and frame rate performance with limited memory and a slow-clock, energy-efficient, integer processing core. - A new workload partitioning technique to optimize the execution of early vision algorithms on multi-core systems. - A novel data transfer technique called cat-tail dma that provides globally-ordered, non-blocking data transfers on a multicore system. By using efficient data representations, Multimodal Mean provides comparable accuracy to the widely used Mixture of Gaussians (MoG) multimodal method. However, it achieves a 6.2x improvement in performance while using 18% less storage than MoG while executing on a representative embedded platform. When this algorithm is adapted to a multicore execution environment, the new workload partitioning technique demonstrates an improvement in execution times of 25% with only a 125 ms system reaction time. It also reduced the overall number of data transfers by 50%. Finally, the cat-tail buffering technique reduces the data-transfer latency between execution cores and main memory by 32.8% over the baseline technique when executing Multimodal Mean. This technique concurrently performs data transfers with code execution on individual cores, while maintaining global ordering through low-overhead scheduling to prevent collisions.Ph.D.Committee Chair: Wills, Scott; Committee Co-Chair: Wills, Linda; Committee Member: Bader, David; Committee Member: Davis, Jeff; Committee Member: Hamblen, James; Committee Member: Lanterman, Aaro
    corecore