Search CORE

4,103 research outputs found

Recommended from our members

Hardware accelerator for ALICE ITS Cluster Finder

Author: Qazi Anisa Aziz
Publication venue
Publication date: 30/08/2018
Field of study

An integral part of the upgrade to the Inner Tracking System (ITS) of the ALICE detector is to support increased readout rates of the charged particles resulting due to increased interaction rate of 50kHz in Pb-Pb collisions at the Large Hadron Collider (LHC). A major task of the ITS readout system is to compress the data and store it in the mass storage system for later analysis. The first step of data compression involves cluster finding on the pixel data received from ALPIDE sensors followed by Huffman compression. In this Thesis, we evaluate the resource requirements for implementing cluster finding on the Arria 10 FPGAs which are an integral part of the ITS readout system, in an attempt to reduce the computing nodes needed on the First Level Processors (FLPs) and also to speed up the processing. We present a hardware implementation of a single pass Connected Component Labeling algorithm. A special linked list based merger table that ensures a constant worst case latency for chained label mergers independent of their length is proposed. For retrieving the shapeIDs, pixels are segregated into clusters on-the-fly without the need to store labeled pixels in memory. Verilog code implementing this design has been written, a testbench for functional verification has been developed, and the design has been synthesized.Electrical and Computer Engineerin

Texas ScholarWorks

Integrated Development and Parallelization of Automated Dicentric Chromosome Identification Software to Expedite Biodosimetry Analysis

Author: Li Yanxin
Publication venue: Scholarship@Western
Publication date: 16/04/2013
Field of study

Manual cytogenetic biodosimetry lacks the ability to handle mass casualty events. We present an automated dicentric chromosome identification (ADCI) software utilizing parallel computing technology. A parallelization strategy combining data and task parallelism, as well as optimization of I/O operations, has been designed, implemented, and incorporated in ADCI. Experiments on an eight-core desktop show that our algorithm can expedite the process of ADCI by at least four folds. Experiments on Symmetric Computing, SHARCNET, Blue Gene/Q multi-processor computers demonstrate the capability of parallelized ADCI to process thousands of samples for cytogenetic biodosimetry in a few hours. This increase in speed underscores the effectiveness of parallelization in accelerating ADCI. Our software will be an important tool to handle the magnitude of mass casualty ionizing radiation events by expediting accurate detection of dicentric chromosomes

CiteSeerX

Scholarship@Western

A State-of-the-Art Review with Code about Connected Components Labeling on GPUs

Author: Costantino Grana
Federico Bolelli
Luca Lumetti
Stefano Allegretti
Publication venue
Publication date: 01/01/2024
Field of study

This article is about Connected Components Labeling (CCL) algorithms developed for GPU accelerators. The task itself is employed in many modern image-processing pipelines and represents a fundamental step in different scenarios, whenever object recognition is required. For this reason, a strong effort in the development of many different proposals devoted to improving algorithm performance using different kinds of hardware accelerators has been made. This paper focuses on GPU-based algorithmic solutions published in the last two decades, highlighting their distinctive traits and the improvements they leverage. The state-of-the-art review proposed is equipped with the source code, which allows to straightforwardly reproduce all the algorithms in different experimental settings. A comprehensive evaluation on multiple environments is also provided, including different operating systems, compilers, and GPUs. Our assessments are performed by means of several tests, including real-case images and synthetically generated ones, highlighting the strengths and weaknesses of each proposal. Overall, the experimental results revealed that block-based oriented algorithms outperform all the other algorithmic solutions on both 2D images and 3D volumes, regardless of the selected environment

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Optimizing GPU-Based Connected Components Labeling Algorithms

Author: ALLEGRETTI STEFANO
Costantino Grana
Federico Bolelli
Michele Cancilla
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical Processing Units (GPUs) makes them eligible for such a kind of algorithms. In the last decade, many approaches to compute CCL on GPUs have been proposed. Unfortunately, most of them have focused on 4-way connectivity neglecting the importance of 8-way connectivity. This paper aims to extend state-of-the-art GPU-based algorithms from 4 to 8-way connectivity and to improve them with additional optimizations. Experimental results revealed the effectiveness of the proposed strategies

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Un algoritmo en tiempo real para etiquetado de componentes conectados en imágenes

Author: Brox Jiménez Piedad
Calvo Gallego Elisa
Sánchez Solano Santiago
Publication venue: Instituto Nacional de Astrofísica, Óptica y Electrónica; Universidad de Sevilla
Publication date: 01/03/2012
Field of study

Esta comunicación presenta un algoritmo de dos pasadas para el etiquetado en tiempo real de los componentes conexos en una imagen. El algoritmo propuesto es una buena opción frente a otras alternativas de dos y múltiples pasadas ya que ha sido diseñado considerando que su implementación en FPGAs ofrezca un buen compromiso entre recursos ocupados y velocidad de operación. Se describen dos implementaciones hardware de este algoritmo, cuyo desarrollo se ha llevado a cabo siguiendo un flujo de diseño basado en la herramienta System Generator de Xilinx.Comunidad Económica Europea MOBY-DIC FP7-IST- 248858Ministerio de Ciencia e Innovación (España) TEC2008-04920Junta de Andalucía P08- TIC-03674Fondos Feder P08- TIC-0367

idUS. Depósito de Investigación Universidad de Sevilla

On the Hardware/Software Design and Implementation of a High Definition Multiview Video Surveillance System

Author: Chan SC
Hung YS
Ni JQ
Tan HJ
Wu J
Zhang S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

published_or_final_versio

HKU Scholars Hub

Fast and efficient FPGA implementation of connected operators

Author: Akil Mohamed
Contou-Carrère Fançois
Dokladalova Eva
Ngan Nicolas
Publication venue: 'Elsevier BV'
Publication date: 06/07/2011
Field of study

International audienceThe Connected Component Tree (CCT)-based operators play a central role in the development of new algorithms related to image processing applications such as pattern recognition, video-surveillance or motion extraction. The CCT construction, being a time consuming task (about 80% of the application time), these applications remain far-off mobile embedded systems. This paper presents its efficient FPGA implementation suited for embedded systems. Three main contributions are discussed: an efficient data structure proposal adapted to representing the CCT in embedded systems, a memory organization suitable for FPGA implementation by using on-chip memory and a customizable hardware accelerator architecture for CCT-based applications

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM