133 research outputs found

    Simulating a Pipelined Reconfigurable Mesh on a Linear Array with a Reconfigurable Pipelined Bus System

    Get PDF
    Due to the unidirectional nature of propagation and predictable delays, optically pipelined buses have been gaining more attention. There have been many models proposed over time that use reconfigurable optically pipelined buses. The reconfigurable nature of the models makes them capable of changing their component’s functionalities and structure that connects the components at every step of computation. There are both one dimensional as well as k –dimensional models that have been proposed in the literature. Though equivalence between various one dimensional models and equivalence between different two dimensional models had been established, so far there has not been any attempt to explore the relationship between a one dimensional model and a two dimensional model. In the proposed research work it is shown that a move from one to two or more dimensions does not cause any increase in the volume of communication between the processors as they communicate in a pipelined manner on the same optical bus. When moving from two dimensions to one dimension, the challenge is to map the processors so that those belonging to a two-dimensional bus segment are contiguous and in the same order on the one-dimensional model. This does not increase any increase in communication overhead as the processors instead of communicating on two dimensional buses now communicate on a linear one dimensional bus structure. To explore the relationship between one dimensional and two dimensional models a commonly used model Linear Array with a Reconfigurable Pipelined Bus System (LARPBS) and its two dimensional counterpart Pipelined Reconfigurable Mesh (PR-Mesh) are chosen Here an attempt has been made to present a simulation of a two dimensional PR-Mesh on a one dimensional LARPBS to establish complexity of the models with respect to one another, and to determine the efficiency with which the LARPBS can simulate the PR-Mesh

    Parameterized Implementation of K-means Clustering on Reconfigurable Systems

    Get PDF
    Processing power of pattern classification algorithms on conventional platforms has not been able to keep up with exponentially growing datasets. However, algorithms such as k-means clustering include significant potential parallelism that could be exploited to enhance processing speed on conventional platforms. A better and effective solution to speed-up the algorithm performance is the use of a hardware assist since parallel kernels can be partitioned and concurrently run on hardware as opposed to the sequential software flow. A parameterized hardware implementation of k-means clustering is presented as a proof of concept on the Pilchard Reconfigurable computing system. The hardware implementation is shown to have speedups of about 500 over conventional implementations on a general-purpose processor. A scalability analysis is done to provide a future direction to take the current implementation of 3 classes and scale it to over N classes

    Efficient mapping of EEG algorithms

    Get PDF

    Efficient reconfigurable architectures for 3D medical image compression

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Recently, the more widespread use of three-dimensional (3-D) imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and ultrasound (US) have generated a massive amount of volumetric data. These have provided an impetus to the development of other applications, in particular telemedicine and teleradiology. In these fields, medical image compression is important since both efficient storage and transmission of data through high-bandwidth digital communication lines are of crucial importance. Despite their advantages, most 3-D medical imaging algorithms are computationally intensive with matrix transformation as the most fundamental operation involved in the transform-based methods. Therefore, there is a real need for high-performance systems, whilst keeping architectures exible to allow for quick upgradeability with real-time applications. Moreover, in order to obtain efficient solutions for large medical volumes data, an efficient implementation of these operations is of significant importance. Reconfigurable hardware, in the form of field programmable gate arrays (FPGAs) has been proposed as viable system building block in the construction of high-performance systems at an economical price. Consequently, FPGAs seem an ideal candidate to harness and exploit their inherent advantages such as massive parallelism capabilities, multimillion gate counts, and special low-power packages. The key achievements of the work presented in this thesis are summarised as follows. Two architectures for 3-D Haar wavelet transform (HWT) have been proposed based on transpose-based computation and partial reconfiguration suitable for 3-D medical imaging applications. These applications require continuous hardware servicing, and as a result dynamic partial reconfiguration (DPR) has been introduced. Comparative study for both non-partial and partial reconfiguration implementation has shown that DPR offers many advantages and leads to a compelling solution for implementing computationally intensive applications such as 3-D medical image compression. Using DPR, several large systems are mapped to small hardware resources, and the area, power consumption as well as maximum frequency are optimised and improved. Moreover, an FPGA-based architecture of the finite Radon transform (FRAT)with three design strategies has been proposed: direct implementation of pseudo-code with a sequential or pipelined description, and block random access memory (BRAM)- based method. An analysis with various medical imaging modalities has been carried out. Results obtained for image de-noising implementation using FRAT exhibits promising results in reducing Gaussian white noise in medical images. In terms of hardware implementation, promising trade-offs on maximum frequency, throughput and area are also achieved. Furthermore, a novel hardware implementation of 3-D medical image compression system with context-based adaptive variable length coding (CAVLC) has been proposed. An evaluation of the 3-D integer transform (IT) and the discrete wavelet transform (DWT) with lifting scheme (LS) for transform blocks reveal that 3-D IT demonstrates better computational complexity than the 3-D DWT, whilst the 3-D DWT with LS exhibits a lossless compression that is significantly useful for medical image compression. Additionally, an architecture of CAVLC that is capable of compressing high-definition (HD) images in real-time without any buffer between the quantiser and the entropy coder is proposed. Through a judicious parallelisation, promising results have been obtained with limited resources. In summary, this research is tackling the issues of massive 3-D medical volumes data that requires compression as well as hardware implementation to accelerate the slowest operations in the system. Results obtained also reveal a significant achievement in terms of the architecture efficiency and applications performance.Ministry of Higher Education Malaysia (MOHE), Universiti Tun Hussein Onn Malaysia (UTHM) and the British Counci

    Optimization of a hardware/software coprocessing platform for EEG eyeblink detection and removal

    Get PDF
    The feasibility of implementing a real-time system for removing eyeblink artifacts from electroencephalogram (EEG) recordings utilizing a hardware/software coprocessing platform was investigated. A software based wavelet and independent component analysis (ICA) eyeblink detection and removal process was extended to enable variation in its processing parameters. Exploiting the efficiency of hardware and the reconfigurability of software, it was ported to a field programmable gate array (FPGA) development platform which was found to be capable of implementing the revised algorithm, although not in real-time. The implemented hardware and software solution was applied to a collection of both simulated and clinically acquired EEG data with known artifact and waveform characteristics to assess its speed and accuracy. Configured for optimal accuracy in terms of minimal false positives and negatives as well as maintaining the integrity of the underlying EEG, especially when encountering EEG waveform patterns with an appearance similar to eyeblink artifacts, the system was capable of processing a 10 second EEG epoch in an average of 123 seconds. Configured for efficiency, but with diminished accuracy, the system required an average of 34 seconds. Varying the ICA contrast function showed that the gaussian nonlinearity provided the best combination of reliability and accuracy, albeit with a long execution time. The cubic nonlinearity was fast, but unreliable, while the hyperbolic tangent contrast function frequently diverged. It is believed that the utilization of programmable logic with increased logic capacity and processing speed may enable this approach to achieve the objective of real-time operation

    Automated Nuclei Segmentation of Breast Cancer Histopathology

    Get PDF
    Automated detection and segmentation of cell nuclei is an essential step in breast cancer histopathology, so that there is improved accuracy, speed, level of automation and adaptability to new application. The goal of this paper is to develop efficient and accurate algorithms for detecting and segmenting cell nuclei in 2-D histological images. In this paper we will implement the utility of our nuclear segmentation algorithm in accurate extraction of nuclear features for automated grading of (a) breast cancer, and (b) distinguishing between cancerous and benign breast histology specimens. In order to address the issue the scheme integrates image information across three different scales: (1) low level information based on pixel values, (2) high-level information based on relationships between pixels for object detection, and(3)domain-specific information based on relationships between histological structures. Low-level information is utilized by a Bayesian Classifier to generate likelihood that each pixel belongs to an object of interest. High-level information is extracted in two ways: (i) by a level-set algorithm, where a contour is evolved in the likelihood scenes generated by the Bayesian classifier to identify object boundaries, and (ii) by a template matching algorithm, where shape models are used to identify glands and nuclei from the low-level likelihood scenes. Structural constraints are imposed via domain specific knowledge in order to verify whether the detected objects do indeed belong to structures of interest. The efficiency of our segmentation algorithm is evaluated by comparing breast cancer grading and benign vs. cancer discrimination accuracies with corresponding accuracies obtained via manual detection and segmentation of glands and nuclei

    Parametric Dense Stereovision Implementation on a System-on Chip (SoC)

    Get PDF
    This paper proposes a novel hardware implementation of a dense recovery of stereovision 3D measurements. Traditionally 3D stereo systems have imposed the maximum number of stereo correspondences, introducing a large restriction on artificial vision algorithms. The proposed system-on-chip (SoC) provides great performance and efficiency, with a scalable architecture available for many different situations, addressing real time processing of stereo image flow. Using double buffering techniques properly combined with pipelined processing, the use of reconfigurable hardware achieves a parametrisable SoC which gives the designer the opportunity to decide its right dimension and features. The proposed architecture does not need any external memory because the processing is done as image flow arrives. Our SoC provides 3D data directly without the storage of whole stereo images. Our goal is to obtain high processing speed while maintaining the accuracy of 3D data using minimum resources. Configurable parameters may be controlled by later/parallel stages of the vision algorithm executed on an embedded processor. Considering hardware FPGA clock of 100 MHz, image flows up to 50 frames per second (fps) of dense stereo maps of more than 30,000 depth points could be obtained considering 2 Mpix images, with a minimum initial latency. The implementation of computer vision algorithms on reconfigurable hardware, explicitly low level processing, opens up the prospect of its use in autonomous systems, and they can act as a coprocessor to reconstruct 3D images with high density information in real time

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    Domain specific high performance reconfigurable architecture for a communication platform

    Get PDF
    corecore