1,730 research outputs found

    Time efficient segmented technique for dynamic programming based algorithms with FPGA implementation

    Get PDF
    © 2019 World Scientific Publishing Company. Although dynamic programming (DP) is an optimization approach used to solve a complex problem fast, the time required to solve it is still not efficient and grows polynomially with the size of the input. In this contribution, we improve the computation time of the dynamic programming based algorithms by proposing a novel technique, which is called SDP: Segmented Dynamic programming . SDP finds the best way of splitting the compared sequences into segments and then applies the dynamic programming algorithm to each segment individually. This will reduce the computation time dramatically. SDP may be applied to any dynamic programming based algorithm to improve its computation time. As case studies, we apply the SDP technique on two different dynamic programming based algorithms; Needleman-Wunsch (NW) , the widely used program for optimal sequence alignment, and the LCS algorithm, which finds the Longest Common Subsequence between two input strings. The results show that applying the SDP technique in conjunction with the DP based algorithms improves the computation time by up to 80% in comparison to the sole DP algorithms, but with small or ignorable degradation in comparing results. This degradation is controllable and it is based on the number of split segments as an input parameter. However, we compare our results with the well-known heuristic FASTA sequence alignment algorithm, GGSEARCH . We show that our results are much closer to the optimal results than the GGSEARCH algorithm. The results are valid independent from the sequences length and their level of similarity. To show the functionality of our technique on the hardware and to verify the results, we implement it on the Xilinx Zynq-7000 FPGA

    Mobile Hardware Based Implementation of a Novel, Efficient, Fuzzy Logic Inspired Edge Detection Technique for Analysis of Malaria Infected Microscopic Thin Blood Images

    Get PDF
    This paper proposes a novel, efficient, low complexity algorithm for edge detection, with a cheap, easily accessible, networkable hardware implementation, specifically focused on the analysis of malaria infected thin blood smears. The algorithm presents a new and dynamic thresholding technique that eliminates inter-cell interference based on histogram analysis. Following this, binary image morphological processing is performed which is shown to outperform the same operation on the much more complex greyscale images. Edge tracking is done via a simplified fuzzy logic inspired rule system. The entire system is implemented on multiple platforms to test widespread compatibility but primarily developed for a battery powered standalone raspberry pi with low power, low resolution touchscreen and hardware buttons. The entire algorithm was pitted against the much more complex but still very well performing Canny algorithm, which despite the age, is still one of the most comprehensive edge detection techniques available; modern variants were considered and reviewed, but ultimately given the level of outperformance, they were not viable options

    Domain-specific and reconfigurable instruction cells based architectures for low-power SoC

    Get PDF

    FPGA implementations for parallel multidimensional filtering algorithms

    Get PDF
    PhD ThesisOne and multi dimensional raw data collections introduce noise and artifacts, which need to be recovered from degradations by an automated filtering system before, further machine analysis. The need for automating wide-ranged filtering applications necessitates the design of generic filtering architectures, together with the development of multidimensional and extensive convolution operators. Consequently, the aim of this thesis is to investigate the problem of automated construction of a generic parallel filtering system. Serving this goal, performance-efficient FPGA implementation architectures are developed to realize parallel one/multi-dimensional filtering algorithms. The proposed generic architectures provide a mechanism for fast FPGA prototyping of high performance computations to obtain efficiently implemented performance indices of area, speed, dynamic power, throughput and computation rates, as a complete package. These parallel filtering algorithms and their automated generic architectures tackle the major bottlenecks and limitations of existing multiprocessor systems in wordlength, input data segmentation, boundary conditions as well as inter-processor communications, in order to support high data throughput real-time applications of low-power architectures using a Xilinx Virtex-6 FPGA board. For one-dimensional raw signal filtering case, mathematical model and architectural development of the generalized parallel 1-D filtering algorithms are presented using the 1-D block filtering method. Five generic architectures are implemented on a Virtex-6 ML605 board, evaluated and compared. A complete set of results on area, speed, power, throughput and computation rates are obtained and discussed as performance indices for the 1-D convolution architectures. A successful application of parallel 1-D cross-correlation is demonstrated. For two dimensional greyscale/colour image processing cases, new parallel 2-D/3-D filtering algorithms are presented and mathematically modelled using input decimation and output image reconstruction by interpolation. Ten generic architectures are implemented on the Virtex-6 ML605 board, evaluated and compared. Key results on area, speed, power, throughput and computation rate are obtained and discussed as performance indices for the 2-D convolution architectures. 2-D image reconfigurable processors are developed and implemented using single, dual and quad MAC FIR units. 3-D Colour image processors are devised to act as 3-D colour filtering engines. A 2-D cross-correlator parallel engine is successfully developed as a parallel 2-D matched filtering algorithm for locating any MRI slice within a MRI data stack library. Twelve 3-D MRI filtering operators are plugged in and adapted to be suitable for biomedical imaging, including 3-D edge operators and 3-D noise smoothing operators. Since three dimensional greyscale/colour volumetric image applications are computationally intensive, a new parallel 3-D/4-D filtering algorithm is presented and mathematically modelled using volumetric data image segmentation by decimation and output reconstruction by interpolation, after simultaneously and independently performing 3-D filtering. Eight generic architectures are developed and implemented on the Virtex-6 board, including 3-D spatial and FFT convolution architectures. Fourteen 3-D MRI filtering operators are plugged and adapted for this particular biomedical imaging application, including 3-D edge operators and 3-D noise smoothing operators. Three successful applications are presented in 4-D colour MRI (fMRI) filtering processors, k-space MRI volume data filter and 3-D cross-correlator.IRAQI Government

    Coarse-grained reconfigurable array architectures

    Get PDF
    Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code

    A Micro Power Hardware Fabric for Embedded Computing

    Get PDF
    Field Programmable Gate Arrays (FPGAs) mitigate many of the problemsencountered with the development of ASICs by offering flexibility, faster time-to-market, and amortized NRE costs, among other benefits. While FPGAs are increasingly being used for complex computational applications such as signal and image processing, networking, and cryptology, they are far from ideal for these tasks due to relatively high power consumption and silicon usage overheads compared to direct ASIC implementation. A reconfigurable device that exhibits ASIC-like power characteristics and FPGA-like costs and tool support is desirable to fill this void. In this research, a parameterized, reconfigurable fabric model named as domain specific fabric (DSF) is developed that exhibits ASIC-like power characteristics for Digital Signal Processing (DSP) style applications. Using this model, the impact of varying different design parameters on power and performance has been studied. Different optimization techniques like local search and simulated annealing are used to determine the appropriate interconnect for a specific set of applications. A design space exploration tool has been developed to automate and generate a tailored architectural instance of the fabric.The fabric has been synthesized on 160 nm cell-based ASIC fabrication process from OKI and 130 nm from IBM. A detailed power-performance analysis has been completed using signal and image processing benchmarks from the MediaBench benchmark suite and elsewhere with comparisons to other hardware and software implementations. The optimized fabric implemented using the 130 nm process yields energy within 3X of a direct ASIC implementation, 330X better than a Virtex-II Pro FPGA and 2016X better than an Intel XScale processor

    Novel control approaches for the next generation computer numerical control (CNC) system for hybrid micro-machines

    Get PDF
    It is well-recognised that micro-machining is a key enabling technology for manufacturing high value-added 3D micro-products, such as optics, moulds/dies and biomedical implants etc. These products are usually made of a wide range of engineering materials and possess complex freeform surfaces with tight tolerance on form accuracy and surface finish.In recent years, hybrid micro-machining technology has been developed to integrate several machining processes on one platform to tackle the manufacturing challenges for the aforementioned micro-products. However, the complexity of system integration and ever increasing demand for further enhanced productivity impose great challenges on current CNC systems. This thesis develops, implements and evaluates three novel control approaches to overcome the identified three major challenges, i.e. system integration, parametric interpolation and toolpath smoothing. These new control approaches provide solid foundation for the development of next generation CNC system for hybrid micro-machines.There is a growing trend for hybrid micro-machines to integrate more functional modules. Machine developers tend to choose modules from different vendors to satisfy the performance and cost requirements. However, those modules often possess proprietary hardware and software interfaces and the lack of plug-and-play solutions lead to tremendous difficulty in system integration. This thesis proposes a novel three-layer control architecture with component-based approach for system integration. The interaction of hardware is encapsulated into software components, while the data flow among different components is standardised. This approach therefore can significantly enhance the system flexibility. It has been successfully verified through the integration of a six-axis hybrid micro-machine. Parametric curves have been proven to be the optimal toolpath representation method for machining 3D micro-products with freeform surfaces, as they can eliminate the high-frequency fluctuation of feedrate and acceleration caused by the discontinuity in the first derivatives along linear or circular segmented toolpath. The interpolation for parametric curves is essentially an optimization problem, which is extremely difficult to get the time-optimal solution. This thesis develops a novel real-time interpolator for parametric curves (RTIPC), which provides a near time-optimal solution. It limits the machine dynamics (axial velocities, axial accelerations and jerk) and contour error through feedrate lookahead and acceleration lookahead operations. Experiments show that the RTIPC can simplify the coding significantly, and achieve up to ten times productivity than the industry standard linear interpolator. Furthermore, it is as efficient as the state-of-the-art Position-Velocity-Time (PVT) interpolator, while achieving much smoother motion profiles.Despite the fact that parametric curves have huge advantage in toolpath continuity, linear segmented toolpath is still dominantly used on the factory floor due to its straightforward coding and excellent compatibility with various CNC systems. This thesis presents a new real-time global toolpath smoothing algorithm, which bridges the gap in toolpath representation for CNC systems. This approach uses a cubic B-spline to approximate a sequence of linear segments. The approximation deviation is controlled by inserting and moving new control points on the control polygon. Experiments show that the proposed approach can increase the productivity by more than three times than the standard toolpath traversing algorithm, and 40% than the state-of-the-art corner blending algorithm, while achieving excellent surface finish.Finally, some further improvements for CNC systems, such as adaptive cutting force control and on-line machining parameters adjustment with metrology, are discussed in the future work section.It is well-recognised that micro-machining is a key enabling technology for manufacturing high value-added 3D micro-products, such as optics, moulds/dies and biomedical implants etc. These products are usually made of a wide range of engineering materials and possess complex freeform surfaces with tight tolerance on form accuracy and surface finish.In recent years, hybrid micro-machining technology has been developed to integrate several machining processes on one platform to tackle the manufacturing challenges for the aforementioned micro-products. However, the complexity of system integration and ever increasing demand for further enhanced productivity impose great challenges on current CNC systems. This thesis develops, implements and evaluates three novel control approaches to overcome the identified three major challenges, i.e. system integration, parametric interpolation and toolpath smoothing. These new control approaches provide solid foundation for the development of next generation CNC system for hybrid micro-machines.There is a growing trend for hybrid micro-machines to integrate more functional modules. Machine developers tend to choose modules from different vendors to satisfy the performance and cost requirements. However, those modules often possess proprietary hardware and software interfaces and the lack of plug-and-play solutions lead to tremendous difficulty in system integration. This thesis proposes a novel three-layer control architecture with component-based approach for system integration. The interaction of hardware is encapsulated into software components, while the data flow among different components is standardised. This approach therefore can significantly enhance the system flexibility. It has been successfully verified through the integration of a six-axis hybrid micro-machine. Parametric curves have been proven to be the optimal toolpath representation method for machining 3D micro-products with freeform surfaces, as they can eliminate the high-frequency fluctuation of feedrate and acceleration caused by the discontinuity in the first derivatives along linear or circular segmented toolpath. The interpolation for parametric curves is essentially an optimization problem, which is extremely difficult to get the time-optimal solution. This thesis develops a novel real-time interpolator for parametric curves (RTIPC), which provides a near time-optimal solution. It limits the machine dynamics (axial velocities, axial accelerations and jerk) and contour error through feedrate lookahead and acceleration lookahead operations. Experiments show that the RTIPC can simplify the coding significantly, and achieve up to ten times productivity than the industry standard linear interpolator. Furthermore, it is as efficient as the state-of-the-art Position-Velocity-Time (PVT) interpolator, while achieving much smoother motion profiles.Despite the fact that parametric curves have huge advantage in toolpath continuity, linear segmented toolpath is still dominantly used on the factory floor due to its straightforward coding and excellent compatibility with various CNC systems. This thesis presents a new real-time global toolpath smoothing algorithm, which bridges the gap in toolpath representation for CNC systems. This approach uses a cubic B-spline to approximate a sequence of linear segments. The approximation deviation is controlled by inserting and moving new control points on the control polygon. Experiments show that the proposed approach can increase the productivity by more than three times than the standard toolpath traversing algorithm, and 40% than the state-of-the-art corner blending algorithm, while achieving excellent surface finish.Finally, some further improvements for CNC systems, such as adaptive cutting force control and on-line machining parameters adjustment with metrology, are discussed in the future work section

    Circuit design and analysis for on-FPGA communication systems

    No full text
    On-chip communication system has emerged as a prominently important subject in Very-Large- Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects. Interconnects often dictates the system performance, and, therefore, research for new methodologies and system architectures that deliver high-performance communication services across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication. Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable fabrics, switches and the specific routing architecture also introduce additional latency and bandwidth degradation further hindering intra-chip communication performance. Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs. Communication with programmable interconnect received little attention and is inadequately understood. This thesis is among the first to research on-chip communication systems that are built on top of programmable fabrics and proposes methodologies to maximize the interconnect throughput performance. There are three major contributions in this thesis: (i) an analysis of on-chip interconnect fringing, which degrades the bandwidth of communication channels due to routing congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly improves the interconnect throughput by exploiting the fundamental electrical characteristics of the reconfigurable interconnect structures. This new scheme can potentially mitigate the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime optimization for route planning and dynamic routing which, effectively utilizes the in-silicon bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new methodologies and concepts are proposed to enhance the on-FPGA communication throughput performance that is of vital importance in new technology processes
    corecore