3,273 research outputs found

    Pattern-Based FPGA Logic Block and Clustering Algorithm

    Get PDF
    In classical FPGA, LUTs and DFFs are pre-packed into BLEs and then BLEs are grouped into logic blocks. We propose a novel logic block architecture with fast combinational paths between LUTs, called pattern-based logic blocks. A new clustering algorithm is developed to release the potential of pattern-based logic blocks. Experimental results show that the novel architecture and the associated clustering algorithm lead to a 14% performance gain and a 8% wirelength reduction with a 3% area overhead compared to conventional architecture in large control-instensive benchmarks

    Binary object recognition system on FPGA with bSOM

    Get PDF
    Tri-state Self Organizing Map (bSOM), which takes binary inputs and maintains tri-state weights, has been used for classification rather than clustering in this paper. The major contribution here is the demonstration of the potential use of the modified bSOM in security surveillance, as a recognition system on FPGA

    A binary self-organizing map and its FPGA implementation

    Get PDF
    A binary Self Organizing Map (SOM) has been designed and implemented on a Field Programmable Gate Array (FPGA) chip. A novel learning algorithm which takes binary inputs and maintains tri-state weights is presented. The binary SOM has the capability of recognizing binary input sequences after training. A novel tri-state rule is used in updating the network weights during the training phase. The rule implementation is highly suited to the FPGA architecture, and allows extremely rapid training. This architecture may be used in real-time for fast pattern clustering and classification of the binary features

    Seven strategies for tolerating highly defective fabrication

    Get PDF
    In this article we present an architecture that supports fine-grained sparing and resource matching. The base logic structure is a set of interconnected PLAs. The PLAs and their interconnections consist of large arrays of interchangeable nanowires, which serve as programmable product and sum terms and as programmable interconnect links. Each nanowire can have several defective programmable junctions. We can test nanowires for functionality and use only the subset that provides appropriate conductivity and electrical characteristics. We then perform a matching between nanowire junction programmability and application logic needs to use almost all the nanowires even though most of them have defective junctions. We employ seven high-level strategies to achieve this level of defect tolerance

    Accelerated hardware video object segmentation: From foreground detection to connected components labelling

    Get PDF
    This is the preprint version of the Article - Copyright @ 2010 ElsevierThis paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency

    FPGA-based Anomalous trajectory detection using SOFM

    Get PDF
    A system for automatically classifying the trajectory of a moving object in a scene as usual or suspicious is presented. The system uses an unsupervised neural network (Self Organising Feature Map) fully implemented on a reconfigurable hardware architecture (Field Programmable Gate Array) to cluster trajectories acquired over a period, in order to detect novel ones. First order motion information, including first order moving average smoothing, is generated from the 2D image coordinates (trajectories). The classification is dynamic and achieved in real-time. The dynamic classifier is achieved using a SOFM and a probabilistic model. Experimental results show less than 15\% classification error, showing the robustness of our approach over others in literature and the speed-up over the use of conventional microprocessor as compared to the use of an off-the-shelf FPGA prototyping board

    A FPGA-based architecture for real-time cluster finding in the LHCb silicon pixel detector

    Get PDF
    The data acquisition system of the LHCb experiment has been substantially upgraded for the LHC Run 3, with the unprecedented capability of reading out and fully reconstructing all proton–proton collisions in real time, occurring with an average rate of 30 MHz, for a total data flow of approximately 32 Tb/s. The high demand of computing power required by this task has motivated a transition to a hybrid heterogeneous computing architecture, where a farm of graphics cores, GPUs, is used in addition to general–purpose processors, CPUs, to speed up the execution of reconstruction algorithms. In a continuing effort to improve real–time processing capabilities of this new DAQ system, also with a view to further luminosity increases in the future, low–level, highly–parallelizable tasks are increasingly being addressed at the earliest stages of the data acquisition chain, using special–purpose computing accelerators. A promising solution is offered by custom–programmable FPGA devices, that are well suited to perform high–volume computations with high throughput and degree of parallelism, limited power consumption and latency. In this context, a two–dimensional FPGA–friendly cluster–finder algorithm has been developed to reconstruct hit positions in the new vertex pixel detector (VELO) of the LHCb Upgrade experiment. The associated firmware architecture, implemented in VHDL language, has been integrated within the VELO readout, without the need for extra cards, as a further enhancement of the DAQ system. This pre–processing allows the first level of the software trigger to accept a 11% higher rate of events, as the ready– made hit coordinates accelerate the track reconstruction, while leading to a drop in electrical power consumption, as the FPGA implementation requires O(50x) less power than the GPU one. The tracking performance of this novel system, being indistinguishable from a full–fledged software implementation, allows the raw pixel data to be dropped immediately at the readout level, yielding the additional benefit of a 14% reduction in data flow. The clustering architecture has been commissioned during the start of LHCb Run 3 and it currently runs in real time during physics data taking, reconstructing VELO hit coordinates on–the–fly at the LHC collision rate
    corecore