870 research outputs found
The Design and Implementation of a PCIe-based LESS Label Switch
With the explosion of the Internet of Things, the number of smart, embedded devices has grown exponentially in the last decade, with growth projected at a commiserate rate. These devices create strain on the existing infrastructure of the Internet, creating challenges with scalability of routing tables and reliability of packet delivery. Various schemes based on Location-Based Forwarding and ID-based routing have been proposed to solve the aforementioned problems, but thus far, no solution has completely been achieved. This thesis seeks to improve current proposed LORIF routers by designing, implementing, and testing and a PCIe-based LESS switch to process unrouteable packets under the current LESS forwarding engine
Recommended from our members
Hardware accelerator for ALICE ITS Cluster Finder
An integral part of the upgrade to the Inner Tracking System (ITS) of the ALICE detector is to support increased readout rates of the charged particles resulting due to increased interaction rate of 50kHz in Pb-Pb collisions at the Large Hadron Collider (LHC). A major task of the ITS readout system is to compress the data and store it in the mass storage system for later analysis. The first step of data compression involves cluster finding on the pixel data received from ALPIDE sensors followed by Huffman compression. In this Thesis, we evaluate the resource requirements for implementing cluster finding on the Arria 10 FPGAs which are an integral part of the ITS readout system, in an attempt to reduce the computing nodes needed on the First Level Processors (FLPs) and also to speed up the processing. We present a hardware implementation of a single pass Connected Component Labeling algorithm. A special linked list based merger table that ensures a constant worst case latency for chained label mergers independent of their length is proposed. For retrieving the shapeIDs, pixels are segregated into clusters on-the-fly without the need to store labeled pixels in memory. Verilog code implementing this design has been written, a testbench for functional verification has been developed, and the design has been synthesized.Electrical and Computer Engineerin
Accelerated hardware video object segmentation: From foreground detection to connected components labelling
This is the preprint version of the Article - Copyright @ 2010 ElsevierThis paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency
Image Processing Using FPGAs
This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs
Video Sensor Architecture for Surveillance Applications
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%. © 2012 by the authors; licensee MDPI, Basel, Switzerland.This work has been partially supported by SENSE project (Specific Targeted Research Project within the thematic priority IST 2.5.3 of the 6th Framework Program of the European Commission: IST Project 033279), and has been also co-funded by the Spanish research projects SIDIRELI: DPI2008-06737-C02-01/02 and COBAMI: DPI2011-28507-C02-02, both partially supported with European FEDER funds.Sánchez Peñarroja, J.; Benet Gilabert, G.; Simó Ten, JE. (2012). Video Sensor Architecture for Surveillance Applications. Sensors. 12(2):1509-1528. https://doi.org/10.3390/s120201509S15091528122Batlle, J. (2002). A New FPGA/DSP-Based Parallel Architecture for Real-Time Image Processing. Real-Time Imaging, 8(5), 345-356. doi:10.1006/rtim.2001.0273Foresti, G. L., Micheloni, C., Piciarelli, C., & Snidaro, L. (2009). Visual Sensor Technology for Advanced Surveillance Systems: Historical View, Technological Aspects and Research Activities in Italy. Sensors, 9(4), 2252-2270. doi:10.3390/s90402252Bramberger, M., Doblander, A., Maier, A., Rinner, B., & Schwabach, H. (2006). Distributed Embedded Smart Cameras for Surveillance Applications. Computer, 39(2), 68-75. doi:10.1109/mc.2006.55Foresti, G. L., Micheloni, C., Snidaro, L., Remagnino, P., & Ellis, T. (2005). Active video-based surveillance system: the low-level image and video processing techniques needed for implementation. IEEE Signal Processing Magazine, 22(2), 25-37. doi:10.1109/msp.2005.1406473Fuentes, L. M., & Velastin, S. A. (2003). Tracking People for Automatic Surveillance Applications. Lecture Notes in Computer Science, 238-245. doi:10.1007/978-3-540-44871-6_28García, J., Pérez, O., Berlanga, A., & Molina, J. M. (2007). Video tracking system optimization using evolution strategies. International Journal of Imaging Systems and Technology, 17(2), 75-90. doi:10.1002/ima.20100Xu, H., Lv, J., Chen, X., Gong, X., & Yang, C. (2007). Design of video processing and testing system based on DSP and FPGA. 3rd International Symposium on Advanced Optical Manufacturing and Testing Technologies: Optical Test and Measurement Technology and Equipment. doi:10.1117/12.783790Sanfeliu, A., Andrade-Cetto, J., Barbosa, M., Bowden, R., Capitán, J., Corominas, A., … Spaan, M. T. J. (2010). Decentralized Sensor Fusion for Ubiquitous Networking Robotics in Urban Areas. Sensors, 10(3), 2274-2314. doi:10.3390/s100302274http://www.sense-ist.orgXu, H., Lv, J., Chen, X., Gong, X., & Yang, C. (2007). Design of video processing and testing system based on DSP and FPGA. 3rd International Symposium on Advanced Optical Manufacturing and Testing Technologies: Optical Test and Measurement Technology and Equipment. doi:10.1117/12.78379
Accelerating LSTM-based High-Rate Dynamic System Models
In this paper, we evaluate the use of a trained Long Short-Term Memory (LSTM)
network as a surrogate for a Euler-Bernoulli beam model, and then we describe
and characterize an FPGA-based deployment of the model for use in real-time
structural health monitoring applications. The focus of our efforts is the
DROPBEAR (Dynamic Reproduction of Projectiles in Ballistic Environments for
Advanced Research) dataset, which was generated as a benchmark for the study of
real-time structural modeling applications. The purpose of DROPBEAR is to
evaluate models that take vibration data as input and give the initial
conditions of the cantilever beam on which the measurements were taken as
output. DROPBEAR is meant to serve an exemplar for emerging high-rate "active
structures" that can be actively controlled with feedback latencies of less
than one microsecond. Although the Euler-Bernoulli beam model is a well-known
solution to this modeling problem, its computational cost is prohibitive for
the time scales of interest. It has been previously shown that a properly
structured LSTM network can achieve comparable accuracy with less workload, but
achieving sub-microsecond model latency remains a challenge. Our approach is to
deploy the LSTM optimized specifically for latency on FPGA. We designed the
model using both high-level synthesis (HLS) and hardware description language
(HDL). The lowest latency of 1.42 S and the highest throughput of 7.87
Gops/s were achieved on Alveo U55C platform for HDL design.Comment: Accepted at 33rd International Conference on Field-Programmable Logic
and Applications (FPL
Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots
Safety is paramount for mobile robotic platforms such as self-driving cars
and unmanned aerial vehicles. This work is devoted to a task that is
indispensable for safety yet was largely overlooked in the past -- detecting
obstacles that are of very thin structures, such as wires, cables and tree
branches. This is a challenging problem, as thin objects can be problematic for
active sensors such as lidar and sonar and even for stereo cameras. In this
work, we propose to use video sequences for thin obstacle detection. We
represent obstacles with edges in the video frames, and reconstruct them in 3D
using efficient edge-based visual odometry techniques. We provide both a
monocular camera solution and a stereo camera solution. The former incorporates
Inertial Measurement Unit (IMU) data to solve scale ambiguity, while the latter
enjoys a novel, purely vision-based solution. Experiments demonstrated that the
proposed methods are fast and able to detect thin obstacles robustly and
accurately under various conditions.Comment: Appeared at IEEE CVPR 2017 Workshop on Embedded Visio
Brain-inspired self-organization with cellular neuromorphic computing for multimodal unsupervised learning
Cortical plasticity is one of the main features that enable our ability to
learn and adapt in our environment. Indeed, the cerebral cortex self-organizes
itself through structural and synaptic plasticity mechanisms that are very
likely at the basis of an extremely interesting characteristic of the human
brain development: the multimodal association. In spite of the diversity of the
sensory modalities, like sight, sound and touch, the brain arrives at the same
concepts (convergence). Moreover, biological observations show that one
modality can activate the internal representation of another modality when both
are correlated (divergence). In this work, we propose the Reentrant
Self-Organizing Map (ReSOM), a brain-inspired neural system based on the
reentry theory using Self-Organizing Maps and Hebbian-like learning. We propose
and compare different computational methods for unsupervised learning and
inference, then quantify the gain of the ReSOM in a multimodal classification
task. The divergence mechanism is used to label one modality based on the
other, while the convergence mechanism is used to improve the overall accuracy
of the system. We perform our experiments on a constructed written/spoken
digits database and a DVS/EMG hand gestures database. The proposed model is
implemented on a cellular neuromorphic architecture that enables distributed
computing with local connectivity. We show the gain of the so-called hardware
plasticity induced by the ReSOM, where the system's topology is not fixed by
the user but learned along the system's experience through self-organization.Comment: Preprin
- …