47 research outputs found
Floating-Point Matrix Product on FPGA
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.---- Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE
A general framework for efficient FPGA implementation of matrix product
Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe
Using thermochromism to simulate blood oxygenation in extracorporeal membrane oxygenation
Introduction: Extracorporeal membrane oxygenation (ECMO) training programs employ real ECMO components, causing them to be extremely expensive while offering little realism in terms of blood oxygenation and pressure. To overcome those limitations, we are developing a standalone modular ECMO simulator that reproduces ECMO’s visual, audio and haptic cues using affordable mechanisms. We present a central component of this simulator, capable of visually reproducing blood oxygenation color change using thermochromism. Methods: Our simulated ECMO circuit consists of two physically distant modules, responsible for adding and withdrawing heat from a thermochromic fluid. This manipulation of heat creates a temperature difference between the fluid in the drainage line and the fluid in the return line of the circuit and, hence, a color difference. Results: Thermochromic ink mixed with concentrated dyes was used to create a recipe for a realistic and affordable blood-colored fluid. The implemented “ECMO circuit” reproduced blood’s oxygenation and deoxygenation color difference or lack thereof. The heat control circuit costs 300 USD to build and the thermochromic fluid costs 40 USD/L. During a ten-hour in situ demonstration, nineteen ECMO specialists rated the fidelity of the oxygenated and deoxygenated “blood” and the color contrast between them as highly realistic. Conclusions: Using low-cost yet high-fidelity simulation mechanisms, we implemented the central subsystem of our modular ECMO simulator, which creates the look and feel of an ECMO circuit without using an actual one.Peer reviewedFinal Accepted Versio
Real-time ECG Monitoring using Compressive sensing on a Heterogeneous Multicore Edge-Device
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.In a typical ambulatory health monitoring systems, wearable medical sensors
are deployed on the human body to continuously collect and transmit physiological
signals to a nearby gateway that forward the measured data to the
cloud-based healthcare platform. However, this model often fails to respect the
strict requirements of healthcare systems. Wearable medical sensors are very
limited in terms of battery lifetime, in addition, the system reliance on a cloud
makes it vulnerable to connectivity and latency issues. Compressive sensing
(CS) theory has been widely deployed in electrocardiogramme ECG monitoring
application to optimize the wearable sensors power consumption. The proposed
solution in this paper aims to tackle these limitations by empowering a gatewaycentric
connected health solution, where the most power consuming tasks are
performed locally on a multicore processor. This paper explores the efficiency
of real-time CS-based recovery of ECG signals on an IoT-gateway embedded
with ARM’s big.littleTM multicore for different signal dimension and allocated
computational resources. Experimental results show that the gateway is able
to reconstruct ECG signals in real-time. Moreover, it demonstrates that using
a high number of cores speeds up the execution time and it further optimizes
energy consumption. The paper identifies the best configurations of resource
allocation that provides the optimal performance. The paper concludes that
multicore processors have the computational capacity and energy efficiency to
promote gateway-centric solution rather than cloud-centric platforms
Robust event-based non-intrusive appliance recognition using multi-scale wavelet packet tree and ensemble bagging tree
open access articleProviding the user with appliance-level consumption data is the core of each energy efficiency system. To that
end, non-intrusive load monitoring is employed for extracting appliance specific consumption data at a low cost
without the need of installing separate submeters for each electrical device. In this context, we propose in this
paper a novel non-intrusive appliance recognition system based on (i) detecting events in the aggregated power
signal using a novel and powerful scheme, (ii) applying multiscale wavelet packet tree to collect comprehensive
energy consumption features, and (iii) adopting an ensemble bagging tree classifier along with comparing its
performance with various machine learning schemes. Moreover, to validate the proposed model, an empirical
investigation is conducted on two real and public energy consumption datasets, namely, the GREEND and REDD,
in which consumption readings are collected at low-frequencies. In addition, a comprehensive review of recent
non-intrusive load monitoring approaches has been conducted and presented, in which their characteristics,
performances and limitations are described. The proposed non-intrusive load monitoring system shows a high
appliance recognition performance in terms of the accuracy, F1 score and low time complexity when it has been
applied to different households from the GREEND and REDD repositories, in which every house includes various
domestic appliances. Obtained results have described, e.g., that average accuracies of 97.01% and 96.36% have
been reached on the GREEND and REDD datasets, respectively, which outperformed almost existing solutions
considered in this framework
Intelligent co-operative processor-in-memory
Original article can be found at: http://www.medjec.com/ Copyright Softmotor LimitedAdvances in VLSI technology are enabling the processor-memory integration to bridge the processor-memory performance gap. It is also a key driver in the innovation of a new concept called Processor-In-Memory (PIM). The work described in this paper capitalises on the extensive work carried out on PIMs in general and develops a road map for an intelligent revision of a PIM architecture referred to as Co-operative Intelligent Memory (CIM). The journey made to reach the goal of achieving a CIM is taken via the route of developing a Cooperative Pseudo Intelligent Memory (CPIM), as proof of concept and mid point in the ratification of the intelligence needed for a full CIM implementation. Both architectures use a hierarchical two level CPU structure referred to as major and minor CPUs. By partitioning computation through dividing workload between major and minor CPUs in an intelligent manner and without any pre-processor compilation or kernel task scheduling, the PIM system can be made more efficient and co-operative for class of tasks, which are heavily reliant on memory-to-memory iterative processes. The proposed architectures exploit the key feature in the iterative process by using vectors that characterize the iteration. The process of identifying intelligently these vectors is described in this paper. In addition, the performance of the proposed architectures has been evaluated.Peer reviewe
Dynamic Co-operative Intelligent Memory
“This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
xDSL Network Upgrade Employing FPGAs
“This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.” DOI: 10.1109/DELTA.2008.84This paper proposes an upgrade scenario for xDSL networks to provide broadband access for extendedlink lengths while demonstrating network grooming bymeans of more than one subscriber using a single network connection concurrently. This is achieved by applying Direct Spread Code Division Multiple Access (DS-CDMA) in a Fiber-to-the-Cabinet (FTTC) topology by means of a Field Programmable Gate Array (FPGA), used to demonstrate simultaneous user transmission and System-on-Chip (SoC) network element generation. Experimental results have displayed efficiency in ADSL link rates of 66% and 41% for back-to-back and 12km-reach fiber links respectively
License plate localisation based on morphological operations
“This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”Automatic Number Plate Recognition (ANPR) systems allow users to track, identify and monitor moving vehicles by automatically extracting their number plates. This paper presents an improved method to locate car plates in an ANPR system. The proposed method is based on morphological open and close operations where different Structuring Elements (SE) are used to maximally eliminate non-plate region and enhance plate region. This method has been tested using a database of UK number plates and results achieved have shown significant improvements in terms of the detection rate compare to other existing plate localisation systems
Comparison of Real-Time DSP-Based Edge Detection Techniques for License Plate Detection
"This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”In this paper, edge detection techniques and their performance are compared when applied in license plate detection using an embedded digital signal processor. License plate detection remains to be the crucial part of a vehicle’s license plate recognition process. The edge detection algorithms compared in this work are those reported capable of delivering real-time performance. These are Canny-Deriche-FGL, Haar and Daubechies-4 wavelet transform and the classic Sobel. These particular algorithms are chosen and compared due to their good performance on digital signal processors. The comparison is drawn in terms of speed and detection success of a license plate. The results show Haar wavelet-based edge detector performs better on a DSP with LP detection speed of 7.32 ms and 98.6% success using 45,032 UK images containing license plates at 768X288 resolutions