8 research outputs found

    The DLMT hardware implementation. A comparative study with the DCT and the DWT

    Get PDF
    In the last recent years, with the popularity of image compression techniques, many architectures have been proposed. Those have been generally based on the Forward and Inverse Discrete Cosine Transform (FDCT, IDCT). Alternatively, compression schemes based on discrete "wavelets" transform (DWT), used, both, in JPEG2000 coding standard and in H264-SVC (Scalable Video Coding) standard, do not need to divide the image into non-overlapping blocks or macroblocks. This paper discusses the DLMT (Discrete Lopez-Moreno Transform) hardware implementation. It proposes a new scheme intermediate between the DCT and the DWT, comparing results of the most relevant proposed architectures for benchmarking. The DLMT can also be applied over a whole image, but this does not involve increasing computational complexity. FPGA implementation results show that the proposed DLMT has significant performance benefits and improvements comparing with the DCT and the DWT and consequently it is very suitable for implementation on WSN (Wireless Sensor Network) applications

    The DLMT. An alternative to the DCT

    Full text link
    In the last recent years, with the popularity of image compression techniques, many architectures have been proposed. Those have been generally based on the Forward and Inverse Discrete Cosine Transform (FDCT, IDCT). Alternatively, compression schemes based on discrete “wavelets” transform (DWT), used, both, in JPEG2000 coding standard and in the next H264-SVC (Scalable Video Coding), do not need to divide the image into non-overlapping blocks or macroblocks. This paper discusses the DLMT (Discrete Lopez-Moreno Transform). It proposes a new scheme intermediate between the DCT and the DWT (Discrete Wavelet Transform). The DLMT is computationally very similar to the DCT and uses quasi-sinusoidal functions, so the emergence of artifact blocks and their effects have a relative low importance. The use of quasi-sinusoidal functions has allowed achieving a multiresolution control quite close to that obtained by a DWT, but without increasing the computational complexity of the transformation. The DLMT can also be applied over a whole image, but this does not involve increasing computational complexity. Simulation results in MATLAB show that the proposed DLMT has significant performance benefits and improvements comparing with the DC

    Distributed artificial neural network architecture over wireless sensor network

    Full text link
    The objective of this thesis is to study how to distribute an Artificial Neural Network (ANN) over a Wireless Sensor Network (WSN). On the one hand, ANNs are Artificial Intelligent (AI) algorithms inspired on our brains that exhibit a great amount of parallelism. The main advantage of the ANNs is that they can learn non-linear features and patterns from examples. On the other hand, WSNs are a technology with increasing interest due to the rise of the IoT (Internet of the Things). Nodes of a WSN usually are unattended and powered by batteries in remote or difficult-toaccess places. Hence, their resources are limited in order to maximize their lifetime. However, more intelligent WSNs are demanded, so, to merge these two kinds of networks seems to be natural. This thesis is about how to do it in an efficient way. In order to achieve a fast and energy-efficient implementation of the ANNs and due to the fact that they exhibit a great amount of parallelism, implementation into FPGAs was chosen. Microcontrollers and DSP processors may be too slow, and GPUs consume too much energy. An FPGA is a flexible intermediate option. Although there is already a lot of research about hardware implementation of ANNs into FPGAs, to the best of the authors’ knowledge at the beginning of this thesis there was not any freely available IP (Intellectual Property). Thus, we opt for designing our own IP, making it highly configurable to allow its use in a great variety of applications. The methodology used on this thesis was the following: First, to choose an application example in order to be able to get measurable data on simulations and implementations. The initial application example was to compress images using auto-encoders. It was chosen because authors had previous works on low-power compressors for WSN to compare with. Auto-encoders are ANN architectures with as many outputs as inputs and a narrow hidden layer (i.e. with less neurons than the output layer). When an auto-encoder is trained though a supervised training algorithm the inputs are taken as the desired outputs, forcing the ANN to learn a compress representation of the inputs on the narrow hidden layer. However, due to finding a way to merge images from multiple sensors into a combined one to be learned, is difficult, and as a worthy real application was not found to justify it, this application was discarded. The second example application was car and pedestrian detection with low-cost radars. This application was chosen to compare results with the thesis of another colleague from the lab. However, the amount of collected examples was not enough to properly train the ANNs. Hence, finally the popular MNIST dataset of handwritten digit classification was chosen. In order to adapt the MNIST dataset to a multi-sensor scenario, the images was divided into 4 quadrants and each quadrant assigned to a different node of the WSN. This emulates 4 sensors catching partial information or different points of view of the same event. Second, to adjust and optimize the algorithms and the intended designs through high-level of abstraction simulations. Matlab scripts were used for optimization and off-line training of the hardware implementations of the ANNs. And SystemC programs to simulate the WSNs. Then, to design, verify and implement the ANN on the FPGA. Two methodologies were used for this step: the classical RTL (Register Transfer Level) using VHDL and though HLS (High-Level Synthesis) using SystemC. Therefore two IPs were obtained: one in VHDL and another in SystemC. As a result form designing and implementing the ANNs through two methodologies, an analysis about the configurability of both methodologies was extracted. The relevance of this analysis is, that it was found that, whereas high-level descriptions are more likely to be re-used, SystemC parameterization mechanisms are less robust and flexible than VHDL ones. A good parameterization is necessary to facilitate code re-use. The parameterization mechanisms of VHDL are the generics, and the SystemC ones are macros, templates and constructor arguments. Macros are limited to scalar values, constructor arguments are not synthesizables and templates have some limitations specified on the analysis. The challenge when parameterizeing an ANN is to configure separately a configurable number of modules. In this case the layers need to be separately configured (number of neurons, activation function, etc.), and the number of layers should be a parameter too. On SystemC is easily doable a separate configuration of a fixed number of modules or a configurable insertion of exactly identical (i.e. with the same configuration) modules, but not both at the same time. Complex macros have been used to solve this problem. Whereas in VHDL, using generic arrays on for-generate statements can easily solve the challenge. Key points and hints for improvement of future HLS tools have been given. Finally, the main result of this thesis is the developed framework for distributing ANNs over WSNs. This framework consists of the Matlab scripts to optimize and train the entire ANN needed, the SystemC programs to estimate the computation vs. communication energy consumptions of the WSN with the ANN distributed in order to balance it, and the IPs to implement the ANN fragments on the nodes with FPGAs

    Distributed artificial neural network architecture over wireless sensor network

    No full text
    The objective of this thesis is to study how to distribute an Artificial Neural Network (ANN) over a Wireless Sensor Network (WSN). On the one hand, ANNs are Artificial Intelligent (AI) algorithms inspired on our brains that exhibit a great amount of parallelism. The main advantage of the ANNs is that they can learn non-linear features and patterns from examples. On the other hand, WSNs are a technology with increasing interest due to the rise of the IoT (Internet of the Things). Nodes of a WSN usually are unattended and powered by batteries in remote or difficult-toaccess places. Hence, their resources are limited in order to maximize their lifetime. However, more intelligent WSNs are demanded, so, to merge these two kinds of networks seems to be natural. This thesis is about how to do it in an efficient way. In order to achieve a fast and energy-efficient implementation of the ANNs and due to the fact that they exhibit a great amount of parallelism, implementation into FPGAs was chosen. Microcontrollers and DSP processors may be too slow, and GPUs consume too much energy. An FPGA is a flexible intermediate option. Although there is already a lot of research about hardware implementation of ANNs into FPGAs, to the best of the authors’ knowledge at the beginning of this thesis there was not any freely available IP (Intellectual Property). Thus, we opt for designing our own IP, making it highly configurable to allow its use in a great variety of applications. The methodology used on this thesis was the following: First, to choose an application example in order to be able to get measurable data on simulations and implementations. The initial application example was to compress images using auto-encoders. It was chosen because authors had previous works on low-power compressors for WSN to compare with. Auto-encoders are ANN architectures with as many outputs as inputs and a narrow hidden layer (i.e. with less neurons than the output layer). When an auto-encoder is trained though a supervised training algorithm the inputs are taken as the desired outputs, forcing the ANN to learn a compress representation of the inputs on the narrow hidden layer. However, due to finding a way to merge images from multiple sensors into a combined one to be learned, is difficult, and as a worthy real application was not found to justify it, this application was discarded. The second example application was car and pedestrian detection with low-cost radars. This application was chosen to compare results with the thesis of another colleague from the lab. However, the amount of collected examples was not enough to properly train the ANNs. Hence, finally the popular MNIST dataset of handwritten digit classification was chosen. In order to adapt the MNIST dataset to a multi-sensor scenario, the images was divided into 4 quadrants and each quadrant assigned to a different node of the WSN. This emulates 4 sensors catching partial information or different points of view of the same event. Second, to adjust and optimize the algorithms and the intended designs through high-level of abstraction simulations. Matlab scripts were used for optimization and off-line training of the hardware implementations of the ANNs. And SystemC programs to simulate the WSNs. Then, to design, verify and implement the ANN on the FPGA. Two methodologies were used for this step: the classical RTL (Register Transfer Level) using VHDL and though HLS (High-Level Synthesis) using SystemC. Therefore two IPs were obtained: one in VHDL and another in SystemC. As a result form designing and implementing the ANNs through two methodologies, an analysis about the configurability of both methodologies was extracted. The relevance of this analysis is, that it was found that, whereas high-level descriptions are more likely to be re-used, SystemC parameterization mechanisms are less robust and flexible than VHDL ones. A good parameterization is necessary to facilitate code re-use. The parameterization mechanisms of VHDL are the generics, and the SystemC ones are macros, templates and constructor arguments. Macros are limited to scalar values, constructor arguments are not synthesizables and templates have some limitations specified on the analysis. The challenge when parameterizeing an ANN is to configure separately a configurable number of modules. In this case the layers need to be separately configured (number of neurons, activation function, etc.), and the number of layers should be a parameter too. On SystemC is easily doable a separate configuration of a fixed number of modules or a configurable insertion of exactly identical (i.e. with the same configuration) modules, but not both at the same time. Complex macros have been used to solve this problem. Whereas in VHDL, using generic arrays on for-generate statements can easily solve the challenge. Key points and hints for improvement of future HLS tools have been given. Finally, the main result of this thesis is the developed framework for distributing ANNs over WSNs. This framework consists of the Matlab scripts to optimize and train the entire ANN needed, the SystemC programs to estimate the computation vs. communication energy consumptions of the WSN with the ANN distributed in order to balance it, and the IPs to implement the ANN fragments on the nodes with FPGAs

    Application specific behavioral syntehsis design space exploration: artifical neural networks

    Full text link
    C-based VLSI design some distinct advantages over traditional RT-Level VLSI design. One key advantage is the ability to automatically generate different types of microarchitectures from the original behavioral description. This allows to explore the design space obtaining micro-architectures with unique characteristics. The main problem with previous work on High-Level Synthesis (HLS) Design Space Exploration (DSE) is that the given behavioral description is assumed to be stable and thus, that it has been fully refined a priori. For many application e.g. Artificial Neural Networks (ANNs) it is not easy to refine the input description as numerous parameters like the number of neurons and layers combined with the duration of the training phase is nontrivial and any changes in these parameters significantly affects the resultant hardware circuit. Thus, this work proposes a tightly integrated two-tier application specific HLS DSE method which explores input parameters of the behavioral description and performs a detailed HLS exploration for the best configurations which meet a set of specified input constraints and then maps the optimal design to a configurable SoC FPGA. A case study using different types of ANNs is presented. The design flow has been fully automated and the experimental results show that our proposed flow is extremely effective

    On the adaptability of ensemblemethods for distributed classificationsystems: A comparative analysis

    Full text link
    In this work, a two-stage architecture is used to analyze the information collected from several sensors. The first stage makes classifications from partial information of the entire target (i.e. from different points of view or from different kind of measures) using a simple artificial neural network as a classifier. In addition, the second stage aggregates all the estimations given by the ensemble in order to obtain the final classification. Four different ensembles methods are compared in the second stage: artificial neural network, plurality majority, basic weighted majority, and stochastic weighted majority. However, not only reliability is an important factor but also adaptation is critical when the ensemble is working in changing environments. Therefore, the artificial neural network and the plurality majority algorithm are compared against our two proposed adaptive algorithms. Unlike artificial neural network, majority methods do not require previous training. The effects of improving the first stage and how the system behaves when different perturbations are presented have been measured. Results have been obtained from two applications: a realistic one and another simpler one, with more training examples for a more accurate comparison. These results show that artificial neural network is the most accurate proposal, whereas the most innovative proposed stochastic weighted voting is the most adaptive one

    Programmable Systems for Intelligence in Automobiles (PRYSTINE): Final results after Year 3

    No full text
    Autonomous driving is disrupting the automotive industry as we know it today. For this, fail-operational behavior is essential in the sense, plan, and act stages of the automation chain in order to handle safety-critical situations on its own, which currently is not reached with state-of-the-art approaches.The European ECSEL research project PRYSTINE realizes Fail-operational Urban Surround perceptION (FUSION) based on robust Radar and LiDAR sensor fusion and control functions in order to enable safe automated driving in urban and rural environments. This paper showcases some of the key exploitable results (e.g., novel Radar sensors, innovative embedded control and E/E architectures, pioneering sensor fusion approaches, AI-controlled vehicle demonstrators) achieved until its final year 3

    Evaluation of a quality improvement intervention to reduce anastomotic leak following right colectomy (EAGLE): pragmatic, batched stepped-wedge, cluster-randomized trial in 64 countries

    Get PDF
    Background Anastomotic leak affects 8 per cent of patients after right colectomy with a 10-fold increased risk of postoperative death. The EAGLE study aimed to develop and test whether an international, standardized quality improvement intervention could reduce anastomotic leaks. Methods The internationally intended protocol, iteratively co-developed by a multistage Delphi process, comprised an online educational module introducing risk stratification, an intraoperative checklist, and harmonized surgical techniques. Clusters (hospital teams) were randomized to one of three arms with varied sequences of intervention/data collection by a derived stepped-wedge batch design (at least 18 hospital teams per batch). Patients were blinded to the study allocation. Low- and middle-income country enrolment was encouraged. The primary outcome (assessed by intention to treat) was anastomotic leak rate, and subgroup analyses by module completion (at least 80 per cent of surgeons, high engagement; less than 50 per cent, low engagement) were preplanned. Results A total 355 hospital teams registered, with 332 from 64 countries (39.2 per cent low and middle income) included in the final analysis. The online modules were completed by half of the surgeons (2143 of 4411). The primary analysis included 3039 of the 3268 patients recruited (206 patients had no anastomosis and 23 were lost to follow-up), with anastomotic leaks arising before and after the intervention in 10.1 and 9.6 per cent respectively (adjusted OR 0.87, 95 per cent c.i. 0.59 to 1.30; P = 0.498). The proportion of surgeons completing the educational modules was an influence: the leak rate decreased from 12.2 per cent (61 of 500) before intervention to 5.1 per cent (24 of 473) after intervention in high-engagement centres (adjusted OR 0.36, 0.20 to 0.64; P < 0.001), but this was not observed in low-engagement hospitals (8.3 per cent (59 of 714) and 13.8 per cent (61 of 443) respectively; adjusted OR 2.09, 1.31 to 3.31). Conclusion Completion of globally available digital training by engaged teams can alter anastomotic leak rates. Registration number: NCT04270721 (http://www.clinicaltrials.gov)
    corecore