14 research outputs found

    Enhancement of Digital Photo Frame Capabilities With Dedicated Hardware

    Get PDF
    Photo frames have come a long way since the typical ones that needed to have a photo printed and stuck on them. Today in this digital era we have a new concept, named digital photo frame, a modern representation of the conventional photo frame. A digital photo frame is basically a picture frame that displays photos without the need to print them. They are available in a variety of sizes and with varied configurations. A typical frame varies in size from 7 inches to 20 inches. There are also key chain sized frames available. These frames also support a variety of formats like .jpeg, .tiff, .bmp and so on. Most of the frames provide an option to run the photos in a sequential or random manner as a slideshow with an adjustable time interval. The mode of input of the photos to the frame is also multi-fold. It can be done directly via the memory card of the camera, or else various memory devices like USB drives, SD Cards, MMC Cards and so on can be used. Nowadays even Bluetooth technology is being used. Another option that is becoming quite popular is that, users can take their photos directly from the Internet from sites like Flickr, Picassa or from their e-mail. Also these frames generally come with built in speakers and with remote controls. Our initial objective was to decide on which all features can be added to the Digital Photo Frame that we design. For this purpose we conducted simulation exercises in MATLAB so as to prove its feasibility. This simulation exercise was divided into two parts. The first part was to perform compression and decompression and the second half dealt with the various enhancements that can be added to the frame. For our compression and decompression we considered the JPEG standard. Joint Photographic Experts Group - an ISO/ITU standard for compressing still images. The JPEG format is very popular due to its variable compression range. A few limitations of JPEG include the fact that it is lossy and also not great for displaying text. The common extension for it include *.jpg, *.jff, *.m-jpeg,*.mpeg The various enhancement features that we tested for feasibility include Mean Filter, Median Filter, Image Sharpening, Negative Image Extraction, Logarithmic Transformations, Power Law Correction (Gamma Correction), Contrast Stretching, Grey Level Slicing, Bit Plane Slicing, Laplace Filtering. We then proceeded onto the hardware implementation of the above said features. We only implemented a handful of features owing to the complexity of design and lack of time. We first implemented the Compression and Decompression algorithm. The two enhancement features we implemented were Laplace Filter and Median Filter. For our implementation we used the VIRTEX 2 FPGA Board

    Applicability of approximate multipliers in hardware neural networks

    Get PDF
    In recent years there has been a growing interest in hardware neural networks, which express many benefits over conventional software models, mainly in applications where speed, cost, reliability, or energy efficiency are of great importance. These hardware neural networks require many resource-, power- and time-consuming multiplication operations, thus special care must be taken during their design. Since the neural network processing can be performed in parallel, there is usually a requirement for designs with as many concurrent multiplication circuits as possible. One option to achieve this goal is to replace the complex exact multiplying circuits with simpler, approximate ones. The present work demonstrates the application of approximate multiplying circuits in the design of a feed-forward neural network model with on-chip learning ability. The experiments performed on a heterogeneous Proben1 benchmark dataset show that the adaptive nature of the neural network model successfully compensates for the calculation errors of the approximate multiplying circuits. At the same time, the proposed designs also profit from more computing power and increased energy efficiency

    Applicability of approximate multipliers in hardware neural networks

    Get PDF
    In recent years there has been a growing interest in hardware neural networks, which express many benefits over conventional software models, mainly in applications where speed, cost, reliability, or energy efficiency are of great importance. These hardware neural networks require many resource-, power- and time-consuming multiplication operations, thus special care must be taken during their design. Since the neural network processing can be performed in parallel, there is usually a requirement for designs with as many concurrent multiplication circuits as possible. One option to achieve this goal is to replace the complex exact multiplying circuits with simpler, approximate ones. The present work demonstrates the application of approximate multiplying circuits in the design of a feed-forward neural network model with on-chip learning ability. The experiments performed on a heterogeneous Proben1 benchmark dataset show that the adaptive nature of the neural network model successfully compensates for the calculation errors of the approximate multiplying circuits. At the same time, the proposed designs also profit from more computing power and increased energy efficiency

    An FPGA Based Hardware Accelerator for Remote Surveillance Cameras

    No full text
    The Blackeye II camera, produced by Kinopta, is used for remote security, conservation and traffic flow surveillance. The camera uses an image sensor to acquire photographs which undergo image processing and JPEG encoding on a microprocessor. Although the microprocessor performs other tasks, it is the processing and encoding of images that limit the frame rate of the camera to 2 frames per second (fps). Clients have requested an increase to 12.5 fps while adding more image processing to each photograph. The current microprocessor-based system is unable to achieve this. Custom digital logic systems perform well on processes that naturally form a pipeline, such as the Blackeye II image processing system. This project develops a digital logic system based on an FPGA to receive images from the image sensor, perform the required image processing operations, encode the images in JPEG format and send them on to the microprocessor. The objective is to implement a proof of concept device based upon the Blackeye II’s existing hardware and an FPGA development board. It will implement the proposed pipeline including one example of an image processing operation. A JPEG encoder is designed to process the 752 × 480 greyscale photographs from the image processor in real time. The JPEG encoder consists of four stages: discrete cosine transform (DCT), quantisation, zig-zag buffer and Huffman encoder. The DCT design is based upon the work of Woods et al. [1], which is improved on. An analysis of the relationship between precision and accuracy in the DCT and quantisation stages is used to minimise the system’s resource requirements. The JPEG encoder is successfully tested in simulation. Input and output stages are added to the design. The input stage receives data from the image sensor and removes breaks in the data stream. The output stage must concatenate the data from the JPEG encoder and transmit it to the microprocessor via the microprocessor’s ISI (image sensor interface) peripheral. An image sharpening filter is developed and inserted into the pipeline between the input and JPEG encoder. Because remote surveillance cameras are battery powered, the minimisation of power consumption is a key concern. To minimise power consumption a mechanism is introduced to track those modules in the pipeline that are in use at any time. Any not in use are paused by gating the module’s clock source. Once the system is complete and tested in simulation it is loaded into hardware. The FPGA development board is attached to the image sensor board and microprocessor board of the Blackeye II camera by a purpose-built breakout board. Plugging the microprocessor board into a PC provides a live stream of images proving the successful operation of the FPGA system. The project objectives were exceeded by increasing the frame rate of the Blackeye II to 20 fps, which will not decrease with additional image processing operations. The project was viewed as a success by Kinopta, who have committed to its further development

    Hardware Implementation of a Secured Digital Camera with Built In Watermarking and Encryption Facility

    Get PDF
    The objective is to design an efficient hardware implementation of a secure digital camera for real time digital rights management (DRM) in embedded systems incorporating watermarking and encryption. This emerging field addresses issues related to the ownership and intellectual property rights of digital content. A novel invisible watermarking algorithm is proposed which uses median of each image block to calculate the embedding factor. The performance of the proposed algorithm is compared with the earlier proposed permutation and CRT based algorithms. It is seen that the watermark is successfully embedded invisibly without distorting the image and it is more robust to common image processing techniques like JPEG compression, filtering, tampering. The robustness is measured by the different quality assessment metrics- Peak Signal to Noise Ratio (PSNR), Normalized Correlation (NC), and Tampering Assessment Function (TAF). It is simpler to implement in hardware because of its computational simplicity. Advanced Encryption Standard (AES) is applied after quantization for increased security. The corresponding hardware architectures for invisible watermarking and AES encryption are presented and synthesized for Field Programmable Gate Array(FPGA).The soft cores in the form of Hardware Description Language(HDL) are available as intellectual property cores and can be integrated with any multimedia based electronic appliance which are basically embedded systems built using System On Chip (SoC) technology

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    SIMULASI DAN ANALISIS ERROR KOMPUTASI FFT WINOGRAD 16-TITIK MENGGUNAKAN XILINX ISE 10.1I

    Get PDF
    Weakness data processing using analog processors are less efficient because if there is an error in the design of a system using an analog processor, the hardware of the system should be redesigned. Processing of analog signals using digital processor has several advantages such as efficient and easy to modify the system are made, without requiring hardware redesign as well as in the design of analog systems. Errors in the design of digital circuits in the processor only requires modification program, without having to change the hardware. Modifications can be done anywhere, without demanding we must be in the laboratory. Fast Fourier Transform (FFT) is a computational algorithm that is used for digital data processing such as in the field of image, music, and satellite. Winograd FFT 16 point multiplier, multiplier consists of 6 pieces in the form of fractions. Design of FFT multiplier in Xilinx ISE 10.1i could do with changing fractions menajdi integers and then converted into a number by logic 1 and 0. Changing fractions into an integer will give different computational results compared with the original computation using a multiplier of FFT. The shift results enormous computing will result in lost / loss information can be conveyed from the processed data. To that end, researchers tried to simulate and analyze the computational error rate of 16 point Winograd FFT processor using Xilinx ISE 10.1i. The results showed average percentage error computational simulations using FFT 16 point Xilinx Ise 10.1i compare with Matlab is 6.67% (first trial) and 4:48% (second trial). This error occurs because the processor does not allow the FPGA digital display numbers in a real number. Key word: FFT, Xilinx 10.1

    Efficient reconfigurable architectures for 3D medical image compression

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Recently, the more widespread use of three-dimensional (3-D) imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and ultrasound (US) have generated a massive amount of volumetric data. These have provided an impetus to the development of other applications, in particular telemedicine and teleradiology. In these fields, medical image compression is important since both efficient storage and transmission of data through high-bandwidth digital communication lines are of crucial importance. Despite their advantages, most 3-D medical imaging algorithms are computationally intensive with matrix transformation as the most fundamental operation involved in the transform-based methods. Therefore, there is a real need for high-performance systems, whilst keeping architectures exible to allow for quick upgradeability with real-time applications. Moreover, in order to obtain efficient solutions for large medical volumes data, an efficient implementation of these operations is of significant importance. Reconfigurable hardware, in the form of field programmable gate arrays (FPGAs) has been proposed as viable system building block in the construction of high-performance systems at an economical price. Consequently, FPGAs seem an ideal candidate to harness and exploit their inherent advantages such as massive parallelism capabilities, multimillion gate counts, and special low-power packages. The key achievements of the work presented in this thesis are summarised as follows. Two architectures for 3-D Haar wavelet transform (HWT) have been proposed based on transpose-based computation and partial reconfiguration suitable for 3-D medical imaging applications. These applications require continuous hardware servicing, and as a result dynamic partial reconfiguration (DPR) has been introduced. Comparative study for both non-partial and partial reconfiguration implementation has shown that DPR offers many advantages and leads to a compelling solution for implementing computationally intensive applications such as 3-D medical image compression. Using DPR, several large systems are mapped to small hardware resources, and the area, power consumption as well as maximum frequency are optimised and improved. Moreover, an FPGA-based architecture of the finite Radon transform (FRAT)with three design strategies has been proposed: direct implementation of pseudo-code with a sequential or pipelined description, and block random access memory (BRAM)- based method. An analysis with various medical imaging modalities has been carried out. Results obtained for image de-noising implementation using FRAT exhibits promising results in reducing Gaussian white noise in medical images. In terms of hardware implementation, promising trade-offs on maximum frequency, throughput and area are also achieved. Furthermore, a novel hardware implementation of 3-D medical image compression system with context-based adaptive variable length coding (CAVLC) has been proposed. An evaluation of the 3-D integer transform (IT) and the discrete wavelet transform (DWT) with lifting scheme (LS) for transform blocks reveal that 3-D IT demonstrates better computational complexity than the 3-D DWT, whilst the 3-D DWT with LS exhibits a lossless compression that is significantly useful for medical image compression. Additionally, an architecture of CAVLC that is capable of compressing high-definition (HD) images in real-time without any buffer between the quantiser and the entropy coder is proposed. Through a judicious parallelisation, promising results have been obtained with limited resources. In summary, this research is tackling the issues of massive 3-D medical volumes data that requires compression as well as hardware implementation to accelerate the slowest operations in the system. Results obtained also reveal a significant achievement in terms of the architecture efficiency and applications performance.Ministry of Higher Education Malaysia (MOHE), Universiti Tun Hussein Onn Malaysia (UTHM) and the British Counci

    Efficient VLSI Architectures for Image Compression Algorithms

    Get PDF
    An image, in its original form, contains huge amount of data which demands not only large amount of memory requirements for its storage but also causes inconvenient transmission over limited bandwidth channel. Image compression reduces the data from the image in either lossless or lossy way. While lossless image compression retrieves the original image data completely, it provides very low compression. Lossy compression techniques compress the image data in variable amount depending on the quality of image required for its use in particular application area. It is performed in steps such as image transformation, quantization and entropy coding. JPEG is one of the most used image compression standard which uses discrete cosine transform (DCT) to transform the image from spatial to frequency domain. An image contains low visual information in its high frequencies for which heavy quantization can be done in order to reduce the size in the transformed representation. Entropy coding follows to further reduce the redundancy in the transformed and quantized image data. Real-time data processing requires high speed which makes dedicated hardware implementation most preferred choice. The hardware of a system is favored by its lowcost and low-power implementation. These two factors are also the most important requirements for the portable devices running on battery such as digital camera. Image transform requires very high computations and complete image compression system is realized through various intermediate steps between transform and final bit-streams. Intermediate stages require memory to store intermediate results. The cost and power of the design can be reduced both in efficient implementation of transforms and reduction/removal of intermediate stages by employing different techniques. The proposed research work is focused on the efficient hardware implementation of transform based image compression algorithms by optimizing the architecture of the system. Distribute arithmetic (DA) is an efficient approach to implement digital signal processing algorithms. DA is realized by two different ways, one through storage of precomputed values in ROMs and another without ROM requirements. ROM free DA is more efficient. For the image transform, architectures of one dimensional discrete Hartley transform (1-D DHT) and one dimensional DCT (1-D DCT) have been optimized using ROM free DA technique. Further, 2-D separable DHT (SDHT) and 2-D DCT architectures have been implemented in row-column approach using two 1-D DHT and two 1-D DCT respectively. A finite state machine (FSM) based architecture from DCT to quantization has been proposed using the modified quantization matrix in JPEG image compression which requires no memory in storage of quantization table and DCT coefficients. In addition, quantization is realized without use of multipliers that require more area and are power hungry. For the entropy encoding, Huffman coding is hardware efficient than arithmetic coding. The use of Huffman code table further simplifies the implementation. The strategies have been used for the significant reduction of memory bits in storage of Huffman code table and the complete Huffman coding architecture encodes the transformed coefficients one bit per clock cycle. Direct implementation algorithm of DCT has the advantage that it is free of transposition memory to store intermediate 1-D DCT. Although recursive algorithms have been a preferred method, these algorithms have low accuracy resulting in image quality degradation. A non-recursive equation for the direct computation of DCT coefficients have been proposed and implemented in both 0.18 µm ASIC library as well as FPGA. It can compute DCT coefficients in any order and all intermediate computations are free of fractions and hence very high image quality has been obtained in terms of PSNR. In addition, one multiplier and one register bit-width need to be changed for increasing the accuracy resulting in very low hardware overhead. The architecture implementation has been done to obtain zig-zag ordered DCT coefficients. The comparison results show that this implementation has less area in terms of gate counts and less power consumption than the existing DCT implementations. Using this architecture, the complete JPEG image compression system has been implemented which has Huffman coding module, one multiplier and one register as the only additional modules. The intermediate stages (DCT to Huffman encoding) are free of memory, hence efficient architecture is obtained

    Recent Advances in Embedded Computing, Intelligence and Applications

    Get PDF
    The latest proliferation of Internet of Things deployments and edge computing combined with artificial intelligence has led to new exciting application scenarios, where embedded digital devices are essential enablers. Moreover, new powerful and efficient devices are appearing to cope with workloads formerly reserved for the cloud, such as deep learning. These devices allow processing close to where data are generated, avoiding bottlenecks due to communication limitations. The efficient integration of hardware, software and artificial intelligence capabilities deployed in real sensing contexts empowers the edge intelligence paradigm, which will ultimately contribute to the fostering of the offloading processing functionalities to the edge. In this Special Issue, researchers have contributed nine peer-reviewed papers covering a wide range of topics in the area of edge intelligence. Among them are hardware-accelerated implementations of deep neural networks, IoT platforms for extreme edge computing, neuro-evolvable and neuromorphic machine learning, and embedded recommender systems
    corecore