110,353 research outputs found

    DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression

    Full text link
    We propose a new architecture for distributed image compression from a group of distributed data sources. The work is motivated by practical needs of data-driven codec design, low power consumption, robustness, and data privacy. The proposed architecture, which we refer to as Distributed Recurrent Autoencoder for Scalable Image Compression (DRASIC), is able to train distributed encoders and one joint decoder on correlated data sources. Its compression capability is much better than the method of training codecs separately. Meanwhile, the performance of our distributed system with 10 distributed sources is only within 2 dB peak signal-to-noise ratio (PSNR) of the performance of a single codec trained with all data sources. We experiment distributed sources with different correlations and show how our data-driven methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding (DSC). To the best of our knowledge, this is the first data-driven DSC framework for general distributed code design with deep learning

    Learning to detect dysarthria from raw speech

    Full text link
    Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-level features, by selecting the relevant information for the task at hand. We explore an alternative to this selection, by learning jointly the classifier, and the feature extraction. Recent work on speech recognition has shown improved performance over speech features by learning from the waveform. We extend this approach to paralinguistic classification and propose a neural network that can learn a filterbank, a normalization factor and a compression power from the raw speech, jointly with the rest of the architecture. We apply this model to dysarthria detection from sentence-level audio recordings. Starting from a strong attention-based baseline on which mel-filterbanks outperform standard low-level descriptors, we show that learning the filters or the normalization and compression improves over fixed features by 10% absolute accuracy. We also observe a gain over OpenSmile features by learning jointly the feature extraction, the normalization, and the compression factor with the architecture. This constitutes a first attempt at learning jointly all these operations from raw audio for a speech classification task.Comment: 5 pages, 3 figures, submitted to ICASS

    Design of multimedia processor based on metric computation

    Get PDF
    Media-processing applications, such as signal processing, 2D and 3D graphics rendering, and image compression, are the dominant workloads in many embedded systems today. The real-time constraints of those media applications have taxing demands on today's processor performances with low cost, low power and reduced design delay. To satisfy those challenges, a fast and efficient strategy consists in upgrading a low cost general purpose processor core. This approach is based on the personalization of a general RISC processor core according the target multimedia application requirements. Thus, if the extra cost is justified, the general purpose processor GPP core can be enforced with instruction level coprocessors, coarse grain dedicated hardware, ad hoc memories or new GPP cores. In this way the final design solution is tailored to the application requirements. The proposed approach is based on three main steps: the first one is the analysis of the targeted application using efficient metrics. The second step is the selection of the appropriate architecture template according to the first step results and recommendations. The third step is the architecture generation. This approach is experimented using various image and video algorithms showing its feasibility

    Low energy HEVC and VVC video compression hardware

    Get PDF
    Video compression standards compress a digital video by reducing and removing redundancy in the digital video using computationally complex algorithms. As spatial and temporal resolutions of videos increase, compression efficiencies of video compression algorithms are also increasing. However, increased compression efficiency comes with increased computational complexity. Therefore, it is necessary to reduce computational complexities of video compression algorithms without reducing their visual quality in order to reduce area and energy consumption of their hardware implementations. In this thesis, we propose a novel technique for reducing amount of computations performed by HEVC intra prediction algorithm. We designed low energy, reconfigurable HEVC intra prediction hardware using the proposed technique. We also designed a low energy FPGA implementation of HEVC intra prediction algorithm using the proposed technique and DSP blocks. We propose a reconfigurable VVC intra prediction hardware architecture. We also propose an efficient VVC intra prediction hardware architecture using DSP blocks. We designed low energy VVC fractional interpolation hardware. We propose a novel approximate absolute difference technique. We designed low energy approximate absolute difference hardware using the proposed technique. We propose a novel approximate constant multiplication technique. We designed approximate constant multiplication hardware using the proposed technique. We quantified computation reductions achieved by the proposed techniques and video quality loss caused by the proposed approximation techniques. The proposed approximate absolute difference technique and approximate constant multiplication technique cause very small PSNR loss. The other proposed techniques cause no PSNR loss. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 or Xilinx Virtex 7 FPGAs and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed techniques significantly reduced power and energy consumptions of these FPGA implementation

    A 28 GHz 0.18-μm CMOS cascade power amplifier with reverse body bias technique

    Get PDF
    A 28 GHz power amplifier (PA) using CMOS 0.18 μm Silterra process technology is reported. The cascade configuration has been adopted to obtain high Power Added Efficiency (PAE). To achieve low power consumption, the input stage adopts reverse body bias technique. The simulation results show that the proposed PA consumes 32.03mW and power gain (S21) of 9.51 dB is achieved at 28 GHz. The PA achieves saturated power (Psat) of 11.10 dBm and maximum PAE of 16.55% with output 1-dB compression point (OP1dB) 8.44 dBm. These results demonstrate the proposed power amplifier architecture is suitable for 5G applications

    Deploying GPU-based Real-time DXT compression for Networked Visual Sharing

    Get PDF
    The networked visual sharing application in multi-party collaboration environment needs compression of video streams due to network bandwidth limitation. For interactive real-time sharing, real-time compression of high-quality video as well as audio echo cancellation are required, which commonly depend on the availability of high-cost hard-to-setup specialized compression and echo-cancellation hardware. In this paper, by leveraging the computing power of GPU-accelerated PC (personal computer), we discuss how to support the software-only real-time compression of HD (high-definition) video streams. The chosen lightweight scheme, DXT (i.e., S3 Texture Compression), is highly matched with GPU-accelerated texture compression. By implementing GPU-accelerated DXT compression, based on CUDA (Compute Unified Device Architecture) parallel computing, and by deploying a software-based echo controller together, we can enable a low-cost solution for efficient networked visual sharing in collaboration environment

    Fog Data: Enhancing Telehealth Big Data Through Fog Computing

    Get PDF
    The size of multi-modal, heterogeneous data collected through various sensors is growing exponentially. It demands intelligent data reduction, data mining and analytics at edge devices. Data compression can reduce the network bandwidth and transmission power consumed by edge devices. This paper proposes, validates and evaluates Fog Data, a service-oriented architecture for Fog computing. The center piece of the proposed architecture is a low power embedded computer that carries out data mining and data analytics on raw data collected from various wearable sensors used for telehealth applications. The embedded computer collects the sensed data as time series, analyzes it, and finds similar patterns present. Patterns are stored, and unique patterns are transmited. Also, the embedded computer extracts clinically relevant information that is sent to the cloud. A working prototype of the proposed architecture was built and used to carry out case studies on telehealth big data applications. Specifically, our case studies used the data from the sensors worn by patients with either speech motor disorders or cardiovascular problems. We implemented and evaluated both generic and application specific data mining techniques to show orders of magnitude data reduction and hence transmission power savings. Quantitative evaluations were conducted for comparing various data mining techniques and standard data compression techniques. The obtained results showed substantial improvement in system efficiency using the Fog Data architecture

    Efficient Hardware Implementation Of Haar Wavelet Transform With Line-Based And Dual-Scan Image Memory Accesses

    Get PDF
    Image compression is of great importance in multimedia systems and applications because it drastically reduces bandwidth requirements for transmission and memory requirements for storage. An image compression algorithm JPEG2000 isbased on Discrete Wavelet Transform. In the hardware implementation of DiscreteWavelet Transform (DWT) and inverse DiscreteWavelet Transform (IDWT),the main problems are storage memory, internal processing buffer, and the limitation of the FPGA resources. Based on non-separable 2-D DWT, the method used to access the image memory has a direct impact on the internal buffer size,the power consumption and, the transformation speed. The need for internal buffer reduces the image memory access time. The main objectives of this thesis are as follows; to implement a 2-D Haar wavelet transform for large gray-scale image, to reduce the number of image memory access by implementing the 2- D Haar wavelet transform with a suitable combination between using external memory and internal memory, and targeting a low-power and high-speed architecture based on multi-levels non-separable discrete Haar wavelet transform. In this work, the proposed two architectures reduce the number of image memory access. The line-based architecture reduces the internal buffer by 2 x 0.5 x N where N presents the image size. This happens for the low-pass coefficients and for the high-pass coefficients. The dual-scan architecture does not use the internal memory. Overall both architectures work well on the Altera FPGA board at frequency 100 MHz

    Low power context adaptive variable length encoder in H.264

    Get PDF
    The adoption of digital TV, DVD video and Internet streaming led to the development of Video compression. H.264/AVC is the industry standard delivering highly efficient and reliable video compression. In this Video compression standard, H.264/AVC one of the technical developments adopted is the Context adaptive entropy coding schemes. This thesis developed a complete VHDL behavioral model of a variable length encoder. A synthesizable hardware description is then developed for components of the variable length encoder using Synopsys tools. Many implementations were focused on density and speed to reduce the hardware cost and improve quality but with higher power consumption. Low power consumption of an IC leads to lower heat dissipation and thereby reduces the need for bigger heat sinking devices. Reducing the need for heat sinking devices can provide lot of advantages to the manufacturers in terms of cost and size of the end product. Focus towards smaller area with higher power consumption may not be appropriate for some end products that need thinner mechanical enclosures because even if the design has smaller area it needs a bigger heat sink thereby making the enclosures bigger. This thesis therefore aimed at low power consumption without compromising much on the area. The designed architecture enables real-time processing for QCIF and CIF frames with 60-fps using 100MHz clock. The resultant hardware power is 1.4mW at 100MHz using 65nm technology. The total logic gate count is 32K gates
    corecore