Search CORE

158 research outputs found

Locally adaptive vector quantization: Data compression with feature preservation

Author: Cheung K. M.
Sayano M.
Publication venue
Publication date
Field of study

A study of a locally adaptive vector quantization (LAVQ) algorithm for data compression is presented. This algorithm provides high-speed one-pass compression and is fully adaptable to any data source and does not require a priori knowledge of the source statistics. Therefore, LAVQ is a universal data compression algorithm. The basic algorithm and several modifications to improve performance are discussed. These modifications are nonlinear quantization, coarse quantization of the codebook, and lossless compression of the output. Performance of LAVQ on various images using irreversible (lossy) coding is comparable to that of the Linde-Buzo-Gray algorithm, but LAVQ has a much higher speed; thus this algorithm has potential for real-time video compression. Unlike most other image compression algorithms, LAVQ preserves fine detail in images. LAVQ's performance as a lossless data compression algorithm is comparable to that of Lempel-Ziv-based algorithms, but LAVQ uses far less memory during the coding process

NASA Technical Reports Server

FPGA-Based Real-time Inference for Semantic Segmentation

Author: André Machado Campanhã
Publication venue
Publication date: 12/07/2023
Field of study

Repositório Aberto da Universidade do Porto

A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models

Author: Chandrakasan Anantha P.
Glass James R.
Price Michael R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2014
Field of study

We describe an IC that provides a local speech recognition capability for a variety of electronic devices. We start with a generic speech decoder architecture that is programmable with industry-standard WFST and GMM speech models. Algorithm and architectural enhancements are incorporated in order to achieve real-time performance amid system-level constraints on internal memory size and external memory bandwidth. A 2.5 × 2.5 mm test chip implementing this architecture was fabricated using a 65 nm process. The chip performs a 5,000 word recognition task in real-time with 13.0% word error rate, 6.0 mW core power consumption, and a search efficiency of approximately 16 nJ per hypothesis.Quanta Computer (Firm)Irwin Mark Jacobs and Joan Klein Jacobs Presidential Fellowshi

DSpace@MIT

Crossref

Some new developments in image compression

Author: Chen Keshi
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1994
Field of study

This study is divided into two parts. The first part involves an investigation of near-lossless compression of digitized images using the entropy-coded DPCM method with a large number of quantization levels. Through the investigation, a new scheme that combines both lossy and lossless DPCM methods into a common framework is developed. This new scheme uses known results on the design of predictors and quantizers that incorporate properties of human visual perception. In order to enhance the compression performance of the scheme, an adaptively generated source model with multiple contexts is employed for the coding of the quantized prediction errors, rather than a memoryless model as in the conventional DPCM method. Experiments show that the scheme can provide compression in the range from 4 to 11 with a peak SNR of about 50 dB for 8-bit medical images. Also, the use of multiple contexts is found to improve compression performance by about 25% to 35%;The second part of the study is devoted to the problem of lossy image compression using tree-structured vector quantization. As a result of the study, a new design method for codebook generation is developed together with four different implementation algorithms. In the new method, an unbalanced tree-structured vector codebook is designed in a greedy fashion under the constraint of rate-distortion trade-off which can then be used to implement a variable-rate compression system. From experiments, it is found that the new method can achieve a very good rate-distortion performance while being computationally efficient. Also, due to the tree-structure of the codebook, the new method is amenable to progressive transmission applications

Digital Repository @ Iowa State University (ISU)

Recommended from our members

Design and automation techniques for hIgh-performance mixed-signal circuits

Author: Shi Wei (Ph. D. in electrical and computer engineering)
Publication venue
Publication date: 21/06/2024
Field of study

In the era of ubiquitous sensing environment, the modern electronic system expands our perception of the outside world. Analog/mixed-signal circuit has played a critical role to bridge the physical and digital worlds. The boom of Internet-of-Things (IoT), bio-sensing, and digital camera calls for versatile high-performance mixed-signal circuits and the corresponding automated design methodology. However, high-performance analog circuits are area or power hungry. Moreover, the design cost is prohibitively expensive. To address these challenges, this dissertation explores solutions from both the design and automation techniques. Analog-to-digital converter (ADC) is an important subset of analog/mixed-signal circuits. Continuous time Delta-Sigma modulator (CTDSM) is a popular design choice for high-speed and high-resolution designs. CTDSMs feature a higher power efficiency than their discrete-time (DT) counterpart. The first work presents a high-speed 4th-order DSM featuring the CT-DT hybridization and an efficient excess-loop-delay (ELD) compensation technique in the charge domain. Compared to prior high-order CTDSMs, the proposed hybrid DSM achieves 4th-order noise shaping with single operational trans-conductance amplifier (OTA). Minimized number of OTAs reduces power and enhances stability. On top of that, an efficient ELD compensation technique is implemented by utilizing the inherent capacitor digital-to-analog converter (CDAC) of SAR. Fabricated in 40 nm CMOS, the prototype ADC achieved a peak Schreier Figure-of-Merits (FoM) of 176.1 dB, marking 4 dB improvement over prior arts. The second project explores the techniques to reduce the area consumption of high-resolution CTDSMs. The performance of existing high-resolution CTDSMs is limited by the feedback DAC. The stringent non-linearity requirement leads to the large area of DAC. To address this limitation, a low-complexity hardware-based 2nd-order dynamic-element-matching (DEM) is proposed. The partial sorter applied to the DEM minimizes the hardware cost. Moreover, feedforward path assisted loop filter adapts the highly-linear integrator design to the low power supply voltage. With these techniques combined, the prototype shows a feasible design pattern to achieve compact-area, high-resolution design at advanced technology nodes. A prototype fabricated in 40 nm CMOS measured 95dB SNDR, occupying only 0.37 mm² area. After the exploration of pushing the ADC performance boundary, this dissertation also demonstrates the automated design methodology. The design cost of high-performance mixed-signal circuit grows exponentially with the technology scaling. Existing analog automation techniques cannot handle practical circuit design constraints (e.g. robustness against variations). The third work presents RobustAnalog, a variation-aware analog circuit optimization via multi-task reinforcement learning (RL) and task-space pruning. RobustAnalog is mainly designed to tackle the process-voltage-temperature (PVT) robustness in the analog design. Correlations between similar variations are modeled and conflicts between distinct variations are mitigated. With task pruning, a small-sized proxy training task set is formed. The pruning reduces the queries to the full task set. Compared with the popular blackbox optimization methods, RobustAnalog significantly reduces the simulation cost. Therefore, RobustAnalog shows the staggering progress towards analog automation techniques that can be applied to real silicon conditions.Electrical and Computer Engineerin

Texas ScholarWorks

Conjoint probabilistic subband modeling

Author: Popat Ashok Chhabedia
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1997
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1997.Includes bibliographical references (leaves 125-133).by Ashok Chhabedia Popat.Ph.D

CiteSeerX

DSpace@MIT

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Author: Dong Peijie
Li Lujun
Niu Xin
Pan Hengyue
Tian Zhiliang
Wei Zimian
Publication venue
Publication date: 19/07/2023
Field of study

Mixed-Precision Quantization~(MQ) can achieve a competitive accuracy-complexity trade-off for models. Conventional training-based search methods require time-consuming candidate training to search optimized per-layer bit-width configurations in MQ. Recently, some training-free approaches have presented various MQ proxies and significantly improve search efficiency. However, the correlation between these proxies and quantization accuracy is poorly understood. To address the gap, we first build the MQ-Bench-101, which involves different bit configurations and quantization results. Then, we observe that the existing training-free proxies perform weak correlations on the MQ-Bench-101. To efficiently seek superior proxies, we develop an automatic search of proxies framework for MQ via evolving algorithms. In particular, we devise an elaborate search space involving the existing proxies and perform an evolution search to discover the best correlated MQ proxy. We proposed a diversity-prompting selection strategy and compatibility screening protocol to avoid premature convergence and improve search efficiency. In this way, our Evolving proxies for Mixed-precision Quantization~(EMQ) framework allows the auto-generation of proxies without heavy tuning and expert knowledge. Extensive experiments on ImageNet with various ResNet and MobileNet families demonstrate that our EMQ obtains superior performance than state-of-the-art mixed-precision methods at a significantly reduced cost. The code will be released.Comment: Accepted by ICCV202

arXiv.org e-Print Archive