145 research outputs found

    Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs

    Full text link
    Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance. We implement the BG-VAE using Hierarchical VAEs and demonstrate its effectiveness through extensive experiments. Along with advanced neural network blocks, we provide a versatile, variable-rate NIC that outperforms existing methods when considering both rate-distortion performance and computational complexity. The code is available at BG-VAE.Comment: 2024 IEEE International Conference on Multimedia and Expo (ICME2024

    Flexible Variable-Rate Image Feature Compression for Edge-Cloud Systems

    Full text link
    Feature compression is a promising direction for coding for machines. Existing methods have made substantial progress, but they require designing and training separate neural network models to meet different specifications of compression rate, performance accuracy and computational complexity. In this paper, a flexible variable-rate feature compression method is presented that can operate on a range of rates by introducing a rate control parameter as an input to the neural network model. By compressing different intermediate features of a pre-trained vision task model, the proposed method can scale the encoding complexity without changing the overall size of the model. The proposed method is more flexible than existing baselines, at the same time outperforming them in terms of the three-way trade-off between feature compression rate, vision task accuracy, and encoding complexity. We have made the source code available at https://github.com/adnan-hossain/var_feat_comp.git.Comment: 6 pages, 7 figures, 1 table, International Conference on Multimedia and Expo Workshops 202

    QARV: Quantization-Aware ResNet VAE for Lossy Image Compression

    Full text link
    This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications. We start by reviewing the framework of variational autoencoders (VAEs), a powerful class of generative probabilistic models that has a deep connection to lossy compression. Based on VAEs, we develop a novel scheme for lossy image compression, which we name quantization-aware ResNet VAE (QARV). Our method incorporates a hierarchical VAE architecture integrated with test-time quantization and quantization-aware training, without which efficient entropy coding would not be possible. In addition, we design the neural network architecture of QARV specifically for fast decoding and propose an adaptive normalization operation for variable-rate compression. Extensive experiments are conducted, and results show that QARV achieves variable-rate compression, high-speed decoding, and a better rate-distortion performance than existing baseline methods. The code of our method is publicly accessible at https://github.com/duanzhiihao/lossy-vaeComment: Technical repor

    Probing Image Compression For Class-Incremental Learning

    Full text link
    Image compression emerges as a pivotal tool in the efficient handling and transmission of digital images. Its ability to substantially reduce file size not only facilitates enhanced data storage capacity but also potentially brings advantages to the development of continual machine learning (ML) systems, which learn new knowledge incrementally from sequential data. Continual ML systems often rely on storing representative samples, also known as exemplars, within a limited memory constraint to maintain the performance on previously learned data. These methods are known as memory replay-based algorithms and have proven effective at mitigating the detrimental effects of catastrophic forgetting. Nonetheless, the limited memory buffer size often falls short of adequately representing the entire data distribution. In this paper, we explore the use of image compression as a strategy to enhance the buffer's capacity, thereby increasing exemplar diversity. However, directly using compressed exemplars introduces domain shift during continual ML, marked by a discrepancy between compressed training data and uncompressed testing data. Additionally, it is essential to determine the appropriate compression algorithm and select the most effective rate for continual ML systems to balance the trade-off between exemplar quality and quantity. To this end, we introduce a new framework to incorporate image compression for continual ML including a pre-processing data compression step and an efficient compression rate/algorithm selection method. We conduct extensive experiments on CIFAR-100 and ImageNet datasets and show that our method significantly improves image classification accuracy in continual ML settings.Comment: Picture Coding Symposium (PCS) 202

    Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao

    Full text link
    Click-Through Rate (CTR) prediction serves as a fundamental component in online advertising. A common practice is to train a CTR model on advertisement (ad) impressions with user feedback. Since ad impressions are purposely selected by the model itself, their distribution differs from the inference distribution and thus exhibits sample selection bias (SSB) that affects model performance. Existing studies on SSB mainly employ sample re-weighting techniques which suffer from high variance and poor model calibration. Another line of work relies on costly uniform data that is inadequate to train industrial models. Thus mitigating SSB in industrial models with a uniform-data-free framework is worth exploring. Fortunately, many platforms display mixed results of organic items (i.e., recommendations) and sponsored items (i.e., ads) to users, where impressions of ads and recommendations are selected by different systems but share the same user decision rationales. Based on the above characteristics, we propose to leverage recommendations samples as a free lunch to mitigate SSB for ads CTR model (Rec4Ad). After elaborating data augmentation, Rec4Ad learns disentangled representations with alignment and decorrelation modules for enhancement. When deployed in Taobao display advertising system, Rec4Ad achieves substantial gains in key business metrics, with a lift of up to +6.6\% CTR and +2.9\% RPM

    Grid investment capability prediction based on path analysis and BP neural network

    Get PDF
    With the more complex investment environment of China’s power grid, the accurate prediction of the investment ability of power grid enterprises has become an important prerequisite for managers to make precise investment decisions. This paper first selects the factors affecting the investment capacity of the power grid from the internal and external environment, and establishes the index system of the factors affecting the investment capacity. Secondly, the path analysis is used to deeply explore the interaction relationship and influence degree of each index and investment capacity. Finally, the maximum investment capacity of the power network can be predicted based on the BP neural network prediction model. The results show that the BP neural network prediction model can achieve higher prediction accuracy when predicting the power grid investment capability

    A multicenter study of fetal chromosomal abnormalities in Chinese women of advanced maternal age

    Get PDF
    AbstractObjectiveThis study aimed to determine the rates of different fetal chromosomal abnormalities among women of advanced maternal age in China and to discuss the possible misdiagnosis risks of newer molecular techniques, for selection of appropriate prenatal screening and diagnostic technologies.Materials and MethodsSecond trimester amniocentesis and fetal karyotype results of 46,258 women were retrospectively reviewed. All women were ≥ 35 years old with singleton pregnancies. The rates of clinically significant chromosomal abnormalities (CSCAs), incidence of chromosomal abnormalities, and correlations with age were determined.ResultsFrom 2001 to 2010, the proportion of women of advanced maternal age undergoing prenatal diagnosis increased from 20% to 46%. The mean age was 37.4 years (range, 35–46 years). A total of 708 cases of CSCAs, with a rate of 1.53% were found. Trisomy 21 was the most common single chromosome abnormality and accounted for 55.9% of all CSCAs with an incidence of 0.86%. Trisomy 13, trisomy 18, and trisomy 21, the most common chromosome autosomal aneuploidies, accounted for 73.6% of all CSCAs, with a rate of 1.13%. As a group, the most common chromosomal aneuploidies (13/18/21/X/Y) accounted for 93.9% of all abnormalities, with a rate of 1.44%. The incidence of trisomy 21, trisomy 13/18/21 as a group, and 13/18/21/X/Y as a group was significantly greater in women aged 39 years and older (p < 0.001), but was not different between women aged 35 years, 36 years, 37 years, and 38 years.ConclusionThese findings may assist in genetic counseling of advanced maternal age pregnant women, and provide a basis for the selection of prenatal screening and diagnostic technologies
    corecore