145 research outputs found
Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs
Recent studies reveal a significant theoretical link between variational
autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to
estimate the theoretical upper bound of the information rate-distortion
function of images. Such estimated theoretical bounds substantially exceed the
performance of existing neural image codecs (NICs). To narrow this gap, we
propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The
proposed BG-VAE leverages the theoretical bound to guide the NIC model towards
enhanced performance. We implement the BG-VAE using Hierarchical VAEs and
demonstrate its effectiveness through extensive experiments. Along with
advanced neural network blocks, we provide a versatile, variable-rate NIC that
outperforms existing methods when considering both rate-distortion performance
and computational complexity. The code is available at BG-VAE.Comment: 2024 IEEE International Conference on Multimedia and Expo (ICME2024
Flexible Variable-Rate Image Feature Compression for Edge-Cloud Systems
Feature compression is a promising direction for coding for machines.
Existing methods have made substantial progress, but they require designing and
training separate neural network models to meet different specifications of
compression rate, performance accuracy and computational complexity. In this
paper, a flexible variable-rate feature compression method is presented that
can operate on a range of rates by introducing a rate control parameter as an
input to the neural network model. By compressing different intermediate
features of a pre-trained vision task model, the proposed method can scale the
encoding complexity without changing the overall size of the model. The
proposed method is more flexible than existing baselines, at the same time
outperforming them in terms of the three-way trade-off between feature
compression rate, vision task accuracy, and encoding complexity. We have made
the source code available at
https://github.com/adnan-hossain/var_feat_comp.git.Comment: 6 pages, 7 figures, 1 table, International Conference on Multimedia
and Expo Workshops 202
QARV: Quantization-Aware ResNet VAE for Lossy Image Compression
This paper addresses the problem of lossy image compression, a fundamental
problem in image processing and information theory that is involved in many
real-world applications. We start by reviewing the framework of variational
autoencoders (VAEs), a powerful class of generative probabilistic models that
has a deep connection to lossy compression. Based on VAEs, we develop a novel
scheme for lossy image compression, which we name quantization-aware ResNet VAE
(QARV). Our method incorporates a hierarchical VAE architecture integrated with
test-time quantization and quantization-aware training, without which efficient
entropy coding would not be possible. In addition, we design the neural network
architecture of QARV specifically for fast decoding and propose an adaptive
normalization operation for variable-rate compression. Extensive experiments
are conducted, and results show that QARV achieves variable-rate compression,
high-speed decoding, and a better rate-distortion performance than existing
baseline methods. The code of our method is publicly accessible at
https://github.com/duanzhiihao/lossy-vaeComment: Technical repor
Probing Image Compression For Class-Incremental Learning
Image compression emerges as a pivotal tool in the efficient handling and
transmission of digital images. Its ability to substantially reduce file size
not only facilitates enhanced data storage capacity but also potentially brings
advantages to the development of continual machine learning (ML) systems, which
learn new knowledge incrementally from sequential data. Continual ML systems
often rely on storing representative samples, also known as exemplars, within a
limited memory constraint to maintain the performance on previously learned
data. These methods are known as memory replay-based algorithms and have proven
effective at mitigating the detrimental effects of catastrophic forgetting.
Nonetheless, the limited memory buffer size often falls short of adequately
representing the entire data distribution. In this paper, we explore the use of
image compression as a strategy to enhance the buffer's capacity, thereby
increasing exemplar diversity. However, directly using compressed exemplars
introduces domain shift during continual ML, marked by a discrepancy between
compressed training data and uncompressed testing data. Additionally, it is
essential to determine the appropriate compression algorithm and select the
most effective rate for continual ML systems to balance the trade-off between
exemplar quality and quantity. To this end, we introduce a new framework to
incorporate image compression for continual ML including a pre-processing data
compression step and an efficient compression rate/algorithm selection method.
We conduct extensive experiments on CIFAR-100 and ImageNet datasets and show
that our method significantly improves image classification accuracy in
continual ML settings.Comment: Picture Coding Symposium (PCS) 202
Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao
Click-Through Rate (CTR) prediction serves as a fundamental component in
online advertising. A common practice is to train a CTR model on advertisement
(ad) impressions with user feedback. Since ad impressions are purposely
selected by the model itself, their distribution differs from the inference
distribution and thus exhibits sample selection bias (SSB) that affects model
performance. Existing studies on SSB mainly employ sample re-weighting
techniques which suffer from high variance and poor model calibration. Another
line of work relies on costly uniform data that is inadequate to train
industrial models. Thus mitigating SSB in industrial models with a
uniform-data-free framework is worth exploring. Fortunately, many platforms
display mixed results of organic items (i.e., recommendations) and sponsored
items (i.e., ads) to users, where impressions of ads and recommendations are
selected by different systems but share the same user decision rationales.
Based on the above characteristics, we propose to leverage recommendations
samples as a free lunch to mitigate SSB for ads CTR model (Rec4Ad). After
elaborating data augmentation, Rec4Ad learns disentangled representations with
alignment and decorrelation modules for enhancement. When deployed in Taobao
display advertising system, Rec4Ad achieves substantial gains in key business
metrics, with a lift of up to +6.6\% CTR and +2.9\% RPM
Grid investment capability prediction based on path analysis and BP neural network
With the more complex investment environment of China’s power grid, the accurate prediction of the investment ability of power grid enterprises has become an important prerequisite for managers to make precise investment decisions. This paper first selects the factors affecting the investment capacity of the power grid from the internal and external environment, and establishes the index system of the factors affecting the investment capacity. Secondly, the path analysis is used to deeply explore the interaction relationship and influence degree of each index and investment capacity. Finally, the maximum investment capacity of the power network can be predicted based on the BP neural network prediction model. The results show that the BP neural network prediction model can achieve higher prediction accuracy when predicting the power grid investment capability
A multicenter study of fetal chromosomal abnormalities in Chinese women of advanced maternal age
AbstractObjectiveThis study aimed to determine the rates of different fetal chromosomal abnormalities among women of advanced maternal age in China and to discuss the possible misdiagnosis risks of newer molecular techniques, for selection of appropriate prenatal screening and diagnostic technologies.Materials and MethodsSecond trimester amniocentesis and fetal karyotype results of 46,258 women were retrospectively reviewed. All women were ≥ 35 years old with singleton pregnancies. The rates of clinically significant chromosomal abnormalities (CSCAs), incidence of chromosomal abnormalities, and correlations with age were determined.ResultsFrom 2001 to 2010, the proportion of women of advanced maternal age undergoing prenatal diagnosis increased from 20% to 46%. The mean age was 37.4 years (range, 35–46 years). A total of 708 cases of CSCAs, with a rate of 1.53% were found. Trisomy 21 was the most common single chromosome abnormality and accounted for 55.9% of all CSCAs with an incidence of 0.86%. Trisomy 13, trisomy 18, and trisomy 21, the most common chromosome autosomal aneuploidies, accounted for 73.6% of all CSCAs, with a rate of 1.13%. As a group, the most common chromosomal aneuploidies (13/18/21/X/Y) accounted for 93.9% of all abnormalities, with a rate of 1.44%. The incidence of trisomy 21, trisomy 13/18/21 as a group, and 13/18/21/X/Y as a group was significantly greater in women aged 39 years and older (p < 0.001), but was not different between women aged 35 years, 36 years, 37 years, and 38 years.ConclusionThese findings may assist in genetic counseling of advanced maternal age pregnant women, and provide a basis for the selection of prenatal screening and diagnostic technologies
- …