563 research outputs found
Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression
The latest advancements in neural image compression show great potential in
surpassing the rate-distortion performance of conventional standard codecs.
Nevertheless, there exists an indelible domain gap between the datasets
utilized for training (i.e., natural images) and those utilized for inference
(e.g., artistic images). Our proposal involves a low-rank adaptation approach
aimed at addressing the rate-distortion drop observed in out-of-domain
datasets. Specifically, we perform low-rank matrix decomposition to update
certain adaptation parameters of the client's decoder. These updated
parameters, along with image latents, are encoded into a bitstream and
transmitted to the decoder in practical scenarios. Due to the low-rank
constraint imposed on the adaptation parameters, the resulting bit rate
overhead is small. Furthermore, the bit rate allocation of low-rank adaptation
is \emph{non-trivial}, considering the diverse inputs require varying
adaptation bitstreams. We thus introduce a dynamic gating network on top of the
low-rank adaptation method, in order to decide which decoder layer should
employ adaptation. The dynamic adaptation network is optimized end-to-end using
rate-distortion loss. Our proposed method exhibits universality across diverse
image datasets. Extensive results demonstrate that this paradigm significantly
mitigates the domain gap, surpassing non-adaptive methods with an average
BD-rate improvement of approximately across out-of-domain images.
Furthermore, it outperforms the most advanced instance adaptive methods by
roughly BD-rate. Ablation studies confirm our method's ability to
universally enhance various image compression architectures.Comment: Accepted by ACM MM 2023, 13 pages, 12 figure
An Introduction to Neural Data Compression
Neural compression is the application of neural networks and other machine
learning methods to data compression. Recent advances in statistical machine
learning have opened up new possibilities for data compression, allowing
compression algorithms to be learned end-to-end from data using powerful
generative models such as normalizing flows, variational autoencoders,
diffusion probabilistic models, and generative adversarial networks. The
present article aims to introduce this field of research to a broader machine
learning audience by reviewing the necessary background in information theory
(e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image
quality assessment, perceptual metrics), and providing a curated guide through
the essential ideas and methods in the literature thus far
Leveraging progressive model and overfitting for efficient learned image compression
Deep learning is overwhelmingly dominant in the field of computer vision and
image/video processing for the last decade. However, for image and video
compression, it lags behind the traditional techniques based on discrete cosine
transform (DCT) and linear filters. Built on top of an autoencoder
architecture, learned image compression (LIC) systems have drawn enormous
attention in recent years. Nevertheless, the proposed LIC systems are still
inferior to the state-of-the-art traditional techniques, for example, the
Versatile Video Coding (VVC/H.266) standard, due to either their compression
performance or decoding complexity. Although claimed to outperform the
VVC/H.266 on a limited bit rate range, some proposed LIC systems take over 40
seconds to decode a 2K image on a GPU system. In this paper, we introduce a
powerful and flexible LIC framework with multi-scale progressive (MSP)
probability model and latent representation overfitting (LOF) technique. With
different predefined profiles, the proposed framework can achieve various
balance points between compression efficiency and computational complexity.
Experiments show that the proposed framework achieves 2.5%, 1.0%, and 1.3%
Bjontegaard delta bit rate (BD-rate) reduction over the VVC/H.266 standard on
three benchmark datasets on a wide bit rate range. More importantly, the
decoding complexity is reduced from O(n) to O(1) compared to many other LIC
systems, resulting in over 20 times speedup when decoding 2K images
Universal Neural-Cracking-Machines: Self-Configurable Password Models from Auxiliary Data
We develop the first universal password model -- a password model that, once
pre-trained, can automatically adapt to any password distribution. To achieve
this result, the model does not need to access any plaintext passwords from the
target set. Instead, it exploits users' auxiliary information, such as email
addresses, as a proxy signal to predict the underlying target password
distribution. The model uses deep learning to capture the correlation between
the auxiliary data of a group of users (e.g., users of a web application) and
their passwords. It then exploits those patterns to create a tailored password
model for the target community at inference time. No further training steps,
targeted data collection, or prior knowledge of the community's password
distribution is required. Besides defining a new state-of-the-art for password
strength estimation, our model enables any end-user (e.g., system
administrators) to autonomously generate tailored password models for their
systems without the often unworkable requirement of collecting suitable
training data and fitting the underlying password model. Ultimately, our
framework enables the democratization of well-calibrated password models to the
community, addressing a major challenge in the deployment of password security
solutions on a large scale.Comment: v0.0
Bayesian Modelling Approaches for Quantum States -- The Ultimate Gaussian Process States Handbook
Capturing the correlation emerging between constituents of many-body systems
accurately is one of the key challenges for the appropriate description of
various systems whose properties are underpinned by quantum mechanical
fundamentals. This thesis discusses novel tools and techniques for the
(classical) modelling of quantum many-body wavefunctions with the ultimate goal
to introduce a universal framework for finding accurate representations from
which system properties can be extracted efficiently. It is outlined how
synergies with standard machine learning approaches can be exploited to enable
an automated inference of the most relevant intrinsic characteristics through
rigorous Bayesian regression techniques. Based on the probabilistic framework
forming the foundation of the introduced ansatz, coined the Gaussian Process
State, different compression techniques are explored to extract numerically
feasible representations of relevant target states within stochastic schemes.
By following intuitively motivated design principles, the resulting model
carries a high degree of interpretability and offers an easily applicable tool
for the numerical study of quantum systems, including ones which are
notoriously difficult to simulate due to a strong intrinsic correlation. The
practical applicability of the Gaussian Process States framework is
demonstrated within several benchmark applications, in particular, ground state
approximations for prototypical quantum lattice models, Fermi-Hubbard models
and models, as well as simple ab-initio quantum chemical systems.Comment: PhD Thesis, King's College London, 202 page
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Transformer-based Large Language Models (LLMs) have been applied in diverse
areas such as knowledge bases, human interfaces, and dynamic agents, and
marking a stride towards achieving Artificial General Intelligence (AGI).
However, current LLMs are predominantly pretrained on short text snippets,
which compromises their effectiveness in processing the long-context prompts
that are frequently encountered in practical scenarios. This article offers a
comprehensive survey of the recent advancement in Transformer-based LLM
architectures aimed at enhancing the long-context capabilities of LLMs
throughout the entire model lifecycle, from pre-training through to inference.
We first delineate and analyze the problems of handling long-context input and
output with the current Transformer-based models. We then provide a taxonomy
and the landscape of upgrades on Transformer architecture to solve these
problems. Afterwards, we provide an investigation on wildly used evaluation
necessities tailored for long-context LLMs, including datasets, metrics, and
baseline models, as well as optimization toolkits such as libraries,
frameworks, and compilers to boost the efficacy of LLMs across different stages
in runtime. Finally, we discuss the challenges and potential avenues for future
research. A curated repository of relevant literature, continuously updated, is
available at https://github.com/Strivin0311/long-llms-learning.Comment: 40 pages, 3 figures, 4 table
- …