100 research outputs found
Contrast, Attend and Diffuse to Decode High-Resolution Images from Brain Activities
Decoding visual stimuli from neural responses recorded by functional Magnetic
Resonance Imaging (fMRI) presents an intriguing intersection between cognitive
neuroscience and machine learning, promising advancements in understanding
human visual perception and building non-invasive brain-machine interfaces.
However, the task is challenging due to the noisy nature of fMRI signals and
the intricate pattern of brain visual representations. To mitigate these
challenges, we introduce a two-phase fMRI representation learning framework.
The first phase pre-trains an fMRI feature learner with a proposed
Double-contrastive Mask Auto-encoder to learn denoised representations. The
second phase tunes the feature learner to attend to neural activation patterns
most informative for visual reconstruction with guidance from an image
auto-encoder. The optimized fMRI feature learner then conditions a latent
diffusion model to reconstruct image stimuli from brain activities.
Experimental results demonstrate our model's superiority in generating
high-resolution and semantically accurate images, substantially exceeding
previous state-of-the-art methods by 39.34% in the 50-way-top-1 semantic
classification accuracy. Our research invites further exploration of the
decoding task's potential and contributes to the development of non-invasive
brain-machine interfaces.Comment: 17 pages, 6 figures, conferenc
Learning to Improve Image Compression without Changing the Standard Decoder
In recent years we have witnessed an increasing interest in applying Deep
Neural Networks (DNNs) to improve the rate-distortion performance in image
compression. However, the existing approaches either train a post-processing
DNN on the decoder side, or propose learning for image compression in an
end-to-end manner. This way, the trained DNNs are required in the decoder,
leading to the incompatibility to the standard image decoders (e.g., JPEG) in
personal computers and mobiles. Therefore, we propose learning to improve the
encoding performance with the standard decoder. In this paper, We work on JPEG
as an example. Specifically, a frequency-domain pre-editing method is proposed
to optimize the distribution of DCT coefficients, aiming at facilitating the
JPEG compression. Moreover, we propose learning the JPEG quantization table
jointly with the pre-editing network. Most importantly, we do not modify the
JPEG decoder and therefore our approach is applicable when viewing images with
the widely used standard JPEG decoder. The experiments validate that our
approach successfully improves the rate-distortion performance of JPEG in terms
of various quality metrics, such as PSNR, MS-SSIM and LPIPS. Visually, this
translates to better overall color retention especially when strong compression
is applied. The codes are available at
https://github.com/YannickStruempler/LearnedJPEG.Comment: Accepted to ECCV AIM Worksho
Image Synthesis under Limited Data: A Survey and Taxonomy
Deep generative models, which target reproducing the given data distribution
to produce novel samples, have made unprecedented advancements in recent years.
Their technical breakthroughs have enabled unparalleled quality in the
synthesis of visual content. However, one critical prerequisite for their
tremendous success is the availability of a sufficient number of training
samples, which requires massive computation resources. When trained on limited
data, generative models tend to suffer from severe performance deterioration
due to overfitting and memorization. Accordingly, researchers have devoted
considerable attention to develop novel models that are capable of generating
plausible and diverse images from limited training data recently. Despite
numerous efforts to enhance training stability and synthesis quality in the
limited data scenarios, there is a lack of a systematic survey that provides 1)
a clear problem definition, critical challenges, and taxonomy of various tasks;
2) an in-depth analysis on the pros, cons, and remain limitations of existing
literature; as well as 3) a thorough discussion on the potential applications
and future directions in the field of image synthesis under limited data. In
order to fill this gap and provide a informative introduction to researchers
who are new to this topic, this survey offers a comprehensive review and a
novel taxonomy on the development of image synthesis under limited data. In
particular, it covers the problem definition, requirements, main solutions,
popular benchmarks, and remain challenges in a comprehensive and all-around
manner.Comment: 230 references, 25 pages. GitHub:
https://github.com/kobeshegu/awesome-few-shot-generatio
- …