26 research outputs found
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
Recent deep learning methods have achieved promising results in image shadow
removal. However, their restored images still suffer from unsatisfactory
boundary artifacts, due to the lack of degradation prior embedding and the
deficiency in modeling capacity. Our work addresses these issues by proposing a
unified diffusion framework that integrates both the image and degradation
priors for highly effective shadow removal. In detail, we first propose a
shadow degradation model, which inspires us to build a novel unrolling
diffusion model, dubbed ShandowDiffusion. It remarkably improves the model's
capacity in shadow removal via progressively refining the desired output with
both degradation prior and diffusive generative prior, which by nature can
serve as a new strong baseline for image restoration. Furthermore,
ShadowDiffusion progressively refines the estimated shadow mask as an auxiliary
task of the diffusion generator, which leads to more accurate and robust
shadow-free image generation. We conduct extensive experiments on three popular
public datasets, including ISTD, ISTD+, and SRD, to validate our method's
effectiveness. Compared to the state-of-the-art methods, our model achieves a
significant improvement in terms of PSNR, increasing from 31.69dB to 34.73dB
over SRD dataset
ExposureDiffusion: Learning to Expose for Low-light Image Enhancement
Previous raw image-based low-light image enhancement methods predominantly
relied on feed-forward neural networks to learn deterministic mappings from
low-light to normally-exposed images. However, they failed to capture critical
distribution information, leading to visually undesirable results. This work
addresses the issue by seamlessly integrating a diffusion model with a
physics-based exposure model. Different from a vanilla diffusion model that has
to perform Gaussian denoising, with the injected physics-based exposure model,
our restoration process can directly start from a noisy image instead of pure
noise. As such, our method obtains significantly improved performance and
reduced inference time compared with vanilla diffusion models. To make full use
of the advantages of different intermediate steps, we further propose an
adaptive residual layer that effectively screens out the side-effect in the
iterative refinement when the intermediate results have been already
well-exposed. The proposed framework can work with both real-paired datasets,
SOTA noise models, and different backbone networks. Note that, the proposed
framework is compatible with real-paired datasets, real/synthetic noise models,
and different backbone networks. We evaluate the proposed method on various
public benchmarks, achieving promising results with consistent improvements
using different exposure models and backbones. Besides, the proposed method
achieves better generalization capacity for unseen amplifying ratios and better
performance than a larger feedforward neural model when few parameters are
adopted.Comment: accepted by ICCV202
An Autonomous Large Language Model Agent for Chemical Literature Data Mining
Chemical synthesis, which is crucial for advancing material synthesis and
drug discovery, impacts various sectors including environmental science and
healthcare. The rise of technology in chemistry has generated extensive
chemical data, challenging researchers to discern patterns and refine synthesis
processes. Artificial intelligence (AI) helps by analyzing data to optimize
synthesis and increase yields. However, AI faces challenges in processing
literature data due to the unstructured format and diverse writing style of
chemical literature. To overcome these difficulties, we introduce an end-to-end
AI agent framework capable of high-fidelity extraction from extensive chemical
literature. This AI agent employs large language models (LLMs) for prompt
generation and iterative optimization. It functions as a chemistry assistant,
automating data collection and analysis, thereby saving manpower and enhancing
performance. Our framework's efficacy is evaluated using accuracy, recall, and
F1 score of reaction condition data, and we compared our method with human
experts in terms of content correctness and time efficiency. The proposed
approach marks a significant advancement in automating chemical literature
extraction and demonstrates the potential for AI to revolutionize data
management and utilization in chemistry
SinSR: Diffusion-Based Image Super-Resolution in a Single Step
While super-resolution (SR) methods based on diffusion models exhibit
promising results, their practical application is hindered by the substantial
number of required inference steps. Recent methods utilize degraded images in
the initial state, thereby shortening the Markov chain. Nevertheless, these
solutions either rely on a precise formulation of the degradation process or
still necessitate a relatively lengthy generation path (e.g., 15 iterations).
To enhance inference speed, we propose a simple yet effective method for
achieving single-step SR generation, named SinSR. Specifically, we first derive
a deterministic sampling process from the most recent state-of-the-art (SOTA)
method for accelerating diffusion-based SR. This allows the mapping between the
input random noise and the generated high-resolution image to be obtained in a
reduced and acceptable number of inference steps during training. We show that
this deterministic mapping can be distilled into a student model that performs
SR within only one inference step. Additionally, we propose a novel
consistency-preserving loss to simultaneously leverage the ground-truth image
during the distillation process, ensuring that the performance of the student
model is not solely bound by the feature manifold of the teacher model,
resulting in further performance improvement. Extensive experiments conducted
on synthetic and real-world datasets demonstrate that the proposed method can
achieve comparable or even superior performance compared to both previous SOTA
methods and the teacher model, in just one sampling step, resulting in a
remarkable up to x10 speedup for inference. Our code will be released at
https://github.com/wyf0912/SinS
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
Diffusion models have proven to be highly effective in image and video
generation; however, they still face composition challenges when generating
images of varying sizes due to single-scale training data. Adapting large
pre-trained diffusion models for higher resolution demands substantial
computational and optimization resources, yet achieving a generation capability
comparable to low-resolution models remains elusive. This paper proposes a
novel self-cascade diffusion model that leverages the rich knowledge gained
from a well-trained low-resolution model for rapid adaptation to
higher-resolution image and video generation, employing either tuning-free or
cheap upsampler tuning paradigms. Integrating a sequence of multi-scale
upsampler modules, the self-cascade diffusion model can efficiently adapt to a
higher resolution, preserving the original composition and generation
capabilities. We further propose a pivot-guided noise re-schedule strategy to
speed up the inference process and improve local structural details. Compared
to full fine-tuning, our approach achieves a 5X training speed-up and requires
only an additional 0.002M tuning parameters. Extensive experiments demonstrate
that our approach can quickly adapt to higher resolution image and video
synthesis by fine-tuning for just 10k steps, with virtually no additional
inference time.Comment: Project Page: https://guolanqing.github.io/Self-Cascade
Exploiting non-local priors via self-convolution for highly-efficient image restoration
Constructing effective priors is critical to solving ill-posed inverse problems in image processing and computational imaging. Recent works focused on exploiting non-local similarity by grouping similar patches for image modeling, and demonstrated state-of-the-art results in many image restoration applications. However, compared to classic methods based on filtering or sparsity, non-local algorithms are more time-consuming, mainly due to the highly inefficient block matching step, i.e., distance between every pair of overlapping patches needs to be computed. In this work, we propose a novel Self-Convolution operator to exploit image non-local properties in a unified framework. We prove that the proposed Self-Convolution based formulation can generalize the commonly-used non-local modeling methods, as well as produce results equivalent to standard methods, but with much cheaper computation. Furthermore, by applying Self-Convolution, we propose an effective multi-modality image restoration scheme, which is much more efficient than conventional block matching for non-local modeling. Experimental results demonstrate that (1) Self-Convolution with fast Fourier transform implementation can significantly speed up most of the popular non-local image restoration algorithms, with two-fold to nine-fold faster block matching, and (2) the proposed online multi-modality image restoration scheme achieves superior denoising results than competing methods in both efficiency and effectiveness on RGB-NIR images. The code for this work is publicly available at https://github.com/GuoLanqing/Self-Convolution.Ministry of Education (MOE)Nanyang Technological UniversityThis work was supported in part by the Ministry of Education, Singapore, through its Academic Research Fund Tier 1 under Project RG137/20; in part by the Start Up Grant; and in part by the Rapid-Rich Object Search (ROSE) Laboratory, Nanyang Technological University, Singapore
Enhancing Low-Light Images in Real World via Cross-Image Disentanglement
Images captured in the low-light condition suffer from low visibility and
various imaging artifacts, e.g., real noise. Existing supervised enlightening
algorithms require a large set of pixel-aligned training image pairs, which are
hard to prepare in practice. Though weakly-supervised or unsupervised methods
can alleviate such challenges without using paired training images, some
real-world artifacts inevitably get falsely amplified because of the lack of
corresponded supervision. In this paper, instead of using perfectly aligned
images for training, we creatively employ the misaligned real-world images as
the guidance, which are considerably easier to collect. Specifically, we
propose a Cross-Image Disentanglement Network (CIDN) to separately extract
cross-image brightness and image-specific content features from
low/normal-light images. Based on that, CIDN can simultaneously correct the
brightness and suppress image artifacts in the feature domain, which largely
increases the robustness to the pixel shifts. Furthermore, we collect a new
low-light image enhancement dataset consisting of misaligned training images
with real-world corruptions. Experimental results show that our model achieves
state-of-the-art performances on both the newly proposed dataset and other
popular low-light datasets
The emerging progress on wound dressings and their application in clinic wound management
Background: In addition to its barrier function, the skin plays a crucial role in maintaining the stability of the body's internal environment and normal physiological functions. When the skin is damaged, it is important to select proper dressings as temporary barriers to cover the wound, which can exert significant effects on defence against microbial infection, maintaining normal tissue/cell functions, and coordinating the process of wound repair and regeneration. It now forms an important approach in clinic practice to facilitate wound repair. Search strategies: We conducted a comprehensive literature search using online databases including PubMed, Web of Science, MEDLINE, ScienceDirect, Wiley Online Library, CNKI, and Wanfang Data. In addition, information was obtained from local and foreign books on biomaterials science and traumatology. Results: This review focuses on the efficacy and principles of functional dressings for anti-bacteria, anti-infection, anti-inflammation, anti-oxidation, hemostasis, and wound healing facilitation; and analyses the research progress of dressings carrying living cells such as fibroblasts, keratinocytes, skin appendage cells, and stem cells from different origins. We also summarize the recent advances in intelligent wound dressings with respect to real-time monitoring, automatic drug delivery, and precise adjustment according to the actual wound microenvironment. In addition, this review explores and compares the characteristics, advantages and disadvantages, mechanisms of actions, and application scopes of dressings made from different materials. Conclusion: The real-time and dynamic acquisition and analysis of wound conditions are crucial for wound management and prognostic evaluation. Therefore, the development of modern dressings that integrate multiple functions, have high similarity to the skin, and are highly intelligent will be the focus of future research, which could drive efficient wound management and personalized medicine, and ultimately facilitate the translation of health monitoring into clinical practice
ShadowFormer: Global Context Helps Shadow Removal
Recent deep learning methods have achieved promising results in image shadow removal. However, most of the existing approaches focus on working locally within shadow and non-shadow regions, resulting in severe artifacts around the shadow boundaries as well as inconsistent illumination between shadow and non-shadow regions. It is still challenging for the deep shadow removal model to exploit the global contextual correlation between shadow and non-shadow regions. In this work, we first propose a Retinex-based shadow model, from which we derive a novel transformer-based network, dubbed ShandowFormer, to exploit non-shadow regions to help shadow region restoration. A multi-scale channel attention framework is employed to hierarchically capture the global information. Based on that, we propose a Shadow-Interaction Module (SIM) with Shadow-Interaction Attention (SIA) in the bottleneck stage to effectively model the context correlation between shadow and non-shadow regions. We conduct extensive experiments on three popular public datasets, including ISTD, ISTD+, and SRD,
to evaluate the proposed method. Our method achieves state-of-the-art performance by using up to 150X fewer model parameters