403 research outputs found
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
Text-to-image generative models have attracted rising attention for flexible
image editing via user-specified descriptions. However, text descriptions alone
are not enough to elaborate the details of subjects, often compromising the
subjects' identity or requiring additional per-subject fine-tuning. We
introduce a new framework called \textit{Paste, Inpaint and Harmonize via
Denoising} (PhD), which leverages an exemplar image in addition to text
descriptions to specify user intentions. In the pasting step, an off-the-shelf
segmentation model is employed to identify a user-specified subject within an
exemplar image which is subsequently inserted into a background image to serve
as an initialization capturing both scene context and subject identity in one.
To guarantee the visual coherence of the generated or edited image, we
introduce an inpainting and harmonizing module to guide the pre-trained
diffusion model to seamlessly blend the inserted subject into the scene
naturally. As we keep the pre-trained diffusion model frozen, we preserve its
strong image synthesis ability and text-driven ability, thus achieving
high-quality results and flexible editing with diverse texts. In our
experiments, we apply PhD to both subject-driven image editing tasks and
explore text-driven scene generation given a reference subject. Both
quantitative and qualitative comparisons with baseline methods demonstrate that
our approach achieves state-of-the-art performance in both tasks. More
qualitative results can be found at
\url{https://sites.google.com/view/phd-demo-page}.Comment: 10 pages, 12 figure
Deep Learning Approaches for Data Augmentation in Medical Imaging: A Review
Deep learning has become a popular tool for medical image analysis, but the
limited availability of training data remains a major challenge, particularly
in the medical field where data acquisition can be costly and subject to
privacy regulations. Data augmentation techniques offer a solution by
artificially increasing the number of training samples, but these techniques
often produce limited and unconvincing results. To address this issue, a
growing number of studies have proposed the use of deep generative models to
generate more realistic and diverse data that conform to the true distribution
of the data. In this review, we focus on three types of deep generative models
for medical image augmentation: variational autoencoders, generative
adversarial networks, and diffusion models. We provide an overview of the
current state of the art in each of these models and discuss their potential
for use in different downstream tasks in medical imaging, including
classification, segmentation, and cross-modal translation. We also evaluate the
strengths and limitations of each model and suggest directions for future
research in this field. Our goal is to provide a comprehensive review about the
use of deep generative models for medical image augmentation and to highlight
the potential of these models for improving the performance of deep learning
algorithms in medical image analysis
Generative adversarial networks review in earthquake-related engineering fields
Within seismology, geology, civil and structural engineering, deep learning (DL), especially via generative adversarial networks (GANs), represents an innovative, engaging, and advantageous way to generate reliable synthetic data that represent actual samples' characteristics, providing a handy data augmentation tool. Indeed, in many practical applications, obtaining a significant number of high-quality information is demanding. Data augmentation is generally based on artificial intelligence (AI) and machine learning data-driven models. The DL GAN-based data augmentation approach for generating synthetic seismic signals revolutionized the current data augmentation paradigm. This study delivers a critical state-of-art review, explaining recent research into AI-based GAN synthetic generation of ground motion signals or seismic events, and also with a comprehensive insight into seismic-related geophysical studies. This study may be relevant, especially for the earth and planetary science, geology and seismology, oil and gas exploration, and on the other hand for assessing the seismic response of buildings and infrastructures, seismic detection tasks, and general structural and civil engineering applications. Furthermore, highlighting the strengths and limitations of the current studies on adversarial learning applied to seismology may help to guide research efforts in the next future toward the most promising directions
- …