83 research outputs found
Universal Approximation Property of Fully Convolutional Neural Networks with Zero Padding
The Convolutional Neural Network (CNN) is one of the most prominent neural
network architectures in deep learning. Despite its widespread adoption, our
understanding of its universal approximation properties has been limited due to
its intricate nature. CNNs inherently function as tensor-to-tensor mappings,
preserving the spatial structure of input data. However, limited research has
explored the universal approximation properties of fully convolutional neural
networks as arbitrary continuous tensor-to-tensor functions. In this study, we
demonstrate that CNNs, when utilizing zero padding, can approximate arbitrary
continuous functions in cases where both the input and output values exhibit
the same spatial shape. Additionally, we determine the minimum depth of the
neural network required for approximation and substantiate its optimality. We
also verify that deep, narrow CNNs possess the UAP as tensor-to-tensor
functions. The results encompass a wide range of activation functions, and our
research covers CNNs of all dimensions
Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport
Optimal Transport (OT) problem investigates a transport map that bridges two
distributions while minimizing a given cost function. In this regard, OT
between tractable prior distribution and data has been utilized for generative
modeling tasks. However, OT-based methods are susceptible to outliers and face
optimization challenges during training. In this paper, we propose a novel
generative model based on the semi-dual formulation of Unbalanced Optimal
Transport (UOT). Unlike OT, UOT relaxes the hard constraint on distribution
matching. This approach provides better robustness against outliers, stability
during training, and faster convergence. We validate these properties
empirically through experiments. Moreover, we study the theoretical upper-bound
of divergence between distributions in UOT. Our model outperforms existing
OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 5.80
on CelebA-HQ-256.Comment: 23 pages, 15 figure
Analyzing and Improving Optimal-Transport-based Adversarial Networks
Optimal Transport (OT) problem aims to find a transport plan that bridges two
distributions while minimizing a given cost function. OT theory has been widely
utilized in generative modeling. In the beginning, OT distance has been used as
a measure for assessing the distance between data and generated distributions.
Recently, OT transport map between data and prior distributions has been
utilized as a generative model. These OT-based generative models share a
similar adversarial training objective. In this paper, we begin by unifying
these OT-based adversarial methods within a single framework. Then, we
elucidate the role of each component in training dynamics through a
comprehensive analysis of this unified framework. Moreover, we suggest a simple
but novel method that improves the previously best-performing OT-based model.
Intuitively, our approach conducts a gradual refinement of the generated
distribution, progressively aligning it with the data distribution. Our
approach achieves a FID score of 2.51 on CIFAR-10 and 5.99 on CelebA-HQ-256,
outperforming unified OT-based adversarial approaches.Comment: 27 pages, 17 figure
Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models
Diffusion models have recently emerged as a promising framework for Image
Restoration (IR), owing to their ability to produce high-quality
reconstructions and their compatibility with established methods. Existing
methods for solving noisy inverse problems in IR, considers the pixel-wise
data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware
diffusion model for IR with Gaussian noise. Our model encourages images to
preserve data-fidelity in both the spatial and frequency domains, resulting in
enhanced reconstruction quality. We comprehensively evaluate the performance of
our model on a variety of noisy inverse problems, including inpainting,
denoising, and super-resolution. Our thorough evaluation demonstrates that
SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and
FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS
and FID metrics
Feature-aligned N-BEATS with Sinkhorn divergence
In this study, we propose Feature-aligned N-BEATS as a domain generalization
model for univariate time series forecasting problems. The proposed model is an
extension of the doubly residual stacking architecture of N-BEATS (Oreshkin et
al. [34]) into a representation learning framework. The model is a new
structure that involves marginal feature probability measures (i.e.,
pushforward measures of multiple source domains) induced by the intricate
composition of residual operators of N-BEATS in each stack and aligns them
stack-wise via an entropic regularized Wasserstein distance referred to as the
Sinkhorn divergence (Genevay et al. [14]). The loss function consists of a
typical forecasting loss for multiple source domains and an alignment loss
calculated with the Sinkhorn divergence, which allows the model to learn
invariant features stack-wise across multiple source data sequences while
retaining N-BEATS's interpretable design. We conduct a comprehensive
experimental evaluation of the proposed approach and the results demonstrate
the model's forecasting and generalization capabilities in comparison with
methods based on the original N-BEATS
MARA-Net: Single Image Deraining Network with Multi-level connections and Adaptive Regional Attentions
Removing rain streaks from single images is an important problem in various
computer vision tasks because rain streaks can degrade outdoor images and
reduce their visibility. While recent convolutional neural network-based
deraining models have succeeded in capturing rain streaks effectively,
difficulties in recovering the details in rain-free images still remain. In
this paper, we present a multi-level connection and adaptive regional attention
network (MARA-Net) to properly restore the original background textures in
rainy images. The first main idea is a multi-level connection design that
repeatedly connects multi-level features of the encoder network to the decoder
network. Multi-level connections encourage the decoding process to use the
feature information of all levels. Channel attention is considered in
multi-level connections to learn which level of features is important in the
decoding process of the current level. The second main idea is a wide regional
non-local block (WRNL). As rain streaks primarily exhibit a vertical
distribution, we divide the grid of the image into horizontally-wide patches
and apply a non-local operation to each region to explore the rich rain-free
background information. Experimental results on both synthetic and real-world
rainy datasets demonstrate that the proposed model significantly outperforms
existing state-of-the-art models. Furthermore, the results of the joint
deraining and segmentation experiment prove that our model contributes
effectively to other vision tasks
- β¦