242 research outputs found
Reconstruction of sparse wavelet signals from partial Fourier measurements
In this paper, we show that high-dimensional sparse wavelet signals of finite
levels can be constructed from their partial Fourier measurements on a
deterministic sampling set with cardinality about a multiple of signal
sparsity
Unsupervised Monocular Depth Estimation in Highly Complex Environments
With the development of computational intelligence algorithms, unsupervised
monocular depth and pose estimation framework, which is driven by warped
photometric consistency, has shown great performance in the daytime scenario.
While in some challenging environments, like night and rainy night, the
essential photometric consistency hypothesis is untenable because of the
complex lighting and reflection, so that the above unsupervised framework
cannot be directly applied to these complex scenarios. In this paper, we
investigate the problem of unsupervised monocular depth estimation in highly
complex scenarios and address this challenging problem by adopting an image
transfer-based domain adaptation framework. We adapt the depth model trained on
day-time scenarios to be applicable to night-time scenarios, and constraints on
both feature space and output space promote the framework to learn the key
features for depth decoding. Meanwhile, we further tackle the effects of
unstable image transfer quality on domain adaptation, and an image adaptation
approach is proposed to evaluate the quality of transferred images and
re-weight the corresponding losses, so as to improve the performance of the
adapted depth model. Extensive experiments show the effectiveness of the
proposed unsupervised framework in estimating the dense depth map from highly
complex images.Comment: Accepted by IEEE Transactions on Emerging Topics in Computational
Intelligenc
Augment Features Beyond Color for Domain Generalized Segmentation
Domain generalized semantic segmentation (DGSS) is an essential but highly
challenging task, in which the model is trained only on source data and any
target data is not available. Previous DGSS methods can be partitioned into
augmentation-based and normalization-based ones. The former either introduces
extra biased data or only conducts channel-wise adjustments for data
augmentation, and the latter may discard beneficial visual information, both of
which lead to limited performance in DGSS. Contrarily, our method performs
inter-channel transformation and meanwhile evades domain-specific biases, thus
diversifying data and enhancing model generalization performance. Specifically,
our method consists of two modules: random image color augmentation (RICA) and
random feature distribution augmentation (RFDA). RICA converts images from RGB
to the CIELAB color model and randomizes color maps in a perception-based way
for image enhancement purposes. We further this augmentation by extending it
beyond color to feature space using a CycleGAN-based generative network, which
complements RICA and further boosts generalization capability. We conduct
extensive experiments, and the generalization results from the synthetic GTAV
and SYNTHIA to the real Cityscapes, BDDS, and Mapillary datasets show that our
method achieves state-of-the-art performance in DGSS
Molecular Joint Representation Learning via Multi-modal Information
In recent years, artificial intelligence has played an important role on
accelerating the whole process of drug discovery. Various of molecular
representation schemes of different modals (e.g. textual sequence or graph) are
developed. By digitally encoding them, different chemical information can be
learned through corresponding network structures. Molecular graphs and
Simplified Molecular Input Line Entry System (SMILES) are popular means for
molecular representation learning in current. Previous works have done attempts
by combining both of them to solve the problem of specific information loss in
single-modal representation on various tasks. To further fusing such
multi-modal imformation, the correspondence between learned chemical feature
from different representation should be considered. To realize this, we propose
a novel framework of molecular joint representation learning via Multi-Modal
information of SMILES and molecular Graphs, called MMSG. We improve the
self-attention mechanism by introducing bond level graph representation as
attention bias in Transformer to reinforce feature correspondence between
multi-modal information. We further propose a Bidirectional Message
Communication Graph Neural Network (BMC GNN) to strengthen the information flow
aggregated from graphs for further combination. Numerous experiments on public
property prediction datasets have demonstrated the effectiveness of our model
- …