343 research outputs found
MixNet: Towards Effective and Efficient UHD Low-Light Image Enhancement
With the continuous advancement of imaging devices, the prevalence of
Ultra-High-Definition (UHD) images is rising. Although many image restoration
methods have achieved promising results, they are not directly applicable to
UHD images on devices with limited computational resources due to the
inherently high computational complexity of UHD images. In this paper, we focus
on the task of low-light image enhancement (LLIE) and propose a novel LLIE
method called MixNet, which is designed explicitly for UHD images. To capture
the long-range dependency of features without introducing excessive
computational complexity, we present the Global Feature Modulation Layer
(GFML). GFML associates features from different views by permuting the feature
maps, enabling efficient modeling of long-range dependency. In addition, we
also design the Local Feature Modulation Layer (LFML) and Feed-forward Layer
(FFL) to capture local features and transform features into a compact
representation. This way, our MixNet achieves effective LLIE with few model
parameters and low computational complexity. We conducted extensive experiments
on both synthetic and real-world datasets, and the comprehensive results
demonstrate that our proposed method surpasses the performance of current
state-of-the-art methods. The code will be available at
\url{https://github.com/zzr-idam/MixNet}
Arbitrary Order Total Variation for Deformable Image Registration
In this work, we investigate image registration in a variational framework and focus on regularization generality and solver efficiency. We first propose a variational model combining the state-of-the-art sum of absolute differences (SAD) and a new arbitrary order total variation regularization term. The main advantage is that this variational model preserves discontinuities in the resultant deformation while being robust to outlier noise. It is however non-trivial to optimize the model due to its non-convexity, non-differentiabilities, and generality in the derivative order. To tackle these, we propose to first apply linearization to the model to formulate a convex objective function and then break down the resultant convex optimization into several point-wise, closed-form subproblems using a fast, over-relaxed alternating direction method of multipliers (ADMM). With this proposed algorithm, we show that solving higher-order variational formulations is similar to solving their lower-order counterparts. Extensive experiments show that our ADMM is significantly more efficient than both the subgradient and primal-dual algorithms particularly when higher-order derivatives are used, and that our new models outperform state-of-the-art methods based on deep learning and free-form deformation. Our code implemented in both Matlab and Pytorch is publicly available at https://github.com/j-duan/AOTV
Fourier-Net+: Leveraging Band-Limited Representation for Efficient 3D Medical Image Registration
U-Net style networks are commonly utilized in unsupervised image registration
to predict dense displacement fields, which for high-resolution volumetric
image data is a resource-intensive and time-consuming task. To tackle this
challenge, we first propose Fourier-Net, which replaces the costly U-Net style
expansive path with a parameter-free model-driven decoder. Instead of directly
predicting a full-resolution displacement field, our Fourier-Net learns a
low-dimensional representation of the displacement field in the band-limited
Fourier domain which our model-driven decoder converts to a full-resolution
displacement field in the spatial domain. Expanding upon Fourier-Net, we then
introduce Fourier-Net+, which additionally takes the band-limited spatial
representation of the images as input and further reduces the number of
convolutional layers in the U-Net style network's contracting path. Finally, to
enhance the registration performance, we propose a cascaded version of
Fourier-Net+. We evaluate our proposed methods on three datasets, on which our
proposed Fourier-Net and its variants achieve comparable results with current
state-of-the art methods, while exhibiting faster inference speeds, lower
memory footprint, and fewer multiply-add operations. With such small
computational cost, our Fourier-Net+ enables the efficient training of
large-scale 3D registration on low-VRAM GPUs. Our code is publicly available at
\url{https://github.com/xi-jia/Fourier-Net}.Comment: Under review. arXiv admin note: text overlap with arXiv:2211.1634
GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data
The rapid expansion of computational capabilities and the ever-growing scale
of modern HPC systems present formidable challenges in managing exascale
scientific data. Faced with such vast datasets, traditional lossless
compression techniques prove insufficient in reducing data size to a manageable
level while preserving all information intact. In response, researchers have
turned to error-bounded lossy compression methods, which offer a balance
between data size reduction and information retention. However, despite their
utility, these compressors employing conventional techniques struggle with
limited reconstruction quality. To address this issue, we draw inspiration from
recent advancements in deep learning and propose GWLZ, a novel group-wise
learning-based lossy compression framework with multiple lightweight learnable
enhancer models. Leveraging a group of neural networks, GWLZ significantly
enhances the decompressed data reconstruction quality with negligible impact on
the compression efficiency. Experimental results on different fields from the
Nyx dataset demonstrate remarkable improvements by GWLZ, achieving up to 20%
quality enhancements with negligible overhead as low as 0.0003x
Decoder-Only Image Registration
In unsupervised medical image registration, encoder-decoder architectures are widely used to predict dense, full-resolution displacement fields from paired images. Despite their popularity, we question the necessity of making both the encoder and decoder learnable. To address this, we propose LessNet, a simplified network architecture with only a learnable decoder, while completely omitting a learnable encoder. Instead, LessNet replaces the encoder with simple, handcrafted features, eliminating the need to optimize encoder parameters. This results in a compact, efficient, and decoder-only architecture for 3D medical image registration. We evaluate our decoder-only LessNet on five registration tasks: 1) inter-subject brain registration using the OASIS-1 dataset, 2) atlas-based brain registration using the IXI dataset, 3) cardiac ES-ED registration using the ACDC dataset, 4) inter-subject abdominal MR registration using the CHAOS dataset, and 5) multi-study, multi-site brain registration using images from 13 public datasets. Our results demonstrate that LessNet can effectively and efficiently learn both dense displacement and diffeomorphic deformation fields. Furthermore, our decoder-only LessNet can achieve comparable registration performance to benchmarking methods such as Voxel-Morph and TransMorph, while requiring significantly fewer computational resources. Our code and pre-trained models are available at https://github.com/xi-jia/LessNet
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
In recent years, the thriving development of research related to egocentric
videos has provided a unique perspective for the study of conversational
interactions, where both visual and audio signals play a crucial role. While
most prior work focus on learning about behaviors that directly involve the
camera wearer, we introduce the Ego-Exocentric Conversational Graph Prediction
problem, marking the first attempt to infer exocentric conversational
interactions from egocentric videos. We propose a unified multi-modal framework
-- Audio-Visual Conversational Attention (AV-CONV), for the joint prediction of
conversation behaviors -- speaking and listening -- for both the camera wearer
as well as all other social partners present in the egocentric video.
Specifically, we adopt the self-attention mechanism to model the
representations across-time, across-subjects, and across-modalities. To
validate our method, we conduct experiments on a challenging egocentric video
dataset that includes multi-speaker and multi-conversation scenarios. Our
results demonstrate the superior performance of our method compared to a series
of baselines. We also present detailed ablation studies to assess the
contribution of each component in our model. Check our project page at
https://vjwq.github.io/AV-CONV/
Genome-wide identification, characterization, evolution and expression analysis of the DIR gene family in potato (Solanum tuberosum)
The dirigent (DIR) gene is a key player in environmental stress response and has been identified in many multidimensional tube plant species. However, there are few studies on the StDIR gene in potato. In this study, we used genome-wide identification to identify 31 StDIR genes in potato. Among the 12 potato chromosomes, the StDIR gene was distributed on 11 chromosomes, among which the third chromosome did not have a family member, while the tenth chromosome had the most members with 11 members. 22 of the 31 StDIRs had a classical DIR gene structure, with one exon and no intron. The conserved DIR domain accounts for most of the proteins in the 27 StDIRs. The structure of the StDIR gene was analyzed and ten different motifs were detected. The StDIR gene was divided into three groups according to its phylogenetic relationship, and 22 duplicate genes were identified. In addition, four kinds of cis-acting elements were detected in all 31 StDIR promoter regions, most of which were associated with biotic and abiotic stress. The findings demonstrated that the StDIR gene exhibited specific responses to cold stress, salt stress, ABA, and drought stress. This study provides new candidate genes for improving potato’s resistance to stress
General synthesis of 2D rare-earth oxide single crystals with tailorable facets
Two-dimensional (2D) rare-earth oxides (REOs) are a large family of materials with various intriguing applications and precise facet control is essential for investigating new properties in the 2D limit. However, a bottleneck remains with regard to obtaining their 2D single crystals with specific facets because of the intrinsic non-layered structure and disparate thermodynamic stability of different facets. Herein, for the first time, we achieve the synthesis of a wide variety of high-quality 2D REO single crystals with tailorable facets via designing a hard-soft-acid-base couple for controlling the 2D nucleation of the predetermined facets and adjusting the growth mode and direction of crystals. Also, the facet-related magnetic properties of 2D REO single crystals were revealed. Our approach provides a foundation for further exploring other facet-dependent properties and various applications of 2D REO, as well as inspiration for the precise growth of other non-layered 2D materials
- …
