343 research outputs found

    MixNet: Towards Effective and Efficient UHD Low-Light Image Enhancement

    Full text link
    With the continuous advancement of imaging devices, the prevalence of Ultra-High-Definition (UHD) images is rising. Although many image restoration methods have achieved promising results, they are not directly applicable to UHD images on devices with limited computational resources due to the inherently high computational complexity of UHD images. In this paper, we focus on the task of low-light image enhancement (LLIE) and propose a novel LLIE method called MixNet, which is designed explicitly for UHD images. To capture the long-range dependency of features without introducing excessive computational complexity, we present the Global Feature Modulation Layer (GFML). GFML associates features from different views by permuting the feature maps, enabling efficient modeling of long-range dependency. In addition, we also design the Local Feature Modulation Layer (LFML) and Feed-forward Layer (FFL) to capture local features and transform features into a compact representation. This way, our MixNet achieves effective LLIE with few model parameters and low computational complexity. We conducted extensive experiments on both synthetic and real-world datasets, and the comprehensive results demonstrate that our proposed method surpasses the performance of current state-of-the-art methods. The code will be available at \url{https://github.com/zzr-idam/MixNet}

    Arbitrary Order Total Variation for Deformable Image Registration

    Get PDF
    In this work, we investigate image registration in a variational framework and focus on regularization generality and solver efficiency. We first propose a variational model combining the state-of-the-art sum of absolute differences (SAD) and a new arbitrary order total variation regularization term. The main advantage is that this variational model preserves discontinuities in the resultant deformation while being robust to outlier noise. It is however non-trivial to optimize the model due to its non-convexity, non-differentiabilities, and generality in the derivative order. To tackle these, we propose to first apply linearization to the model to formulate a convex objective function and then break down the resultant convex optimization into several point-wise, closed-form subproblems using a fast, over-relaxed alternating direction method of multipliers (ADMM). With this proposed algorithm, we show that solving higher-order variational formulations is similar to solving their lower-order counterparts. Extensive experiments show that our ADMM is significantly more efficient than both the subgradient and primal-dual algorithms particularly when higher-order derivatives are used, and that our new models outperform state-of-the-art methods based on deep learning and free-form deformation. Our code implemented in both Matlab and Pytorch is publicly available at https://github.com/j-duan/AOTV

    Fourier-Net+: Leveraging Band-Limited Representation for Efficient 3D Medical Image Registration

    Full text link
    U-Net style networks are commonly utilized in unsupervised image registration to predict dense displacement fields, which for high-resolution volumetric image data is a resource-intensive and time-consuming task. To tackle this challenge, we first propose Fourier-Net, which replaces the costly U-Net style expansive path with a parameter-free model-driven decoder. Instead of directly predicting a full-resolution displacement field, our Fourier-Net learns a low-dimensional representation of the displacement field in the band-limited Fourier domain which our model-driven decoder converts to a full-resolution displacement field in the spatial domain. Expanding upon Fourier-Net, we then introduce Fourier-Net+, which additionally takes the band-limited spatial representation of the images as input and further reduces the number of convolutional layers in the U-Net style network's contracting path. Finally, to enhance the registration performance, we propose a cascaded version of Fourier-Net+. We evaluate our proposed methods on three datasets, on which our proposed Fourier-Net and its variants achieve comparable results with current state-of-the art methods, while exhibiting faster inference speeds, lower memory footprint, and fewer multiply-add operations. With such small computational cost, our Fourier-Net+ enables the efficient training of large-scale 3D registration on low-VRAM GPUs. Our code is publicly available at \url{https://github.com/xi-jia/Fourier-Net}.Comment: Under review. arXiv admin note: text overlap with arXiv:2211.1634

    GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data

    Full text link
    The rapid expansion of computational capabilities and the ever-growing scale of modern HPC systems present formidable challenges in managing exascale scientific data. Faced with such vast datasets, traditional lossless compression techniques prove insufficient in reducing data size to a manageable level while preserving all information intact. In response, researchers have turned to error-bounded lossy compression methods, which offer a balance between data size reduction and information retention. However, despite their utility, these compressors employing conventional techniques struggle with limited reconstruction quality. To address this issue, we draw inspiration from recent advancements in deep learning and propose GWLZ, a novel group-wise learning-based lossy compression framework with multiple lightweight learnable enhancer models. Leveraging a group of neural networks, GWLZ significantly enhances the decompressed data reconstruction quality with negligible impact on the compression efficiency. Experimental results on different fields from the Nyx dataset demonstrate remarkable improvements by GWLZ, achieving up to 20% quality enhancements with negligible overhead as low as 0.0003x

    Decoder-Only Image Registration

    Get PDF
    In unsupervised medical image registration, encoder-decoder architectures are widely used to predict dense, full-resolution displacement fields from paired images. Despite their popularity, we question the necessity of making both the encoder and decoder learnable. To address this, we propose LessNet, a simplified network architecture with only a learnable decoder, while completely omitting a learnable encoder. Instead, LessNet replaces the encoder with simple, handcrafted features, eliminating the need to optimize encoder parameters. This results in a compact, efficient, and decoder-only architecture for 3D medical image registration. We evaluate our decoder-only LessNet on five registration tasks: 1) inter-subject brain registration using the OASIS-1 dataset, 2) atlas-based brain registration using the IXI dataset, 3) cardiac ES-ED registration using the ACDC dataset, 4) inter-subject abdominal MR registration using the CHAOS dataset, and 5) multi-study, multi-site brain registration using images from 13 public datasets. Our results demonstrate that LessNet can effectively and efficiently learn both dense displacement and diffeomorphic deformation fields. Furthermore, our decoder-only LessNet can achieve comparable registration performance to benchmarking methods such as Voxel-Morph and TransMorph, while requiring significantly fewer computational resources. Our code and pre-trained models are available at https://github.com/xi-jia/LessNet

    The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective

    Full text link
    In recent years, the thriving development of research related to egocentric videos has provided a unique perspective for the study of conversational interactions, where both visual and audio signals play a crucial role. While most prior work focus on learning about behaviors that directly involve the camera wearer, we introduce the Ego-Exocentric Conversational Graph Prediction problem, marking the first attempt to infer exocentric conversational interactions from egocentric videos. We propose a unified multi-modal framework -- Audio-Visual Conversational Attention (AV-CONV), for the joint prediction of conversation behaviors -- speaking and listening -- for both the camera wearer as well as all other social partners present in the egocentric video. Specifically, we adopt the self-attention mechanism to model the representations across-time, across-subjects, and across-modalities. To validate our method, we conduct experiments on a challenging egocentric video dataset that includes multi-speaker and multi-conversation scenarios. Our results demonstrate the superior performance of our method compared to a series of baselines. We also present detailed ablation studies to assess the contribution of each component in our model. Check our project page at https://vjwq.github.io/AV-CONV/

    Genome-wide identification, characterization, evolution and expression analysis of the DIR gene family in potato (Solanum tuberosum)

    Get PDF
    The dirigent (DIR) gene is a key player in environmental stress response and has been identified in many multidimensional tube plant species. However, there are few studies on the StDIR gene in potato. In this study, we used genome-wide identification to identify 31 StDIR genes in potato. Among the 12 potato chromosomes, the StDIR gene was distributed on 11 chromosomes, among which the third chromosome did not have a family member, while the tenth chromosome had the most members with 11 members. 22 of the 31 StDIRs had a classical DIR gene structure, with one exon and no intron. The conserved DIR domain accounts for most of the proteins in the 27 StDIRs. The structure of the StDIR gene was analyzed and ten different motifs were detected. The StDIR gene was divided into three groups according to its phylogenetic relationship, and 22 duplicate genes were identified. In addition, four kinds of cis-acting elements were detected in all 31 StDIR promoter regions, most of which were associated with biotic and abiotic stress. The findings demonstrated that the StDIR gene exhibited specific responses to cold stress, salt stress, ABA, and drought stress. This study provides new candidate genes for improving potato’s resistance to stress

    General synthesis of 2D rare-earth oxide single crystals with tailorable facets

    Get PDF
    Two-dimensional (2D) rare-earth oxides (REOs) are a large family of materials with various intriguing applications and precise facet control is essential for investigating new properties in the 2D limit. However, a bottleneck remains with regard to obtaining their 2D single crystals with specific facets because of the intrinsic non-layered structure and disparate thermodynamic stability of different facets. Herein, for the first time, we achieve the synthesis of a wide variety of high-quality 2D REO single crystals with tailorable facets via designing a hard-soft-acid-base couple for controlling the 2D nucleation of the predetermined facets and adjusting the growth mode and direction of crystals. Also, the facet-related magnetic properties of 2D REO single crystals were revealed. Our approach provides a foundation for further exploring other facet-dependent properties and various applications of 2D REO, as well as inspiration for the precise growth of other non-layered 2D materials
    corecore