6,031 research outputs found
MMF3: Neural Code Summarization Based on Multi-Modal Fine-Grained Feature Fusion
Background: Code summarization automatically generates the corresponding
natural language descriptions according to the input code. Comprehensiveness of
code representation is critical to code summarization task. However, most
existing approaches typically use coarse-grained fusion methods to integrate
multi-modal features. They generally represent different modalities of a piece
of code, such as an Abstract Syntax Tree (AST) and a token sequence, as two
embeddings and then fuse the two ones at the AST/code levels. Such a coarse
integration makes it difficult to learn the correlations between fine-grained
code elements across modalities effectively. Aims: This study intends to
improve the model's prediction performance for high-quality code summarization
by accurately aligning and fully fusing semantic and syntactic structure
information of source code at node/token levels. Method: This paper proposes a
Multi-Modal Fine-grained Feature Fusion approach (MMF3) for neural code
summarization. We introduce a novel fine-grained fusion method, which allows
fine-grained fusion of multiple code modalities at the token and node levels.
Specifically, we use this method to fuse information from both token and AST
modalities and apply the fused features to code summarization. Results: We
conduct experiments on one Java and one Python datasets, and evaluate generated
summaries using four metrics. The results show that: 1) the performance of our
model outperforms the current state-of-the-art models, and 2) the ablation
experiments show that our proposed fine-grained fusion method can effectively
improve the accuracy of generated summaries. Conclusion: MMF3 can mine the
relationships between crossmodal elements and perform accurate fine-grained
element-level alignment fusion accordingly. As a result, more clues can be
provided to improve the accuracy of the generated code summaries.Comment: 12 pages, 5 figure
DPFNet: A Dual-branch Dilated Network with Phase-aware Fourier Convolution for Low-light Image Enhancement
Low-light image enhancement is a classical computer vision problem aiming to
recover normal-exposure images from low-light images. However, convolutional
neural networks commonly used in this field are good at sampling low-frequency
local structural features in the spatial domain, which leads to unclear texture
details of the reconstructed images. To alleviate this problem, we propose a
novel module using the Fourier coefficients, which can recover high-quality
texture details under the constraint of semantics in the frequency phase and
supplement the spatial domain. In addition, we design a simple and efficient
module for the image spatial domain using dilated convolutions with different
receptive fields to alleviate the loss of detail caused by frequent
downsampling. We integrate the above parts into an end-to-end dual branch
network and design a novel loss committee and an adaptive fusion module to
guide the network to flexibly combine spatial and frequency domain features to
generate more pleasing visual effects. Finally, we evaluate the proposed
network on public benchmarks. Extensive experimental results show that our
method outperforms many existing state-of-the-art ones, showing outstanding
performance and potential
E-CLIP: Towards Label-efficient Event-based Open-world Understanding by CLIP
Contrasting Language-image pertaining (CLIP) has recently shown promising
open-world and few-shot performance on 2D image-based recognition tasks.
However, the transferred capability of CLIP to the novel event camera data
still remains under-explored. In particular, due to the modality gap with the
image-text data and the lack of large-scale datasets, achieving this goal is
non-trivial and thus requires significant research innovation. In this paper,
we propose E-CLIP, a novel and effective framework that unleashes the potential
of CLIP for event-based recognition to compensate for the lack of large-scale
event-based datasets. Our work addresses two crucial challenges: 1) how to
generalize CLIP's visual encoder to event data while fully leveraging events'
unique properties, e.g., sparsity and high temporal resolution; 2) how to
effectively align the multi-modal embeddings, i.e., image, text, and events. To
this end, we first introduce a novel event encoder that subtly models the
temporal information from events and meanwhile generates event prompts to
promote the modality bridging. We then design a text encoder that generates
content prompts and utilizes hybrid text prompts to enhance the E-CLIP's
generalization ability across diverse datasets. With the proposed event
encoder, text encoder, and original image encoder, a novel Hierarchical Triple
Contrastive Alignment (HTCA) module is introduced to jointly optimize the
correlation and enable efficient knowledge transfer among the three modalities.
We conduct extensive experiments on two recognition benchmarks, and the results
demonstrate that our E-CLIP outperforms existing methods by a large margin of
+3.94% and +4.62% on the N-Caltech dataset, respectively, in both fine-tuning
and few-shot settings. Moreover, our E-CLIP can be flexibly extended to the
event retrieval task using both text or image queries, showing plausible
performance.Comment: Jounal version with supplementary materia
A PML method for signal-propagation problems in axon
This work is focused on the modelling of signal propagations in myelinated
axons to characterize the functions of the myelin sheath in the neural
structure. Based on reasonable assumptions on the medium properties, we derive
a two-dimensional neural-signaling model in cylindrical coordinates from the
time-harmonic Maxwell's equations. The well-posedness of model is established
upon Dirichlet boundary conditions at the two ends of the neural structure and
the radiative condition in the radial direction of the structure. Using the
perfectly matched layer (PML) method, we truncate the unbounded background
medium and propose an approximate problem on the truncated domain. The
well-posedness of the PML problem and the exponential convergence of the
approximate solution to the exact solution are established. Numerical
experiments based on finite element discretization are presented to demonstrate
the theoretical results and the efficiency of our methods to simulate the
signal propagation in axons
- …