88 research outputs found
DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication
Finger vein authentication, recognized for its high security and specificity,
has become a focal point in biometric research. Traditional methods
predominantly concentrate on vein feature extraction for discriminative
modeling, with a limited exploration of generative approaches. Suffering from
verification failure, existing methods often fail to obtain authentic vein
patterns by segmentation. To fill this gap, we introduce DiffVein, a unified
diffusion model-based framework which simultaneously addresses vein
segmentation and authentication tasks. DiffVein is composed of two dedicated
branches: one for segmentation and the other for denoising. For better feature
interaction between these two branches, we introduce two specialized modules to
improve their collective performance. The first, a mask condition module,
incorporates the semantic information of vein patterns from the segmentation
branch into the denoising process. Additionally, we also propose a Semantic
Difference Transformer (SD-Former), which employs Fourier-space self-attention
and cross-attention modules to extract category embedding before feeding it to
the segmentation task. In this way, our framework allows for a dynamic
interplay between diffusion and segmentation embeddings, thus vein segmentation
and authentication tasks can inform and enhance each other in the joint
training. To further optimize our model, we introduce a Fourier-space
Structural Similarity (FourierSIM) loss function, which is tailored to improve
the denoising network's learning efficacy. Extensive experiments on the USM and
THU-MVFV3V datasets substantiates DiffVein's superior performance, setting new
benchmarks in both vein segmentation and authentication tasks
CLIP-based Synergistic Knowledge Transfer for Text-based Person Retrieval
Text-based Person Retrieval (TPR) aims to retrieve the target person images
given a textual query. The primary challenge lies in bridging the substantial
gap between vision and language modalities, especially when dealing with
limited large-scale datasets. In this paper, we introduce a CLIP-based
Synergistic Knowledge Transfer (CSKT) approach for TPR. Specifically, to
explore the CLIP's knowledge on input side, we first propose a Bidirectional
Prompts Transferring (BPT) module constructed by text-to-image and
image-to-text bidirectional prompts and coupling projections. Secondly, Dual
Adapters Transferring (DAT) is designed to transfer knowledge on output side of
Multi-Head Attention (MHA) in vision and language. This synergistic two-way
collaborative mechanism promotes the early-stage feature fusion and efficiently
exploits the existing knowledge of CLIP. CSKT outperforms the state-of-the-art
approaches across three benchmark datasets when the training parameters merely
account for 7.4% of the entire model, demonstrating its remarkable efficiency,
effectiveness and generalization.Comment: ICASSP2024(accepted). minor typos revision compared to version 1 in
arxi
Meta-Learning based Degradation Representation for Blind Super-Resolution
The most of CNN based super-resolution (SR) methods assume that the
degradation is known (\eg, bicubic). These methods will suffer a severe
performance drop when the degradation is different from their assumption.
Therefore, some approaches attempt to train SR networks with the complex
combination of multiple degradations to cover the real degradation space. To
adapt to multiple unknown degradations, introducing an explicit degradation
estimator can actually facilitate SR performance. However, previous explicit
degradation estimation methods usually predict Gaussian blur with the
supervision of groundtruth blur kernels, and estimation errors may lead to SR
failure. Thus, it is necessary to design a method that can extract implicit
discriminative degradation representation. To this end, we propose a
Meta-Learning based Region Degradation Aware SR Network (MRDA), including
Meta-Learning Network (MLN), Degradation Extraction Network (DEN), and Region
Degradation Aware SR Network (RDAN). To handle the lack of groundtruth
degradation, we use the MLN to rapidly adapt to the specific complex
degradation after several iterations and extract implicit degradation
information. Subsequently, a teacher network MRDA is designed to further
utilize the degradation information extracted by MLN for SR. However, MLN
requires iterating on paired low-resolution (LR) and corresponding
high-resolution (HR) images, which is unavailable in the inference phase.
Therefore, we adopt knowledge distillation (KD) to make the student network
learn to directly extract the same implicit degradation representation (IDR) as
the teacher from LR images.Comment: This paper is accepted by TIP 2023, and code will be released at
https://github.com/Zj-BinXia/MRD
Real-MFF: A Large Realistic Multi-focus Image Dataset with Ground Truth
Multi-focus image fusion, a technique to generate an all-in-focus image from
two or more partially-focused source images, can benefit many computer vision
tasks. However, currently there is no large and realistic dataset to perform
convincing evaluation and comparison of algorithms in multi-focus image fusion.
Moreover, it is difficult to train a deep neural network for multi-focus image
fusion without a suitable dataset. In this letter, we introduce a large and
realistic multi-focus dataset called Real-MFF, which contains 710 pairs of
source images with corresponding ground truth images. The dataset is generated
by light field images, and both the source images and the ground truth images
are realistic. To serve as both a well-established benchmark for existing
multi-focus image fusion algorithms and an appropriate training dataset for
future development of deep-learning-based methods, the dataset contains a
variety of scenes, including buildings, plants, humans, shopping malls, squares
and so on. We also evaluate 10 typical multi-focus algorithms on this dataset
for the purpose of illustration
Single-Sample Finger Vein Recognition via Competitive and Progressive Sparse Representation
As an emerging biometric technology, finger vein recognition has attracted much attention in recent years. However, single-sample recognition is a practical and longstanding challenge in this field, referring to only one finger vein image per class in the training set. In single-sample finger vein recognition, the illumination variations under low contrast and the lack of information of intra-class variations severely affect the recognition performance. Despite of its high robustness against noise and illumination variations, sparse representation has rarely been explored for single-sample finger vein recognition. Therefore, in this paper, we focus on developing a new approach called Progressive Sparse Representation Classification (PSRC) to address the challenging issue of single-sample finger vein recognition. Firstly, as residual may become too large under the scenario of single-sample finger vein recognition, we propose a progressive strategy for representation refinement of SRC. Secondly, to adaptively optimize progressions, a progressive index called Max Energy Residual Index (MERI) is defined as the guidance. Furthermore, we extend PSRC to bimodal biometrics and propose a Competitive PSRC (C-PSRC) fusion approach. The C-PSRC creates more discriminative fused sample and fusion dictionary by comparing residual errors of different modalities. By comparing with several state-of-the-art methods on three finger vein benchmarks, the superiority of the proposed PSRC and C-PSRC is clearly demonstrated
Multidimensional features of sporadic Creutzfeldt-Jakob disease in the elderly: a case report and systematic review
BackgroundAs a rare neurodegenerative disease, sporadic Creutzfeldt-Jakob disease (sCJD) is poorly understood in the elderly populace. This study aims to enunciate the multidimensional features of sCJD in this group.MethodsA case of probable sCJD was reported in a 90-year-old Chinese man with initial dizziness. Then, available English literature of the elderly sCJD cases (aged 80 years and over) was reviewed and analyzed. Patients (15 cases) were subdivided and compared geographically.ResultsIn the elderly sCJD cohort, the onset age was 84.9 ± 4.5 years and the median disease duration was 6.8 months, with respiratory infection/failure as the commonest death cause. Various clinical symptoms were identified, with cognitive disorder (86.7%) as the commonest typical symptom and speech impairment (66.7%) as the most atypical one. Restricted hyperintensities were reported in 60.0% cases on DWI, periodic sharp wave complexes in 73.3% cases on electroencephalogram, and cerebral hypoperfusion/hypometabolism in 26.7% cases on molecular imaging. The sensitive cerebrospinal fluid biomarkers were total tau (83.3%), 14-3-3 protein (75.0%), and PrP RT-QuIC (75.0%). Neuropathological profiles in the cerebral cortex revealed vacuolar spongiosis, neuronal loss, gliosis, and aging-related markers, with synaptic deposit as the commonest PrP pattern (60.0%). The polymorphic PRNP analysis at codon 129 was M/M (90.9%), with MM1 and MM2C as the primary molecular phenotypes. Latency to first clinic visit, hyperintense signals on DWI, and disease duration were significantly different between the patient subgroups.ConclusionThe characteristics of sCJD are multidimensional in the elderly, deepening our understanding of the disease and facilitating an earlier recognition and better care for this group
LCSCNet: Linear Compressing Based Skip-Connecting Network for Image Super-Resolution
In this paper, we develop a concise but efficient network architecture called
linear compressing based skip-connecting network (LCSCNet) for image
super-resolution. Compared with two representative network architectures with
skip connections, ResNet and DenseNet, a linear compressing layer is designed
in LCSCNet for skip connection, which connects former feature maps and
distinguishes them from newly-explored feature maps. In this way, the proposed
LCSCNet enjoys the merits of the distinguish feature treatment of DenseNet and
the parameter-economic form of ResNet. Moreover, to better exploit hierarchical
information from both low and high levels of various receptive fields in deep
models, inspired by gate units in LSTM, we also propose an adaptive
element-wise fusion strategy with multi-supervised training. Experimental
results in comparison with state-of-the-art algorithms validate the
effectiveness of LCSCNet.Comment: Accepted by IEEE Transactions on Image Processing (IEEE-TIP
On modelling coolant penetration into the microchannels at the tool-workpiece interface
A network of microchannels is formed at the interface between the cutting tool and the workpiece during machining due to their rough surface structures. The penetration of coolant into these microchannels has a great effect on the machined surface quality by changing the frictional behaviour and heat transfer characteristics of the interfaces. Most of the present studies focus on the macroscopic coolant delivery process but the mechanism of the microscopic penetration phenomena in the channels at the interface remains unclear. Thus, this study proposes a multi-scale and multi-phase comprehensive theoretical model by combining the machining process, coolant flowing in microchannels and lubrication process, to analyse the coolant transportation in the microchannels and investigate its lubrication effects and the friction mechanisms at the interface between the tool and the workpiece. Then, an orthogonal turning testing of Inconel 718 is designed to verify the proposed model. The cooling and lubrication effect of coolant in the microchannels on the reduction of tool wear, contact length, cutting force and temperature has been demonstrated. And the improvement of chip fragment and microstructure has also been revealed
- …