88 research outputs found

    DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication

    Full text link
    Finger vein authentication, recognized for its high security and specificity, has become a focal point in biometric research. Traditional methods predominantly concentrate on vein feature extraction for discriminative modeling, with a limited exploration of generative approaches. Suffering from verification failure, existing methods often fail to obtain authentic vein patterns by segmentation. To fill this gap, we introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks. DiffVein is composed of two dedicated branches: one for segmentation and the other for denoising. For better feature interaction between these two branches, we introduce two specialized modules to improve their collective performance. The first, a mask condition module, incorporates the semantic information of vein patterns from the segmentation branch into the denoising process. Additionally, we also propose a Semantic Difference Transformer (SD-Former), which employs Fourier-space self-attention and cross-attention modules to extract category embedding before feeding it to the segmentation task. In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings, thus vein segmentation and authentication tasks can inform and enhance each other in the joint training. To further optimize our model, we introduce a Fourier-space Structural Similarity (FourierSIM) loss function, which is tailored to improve the denoising network's learning efficacy. Extensive experiments on the USM and THU-MVFV3V datasets substantiates DiffVein's superior performance, setting new benchmarks in both vein segmentation and authentication tasks

    CLIP-based Synergistic Knowledge Transfer for Text-based Person Retrieval

    Full text link
    Text-based Person Retrieval (TPR) aims to retrieve the target person images given a textual query. The primary challenge lies in bridging the substantial gap between vision and language modalities, especially when dealing with limited large-scale datasets. In this paper, we introduce a CLIP-based Synergistic Knowledge Transfer (CSKT) approach for TPR. Specifically, to explore the CLIP's knowledge on input side, we first propose a Bidirectional Prompts Transferring (BPT) module constructed by text-to-image and image-to-text bidirectional prompts and coupling projections. Secondly, Dual Adapters Transferring (DAT) is designed to transfer knowledge on output side of Multi-Head Attention (MHA) in vision and language. This synergistic two-way collaborative mechanism promotes the early-stage feature fusion and efficiently exploits the existing knowledge of CLIP. CSKT outperforms the state-of-the-art approaches across three benchmark datasets when the training parameters merely account for 7.4% of the entire model, demonstrating its remarkable efficiency, effectiveness and generalization.Comment: ICASSP2024(accepted). minor typos revision compared to version 1 in arxi

    Meta-Learning based Degradation Representation for Blind Super-Resolution

    Full text link
    The most of CNN based super-resolution (SR) methods assume that the degradation is known (\eg, bicubic). These methods will suffer a severe performance drop when the degradation is different from their assumption. Therefore, some approaches attempt to train SR networks with the complex combination of multiple degradations to cover the real degradation space. To adapt to multiple unknown degradations, introducing an explicit degradation estimator can actually facilitate SR performance. However, previous explicit degradation estimation methods usually predict Gaussian blur with the supervision of groundtruth blur kernels, and estimation errors may lead to SR failure. Thus, it is necessary to design a method that can extract implicit discriminative degradation representation. To this end, we propose a Meta-Learning based Region Degradation Aware SR Network (MRDA), including Meta-Learning Network (MLN), Degradation Extraction Network (DEN), and Region Degradation Aware SR Network (RDAN). To handle the lack of groundtruth degradation, we use the MLN to rapidly adapt to the specific complex degradation after several iterations and extract implicit degradation information. Subsequently, a teacher network MRDAT_{T} is designed to further utilize the degradation information extracted by MLN for SR. However, MLN requires iterating on paired low-resolution (LR) and corresponding high-resolution (HR) images, which is unavailable in the inference phase. Therefore, we adopt knowledge distillation (KD) to make the student network learn to directly extract the same implicit degradation representation (IDR) as the teacher from LR images.Comment: This paper is accepted by TIP 2023, and code will be released at https://github.com/Zj-BinXia/MRD

    Real-MFF: A Large Realistic Multi-focus Image Dataset with Ground Truth

    Get PDF
    Multi-focus image fusion, a technique to generate an all-in-focus image from two or more partially-focused source images, can benefit many computer vision tasks. However, currently there is no large and realistic dataset to perform convincing evaluation and comparison of algorithms in multi-focus image fusion. Moreover, it is difficult to train a deep neural network for multi-focus image fusion without a suitable dataset. In this letter, we introduce a large and realistic multi-focus dataset called Real-MFF, which contains 710 pairs of source images with corresponding ground truth images. The dataset is generated by light field images, and both the source images and the ground truth images are realistic. To serve as both a well-established benchmark for existing multi-focus image fusion algorithms and an appropriate training dataset for future development of deep-learning-based methods, the dataset contains a variety of scenes, including buildings, plants, humans, shopping malls, squares and so on. We also evaluate 10 typical multi-focus algorithms on this dataset for the purpose of illustration

    Single-Sample Finger Vein Recognition via Competitive and Progressive Sparse Representation

    Get PDF
    As an emerging biometric technology, finger vein recognition has attracted much attention in recent years. However, single-sample recognition is a practical and longstanding challenge in this field, referring to only one finger vein image per class in the training set. In single-sample finger vein recognition, the illumination variations under low contrast and the lack of information of intra-class variations severely affect the recognition performance. Despite of its high robustness against noise and illumination variations, sparse representation has rarely been explored for single-sample finger vein recognition. Therefore, in this paper, we focus on developing a new approach called Progressive Sparse Representation Classification (PSRC) to address the challenging issue of single-sample finger vein recognition. Firstly, as residual may become too large under the scenario of single-sample finger vein recognition, we propose a progressive strategy for representation refinement of SRC. Secondly, to adaptively optimize progressions, a progressive index called Max Energy Residual Index (MERI) is defined as the guidance. Furthermore, we extend PSRC to bimodal biometrics and propose a Competitive PSRC (C-PSRC) fusion approach. The C-PSRC creates more discriminative fused sample and fusion dictionary by comparing residual errors of different modalities. By comparing with several state-of-the-art methods on three finger vein benchmarks, the superiority of the proposed PSRC and C-PSRC is clearly demonstrated

    Multidimensional features of sporadic Creutzfeldt-Jakob disease in the elderly: a case report and systematic review

    Get PDF
    BackgroundAs a rare neurodegenerative disease, sporadic Creutzfeldt-Jakob disease (sCJD) is poorly understood in the elderly populace. This study aims to enunciate the multidimensional features of sCJD in this group.MethodsA case of probable sCJD was reported in a 90-year-old Chinese man with initial dizziness. Then, available English literature of the elderly sCJD cases (aged 80 years and over) was reviewed and analyzed. Patients (15 cases) were subdivided and compared geographically.ResultsIn the elderly sCJD cohort, the onset age was 84.9 ± 4.5 years and the median disease duration was 6.8 months, with respiratory infection/failure as the commonest death cause. Various clinical symptoms were identified, with cognitive disorder (86.7%) as the commonest typical symptom and speech impairment (66.7%) as the most atypical one. Restricted hyperintensities were reported in 60.0% cases on DWI, periodic sharp wave complexes in 73.3% cases on electroencephalogram, and cerebral hypoperfusion/hypometabolism in 26.7% cases on molecular imaging. The sensitive cerebrospinal fluid biomarkers were total tau (83.3%), 14-3-3 protein (75.0%), and PrP RT-QuIC (75.0%). Neuropathological profiles in the cerebral cortex revealed vacuolar spongiosis, neuronal loss, gliosis, and aging-related markers, with synaptic deposit as the commonest PrP pattern (60.0%). The polymorphic PRNP analysis at codon 129 was M/M (90.9%), with MM1 and MM2C as the primary molecular phenotypes. Latency to first clinic visit, hyperintense signals on DWI, and disease duration were significantly different between the patient subgroups.ConclusionThe characteristics of sCJD are multidimensional in the elderly, deepening our understanding of the disease and facilitating an earlier recognition and better care for this group

    LCSCNet: Linear Compressing Based Skip-Connecting Network for Image Super-Resolution

    Get PDF
    In this paper, we develop a concise but efficient network architecture called linear compressing based skip-connecting network (LCSCNet) for image super-resolution. Compared with two representative network architectures with skip connections, ResNet and DenseNet, a linear compressing layer is designed in LCSCNet for skip connection, which connects former feature maps and distinguishes them from newly-explored feature maps. In this way, the proposed LCSCNet enjoys the merits of the distinguish feature treatment of DenseNet and the parameter-economic form of ResNet. Moreover, to better exploit hierarchical information from both low and high levels of various receptive fields in deep models, inspired by gate units in LSTM, we also propose an adaptive element-wise fusion strategy with multi-supervised training. Experimental results in comparison with state-of-the-art algorithms validate the effectiveness of LCSCNet.Comment: Accepted by IEEE Transactions on Image Processing (IEEE-TIP

    On modelling coolant penetration into the microchannels at the tool-workpiece interface

    Get PDF
    A network of microchannels is formed at the interface between the cutting tool and the workpiece during machining due to their rough surface structures. The penetration of coolant into these microchannels has a great effect on the machined surface quality by changing the frictional behaviour and heat transfer characteristics of the interfaces. Most of the present studies focus on the macroscopic coolant delivery process but the mechanism of the microscopic penetration phenomena in the channels at the interface remains unclear. Thus, this study proposes a multi-scale and multi-phase comprehensive theoretical model by combining the machining process, coolant flowing in microchannels and lubrication process, to analyse the coolant transportation in the microchannels and investigate its lubrication effects and the friction mechanisms at the interface between the tool and the workpiece. Then, an orthogonal turning testing of Inconel 718 is designed to verify the proposed model. The cooling and lubrication effect of coolant in the microchannels on the reduction of tool wear, contact length, cutting force and temperature has been demonstrated. And the improvement of chip fragment and microstructure has also been revealed
    corecore