74 research outputs found

    Deep Joint Source-Channel Coding for Image Transmission With Visual Protection

    Get PDF
    Joint source-channel coding (JSCC) has achieved great success due to the introduction of deep learning (DL). Compared to traditional separate source-channel coding (SSCC) schemes, the advantages of DL-based JSCC (DJSCC) include high spectrum efficiency, high reconstruction quality, and relief of “cliff effect”. However, it is difficult to couple existing secure communication mechanisms (e.g., encryption-decryption mechanism) with DJSCC in contrast with traditional SSCC schemes, which hinders the practical usage of this emerging technology. To this end, our paper proposes a novel method called DL-based joint protection and source-channel coding (DJPSCC) for images that can successfully protect the visual content of the plain image without significantly sacrificing image reconstruction performance. The idea of the design is to use a neural network to conduct visual protection, which converts the plain image to a visually protected one with the consideration of its interaction with DJSCC. During the training stage, the proposed DJPSCC method learns: 1) deep neural networks for image protection and image deprotection, and 2) an effective DJSCC network for image transmission in the protected domain. Compared to existing source protection methods applied with DJSCC transmission, the DJPSCC method achieves much better reconstruction performance

    CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition

    Full text link
    The Transformer-based encoder-decoder architecture has recently made significant advances in recognizing handwritten mathematical expressions. However, the transformer model still suffers from the lack of coverage problem, making its expression recognition rate (ExpRate) inferior to its RNN counterpart. Coverage information, which records the alignment information of the past steps, has proven effective in the RNN models. In this paper, we propose CoMER, a model that adopts the coverage information in the transformer decoder. Specifically, we propose a novel Attention Refinement Module (ARM) to refine the attention weights with past alignment information without hurting its parallelism. Furthermore, we take coverage information to the extreme by proposing self-coverage and cross-coverage, which utilize the past alignment information from the current and previous layers. Experiments show that CoMER improves the ExpRate by 0.61%/2.09%/1.59% compared to the current state-of-the-art model, and reaches 59.33%/59.81%/62.97% on the CROHME 2014/2016/2019 test sets.Comment: Accept by ECCV 202

    Deep Learning based Fingerprint Presentation Attack Detection: A Comprehensive Survey

    Full text link
    The vulnerabilities of fingerprint authentication systems have raised security concerns when adapting them to highly secure access-control applications. Therefore, Fingerprint Presentation Attack Detection (FPAD) methods are essential for ensuring reliable fingerprint authentication. Owing to the lack of generation capacity of traditional handcrafted based approaches, deep learning-based FPAD has become mainstream and has achieved remarkable performance in the past decade. Existing reviews have focused more on hand-cratfed rather than deep learning-based methods, which are outdated. To stimulate future research, we will concentrate only on recent deep-learning-based FPAD methods. In this paper, we first briefly introduce the most common Presentation Attack Instruments (PAIs) and publicly available fingerprint Presentation Attack (PA) datasets. We then describe the existing deep-learning FPAD by categorizing them into contact, contactless, and smartphone-based approaches. Finally, we conclude the paper by discussing the open challenges at the current stage and emphasizing the potential future perspective.Comment: 29 pages, submitted to ACM computing survey journa

    Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition

    Full text link
    Handwritten mathematical expression recognition (HMER) has attracted extensive attention recently. However, current methods cannot explicitly study the interactions between different symbols, which may fail when faced similar symbols. To alleviate this issue, we propose a simple but efficient method to enhance semantic interaction learning (SIL). Specifically, we firstly construct a semantic graph based on the statistical symbol co-occurrence probabilities. Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space. The cosine distance between different projected vectors indicates the correlation between symbols. And jointly optimizing HMER and SIL can explicitly enhances the model's understanding of symbol relationships. In addition, SAM can be easily plugged into existing attention-based models for HMER and consistently bring improvement. Extensive experiments on public benchmark datasets demonstrate that our proposed module can effectively enhance the recognition performance. Our method achieves better recognition performance than prior arts on both CROHME and HME100K datasets.Comment: 12 Page

    Data Augmentation using Random Image Cropping and Patching for Deep CNNs

    Get PDF
    Deep convolutional neural networks (CNNs) have achieved remarkable results in image processing tasks. However, their high expression ability risks overfitting. Consequently, data augmentation techniques have been proposed to prevent overfitting while enriching datasets. Recent CNN architectures with more parameters are rendering traditional data augmentation techniques insufficient. In this study, we propose a new data augmentation technique called random image cropping and patching (RICAP) which randomly crops four images and patches them to create a new training image. Moreover, RICAP mixes the class labels of the four images, resulting in an advantage similar to label smoothing. We evaluated RICAP with current state-of-the-art CNNs (e.g., the shake-shake regularization model) by comparison with competitive data augmentation techniques such as cutout and mixup. RICAP achieves a new state-of-the-art test error of 2.19%2.19\% on CIFAR-10. We also confirmed that deep CNNs with RICAP achieve better results on classification tasks using CIFAR-100 and ImageNet and an image-caption retrieval task using Microsoft COCO.Comment: accepted version, 16 page
    • …
    corecore