74 research outputs found
Deep Joint Source-Channel Coding for Image Transmission With Visual Protection
Joint source-channel coding (JSCC) has achieved great success due to the introduction of deep learning (DL). Compared to traditional separate source-channel coding (SSCC) schemes, the advantages of DL-based JSCC (DJSCC) include high spectrum efficiency, high reconstruction quality, and relief of “cliff effect”. However, it is difficult to couple existing secure communication mechanisms (e.g., encryption-decryption mechanism) with DJSCC in contrast with traditional SSCC schemes, which hinders the practical usage of this emerging technology. To this end, our paper proposes a novel method called DL-based joint protection and source-channel coding (DJPSCC) for images that can successfully protect the visual content of the plain image without significantly sacrificing image reconstruction performance. The idea of the design is to use a neural network to conduct visual protection, which converts the plain image to a visually protected one with the consideration of its interaction with DJSCC. During the training stage, the proposed DJPSCC method learns: 1) deep neural networks for image protection and image deprotection, and 2) an effective DJSCC network for image transmission in the protected domain. Compared to existing source protection methods applied with DJSCC transmission, the DJPSCC method achieves much better reconstruction performance
CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition
The Transformer-based encoder-decoder architecture has recently made
significant advances in recognizing handwritten mathematical expressions.
However, the transformer model still suffers from the lack of coverage problem,
making its expression recognition rate (ExpRate) inferior to its RNN
counterpart. Coverage information, which records the alignment information of
the past steps, has proven effective in the RNN models. In this paper, we
propose CoMER, a model that adopts the coverage information in the transformer
decoder. Specifically, we propose a novel Attention Refinement Module (ARM) to
refine the attention weights with past alignment information without hurting
its parallelism. Furthermore, we take coverage information to the extreme by
proposing self-coverage and cross-coverage, which utilize the past alignment
information from the current and previous layers. Experiments show that CoMER
improves the ExpRate by 0.61%/2.09%/1.59% compared to the current
state-of-the-art model, and reaches 59.33%/59.81%/62.97% on the CROHME
2014/2016/2019 test sets.Comment: Accept by ECCV 202
Deep Learning based Fingerprint Presentation Attack Detection: A Comprehensive Survey
The vulnerabilities of fingerprint authentication systems have raised
security concerns when adapting them to highly secure access-control
applications. Therefore, Fingerprint Presentation Attack Detection (FPAD)
methods are essential for ensuring reliable fingerprint authentication. Owing
to the lack of generation capacity of traditional handcrafted based approaches,
deep learning-based FPAD has become mainstream and has achieved remarkable
performance in the past decade. Existing reviews have focused more on
hand-cratfed rather than deep learning-based methods, which are outdated. To
stimulate future research, we will concentrate only on recent
deep-learning-based FPAD methods. In this paper, we first briefly introduce the
most common Presentation Attack Instruments (PAIs) and publicly available
fingerprint Presentation Attack (PA) datasets. We then describe the existing
deep-learning FPAD by categorizing them into contact, contactless, and
smartphone-based approaches. Finally, we conclude the paper by discussing the
open challenges at the current stage and emphasizing the potential future
perspective.Comment: 29 pages, submitted to ACM computing survey journa
Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition
Handwritten mathematical expression recognition (HMER) has attracted
extensive attention recently. However, current methods cannot explicitly study
the interactions between different symbols, which may fail when faced similar
symbols. To alleviate this issue, we propose a simple but efficient method to
enhance semantic interaction learning (SIL). Specifically, we firstly construct
a semantic graph based on the statistical symbol co-occurrence probabilities.
Then we design a semantic aware module (SAM), which projects the visual and
classification feature into semantic space. The cosine distance between
different projected vectors indicates the correlation between symbols. And
jointly optimizing HMER and SIL can explicitly enhances the model's
understanding of symbol relationships. In addition, SAM can be easily plugged
into existing attention-based models for HMER and consistently bring
improvement. Extensive experiments on public benchmark datasets demonstrate
that our proposed module can effectively enhance the recognition performance.
Our method achieves better recognition performance than prior arts on both
CROHME and HME100K datasets.Comment: 12 Page
Data Augmentation using Random Image Cropping and Patching for Deep CNNs
Deep convolutional neural networks (CNNs) have achieved remarkable results in
image processing tasks. However, their high expression ability risks
overfitting. Consequently, data augmentation techniques have been proposed to
prevent overfitting while enriching datasets. Recent CNN architectures with
more parameters are rendering traditional data augmentation techniques
insufficient. In this study, we propose a new data augmentation technique
called random image cropping and patching (RICAP) which randomly crops four
images and patches them to create a new training image. Moreover, RICAP mixes
the class labels of the four images, resulting in an advantage similar to label
smoothing. We evaluated RICAP with current state-of-the-art CNNs (e.g., the
shake-shake regularization model) by comparison with competitive data
augmentation techniques such as cutout and mixup. RICAP achieves a new
state-of-the-art test error of on CIFAR-10. We also confirmed that
deep CNNs with RICAP achieve better results on classification tasks using
CIFAR-100 and ImageNet and an image-caption retrieval task using Microsoft
COCO.Comment: accepted version, 16 page
- …