225 research outputs found
Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition
Handwritten mathematical expression recognition is a challenging problem due
to the complicated two-dimensional structures, ambiguous handwriting input and
variant scales of handwritten math symbols. To settle this problem, we utilize
the attention based encoder-decoder model that recognizes mathematical
expression images from two-dimensional layouts to one-dimensional LaTeX
strings. We improve the encoder by employing densely connected convolutional
networks as they can strengthen feature extraction and facilitate gradient
propagation especially on a small training set. We also present a novel
multi-scale attention model which is employed to deal with the recognition of
math symbols in different scales and save the fine-grained details that will be
dropped by pooling operations. Validated on the CROHME competition task, the
proposed method significantly outperforms the state-of-the-art methods with an
expression recognition accuracy of 52.8% on CROHME 2014 and 50.1% on CROHME
2016, by only using the official training dataset
DenseBAM-GI: Attention Augmented DeneseNet with momentum aided GRU for HMER
The task of recognising Handwritten Mathematical Expressions (HMER) is
crucial in the fields of digital education and scholarly research. However, it
is difficult to accurately determine the length and complex spatial
relationships among symbols in handwritten mathematical expressions. In this
study, we present a novel encoder-decoder architecture (DenseBAM-GI) for HMER,
where the encoder has a Bottleneck Attention Module (BAM) to improve feature
representation and the decoder has a Gated Input-GRU (GI-GRU) unit with an
extra gate to make decoding long and complex expressions easier. The proposed
model is an efficient and lightweight architecture with performance equivalent
to state-of-the-art models in terms of Expression Recognition Rate (exprate).
It also performs better in terms of top 1, 2, and 3 error accuracy across the
CROHME 2014, 2016, and 2019 datasets. DenseBAM-GI achieves the best exprate
among all models on the CROHME 2019 dataset. Importantly, these successes are
accomplished with a drop in the complexity of the calculation and a reduction
in the need for GPU memory
Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition
Handwritten mathematical expression recognition (HMER) has attracted
extensive attention recently. However, current methods cannot explicitly study
the interactions between different symbols, which may fail when faced similar
symbols. To alleviate this issue, we propose a simple but efficient method to
enhance semantic interaction learning (SIL). Specifically, we firstly construct
a semantic graph based on the statistical symbol co-occurrence probabilities.
Then we design a semantic aware module (SAM), which projects the visual and
classification feature into semantic space. The cosine distance between
different projected vectors indicates the correlation between symbols. And
jointly optimizing HMER and SIL can explicitly enhances the model's
understanding of symbol relationships. In addition, SAM can be easily plugged
into existing attention-based models for HMER and consistently bring
improvement. Extensive experiments on public benchmark datasets demonstrate
that our proposed module can effectively enhance the recognition performance.
Our method achieves better recognition performance than prior arts on both
CROHME and HME100K datasets.Comment: 12 Page
- …