48 research outputs found
Context Perception Parallel Decoder for Scene Text Recognition
Scene text recognition (STR) methods have struggled to attain high accuracy
and fast inference speed. Autoregressive (AR)-based STR model uses the
previously recognized characters to decode the next character iteratively. It
shows superiority in terms of accuracy. However, the inference speed is slow
also due to this iteration. Alternatively, parallel decoding (PD)-based STR
model infers all the characters in a single decoding pass. It has advantages in
terms of inference speed but worse accuracy, as it is difficult to build a
robust recognition context in such a pass. In this paper, we first present an
empirical study of AR decoding in STR. In addition to constructing a new AR
model with the top accuracy, we find out that the success of AR decoder lies
also in providing guidance on visual context perception rather than language
modeling as claimed in existing studies. As a consequence, we propose Context
Perception Parallel Decoder (CPPD) to decode the character sequence in a single
PD pass. CPPD devises a character counting module and a character ordering
module. Given a text instance, the former infers the occurrence count of each
character, while the latter deduces the character reading order and
placeholders. Together with the character prediction task, they construct a
context that robustly tells what the character sequence is and where the
characters appear, well mimicking the context conveyed by AR decoding.
Experiments on both English and Chinese benchmarks demonstrate that CPPD models
achieve highly competitive accuracy. Moreover, they run approximately 7x faster
than their AR counterparts, and are also among the fastest recognizers. The
code will be released soon
Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention
Massive rumors usually appear along with breaking news or trending topics, seriously hindering the truth. Existing rumor detection methods are mostly focused on the same domain, thus have poor performance in cross-domain scenarios due to domain shift. In this work, we propose an end-to-end instance-wise and prototype-wise contrastive learning model with cross-attention mechanism for cross-domain rumor detection. The model not only performs cross-domain
feature alignment, but also enforces target samples to align with the corresponding prototypes of a given source domain. Since target labels in a target domain are unavailable, we use a clustering-based approach with carefully initialized centers
by a batch of source domain samples to produce pseudo labels. Moreover, we use a cross-attention mechanism on a pair of source data and target data with the same labels to learn domain-invariant representations. Because the samples in a
domain pair tend to express similar semantic patterns especially on the people’s attitudes (e.g., supporting or denying) towards the same category of rumors, the discrepancy between a pair of source domain and target domain will be decreased. We conduct experiments on four groups of cross-domain datasets and show that our proposed model achieves state-of-the-art performance
Deep Embedded Clustering with Distribution Consistency Preservation for Attributed Networks
Many complex systems in the real world can be characterized by attributed
networks. To mine the potential information in these networks, deep embedded
clustering, which obtains node representations and clusters simultaneously, has
been paid much attention in recent years. Under the assumption of consistency
for data in different views, the cluster structure of network topology and that
of node attributes should be consistent for an attributed network. However,
many existing methods ignore this property, even though they separately encode
node representations from network topology and node attributes meanwhile
clustering nodes on representation vectors learnt from one of the views.
Therefore, in this study, we propose an end-to-end deep embedded clustering
model for attributed networks. It utilizes graph autoencoder and node attribute
autoencoder to respectively learn node representations and cluster assignments.
In addition, a distribution consistency constraint is introduced to maintain
the latent consistency of cluster distributions of two views. Extensive
experiments on several datasets demonstrate that the proposed model achieves
significantly better or competitive performance compared with the
state-of-the-art methods. The source code can be found at
https://github.com/Zhengymm/DCP.Comment: 28 pages, 5 figure
A Generative Node-attribute Network Model for Detecting Generalized Structure
Exploring meaningful structural regularities embedded in networks is a key to
understanding and analyzing the structure and function of a network. The
node-attribute information can help improve such understanding and analysis.
However, most of the existing methods focus on detecting traditional
communities, i.e., groupings of nodes with dense internal connections and
sparse external ones. In this paper, based on the connectivity behavior of
nodes and homogeneity of attributes, we propose a principle model (named GNAN),
which can generate both topology information and attribute information. The new
model can detect not only community structure, but also a range of other types
of structure in networks, such as bipartite structure, core-periphery
structure, and their mixture structure, which are collectively referred to as
generalized structure. The proposed model that combines topological information
and node-attribute information can detect communities more accurately than the
model that only uses topology information. The dependency between attributes
and communities can be automatically learned by our model and thus we can
ignore the attributes that do not contain useful information. The model
parameters are inferred by using the expectation-maximization algorithm. And a
case study is provided to show the ability of our model in the semantic
interpretability of communities. Experiments on both synthetic and real-world
networks show that the new model is competitive with other state-of-the-art
models