17 research outputs found
Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together
Neural networks equipped with self-attention have parallelizable computation,
light-weight structure, and the ability to capture both long-range and local
dependencies. Further, their expressive power and performance can be boosted by
using a vector to measure pairwise dependency, but this requires to expand the
alignment matrix to a tensor, which results in memory and computation
bottlenecks. In this paper, we propose a novel attention mechanism called
"Multi-mask Tensorized Self-Attention" (MTSA), which is as fast and as
memory-efficient as a CNN, but significantly outperforms previous
CNN-/RNN-/attention-based models. MTSA 1) captures both pairwise (token2token)
and global (source2token) dependencies by a novel compatibility function
composed of dot-product and additive attentions, 2) uses a tensor to represent
the feature-wise alignment scores for better expressive power but only requires
parallelizable matrix multiplications, and 3) combines multi-head with
multi-dimensional attentions, and applies a distinct positional mask to each
head (subspace), so the memory and computation can be distributed to multiple
heads, each with sequential information encoded independently. The experiments
show that a CNN/RNN-free model based on MTSA achieves state-of-the-art or
competitive performance on nine NLP benchmarks with compelling memory- and
time-efficiency
ReadNet: A Hierarchical Transformer Framework for Web Article Readability Analysis
Analyzing the readability of articles has been an important sociolinguistic
task. Addressing this task is necessary to the automatic recommendation of
appropriate articles to readers with different comprehension abilities, and it
further benefits education systems, web information systems, and digital
libraries. Current methods for assessing readability employ empirical measures
or statistical learning techniques that are limited by their ability to
characterize complex patterns such as article structures and semantic meanings
of sentences. In this paper, we propose a new and comprehensive framework which
uses a hierarchical self-attention model to analyze document readability. In
this model, measurements of sentence-level difficulty are captured along with
the semantic meanings of each sentence. Additionally, the sentence-level
features are incorporated to characterize the overall readability of an article
with consideration of article structures. We evaluate our proposed approach on
three widely-used benchmark datasets against several strong baseline
approaches. Experimental results show that our proposed method achieves the
state-of-the-art performance on estimating the readability for various web
articles and literature.Comment: ECIR 202