4,178 research outputs found
Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding
Abstractive community detection is an important spoken language understanding
task, whose goal is to group utterances in a conversation according to whether
they can be jointly summarized by a common abstractive sentence. This paper
provides a novel approach to this task. We first introduce a neural contextual
utterance encoder featuring three types of self-attention mechanisms. We then
train it using the siamese and triplet energy-based meta-architectures.
Experiments on the AMI corpus show that our system outperforms multiple
energy-based and non-energy based baselines from the state-of-the-art. Code and
data are publicly available.Comment: Update baseline
GASCOM: Graph-based Attentive Semantic Context Modeling for Online Conversation Understanding
Online conversation understanding is an important yet challenging NLP problem
which has many useful applications (e.g., hate speech detection). However,
online conversations typically unfold over a series of posts and replies to
those posts, forming a tree structure within which individual posts may refer
to semantic context from higher up the tree. Such semantic cross-referencing
makes it difficult to understand a single post by itself; yet considering the
entire conversation tree is not only difficult to scale but can also be
misleading as a single conversation may have several distinct threads or
points, not all of which are relevant to the post being considered. In this
paper, we propose a Graph-based Attentive Semantic COntext Modeling (GASCOM)
framework for online conversation understanding. Specifically, we design two
novel algorithms that utilise both the graph structure of the online
conversation as well as the semantic information from individual posts for
retrieving relevant context nodes from the whole conversation. We further
design a token-level multi-head graph attention mechanism to pay different
attentions to different tokens from different selected context utterances for
fine-grained conversation context modeling. Using this semantic conversational
context, we re-examine two well-studied problems: polarity prediction and hate
speech detection. Our proposed framework significantly outperforms
state-of-the-art methods on both tasks, improving macro-F1 scores by 4.5% for
polarity prediction and by 5% for hate speech detection. The GASCOM context
weights also enhance interpretability
Radar Instance Transformer: Reliable Moving Instance Segmentation in Sparse Radar Point Clouds
The perception of moving objects is crucial for autonomous robots performing
collision avoidance in dynamic environments. LiDARs and cameras tremendously
enhance scene interpretation but do not provide direct motion information and
face limitations under adverse weather. Radar sensors overcome these
limitations and provide Doppler velocities, delivering direct information on
dynamic objects. In this paper, we address the problem of moving instance
segmentation in radar point clouds to enhance scene interpretation for
safety-critical tasks. Our Radar Instance Transformer enriches the current
radar scan with temporal information without passing aggregated scans through a
neural network. We propose a full-resolution backbone to prevent information
loss in sparse point cloud processing. Our instance transformer head
incorporates essential information to enhance segmentation but also enables
reliable, class-agnostic instance assignments. In sum, our approach shows
superior performance on the new moving instance segmentation benchmarks,
including diverse environments, and provides model-agnostic modules to enhance
scene interpretation. The benchmark is based on the RadarScenes dataset and
will be made available upon acceptance.Comment: UNDER Revie
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …