155 research outputs found
Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling
Syntactic features play an essential role in identifying relationship in a
sentence. Previous neural network models often suffer from irrelevant
information introduced when subjects and objects are in a long distance. In
this paper, we propose to learn more robust relation representations from the
shortest dependency path through a convolution neural network. We further
propose a straightforward negative sampling strategy to improve the assignment
of subjects and objects. Experimental results show that our method outperforms
the state-of-the-art methods on the SemEval-2010 Task 8 dataset
Motion Generation from Fine-grained Textual Descriptions
The task of text2motion is to generate human motion sequences from given
textual descriptions, where the model explores diverse mappings from natural
language instructions to human body movements. While most existing works are
confined to coarse-grained motion descriptions, e.g., "A man squats.",
fine-grained descriptions specifying movements of relevant body parts are
barely explored. Models trained with coarse-grained texts may not be able to
learn mappings from fine-grained motion-related words to motion primitives,
resulting in the failure to generate motions from unseen descriptions. In this
paper, we build a large-scale language-motion dataset specializing in
fine-grained textual descriptions, FineHumanML3D, by feeding GPT-3.5-turbo with
step-by-step instructions with pseudo-code compulsory checks. Accordingly, we
design a new text2motion model, FineMotionDiffuse, making full use of
fine-grained textual information. Our quantitative evaluation shows that
FineMotionDiffuse trained on FineHumanML3D improves FID by a large margin of
0.38, compared with competitive baselines. According to the qualitative
evaluation and case study, our model outperforms MotionDiffuse in generating
spatially or chronologically composite motions, by learning the implicit
mappings from fine-grained descriptions to the corresponding basic motions. We
release our data at https://github.com/KunhangL/finemotiondiffuse
Learning to Predict Charges for Criminal Cases with Legal Basis
The charge prediction task is to determine appropriate charges for a given
case, which is helpful for legal assistant systems where the user input is fact
description. We argue that relevant law articles play an important role in this
task, and therefore propose an attention-based neural network method to jointly
model the charge prediction task and the relevant article extraction task in a
unified framework. The experimental results show that, besides providing legal
basis, the relevant articles can also clearly improve the charge prediction
results, and our full model can effectively predict appropriate charges for
cases with different expression styles.Comment: 10 pages, accepted by EMNLP 201
Neighborhood Matching Network for Entity Alignment
Structural heterogeneity between knowledge graphs is an outstanding challenge
for entity alignment. This paper presents Neighborhood Matching Network (NMN),
a novel entity alignment framework for tackling the structural heterogeneity
challenge. NMN estimates the similarities between entities to capture both the
topological structure and the neighborhood difference. It provides two
innovative components for better learning representations for entity alignment.
It first uses a novel graph sampling method to distill a discriminative
neighborhood for each entity. It then adopts a cross-graph neighborhood
matching module to jointly encode the neighborhood difference for a given
entity pair. Such strategies allow NMN to effectively construct
matching-oriented entity representations while ignoring noisy neighbors that
have a negative impact on the alignment task. Extensive experiments performed
on three entity alignment datasets show that NMN can well estimate the
neighborhood similarity in more tough cases and significantly outperforms 12
previous state-of-the-art methods.Comment: 11 pages, accepted by ACL 202
Generalized Implicit Factorization Problem
The Implicit Factorization Problem was first introduced by May and
Ritzenhofen at PKC'09. This problem aims to factorize two RSA moduli
and when their prime factors share a certain number
of least significant bits (LSBs). They proposed a lattice-based algorithm to
tackle this problem and extended it to cover RSA moduli. Since then,
several variations of the Implicit Factorization Problem have been studied,
including the cases where and share some most significant bits
(MSBs), middle bits, or both MSBs and LSBs at the same position.
In this paper, we explore a more general case of the Implicit Factorization
Problem, where the shared bits are located at different and unknown positions
for different primes. We propose a lattice-based algorithm and analyze its
efficiency under certain conditions. We also present experimental results to
support our analysis
Automatic caption generation for news images
This thesis is concerned with the task of automatically generating captions for images,
which is important for many image-related applications. Automatic description generation
for video frames would help security authorities manage more efficiently and
utilize large volumes of monitoring data. Image search engines could potentially benefit
from image description in supporting more accurate and targeted queries for end
users. Importantly, generating image descriptions would aid blind or partially sighted
people who cannot access visual information in the same way as sighted people can.
However, previous work has relied on fine-gained resources, manually created for specific
domains and applications In this thesis, we explore the feasibility of automatic
caption generation for news images in a knowledge-lean way. We depart from previous
work, as we learn a model of caption generation from publicly available data that
has not been explicitly labelled for our task. The model consists of two components,
namely extracting image content and rendering it in natural language.
Specifically, we exploit data resources where images and their textual descriptions
co-occur naturally. We present a new dataset consisting of news articles, images, and
their captions that we required from the BBC News website. Rather than laboriously
annotating images with keywords, we simply treat the captions as the labels. We show
that it is possible to learn the visual and textual correspondence under such noisy conditions
by extending an existing generative annotation model (Lavrenko et al., 2003).
We also find that the accompanying news documents substantially complements the
extraction of the image content. In order to provide a better modelling and representation
of image content,We propose a probabilistic image annotation model that exploits
the synergy between visual and textual modalities under the assumption that images
and their textual descriptions are generated by a shared set of latent variables (topics).
Using Latent Dirichlet Allocation (Blei and Jordan, 2003), we represent visual and
textual modalities jointly as a probability distribution over a set of topics. Our model
takes these topic distributions into account while finding the most likely keywords for
an image and its associated document.
The availability of news documents in our dataset allows us to perform the caption
generation task in a fashion akin to text summarization; save one important difference
that our model is not solely based on text but uses the image in order to select content
from the document that should be present in the caption. We propose both extractive
and abstractive caption generation models to render the extracted image content
in natural language without relying on rich knowledge resources, sentence-templates or grammars. The backbone for both approaches is our topic-based image annotation
model. Our extractive models examine how to best select sentences that overlap in
content with our image annotation model. We modify an existing abstractive headline
generation model to our scenario by incorporating visual information. Our own
model operates over image description keywords and document phrases by taking dependency
and word order constraints into account. Experimental results show that both
approaches can generate human-readable captions for news images. Our phrase-based
abstractive model manages to yield as informative captions as those written by the
BBC journalists
- …