68 research outputs found
On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation
Commit messages play an important role in software maintenance and evolution.
Nonetheless, developers often do not produce high-quality messages. A number of
commit message generation methods have been proposed in recent years to address
this problem. Some of these methods are based on neural machine translation
(NMT) techniques. Studies show that the nearest neighbor algorithm (NNGen)
outperforms existing NMT-based methods, although NNGen is simpler and faster
than NMT. In this paper, we show that NNGen does not take advantage of
cross-project learning in the majority of the cases. We also show that there is
an even simpler and faster variation of the existing NNGen method which
outperforms it in terms of the BLEU_4 score without using cross-project
learning
Don't hide in the frames: Note- and pattern-based evaluation of automated melody extraction algorithms
International audienc
Ranking Enhanced Dialogue Generation
How to effectively utilize the dialogue history is a crucial problem in
multi-turn dialogue generation. Previous works usually employ various neural
network architectures (e.g., recurrent neural networks, attention mechanisms,
and hierarchical structures) to model the history. However, a recent empirical
study by Sankar et al. has shown that these architectures lack the ability of
understanding and modeling the dynamics of the dialogue history. For example,
the widely used architectures are insensitive to perturbations of the dialogue
history, such as words shuffling, utterances missing, and utterances
reordering. To tackle this problem, we propose a Ranking Enhanced Dialogue
generation framework in this paper. Despite the traditional representation
encoder and response generation modules, an additional ranking module is
introduced to model the ranking relation between the former utterance and
consecutive utterances. Specifically, the former utterance and consecutive
utterances are treated as query and corresponding documents, and both local and
global ranking losses are designed in the learning process. In this way, the
dynamics in the dialogue history can be explicitly captured. To evaluate our
proposed models, we conduct extensive experiments on three public datasets,
i.e., bAbI, PersonaChat, and JDC. Experimental results show that our models
produce better responses in terms of both quantitative measures and human
judgments, as compared with the state-of-the-art dialogue generation models.
Furthermore, we give some detailed experimental analysis to show where and how
the improvements come from.Comment: Accepted at CIKM 202
Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus
The ability to ask questions is important in both human and machine
intelligence. Learning to ask questions helps knowledge acquisition, improves
question-answering and machine reading comprehension tasks, and helps a chatbot
to keep the conversation flowing with a human. Existing question generation
models are ineffective at generating a large amount of high-quality
question-answer pairs from unstructured text, since given an answer and an
input passage, question generation is inherently a one-to-many mapping. In this
paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which
aims at automatically generating high-quality and diverse question-answer pairs
from unlabeled text corpus at scale by imitating the way a human asks
questions. Our system consists of: i) an information extractor, which samples
from the text multiple types of assistive information to guide question
generation; ii) neural question generators, which generate diverse and
controllable questions, leveraging the extracted assistive information; and
iii) a neural quality controller, which removes low-quality generated data
based on text entailment. We compare our question generation models with
existing approaches and resort to voluntary human evaluation to assess the
quality of the generated question-answer pairs. The evaluation results suggest
that our system dramatically outperforms state-of-the-art neural question
generation models in terms of the generation quality, while being scalable in
the meantime. With models trained on a relatively smaller amount of data, we
can generate 2.8 million quality-assured question-answer pairs from a million
sentences found in Wikipedia.Comment: Accepted by The Web Conference 2020 (WWW 2020) as full paper (oral
presentation
ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit
Dance and music are two highly correlated artistic forms. Synthesizing dance
motions has attracted much attention recently. Most previous works conduct
music-to-dance synthesis via directly music to human skeleton keypoints
mapping. Meanwhile, human choreographers design dance motions from music in a
two-stage manner: they firstly devise multiple choreographic dance units
(CAUs), each with a series of dance motions, and then arrange the CAU sequence
according to the rhythm, melody and emotion of the music. Inspired by these, we
systematically study such two-stage choreography approach and construct a
dataset to incorporate such choreography knowledge. Based on the constructed
dataset, we design a two-stage music-to-dance synthesis framework ChoreoNet to
imitate human choreography procedure. Our framework firstly devises a CAU
prediction model to learn the mapping relationship between music and CAU
sequences. Afterwards, we devise a spatial-temporal inpainting model to convert
the CAU sequence into continuous dance motions. Experimental results
demonstrate that the proposed ChoreoNet outperforms baseline methods (0.622 in
terms of CAU BLEU score and 1.59 in terms of user study score).Comment: 10 pages, 5 figures, Accepted by ACM MM 202
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning
According to the World Health Organization(WHO), it is estimated that
approximately 1.3 billion people live with some forms of vision impairment
globally, of whom 36 million are blind. Due to their disability, engaging these
minority into the society is a challenging problem. The recent rise of smart
mobile phones provides a new solution by enabling blind users' convenient
access to the information and service for understanding the world. Users with
vision impairment can adopt the screen reader embedded in the mobile operating
systems to read the content of each screen within the app, and use gestures to
interact with the phone. However, the prerequisite of using screen readers is
that developers have to add natural-language labels to the image-based
components when they are developing the app. Unfortunately, more than 77% apps
have issues of missing labels, according to our analysis of 10,408 Android
apps. Most of these issues are caused by developers' lack of awareness and
knowledge in considering the minority. And even if developers want to add the
labels to UI components, they may not come up with concise and clear
description as most of them are of no visual issues. To overcome these
challenges, we develop a deep-learning based model, called LabelDroid, to
automatically predict the labels of image-based buttons by learning from
large-scale commercial apps in Google Play. The experimental results show that
our model can make accurate predictions and the generated labels are of higher
quality than that from real Android developers.Comment: Accepted to 42nd International Conference on Software Engineerin
Unsupervised Paraphrasing via Deep Reinforcement Learning
Paraphrasing is expressing the meaning of an input sentence in different
wording while maintaining fluency (i.e., grammatical and syntactical
correctness). Most existing work on paraphrasing use supervised models that are
limited to specific domains (e.g., image captions). Such models can neither be
straightforwardly transferred to other domains nor generalize well, and
creating labeled training data for new domains is expensive and laborious. The
need for paraphrasing across different domains and the scarcity of labeled
training data in many such domains call for exploring unsupervised paraphrase
generation methods. We propose Progressive Unsupervised Paraphrasing (PUP): a
novel unsupervised paraphrase generation method based on deep reinforcement
learning (DRL). PUP uses a variational autoencoder (trained using a
non-parallel corpus) to generate a seed paraphrase that warm-starts the DRL
model. Then, PUP progressively tunes the seed paraphrase guided by our novel
reward function which combines semantic adequacy, language fluency, and
expression diversity measures to quantify the quality of the generated
paraphrases in each iteration without needing parallel sentences. Our extensive
experimental evaluation shows that PUP outperforms unsupervised
state-of-the-art paraphrasing techniques in terms of both automatic metrics and
user studies on four real datasets. We also show that PUP outperforms
domain-adapted supervised algorithms on several datasets. Our evaluation also
shows that PUP achieves a great trade-off between semantic similarity and
diversity of expression
Automatic Testing and Improvement of Machine Translation
This paper presents TransRepair, a fully automatic approach for testing and repairing the consistency of machine translation systems. TransRepair combines mutation with metamorphic testing to detect inconsistency bugs (without access to human oracles). It then adopts probability-reference or cross-reference to post-process the translations, in a grey-box or black-box manner, to repair the inconsistencies. Our evaluation on two state-of-the-art translators, Google Translate and Transformer, indicates that TransRepair has a high precision (99%) on generating input pairs with consistent translations. With these tests, using automatic consistency metrics and manual assessment, we find that Google Translate and Transformer have approximately 36% and 40% inconsistency bugs. Black-box repair fixes 28% and 19% bugs on average for Google Translate and Transformer. Grey-box repair fixes 30% bugs on average for Transformer. Manual inspection indicates that the translations repaired by our approach improve consistency in 87% of cases (degrading it in 2%), and that our repairs have better translation acceptability in 27% of the cases (worse in 8%)
Dynamic Context-guided Capsule Network for Multimodal Machine Translation
Multimodal machine translation (MMT), which mainly focuses on enhancing
text-only translation with visual features, has attracted considerable
attention from both computer vision and natural language processing
communities. Most current MMT models resort to attention mechanism, global
context modeling or multimodal joint representation learning to utilize visual
features. However, the attention mechanism lacks sufficient semantic
interactions between modalities while the other two provide fixed visual
context, which is unsuitable for modeling the observed variability when
generating translation. To address the above issues, in this paper, we propose
a novel Dynamic Context-guided Capsule Network (DCCN) for MMT. Specifically, at
each timestep of decoding, we first employ the conventional source-target
attention to produce a timestep-specific source-side context vector. Next, DCCN
takes this vector as input and uses it to guide the iterative extraction of
related visual features via a context-guided dynamic routing mechanism.
Particularly, we represent the input image with global and regional visual
features, we introduce two parallel DCCNs to model multimodal context vectors
with visual features at different granularities. Finally, we obtain two
multimodal context vectors, which are fused and incorporated into the decoder
for the prediction of the target word. Experimental results on the Multi30K
dataset of English-to-German and English-to-French translation demonstrate the
superiority of DCCN. Our code is available on
https://github.com/DeepLearnXMU/MM-DCCN
- …