Search CORE

68 research outputs found

On the Relevance of Cross-project Learning with Nearest Neighbours for Commit Message Generation

Author: Buse Raymond PL
Papineni Kishore
Vishalakshi M
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/10/2020
Field of study

Commit messages play an important role in software maintenance and evolution. Nonetheless, developers often do not produce high-quality messages. A number of commit message generation methods have been proposed in recent years to address this problem. Some of these methods are based on neural machine translation (NMT) techniques. Studies show that the nearest neighbor algorithm (NNGen) outperforms existing NMT-based methods, although NNGen is simpler and faster than NMT. In this paper, we show that NNGen does not take advantage of cross-project learning in the majority of the cases. We also show that there is an even simpler and faster variation of the existing NNGen method which outperforms it in terms of the BLEU_4 score without using cross-project learning

arXiv.org e-Print Archive

Crossref

Don't hide in the frames: Note- and pattern-based evaluation of automated melody extraction algorithms

Author: Basaran D.
Bittner R.M.
Bittner R.M.
Bosch J.J.
Frieler Klaus
Frieler Klaus
Müllensiefen Daniel
Papineni Kishore
Raffel Colin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

International audienc

HAL-CentraleSupelec

Crossref

Queen Mary Research Online

HAL-Rennes 1

Ranking Enhanced Dialogue Generation

Author: Kingma Diederik P
Lee Alan
Li Yanran
Papineni Kishore
Tie-Yan
van der Maaten Laurens
Xia Fen
Xing Chen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/08/2020
Field of study

How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation. Previous works usually employ various neural network architectures (e.g., recurrent neural networks, attention mechanisms, and hierarchical structures) to model the history. However, a recent empirical study by Sankar et al. has shown that these architectures lack the ability of understanding and modeling the dynamics of the dialogue history. For example, the widely used architectures are insensitive to perturbations of the dialogue history, such as words shuffling, utterances missing, and utterances reordering. To tackle this problem, we propose a Ranking Enhanced Dialogue generation framework in this paper. Despite the traditional representation encoder and response generation modules, an additional ranking module is introduced to model the ranking relation between the former utterance and consecutive utterances. Specifically, the former utterance and consecutive utterances are treated as query and corresponding documents, and both local and global ranking losses are designed in the learning process. In this way, the dynamics in the dialogue history can be explicitly captured. To evaluate our proposed models, we conduct extensive experiments on three public datasets, i.e., bAbI, PersonaChat, and JDC. Experimental results show that our models produce better responses in terms of both quantitative measures and human judgments, as compared with the state-of-the-art dialogue generation models. Furthermore, we give some detailed experimental analysis to show where and how the improvements come from.Comment: Accepted at CIKM 202

arXiv.org e-Print Archive

Crossref

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Author: Devlin Jacob
Kingma P
Lin Chin-Yew
Liu Bang
Papineni Kishore
Paszke Adam
Wolf Thomas
Zhou Qingyu
Řehůřek Radim
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/03/2020
Field of study

The ability to ask questions is important in both human and machine intelligence. Learning to ask questions helps knowledge acquisition, improves question-answering and machine reading comprehension tasks, and helps a chatbot to keep the conversation flowing with a human. Existing question generation models are ineffective at generating a large amount of high-quality question-answer pairs from unstructured text, since given an answer and an input passage, question generation is inherently a one-to-many mapping. In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions. Our system consists of: i) an information extractor, which samples from the text multiple types of assistive information to guide question generation; ii) neural question generators, which generate diverse and controllable questions, leveraging the extracted assistive information; and iii) a neural quality controller, which removes low-quality generated data based on text entailment. We compare our question generation models with existing approaches and resort to voluntary human evaluation to assess the quality of the generated question-answer pairs. The evaluation results suggest that our system dramatically outperforms state-of-the-art neural question generation models in terms of the generation quality, while being scalable in the meantime. With models trained on a relatively smaller amount of data, we can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.Comment: Accepted by The Web Conference 2020 (WWW 2020) as full paper (oral presentation

arXiv.org e-Print Archive

Crossref

ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit

Author: Böck Sebastian
Crnkovic-Friis Luka
Eyben Florian
Heusel Martin
Korzeniowski Filip
Papineni Kishore
Paszke Adam
Pavllo Dario
Shiratori Takaaki
Yalta Nelson
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/09/2020
Field of study

Dance and music are two highly correlated artistic forms. Synthesizing dance motions has attracted much attention recently. Most previous works conduct music-to-dance synthesis via directly music to human skeleton keypoints mapping. Meanwhile, human choreographers design dance motions from music in a two-stage manner: they firstly devise multiple choreographic dance units (CAUs), each with a series of dance motions, and then arrange the CAU sequence according to the rhythm, melody and emotion of the music. Inspired by these, we systematically study such two-stage choreography approach and construct a dataset to incorporate such choreography knowledge. Based on the constructed dataset, we design a two-stage music-to-dance synthesis framework ChoreoNet to imitate human choreography procedure. Our framework firstly devises a CAU prediction model to learn the mapping relationship between music and CAU sequences. Afterwards, we devise a spatial-temporal inpainting model to convert the CAU sequence into continuous dance motions. Experimental results demonstrate that the proposed ChoreoNet outperforms baseline methods (0.622 in terms of CAU BLEU score and 1.59 in terms of user study score).Comment: 10 pages, 5 figures, Accepted by ACM MM 202

arXiv.org e-Print Archive

Crossref

Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning

Author: Banerjee Satanjeev
Chen Chunyang
Feng Ruitao
Juan
Kingma Diederik P
Papineni Kishore
Park Kyudong
Spearman Ch
Wilcoxon Frank
Yan Shunguo
Zhang Xiaoyi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/07/2020
Field of study

According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.Comment: Accepted to 42nd International Conference on Software Engineerin

arXiv.org e-Print Archive

Crossref

Unsupervised Paraphrasing via Deep Reinforcement Learning

Author: Banerjee Satanjeev
Barzilay Regina
Dolan William B
Fader Anthony
Gupta Ankush
Heafield Kenneth
Hovy Eduard H
Knight Kevin
Papineni Kishore
Ross Stéphane
Silver David
Snover Matthew
Zhao Shiqi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/07/2020
Field of study

Paraphrasing is expressing the meaning of an input sentence in different wording while maintaining fluency (i.e., grammatical and syntactical correctness). Most existing work on paraphrasing use supervised models that are limited to specific domains (e.g., image captions). Such models can neither be straightforwardly transferred to other domains nor generalize well, and creating labeled training data for new domains is expensive and laborious. The need for paraphrasing across different domains and the scarcity of labeled training data in many such domains call for exploring unsupervised paraphrase generation methods. We propose Progressive Unsupervised Paraphrasing (PUP): a novel unsupervised paraphrase generation method based on deep reinforcement learning (DRL). PUP uses a variational autoencoder (trained using a non-parallel corpus) to generate a seed paraphrase that warm-starts the DRL model. Then, PUP progressively tunes the seed paraphrase guided by our novel reward function which combines semantic adequacy, language fluency, and expression diversity measures to quantify the quality of the generated paraphrases in each iteration without needing parallel sentences. Our extensive experimental evaluation shows that PUP outperforms unsupervised state-of-the-art paraphrasing techniques in terms of both automatic metrics and user studies on four real datasets. We also show that PUP outperforms domain-adapted supervised algorithms on several datasets. Our evaluation also shows that PUP achieves a great trade-off between semantic similarity and diversity of expression

arXiv.org e-Print Archive

Crossref

Automatic Testing and Improvement of Machine Translation

Author: Belinkov Yonatan
Cheng Yong
Eberhard David M
Goodfellow Ian
Goues Claire Le
Graham Yvette
Gu Jiatao
Heigold Georg
Mason M. Chris
Papineni Kishore
Saha Ripon K.
Sperber Matthias
Taylor Ann
Vaswani Ashish
Xin Qi
Ziemski Michał
Publication venue
Publication date: 25/12/2019
Field of study

This paper presents TransRepair, a fully automatic approach for testing and repairing the consistency of machine translation systems. TransRepair combines mutation with metamorphic testing to detect inconsistency bugs (without access to human oracles). It then adopts probability-reference or cross-reference to post-process the translations, in a grey-box or black-box manner, to repair the inconsistencies. Our evaluation on two state-of-the-art translators, Google Translate and Transformer, indicates that TransRepair has a high precision (99%) on generating input pairs with consistent translations. With these tests, using automatic consistency metrics and manual assessment, we find that Google Translate and Transformer have approximately 36% and 40% inconsistency bugs. Black-box repair fixes 28% and 19% bugs on average for Google Translate and Transformer. Grey-box repair fixes 30% bugs on average for Transformer. Manual inspection indicates that the translations repaired by our approach improve consistency in 87% of cases (degrading it in 2%), and that our repairs have better translation acceptability in 27% of the cases (worse in 8%)

arXiv.org e-Print Archive

Crossref

UCL Discovery

Open Repository and Bibliography - Luxembourg

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

Author: Anderson Peter
Bahdanau Dzmitry
Caglayan Ozan
Desmond
He Kaiming
Jaiswal Ayush
Klein Guillaume
Michael
Papineni Kishore
Sabour Sara
Singh Maneet
Stig-Arne
Su Jinsong
Vaswani Ashish
Wang Mingxuan
Wu Qi
Xinyi Zhang
Yang Zhengxin
Zhang Xiangwen
Zheng Zaixiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/09/2020
Field of study

Multimodal machine translation (MMT), which mainly focuses on enhancing text-only translation with visual features, has attracted considerable attention from both computer vision and natural language processing communities. Most current MMT models resort to attention mechanism, global context modeling or multimodal joint representation learning to utilize visual features. However, the attention mechanism lacks sufficient semantic interactions between modalities while the other two provide fixed visual context, which is unsuitable for modeling the observed variability when generating translation. To address the above issues, in this paper, we propose a novel Dynamic Context-guided Capsule Network (DCCN) for MMT. Specifically, at each timestep of decoding, we first employ the conventional source-target attention to produce a timestep-specific source-side context vector. Next, DCCN takes this vector as input and uses it to guide the iterative extraction of related visual features via a context-guided dynamic routing mechanism. Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities. Finally, we obtain two multimodal context vectors, which are fused and incorporated into the decoder for the prediction of the target word. Experimental results on the Multi30K dataset of English-to-German and English-to-French translation demonstrate the superiority of DCCN. Our code is available on https://github.com/DeepLearnXMU/MM-DCCN

arXiv.org e-Print Archive

Crossref