Search CORE

67 research outputs found

NLP2Code: Code Snippet Content Assist via Natural Language Tasks

Author: Campbell Brock Angus
Treude Christoph
Publication venue
Publication date: 02/08/2017
Field of study

Developers increasingly take to the Internet for code snippets to integrate into their programs. To save developers the time required to switch from their development environments to a web browser in the quest for a suitable code snippet, we introduce NLP2Code, a content assist for code snippets. Unlike related tools, NLP2Code integrates directly into the source code editor and provides developers with a content assist feature to close the vocabulary gap between developers' needs and code snippet meta data. Our preliminary evaluation of NLP2Code shows that the majority of invocations lead to code snippets rated as helpful by users and that the tool is able to support a wide range of tasks.Comment: tool demo video available at https://www.youtube.com/watch?v=h-gaVYtCznI; to appear as a tool demo paper at ICSME 2017 (https://icsme2017.github.io/

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Is Stack Overflow Overflowing With Questions and Tags

Author: K. Ranjitha R.
Singh Sanjay
Publication venue
Publication date: 01/01/2015
Field of study

Programming question and answer (Q & A) websites, such as Quora, Stack Overflow, and Yahoo! Answer etc. helps us to understand the programming concepts easily and quickly in a way that has been tested and applied by many software developers. Stack Overflow is one of the most frequently used programming Q\&A website where the questions and answers posted are presently analyzed manually, which requires a huge amount of time and resource. To save the effort, we present a topic modeling based technique to analyze the words of the original texts to discover the themes that run through them. We also propose a method to automate the process of reviewing the quality of questions on Stack Overflow dataset in order to avoid ballooning the stack overflow with insignificant questions. The proposed method also recommends the appropriate tags for the new post, which averts the creation of unnecessary tags on Stack Overflow.Comment: 11 pages, 7 figures, 3 tables Presented at Third International Symposium on Women in Computing and Informatics (WCI-2015

arXiv.org e-Print Archive

Crossref

On Using Active Learning and Self-Training when Mining Performance Discussions on Stack Overflow

Author: Allamanis M.
Chowdhury S.
Cicchetti A.
Lin Y.
Pedregosa F.
Settles B.
Settles B.
Soliman M.
Ying A.
Publication venue
Publication date: 01/01/2017
Field of study

Abundant data is the key to successful machine learning. However, supervised learning requires annotated data that are often hard to obtain. In a classification task with limited resources, Active Learning (AL) promises to guide annotators to examples that bring the most value for a classifier. AL can be successfully combined with self-training, i.e., extending a training set with the unlabelled examples for which a classifier is the most certain. We report our experiences on using AL in a systematic manner to train an SVM classifier for Stack Overflow posts discussing performance of software components. We show that the training examples deemed as the most valuable to the classifier are also the most difficult for humans to annotate. Despite carefully evolved annotation criteria, we report low inter-rater agreement, but we also propose mitigation strategies. Finally, based on one annotator's work, we show that self-training can improve the classification accuracy. We conclude the paper by discussing implication for future text miners aspiring to use AL and self-training.Comment: Preprint of paper accepted for the Proc. of the 21st International Conference on Evaluation and Assessment in Software Engineering, 201

arXiv.org e-Print Archive

Lund University Publications

Crossref

Swedish Institute of Computer Science Publications Database