Search CORE

74 research outputs found

Detecting Similar Applications with Collaborative Tagging

Author: JIANG Lingxiao
LO David
THUNG Ferdian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Abstract—Detecting similar applications are useful for var-ious purposes ranging from program comprehension, rapid prototyping, plagiarism detection, and many more. McMillan et al. have proposed a solution to detect similar applications based on common Java API usage patterns. Recently, collaborative tagging has impacted software development practices. Various sites allow users to give various tags to software systems. In this study, we would like to complement the study by McMillan et al. by leveraging another source of information aside from API usage patterns, namely software tags. We have performed a user study involving several participants and the results show that collaborative tagging is a promising source of information useful for detecting similar software applications. I

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Interactive fault localization leveraging simple user feedback

Author: GONG Liang
JIANG Lingxiao
LO David
ZHANG Hongyu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2012
Field of study

NSF

Crossref

Institutional Knowledge at Singapore Management University

When and Why Your Code Starts to Smell Bad

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A Tutorial on Software Engineering Intelligence: Case Studies on Model-Driven Engineering

Author: Kessentini Marouane
Publication venue
Publication date: 15/11/2019
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/153783/1/MODELS_Tutorial__SEI___Copy_.pd

Deep Blue Documents at the University of Michigan

Stack Overflow in Github: Any Snippets There?

Author: Lopes Cristina
Martins Pedro
Saini Vaibhav
Yang Di
Publication venue
Publication date: 02/05/2017
Field of study

When programmers look for how to achieve certain programming tasks, Stack Overflow is a popular destination in search engine results. Over the years, Stack Overflow has accumulated an impressive knowledge base of snippets of code that are amply documented. We are interested in studying how programmers use these snippets of code in their projects. Can we find Stack Overflow snippets in real projects? When snippets are used, is this copy literal or does it suffer adaptations? And are these adaptations specializations required by the idiosyncrasies of the target artifact, or are they motivated by specific requirements of the programmer? The large-scale study presented on this paper analyzes 909k non-fork Python projects hosted on Github, which contain 290M function definitions, and 1.9M Python snippets captured in Stack Overflow. Results are presented as quantitative analysis of block-level code cloning intra and inter Stack Overflow and GitHub, and as an analysis of programming behaviors through the qualitative analysis of our findings.Comment: 14th International Conference on Mining Software Repositories, 11 page

arXiv.org e-Print Archive

Crossref

Contemporary Approach for Technical Reckoning Code Smells Detection using Textual Analysis

Author: Dr. P. Sengottuvelan, M. Sangeetha
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/05/2017
Field of study

Software Designers should be aware of address design smells that can evident as results of design and decision. In a software project, technical debt needs to be repaid habitually to avoid its accretion. Large technical debt significantly degrades the quality of the software system and affects the productivity of the development team. In tremendous cases, when the accumulated technical reckoning becomes so enormous that it cannot be paid off to any further extent the product has to be abandoned. In this paper, we bridge the gap analyzing to what coverage abstract information, extracted using textual analysis techniques, can be used to identify smells in source code. The proposed textual-based move toward for detecting smells in source code, fabricated as TACO (Textual Analysis for Code smell detection), has been instantiated for detecting the long parameter list smell and has been evaluated on three sampling Java open source projects. The results determined that TACO is able to indentified between 50% and 77% of the smell instances with a exactitude ranging between 63% and 67%. In addition, the results show that TACO identifies smells that are not recognized by approaches based on exclusively structural information

International Journal on Recent and Innovation Trends in Computing and Communication

Using Word Embedding and Convolution Neural Network for Bug Triaging by Considering Design Flaws

Author: Akbari Reza
Boushehrian Omid
Hashemi Sattar
Jamasb Behnaz
Sepahvand Reza
Publication venue
Publication date: 20/09/2022
Field of study

Resolving bugs in the maintenance phase of software is a complicated task. Bug assignment is one of the main tasks for resolving bugs. Some Bugs cannot be fixed properly without making design decisions and have to be assigned to designers, rather than programmers, to avoid emerging bad smells that may cause subsequent bug reports. Hence, it is important to refer some bugs to the designer to check the possible design flaws. Based on our best knowledge, there are a few works that have considered referring bugs to designers. Hence, this issue is considered in this work. In this paper, a dataset is created, and a CNN-based model is proposed to predict the need for assigning a bug to a designer by learning the peculiarities of bug reports effective in creating bad smells in the code. The features of each bug are extracted from CNN based on its textual features, such as a summary and description. The number of bad samples added to it in the fixing process using the PMD tool determines the bug tag. The summary and description of the new bug are given to the model and the model predicts the need to refer to the designer. The accuracy of 75% (or more) was achieved for datasets with a sufficient number of samples for deep learning-based model training. A model is proposed to predict bug referrals to the designer. The efficiency of the model in predicting referrals to the designer at the time of receiving the bug report was demonstrated by testing the model on 10 projects

arXiv.org e-Print Archive

Identification-method research for open-source software ecosystems

Author: Liao Zhifang
Liu Hui
Liu Shengzong
Wang Ningwei
Zhang Qi
Zhang Yan
Publication venue: 'MDPI AG'
Publication date: 01/02/2019
Field of study

In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

ResearchOnline@GCU

MuDelta: Delta-Oriented Mutation Testing at Commit Time

Author: Chekam TT
Harman M
Ma W
Papadakis M
Publication venue: 43rd IEEE/ACM International Conference on Software Engineering - Software Engineering in Practice (ICSE-SEIP) / 43rd ACM/IEEE International Conference on Software Engineering - New Ideas and Emerging Results (ICSE-NIER)
Publication date: 07/05/2021
Field of study

To effectively test program changes using mutation testing, one needs to use mutants that are relevant to the altered program behaviours. In view of this, we introduce MuDelta, an approach that identifies commit-relevant mutants; mutants that affect and are affected by the changed program behaviours. Our approach uses machine learning applied on a combined scheme of graph and vector-based representations of static code features. Our results, from 50 commits in 21 Coreutils programs, demonstrate a strong prediction ability of our approach; yielding 0.80 (ROC) and 0.50 (PR Curve) AUC values with 0.63 and 0.32 precision and recall values. These predictions are significantly higher than random guesses, 0.20 (PR-Curve) AUC, 0.21 and 0.21 precision and recall, and subsequently lead to strong relevant tests that kill 45%more relevant mutants than randomly sampled mutants (either sampled from those residing on the changed component(s) or from the changed lines). Our results also show that MuDelta selects mutants with 27% higher fault revealing ability in fault introducing commits. Taken together, our results corroborate the conclusion that commit-based mutation testing is suitable and promising for evolving software

UCL Discovery

When would this bug get reported?

Author: Devanbu Premkumar
JIANG Lingxiao
LO David
Lucia Lucia
Rahman Foyzur
THUNG Ferdian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2012
Field of study

Abstract—Not all bugs in software would be experienced and reported by end users right away: Some bugs manifest themselves quickly and may be reported by users a few days after they get into the code base; others manifest many months or even years later, and may only be experienced and reported by a small number of users. We refer to the period of time between the time when a bug is introduced into code and the time when it is reported by a user as bug reporting latency. Knowledge of bug reporting latencies has an implication on prioritization of bug fixing activities—bugs with low reporting latencies may be fixed earlier than those with high latencies to shift debugging resources towards bugs highly concerning users. To investigate bug reporting latencies, we analyze bugs from three Java software systems: AspectJ, Rhino, and Lucene. We extract bug reporting data from their version control repositories and bug tracking systems, identify bug locations based on bug fixes, and back-trace bug introducing time based on change histories of the buggy code. Also, we remove nonessential changes, and most importantly, recover root causes of bugs from their treatments/fixes. We then calculate the bug reporting latencies, and find that bugs have diverse reporting latencies. Based on the calculated reporting latencies and features we extract from bugs, we build classification models that can predict whether a bug would be reported early (within 30 days) or later, which may be helpful for prioritizing bug fixing activities. Our evaluation on the three software systems shows that our bug reporting latency prediction models could achieve an AUC (Area Under the Receiving Operating Characteristics Curve) of 70.869%. I

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University