Search CORE

2,201 research outputs found

Duplicate Bug Report Detection:How Far Are We?

Author: Han DongGyun
Irsan Ivana Clairine
Jiang Lingxiao
Lo David
Thung Ferdian
Vinayakarao Venkatesh
Xu Bowen
Zhang Ting
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/12/2022
Field of study

Software Development Analytics in Practice: A Systematic Literature Review

Author: Brito e Abreu F.
Caldeira J.
Cardoso J.
Oliveira T. C.
Reis J.
Simões R.
Publication venue
Publication date: 20/07/2020
Field of study

Context:Software Development Analytics is a research area concerned with providing insights to improve product deliveries and processes. Many types of studies, data sources and mining methods have been used for that purpose. Objective:This systematic literature review aims at providing an aggregate view of the relevant studies on Software Development Analytics in the past decade (2010-2019), with an emphasis on its application in practical settings. Method:Definition and execution of a search string upon several digital libraries, followed by a quality assessment criteria to identify the most relevant papers. On those, we extracted a set of characteristics (study type, data source, study perspective, development life-cycle activities covered, stakeholders, mining methods, and analytics scope) and classified their impact against a taxonomy. Results:Source code repositories, experimental case studies, and developers are the most common data sources, study types, and stakeholders, respectively. Product and project managers are also often present, but less than expected. Mining methods are evolving rapidly and that is reflected in the long list identified. Descriptive statistics are the most usual method followed by correlation analysis. Being software development an important process in every organization, it was unexpected to find that process mining was present in only one study. Most contributions to the software development life cycle were given in the quality dimension. Time management and costs control were lightly debated. The analysis of security aspects suggests it is an increasing topic of concern for practitioners. Risk management contributions are scarce. Conclusions:There is a wide improvement margin for software development analytics in practice. For instance, mining and analyzing the activities performed by software developers in their actual workbench, the IDE

arXiv.org e-Print Archive

Repositório Institucional do ISCTE-IUL

Estudo Geral

Characterizing and Predicting Blocking Bugs in Open Source Projects

Author: Nagappan Mei
Shihab Emad
Valdivia-Garcia Harold
Publication venue: 'Elsevier BV'
Publication date: 06/04/2018
Field of study

Software engineering researchers have studied specific types of issues such reopened bugs, performance bugs, dormant bugs, etc. However, one special type of severe bugs is blocking bugs. Blocking bugs are software bugs that prevent other bugs from being fixed. These bugs may increase maintenance costs, reduce overall quality and delay the release of the software systems. In this paper, we study blocking bugs in eight open source projects and propose a model to predict them early on. We extract 14 different factors (from the bug repositories) that are made available within 24 hours after the initial submission of the bug reports. Then, we build decision trees to predict whether a bug will be a blocking bugs or not. Our results show that our prediction models achieve F-measures of 21%-54%, which is a two-fold improvement over the baseline predictors. We also analyze the fixes of these blocking bugs to understand their negative impact. We find that fixing blocking bugs requires more lines of code to be touched compared to non-blocking bugs. In addition, our file-level analysis shows that files affected by blocking bugs are more negatively impacted in terms of cohesion, coupling complexity and size than files affected by non-blocking bugs

Crossref

Concordia University Research Repository

Machine Learning And Deep Learning Based Approaches For Detecting Duplicate Bug Reports With Stack Traces

Author: Ebrahimi Koopaei Neda
Publication venue
Publication date: 17/07/2019
Field of study

Many large software systems rely on bug tracking systems to record the submitted bug reports and to track and manage bugs. Handling bug reports is known to be a challenging task, especially in software organizations with a large client base, which tend to receive a considerable large number of bug reports a day. Fortunately, not all reported bugs are new; many are similar or identical to previously reported bugs, also called duplicate bug reports. Automatic detection of duplicate bug reports is an important research topic to help reduce the time and effort spent by triaging and development teams on sorting and fixing bugs. This explains the recent increase in attention to this topic as evidenced by the number of tools and algorithms that have been proposed in academia and industry. The objective is to automatically detect duplicate bug reports as soon as they arrive into the system. To do so, existing techniques rely heavily on the nature of bug report data they operate on. This includes both structural information such as OS, product version, time and date of the crash, and stack traces, as well as unstructured information such as bug report summaries and descriptions written in natural language by end users and developers

Concordia University Research Repository

Issue Creator

Author: Seabra César André da Rocha
Publication venue
Publication date: 01/01/2022
Field of study

In the constant development of the technological industry, when it comes to product development in terms of software development new trends tend to motivate the evolution of the software through the analysis of user feedback from issue tracking systems. This is because the ultimate success of any software and, as consequence, for any technology-driven company, falls on whether or not the developed solutions manage to fulfill the expectation of the final users. E-goi is a company that provides a platform for multi-channel marketing automation that allows the integration of multiple channels from SMS and voice messages to e-mail and webpush. When it comes to SaaS companies such as E-goi, user feedback becomes of extreme importance in order to improve its products and create value for both the user and the company. When managing user feedback, it is often important how it will be delivered to the development teams in such a way that the problem at hand becomes easily understood with the maximum information possible, to be able to replicate the bugs and to create new features for the product. This, of course, must be achieved with minimal impact when it comes to the analysis of the issues and consequent development. However, the gathering and consequent delivery of this feedback to the product development teams, in E-goi, can come with some problems in both information standardization and duplicate prevention and extra costs when generated by the used tools when pursuing the objective of allowing the entire company to provide said feedback as well. So, to solve this problem, E-goi decided to create a tool that allows all the collaborators to submit issues - the Issue Creator. Nevertheless, other described problems still need to be solved. Here, is where this project comes into play by developing a revamp of this platform and enabling the creation of standardized issue reports, issue duplication prevention, and the implementation of other features that involve the integration of different platforms to simplify the actions that are essential to the product development teams. In this report, an introduction to the identified problem is described, along with the objectives and methodology followed. After this, a full contextualization on how the E-goi organizational departments are distributed, with an emphasis on the product development department, and their processes in software development. Subsequently, an analysis of the value of the solution and the requirements gathered through the elicitation phase as part of the requirements engineering practice is made, passing by a detailed view of the proposed design to develop the platform. Finally, the developed platform was evaluated both from the technical aspect through tests and quality aspects comprehended by the users, by taking advantage of the stakeholder answers gathered from inquiries performed.Com o constante desenvolvimento da indústria tecnológica, quando se trata de desenvolvimento de produtos, mais concretamente em termos de desenvolvimento de software, as novas tendências tendem a motivar a evolução do software através da análise do feedback dos clientes reunido nos sistemas de gestão de tarefas. Isto deve-se essencialmente ao facto do sucesso de qualquer software e de qualquer empresa tecnológica, depender do facto das soluções desenvolvidas conseguirem ou não atender às expectativas dos utilizadores finais. A E-goi é uma empresa que disponibiliza uma plataforma de automação de marketing multicanal que permite a integração de múltiplos canais desde SMS e mensagens de voz a e-mail e webpush. No que concerne empresas SaaS como a E-goi, o feedback dos utilizadores torna-se de extrema importância para melhorar os seus produtos e criar valor tanto para o utilizador como para a empresa. Ao gerir o feedback do utilizador, muitas vezes é importante como ele será entregue às equipas de desenvolvimento, de forma a que o problema em questão seja facilmente entendido com o máximo de informações possível, aquando se tenta reproduzir os bugs reportados ou mesmo desenvolver novas funcionalidades. Isso, é claro, deve ser alcançado com o mínimo de impacto aquando a análise das issues e consequente desenvolvimento. No entanto, a recolha e consequente entrega deste feedback às equipas de desenvolvimento de produto, no E-goi, pode acarretar alguns problemas quer na uniformização da informação, quer na prevenção de duplicações e custos extra gerados pela utilização das ferramentas de gestão de issues pelos diversos colaboradores da empresa, de forma a permitir que estes também possam reportar novos problemas. Assim, para resolver esta problemática, a E-goi decidiu criar uma ferramenta que permitisse a todos os colaboradores submeterem novas issues - o Issue Creator. No entanto, outros problemas descritos ficaram por resolver. É aqui que entra este projeto, ao desenvolver uma reformulação desta plataforma e permitir a criação de relatórios de issues standerdizados, prevenção de duplicação de relatórios e a implementação de outras funcionalidades que envolvem a integração de diferentes plataformas, de forma a simplificar as ações que são essenciais para as equipes de desenvolvimento de produtos. Neste relatório é descrita uma introdução ao problema identificado, bem como os objetivos e a metodologia seguida. A seguir, é feita uma contextualização completa de como estão distribuídos os departamentos da E-goi, com ênfase no departamento de desenvolvimento de produto e nos seus processos no desenvolvimento de software. Posteriormente, é feita uma análise de valor da solução e dos requisitos levantados na fase de elicitação como parte da prática de engenharia de requisitos, passando por uma visão detalhada do design proposto para o desenvolvimento da plataforma. Finalmente, a plataforma desenvolvida é avaliada tanto do aspeto técnico por meio de testes quanto dos aspetos de qualidade compreendidos pelos utilizadores, através da analise das respostas obtidas pela realização de questionários

Repositório Científico do Instituto Politécnico do Porto

Recommended from our members

Leveraging the Power of Crowds: Automated Test Report Processing for The Maintenance of Mobile Applications

Author: Feng Yang
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Crowdsourcing is an emerging distributed problem-solving model combining human and machine computation. It collects intelligence and knowledge from a large and diverse workforce to complete complex tasks. In the software engineering domain, crowdsourced techniques have been adopted to facilitate various tasks, such as design, testing, debugging, development, and so on. Specifically, in crowdsourced testing, crowdsourced workers are given testing tasks to perform and submit their feedback in the form of test reports. One of the key advantages of crowdsourced testing is that it is capable of providing engineers software engineers with domain knowledge and feedback from a large number of real users. Based on diverse software and hardware settings of these users, engineers can bugs that are not caught by traditional quality assurance techniques. Such benefits are particularly ideal for mobile application testing, which needs rapid development-and-deployment iterations and support diverse execution environments. However, crowdsourced testing naturally generates an overwhelming number of crowdsourced test reports, and inspecting such a large number of reports becomes a time-consuming yet inevitable task. This dissertation presents a series of techniques, tools and experiments to assist in crowdsourced report processing. These techniques are designed for improving this task in multiple aspects: 1. prioritizing crowdsourced report to assist engineers in finding as many unique bugs as possible, and as quickly as possible; 2. grouping crowdsourced report to assist engineers in identifying the representative ones in a short time; 3. summarizing the duplicate reports to provide engineers with a concise and accurate understanding of a group of reports; In the first step, I present a text-analysis-based technique to prioritize test reports for manual inspection. This technique leverages two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk-assessment strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations.Together, these two strategies form our technique to prioritize test reports in crowdsourced testing. Moreover, in the mobile testing domain, test reports often consist of more screenshots and shorter descriptive text, and thus text-analysis-based techniques may be ineffective or inapplicable. The shortage and ambiguity of natural-language text information and the well-defined screenshots of activity views within mobile applications motivate me to propose a novel technique based on using image understanding for multi-objective test-report prioritization. This technique employs the Spatial Pyramid Matching (SPM) technique to measure the similarity of the screenshots, and apply the natural-language processing technique to measure the distance between the text of test reports. Next, I design and implement CTRAS: a novel approach to leveraging duplicates to enrich the content of bug descriptions and improve the efficiency of inspecting these reports. CTRAS is capable of automatically aggregating duplicates based on both textual information and screenshots, and further summarizes the duplicate test reports into a comprehensive and comprehensible report.I validate all of these techniques on industrial data by collaborating with several companies. The results show my techniques can improve both the efficiency and effectiveness of crowdsourced test report processing. Also, I suggest settings for different usage scenarios and discuss future research directions

eScholarship - University of California

Use and misuse of the term "Experiment" in mining software repositories research

Author: Ayala Martínez Claudia Patricia
Franch Gutiérrez Javier
Juristo Juzgado Natalia
Turhan Burak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2022
Field of study

The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments.This work has been partially supported by the Spanish project: MCI PID2020-117191RB-I00.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC