1,838 research outputs found
Locating bugs without looking back
Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history
An Empirical Study on Android-related Vulnerabilities
Mobile devices are used more and more in everyday life. They are our cameras,
wallets, and keys. Basically, they embed most of our private information in our
pocket. For this and other reasons, mobile devices, and in particular the
software that runs on them, are considered first-class citizens in the
software-vulnerabilities landscape. Several studies investigated the
software-vulnerabilities phenomenon in the context of mobile apps and, more in
general, mobile devices. Most of these studies focused on vulnerabilities that
could affect mobile apps, while just few investigated vulnerabilities affecting
the underlying platform on which mobile apps run: the Operating System (OS).
Also, these studies have been run on a very limited set of vulnerabilities.
In this paper we present the largest study at date investigating
Android-related vulnerabilities, with a specific focus on the ones affecting
the Android OS. In particular, we (i) define a detailed taxonomy of the types
of Android-related vulnerability; (ii) investigate the layers and subsystems
from the Android OS affected by vulnerabilities; and (iii) study the
survivability of vulnerabilities (i.e., the number of days between the
vulnerability introduction and its fixing). Our findings could help OS and apps
developers in focusing their verification & validation activities, and
researchers in building vulnerability detection tools tailored for the mobile
world
How Shallow is a Bug? Why Open Source Communities Shorten the Repair Time of Software Defects
A central tenet of the open source software development methodology is that the community of users and developers is instrumental in improving the quality of software. Using a 10-year longitudinal dataset from the Firefox community, I investigate how the size of a community in terms of bug reporters and software developers, the social networks of developers and the quality of user contributions influence the time needed to repair software defects. The results show that a large open source community in terms of bug reporters reduces the time needed to resolve a defect while the addition of new software developers to an open source community takes away resources to fix bugs and increase the time needed to resolve a defect. In addition, software developers occupying dense network positions need less time to solve a bug. Finally, user contributions are beneficial when bugs are lively discussed but there is no support for the prediction that the experience of the bug reporter or the quality of the bug report reduces the time needed to solve a software defect
How do software practitioners perceive human-centric defects?
Context: Human-centric software design and development focuses on how users
want to carry out their tasks rather than making users accommodate their
software. Software users can have different genders, ages, cultures, languages,
disabilities, socioeconomic statuses, and educational backgrounds, among many
other differences. Due to the inherently varied nature of these differences and
their impact on software usage, preferences and issues of users can vary,
resulting in user-specific defects that we term as `human-centric defects'
(HCDs).
Objective: This research aims to understand the perception and current
management practices of such human-centric defects by software practitioners,
identify key challenges in reporting, understanding and fixing them, and
provide recommendations to improve HCDs management in software engineering.
Method: We conducted a survey and interviews with software engineering
practitioners to gauge their knowledge and experience on HCDs and the defect
tracking process.
Results: We analysed fifty (50) survey- and ten (10) interview- responses
from SE practitioners and identified that there are multiple gaps in the
current management of HCDs in software engineering practice. There is a lack of
awareness regarding human-centric aspects, causing them to be lost or
under-appreciated during software development. Our results revealed that
handling HCDs could be improved by following a better feedback process with
end-users, a more descriptive taxonomy, and suitable automation.
Conclusion: HCDs present a major challenge to software practitioners, given
their diverse end-user base. In the software engineering domain, research on
HCDs has been limited and requires effort from the research and practice
communities to create better awareness and support regarding human-centric
aspects
Recommended from our members
Improving Information Retrieval Bug Localisation Using Contextual Heuristics
Software developers working on unfamiliar systems are challenged to identify where and how high-level concepts are implemented in the source code prior to performing maintenance tasks. Bug localisation is a core program comprehension activity in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code?
Information retrieval (IR) approaches see the bug report as the query, and the source files as the documents to be retrieved, ranked by relevance. Current approaches rely on project history, in particular previously fixed bugs and versions of the source code. Existing IR techniques fall short of providing adequate solutions in finding all the source code files relevant for a bug. Without additional help, bug localisation can become a tedious, time- consuming and error-prone task.
My research contributes a novel algorithm that, given a bug report and the application’s source files, uses a combination of lexical and structural information to suggest, in a ranked order, files that may have to be changed to resolve the reported bug without requiring past code and similar reports.
I study eight applications for which I had access to the user guide, the source code, and some bug reports. I compare the relative importance and the occurrence of the domain concepts in the project artefacts and measure the effectiveness of using only concept key words to locate files relevant for a bug compared to using all the words of a bug report.
Measuring my approach against six others, using their five metrics and eight projects, I position an effected file in the top-1, top-5 and top-10 ranks on average for 44%, 69% and 76% of the bug reports respectively. This is an improvement of 23%, 16% and 11% respectively over the best performing current state-of-the-art tool.
Finally, I evaluate my algorithm with a range of industrial applications in user studies, and found that it is superior to simple string search, as often performed by developers. These results show the applicability of my approach to software projects without history and offers a simpler light-weight solution
Learning Code Transformations via Neural Machine Translation
Source code evolves – inevitably – to remain useful, secure, correct, readable, and efficient. Developers perform software evolution and maintenance activities by transforming existing source code via corrective, adaptive, perfective, and preventive changes. These code changes are usually managed and stored by a variety of tools and infrastructures such as version control, issue trackers, and code review systems. Software Evolution and Maintenance researchers have been mining these code archives in order to distill useful insights on the nature of such developers’ activities. One of the long-lasting goal of Software Engineering research is to better support and automate different types of code changes performed by developers. In this thesis we depart from classic manually crafted rule- or heuristic-based approaches, and propose a novel technique to learn code transformations by leveraging the vast amount of publicly available code changes performed by developers. We rely on Deep Learning, and in particular on Neural Machine Translation (NMT), to train models able to learn code change patterns and apply them to novel, unseen, source code. First, we tackle the problem of generating source code mutants for Mutation Testing. In contrast with classic approaches, which rely on handcrafted mutation operators, we propose to automatically learn how to mutate source code by observing real faults. We mine millions of bug fixing commits from GitHub, process and abstract their source code. This data is used to train and evaluate an NMT model to translate fixed code into buggy code (i.e., the mutated code). In the second project, we rely on the same dataset of bug-fixes to learn code transformations for the purpose of Automated Program Repair (APR). This represents one of the most challenging research problem in Software Engineering, whose goal is to automatically fix bugs without developers’ intervention. We train a model to translate buggy code into fixed code (i.e., learning patches) and, in conjunction with Beam Search, generate many different potential patches for a given buggy method. In our empirical investigation we found that such a model is able to fix thousands of unique buggy methods in the wild.Finally, in our third project we push our novel technique to the limits and enlarge the scope to consider not only bug-fixing activities, but any type of meaningful code changes performed by developers. We focus on accepted and merged code changes that undergone a Pull Request (PR) process. We quantitatively and qualitatively investigate the code transformations learned by the model to build a taxonomy. The taxonomy shows that NMT can replicate a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. In this dissertation we illustrate and evaluate the proposed techniques, which represent a significant departure from earlier approaches in the literature. The promising results corroborate the potential applicability of learning techniques, such as NMT, to a variety of Software Engineering tasks
Effort Estimation Factors for Corrective Software Maintenance Projects: A Qualitative Analysis of Estimation Criteria
In this paper, we identify factors that impact software maintenance effort by exploring expert software maintenance estimators’ knowledge about corrective maintenance projects. We use a qualitative approach to identify the issues important to these experts to derive their effort estimates. We find seventeen factors (rated and rank ordered by importance) that affect corrective maintenance effort and include constructs related to developers, code, defects, and environment. Several of these factors that have a comparably strong influence on corrective maintenance estimation are unique to corrective maintenance and are not generally observed in established software estimation models. The results enhance organizations’ ability to effectively manage maintenance environments by focusing attention on the identified areas. For future research, these results represent an important step toward developing a comprehensive and accurate corrective maintenance effort estimation model
Recommended from our members
Analysing the Resolution of Security Bugs in Software Maintenance
Security bugs in software systems are often reported after incidents of malicious attacks. Developers often need to resolve these bugs quickly in order to maintain the security of such systems. Bug resolution includes two kinds of activities: triaging confirms that the bugs are indeed security problems, after which fixing involves making changes to the code.
It is reported in the literature that, statistically, security bugs are reopened more often compared to others, which poses two new research questions: (a) Are developers “rushing” to triage security bugs too soon under the pressure of deadlines? (b) Do developers need to spend more time fixing security bugs to avoid frequent reopening?
This thesis explores these questions in order to determine whether security bug fixing should take a higher priority than other bugs to avoid malicious attackers exploiting vulnerabilities before the problems are fixed, and whether security bug fixing should take a higher priority than other bugs.
In this thesis a quantitative approach has been adopted by conducting statistical empirical studies to observe the behaviour of software developers engaged in dealing with security bugs.
Firstly, the concept of "rush'' has been borrowed from the time management literature to refer to the behaviour of people delivering work under the pressure of deadlines. By observing how developers deliver bug resolution before the deadline of releases, the degree of rush has been measured as the ratio between the actual time spent by developers during triaging and the theoretical time the developers have by delaying the fixes until the next regular release.
In this thesis, a suggest that delaying bug assignment helps find the right developer and gives the developer more time to prepare for the same workload with more relaxed planning constraints. Secondly, to analyse the complexity of security bug fixes, the fan-in complexity of functions relevant to security bugs has been measured, rather than simply measuring the time spent by the software developers on the fixing of such bugs.
The first null hypothesis is tested using a Man-Whitney method on five software case studies, Samba, MozillaFirefox, RedHat, FreeBSD and Mozilla. The second null hypothesis is tested by comparing the results of fixing security and non-security bugs from the Samba and MozillaFirefox case studies.
Statistically significant results suggest that security bugs are triaged in a rush compared to non-security bugs for RedHat, FreeBSD and Mozilla.
In terms of fan-in, the results of the Samba and MozillaFirefox case studies suggest that security bugs are more complex to fix compared to non-security bugs
Design Architecture, Developer Networks and Performance of Open Source Software Projects
In this study we seek to understand the factors differentiating successful from unsuccessful software projects. This article develops and tests a model measuring the impact on software project performance of (1) software products ’ design architectures and (2) developers ’ positions within collaborative networks. Two indicators of project success are used: product quality and project velocity. Two dimensions of design architecture – degree of decomposition and coupling – and one characteristic of developer network structures – degree centrality – are investigated for their impact on project performance. Using data gathered from SourceForge.net and its monthly dumps, we empirically test hypotheses on the top 100 projects according to project rankings. These rankings are generated from the traffic, communication, and development statistics collected for each project hosted on SourceForge.net. Besides the top 100 projects, we also randomly choose another 100 projects to form the data sample. The main findings are that (1) the degree of decomposition has an inverted U-shaped relationship with project performance, (2) when tested on the sample of top 100 projects, average degree centrality of a project team has a positive and significant effect on project performance and (3) the effects of network metrics o
- …