385 research outputs found
We Don't Need Another Hero? The Impact of "Heroes" on Software Development
A software project has "Hero Developers" when 80% of contributions are
delivered by 20% of the developers. Are such heroes a good idea? Are too many
heroes bad for software quality? Is it better to have more/less heroes for
different kinds of projects? To answer these questions, we studied 661 open
source projects from Public open source software (OSS) Github and 171 projects
from an Enterprise Github.
We find that hero projects are very common. In fact, as projects grow in
size, nearly all project become hero projects. These findings motivated us to
look more closely at the effects of heroes on software development. Analysis
shows that the frequency to close issues and bugs are not significantly
affected by the presence of project type (Public or Enterprise). Similarly, the
time needed to resolve an issue/bug/enhancement is not affected by heroes or
project type. This is a surprising result since, before looking at the data, we
expected that increasing heroes on a project will slow down howfast that
project reacts to change. However, we do find a statistically significant
association between heroes, project types, and enhancement resolution rates.
Heroes do not affect enhancement resolution rates in Public projects. However,
in Enterprise projects, the more heroes increase the rate at which project
complete enhancements.
In summary, our empirical results call for a revision of a long-held truism
in software engineering. Software heroes are far more common and valuable than
suggested by the literature, particularly for medium to large Enterprise
developments. Organizations should reflect on better ways to find and retain
more of these heroesComment: 8 pages + 1 references, Accepted to International conference on
Software Engineering - Software Engineering in Practice, 201
SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry
In this paper, we aim at the automated unit coverage-based testing for
embedded software. To achieve the goal, by analyzing the industrial
requirements and our previous work on automated unit testing tool CAUT, we
rebuild a new tool, SmartUnit, to solve the engineering requirements that take
place in our partner companies. SmartUnit is a dynamic symbolic execution
implementation, which supports statement, branch, boundary value and MC/DC
coverage. SmartUnit has been used to test more than one million lines of code
in real projects. For confidentiality motives, we select three in-house real
projects for the empirical evaluations. We also carry out our evaluations on
two open source database projects, SQLite and PostgreSQL, to test the
scalability of our tool since the scale of the embedded software project is
mostly not large, 5K-50K lines of code on average. From our experimental
results, in general, more than 90% of functions in commercial embedded software
achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in
SQLite achieve 100% MC/DC coverage, and more than 60% of functions in
PostgreSQL achieve 100% MC/DC coverage. Moreover, SmartUnit is able to find the
runtime exceptions at the unit testing level. We also have reported exceptions
like array index out of bounds and divided-by-zero in SQLite. Furthermore, we
analyze the reasons of low coverage in automated unit testing in our setting
and give a survey on the situation of manual unit testing with respect to
automated unit testing in industry.Comment: In Proceedings of 40th International Conference on Software
Engineering: Software Engineering in Practice Track, Gothenburg, Sweden, May
27-June 3, 2018 (ICSE-SEIP '18), 10 page
Combining Spreadsheet Smells for Improved Fault Prediction
Spreadsheets are commonly used in organizations as a programming tool for
business-related calculations and decision making. Since faults in spreadsheets
can have severe business impacts, a number of approaches from general software
engineering have been applied to spreadsheets in recent years, among them the
concept of code smells. Smells can in particular be used for the task of fault
prediction. An analysis of existing spreadsheet smells, however, revealed that
the predictive power of individual smells can be limited. In this work we
therefore propose a machine learning based approach which combines the
predictions of individual smells by using an AdaBoost ensemble classifier.
Experiments on two public datasets containing real-world spreadsheet faults
show significant improvements in terms of fault prediction accuracy.Comment: 4 pages, 1 figure, to be published in 40th International Conference
on Software Engineering: New Ideas and Emerging Results Trac
Machine Learning for Software Engineering: Models, Methods, and Applications
Machine Learning (ML) is the discipline that studies methods for automatically inferring models from data. Machine learning has been successfully applied in many areas of software engineering ranging from behaviour extraction, to testing, to bug fixing. Many more applications are yet be defined. However, a better understanding of ML methods, their assumptions and guarantees would help software engineers adopt and identify the appropriate methods for their desired applications. We argue that this choice can be guided by the models one seeks to infer. In this technical briefing, we review and reflect on the applications of ML for software engineering organised according to the models they produce and the methods they use. We introduce the principles of ML, give an overview of some key methods, and present examples of areas of software engineering benefiting from ML. We also discuss the open challenges for reaching the full potential of ML for software engineering and how ML can benefit from software engineering methods
Towards a unified conceptual model for surveillance theories
The erosion of values such as privacy can be a critical factor in preventing the acceptance of new innovative technology especially in challenging environments such as the criminal justice system. Erosion of privacy happens through either deliberate or inadvertent surveillance. Since Bentham’s original liberal project in the 1900s, a literature and a whole study area around theories of surveillance has developed. Increasingly this general body of work has focussed on the role of information technology as a vehicle for surveillance activity. Despite an abundance of knowledge, a uni!ed view of key surveillance concepts that is useful to designers of information systems in preventing or reducing unintended surveillance remains elusive. This paper contributes a conceptual model that synthesises the gamut of surveillance theories as a !rst step to a theory building effort for use by Information Systems professionals. The model is evaluated using a design science research paradigm using data from both examples of surveillance and a recently completed research project that developed technology for the UK youth justice system
Opinion Mining for Software Development: A Systematic Literature Review
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies.
SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in
code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take
considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils
these approaches entail.
We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion
mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in
other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4)
concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques.
The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide
critical insights for the further development of opinion mining techniques in the SE domain
- …