385 research outputs found

    We Don't Need Another Hero? The Impact of "Heroes" on Software Development

    Full text link
    A software project has "Hero Developers" when 80% of contributions are delivered by 20% of the developers. Are such heroes a good idea? Are too many heroes bad for software quality? Is it better to have more/less heroes for different kinds of projects? To answer these questions, we studied 661 open source projects from Public open source software (OSS) Github and 171 projects from an Enterprise Github. We find that hero projects are very common. In fact, as projects grow in size, nearly all project become hero projects. These findings motivated us to look more closely at the effects of heroes on software development. Analysis shows that the frequency to close issues and bugs are not significantly affected by the presence of project type (Public or Enterprise). Similarly, the time needed to resolve an issue/bug/enhancement is not affected by heroes or project type. This is a surprising result since, before looking at the data, we expected that increasing heroes on a project will slow down howfast that project reacts to change. However, we do find a statistically significant association between heroes, project types, and enhancement resolution rates. Heroes do not affect enhancement resolution rates in Public projects. However, in Enterprise projects, the more heroes increase the rate at which project complete enhancements. In summary, our empirical results call for a revision of a long-held truism in software engineering. Software heroes are far more common and valuable than suggested by the literature, particularly for medium to large Enterprise developments. Organizations should reflect on better ways to find and retain more of these heroesComment: 8 pages + 1 references, Accepted to International conference on Software Engineering - Software Engineering in Practice, 201

    SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry

    Full text link
    In this paper, we aim at the automated unit coverage-based testing for embedded software. To achieve the goal, by analyzing the industrial requirements and our previous work on automated unit testing tool CAUT, we rebuild a new tool, SmartUnit, to solve the engineering requirements that take place in our partner companies. SmartUnit is a dynamic symbolic execution implementation, which supports statement, branch, boundary value and MC/DC coverage. SmartUnit has been used to test more than one million lines of code in real projects. For confidentiality motives, we select three in-house real projects for the empirical evaluations. We also carry out our evaluations on two open source database projects, SQLite and PostgreSQL, to test the scalability of our tool since the scale of the embedded software project is mostly not large, 5K-50K lines of code on average. From our experimental results, in general, more than 90% of functions in commercial embedded software achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in SQLite achieve 100% MC/DC coverage, and more than 60% of functions in PostgreSQL achieve 100% MC/DC coverage. Moreover, SmartUnit is able to find the runtime exceptions at the unit testing level. We also have reported exceptions like array index out of bounds and divided-by-zero in SQLite. Furthermore, we analyze the reasons of low coverage in automated unit testing in our setting and give a survey on the situation of manual unit testing with respect to automated unit testing in industry.Comment: In Proceedings of 40th International Conference on Software Engineering: Software Engineering in Practice Track, Gothenburg, Sweden, May 27-June 3, 2018 (ICSE-SEIP '18), 10 page

    Combining Spreadsheet Smells for Improved Fault Prediction

    Full text link
    Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.Comment: 4 pages, 1 figure, to be published in 40th International Conference on Software Engineering: New Ideas and Emerging Results Trac

    Machine Learning for Software Engineering: Models, Methods, and Applications

    Get PDF
    Machine Learning (ML) is the discipline that studies methods for automatically inferring models from data. Machine learning has been successfully applied in many areas of software engineering ranging from behaviour extraction, to testing, to bug fixing. Many more applications are yet be defined. However, a better understanding of ML methods, their assumptions and guarantees would help software engineers adopt and identify the appropriate methods for their desired applications. We argue that this choice can be guided by the models one seeks to infer. In this technical briefing, we review and reflect on the applications of ML for software engineering organised according to the models they produce and the methods they use. We introduce the principles of ML, give an overview of some key methods, and present examples of areas of software engineering benefiting from ML. We also discuss the open challenges for reaching the full potential of ML for software engineering and how ML can benefit from software engineering methods

    Towards a unified conceptual model for surveillance theories

    Get PDF
    The erosion of values such as privacy can be a critical factor in preventing the acceptance of new innovative technology especially in challenging environments such as the criminal justice system. Erosion of privacy happens through either deliberate or inadvertent surveillance. Since Bentham’s original liberal project in the 1900s, a literature and a whole study area around theories of surveillance has developed. Increasingly this general body of work has focussed on the role of information technology as a vehicle for surveillance activity. Despite an abundance of knowledge, a uni!ed view of key surveillance concepts that is useful to designers of information systems in preventing or reducing unintended surveillance remains elusive. This paper contributes a conceptual model that synthesises the gamut of surveillance theories as a !rst step to a theory building effort for use by Information Systems professionals. The model is evaluated using a design science research paradigm using data from both examples of surveillance and a recently completed research project that developed technology for the UK youth justice system

    Opinion Mining for Software Development: A Systematic Literature Review

    Get PDF
    Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of relevant studies available, it can take considerable time for researchers and developers to figure out which approaches they can adopt in their own studies and what perils these approaches entail. We conducted a systematic literature review involving 185 papers. More specifically, we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for software development activities, and provide critical insights for the further development of opinion mining techniques in the SE domain
    • …
    corecore