82 research outputs found

    Knowledge Unchained or Strategically Overseen? Knowledge Management in Open Source Software Projects

    Get PDF
    The term “open source software” was formally introduced in the early 2000s to describe source code which are available to the public to be used and modified by anyone. Like any innovative idea attaining a certain maturity level, open source communities have reached a degree of formalization in their structures and practices. This also holds for knowledge management and its related measures in open source communities. Therefore, we investigate the patterns and structures in communication and collaboration of the currently most successful open source software projects through a case study approach. Herewith, we reveal how the different knowledge management aspects are practiced in these internet communities. Due to the projects’ success, we identify similarities as good practices and derive practical recommendations for action for other open source communities as well as research opportunities regarding knowledge management in open source software projects

    An Investigation into quality assurance of the Open Source Software Development model

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of PhilosophyThe Open Source Software Development (OSSD) model has launched products in rapid succession and with high quality, without following traditional quality practices of accepted software development models (Raymond 1999). Some OSSD projects challenge established quality assurance approaches, claiming to be successful through partial contrary techniques of standard software development. However, empirical studies of quality assurance practices for Open Source Software (OSS) are rare (Glass 2001). Therefore, further research is required to evaluate the quality assurance processes and methods within the OSSD model. The aim of this research is to improve the understanding of quality assurance practices under the OSSD model. The OSSD model is characterised by a collaborative, distributed development approach with public communication, free participation, free entry to the project for newcomers and unlimited access to the source code. The research examines applied quality assurance practices from a process view rather than from a product view. The research follows ideographic and nomothetic methodologies and adopts an antipositivist epistemological approach. An empirical research of applied quality assurance practices in OSS projects is conducted through the literature research. The survey research method is used to gain empirical evidence about applied practices. The findings are used to validate the theoretical knowledge and to obtain further expertise about practical approaches. The findings contribute to the development of a quality assurance framework for standard OSSD approaches. The result is an appropriate quality model with metrics that the requirements of the OSSD support. An ideographic approach with case studies is used to extend the body of knowledge and to assess the feasibility and applicability of the quality assurance framework. In conclusion, the study provides further understanding of the applied quality assurance processes under the OSSD model and shows how a quality assurance framework can support the development processes with guidelines and measurements

    Achieving Quality through Software Maintenance and Evolution: on the role of Agile Methodologies and Open Source Software

    Get PDF
    Agile methodologies, open source software development, and emerging new technologies are at the base of disruptive changes in software engineering. Being effort estimation pivotal for effective project management in the agile context, in the first part of the thesis we contribute to improve effort estimation by devising a real-time story point classifier, designed with the collaboration of an industrial partner and by exploiting publicly available data on open source projects. We demonstrate that, after an initial training on at least 300 issue reports, the classifier estimates a new issue in less than 15 seconds with a mean magnitude of relative error between 0.16 and 0.61. In addition, issue type, summary, description, and related components prove to be project-dependent features pivotal for story point estimation. Since story points are the most popular effort estimation metric in the agile context, in the second study presented in the thesis we investigate the role of agile methodologies in software maintenance and evolution, and prove its undoubted influence on the refactoring research field over the last 15 years. In the later part of the thesis, we focus on recent technologies to understand their impact on software engineering. We start by proposing a specialized blockchain-oriented software engineering, on the basis of the peculiar challenges the blockchain sector must confront with and statistical data retrieved from a corpus of open source blockchain-oriented software repositories, identified relying upon the 2016 Moody’s Blockchain Report. We advocate the need for new professional roles, enhanced security and reliability, novel modeling languages, and specialized metrics, along with new research directions focusing on collaboration among large teams, testing, and specialized tools for the creation of smart contracts. Along with the blockchain, in the later part of this work we also study the growing mobile sector. More specifically, we focus on the relationships between software defects and the use of the underlying system API, proving that our findings are aligned with those in the literature, namely, that the applications which are more connected to API classes are also more defect-prone. Finally, in the last work presented in the dissertation, we conducted a statistical analysis of 20 open source object-oriented systems, 10 written in the highly popular language Java and 10 in the rising language Python. We leveraged two statistical distribution functions–the log-normal and the double Pareto distributions–to provide good fits, both in Java and Python, for three metrics, namely, the NOLM, NOM, and NOS metrics. The study, among other findings, revealed that the variability of the number of methods used in Python classes is lower than in Java classes, and that Java classes, on average, feature fewer lines of code than Python classes

    Software Development Analytics in Practice: A Systematic Literature Review

    Full text link
    Context:Software Development Analytics is a research area concerned with providing insights to improve product deliveries and processes. Many types of studies, data sources and mining methods have been used for that purpose. Objective:This systematic literature review aims at providing an aggregate view of the relevant studies on Software Development Analytics in the past decade (2010-2019), with an emphasis on its application in practical settings. Method:Definition and execution of a search string upon several digital libraries, followed by a quality assessment criteria to identify the most relevant papers. On those, we extracted a set of characteristics (study type, data source, study perspective, development life-cycle activities covered, stakeholders, mining methods, and analytics scope) and classified their impact against a taxonomy. Results:Source code repositories, experimental case studies, and developers are the most common data sources, study types, and stakeholders, respectively. Product and project managers are also often present, but less than expected. Mining methods are evolving rapidly and that is reflected in the long list identified. Descriptive statistics are the most usual method followed by correlation analysis. Being software development an important process in every organization, it was unexpected to find that process mining was present in only one study. Most contributions to the software development life cycle were given in the quality dimension. Time management and costs control were lightly debated. The analysis of security aspects suggests it is an increasing topic of concern for practitioners. Risk management contributions are scarce. Conclusions:There is a wide improvement margin for software development analytics in practice. For instance, mining and analyzing the activities performed by software developers in their actual workbench, the IDE

    Modeling User-Affected Software Properties for Open Source Software Supply Chains

    Get PDF
    Background: Open Source Software development community relies heavily on users of the software and contributors outside of the core developers to produce top-quality software and provide long-term support. However, the relationship between a software and its contributors in terms of exactly how they are related through dependencies and how the users of a software affect many of its properties are not very well understood. Aim: My research covers a number of aspects related to answering the overarching question of modeling the software properties affected by users and the supply chain structure of software ecosystems, viz. 1) Understanding how software usage affect its perceived quality; 2) Estimating the effects of indirect usage (e.g. dependent packages) on software popularity; 3) Investigating the patch submission and issue creation patterns of external contributors; 4) Examining how the patch acceptance probability is related to the contributors\u27 characteristics. 5) A related topic, the identification of bots that commit code, aimed at improving the accuracy of these and other similar studies was also investigated. Methodology: Most of the Research Questions are addressed by studying the NPM ecosystem, with data from various sources like the World of Code, GHTorrent, and the GiHub API. Different supervised and unsupervised machine learning models, including Regression, Random Forest, Bayesian Networks, and clustering, were used to answer appropriate questions. Results: 1) Software usage affects its perceived quality even after accounting for code complexity measures. 2) The number of dependents and dependencies of a software were observed to be able to predict the change in its popularity with good accuracy. 3) Users interact (contribute issues or patches) primarily with their direct dependencies, and rarely with transitive dependencies. 4) A user\u27s earlier interaction with the repository to which they are contributing a patch, and their familiarity with related topics were important predictors impacting the chance of a pull request getting accepted. 5) Developed BIMAN, a systematic methodology for identifying bots. Conclusion: Different aspects of how users and their characteristics affect different software properties were analyzed, which should lead to a better understanding of the complex interaction between software developers and users/ contributors

    Understanding, Analysis, and Handling of Software Architecture Erosion

    Get PDF
    Architecture erosion occurs when a software system's implemented architecture diverges from the intended architecture over time. Studies show erosion impacts development, maintenance, and evolution since it accumulates imperceptibly. Identifying early symptoms like architectural smells enables managing erosion through refactoring. However, research lacks comprehensive understanding of erosion, unclear which symptoms are most common, and lacks detection methods. This thesis establishes an erosion landscape, investigates symptoms, and proposes identification approaches. A mapping study covers erosion definitions, symptoms, causes, and consequences. Key findings: 1) "Architecture erosion" is the most used term, with four perspectives on definitions and respective symptom types. 2) Technical and non-technical reasons contribute to erosion, negatively impacting quality attributes. Practitioners can advocate addressing erosion to prevent failures. 3) Detection and correction approaches are categorized, with consistency and evolution-based approaches commonly mentioned.An empirical study explores practitioner perspectives through communities, surveys, and interviews. Findings reveal associated practices like code review and tools identify symptoms, while collected measures address erosion during implementation. Studying code review comments analyzes erosion in practice. One study reveals architectural violations, duplicate functionality, and cyclic dependencies are most frequent. Symptoms decreased over time, indicating increased stability. Most were addressed after review. A second study explores violation symptoms in four projects, identifying 10 categories. Refactoring and removing code address most violations, while some are disregarded.Machine learning classifiers using pre-trained word embeddings identify violation symptoms from code reviews. Key findings: 1) SVM with word2vec achieved highest performance. 2) fastText embeddings worked well. 3) 200-dimensional embeddings outperformed 100/300-dimensional. 4) Ensemble classifier improved performance. 5) Practitioners found results valuable, confirming potential.An automated recommendation system identifies qualified reviewers for violations using similarity detection on file paths and comments. Experiments show common methods perform well, outperforming a baseline approach. Sampling techniques impact recommendation performance

    Understanding the current trends in mobile crowdsensing - a business model perspective: case MyGeo Trust

    Get PDF
    Crowdsensing and personal data markets that have emerged around it have rapidly gained momentum in parallel with the appearance of mobile devices. Collecting information via mobile sensors and the applications relying on these, the privacy of mobile users can be threatened, especially in the case of location-related data. In 2015, a research project called MyGeoTrust was initiated to investigate this issue. One aim of the project was to study the potential business models for a trusted, open-source crowdsourcing platform. This study, carried within the MyGeoTrust project, reviews existing literature about business models, location-based services, and open-source software development. It then investigates the relationship between these topics and mobile crowdsensing. As a whole, this thesis provides an overview on the development of location-based services, as well as the current trends and business models in crowdsensing. The empirical part of the thesis employs embedded case study methodology, acquiring empirical data from several sources. The analyzed case is the MyGeoTrust project itself, and other empirical data is collected via market analysis, interim reports, a user survey, and semi-structured interviews. This material forms the baseline for the empirical study and project-specific recommendations. The findings suggest that creating a two- or multisided platform is the most robust business model for mobile crowdsensing. The identified benefits of platform-based business models include facilitating the value exchange between self-governing groups and possibilities to build positive network effects. This is especially the case with open-source software and open data since the key value for users - or “the crowd” in other terms - is created through network effects. In the context of open business models, strategic planning, principally licensing, plays a central role. Also, for a differentiated platform like MyGeoTrust finding the critical mass of users is crucial, in order to create an appealing alternative to current market leaders. Lastly, this study examines how transformational political or legal factors may shape the scene and create requirements for novel, privacy-perceiving solutions. In the present case study, the upcoming European Union (EU) General Data Protection Regulation (GDPR) legislation is a central example of such a factor

    Essential properties of open development communities : supporting growth, collaboration, and learning

    Get PDF
    Open development has emerged as a method for creating versatile and complex products through free collaboration of individuals. This free collaboration forms globally distributed teams. Similarly, it is common today to view business and other human organizations as ecosystems, where several participating companies and organizations co-operate and compete together. For example, open source software development is one area where community driven development provides a plausible platform for both development of products and establishing a software ecosystem where a set of businesses contribute their own innovations. Equally, open learning environments and open innovation platforms are also gaining ground. While such initiatives are not limited to any specific area, they typically offer a technological, legal, social, and economic framework for development. Moreover, they always rely on the associated community, the people. Open development would not exist without the active participation of keen developers. However, people are fickle. Firstly, as one of the main driving forces for participation is own interest, "scratching your own itch", the question of how to grow and support open development rises to the forefront. Further it leads to ask what contributes to making open development successful. This is especially crucial when the product has business value. Secondly, as open development has its own governance methods and development guidelines, one is led to ask, how learning these could be facilitated, and how community participation could be supported. This doctoral dissertation gives insight on tools and techniques that help in dealing with the multi-faceted challenge of working with and growing an open development community. It discusses these through a framework covering the five key aspects of open development: the people in and the purpose of the community, the product developed by the community and the policies and the platform the community needs to function. The thesis presents work on establishing and monitoring an open development community in two different settings: a Free/Libre/Open Source Software(FLOSS) business environment and open education. The research covers going ahead with open development within the FLOSS ecosystem both from the point of view of the product and the business environment. Additionally, this thesis offers research on how developers can learn open development methods. It introduces academic open development communities through which the developers can adopt collaborative development skills. The research presented paves the way for gaining further knowledge in growing thriving open development communities

    An Assessment of DevOps Maturity in a Software Project

    Get PDF
    DevOps is a software development method, which aims at decreasing conflict between software developers and system operators. Conflicts can occur because the developers’ goal is to release the new features of the software to production, whereas the operators’ goal is to keep the software as stable and available as possible. In traditional software development models, the typical amount of time between deployments can be long and the changes in software can become rather complex and big in size. The DevOps approach seeks to solve this contradiction by bringing software developers and system operators together from the very beginning of a development project. In the DevOps model, changes deployed to production are small and frequent. Automated deployments decrease human errors that sometimes occur in manual deployments. Testing is at least partly automated and tests are run after each individual software change. However, technical means are only one part of the DevOps approach. The model also emphasizes changes in organizational culture, which are ideally based on openness, continuous learning, and experimentation. Employees possess the freedom of decision-making while carrying the responsibility that follows. In addition to individual or team-based goals, each employee is encouraged to pursue the common goals. The aim of this thesis is two-fold. Firstly, the goal is to understand and define the DevOps model through a literature review. Secondly, the thesis analyzes the factors that contribute to the successful adoption of DevOps in an organization, including those with the possibility of slowing down or hindering the process. A qualitative case study was carried out on a system development project in a large Finnish technology company. The data consists of semi-structured open-ended interviews with key personnel, and the findings are analyzed and compared to factors introduced in previous DevOps literature, including the DevOps maturity model. The case project is also assessed in terms of its DevOps maturity. Finally, impediments and problems regarding DevOps adoption are discussed. Based on the case study, major challenges in the project include the large size and complexity of the project, problems in project management, occasional communication problems between the vendor and the client, poor overall quality of the software, and defects in the software development process of the vendor. Despite the challenges, the company demonstrated progress in some aspects, such as partly automating the deployment process, creating basic monitoring for the software, and negotiating development and testing guidelines with the vendor
    • 

    corecore