257 research outputs found

    Security considerations in the open source software ecosystem

    Get PDF
    Open source software plays an important role in the software supply chain, allowing stakeholders to utilize open source components as building blocks in their software, tooling, and infrastructure. But relying on the open source ecosystem introduces unique challenges, both in terms of security and trust, as well as in terms of supply chain reliability. In this dissertation, I investigate approaches, considerations, and encountered challenges of stakeholders in the context of security, privacy, and trustworthiness of the open source software supply chain. Overall, my research aims to empower and support software experts with the knowledge and resources necessary to achieve a more secure and trustworthy open source software ecosystem. In the first part of this dissertation, I describe a research study investigating the security and trust practices in open source projects by interviewing 27 owners, maintainers, and contributors from a diverse set of projects to explore their behind-the-scenes processes, guidance and policies, incident handling, and encountered challenges, finding that participants’ projects are highly diverse in terms of their deployed security measures and trust processes, as well as their underlying motivations. More on the consumer side of the open source software supply chain, I investigated the use of open source components in industry projects by interviewing 25 software developers, architects, and engineers to understand their projects’ processes, decisions, and considerations in the context of external open source code, finding that open source components play an important role in many of the industry projects, and that most projects have some form of company policy or best practice for including external code. On the side of end-user focused software, I present a study investigating the use of software obfuscation in Android applications, which is a recommended practice to protect against plagiarism and repackaging. The study leveraged a multi-pronged approach including a large-scale measurement, a developer survey, and a programming experiment, finding that only 24.92% of apps are obfuscated by their developer, that developers do not fear theft of their own apps, and have difficulties obfuscating their own apps. Lastly, to involve end users themselves, I describe a survey with 200 users of cloud office suites to investigate their security and privacy perceptions and expectations, with findings suggesting that users are generally aware of basic security implications, but lack technical knowledge for envisioning some threat models. The key findings of this dissertation include that open source projects have highly diverse security measures, trust processes, and underlying motivations. That the projects’ security and trust needs are likely best met in ways that consider their individual strengths, limitations, and project stage, especially for smaller projects with limited access to resources. That open source components play an important role in industry projects, and that those projects often have some form of company policy or best practice for including external code, but developers wish for more resources to better audit included components. This dissertation emphasizes the importance of collaboration and shared responsibility in building and maintaining the open source software ecosystem, with developers, maintainers, end users, researchers, and other stakeholders alike ensuring that the ecosystem remains a secure, trustworthy, and healthy resource for everyone to rely on

    A survey of the use of crowdsourcing in software engineering

    Get PDF
    The term 'crowdsourcing' was initially introduced in 2006 to describe an emerging distributed problem-solving model by online workers. Since then it has been widely studied and practiced to support software engineering. In this paper we provide a comprehensive survey of the use of crowdsourcing in software engineering, seeking to cover all literature on this topic. We first review the definitions of crowdsourcing and derive our definition of Crowdsourcing Software Engineering together with its taxonomy. Then we summarise industrial crowdsourcing practice in software engineering and corresponding case studies. We further analyse the software engineering domains, tasks and applications for crowdsourcing and the platforms and stakeholders involved in realising Crowdsourced Software Engineering solutions. We conclude by exposing trends, open issues and opportunities for future research on Crowdsourced Software Engineering

    Multi-objective Search-based Mobile Testing

    Get PDF
    Despite the tremendous popularity of mobile applications, mobile testing still relies heavily on manual testing. This thesis presents mobile test automation approaches based on multi-objective search. We introduce three approaches: Sapienz (for native Android app testing), Octopuz (for hybrid/web JavaScript app testing) and Polariz (for using crowdsourcing to support search-based mobile testing). These three approaches represent the primary scientific and technical contributions of the thesis. Since crowdsourcing is, itself, an emerging research area, and less well understood than search-based software engineering, the thesis also provides the first comprehensive survey on the use of crowdsourcing in software testing (in particular) and in software engineering (more generally). This survey represents a secondary contribution. Sapienz is an approach to Android testing that uses multi-objective search-based testing to automatically explore and optimise test sequences, minimising their length, while simultaneously maximising their coverage and fault revelation. The results of empirical studies demonstrate that Sapienz significantly outperforms both the state-of-the-art technique Dynodroid and the widely-used tool, Android Monkey, on all three objectives. When applied to the top 1,000 Google Play apps, Sapienz found 558 unique, previously unknown crashes. Octopuz reuses the Sapienz multi-objective search approach for automated JavaScript testing, aiming to investigate whether it replicates the Sapienz’ success on JavaScript testing. Experimental results on 10 real-world JavaScript apps provide evidence that Octopuz significantly outperforms the state of the art (and current state of practice) in automated JavaScript testing. Polariz is an approach that combines human (crowd) intelligence with machine (computational search) intelligence for mobile testing. It uses a platform that enables crowdsourced mobile testing from any source of app, via any terminal client, and by any crowd of workers. It generates replicable test scripts based on manual test traces produced by the crowd workforce, and automatically extracts from these test traces, motif events that can be used to improve search-based mobile testing approaches such as Sapienz

    The State of Practice for Security Unit Testing: Towards Data Driven Strategies to Shift Security into Developer\u27s Automated Testing Workflows

    Get PDF
    The pressing need to “shift security left” in the software development lifecycle has motivated efforts to adapt the iterative and continuous process models used in practice today. Security unit testing is praised by practitioners and recommended by expert groups, usually in the context of DevSecOps and achieving “continuous security”. In addition to vulnerability testing and standards adherence, this technique can help developers verify that security controls are implemented correctly, i.e. functional security testing. Further, the means by which security unit testing can be integrated into developer workflows is unique from other standalone tools as it is an adaptation of practices and infrastructure developers are already familiar with. Yet, software engineering researchers have so far failed to include this technique in their empirical studies on secure development and little is known about the state of practice for security unit testing. This dissertation is motivated by the disconnect between promotion of security unit testing and the lack of empirical evidence on how it is and can be applied. The goal of this work was to address the disconnect towards identifying actionable strategies to promote wider adoption and mitigate observed challenges. Three mixed-method empirical studies were conducted wherein practitioner-authored unit test code, Q&A posts, and grey literature were analyzed through three lenses: Practices (what they do), Perspectives and Guidelines (what and how they think it should be done), and Pain Points (what challenges they face) to incorporate both technical and human factors of this phenomena. Accordingly, this work contributes novel and important insights into how developers write functional unit tests for at least nine security controls, including a taxonomy of 53 authentication unit test cases derived from real code and a detailed analysis of seven unique pain points that developers seek help with from peers on Q&A sites. Recommendations given herein for conducting and adopting security unit testing, including mitigating challenges and addressing gaps between available and needed support, are grounded in the guidelines and perspectives on the benefits, limitations, use cases, and integration strategies shared in grey literature authored by practitioners

    Achieving Quality through Software Maintenance and Evolution: on the role of Agile Methodologies and Open Source Software

    Get PDF
    Agile methodologies, open source software development, and emerging new technologies are at the base of disruptive changes in software engineering. Being effort estimation pivotal for effective project management in the agile context, in the first part of the thesis we contribute to improve effort estimation by devising a real-time story point classifier, designed with the collaboration of an industrial partner and by exploiting publicly available data on open source projects. We demonstrate that, after an initial training on at least 300 issue reports, the classifier estimates a new issue in less than 15 seconds with a mean magnitude of relative error between 0.16 and 0.61. In addition, issue type, summary, description, and related components prove to be project-dependent features pivotal for story point estimation. Since story points are the most popular effort estimation metric in the agile context, in the second study presented in the thesis we investigate the role of agile methodologies in software maintenance and evolution, and prove its undoubted influence on the refactoring research field over the last 15 years. In the later part of the thesis, we focus on recent technologies to understand their impact on software engineering. We start by proposing a specialized blockchain-oriented software engineering, on the basis of the peculiar challenges the blockchain sector must confront with and statistical data retrieved from a corpus of open source blockchain-oriented software repositories, identified relying upon the 2016 Moody’s Blockchain Report. We advocate the need for new professional roles, enhanced security and reliability, novel modeling languages, and specialized metrics, along with new research directions focusing on collaboration among large teams, testing, and specialized tools for the creation of smart contracts. Along with the blockchain, in the later part of this work we also study the growing mobile sector. More specifically, we focus on the relationships between software defects and the use of the underlying system API, proving that our findings are aligned with those in the literature, namely, that the applications which are more connected to API classes are also more defect-prone. Finally, in the last work presented in the dissertation, we conducted a statistical analysis of 20 open source object-oriented systems, 10 written in the highly popular language Java and 10 in the rising language Python. We leveraged two statistical distribution functions–the log-normal and the double Pareto distributions–to provide good fits, both in Java and Python, for three metrics, namely, the NOLM, NOM, and NOS metrics. The study, among other findings, revealed that the variability of the number of methods used in Python classes is lower than in Java classes, and that Java classes, on average, feature fewer lines of code than Python classes

    ICSEA 2022: the seventeenth international conference on software engineering advances

    Get PDF
    The Seventeenth International Conference on Software Engineering Advances (ICSEA 2022), held between October 16th and October 20th, 2022, continued a series of events covering a broad spectrum of software-related topics. The conference covered fundamentals on designing, implementing, testing, validating and maintaining various kinds of software. Several tracks were proposed to treat the topics from theory to practice, in terms of methodologies, design, implementation, testing, use cases, tools, and lessons learned. The conference topics covered classical and advanced methodologies, open source, agile software, as well as software deployment and software economics and education. Other advanced aspects are related to on-time practical aspects, such as run-time vulnerability checking, rejuvenation process, updates partial or temporary feature deprecation, software deployment and configuration, and on-line software updates. These aspects trigger implications related to patenting, licensing, engineering education, new ways for software adoption and improvement, and ultimately, to software knowledge management. There are many advanced applications requiring robust, safe, and secure software: disaster recovery applications, vehicular systems, biomedical-related software, biometrics related software, mission critical software, E-health related software, crisis-situation software. These applications require appropriate software engineering techniques, metrics and formalisms, such as, software reuse, appropriate software quality metrics, composition and integration, consistency checking, model checking, provers and reasoning. The nature of research in software varies slightly with the specific discipline researchers work in, yet there is much common ground and room for a sharing of best practice, frameworks, tools, languages and methodologies. Despite the number of experts we have available, little work is done at the meta level, that is examining how we go about our research, and how this process can be improved. There are questions related to the choice of programming language, IDEs and documentation styles and standard. Reuse can be of great benefit to research projects yet reuse of prior research projects introduces special problems that need to be mitigated. The research environment is a mix of creativity and systematic approach which leads to a creative tension that needs to be managed or at least monitored. Much of the coding in any university is undertaken by research students or young researchers. Issues of skills training, development and quality control can have significant effects on an entire department. In an industrial research setting, the environment is not quite that of industry as a whole, nor does it follow the pattern set by the university. The unique approaches and issues of industrial research may hold lessons for researchers in other domains. We take here the opportunity to warmly thank all the members of the ICSEA 2022 technical program committee, as well as all the reviewers. The creation of such a high-quality conference program would not have been possible without their involvement. We also kindly thank all the authors who dedicated much of their time and effort to contribute to ICSEA 2022. We truly believe that, thanks to all these efforts, the final conference program consisted of top-quality contributions. We also thank the members of the ICSEA 2022 organizing committee for their help in handling the logistics of this event. We hope that ICSEA 2022 was a successful international forum for the exchange of ideas and results between academia and industry and for the promotion of progress in software engineering advances

    Software Maintenance At Commit-Time

    Get PDF
    Software maintenance activities such as debugging and feature enhancement are known to be challenging and costly, which explains an ever growing line of research in software maintenance areas including mining software repository, default prevention, clone detection, and bug reproduction. The main goal is to improve the productivity of software developers as they undertake maintenance tasks. Existing tools, however, operate in an offline fashion, i.e., after the changes to the systems have been made. Studies have shown that software developers tend to be reluctant to use these tools as part of a continuous development process. This is because they require installation and training, hindering their integration with developers’ workflow, which in turn limits their adoption. In this thesis, we propose novel approaches to support software developers at commit-time. As part of the developer’s workflow, a commit marks the end of a given task. We show how commits can be used to catch unwanted modifications to the system, and prevent the introduction of clones and bugs, before these modifications reach the central code repository. We also propose a bug reproduction technique that is based on model checking and crash traces. Furthermore, we propose a new way for classifying bugs based on the location of fixes that can serve as the basis for future research in this field of study. The techniques proposed in this thesis have been tested on over 400 open and closed (industrial) systems, resulting in high levels of precision and recall. They are also scalable and non-intrusive

    Towards an Improved Understanding of Software Vulnerability Assessment Using Data-Driven Approaches

    Get PDF
    Software Vulnerabilities (SVs) can expose software systems to cyber-attacks, potentially causing enormous financial and reputational damage for organizations. There have been significant research efforts to detect these SVs so that developers can promptly fix them. However, fixing SVs is complex and time-consuming in practice, and thus developers usually do not have sufficient time and resources to fix all SVs at once. As a result, developers often need SV information, such as exploitability, impact, and overall severity, to prioritize fixing more critical SVs. Such information required for fixing planning and prioritization is typically provided in the SV assessment step of the SV lifecycle. Recently, data-driven methods have been increasingly proposed to automate SV assessment tasks. However, there are still numerous shortcomings with the existing studies on data-driven SV assessment that would hinder their application in practice. This PhD thesis aims to contribute to the growing literature in data-driven SV assessment by investigating and addressing the constant changes in SV data as well as the lacking considerations of source code and developers’ needs for SV assessment that impede the practical applicability of the field. Particularly, we have made the following five contributions in this thesis. (1) We systematize the knowledge of data-driven SV assessment to reveal the best practices of the field and the main challenges affecting its application in practice. Subsequently, we propose various solutions to tackle these challenges to better support the real-world applications of data-driven SV assessment. (2) We first demonstrate the existence of the concept drift (changing data) issue in descriptions of SV reports that current studies have mostly used for predicting the Common Vulnerability Scoring System (CVSS) metrics. We augment report-level SV assessment models with subwords of terms extracted from SV descriptions to help the models more effectively capture the semantics of ever-increasing SVs. (3) We also identify that SV reports are usually released after SV fixing. Thus, we propose using vulnerable code to enable earlier SV assessment without waiting for SV reports. We are the first to use Machine Learning techniques to predict CVSS metrics on the function level leveraging vulnerable statements directly causing SVs and their context in code functions. The performance of our function-level SV assessment models is promising, opening up research opportunities in this new direction. (4) To facilitate continuous integration of software code nowadays, we present a novel deep multi-task learning model, DeepCVA, to simultaneously and efficiently predict multiple CVSS assessment metrics on the commit level, specifically using vulnerability-contributing commits. DeepCVA is the first work that enables practitioners to perform SV assessment as soon as vulnerable changes are added to a codebase, supporting just-in-time prioritization of SV fixing. (5) Besides code artifacts produced from a software project of interest, SV assessment tasks can also benefit from SV crowdsourcing information on developer Question and Answer (Q&A) websites. We automatically retrieve large-scale security/SVrelated posts from these Q&A websites. We then apply a topic modeling technique on these posts to distill developers’ real-world SV concerns that can be used for data-driven SV assessment. Overall, we believe that this thesis has provided evidence-based knowledge and useful guidelines for researchers and practitioners to automate SV assessment using data-driven approaches.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202

    Rethinking Productivity in Software Engineering

    Get PDF
    Get the most out of this foundational reference and improve the productivity of your software teams. This open access book collects the wisdom of the 2017 "Dagstuhl" seminar on productivity in software engineering, a meeting of community leaders, who came together with the goal of rethinking traditional definitions and measures of productivity. The results of their work, Rethinking Productivity in Software Engineering, includes chapters covering definitions and core concepts related to productivity, guidelines for measuring productivity in specific contexts, best practices and pitfalls, and theories and open questions on productivity. You'll benefit from the many short chapters, each offering a focused discussion on one aspect of productivity in software engineering. Readers in many fields and industries will benefit from their collected work. Developers wanting to improve their personal productivity, will learn effective strategies for overcoming common issues that interfere with progress. Organizations thinking about building internal programs for measuring productivity of programmers and teams will learn best practices from industry and researchers in measuring productivity. And researchers can leverage the conceptual frameworks and rich body of literature in the book to effectively pursue new research directions. What You'll Learn Review the definitions and dimensions of software productivity See how time management is having the opposite of the intended effect Develop valuable dashboards Understand the impact of sensors on productivity Avoid software development waste Work with human-centered methods to measure productivity Look at the intersection of neuroscience and productivity Manage interruptions and context-switching Who Book Is For Industry developers and those responsible for seminar-style courses that include a segment on software developer productivity. Chapters are written for a generalist audience, without excessive use of technical terminology. ; Collects the wisdom of software engineering thought leaders in a form digestible for any developer Shares hard-won best practices and pitfalls to avoid An up to date look at current practices in software engineering productivit
    • …
    corecore