2,760 research outputs found

    Why Modern Open Source Projects Fail

    Full text link
    Open source is experiencing a renaissance period, due to the appearance of modern platforms and workflows for developing and maintaining public code. As a result, developers are creating open source software at speeds never seen before. Consequently, these projects are also facing unprecedented mortality rates. To better understand the reasons for the failure of modern open source projects, this paper describes the results of a survey with the maintainers of 104 popular GitHub systems that have been deprecated. We provide a set of nine reasons for the failure of these open source projects. We also show that some maintenance practices -- specifically the adoption of contributing guidelines and continuous integration -- have an important association with a project failure or success. Finally, we discuss and reveal the principal strategies developers have tried to overcome the failure of the studied projects.Comment: Paper accepted at 25th International Symposium on the Foundations of Software Engineering (FSE), pages 1-11, 201

    Towards understanding the challenges faced by machine learning software developers and enabling automated solutions

    Get PDF
    Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∼80% on two benchmarks of misuses from Stack Overflow and Github

    Human Factors in Secure Software Development

    Get PDF
    While security research has made significant progress in the development of theoretically secure methods, software and algorithms, software still comes with many possible exploits, many of those using the human factor. The human factor is often called ``the weakest link'' in software security. To solve this, human factors research in security and privacy focus on the users of technology and consider their security needs. The research then asks how technology can serve users while minimizing risks and empowering them to retain control over their own data. However, these concepts have to be implemented by developers whose security errors may proliferate to all of their software's users. For example, software that stores data in an insecure way, does not secure network traffic correctly, or otherwise fails to adhere to secure programming best practices puts all of the software's users at risk. It is therefore critical that software developers implement security correctly. However, in addition to security rarely being a primary concern while producing software, developers may also not have extensive awareness, knowledge, training or experience in secure development. A lack of focus on usability in libraries, documentation, and tools that they have to use for security-critical components may exacerbate the problem by blowing up the investment of time and effort needed to "get security right". This dissertation's focus is how to support developers throughout the process of implementing software securely. This research aims to understand developers' use of resources, their mindsets as they develop, and how their background impacts code security outcomes. Qualitative, quantitative and mixed methods were employed online and in the laboratory, and large scale datasets were analyzed to conduct this research. This research found that the information sources developers use can contribute to code (in)security: copying and pasting code from online forums leads to achieving functional code quickly compared to using official documentation resources, but may introduce vulnerable code. We also compared the usability of cryptographic APIs, finding that poor usability, unsafe (possibly obsolete) defaults and unhelpful documentation also lead to insecure code. On the flip side, well-thought out documentation and abstraction levels can help improve an API's usability and may contribute to secure API usage. We found that developer experience can contribute to better security outcomes, and that studying students in lieu of professional developers can produce meaningful insights into developers' experiences with secure programming. We found that there is a multitude of online secure development advice, but that these advice sources are incomplete and may be insufficient for developers to retrieve help, which may cause them to choose un-vetted and potentially insecure resources. This dissertation supports that (a) secure development is subject to human factor challenges and (b) security can be improved by addressing these challenges and supporting developers. The work presented in this dissertation has been seminal in establishing human factors in secure development research within the security and privacy community and has advanced the dialogue about the rigorous use of empirical methods in security and privacy research. In these research projects, we repeatedly found that usability issues of security and privacy mechanisms, development practices, and operation routines are what leads to the majority of security and privacy failures that affect millions of end users

    Testing And Verification For The Open Source Release Of The Horizon Simulation Framework

    Get PDF
    Modeling and simulation tools are exceptionally useful for designing aerospace systems because they allow engineers to test and iterate designs before committing the massive resources required for system realization. The Horizon Simulation Framework (HSF) is a time-driven modeling and simulation tool which attempts to optimize how a modeled system could perform a mission profile. After 15 years of development, the HSF team aims to achieve a wider user and developer base by releasing the software open source. To ensure a successful release, the software required extensive testing, and the main scheduling algorithm required protections against new code breaking old functionality. The goal of the work presented in this thesis is to satisfy these requirements and officially release the software open source. The software was tested with \u3e 80% coverage and a continuous integration pipeline which runs build and unit/integration tests on every new commit was set up. Finally, supporting documentation and user resources were created and organized to promote community adoption of the software, making Horizon ready for an open source release
    corecore