4 research outputs found

    Towards understanding the challenges faced by machine learning software developers and enabling automated solutions

    Get PDF
    Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∼80% on two benchmarks of misuses from Stack Overflow and Github

    An empirical study on the usage of the swift programming language

    No full text
    Recently, Apple released Swift, a modern programming language built to be the successor of Objective-C. In less than a year and a half after its first release, Swift became one of the most popular programming languages in the world, considering different popularity measures. A significant part of this success is due to Apple's strict control over its ecosystem, and the clear message that it will replace Objective-C in a near future. According to Apple, Swift is a powerful and intuitive programming language[...]. Writing Swift code is interactive and fun, the syntax is concise yet expressive. However, little is known about how Swift developers perceive these benefits. In this paper, we conducted two studies aimed at uncovering the questions and strains that arise from this early adoption. First, we perform a thorough analysis on 59,156 questions asked about Swift on StackOverflow. Second, we interviewed 12 Swift developers to cross-validate the initial results. Our study reveals that developers do seem to find the language easy to understand and adopt, although 17.5% of the questions are about basic elements of the language. Still, there are many questions about problems in the toolset (compiler, Xcode, libraries). Some of our interviewees reinforced these problems

    An empirical study on the usage of the swift programming language

    No full text
    Recently, Apple released Swift, a modern programming language built to be the successor of Objective-C. In less than a year and a half after its first release, Swift became one of the most popular programming languages in the world, considering different popularity measures. A significant part of this success is due to Apple's strict control over its ecosystem, and the clear message that it will replace Objective-C in a near future. According to Apple, "Swift is a powerful and intuitive programming language[...]. Writing Swift code is interactive and fun, the syntax is concise yet expressive." However, little is known about how Swift developers perceive these benefits. In this paper, we conducted two studies aimed at uncovering the questions and strains that arise from this early adoption. First, we perform a thorough analysis on 59,156 questions asked about Swift on StackOverflow. Second, we interviewed 12 Swift developers to cross-validate the initial results. Our study reveals that developers do seem to find the language easy to understand and adopt, although 17.5% of the questions are about basic elements of the language. Still, there are many questions about problems in the toolset (compiler, Xcode, libraries). Some of our interviewees reinforced these problems
    corecore