18,829 research outputs found
Towards understanding the challenges faced by machine learning software developers and enabling automated solutions
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∼80% on two benchmarks of misuses from Stack Overflow and Github
Repairing Deep Neural Networks: Fix Patterns and Challenges
Significant interest in applying Deep Neural Network (DNN) has fueled the
need to support engineering of software that uses DNNs. Repairing software that
uses DNNs is one such unmistakable SE need where automated tools could be
beneficial; however, we do not fully understand challenges to repairing and
patterns that are utilized when manually repairing DNNs. What challenges should
automated repair tools address? What are the repair patterns whose automation
could help developers? Which repair patterns should be assigned a higher
priority for building automated bug repair tools? This work presents a
comprehensive study of bug fix patterns to address these questions. We have
studied 415 repairs from Stack overflow and 555 repairs from Github for five
popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to
understand challenges in repairs and bug repair patterns. Our key findings
reveal that DNN bug fix patterns are distinctive compared to traditional bug
fix patterns; the most common bug fix patterns are fixing data dimension and
neural network connectivity; DNN bug fixes have the potential to introduce
adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and
DNN bug localization, reuse of trained model, and coping with frequent releases
are major challenges faced by developers when fixing bugs. We also contribute a
benchmark of 667 DNN (bug, repair) instances
Mining Fix Patterns for FindBugs Violations
In this paper, we first collect and track a large number of fixed and unfixed
violations across revisions of software.
The empirical analyses reveal that there are discrepancies in the
distributions of violations that are detected and those that are fixed, in
terms of occurrences, spread and categories, which can provide insights into
prioritizing violations.
To automatically identify patterns in violations and their fixes, we propose
an approach that utilizes convolutional neural networks to learn features and
clustering to regroup similar instances. We then evaluate the usefulness of the
identified fix patterns by applying them to unfixed violations.
The results show that developers will accept and merge a majority (69/116) of
fixes generated from the inferred fix patterns. It is also noteworthy that the
yielded patterns are applicable to four real bugs in the Defects4J major
benchmark for software testing and automated repair.Comment: Accepted for IEEE Transactions on Software Engineerin
A Comprehensive Empirical Study of Bugs in Open-Source Federated Learning Frameworks
Federated learning (FL) is a distributed machine learning (ML) paradigm,
allowing multiple clients to collaboratively train shared machine learning (ML)
models without exposing clients' data privacy. It has gained substantial
popularity in recent years, especially since the enforcement of data protection
laws and regulations in many countries. To foster the application of FL, a
variety of FL frameworks have been proposed, allowing non-experts to easily
train ML models. As a result, understanding bugs in FL frameworks is critical
for facilitating the development of better FL frameworks and potentially
encouraging the development of bug detection, localization and repair tools.
Thus, we conduct the first empirical study to comprehensively collect,
taxonomize, and characterize bugs in FL frameworks. Specifically, we manually
collect and classify 1,119 bugs from all the 676 closed issues and 514 merged
pull requests in 17 popular and representative open-source FL frameworks on
GitHub. We propose a classification of those bugs into 12 bug symptoms, 12 root
causes, and 18 fix patterns. We also study their correlations and distributions
on 23 functionalities. We identify nine major findings from our study, discuss
their implications and future research directions based on our findings
Analysis and Detection of Information Types of Open Source Software Issue Discussions
Most modern Issue Tracking Systems (ITSs) for open source software (OSS)
projects allow users to add comments to issues. Over time, these comments
accumulate into discussion threads embedded with rich information about the
software project, which can potentially satisfy the diverse needs of OSS
stakeholders. However, discovering and retrieving relevant information from the
discussion threads is a challenging task, especially when the discussions are
lengthy and the number of issues in ITSs are vast. In this paper, we address
this challenge by identifying the information types presented in OSS issue
discussions. Through qualitative content analysis of 15 complex issue threads
across three projects hosted on GitHub, we uncovered 16 information types and
created a labeled corpus containing 4656 sentences. Our investigation of
supervised, automated classification techniques indicated that, when prior
knowledge about the issue is available, Random Forest can effectively detect
most sentence types using conversational features such as the sentence length
and its position. When classifying sentences from new issues, Logistic
Regression can yield satisfactory performance using textual features for
certain information types, while falling short on others. Our work represents a
nontrivial first step towards tools and techniques for identifying and
obtaining the rich information recorded in the ITSs to support various software
engineering activities and to satisfy the diverse needs of OSS stakeholders.Comment: 41st ACM/IEEE International Conference on Software Engineering
(ICSE2019
- …