7 research outputs found
What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikit-learn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. We classify these questions into seven typical stages of an ML pipeline to understand the correlation between the library and the stage. Then we study the questions and perform statistical analysis to explore the answer to four research objectives (finding the most difficult stage, understanding the nature of problems, nature of libraries and studying whether the difficulties stayed consistent over time). Our findings reveal the urgent need for software engineering (SE) research in this area. Both static and dynamic analyses are mostly absent and badly needed to help developers find errors earlier. While there has been some early research on debugging, much more work is needed. API misuses are prevalent and API design improvements are sorely needed. Last and somewhat surprisingly, a tug of war between providing higher levels of abstractions and the need to understand the behavior of the trained model is prevalent
Documentation of Machine Learning Software
Machine Learning software documentation is different from most of the
documentations that were studied in software engineering research. Often, the
users of these documentations are not software experts. The increasing interest
in using data science and in particular, machine learning in different fields
attracted scientists and engineers with various levels of knowledge about
programming and software engineering. Our ultimate goal is automated generation
and adaptation of machine learning software documents for users with different
levels of expertise. We are interested in understanding the nature and triggers
of the problems and the impact of the users' levels of expertise in the process
of documentation evolution. We will investigate the Stack Overflow Q/As and
classify the documentation related Q/As within the machine learning domain to
understand the types and triggers of the problems as well as the potential
change requests to the documentation. We intend to use the results for building
on top of the state of the art techniques for automatic documentation
generation and extending on the adoption, summarization, and explanation of
software functionalities.Comment: The paper is accepted for publication in 27th IEEE International
Conference on Software Analysis, Evolution and Reengineering (SANER 2020
What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikit-learn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. We classify these questions into seven typical stages of an ML pipeline to understand the correlation between the library and the stage. Then we study the questions and perform statistical analysis to explore the answer to four research objectives (finding the most difficult stage, understanding the nature of problems, nature of libraries and studying whether the difficulties stayed consistent over time). Our findings reveal the urgent need for software engineering (SE) research in this area. Both static and dynamic analyses are mostly absent and badly needed to help developers find errors earlier. While there has been some early research on debugging, much more work is needed. API misuses are prevalent and API design improvements are sorely needed. Last and somewhat surprisingly, a tug of war between providing higher levels of abstractions and the need to understand the behavior of the trained model is prevalent.This pre-print is made available through arxiv: https://arxiv.org/abs/1906.11940.</p
Repairing Deep Neural Networks: Fix Patterns and Challenges
Significant interest in applying Deep Neural Network (DNN) has fueled the
need to support engineering of software that uses DNNs. Repairing software that
uses DNNs is one such unmistakable SE need where automated tools could be
beneficial; however, we do not fully understand challenges to repairing and
patterns that are utilized when manually repairing DNNs. What challenges should
automated repair tools address? What are the repair patterns whose automation
could help developers? Which repair patterns should be assigned a higher
priority for building automated bug repair tools? This work presents a
comprehensive study of bug fix patterns to address these questions. We have
studied 415 repairs from Stack overflow and 555 repairs from Github for five
popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to
understand challenges in repairs and bug repair patterns. Our key findings
reveal that DNN bug fix patterns are distinctive compared to traditional bug
fix patterns; the most common bug fix patterns are fixing data dimension and
neural network connectivity; DNN bug fixes have the potential to introduce
adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and
DNN bug localization, reuse of trained model, and coping with frequent releases
are major challenges faced by developers when fixing bugs. We also contribute a
benchmark of 667 DNN (bug, repair) instances
Towards understanding the challenges faced by machine learning software developers and enabling automated solutions
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∼80% on two benchmarks of misuses from Stack Overflow and Github