72 research outputs found

    Modularity in artificial neural networks

    Get PDF
    Artificial neural networks are deep machine learning models that excel at complex artificial intelligence tasks by abstracting concepts through multiple layers of feature extraction. Modular neural networks are artificial neural networks that are composed of multiple subnetworks called modules. The study of modularity has a long history in the field of artificial neural networks and many of the actively studied models in the domain of artificial neural networks have modular aspects. In this work, we aim to formalize the study of modularity in artificial neural networks and outline how modularity can be used to enhance some neural network performance measures. We do an extensive review of the current practices of modularity in the literature. Based on that, we build a framework that captures the essential properties characterizing the modularization process. Using this modularization framework as an anchor, we investigate the use of modularity to solve three different problems in artificial neural networks: balancing latency and accuracy, reducing model complexity and increasing robustness to noise and adversarial attacks. Artificial neural networks are high-capacity models with high data and computational demands. This represents a serious problem for using these models in environments with limited computational resources. Using a differential architectural search technique, we guide the modularization of a fully-connected network into a modular multi-path network. By evaluating sampled architectures, we can establish a relation between latency and accuracy that can be used to meet a required soft balance between these conflicting measures. A related problem is reducing the complexity of neural network models while minimizing accuracy loss. CapsNet is a neural network architecture that builds on the ideas of convolutional neural networks. However, the original architecture is shallow and has wide layers that contribute significantly to its complexity. By replacing the early wide layers by parallel deep independent paths, we can significantly reduce the complexity of the model. Combining this modular architecture with max-pooling, DropCircuit regularization and a modified variant of the routing algorithm, we can achieve lower model latency with the same or better accuracy compared to the baseline. The last problem we address is the sensitivity of neural network models to random noise and to adversarial attacks, a highly disruptive form of engineered noise. Convolutional layers are the basis of state-of-the-art computer vision models and, much like other neural network layers, they suffer from sensitivity to noise and adversarial attacks. We introduce the weight map layer, a modular layer based on the convolutional layer, that can increase model robustness to noise and adversarial attacks. We conclude our work by a general discussion about the investigated relation between modularity and the addressed problems and potential future research directions

    Modularity in artificial neural networks

    Get PDF
    Artificial neural networks are deep machine learning models that excel at complex artificial intelligence tasks by abstracting concepts through multiple layers of feature extraction. Modular neural networks are artificial neural networks that are composed of multiple subnetworks called modules. The study of modularity has a long history in the field of artificial neural networks and many of the actively studied models in the domain of artificial neural networks have modular aspects. In this work, we aim to formalize the study of modularity in artificial neural networks and outline how modularity can be used to enhance some neural network performance measures. We do an extensive review of the current practices of modularity in the literature. Based on that, we build a framework that captures the essential properties characterizing the modularization process. Using this modularization framework as an anchor, we investigate the use of modularity to solve three different problems in artificial neural networks: balancing latency and accuracy, reducing model complexity and increasing robustness to noise and adversarial attacks. Artificial neural networks are high-capacity models with high data and computational demands. This represents a serious problem for using these models in environments with limited computational resources. Using a differential architectural search technique, we guide the modularization of a fully-connected network into a modular multi-path network. By evaluating sampled architectures, we can establish a relation between latency and accuracy that can be used to meet a required soft balance between these conflicting measures. A related problem is reducing the complexity of neural network models while minimizing accuracy loss. CapsNet is a neural network architecture that builds on the ideas of convolutional neural networks. However, the original architecture is shallow and has wide layers that contribute significantly to its complexity. By replacing the early wide layers by parallel deep independent paths, we can significantly reduce the complexity of the model. Combining this modular architecture with max-pooling, DropCircuit regularization and a modified variant of the routing algorithm, we can achieve lower model latency with the same or better accuracy compared to the baseline. The last problem we address is the sensitivity of neural network models to random noise and to adversarial attacks, a highly disruptive form of engineered noise. Convolutional layers are the basis of state-of-the-art computer vision models and, much like other neural network layers, they suffer from sensitivity to noise and adversarial attacks. We introduce the weight map layer, a modular layer based on the convolutional layer, that can increase model robustness to noise and adversarial attacks. We conclude our work by a general discussion about the investigated relation between modularity and the addressed problems and potential future research directions

    Towards understanding the challenges faced by machine learning software developers and enabling automated solutions

    Get PDF
    Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To fill that gap this thesis reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikitlearn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. Our findings reveal the urgent need for software engineering (SE) research in this area. The second part of the thesis particularly focuses on understanding the Deep Neural Network (DNN) bug characteristics. We study 2,716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, their root causes and impacts, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. While exploring the bug characteristics, our findings imply that repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. So, the third part of this thesis presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack Overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns and the most common bug fix patterns are fixing data dimension and neural network connectivity. Finally, we propose an automatic technique to detect ML Application Programming Interface (API) misuses. We started with an empirical study to understand ML API misuses. Our study shows that ML API misuse is prevalent and distinct compared to non-ML API misuses. Inspired by these findings, we contributed Amimla (Api Misuse In Machine Learning Apis) an approach and a tool for ML API misuse detection. Amimla relies on several technical innovations. First, we proposed an abstract representation of ML pipelines to use in misuse detection. Second, we proposed an abstract representation of neural networks for deep learning related APIs. Third, we have developed a representation strategy for constraints on ML APIs. Finally, we have developed a misuse detection strategy for both single and multi-APIs. Our experimental evaluation shows that Amimla achieves a high average accuracy of ∌80% on two benchmarks of misuses from Stack Overflow and Github

    Big data-driven multimodal traffic management : trends and challenges

    Get PDF

    Explainable-by-Design Deep Learning

    Get PDF
    Machine learning, and more specifically, deep learning, have attracted the attention of media and the broader public in the last decade due to its potential to revolutionize industries, public services, and society. Deep learning achieved or even surpassed human experts’ performance in terms of accuracy for different challenging problems such as image recognition, speech, and language translation. However, deep learning models are often characterized as a “black box” as these models are composed of many millions of parameters, which are extremely difficult to interpret by specialists. Complex “black box” models can easily fool users unable to inspect the algorithm’s decision, which can lead to dangerous or catastrophic events. Therefore, auditable explainable AI approaches are crucial for developing safe systems, complying with regulations, and accepting this new technology within society. This thesis tries to answer the following research question: Is it possible to provide an approach that has a performance compared to a Deep Learning and the same time has a transparent structure (non-black box)? To this end, it introduces a novel framework of explainable- by-design Deep Learning architectures that offers transparency and high accuracy, helping humans understand why a particular machine decision has been reached and whether or not it is trustworthy. Moreover, the proposed prototype-based framework has a flexible structure that allows the unsupervised detection of new classes and situations. The approaches proposed in thesis have been applied to multiple use cases, including image classification, fairness, deep recursive learning interpretation, and novelty detection

    Flexible Automation and Intelligent Manufacturing: The Human-Data-Technology Nexus

    Get PDF
    This is an open access book. It gathers the first volume of the proceedings of the 31st edition of the International Conference on Flexible Automation and Intelligent Manufacturing, FAIM 2022, held on June 19 – 23, 2022, in Detroit, Michigan, USA. Covering four thematic areas including Manufacturing Processes, Machine Tools, Manufacturing Systems, and Enabling Technologies, it reports on advanced manufacturing processes, and innovative materials for 3D printing, applications of machine learning, artificial intelligence and mixed reality in various production sectors, as well as important issues in human-robot collaboration, including methods for improving safety. Contributions also cover strategies to improve quality control, supply chain management and training in the manufacturing industry, and methods supporting circular supply chain and sustainable manufacturing. All in all, this book provides academicians, engineers and professionals with extensive information on both scientific and industrial advances in the converging fields of manufacturing, production, and automation

    Flexible Automation and Intelligent Manufacturing: The Human-Data-Technology Nexus

    Get PDF
    This is an open access book. It gathers the first volume of the proceedings of the 31st edition of the International Conference on Flexible Automation and Intelligent Manufacturing, FAIM 2022, held on June 19 – 23, 2022, in Detroit, Michigan, USA. Covering four thematic areas including Manufacturing Processes, Machine Tools, Manufacturing Systems, and Enabling Technologies, it reports on advanced manufacturing processes, and innovative materials for 3D printing, applications of machine learning, artificial intelligence and mixed reality in various production sectors, as well as important issues in human-robot collaboration, including methods for improving safety. Contributions also cover strategies to improve quality control, supply chain management and training in the manufacturing industry, and methods supporting circular supply chain and sustainable manufacturing. All in all, this book provides academicians, engineers and professionals with extensive information on both scientific and industrial advances in the converging fields of manufacturing, production, and automation
    • 

    corecore