296 research outputs found
DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks
The increasing inclusion of Deep Learning (DL) models in safety-critical
systems such as autonomous vehicles have led to the development of multiple
model-based DL testing techniques. One common denominator of these testing
techniques is the automated generation of test cases, e.g., new inputs
transformed from the original training data with the aim to optimize some test
adequacy criteria. So far, the effectiveness of these approaches has been
hindered by their reliance on random fuzzing or transformations that do not
always produce test cases with a good diversity. To overcome these limitations,
we propose, DeepEvolution, a novel search-based approach for testing DL models
that relies on metaheuristics to ensure a maximum diversity in generated test
cases. We assess the effectiveness of DeepEvolution in testing computer-vision
DL models and found that it significantly increases the neuronal coverage of
generated test cases. Moreover, using DeepEvolution, we could successfully find
several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz
(a coverage-guided fuzzing tool developed at Google Brain) in detecting latent
defects introduced during the quantization of the models. These results suggest
that search-based approaches can help build effective testing tools for DL
systems
Traffic signal settings optimization using fradient descent
We investigate performance of a gradient descent optimization (GR) applied to the traffic signal setting problem and compare it to genetic algorithms. We used neural networks as metamodels evaluating quality of signal settings and discovered that both optimization methods produce similar results, e.g., in both cases the accuracy of neural networks close to local optima depends on an activation function (e.g., TANH activation makes optimization process converge to different minima than ReLU activation)
DiverGet: A Search-Based Software Testing Approach for Deep Neural Network Quantization Assessment
Quantization is one of the most applied Deep Neural Network (DNN) compression
strategies, when deploying a trained DNN model on an embedded system or a cell
phone. This is owing to its simplicity and adaptability to a wide range of
applications and circumstances, as opposed to specific Artificial Intelligence
(AI) accelerators and compilers that are often designed only for certain
specific hardware (e.g., Google Coral Edge TPU). With the growing demand for
quantization, ensuring the reliability of this strategy is becoming a critical
challenge. Traditional testing methods, which gather more and more genuine data
for better assessment, are often not practical because of the large size of the
input space and the high similarity between the original DNN and its quantized
counterpart. As a result, advanced assessment strategies have become of
paramount importance. In this paper, we present DiverGet, a search-based
testing framework for quantization assessment. DiverGet defines a space of
metamorphic relations that simulate naturally-occurring distortions on the
inputs. Then, it optimally explores these relations to reveal the disagreements
among DNNs of different arithmetic precision. We evaluate the performance of
DiverGet on state-of-the-art DNNs applied to hyperspectral remote sensing
images. We chose the remote sensing DNNs as they're being increasingly deployed
at the edge (e.g., high-lift drones) in critical domains like climate change
research and astronomy. Our results show that DiverGet successfully challenges
the robustness of established quantization techniques against
naturally-occurring shifted data, and outperforms its most recent concurrent,
DiffChaser, with a success rate that is (on average) four times higher.Comment: Accepted for publication in The Empirical Software Engineering
Journal (EMSE
An Improved Bees Algorithm for Training Deep Recurrent Networks for Sentiment Classification
Recurrent neural networks (RNNs) are powerful tools for learning information from
temporal sequences. Designing an optimum deep RNN is difficult due to configuration and training
issues, such as vanishing and exploding gradients. In this paper, a novel metaheuristic optimisation
approach is proposed for training deep RNNs for the sentiment classification task. The approach
employs an enhanced Ternary Bees Algorithm (BA-3+), which operates for large dataset classification
problems by considering only three individual solutions in each iteration. BA-3+ combines the
collaborative search of three bees to find the optimal set of trainable parameters of the proposed deep
recurrent learning architecture. Local learning with exploitative search utilises the greedy selection
strategy. Stochastic gradient descent (SGD) learning with singular value decomposition (SVD) aims to
handle vanishing and exploding gradients of the decision parameters with the stabilisation strategy
of SVD. Global learning with explorative search achieves faster convergence without getting trapped
at local optima to find the optimal set of trainable parameters of the proposed deep recurrent learning
architecture. BA-3+ has been tested on the sentiment classification task to classify symmetric and
asymmetric distribution of the datasets from different domains, including Twitter, product reviews,
and movie reviews. Comparative results have been obtained for advanced deep language models and
Differential Evolution (DE) and Particle Swarm Optimization (PSO) algorithms. BA-3+ converged
to the global minimum faster than the DE and PSO algorithms, and it outperformed the SGD, DE,
and PSO algorithms for the Turkish and English datasets. The accuracy value and F1 measure have
improved at least with a 30â40% improvement than the standard SGD algorithm for all classification
datasets. Accuracy rates in the RNN model trained with BA-3+ ranged from 80% to 90%, while the
RNN trained with SGD was able to achieve between 50% and 60% for most datasets. The performance
of the RNN model with BA-3+ has as good as for Tree-LSTMs and Recursive Neural Tensor Networks
(RNTNs) language models, which achieved accuracy results of up to 90% for some datasets. The
improved accuracy and convergence results show that BA-3+ is an efficient, stable algorithm for the
complex classification task, and it can handle the vanishing and exploding gradients problem of
deep RNNs
Towards Debugging and Testing Deep Learning Systems
Au cours des derniĂšres annĂ©es, lâapprentissage profond, en anglais Deep Learning (DL) a fait dâĂ©normes progrĂšs, en atteignant et dĂ©passant mĂȘme parfois le niveau de performance des humains pour diffĂ©rentes tĂąches, telles que la classification des images et la reconnaissance vocale. GrĂące Ă ces progrĂšs, nous constatons une large adoption du DL dans des applications critiques, telles que la conduite autonome de vĂ©hicules, la prĂ©vention et la dĂ©tection du
crime, et le traitement mĂ©dical. Cependant, malgrĂ© leurs progrĂšs spectaculaires, les systĂšmes de DL, tout comme les logiciels traditionnels, prĂ©sentent souvent des comportements erronĂ©s en raison de lâexistence de dĂ©fauts cachĂ©s ou dâinefficacitĂ©s. Ces comportements erronĂ©s
peuvent ĂȘtre Ă lâorigine dâaccidents catastrophiques. Ainsi, lâassurance de la qualitĂ© des logiciels (SQA), y compris la fiabilitĂ© et la robustesse, pour les systĂšmes de DL devient une prĂ©occupation majeure. Les tests traditionnels pour les modĂšles de DL consistent Ă mesurer leurs performances sur des donnĂ©es collectĂ©es manuellement ; ils dĂ©pendent donc fortement de la qualitĂ© des donnĂ©es de test qui, souvent, nâincluent pas de donnĂ©es dâentrĂ©e rares, comme en
tĂ©moignent les rĂ©cents accidents de voitures avec conduite autonome (exemple Tesla/Uber). Les techniques de test avancĂ©es sont trĂšs demandĂ©es pour amĂ©liorer la fiabilitĂ© des systĂšmes de DL. NĂ©anmoins, les tests des systĂšmes de DL posent des dĂ©fis importants, en raison de leur nature non-dĂ©terministe puisquâils suivent un paradigme axĂ© sur les donnĂ©es (la tĂąche cible est apprise statistiquement) et leur manque dâoracle puisquâils sont conçus principalement
pour fournir la réponse. Récemment, les chercheurs en génie logiciel ont commencé à adapter des concepts du domaine du test logiciel tels que la couverture des cas de tests et
les pseudo-oracles, pour rĂ©soudre ces difficultĂ©s. MalgrĂ© les rĂ©sultats prometteurs obtenus de cette rĂ©novation des mĂ©thodes existantes de test logiciel, le domaine du test des systĂšmes de DL est encore immature et les mĂ©thodes proposĂ©es Ă ce jour ne sont pas trĂšs efficaces. Dans ce mĂ©moire, nous examinons les solutions existantes proposĂ©es pour tester les systĂšmes de DL et proposons quelques nouvelles techniques. Nous rĂ©alisons cet objectif en suivant une approche systĂ©matique qui consiste Ă : (1) Ă©tudier les problĂšmes et les dĂ©fis liĂ©s aux tests des logiciels de DL; (2) souligner les forces et les faiblesses des techniques de test logiciel adaptĂ©es aux systĂšmes de DL; (3) proposer de nouvelles solutions de test pour combler certaines lacunes identifiĂ©es dans la littĂ©rature, et potentiellement aider Ă amĂ©liorer lâassurance qualitĂ© des systĂšmes de DL.----------ABSTRACT: Over the past few years, Deep Learning (DL) has made tremendous progress, achieving or surpassing human-level performance for different tasks such as image classification and speech recognition. Thanks to these advances, we are witnessing a wide adoption of DL in safetycritical applications such as autonomous driving cars, crime prevention and detection, and medical treatment. However, despite their spectacular progress, DL systems, just like traditional software systems, often exhibit erroneous corner-cases behaviors due to the existence of
latent defects or inefficiencies, and which can lead to catastrophic accidents. Thus, software quality assurance (SQA), including reliability and robustness, for DL systems becomes a big concern. Traditional testing for DL models consists of measuring their performance on manually collected data ; so it heavily depends on the quality of the test data that often fails to include rare inputs, as evidenced by recent autonomous-driving car accidents (e.g., Tesla/Uber). Advanced testing techniques are in high demand to improve the trustworthiness of DL systems. Nevertheless, DL testing poses significant challenges stemming from the non-deterministic nature of DL systems (since they follow a data-driven paradigm ; the target task is learned
statistically) and their lack of oracle (since they are designed principally to provide the answer). Recently, software researchers have started adapting concepts from the software testing domain such as test coverage and pseudo-oracles to tackle these difficulties. Despite some
promising results obtained from adapting existing software testing methods, current software testing techniques for DL systems are still quite immature. In this thesis, we examine existing testing techniques for DL systems and propose some new techniques. We achieve this by following a systematic approach consisting of : (1) investigating DL software issues and testing challenges ; (2) outlining the strengths and weaknesses of the software-based testing techniques adapted for DL systems ; and (3) proposing novel testing solutions to fill some of the identified literature gaps, and potentially help improving the SQA of DL systems
CUDA-bigPSF: An optimized version of bigPSF accelerated with Graphics Processing Unit
Accurate and fast short-term load forecasting is crucial in efficiently managing energy production and distribution. As such, many different algorithms have been proposed to address this topic, including hybrid models that combine clustering with other forecasting techniques. One of these algorithms is bigPSF, an algorithm that combines K-means clustering and a similarity search optimized for its use in distributed environments. The work presented in this paper aims to improve the time required to execute the algorithm with two main contributions. First, some of the issues of the original proposal that limited the number of cores simultaneously used are studied and highlighted. Second, a version of the algorithm optimized for Graphics Processing Unit (GPU) is proposed, solving the previously mentioned issues while taking into account the GPU architecture and memory structure. Experimentation was done with seven years of real-world electric demand data from Uruguay. Results show that the proposed algorithm executed consistently faster than the original version, achieving speedups up to 500 times faster during the training phase.Funding for open access charge: Universidad de Granada / CBUAGrant PID2020-112495RB-C21 funded by MCIN/ AEI /10.13039/501100011033I + D + i FEDER 2020 project B-TIC-42-UGR2
Enhanced Deep Network Designs Using Mitochondrial DNA Based Genetic Algorithm And Importance Sampling
Machine learning (ML) is playing an increasingly important role in our lives. It has already made huge impact in areas such as cancer diagnosis, precision medicine, self-driving cars, natural disasters predictions, speech recognition, etc. The painstakingly handcrafted feature extractors used in the traditional learning, classification and pattern recognition systems are not scalable for large-sized datasets or adaptable to different classes of problems or domains. Machine learning resurgence in the form of Deep Learning (DL) in the last decade after multiple AI (artificial intelligence) winters and hype cycles is a result of the convergence of advancements in training algorithms, availability of massive data (big data) and innovation in compute resources (GPUs and cloud). If we want to solve more complex problems with machine learning, we need to optimize all three of these areas, i.e., algorithms, dataset and compute. Our dissertation research work presents the original application of nature-inspired idea of mitochondrial DNA (mtDNA) to improve deep learning network design. Additional fine-tuning is provided with Monte Carlo based method called importance sampling (IS). The primary performance indicators for machine learning are model accuracy, loss and training time. The goal of our dissertation is to provide a framework to address all these areas by optimizing network designs (in the form of hyperparameter optimization) and dataset using enhanced Genetic Algorithm (GA) and importance sampling. Algorithms are by far the most important aspect of machine learning. We demonstrate the application of mitochondrial DNA to complement the standard genetic algorithm for architecture optimization of deep Convolution Neural Network (CNN). We use importance sampling to reduce the dataset variance and sample more often from the instances that add greater value from the training outcome perspective. And finally, we leverage massive parallel and distributed processing of GPUs in the cloud to speed up training. Thus, our multi-approach method for enhancing deep learning combines architecture optimization, dataset optimization and the power of the cloud to drive better model accuracy and reduce training time
- âŠ