611 research outputs found

    Evolutionary design of deep neural networks

    Get PDF
    Mención Internacional en el título de doctorFor three decades, neuroevolution has applied evolutionary computation to the optimization of the topology of artificial neural networks, with most works focusing on very simple architectures. However, times have changed, and nowadays convolutional neural networks are the industry and academia standard for solving a variety of problems, many of which remained unsolved before the discovery of this kind of networks. Convolutional neural networks involve complex topologies, and the manual design of these topologies for solving a problem at hand is expensive and inefficient. In this thesis, our aim is to use neuroevolution in order to evolve the architecture of convolutional neural networks. To do so, we have decided to try two different techniques: genetic algorithms and grammatical evolution. We have implemented a niching scheme for preserving the genetic diversity, in order to ease the construction of ensembles of neural networks. These techniques have been validated against the MNIST database for handwritten digit recognition, achieving a test error rate of 0.28%, and the OPPORTUNITY data set for human activity recognition, attaining an F1 score of 0.9275. Both results have proven very competitive when compared with the state of the art. Also, in all cases, ensembles have proven to perform better than individual models. Later, the topologies learned for MNIST were tested on EMNIST, a database recently introduced in 2017, which includes more samples and a set of letters for character recognition. Results have shown that the topologies optimized for MNIST perform well on EMNIST, proving that architectures can be reused across domains with similar characteristics. In summary, neuroevolution is an effective approach for automatically designing topologies for convolutional neural networks. However, it still remains as an unexplored field due to hardware limitations. Current advances, however, should constitute the fuel that empowers the emergence of this field, and further research should start as of today.This Ph.D. dissertation has been partially supported by the Spanish Ministry of Education, Culture and Sports under FPU fellowship with identifier FPU13/03917. This research stay has been partially co-funded by the Spanish Ministry of Education, Culture and Sports under FPU short stay grant with identifier EST15/00260.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: María Araceli Sanchís de Miguel.- Secretario: Francisco Javier Segovia Pérez.- Vocal: Simon Luca

    A survey of handwritten character recognition with MNIST and EMNIST

    Get PDF
    This article belongs to the Special Issue Computer Vision and Pattern Recognition in the Era of Deep Learning.This paper summarizes the top state-of-the-art contributions reported on the MNIST dataset for handwritten digit recognition. This dataset has been extensively used to validate novel techniques in computer vision, and in recent years, many authors have explored the performance of convolutional neural networks (CNNs) and other deep learning techniques over this dataset. To the best of our knowledge, this paper is the first exhaustive and updated review of this dataset; there are some online rankings, but they are outdated, and most published papers survey only closely related works, omitting most of the literature. This paper makes a distinction between those works using some kind of data augmentation and works using the original dataset out-of-the-box. Also, works using CNNs are reported separately; as they are becoming the state-of-the-art approach for solving this problem. Nowadays, a significant amount of works have attained a test error rate smaller than 1% on this dataset; which is becoming non-challenging. By mid-2017, a new dataset was introduced: EMNIST, which involves both digits and letters, with a larger amount of data acquired from a database different than MNIST's. In this paper, EMNIST is explained and some results are surveyed

    Evolutionary Design of Convolutional Neural Networks for Human Activity Recognition in Sensor-Rich Environments

    Get PDF
    Human activity recognition is a challenging problem for context-aware systems and applications. It is gaining interest due to the ubiquity of different sensor sources, wearable smart objects, ambient sensors, etc. This task is usually approached as a supervised machine learning problem, where a label is to be predicted given some input data, such as the signals retrieved from different sensors. For tackling the human activity recognition problem in sensor network environments, in this paper we propose the use of deep learning (convolutional neural networks) to perform activity recognition using the publicly available OPPORTUNITY dataset. Instead of manually choosing a suitable topology, we will let an evolutionary algorithm design the optimal topology in order to maximize the classification F1 score. After that, we will also explore the performance of committees of the models resulting from the evolutionary process. Results analysis indicates that the proposed model was able to perform activity recognition within a heterogeneous sensor network environment, achieving very high accuracies when tested with new sensor data. Based on all conducted experiments, the proposed neuroevolutionary system has proved to be able to systematically find a classification model which is capable of outperforming previous results reported in the state-of-the-art, showing that this approach is useful and improves upon previously manually-designed architectures.This research is partially supported by the Spanish Ministry of Education, Culture and Sports under FPU fellowship with identifier FPU13/03917

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    The diversity-accuracy duality in ensembles of classifiersd

    Get PDF
    Horizontal scaling of Machine Learning algorithms has the potential to tackle concerns over the scalability and sustainability of Deep Learning methods, viz. their consumption of energy and computational resources, as well their increasing inaccessibility to researchers. One way to enact horizontal scaling is by employing ensemble learning methods, since they enable distribution. There is a consensus on the point that diversity between individual learners leads to better performance, which is why we have focused on it as the criterion for distributing the base models of an ensemble. However, there is no standard agreement on how diversity should be defined and thus how to exploit it to construct a high-performing classifier. Therefore, we have proposed different definitions of diversity and innovative algorithms which promote it in a systematic way. We have first considered architectural diversity with an algorithm called WILDA: Wide Learning of Diverse Architectures. In a distributed fashion, this algorithm evolves a set of neural networks that are pretrained on the target task and diverse w.r.t. architectural feature descriptors. We have then generalised this notion by defining behavioural diversity on the basis of the divergence between the errors made by different models on a dataset. We have defined several diversity metrics and used them to guide a novelty search algorithm which builds an ensemble of behaviourally diverse classifiers. The algorithm promotes diversity in ensembles by explicitly searching for it, without selecting for accuracy. We have then extended this approach with a surrogate diversity model, which reduces the computational burden of this search by eliminating the need to train each network in the population with stochastic gradient descent at each step. These methods have enabled us to investigate the role that both architectural and behavioural diversity play in contributing to the performance of an ensemble. In order to study the relationship between diversity and accuracy in classifier ensembles, we have then proposed several methods that extend the novelty search with accuracy objectives. Surprisingly, we have observed that, with the highest-performing diversity metrics, there is an equivalence between searching for diversity objectives and searching for accuracy objectives. This contradicts widespread assumptions that a trade-off must be found by balancing diversity and accuracy objectives. We therefore posit the existence of a diversity-accuracy duality in ensembles of classifiers. An implication of this is the possibility of evolving diverse ensembles without detriment to their accuracy, since it is implicitly ensured.Open Acces

    Scientometric Analysis of Optimisation and Machine Learning Publications

    Get PDF
    Introduction: Optimisation is an important aspect of machine learning because it helps improve accuracy and reduce errors in the model's predictions. Purpose: The purpose of this research is to identify the global structure of optimization and machine learning. The work specifically looks at the collaborative network of countries in these fields, the top 20 authors in terms of production from 2015–2021, and the co-citation network of articles. Methodology: In this study, co-word analysis and social network analysis were used to conduct a descriptive study based on the scientometric approach and the content analysis method. In this research, around 17,500 articles on optimization and machine learning published between 2015 and 2021 were extracted. An ANOVA was performed to evaluate whether there was a significant difference between betweenness, closeness, and pagerank. The Dimensions database was utilised for the investigation without language constraints. Moreover, Bibliometrix was used for calculation and visualization. Findings: The results revealed a substantial difference between betweenness, proximity, and pagerank, indicating that this research has the potential to bring vital insights into future optimization and machine learning research
    corecore