5,596 research outputs found
Assessing Architectural Similarity in Populations of Deep Neural Networks
Evolutionary deep intelligence has recently shown great promise for producing
small, powerful deep neural network models via the synthesis of increasingly
efficient architectures over successive generations. Despite recent research
showing the efficacy of multi-parent evolutionary synthesis, little has been
done to directly assess architectural similarity between networks during the
synthesis process for improved parent network selection. In this work, we
present a preliminary study into quantifying architectural similarity via the
percentage overlap of architectural clusters. Results show that networks
synthesized using architectural alignment (via gene tagging) maintain higher
architectural similarities within each generation, potentially restricting the
search space of highly efficient network architectures.Comment: 3 pages. arXiv admin note: text overlap with arXiv:1811.0796
An emergentist perspective on the origin of number sense
open2noopenZorzi, Marco; Testolin, AlbertoZorzi, Marco; Testolin, Albert
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table
Highly Efficient Deep Intelligence via Multi-Parent Evolutionary Synthesis of Deep Neural Networks
Machine learning methods, and particularly deep neural networks, are a rapidly growing field and are currently being employed in domains such as science, business, and government. However, the significant success of neural networks has largely been due to the increasingly large model sizes and enormous amounts of required training data. As a result, powerful neural networks are accompanied by growing storage and memory requirements, making these powerful models infeasible for practical scenarios that use small embedded devices without access to cloud computing.
As such, methods for significantly reducing the memory and computational requirements of high-performing deep neural networks via sparsification and/or compression have been developed. More recently, the concept of evolutionary deep intelligence was proposed, and takes inspiration from nature and allows highly-efficient deep neural networks to organically synthesize over successive generations. However, current work in evolutionary deep intelligence has been limited to the use of asexual evolutionary synthesis where a newly synthesized offspring network is solely dependent on a single parent network from the preceding generation.
In this thesis, we introduce a general framework for synthesizing efficient neural network architectures via multi-parent evolutionary synthesis. Generalized from the asexual evolutionary synthesis approach, the framework allows for a newly synthesized network to be dependent on a subset of all previously synthesized networks. By imposing constraints on this general framework, the cases of asexual evolutionary synthesis, 2-parent sexual evolutionary synthesis, and m-parent evolutionary synthesis can all be realized.
We explore the computational construct used to mimic heredity, and generalize it beyond the asexual evolutionary synthesis used in current evolutionary deep intelligence works. The efficacy of incorporating multiple parent networks during evolutionary synthesis was examined first in the context of 2-parent sexual evolutionary synthesis, then generalized to m-parent evolutionary synthesis in the context of varying generational population sizes. Both experiments show that the use of multiple parent networks during evolutionary synthesis allows for increased network diversity as well as steeper trends in increasing network efficiency over generations.
We also introduce the concept of gene tagging within the evolutionary deep intelligence framework as a means to enforce a like-with-like mating policy during the multi-parent evolutionary synthesis process, and evaluate the effect of architectural alignment during multi-parent evolutionary synthesis. We present an experiment exploring the quantification of network architectural similarity in populations of networks.
In addition, we investigate the the computational construct used to mimic natural selection. The impact of various environmental resource models used to mimic the constraint of available computational and storage resources on network synthesis over successive generations is explored, and results clearly demonstrate the trade-off between computation time and optimal model performance.
The results of m-parent evolutionary synthesis are promising, and indicate the potential benefits of incorporating multiple parent networks during evolutionary synthesis for highly-efficient evolutionary deep intelligence. Future work includes studying the effects of inheriting weight values (as opposed to random initialization) on total training time and further investigation of potential structural similarity metrics, with the goal of developing a deeper understanding of the underlying effects of network architecture on performance
A Hybrid Neuroevolutionary Approach to the Design of Convolutional Neural Networks for 2D and 3D Medical Image Segmentation
This thesis highlights the development and evaluation of a hybrid neuroevolutionary approach for designing Convolutional Neural Networks (CNNs) for image classification and segmentation tasks. We integrate Cartesian Genetic Programming (CGP) with Novelty Search and Simulated Annealing algorithms to optimize the CNN architectures efficiently.
The challenge lies in reducing the computational demands and inefficiencies of traditional Neural Architecture Search (NAS) techniques. To address this, a flexible framework based on CGP is utilized for evolving network architectures. Novelty Search facilitates the exploration of varied architectural landscapes, promoting diversity of solutions. Simulated
Annealing further refines these solutions, optimizing the balance between exploring new possibilities and exploiting known good solutions within the search space.
Our experiments, conducted on benchmark datasets such as DRIVE and MSD, demonstrate the method’s effectiveness in generating competitive segmentation models. On the DRIVE dataset, our models achieved Dice Similarity Coefficients (DSC) of 0.828 and 0.814, and Intersection over Union (IoU) scores of 0.716 and 0.736, respectively. For the MSD dataset, our models exhibited DSC scores up to 0.924 for the Heart task, showcasing the potential of our method in handling complex segmentation challenges across different medical imaging modalities.
The significance of this research lies in its hybrid approach that efficiently navigates the search space for CNN architectures, thus reducing number of fitness evaluations while achieving near state of art performance. Future work will explore enhancing the algorithm’s effectiveness through advanced data preprocessing techniques, and the exploration of more complex network layers. Our findings highlight the potential of evolutionary algorithms and local search in advancing automated CNN design for medical image segmentation, offering a promising direction for future research in the field
- …