2,446 research outputs found

    Automatically Designing CNN Architectures for Medical Image Segmentation

    Full text link
    Deep neural network architectures have traditionally been designed and explored with human expertise in a long-lasting trial-and-error process. This process requires huge amount of time, expertise, and resources. To address this tedious problem, we propose a novel algorithm to optimally find hyperparameters of a deep network architecture automatically. We specifically focus on designing neural architectures for medical image segmentation task. Our proposed method is based on a policy gradient reinforcement learning for which the reward function is assigned a segmentation evaluation utility (i.e., dice index). We show the efficacy of the proposed method with its low computational cost in comparison with the state-of-the-art medical image segmentation networks. We also present a new architecture design, a densely connected encoder-decoder CNN, as a strong baseline architecture to apply the proposed hyperparameter search algorithm. We apply the proposed algorithm to each layer of the baseline architectures. As an application, we train the proposed system on cine cardiac MR images from Automated Cardiac Diagnosis Challenge (ACDC) MICCAI 2017. Starting from a baseline segmentation architecture, the resulting network architecture obtains the state-of-the-art results in accuracy without performing any trial-and-error based architecture design approaches or close supervision of the hyperparameters changes.Comment: Accepted to Machine Learning in Medical Imaging (MLMI 2018

    Dirichlet Process Hidden Markov Multiple Change-point Model

    Get PDF
    This paper proposes a new Bayesian multiple change-point model which is based on the hidden Markov approach. The Dirichlet process hidden Markov model does not require the specification of the number of change-points a priori. Hence our model is robust to model specification in contrast to the fully parametric Bayesian model. We propose a general Markov chain Monte Carlo algorithm which only needs to sample the states around change-points. Simulations for a normal mean-shift model with known and unknown variance demonstrate advantages of our approach. Two applications, namely the coal-mining disaster data and the real United States Gross Domestic Product growth, are provided. We detect a single change-point for both the disaster data and US GDP growth. All the change-point locations and posterior inferences of the two applications are in line with existing methods.Comment: Published at http://dx.doi.org/10.1214/14-BA910 in the Bayesian Analysis (http://projecteuclid.org/euclid.ba) by the International Society of Bayesian Analysis (http://bayesian.org/

    Optimizing Convolutional Neural Networks for Embedded Systems by Means of Neuroevolution

    Full text link
    Automated design methods for convolutional neural networks (CNNs) have recently been developed in order to increase the design productivity. We propose a neuroevolution method capable of evolving and optimizing CNNs with respect to the classification error and CNN complexity (expressed as the number of tunable CNN parameters), in which the inference phase can partly be executed using fixed point operations to further reduce power consumption. Experimental results are obtained with TinyDNN framework and presented using two common image classification benchmark problems -- MNIST and CIFAR-10.Comment: TPNC 2019, LNCS 11934, pp. 1-13, 201

    Coevolution of Generative Adversarial Networks

    Full text link
    Generative adversarial networks (GAN) became a hot topic, presenting impressive results in the field of computer vision. However, there are still open problems with the GAN model, such as the training stability and the hand-design of architectures. Neuroevolution is a technique that can be used to provide the automatic design of network architectures even in large search spaces as in deep neural networks. Therefore, this project proposes COEGAN, a model that combines neuroevolution and coevolution in the coordination of the GAN training algorithm. The proposal uses the adversarial characteristic between the generator and discriminator components to design an algorithm using coevolution techniques. Our proposal was evaluated in the MNIST dataset. The results suggest the improvement of the training stability and the automatic discovery of efficient network architectures for GANs. Our model also partially solves the mode collapse problem.Comment: Published in EvoApplications 201

    Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes

    Full text link
    We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the space by sampling those configurations that yield maximum information. We then show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings. We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.Comment: Accepted for publication at ECCV 2020. Code availiable at https://code.active.vision . Updated for typ

    Using neuroevolution for predicting mobile marketing conversion

    Get PDF
    This paper addresses user Conversion Rate (CVR) prediction within the context of Mobile Performance Marketing. Specifically, we adapt two main neuroevolution methods: Neuroevolution of Augmenting Topologies (NEAT) and Hypercube-based NEAT (HyperNEAT). First, we discuss two mechanisms for increasing execution speed (parallelism and data sampling); a strategy for preventing excessive network complexity with NEAT; and a rolling window scheme for performing an online learning. Then, we present experimental results, using distinct datasets and testing both offline and online learning environments.ThisarticleisaresultoftheprojectNORTE-01-0247-FEDER-017497,supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). This work was also supported by FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2019

    Spatial patterns and intensity of the surface storm tracks in CMIP5 models

    Get PDF
    Author Posting. © American Meteorological Society, 2017. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 30 (2017): 4965-4981, doi:10.1175/JCLI-D-16-0228.1.To improve the understanding of storm tracks and western boundary current (WBC) interactions, surface storm tracks in 12 CMIP5 models are examined against ERA-Interim. All models capture an equatorward displacement toward the WBCs in the locations of the surface storm tracks’ maxima relative to those at 850 hPa. An estimated storm-track metric is developed to analyze the location of the surface storm track. It shows that the equatorward shift is influenced by both the lower-tropospheric instability and the baroclinicity. Basin-scale spatial correlations between models and ERA-Interim for the storm tracks, near-surface stability, SST gradient, and baroclinicity are calculated to test the ability of the GCMs’ match reanalysis. An intermodel comparison of the spatial correlations suggests that differences (relative to ERA-Interim) in the position of the storm track aloft have the strongest influence on differences in the surface storm-track position. However, in the North Atlantic, biases in the surface storm track north of the Gulf Stream are related to biases in the SST. An analysis of the strength of the storm tracks shows that most models generate a weaker storm track at the surface than 850 hPa, consistent with observations, although some outliers are found. A linear relationship exists among the models between storm-track amplitudes at 500 and 850 hPa, but not between 850 hPa and the surface. In total, the work reveals a dual role in forcing the surface storm track from aloft and from the ocean surface in CMIP5 models, with the atmosphere having the larger relative influence.JFB was partially supported by the NOAA Climate Program Office’s Modeling, Analysis, Predictions, and Projections program (Grant NA15OAR4310094). Y-OK was supported by NSF Division of Atmospheric and Geospace Science Climate and Large-scale Dynamics Program (AGS-1355339), NASA Physical Oceanography Program (NNX13AM59G), and DOE Office of Biological and Environmental Research Regional and Global Climate Modeling Program (DE-SC0014433). RJS was supported by DOE Office of Biological and Environmental Research (DE-SC0006743) and NSF Directorate for Geosciences Division of Ocean Sciences (1419584),2017-10-0

    Ant-based Neural Topology Search (ANTS) for Optimizing Recurrent Networks

    Get PDF
    Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, we propose a novel neuro-evolution algorithm based on ant colony optimization (ACO), called Ant-based Neural Topology Search (ANTS), for directly optimizing RNN topologies. The procedure selects from multiple modern recurrent cell types such as ∆-RNN, GRU, LSTM, MGU and UGRNN cells, as well as recurrent connections which may span multiple layers and/or steps of time. In order to introduce an inductive bias that encourages the formation of sparser synaptic connectivity patterns, we investigate several variations of the core algorithm. We do so primarily by formulating different functions that drive the underlying pheromone simulation process (which mimic L1 and L2 regularization in standard machine learning) as well as by introducing ant agents with specialized roles (inspired by how real ant colonies operate), i.e., explorer ants that construct the initial feed forward structure and social ants which select nodes from the feed forward connections to subsequently craft recurrent memory structures. We also incorporate communal intelligence, where best weights are shared by the ant colony for weight initialization, reducing the number of backpropagation epochs required to locally train candidate RNNs, speeding up the neuro-evolution process. Our results demonstrate that the sparser RNNs evolved by ANTS significantly outperform traditional one and two layer architectures consisting of modern memory cells, as well as the well-known NEAT algorithm. Furthermore, we improve upon prior state-of-the-art results on the time series dataset utilized in our experiments

    Automatic Inference of Cross-modal Connection Topologies for X-CNNs

    Full text link
    This paper introduces a way to learn cross-modal convolutional neural network (X-CNN) architectures from a base convolutional network (CNN) and the training data to reduce the design cost and enable applying cross-modal networks in sparse data environments. Two approaches for building X-CNNs are presented. The base approach learns the topology in a data-driven manner, by using measurements performed on the base CNN and supplied data. The iterative approach performs further optimisation of the topology through a combined learning procedure, simultaneously learning the topology and training the network. The approaches were evaluated agains examples of hand-designed X-CNNs and their base variants, showing superior performance and, in some cases, gaining an additional 9% of accuracy. From further considerations, we conclude that the presented methodology takes less time than any manual approach would, whilst also significantly reducing the design complexity. The application of the methods is fully automated and implemented in Xsertion library.Comment: 10 pages, 3 figures, 2 tables, to appear in ISNN 201
    • …
    corecore