Search CORE

2,446 research outputs found

Automatically Designing CNN Architectures for Medical Image Segmentation

Author: A Mortazi
KO Stanley
O Ronneberger
Publication venue
Publication date: 01/01/2018
Field of study

Deep neural network architectures have traditionally been designed and explored with human expertise in a long-lasting trial-and-error process. This process requires huge amount of time, expertise, and resources. To address this tedious problem, we propose a novel algorithm to optimally find hyperparameters of a deep network architecture automatically. We specifically focus on designing neural architectures for medical image segmentation task. Our proposed method is based on a policy gradient reinforcement learning for which the reward function is assigned a segmentation evaluation utility (i.e., dice index). We show the efficacy of the proposed method with its low computational cost in comparison with the state-of-the-art medical image segmentation networks. We also present a new architecture design, a densely connected encoder-decoder CNN, as a strong baseline architecture to apply the proposed hyperparameter search algorithm. We apply the proposed algorithm to each layer of the baseline architectures. As an application, we train the proposed system on cine cardiac MR images from Automated Cardiac Diagnosis Challenge (ACDC) MICCAI 2017. Starting from a baseline segmentation architecture, the resulting network architecture obtains the state-of-the-art results in accuracy without performing any trial-and-error based architecture design approaches or close supervision of the hyperparameters changes.Comment: Accepted to Machine Learning in Medical Imaging (MLMI 2018

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Dirichlet Process Hidden Markov Multiple Change-point Model

Author: Chong Terence T. L.
Ghosh Pulak
Ko Stanley I. M.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/08/2014
Field of study

This paper proposes a new Bayesian multiple change-point model which is based on the hidden Markov approach. The Dirichlet process hidden Markov model does not require the specification of the number of change-points a priori. Hence our model is robust to model specification in contrast to the fully parametric Bayesian model. We propose a general Markov chain Monte Carlo algorithm which only needs to sample the states around change-points. Simulations for a normal mean-shift model with known and unknown variance demonstrate advantages of our approach. Two applications, namely the coal-mining disaster data and the real United States Gross Domestic Product growth, are provided. We detect a single change-point for both the disaster data and US GDP growth. All the change-point locations and posterior inferences of the two applications are in line with existing methods.Comment: Published at http://dx.doi.org/10.1214/14-BA910 in the Bayesian Analysis (http://projecteuclid.org/euclid.ba) by the International Society of Bayesian Analysis (http://bayesian.org/

arXiv.org e-Print Archive

Munich RePEc Personal Archive

Crossref

Optimizing Convolutional Neural Networks for Embedded Systems by Means of Neuroevolution

Author: J-D Dong
KO Stanley
V Sze
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2019
Field of study

Automated design methods for convolutional neural networks (CNNs) have recently been developed in order to increase the design productivity. We propose a neuroevolution method capable of evolving and optimizing CNNs with respect to the classification error and CNN complexity (expressed as the number of tunable CNN parameters), in which the inference phase can partly be executed using fixed point operations to further reduce power consumption. Experimental results are obtained with TinyDNN framework and presented using two common image classification benchmark problems -- MNIST and CIFAR-10.Comment: TPNC 2019, LNCS 11934, pp. 1-13, 201

arXiv.org e-Print Archive

Crossref

Coevolution of Generative Adversarial Networks

Author: F Assunção
F Gomez
K Sims
KO Stanley
KO Stanley
N García-Pedrajas
N García-Pedrajas
O Russakovsky
WD Hillis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/12/2019
Field of study

Generative adversarial networks (GAN) became a hot topic, presenting impressive results in the field of computer vision. However, there are still open problems with the GAN model, such as the training stability and the hand-design of architectures. Neuroevolution is a technique that can be used to provide the automatic design of network architectures even in large search spaces as in deep neural networks. Therefore, this project proposes COEGAN, a model that combines neuroevolution and coevolution in the coordination of the GAN training algorithm. The proposal uses the adversarial characteristic between the generator and discriminator components to design an algorithm using coevolution techniques. Our proposal was evaluated in the MNIST dataset. The results suggest the improvement of the training stability and the automatic discovery of efficient network architectures for GANs. Our model also partially solves the mode collapse problem.Comment: Published in EvoApplications 201

arXiv.org e-Print Archive

Crossref

Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes

Author: H Kitano
I Hubara
KO Stanley
O Russakovsky
Publication venue
Publication date: 20/07/2020
Field of study

We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the space by sampling those configurations that yield maximum information. We then show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings. We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.Comment: Accepted for publication at ECCV 2020. Code availiable at https://code.active.vision . Updated for typ

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Using neuroevolution for predicting mobile marketing conversion

Author: D Floreano
GO Campos
KO Stanley
KO Stanley
L Tashman
M Bereta
M López-Ibáñez
P Cortez
T Fawcett
W Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This paper addresses user Conversion Rate (CVR) prediction within the context of Mobile Performance Marketing. Specifically, we adapt two main neuroevolution methods: Neuroevolution of Augmenting Topologies (NEAT) and Hypercube-based NEAT (HyperNEAT). First, we discuss two mechanisms for increasing execution speed (parallelism and data sampling); a strategy for preventing excessive network complexity with NEAT; and a rolling window scheme for performing an online learning. Then, we present experimental results, using distinct datasets and testing both offline and online learning environments.ThisarticleisaresultoftheprojectNORTE-01-0247-FEDER-017497,supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF). This work was also supported by FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2019

Universidade do Minho: RepositoriUM

Crossref

Spatial patterns and intensity of the surface storm tracks in CMIP5 models

Author: Booth James F.
Ko Stanley
Kwon Young-Oh
Msadek Rym
Small R. Justin
Publication venue: 'American Meteorological Society'
Publication date: 03/04/2017
Field of study

Author Posting. © American Meteorological Society, 2017. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 30 (2017): 4965-4981, doi:10.1175/JCLI-D-16-0228.1.To improve the understanding of storm tracks and western boundary current (WBC) interactions, surface storm tracks in 12 CMIP5 models are examined against ERA-Interim. All models capture an equatorward displacement toward the WBCs in the locations of the surface storm tracks’ maxima relative to those at 850 hPa. An estimated storm-track metric is developed to analyze the location of the surface storm track. It shows that the equatorward shift is influenced by both the lower-tropospheric instability and the baroclinicity. Basin-scale spatial correlations between models and ERA-Interim for the storm tracks, near-surface stability, SST gradient, and baroclinicity are calculated to test the ability of the GCMs’ match reanalysis. An intermodel comparison of the spatial correlations suggests that differences (relative to ERA-Interim) in the position of the storm track aloft have the strongest influence on differences in the surface storm-track position. However, in the North Atlantic, biases in the surface storm track north of the Gulf Stream are related to biases in the SST. An analysis of the strength of the storm tracks shows that most models generate a weaker storm track at the surface than 850 hPa, consistent with observations, although some outliers are found. A linear relationship exists among the models between storm-track amplitudes at 500 and 850 hPa, but not between 850 hPa and the surface. In total, the work reveals a dual role in forcing the surface storm track from aloft and from the ocean surface in CMIP5 models, with the atmosphere having the larger relative influence.JFB was partially supported by the NOAA Climate Program Office’s Modeling, Analysis, Predictions, and Projections program (Grant NA15OAR4310094). Y-OK was supported by NSF Division of Atmospheric and Geospace Science Climate and Large-scale Dynamics Program (AGS-1355339), NASA Physical Oceanography Program (NNX13AM59G), and DOE Office of Biological and Environmental Research Regional and Global Climate Modeling Program (DE-SC0014433). RJS was supported by DOE Office of Biological and Environmental Research (DE-SC0006743) and NSF Directorate for Geosciences Division of Ocean Sciences (1419584),2017-10-0

Woods Hole Open Access Server

Ant-based Neural Topology Search (ANTS) for Optimizing Recurrent Networks

Author: A ElSaid
AER ElSaid
AG Ororbia II
G-B Zhou
JH Holland
JL Elman
KO Stanley
KO Stanley
N Srivastava
RK Sivagaminathan
S Hochreiter
S O’Donnell
T Desell
X Yao
XS Yang
Y-P Liu
Publication venue: RIT Scholar Works
Publication date: 09/04/2020
Field of study

Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, we propose a novel neuro-evolution algorithm based on ant colony optimization (ACO), called Ant-based Neural Topology Search (ANTS), for directly optimizing RNN topologies. The procedure selects from multiple modern recurrent cell types such as ∆-RNN, GRU, LSTM, MGU and UGRNN cells, as well as recurrent connections which may span multiple layers and/or steps of time. In order to introduce an inductive bias that encourages the formation of sparser synaptic connectivity patterns, we investigate several variations of the core algorithm. We do so primarily by formulating different functions that drive the underlying pheromone simulation process (which mimic L1 and L2 regularization in standard machine learning) as well as by introducing ant agents with specialized roles (inspired by how real ant colonies operate), i.e., explorer ants that construct the initial feed forward structure and social ants which select nodes from the feed forward connections to subsequently craft recurrent memory structures. We also incorporate communal intelligence, where best weights are shared by the ant colony for weight initialization, reducing the number of backpropagation epochs required to locally train candidate RNNs, speeding up the neuro-evolution process. Our results demonstrate that the sparser RNNs evolved by ANTS significantly outperform traditional one and two layer architectures consisting of modern memory cells, as well as the well-known NEAT algorithm. Furthermore, we improve upon prior state-of-the-art results on the time series dataset utilized in our experiments

Crossref

RIT Scholar Works

Continual and One-Shot Learning Through Neural Networks with Dynamic External Memory

Author: A Graves
D Floreano
D Floreano
D Foster
D Kumaran
DO Hebb
F Silva
J Blynel
KO Ellefsen
KO Stanley
S Risi
S Risi
X Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

The IT University of Copenhagen's Repository

Automatic Inference of Cross-modal Connection Topologies for X-CNNs

Author: BT Zhang
G Hinton
KO Stanley
MD Zeiler
V Mnih
X Yao
X Yao
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/05/2018
Field of study

This paper introduces a way to learn cross-modal convolutional neural network (X-CNN) architectures from a base convolutional network (CNN) and the training data to reduce the design cost and enable applying cross-modal networks in sparse data environments. Two approaches for building X-CNNs are presented. The base approach learns the topology in a data-driven manner, by using measurements performed on the base CNN and supplied data. The iterative approach performs further optimisation of the topology through a combined learning procedure, simultaneously learning the topology and training the network. The approaches were evaluated agains examples of hand-designed X-CNNs and their base variants, showing superior performance and, in some cases, gaining an additional 9% of accuracy. From further considerations, we conclude that the presented methodology takes less time than any manual approach would, whilst also significantly reducing the design complexity. The application of the methods is fully automated and implemented in Xsertion library.Comment: 10 pages, 3 figures, 2 tables, to appear in ISNN 201

arXiv.org e-Print Archive

Crossref