9 research outputs found

    Growing Efficient Deep Networks by Structured Continuous Sparsification

    Full text link
    We develop an approach to training deep networks while dynamically adjusting their architecture, driven by a principled combination of accuracy and sparsity objectives. Unlike conventional pruning approaches, our method adopts a gradual continuous relaxation of discrete network structure optimization and then samples sparse subnetworks, enabling efficient deep networks to be trained in a growing and pruning manner. Extensive experiments across CIFAR-10, ImageNet, PASCAL VOC, and Penn Treebank, with convolutional models for image classification and semantic segmentation, and recurrent models for language modeling, show that our training scheme yields efficient networks that are smaller and more accurate than those produced by competing pruning methods

    Application of deep learning algorithm for estimating stand volume in South Korea

    Get PDF
    Current estimates of stand volume for South Korean forests are mostly derived from expensive field data. Techniques that allow reducing the amount of ground data with reliable accuracy would decrease the cost and time. The fifth National Forest Inventory (NFI) has been conducted annually for all forest areas in South Korea from 2006 to 2010 and using these data we can make a model for estimating the stand volume of forests. The purpose of this study is to test deep learning whether it is available for measurement of stand volume with satellite imageries and geospatial information. The spatial distribution of the stand volume of South Korean forests was predicted with the convolutional neural networks (CNNs) algorithm. NFI data were randomly sampled for training from 90% to 10%, with 10% decrement, and the rest of the area was estimated using satellite imagery and geospatial information. Consequently, we found that the error rate of total stand volume was <5  %   when using over 17% of NFI data for training (R2  =  0.96). We identified that using CNNs model based on satellite imageries and geospatial information is considered to be suitable for estimating the national level of stand volume. This study is meaningful in that we (1) estimated the stand volume using a deep learning algorithm with high accuracy compare with previous studies, (2) identified the minimum training rate of the CNNs model to estimate the stand volume of South Korean forest, and (3) identified the effect of diameter class on error hotspots in stand volume estimates through clustering analysis

    Neural networks with late-phase weights

    Full text link
    The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space. To avoid incurring increased computational costs, we investigate a family of low-dimensional late-phase weight models which interact multiplicatively with the remaining parameters. Our results show that augmenting standard models with late-phase weights improves generalization in established benchmarks such as CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a theoretical analysis of a noisy quadratic problem which provides a simplified picture of the late phases of neural network learning.Comment: 25 pages, 6 figure

    CHAMMI: A benchmark for channel-adaptive models in microscopy imaging

    Full text link
    Most neural networks assume that input images have a fixed number of channels (three for RGB images). However, there are many settings where the number of channels may vary, such as microscopy images where the number of channels changes depending on instruments and experimental goals. Yet, there has not been a systemic attempt to create and evaluate neural networks that are invariant to the number and type of channels. As a result, trained models remain specific to individual studies and are hardly reusable for other microscopy settings. In this paper, we present a benchmark for investigating channel-adaptive models in microscopy imaging, which consists of 1) a dataset of varied-channel single-cell images, and 2) a biologically relevant evaluation framework. In addition, we adapted several existing techniques to create channel-adaptive models and compared their performance on this benchmark to fixed-channel, baseline models. We find that channel-adaptive models can generalize better to out-of-domain tasks and can be computationally efficient. We contribute a curated dataset (https://doi.org/10.5281/zenodo.7988357) and an evaluation API (https://github.com/broadinstitute/MorphEm.git) to facilitate objective comparisons in future research and applications.Comment: Accepted at NeurIPS Track on Datasets and Benchmarks, 202

    Can neural networks benefit from objectives that encourage iterative convergent computations? A case study of ResNets and object classification

    Get PDF
    Recent work has suggested that feedforward residual neural networks (ResNets) approximate iterative recurrent computations. Iterative computations are useful in many domains, so they might provide good solutions for neural networks to learn. However, principled methods for measuring and manipulating iterative convergence in neural networks remain lacking. Here we address this gap by 1) quantifying the degree to which ResNets learn iterative solutions and 2) introducing a regularization approach that encourages the learning of iterative solutions. Iterative methods are characterized by two properties: iteration and convergence. To quantify these properties, we define three indices of iterative convergence. Consistent with previous work, we show that, even though ResNets can express iterative solutions, they do not learn them when trained conventionally on computer-vision tasks. We then introduce regularizations to encourage iterative convergent computation and test whether this provides a useful inductive bias. To make the networks more iterative, we manipulate the degree of weight sharing across layers using soft gradient coupling. This new method provides a form of recurrence regularization and can interpolate smoothly between an ordinary ResNet and a “recurrent” ResNet (i.e., one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time). To make the networks more convergent we impose a Lipschitz constraint on the residual functions using spectral normalization. The three indices of iterative convergence reveal that the gradient coupling and the Lipschitz constraint succeed at making the networks iterative and convergent, respectively. To showcase the practicality of our approach, we study how iterative convergence impacts generalization on standard visual recognition tasks (MNIST, CIFAR-10, CIFAR-100) or challenging recognition tasks with partial occlusions (Digitclutter). We find that iterative convergent computation, in these tasks, does not provide a useful inductive bias for ResNets. Importantly, our approach may be useful for investigating other network architectures and tasks as well and we hope that our study provides a useful starting point for investigating the broader question of whether iterative convergence can help neural networks in their generalization
    corecore