2,422 research outputs found
Modular Networks: Learning to Decompose Neural Computation
Scaling model capacity has been vital in the success of deep learning. For a
typical network, necessary compute resources and training time grow
dramatically with model size. Conditional computation is a promising way to
increase the number of parameters with a relatively small increase in
resources. We propose a training algorithm that flexibly chooses neural modules
based on the data to be processed. Both the decomposition and modules are
learned end-to-end. In contrast to existing approaches, training does not rely
on regularization to enforce diversity in module use. We apply modular networks
both to image recognition and language modeling tasks, where we achieve
superior performance compared to several baselines. Introspection reveals that
modules specialize in interpretable contexts.Comment: NIPS 201
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
Knowing what you know in brain segmentation using Bayesian deep neural networks
In this paper, we describe a Bayesian deep neural network (DNN) for
predicting FreeSurfer segmentations of structural MRI volumes, in minutes
rather than hours. The network was trained and evaluated on a large dataset (n
= 11,480), obtained by combining data from more than a hundred different sites,
and also evaluated on another completely held-out dataset (n = 418). The
network was trained using a novel spike-and-slab dropout-based variational
inference approach. We show that, on these datasets, the proposed Bayesian DNN
outperforms previously proposed methods, in terms of the similarity between the
segmentation predictions and the FreeSurfer labels, and the usefulness of the
estimate uncertainty of these predictions. In particular, we demonstrated that
the prediction uncertainty of this network at each voxel is a good indicator of
whether the network has made an error and that the uncertainty across the whole
brain can predict the manual quality control ratings of a scan. The proposed
Bayesian DNN method should be applicable to any new network architecture for
addressing the segmentation problem.Comment: Submitted to Frontiers in Neuroinformatic
Modeling biological face recognition with deep convolutional neural networks
Deep convolutional neural networks (DCNNs) have become the state-of-the-art
computational models of biological object recognition. Their remarkable success
has helped vision science break new ground and recent efforts have started to
transfer this achievement to research on biological face recognition. In this
regard, face detection can be investigated by comparing face-selective
biological neurons and brain areas to artificial neurons and model layers.
Similarly, face identification can be examined by comparing in vivo and in
silico multidimensional "face spaces". In this review, we summarize the first
studies that use DCNNs to model biological face recognition. On the basis of a
broad spectrum of behavioral and computational evidence, we conclude that DCNNs
are useful models that closely resemble the general hierarchical organization
of face recognition in the ventral visual pathway and the core face network. In
two exemplary spotlights, we emphasize the unique scientific contributions of
these models. First, studies on face detection in DCNNs indicate that
elementary face selectivity emerges automatically through feedforward
processing even in the absence of visual experience. Second, studies on face
identification in DCNNs suggest that identity-specific experience and
generative mechanisms facilitate this particular challenge. Taken together, as
this novel modeling approach enables close control of predisposition (i.e.,
architecture) and experience (i.e., training data), it may be suited to inform
long-standing debates on the substrates of biological face recognition.Comment: 41 pages, 2 figures, 1 tabl
Predicting brain activation maps for arbitrary tasks with cognitive encoding models
A deep understanding of the neural architecture of mental function should enable the accurate prediction of a specific pattern of brain activity for any psychological task, based only on the cognitive functions known to be engaged by that task. Encoding models (EMs), which predict neural responses from known features (e.g., stimulus properties), have succeeded in circumscribed domains (e.g., visual neuroscience), but implementing domain-general EMs that predict brain-wide activity for arbitrary tasks has been limited mainly by availability of datasets that 1) sufficiently span a large space of psychological functions, and 2) are sufficiently annotated with such functions to allow robust EM specification. We examine the use of EMs based on a formal specification of psychological function, to predict cortical activation patterns across a broad range of tasks. We utilized the Multi-Domain Task Battery, a dataset in which 24 subjects completed 32 ten-minute fMRI scans, switching tasks every 35 s and engaging in 44 total conditions of diverse psychological manipulations. Conditions were annotated by a group of experts using the Cognitive Atlas ontology to identify putatively engaged functions, and region-wise cognitive EMs (CEMs) were fit, for individual subjects, on neocortical responses. We found that CEMs predicted cortical activation maps of held-out tasks with high accuracy, outperforming a permutation-based null model while approaching the noise ceiling of the data, without being driven solely by either cognitive or perceptual-motor features. Hierarchical clustering on the similarity structure of CEM generalization errors revealed relationships amongst psychological functions. Spatial distributions of feature importances systematically overlapped with large-scale resting-state functional networks (RSNs), supporting the hypothesis of functional specialization within RSNs while grounding their function in an interpretable data-driven manner. Our implementation and validation of CEMs provides a proof of principle for the utility of formal ontologies in cognitive neuroscience and motivates the use of CEMs in the further testing of cognitive theories
- …