23,922 research outputs found
VPNet: Variable Projection Networks
In this paper, we introduce VPNet, a novel model-driven neural network
architecture based on variable projections (VP). The application of VP
operators in neural networks implies learnable features, interpretable
parameters, and compact network structures. This paper discusses the motivation
and mathematical background of VPNet as well as experiments. The concept was
evaluated in the context of signal processing. We performed classification
tasks on a synthetic dataset, and real electrocardiogram (ECG) signals.
Compared to fully-connected and 1D convolutional networks, VPNet features fast
learning ability and good accuracy at a low computational cost in both of the
training and inference. Based on the promising results and mentioned
advantages, we expect broader impact in signal processing, including
classification, regression, and even clustering problems
Projectron -- A Shallow and Interpretable Network for Classifying Medical Images
This paper introduces the `Projectron' as a new neural network architecture
that uses Radon projections to both classify and represent medical images. The
motivation is to build shallow networks which are more interpretable in the
medical imaging domain. Radon transform is an established technique that can
reconstruct images from parallel projections. The Projectron first applies
global Radon transform to each image using equidistant angles and then feeds
these transformations for encoding to a single layer of neurons followed by a
layer of suitable kernels to facilitate a linear separation of projections.
Finally, the Projectron provides the output of the encoding as an input to two
more layers for final classification. We validate the Projectron on five
publicly available datasets, a general dataset (namely MNIST) and four medical
datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are
encouraging as we compared the Projectron's performance against MLPs with raw
images and Radon projections as inputs, respectively. Experiments clearly
demonstrate the potential of the proposed Projectron for
representing/classifying medical images.Comment: Accepted for publication in the 2019 International Joint Conference
on Neural Networks (IJCNN), Budapest, Hungar
Spatially-Adaptive Reconstruction in Computed Tomography Based on Statistical Learning
We propose a direct reconstruction algorithm for Computed Tomography, based
on a local fusion of a few preliminary image estimates by means of a non-linear
fusion rule. One such rule is based on a signal denoising technique which is
spatially adaptive to the unknown local smoothness. Another, more powerful
fusion rule, is based on a neural network trained off-line with a high-quality
training set of images. Two types of linear reconstruction algorithms for the
preliminary images are employed for two different reconstruction tasks. For an
entire image reconstruction from full projection data, the proposed scheme uses
a sequence of Filtered Back-Projection algorithms with a gradually growing
cut-off frequency. To recover a Region Of Interest only from local projections,
statistically-trained linear reconstruction algorithms are employed. Numerical
experiments display the improvement in reconstruction quality when compared to
linear reconstruction algorithms.Comment: Submitted to IEEE Transactions on Image Processin
A black-box model for neurons
We explore the identification of neuronal voltage traces by artificial neural networks based on wavelets (Wavenet). More precisely, we apply a modification in the representation of dynamical systems by Wavenet which decreases the number of used functions; this approach combines localized and global scope functions (unlike Wavenet, which uses localized functions only). As a proof-of-concept, we focus on the identification of voltage traces obtained by simulation of a paradigmatic neuron model, the Morris-Lecar model. We show that, after training our artificial network with biologically plausible input currents, the network is able to identify the neuron's behaviour with high accuracy, thus obtaining a black box that can be then used for predictive goals. Interestingly, the interval of input currents used for training, ranging from stimuli for which the neuron is quiescent to stimuli that elicit spikes, shows the ability of our network to identify abrupt changes in the bifurcation diagram, from almost linear input-output relationships to highly nonlinear ones. These findings open new avenues to investigate the identification of other neuron models and to provide heuristic models for real neurons by stimulating them in closed-loop experiments, that is, using the dynamic-clamp, a well-known electrophysiology technique.Peer ReviewedPostprint (author's final draft
Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery
Recovering images from undersampled linear measurements typically leads to an
ill-posed linear inverse problem, that asks for proper statistical priors.
Building effective priors is however challenged by the low train and test
overhead dictated by real-time tasks; and the need for retrieving visually
"plausible" and physically "feasible" images with minimal hallucination. To
cope with these challenges, we design a cascaded network architecture that
unrolls the proximal gradient iterations by permeating benefits from generative
residual networks (ResNet) to modeling the proximal operator. A mixture of
pixel-wise and perceptual costs is then deployed to train proximals. The
overall architecture resembles back-and-forth projection onto the intersection
of feasible and plausible images. Extensive computational experiments are
examined for a global task of reconstructing MR images of pediatric patients,
and a more local task of superresolving CelebA faces, that are insightful to
design efficient architectures. Our observations indicate that for MRI
reconstruction, a recurrent ResNet with a single residual block effectively
learns the proximal. This simple architecture appears to significantly
outperform the alternative deep ResNet architecture by 2dB SNR, and the
conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For
image superresolution, our preliminary results indicate that modeling the
denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure
From neural PCA to deep unsupervised learning
A network supporting deep unsupervised learning is presented. The network is
an autoencoder with lateral shortcut connections from the encoder to decoder at
each level of the hierarchy. The lateral shortcut connections allow the higher
levels of the hierarchy to focus on abstract invariant features. While standard
autoencoders are analogous to latent variable models with a single layer of
stochastic variables, the proposed network is analogous to hierarchical latent
variables models. Learning combines denoising autoencoder and denoising sources
separation frameworks. Each layer of the network contributes to the cost
function a term which measures the distance of the representations produced by
the encoder and the decoder. Since training signals originate from all levels
of the network, all layers can learn efficiently even in deep networks. The
speedup offered by cost terms from higher levels of the hierarchy and the
ability to learn invariant features are demonstrated in experiments.Comment: A revised version of an article that has been accepted for
publication in Advances in Independent Component Analysis and Learning
Machines (2015), edited by Ella Bingham, Samuel Kaski, Jorma Laaksonen and
Jouko Lampine
Proximal Mean-field for Neural Network Quantization
Compressing large Neural Networks (NN) by quantizing the parameters, while
maintaining the performance is highly desirable due to reduced memory and time
complexity. In this work, we cast NN quantization as a discrete labelling
problem, and by examining relaxations, we design an efficient iterative
optimization procedure that involves stochastic gradient descent followed by a
projection. We prove that our simple projected gradient descent approach is, in
fact, equivalent to a proximal version of the well-known mean-field method.
These findings would allow the decades-old and theoretically grounded research
on MRF optimization to be used to design better network quantization schemes.
Our experiments on standard classification datasets (MNIST, CIFAR10/100,
TinyImageNet) with convolutional and residual architectures show that our
algorithm obtains fully-quantized networks with accuracies very close to the
floating-point reference networks
An exploration of parameter redundancy in deep networks with circulant projections
We explore the redundancy of parameters in deep neural networks by replacing
the conventional linear projection in fully-connected layers with the circulant
projection. The circulant structure substantially reduces memory footprint and
enables the use of the Fast Fourier Transform to speed up the computation.
Considering a fully-connected neural network layer with d input nodes, and d
output nodes, this method improves the time complexity from O(d^2) to O(dlogd)
and space complexity from O(d^2) to O(d). The space savings are particularly
important for modern deep convolutional neural network architectures, where
fully-connected layers typically contain more than 90% of the network
parameters. We further show that the gradient computation and optimization of
the circulant projections can be performed very efficiently. Our experiments on
three standard datasets show that the proposed approach achieves this
significant gain in storage and efficiency with minimal increase in error rate
compared to neural networks with unstructured projections.Comment: International Conference on Computer Vision (ICCV) 201
Recurrent Inference Machines for Solving Inverse Problems
Much of the recent research on solving iterative inference problems focuses
on moving away from hand-chosen inference algorithms and towards learned
inference. In the latter, the inference process is unrolled in time and
interpreted as a recurrent neural network (RNN) which allows for joint learning
of model and inference parameters with back-propagation through time. In this
framework, the RNN architecture is directly derived from a hand-chosen
inference algorithm, effectively limiting its capabilities. We propose a
learning framework, called Recurrent Inference Machines (RIM), in which we turn
algorithm construction the other way round: Given data and a task, train an RNN
to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they
are capable to implement any inference algorithm. The framework allows for an
abstraction which removes the need for domain knowledge. We demonstrate in
several image restoration experiments that this abstraction is effective,
allowing us to achieve state-of-the-art performance on image denoising and
super-resolution tasks and superior across-task generalization
Statistical mechanics of complex neural systems and high dimensional data
Recent experimental advances in neuroscience have opened new vistas into the
immense complexity of neuronal networks. This proliferation of data challenges
us on two parallel fronts. First, how can we form adequate theoretical
frameworks for understanding how dynamical network processes cooperate across
widely disparate spatiotemporal scales to solve important computational
problems? And second, how can we extract meaningful models of neuronal systems
from high dimensional datasets? To aid in these challenges, we give a
pedagogical review of a collection of ideas and theoretical methods arising at
the intersection of statistical physics, computer science and neurobiology. We
introduce the interrelated replica and cavity methods, which originated in
statistical physics as powerful ways to quantitatively analyze large highly
heterogeneous systems of many interacting degrees of freedom. We also introduce
the closely related notion of message passing in graphical models, which
originated in computer science as a distributed algorithm capable of solving
large inference and optimization problems involving many coupled variables. We
then show how both the statistical physics and computer science perspectives
can be applied in a wide diversity of contexts to problems arising in
theoretical neuroscience and data analysis. Along the way we discuss spin
glasses, learning theory, illusions of structure in noise, random matrices,
dimensionality reduction, and compressed sensing, all within the unified
formalism of the replica method. Moreover, we review recent conceptual
connections between message passing in graphical models, and neural computation
and learning. Overall, these ideas illustrate how statistical physics and
computer science might provide a lens through which we can uncover emergent
computational functions buried deep within the dynamical complexities of
neuronal networks.Comment: 72 pages, 8 figures, iopart.cls, to appear in JSTA
- …