Search CORE

23,922 research outputs found

VPNet: Variable Projection Networks

Author: Bognár Gergő
Huber Christian
Huemer Mario
Kovács Péter
Publication venue
Publication date: 28/06/2020
Field of study

In this paper, we introduce VPNet, a novel model-driven neural network architecture based on variable projections (VP). The application of VP operators in neural networks implies learnable features, interpretable parameters, and compact network structures. This paper discusses the motivation and mathematical background of VPNet as well as experiments. The concept was evaluated in the context of signal processing. We performed classification tasks on a synthetic dataset, and real electrocardiogram (ECG) signals. Compared to fully-connected and 1D convolutional networks, VPNet features fast learning ability and good accuracy at a low computational cost in both of the training and inference. Based on the promising results and mentioned advantages, we expect broader impact in signal processing, including classification, regression, and even clustering problems

arXiv.org e-Print Archive

Projectron -- A Shallow and Interpretable Network for Classifying Medical Images

Author: Kalra Shivam
Sriram Aditya
Tizhoosh H. R.
Publication venue
Publication date: 15/03/2019
Field of study

This paper introduces the `Projectron' as a new neural network architecture that uses Radon projections to both classify and represent medical images. The motivation is to build shallow networks which are more interpretable in the medical imaging domain. Radon transform is an established technique that can reconstruct images from parallel projections. The Projectron first applies global Radon transform to each image using equidistant angles and then feeds these transformations for encoding to a single layer of neurons followed by a layer of suitable kernels to facilitate a linear separation of projections. Finally, the Projectron provides the output of the encoding as an input to two more layers for final classification. We validate the Projectron on five publicly available datasets, a general dataset (namely MNIST) and four medical datasets (namely Emphysema, IDC, IRMA, and Pneumonia). The results are encouraging as we compared the Projectron's performance against MLPs with raw images and Radon projections as inputs, respectively. Experiments clearly demonstrate the potential of the proposed Projectron for representing/classifying medical images.Comment: Accepted for publication in the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungar

arXiv.org e-Print Archive

Spatially-Adaptive Reconstruction in Computed Tomography Based on Statistical Learning

Author: Elad Michael
Shtok Joseph
Zibulevsky Michael
Publication venue
Publication date: 25/04/2010
Field of study

We propose a direct reconstruction algorithm for Computed Tomography, based on a local fusion of a few preliminary image estimates by means of a non-linear fusion rule. One such rule is based on a signal denoising technique which is spatially adaptive to the unknown local smoothness. Another, more powerful fusion rule, is based on a neural network trained off-line with a high-quality training set of images. Two types of linear reconstruction algorithms for the preliminary images are employed for two different reconstruction tasks. For an entire image reconstruction from full projection data, the proposed scheme uses a sequence of Filtered Back-Projection algorithms with a gradually growing cut-off frequency. To recover a Region Of Interest only from local projections, statistically-trained linear reconstruction algorithms are employed. Numerical experiments display the improvement in reconstruction quality when compared to linear reconstruction algorithms.Comment: Submitted to IEEE Transactions on Image Processin

arXiv.org e-Print Archive

A black-box model for neurons

Author: Claumann Carlos
Fossas Colet Enric
Guillamon Grabolosa Antoni
Roqueiro Nestor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

We explore the identification of neuronal voltage traces by artificial neural networks based on wavelets (Wavenet). More precisely, we apply a modification in the representation of dynamical systems by Wavenet which decreases the number of used functions; this approach combines localized and global scope functions (unlike Wavenet, which uses localized functions only). As a proof-of-concept, we focus on the identification of voltage traces obtained by simulation of a paradigmatic neuron model, the Morris-Lecar model. We show that, after training our artificial network with biologically plausible input currents, the network is able to identify the neuron's behaviour with high accuracy, thus obtaining a black box that can be then used for predictive goals. Interestingly, the interval of input currents used for training, ranging from stimuli for which the neuron is quiescent to stimuli that elicit spikes, shows the ability of our network to identify abrupt changes in the bifurcation diagram, from almost linear input-output relationships to highly nonlinear ones. These findings open new avenues to investigate the identification of other neuron models and to provide heuristic models for real neurons by stimulating them in closed-loop experiments, that is, using the dynamic-clamp, a well-known electrophysiology technique.Peer ReviewedPostprint (author's final draft

Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery

Author: Donoho David
Mardani Morteza
Monajemi Hatef
Papyan Vardan
Pauly John
Vasanawala Shreyas
Publication venue
Publication date: 27/11/2017
Field of study

Recovering images from undersampled linear measurements typically leads to an ill-posed linear inverse problem, that asks for proper statistical priors. Building effective priors is however challenged by the low train and test overhead dictated by real-time tasks; and the need for retrieving visually "plausible" and physically "feasible" images with minimal hallucination. To cope with these challenges, we design a cascaded network architecture that unrolls the proximal gradient iterations by permeating benefits from generative residual networks (ResNet) to modeling the proximal operator. A mixture of pixel-wise and perceptual costs is then deployed to train proximals. The overall architecture resembles back-and-forth projection onto the intersection of feasible and plausible images. Extensive computational experiments are examined for a global task of reconstructing MR images of pediatric patients, and a more local task of superresolving CelebA faces, that are insightful to design efficient architectures. Our observations indicate that for MRI reconstruction, a recurrent ResNet with a single residual block effectively learns the proximal. This simple architecture appears to significantly outperform the alternative deep ResNet architecture by 2dB SNR, and the conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For image superresolution, our preliminary results indicate that modeling the denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure

arXiv.org e-Print Archive

From neural PCA to deep unsupervised learning

Author: Almeida
Bengio
Bishop
Ciresan
Deco
Dempster
Fukushima
Gregor
Hinton
Hinton
Hubel
Hyvärinen
Hyvärinen
Hyvärinen
Ilin
Krizhevsky
Oja
Oja
Raiko
Schmidhuber
Särelä
Uria
Valpola
Valpola
Vincent
Yli-Krekola
Publication venue
Publication date: 01/01/2015
Field of study

A network supporting deep unsupervised learning is presented. The network is an autoencoder with lateral shortcut connections from the encoder to decoder at each level of the hierarchy. The lateral shortcut connections allow the higher levels of the hierarchy to focus on abstract invariant features. While standard autoencoders are analogous to latent variable models with a single layer of stochastic variables, the proposed network is analogous to hierarchical latent variables models. Learning combines denoising autoencoder and denoising sources separation frameworks. Each layer of the network contributes to the cost function a term which measures the distance of the representations produced by the encoder and the decoder. Since training signals originate from all levels of the network, all layers can learn efficiently even in deep networks. The speedup offered by cost terms from higher levels of the hierarchy and the ability to learn invariant features are demonstrated in experiments.Comment: A revised version of an article that has been accepted for publication in Advances in Independent Component Analysis and Learning Machines (2015), edited by Ella Bingham, Samuel Kaski, Jorma Laaksonen and Jouko Lampine

arXiv.org e-Print Archive

Proximal Mean-field for Neural Network Quantization

Author: Ajanthan Thalaiyasingam
Dokania Puneet K.
Hartley Richard
Torr Philip H. S.
Publication venue
Publication date: 19/08/2019
Field of study

Compressing large Neural Networks (NN) by quantizing the parameters, while maintaining the performance is highly desirable due to reduced memory and time complexity. In this work, we cast NN quantization as a discrete labelling problem, and by examining relaxations, we design an efficient iterative optimization procedure that involves stochastic gradient descent followed by a projection. We prove that our simple projected gradient descent approach is, in fact, equivalent to a proximal version of the well-known mean-field method. These findings would allow the decades-old and theoretically grounded research on MRF optimization to be used to design better network quantization schemes. Our experiments on standard classification datasets (MNIST, CIFAR10/100, TinyImageNet) with convolutional and residual architectures show that our algorithm obtains fully-quantized networks with accuracies very close to the floating-point reference networks

arXiv.org e-Print Archive

An exploration of parameter redundancy in deep networks with circulant projections

Author: Chang Shih-Fu
Cheng Yu
Choudhary Alok
Feris Rogerio S.
Kumar Sanjiv
Yu Felix X.
Publication venue
Publication date: 27/10/2015
Field of study

We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection. The circulant structure substantially reduces memory footprint and enables the use of the Fast Fourier Transform to speed up the computation. Considering a fully-connected neural network layer with d input nodes, and d output nodes, this method improves the time complexity from O(d^2) to O(dlogd) and space complexity from O(d^2) to O(d). The space savings are particularly important for modern deep convolutional neural network architectures, where fully-connected layers typically contain more than 90% of the network parameters. We further show that the gradient computation and optimization of the circulant projections can be performed very efficiently. Our experiments on three standard datasets show that the proposed approach achieves this significant gain in storage and efficiency with minimal increase in error rate compared to neural networks with unstructured projections.Comment: International Conference on Computer Vision (ICCV) 201

arXiv.org e-Print Archive

Recurrent Inference Machines for Solving Inverse Problems

Author: Putzky Patrick
Welling Max
Publication venue
Publication date: 13/06/2017
Field of study

Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization

arXiv.org e-Print Archive

Statistical mechanics of complex neural systems and high dimensional data

Author: Advani Madhu
Ganguli Surya
Lahiri Subhaneil
Publication venue: 'IOP Publishing'
Publication date: 29/01/2013
Field of study

Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? And second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction, and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks.Comment: 72 pages, 8 figures, iopart.cls, to appear in JSTA

arXiv.org e-Print Archive