Search CORE

199,286 research outputs found

Optimal Brain Surgeon and general network pruning

Author: Hassibi Babak
Stork David G.
Wolff Gregory J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

The use of information from all second-order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and, in some cases, enable rule extraction, is investigated. The method, Optimal Brain Surgeon (OBS), is significantly better than magnitude-based methods and Optimal Brain Damage, which often remove the wrong weights. OBS, permits pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H^-1 from training data and structural information of the set. OBS permits a 76%, a 62%, and a 90% reduction in weights over backpropagation with weight decay on three benchmark MONK'S problems. Of OBS, Optimal Brain Damage, and a magnitude-based method, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1,560 weights, yielding better generalization

Caltech Authors

Second Order Derivatives for Network Pruning: Optimal Brain Surgeon

Author: Hassibi Babak
Stork David G.
Publication venue: Morgan Kaufmann
Publication date: 01/01/1993
Field of study

We investigate the use of information from all second order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H^(-1) from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weigh decay on three benchmark MONK's problems [Thrun et aI., 1991]. Of OBS, Optimal Brain Damage, and magnitude-based methods, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg [1987J used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1560 weights, yielding better generalization

Caltech Authors

Recommended from our members

Motor deficits are produced by removing some cortical transplants grafted into injured sensorimotor cortex of neonatal rats.

Author: Gonzalez MF
Moseley M
Sandor R
Sharp FR
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

Fetal frontal cortex was transplanted into cavities formed in the right motor cortex of neonatal rats. As adults, the animals were trained to press two levers in rapid succession with their left forelimb to receive food rewards. Once they had reached an optimal level of performance, the effect of removing their transplants was assessed. Surgical removal of transplants significantly impaired the performance of 2 of 4 subjects. Placing a cross-strain skin graft to induce the immunological rejection of the transplants produced a behavioral deficit in 1 of 2 subjects with complete transplant removal. Skin grafts produced no behavioral effects in four subjects that had surviving transplants. Since the motor deficits produced by transplant removal resembled those observed following the removal of normal motor cortex, we propose that these three transplants functioned within the host brain. Histology showed that the procedures used to remove cortical grafts did not injure any host brains. Therefore, host brain damage is unlikely to account for the behavioral deterioration that followed transplant removals

eScholarship - University of California

Modeling Brain Cooling Helmets for Ischemia Patients

Author: McDonough Brendan
Ranganath Neel
Wiechecki Julie
Wind Jessica
Publication venue
Publication date: 21/05/2010
Field of study

In this project, we modeled the effectiveness of a cooling cap designed to lower brain temperatures by approximately 3°C in order to cause temporary hypothermia in the brain and thereby prevent further injury caused by cerebral ischemia. Cerebral ischemia is a condition in which blood flows preferentially through certain blood vessels in the brain and not through others, resulting in certain sections of the brain receiving insufficient blood flow for nutrient uptake and waste removal, potentially resulting in a stroke. Using the modeling program COMSOL Multiphysics, we simulated the brain temperature that results from the use of a cooling helmet consisting of a cap containing flowing coolant. The model incorporates convective flow of the coolant in the cap, and heat conduction through various modeled layers of the head. Our model showed that cooling occurred by the predicted conduction and convection mechanism and our results matched closely with those obtained from the literature, therefore validating our model. Initial results showed appropriate brain cooling; however, damage to the scalp occurred. Time of application and temperature of coolant were then successfully optimized to eliminate scalp damage while maintaining effective brain cooling. Even cooling the brain just a few degrees, as achieved in our model, reduces the extent of brain damage following cerebral ischemia. The use of this model provides insight into the optimal treatment conditions for using the cooling cap in a clinical setting

eCommons@Cornell

Fast ConvNets Using Group-wise Brain Damage

Author: Lebedev Vadim
Lempitsky Victor
Publication venue
Publication date: 07/12/2015
Field of study

We revisit the idea of brain damage, i.e. the pruning of the coefficients of a neural network, and suggest how brain damage can be modified and used to speedup convolutional layers. The approach uses the fact that many efficient implementations reduce generalized convolutions to matrix multiplications. The suggested brain damage process prunes the convolutional kernel tensor in a group-wise fashion by adding group-sparsity regularization to the standard training process. After such group-wise pruning, convolutions can be reduced to multiplications of thinned dense matrices, which leads to speedup. In the comparison on AlexNet, the method achieves very competitive performance

arXiv.org e-Print Archive

Crossref