199,286 research outputs found

    Optimal Brain Surgeon and general network pruning

    Get PDF
    The use of information from all second-order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and, in some cases, enable rule extraction, is investigated. The method, Optimal Brain Surgeon (OBS), is significantly better than magnitude-based methods and Optimal Brain Damage, which often remove the wrong weights. OBS, permits pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H^-1 from training data and structural information of the set. OBS permits a 76%, a 62%, and a 90% reduction in weights over backpropagation with weight decay on three benchmark MONK'S problems. Of OBS, Optimal Brain Damage, and a magnitude-based method, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1,560 weights, yielding better generalization

    Second Order Derivatives for Network Pruning: Optimal Brain Surgeon

    Get PDF
    We investigate the use of information from all second order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H^(-1) from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weigh decay on three benchmark MONK's problems [Thrun et aI., 1991]. Of OBS, Optimal Brain Damage, and magnitude-based methods, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg [1987J used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1560 weights, yielding better generalization

    Modeling Brain Cooling Helmets for Ischemia Patients

    Full text link
    In this project, we modeled the effectiveness of a cooling cap designed to lower brain temperatures by approximately 3°C in order to cause temporary hypothermia in the brain and thereby prevent further injury caused by cerebral ischemia. Cerebral ischemia is a condition in which blood flows preferentially through certain blood vessels in the brain and not through others, resulting in certain sections of the brain receiving insufficient blood flow for nutrient uptake and waste removal, potentially resulting in a stroke. Using the modeling program COMSOL Multiphysics, we simulated the brain temperature that results from the use of a cooling helmet consisting of a cap containing flowing coolant. The model incorporates convective flow of the coolant in the cap, and heat conduction through various modeled layers of the head. Our model showed that cooling occurred by the predicted conduction and convection mechanism and our results matched closely with those obtained from the literature, therefore validating our model. Initial results showed appropriate brain cooling; however, damage to the scalp occurred. Time of application and temperature of coolant were then successfully optimized to eliminate scalp damage while maintaining effective brain cooling. Even cooling the brain just a few degrees, as achieved in our model, reduces the extent of brain damage following cerebral ischemia. The use of this model provides insight into the optimal treatment conditions for using the cooling cap in a clinical setting

    Fast ConvNets Using Group-wise Brain Damage

    Full text link
    We revisit the idea of brain damage, i.e. the pruning of the coefficients of a neural network, and suggest how brain damage can be modified and used to speedup convolutional layers. The approach uses the fact that many efficient implementations reduce generalized convolutions to matrix multiplications. The suggested brain damage process prunes the convolutional kernel tensor in a group-wise fashion by adding group-sparsity regularization to the standard training process. After such group-wise pruning, convolutions can be reduced to multiplications of thinned dense matrices, which leads to speedup. In the comparison on AlexNet, the method achieves very competitive performance
    • …
    corecore