21 research outputs found

    Tensor Contraction Layers for Parsimonious Deep Nets

    Get PDF
    Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. Tensor methods are noted for their ability to discover multi-dimensional dependencies, and tensor decompositions in particular, have been used to produce compact low-rank approximations of data. In this paper, we explore the use of tensor contractions as neural network layers and investigate several ways to apply them to activation tensors. Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers. Applied to existing networks, TCLs reduce the dimensionality of the activation tensors and thus the number of model parameters. We evaluate the TCL on the task of image recognition, augmenting two popular networks (AlexNet, VGG). The resulting models are trainable end-to-end. Applying the TCL to the task of image recognition, using the CIFAR100 and ImageNet datasets, we evaluate the effect of parameter reduction via tensor contraction on performance. We demonstrate significant model compression without significant impact on the accuracy and, in some cases, improved performance

    StrassenNets: Deep Learning with a Multiplication Budget

    Get PDF
    A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen’s matrix multiplication algorithm, learning to multiply 2×2 matrices using only 7 multiplications instead of 8

    Tensor Regression Networks

    Get PDF
    Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach's ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset

    StrassenNets: Deep Learning with a Multiplication Budget

    Get PDF
    A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen’s matrix multiplication algorithm, learning to multiply 2×2 matrices using only 7 multiplications instead of 8

    Stochastic Activation Pruning for Robust Adversarial Defense

    Get PDF
    Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration

    PANC Study (Pancreatitis: A National Cohort Study): national cohort study examining the first 30 days from presentation of acute pancreatitis in the UK

    Get PDF
    Abstract Background Acute pancreatitis is a common, yet complex, emergency surgical presentation. Multiple guidelines exist and management can vary significantly. The aim of this first UK, multicentre, prospective cohort study was to assess the variation in management of acute pancreatitis to guide resource planning and optimize treatment. Methods All patients aged greater than or equal to 18 years presenting with acute pancreatitis, as per the Atlanta criteria, from March to April 2021 were eligible for inclusion and followed up for 30 days. Anonymized data were uploaded to a secure electronic database in line with local governance approvals. Results A total of 113 hospitals contributed data on 2580 patients, with an equal sex distribution and a mean age of 57 years. The aetiology was gallstones in 50.6 per cent, with idiopathic the next most common (22.4 per cent). In addition to the 7.6 per cent with a diagnosis of chronic pancreatitis, 20.1 per cent of patients had a previous episode of acute pancreatitis. One in 20 patients were classed as having severe pancreatitis, as per the Atlanta criteria. The overall mortality rate was 2.3 per cent at 30 days, but rose to one in three in the severe group. Predictors of death included male sex, increased age, and frailty; previous acute pancreatitis and gallstones as aetiologies were protective. Smoking status and body mass index did not affect death. Conclusion Most patients presenting with acute pancreatitis have a mild, self-limiting disease. Rates of patients with idiopathic pancreatitis are high. Recurrent attacks of pancreatitis are common, but are likely to have reduced risk of death on subsequent admissions. </jats:sec

    Replication Data for: Facebook's Privacy Incident Response, a study of geolocation sharing on Facebook Messenger

    No full text
    This dataset was used for this paper published on 8/11/2015 on Technology Science http://techscience.org/a/2015081104
    corecore