334 research outputs found

    Tensor Contraction Layers for Parsimonious Deep Nets

    Get PDF
    Tensors offer a natural representation for many kinds of data frequently encountered in machine learning. Images, for example, are naturally represented as third order tensors, where the modes correspond to height, width, and channels. Tensor methods are noted for their ability to discover multi-dimensional dependencies, and tensor decompositions in particular, have been used to produce compact low-rank approximations of data. In this paper, we explore the use of tensor contractions as neural network layers and investigate several ways to apply them to activation tensors. Specifically, we propose the Tensor Contraction Layer (TCL), the first attempt to incorporate tensor contractions as end-to-end trainable neural network layers. Applied to existing networks, TCLs reduce the dimensionality of the activation tensors and thus the number of model parameters. We evaluate the TCL on the task of image recognition, augmenting two popular networks (AlexNet, VGG). The resulting models are trainable end-to-end. Applying the TCL to the task of image recognition, using the CIFAR100 and ImageNet datasets, we evaluate the effect of parameter reduction via tensor contraction on performance. We demonstrate significant model compression without significant impact on the accuracy and, in some cases, improved performance

    Tensor Regression Networks

    Get PDF
    Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach's ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset

    Learning Causal State Representations of Partially Observable Environments

    Get PDF
    Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

    Acute febrile illness is associated with Rickettsia spp infection in dogs

    Get PDF
    BACKGROUND: Rickettsia conorii is transmitted by Rhipicephalus sanguineus ticks and causes Mediterranean Spotted Fever (MSF) in humans. Although dogs are considered the natural host of the vector, the clinical and epidemiological significance of R. conorii infection in dogs remains unclear. The aim of this prospective study was to investigate whether Rickettsia infection causes febrile illness in dogs living in areas endemic for human MSF. METHODS: Dogs from southern Italy with acute fever (n = 99) were compared with case–control dogs with normal body temperatures (n = 72). Serology and real-time PCR were performed for Rickettsia spp., Ehrlichia canis, Anaplasma phagocytophilum/A. platys and Leishmania infantum. Conventional PCR was performed for Babesia spp. and Hepatozoon spp. Acute and convalescent antibodies to R. conorii, E. canis and A. phagocytophilum were determined. RESULTS: The seroprevalence rates at first visit for R. conorii, E. canis, A. phagocytophilum and L. infantum were 44.8%, 48.5%, 37.8% and 17.6%, respectively. The seroconversion rates for R. conorii, E. canis and A. phagocytophilum were 20.7%, 14.3% and 8.8%, respectively. The molecular positive rates at first visit for Rickettsia spp., E. canis, A. phagocytophilum, A. platys, L. infantum, Babesia spp. and Hepatozoon spp. were 1.8%, 4.1%, 0%, 2.3%, 11.1%, 2.3% and 0.6%, respectively. Positive PCR for E. canis (7%), Rickettsia spp. (3%), Babesia spp. (4.0%) and Hepatozoon spp. (1.0%) were found only in febrile dogs. The DNA sequences obtained from Rickettsia and Babesia PCRs positive samples were 100% identical to the R. conorii and Babesia vogeli sequences in GenBank®, respectively. Febrile illness was statistically associated with acute and convalescent positive R. conorii antibodies, seroconversion to R. conorii, E. canis positive PCR, and positivity to any tick pathogen PCRs. Fourteen febrile dogs (31.8%) were diagnosed with Rickettsia spp. infection based on seroconversion and/or PCR while only six afebrile dogs (12.5%) seroconverted (P = 0.0248). The most common clinical findings of dogs with Rickettsia infection diagnosed by seroconversion and/or PCR were fever, myalgia, lameness, elevation of C-reactive protein, thrombocytopenia and hypoalbuminemia. CONCLUSIONS: This study demonstrates acute febrile illness associated with Rickettsia infection in dogs living in endemic areas of human MSF based on seroconversion alone or in combination with PCR

    Born Again Neural Networks

    Get PDF
    Knowledge distillation (KD) consists of transferring knowledge from one machine learning model (the teacher}) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student's compactness. %we desire a compact model with performance close to the teacher's. We study KD from a new perspective: rather than compressing models, we train students parameterized identically to their teachers. Surprisingly, these {Born-Again Networks (BANs), outperform their teachers significantly, both on computer vision and language modeling tasks. Our experiments with BANs based on DenseNets demonstrate state-of-the-art performance on the CIFAR-10 (3.5%) and CIFAR-100 (15.5%) datasets, by validation error. Additional experiments explore two distillation objectives: (i) Confidence-Weighted by Teacher Max (CWTM) and (ii) Dark Knowledge with Permuted Predictions (DKPP). Both methods elucidate the essential components of KD, demonstrating a role of the teacher outputs on both predicted and non-predicted classes. We present experiments with students of various capacities, focusing on the under-explored case where students overpower teachers. Our experiments show significant advantages from transferring knowledge between DenseNets and ResNets in either direction.Comment: Published @ICML 201

    Detection of Leishmania infantum DNA mainly in Rhipicephalus sanguineus male ticks removed from dogs living in endemic areas of canine leishmaniosis

    Get PDF
    Background: Sand flies are the only biologically adapted vectors of Leishmania parasites, however, a possible role in the transmission of Leishmania has been proposed for other hematophagous ectoparasites such as ticks. In order to evaluate natural infection by Leishmania infantum in Rhipicephalus sanguineus ticks, taking into account its close association with dogs, 128 adult R. sanguineus ticks removed from 41 dogs living in endemic areas of canine leishmaniosis were studied. Methods: Individual DNA extraction was performed from each tick and whole blood taken from dogs. Dog sera were tested for IgG antibodies to L. infantum antigen by ELISA and L. infantum real-time PCR was performed from canine whole blood samples and ticks. Results: Leishmania infantum PCR was positive in 13 ticks (10.1%) including one female, (2.0%) and 12 males (15.2%), and in only five dogs (12.2%). Male ticks had a significantly higher infection rate when compared to female R. sanguineus. The percentage of L. infantum seroreactive dogs was 19.5%. All but two PCR positive dogs were seroreactive. Leishmania infantum PCR positive ticks were removed from seropositive and seronegative dogs with a variety of PCR results. Conclusions: This study demonstrates high prevalence of L. infantum DNA in R. sanguineus ticks removed from L. infantum seropositive and seronegative dogs. The presence of L. infantum DNA was detected mainly in male ticks possibly due to their ability to move between canine hosts and feed on several canine hosts during the adult life stage. Additional studies are needed to further explore the role of R. sanguineus ticks and in particular, male adults, in both the epidemiology and immunology of L. infantum infection in dogs in endemic areas
    corecore