334 research outputs found
Tensor Contraction Layers for Parsimonious Deep Nets
Tensors offer a natural representation for many kinds of data frequently
encountered in machine learning. Images, for example, are naturally represented
as third order tensors, where the modes correspond to height, width, and
channels. Tensor methods are noted for their ability to discover
multi-dimensional dependencies, and tensor decompositions in particular, have
been used to produce compact low-rank approximations of data. In this paper, we
explore the use of tensor contractions as neural network layers and investigate
several ways to apply them to activation tensors. Specifically, we propose the
Tensor Contraction Layer (TCL), the first attempt to incorporate tensor
contractions as end-to-end trainable neural network layers. Applied to existing
networks, TCLs reduce the dimensionality of the activation tensors and thus the
number of model parameters. We evaluate the TCL on the task of image
recognition, augmenting two popular networks (AlexNet, VGG). The resulting
models are trainable end-to-end. Applying the TCL to the task of image
recognition, using the CIFAR100 and ImageNet datasets, we evaluate the effect
of parameter reduction via tensor contraction on performance. We demonstrate
significant model compression without significant impact on the accuracy and,
in some cases, improved performance
Tensor Regression Networks
Convolutional neural networks typically consist of many convolutional layers
followed by one or more fully connected layers. While convolutional layers map
between high-order activation tensors, the fully connected layers operate on
flattened activation vectors. Despite empirical success, this approach has
notable drawbacks. Flattening followed by fully connected layers discards
multilinear structure in the activations and requires many parameters. We
address these problems by incorporating tensor algebraic operations that
preserve multilinear structure at every layer. First, we introduce Tensor
Contraction Layers (TCLs) that reduce the dimensionality of their input while
preserving their multilinear structure using tensor contraction. Next, we
introduce Tensor Regression Layers (TRLs), which express outputs through a
low-rank multilinear mapping from a high-order activation tensor to an output
tensor of arbitrary order. We learn the contraction and regression factors
end-to-end, and produce accurate nets with fewer parameters. Additionally, our
layers regularize networks by imposing low-rank constraints on the activations
(TCL) and regression weights (TRL). Experiments on ImageNet show that, applied
to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters
compared to fully connected layers by more than 65% while maintaining or
increasing accuracy. In addition to the space savings, our approach's ability
to leverage topological structure can be crucial for structured data such as
MRI. In particular, we demonstrate significant performance improvements over
comparable architectures on three tasks associated with the UK Biobank dataset
Learning Causal State Representations of Partially Observable Environments
Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs
Algebraic Comparison of Partial Lists in Bioinformatics
The outcome of a functional genomics pipeline is usually a partial list of
genomic features, ranked by their relevance in modelling biological phenotype
in terms of a classification or regression model. Due to resampling protocols
or just within a meta-analysis comparison, instead of one list it is often the
case that sets of alternative feature lists (possibly of different lengths) are
obtained. Here we introduce a method, based on the algebraic theory of
symmetric groups, for studying the variability between lists ("list stability")
in the case of lists of unequal length. We provide algorithms evaluating
stability for lists embedded in the full feature set or just limited to the
features occurring in the partial lists. The method is demonstrated first on
synthetic data in a gene filtering task and then for finding gene profiles on a
recent prostate cancer dataset
Acute febrile illness is associated with Rickettsia spp infection in dogs
BACKGROUND: Rickettsia conorii is transmitted by Rhipicephalus sanguineus ticks and causes Mediterranean Spotted Fever (MSF) in humans. Although dogs are considered the natural host of the vector, the clinical and epidemiological significance of R. conorii infection in dogs remains unclear. The aim of this prospective study was to investigate whether Rickettsia infection causes febrile illness in dogs living in areas endemic for human MSF. METHODS: Dogs from southern Italy with acute fever (n = 99) were compared with case–control dogs with normal body temperatures (n = 72). Serology and real-time PCR were performed for Rickettsia spp., Ehrlichia canis, Anaplasma phagocytophilum/A. platys and Leishmania infantum. Conventional PCR was performed for Babesia spp. and Hepatozoon spp. Acute and convalescent antibodies to R. conorii, E. canis and A. phagocytophilum were determined. RESULTS: The seroprevalence rates at first visit for R. conorii, E. canis, A. phagocytophilum and L. infantum were 44.8%, 48.5%, 37.8% and 17.6%, respectively. The seroconversion rates for R. conorii, E. canis and A. phagocytophilum were 20.7%, 14.3% and 8.8%, respectively. The molecular positive rates at first visit for Rickettsia spp., E. canis, A. phagocytophilum, A. platys, L. infantum, Babesia spp. and Hepatozoon spp. were 1.8%, 4.1%, 0%, 2.3%, 11.1%, 2.3% and 0.6%, respectively. Positive PCR for E. canis (7%), Rickettsia spp. (3%), Babesia spp. (4.0%) and Hepatozoon spp. (1.0%) were found only in febrile dogs. The DNA sequences obtained from Rickettsia and Babesia PCRs positive samples were 100% identical to the R. conorii and Babesia vogeli sequences in GenBank®, respectively. Febrile illness was statistically associated with acute and convalescent positive R. conorii antibodies, seroconversion to R. conorii, E. canis positive PCR, and positivity to any tick pathogen PCRs. Fourteen febrile dogs (31.8%) were diagnosed with Rickettsia spp. infection based on seroconversion and/or PCR while only six afebrile dogs (12.5%) seroconverted (P = 0.0248). The most common clinical findings of dogs with Rickettsia infection diagnosed by seroconversion and/or PCR were fever, myalgia, lameness, elevation of C-reactive protein, thrombocytopenia and hypoalbuminemia. CONCLUSIONS: This study demonstrates acute febrile illness associated with Rickettsia infection in dogs living in endemic areas of human MSF based on seroconversion alone or in combination with PCR
Born Again Neural Networks
Knowledge distillation (KD) consists of transferring knowledge from one
machine learning model (the teacher}) to another (the student). Commonly, the
teacher is a high-capacity model with formidable performance, while the student
is more compact. By transferring knowledge, one hopes to benefit from the
student's compactness. %we desire a compact model with performance close to the
teacher's. We study KD from a new perspective: rather than compressing models,
we train students parameterized identically to their teachers. Surprisingly,
these {Born-Again Networks (BANs), outperform their teachers significantly,
both on computer vision and language modeling tasks. Our experiments with BANs
based on DenseNets demonstrate state-of-the-art performance on the CIFAR-10
(3.5%) and CIFAR-100 (15.5%) datasets, by validation error. Additional
experiments explore two distillation objectives: (i) Confidence-Weighted by
Teacher Max (CWTM) and (ii) Dark Knowledge with Permuted Predictions (DKPP).
Both methods elucidate the essential components of KD, demonstrating a role of
the teacher outputs on both predicted and non-predicted classes. We present
experiments with students of various capacities, focusing on the under-explored
case where students overpower teachers. Our experiments show significant
advantages from transferring knowledge between DenseNets and ResNets in either
direction.Comment: Published @ICML 201
Detection of Leishmania infantum DNA mainly in Rhipicephalus sanguineus male ticks removed from dogs living in endemic areas of canine leishmaniosis
Background: Sand flies are the only biologically adapted vectors of Leishmania parasites, however, a possible role in the transmission of Leishmania has been proposed for other hematophagous ectoparasites such as ticks. In order to evaluate natural infection by Leishmania infantum in Rhipicephalus sanguineus ticks, taking into account its close association with dogs, 128 adult R. sanguineus ticks removed from 41 dogs living in endemic areas of canine leishmaniosis were studied. Methods: Individual DNA extraction was performed from each tick and whole blood taken from dogs. Dog sera were tested for IgG antibodies to L. infantum antigen by ELISA and L. infantum real-time PCR was performed from canine whole blood samples and ticks. Results: Leishmania infantum PCR was positive in 13 ticks (10.1%) including one female, (2.0%) and 12 males (15.2%), and in only five dogs (12.2%). Male ticks had a significantly higher infection rate when compared to female R. sanguineus. The percentage of L. infantum seroreactive dogs was 19.5%. All but two PCR positive dogs were seroreactive. Leishmania infantum PCR positive ticks were removed from seropositive and seronegative dogs with a variety of PCR results. Conclusions: This study demonstrates high prevalence of L. infantum DNA in R. sanguineus ticks removed from L. infantum seropositive and seronegative dogs. The presence of L. infantum DNA was detected mainly in male ticks possibly due to their ability to move between canine hosts and feed on several canine hosts during the adult life stage. Additional studies are needed to further explore the role of R. sanguineus ticks and in particular, male adults, in both the epidemiology and immunology of L. infantum infection in dogs in endemic areas
- …
