938 research outputs found
Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima
Recently, a race towards the simplification of deep networks has begun,
showing that it is effectively possible to reduce the size of these models with
minimal or no performance loss. However, there is a general lack in
understanding why these pruning strategies are effective. In this work, we are
going to compare and analyze pruned solutions with two different pruning
approaches, one-shot and gradual, showing the higher effectiveness of the
latter. In particular, we find that gradual pruning allows access to narrow,
well-generalizing minima, which are typically ignored when using one-shot
approaches. In this work we also propose PSP-entropy, a measure to understand
how a given neuron correlates to some specific learned classes. Interestingly,
we observe that the features extracted by iteratively-pruned models are less
correlated to specific classes, potentially making these models a better fit in
transfer learning approaches
Deploying AI Frameworks on Secure HPC Systems with Containers
The increasing interest in the usage of Artificial Intelligence techniques
(AI) from the research community and industry to tackle "real world" problems,
requires High Performance Computing (HPC) resources to efficiently compute and
scale complex algorithms across thousands of nodes. Unfortunately, typical data
scientists are not familiar with the unique requirements and characteristics of
HPC environments. They usually develop their applications with high-level
scripting languages or frameworks such as TensorFlow and the installation
process often requires connection to external systems to download open source
software during the build. HPC environments, on the other hand, are often based
on closed source applications that incorporate parallel and distributed
computing API's such as MPI and OpenMP, while users have restricted
administrator privileges, and face security restrictions such as not allowing
access to external systems. In this paper we discuss the issues associated with
the deployment of AI frameworks in a secure HPC environment and how we
successfully deploy AI frameworks on SuperMUC-NG with Charliecloud.Comment: 6 pages, 2 figures, 2019 IEEE High Performance Extreme Computing
Conferenc
Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery
Can we improve detection in the thermal domain by borrowing features from
rich domains like visual RGB? In this paper, we propose a pseudo-multimodal
object detector trained on natural image domain data to help improve the
performance of object detection in thermal images. We assume access to a
large-scale dataset in the visual RGB domain and relatively smaller dataset (in
terms of instances) in the thermal domain, as is common today. We propose the
use of well-known image-to-image translation frameworks to generate pseudo-RGB
equivalents of a given thermal image and then use a multi-modal architecture
for object detection in the thermal image. We show that our framework
outperforms existing benchmarks without the explicit need for paired training
examples from the two domains. We also show that our framework has the ability
to learn with less data from thermal domain when using our approach. Our code
and pre-trained models are made available at
https://github.com/tdchaitanya/MMTODComment: Accepted at Perception Beyond Visible Spectrum Workshop, CVPR 201
- …