Search CORE

170 research outputs found

In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

Author: Bulò Samuel Rota
Kontschieder Peter
Porzi Lorenzo
Publication venue
Publication date: 26/10/2018
Field of study

In this work we present In-Place Activated Batch Normalization (InPlace-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as InPlace-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report results for COCO-Stuff, Cityscapes and Mapillary Vistas, obtaining new state-of-the-art results on the latter without additional training data but in a single-scale and -model scenario. Code can be found at https://github.com/mapillary/inplace_abn

arXiv.org e-Print Archive

Crossref

AutoDIAL: Automatic DomaIn Alignment Layers

Author: Bulò Samuel Rota
Caputo Barbara
Carlucci Fabio Maria
Porzi Lorenzo
Ricci Elisa
Publication venue
Publication date: 01/01/2017
Field of study

Classifiers trained on given databases perform poorly when tested on data acquired in different settings. This is explained in domain adaptation through a shift among distributions of the source and target domains. Attempts to align them have traditionally resulted in works reducing the domain shift by introducing appropriate loss terms, measuring the discrepancies between source and target distributions, in the objective function. Here we take a different route, proposing to align the learned representations by embedding in any given network specific Domain Alignment Layers, designed to match the source and target feature distributions to a reference one. Opposite to previous works which define a priori in which layers adaptation should be performed, our method is able to automatically learn the degree of feature alignment required at different levels of the deep network. Thorough experiments on different public benchmarks, in the unsupervised setting, confirm the power of our approach.Comment: arXiv admin note: substantial text overlap with arXiv:1702.06332 added supplementary materia

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

KALYPSO, a novel detector system for high-repetition rate and real-time beam diagnostics

Author: Rota Lorenzo
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

KITopen

Towards Generalization Across Depth for Monocular 3D Object Detection

Author: Bulò Samuel Rota
Kontschieder Peter
Porzi Lorenzo
Ricci Elisa
Simonelli Andrea
Publication venue
Publication date: 01/01/2020
Field of study

While expensive LiDAR and stereo camera rigs have enabled the development of successful 3D object detection methods, monocular RGB-only approaches lag much behind. This work advances the state of the art by introducing MoVi-3D, a novel, single-stage deep architecture for monocular 3D object detection. MoVi-3D builds upon a novel approach which leverages geometrical information to generate, both at training and test time, virtual views where the object appearance is normalized with respect to distance. These virtually generated views facilitate the detection task as they significantly reduce the visual appearance variability associated to objects placed at different distances from the camera. As a consequence, the deep model is relieved from learning depth-specific representations and its complexity can be significantly reduced. In particular, in this work we show that, thanks to our virtual views generation process, a lightweight, single-stage architecture suffices to set new state-of-the-art results on the popular KITTI3D benchmark

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

3D CNNs on distance matrices for human action recognition

Author: Hernández Ruiz Alejandro José
Moreno-Noguer Francesc
Porzi Lorenzo
Rota Bulò Samuel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

In this paper we are interested in recognizing human actions from sequences of 3D skeleton data. For this purpose we combine a 3D Convolutional Neural Network with body representations based on Euclidean Distance Matrices (EDMs), which have been recently shown to be very effective to capture the geometric structure of the human pose. One inherent limitation of the EDMs, however, is that they are defined up to a permutation of the skeleton joints, i.e., randomly shuffling the ordering of the joints yields many different representations. In oder to address this issue we introduce a novel architecture that simultaneously, and in an end-to-end manner, learns an optimal transformation of the joints, while optimizing the rest of parameters of the convolutional network. The proposed approach achieves state-of-the-art results on 3 benchmarks, including the recent NTU RGB-D dataset, for which we improve on previous LSTM-based methods by more than 10 percentage points, also surpassing other CNN-based methods while using almost 1000 times fewer parameters.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Iatrogenic Hypoglycemia Induced by Valproic Acid in an Adult Patient

Author: Arena Luciano
Celli Lorenzo
Morelli Nicola
Pappalardo Irene
Rota Eugenia
Varese Paola
Publication venue: 'Seed SRL'
Publication date: 23/04/2021
Field of study

Literature on antiepileptic induced iatrogenic hypoglycemia is scanty. Due to its broad spectrum of activity and mechanisms of action, valproic acid (VPA), a fatty acid, is the most widely prescribed epilepsy treatment worldwide.Herein, we describe an adult epileptic patient, where persistent, otherwise unexplained hypoglycemia, was most likely induced by VPA, as suggested by the VPA time course and glucose blood levels. Indeed, no further hypoglycemic episodes occurred after VPA discontinuation and the diagnostic work-up ruled out other possible causes of hypoglycemia.This case supports the hypothesis that VPA may induce hypoglycemia, due to still not well-defined metabolic mechanisms of action. Moreover, it emphasizes the fact that an iatrogenic pathogenesis should be considered if an apparently unexplained hypoglycemia occurs in a patient on chronic therapy with antiepileptics, even at a therapeutical dosage

SEEd's Journals Collection

Learning depth-aware deep representations for robotic perception

Author: Moreno-Noguer Francesc
Peñate Sánchez Adrián
Porzi Lorenzo
Ricci Elisa
Rota Bulò Samuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Exploiting RGB-D data by means of Convolutional Neural Networks (CNNs) is at the core of a number of robotics applications, including object detection, scene semantic segmentation and grasping. Most existing approaches, however, exploit RGB-D data by simply considering depth as an additional input channel for the network. In this paper we show that the performance of deep architectures can be boosted by introducing DaConv, a novel, general-purpose CNN block which exploits depth to learn scale-aware feature representations. We demonstrate the benefits of DaConv on a variety of robotics oriented tasks, involving affordance detection, object coordinate regression and contour detection in RGB-D images. In each of these experiments we show the potential of the proposed block and how it can be readily integrated into existing CNN architectures.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Archivio della ricerca - Fondazione Bruno Kessler

Digital.CSIC