Search CORE

952 research outputs found

NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations

Author: Ciccone Marco
Gallieri Marco
Gomez Faustino
Masci Jonathan
Osendorfer Christian
Publication venue
Publication date: 01/01/2018
Field of study

This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming tanh units, and multiple stable equilibria for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets.Comment: NIPS 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Multi-View Stereo with Single-View Semantic Mesh Refinement

Author: Ciccone Marco
Matteucci Matteo
Romanoni Andrea
Visin Francesco
Publication venue
Publication date: 01/01/2017
Field of study

While 3D reconstruction is a well-established and widely explored research topic, semantic 3D reconstruction has only recently witnessed an increasing share of attention from the Computer Vision community. Semantic annotations allow in fact to enforce strong class-dependent priors, as planarity for ground and walls, which can be exploited to refine the reconstruction often resulting in non-trivial performance improvements. State-of-the art methods propose volumetric approaches to fuse RGB image data with semantic labels; even if successful, they do not scale well and fail to output high resolution meshes. In this paper we propose a novel method to refine both the geometry and the semantic labeling of a given mesh. We refine the mesh geometry by applying a variational method that optimizes a composite energy made of a state-of-the-art pairwise photo-metric term and a single-view term that models the semantic consistency between the labels of the 3D mesh and those of the segmented images. We also update the semantic labeling through a novel Markov Random Field (MRF) formulation that, together with the classical data and smoothness terms, takes into account class-specific priors estimated directly from the annotated mesh. This is in contrast to state-of-the-art methods that are typically based on handcrafted or learned priors. We are the first, jointly with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336, 2017], to propose the use of semantics inside a mesh refinement framework. Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more classical pairwise comparison to estimate the flow of the mesh, we apply a single-view comparison between the semantically annotated image and the current 3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Attention Mechanisms for Object Recognition with Event-Based Cameras

Author: Cannici Marco
Ciccone Marco
Matteucci Matteo
Romanoni Andrea
Publication venue
Publication date: 18/11/2018
Field of study

Event-based cameras are neuromorphic sensors capable of efficiently encoding visual information in the form of sparse sequences of events. Being biologically inspired, they are commonly used to exploit some of the computational and power consumption benefits of biological vision. In this paper we focus on a specific feature of vision: visual attention. We propose two attentive models for event based vision: an algorithm that tracks events activity within the field of view to locate regions of interest and a fully-differentiable attention procedure based on DRAW neural model. We highlight the strengths and weaknesses of the proposed methods on four datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101 collections, using the Phased LSTM recognition network as a baseline reference model obtaining improvements in terms of both translation and scale invariance.Comment: WACV2019 camera-ready submissio

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation

Author: Ciccone Marco
Lattari Francesco
Masci Jonathan
Matteucci Matteo
Visin Francesco
Publication venue
Publication date: 01/01/2018
Field of study

We introduce ReConvNet, a recurrent convolutional architecture for semi-supervised video object segmentation that is able to fast adapt its features to focus on any specific object of interest at inference time. Generalization to new objects never observed during training is known to be a hard task for supervised approaches that would need to be retrained. To tackle this problem, we propose a more efficient solution that learns spatio-temporal features self-adapting to the object of interest via conditional affine transformations. This approach is simple, can be trained end-to-end and does not necessarily require extra training steps at inference time. Our method shows competitive results on DAVIS2016 with respect to state-of-the art approaches that use online fine-tuning, and outperforms them on DAVIS2017. ReConvNet shows also promising results on the DAVIS-Challenge 2018 winning the

10

-th position.Comment: CVPR Workshop - DAVIS Challenge 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation

Author: Bengio Yoshua
Cho Kyunghyun
Ciccone Marco
Courville Aaron
Kastner Kyle
Matteucci Matteo
Romero Adriana
Visin Francesco
Publication venue
Publication date: 01/01/2016
Field of study

We propose a structured prediction architecture, which exploits the local generic features extracted by Convolutional Neural Networks and the capacity of Recurrent Neural Networks (RNN) to retrieve distant dependencies. The proposed architecture, called ReSeg, is based on the recently introduced ReNet model for image classification. We modify and extend it to perform the more challenging task of semantic segmentation. Each ReNet layer is composed of four RNN that sweep the image horizontally and vertically in both directions, encoding patches or activations, and providing relevant global information. Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features. Upsampling layers follow ReNet layers to recover the original image resolution in the final predictions. The proposed ReSeg architecture is efficient, flexible and suitable for a variety of semantic segmentation tasks. We evaluate ReSeg on several widely-used semantic segmentation datasets: Weizmann Horse, Oxford Flower, and CamVid; achieving state-of-the-art performance. Results show that ReSeg can act as a suitable architecture for semantic segmentation tasks, and may have further applications in other structured prediction problems. The source code and model hyperparameters are available on https://github.com/fvisin/reseg.Comment: In CVPR Deep Vision Workshop, 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Endothelial Function in Pre-diabetes, Diabetes and Diabetic Cardiomyopathy: A Review

Author: Cameli Matteo
Cecere Annagrazia
Matteo Ciccone Marco
MATTIOLI Anna Vittoria
Scicchitano Pietro
Publication venue: 'OMICS Publishing Group'
Publication date: 01/01/2014
Field of study

Diabetes mellitus worsens cardiovascular risk profile of affected individuals. Its worldwide increasing prevalence and its negative influences on vascular walls morphology and function are able to induce the expression of several morbidities which worsen the clinical conditions of the patients getting them running towards a reduced survival curve. Although overt diabetes increases the mortality rate of individuals due to its pathogenesis, poor information are in literature about the role of pre-diabetes and family history of diabetes mellitus in the outcome of general population. This emphasizes the importance of early detection of vascular impairment in subjects at risk of developing diabetes. The identification of early stages of atherosclerotic diseases in diabetic persons is a fundamental step in the risk stratification protocols followed-up by physicians in order to have a complete overview about the clinical status of such individuals. Common carotid intima-media thickness, flow-mediated vasodilatation, pulse wave velocity are instrumental tools able to detect the early impairment in cardiovascular system and stratify cardiovascular risk of individuals. The aim of this review is to get a general perspective on the complex relationship between cardiovascular diseases onset, pre-diabetes and family history of diabetes. Furthermore, it points out the influence of diabetes on heart function till the expression of the so-called diabetic cardiomyopathy

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

On iterative and conditional computation for visual representation learning

Author: Ciccone Marco
Publication venue: Politecnico di Milano
Publication date: 06/04/2021
Field of study

DOTTORATOL'apprendimento di rappresentazioni direttamente dai dati è fondamentale per migliorare le prestazioni dei metodi di apprendimento automatico. Le reti neurali sono modelli flessibili in grado di apprendere potenti rappresentazioni gerarchiche. Tuttavia, una volta appresa, adattare la rappresentazione a nuovi dati o comportamenti non è banale. In questa tesi, facciamo un passo nella direzione dell'apprendimento di rappresentazioni adattive per dati visivi che affrontano il problema sia da una prospettiva pratica che teorica. In primo luogo, studiamo le Residual Networks dal punto di vista dei sistemi dinamici aggiungendo con un meccanismo per adattare automaticamente il numero di passi di computazione in base alle caratteristiche dei dati. Successivamente, ci concentriamo sul problema dell'apprendimento di rappresentazioni asincrone per dati basati su eventi. Proponiamo un meccanismo ricorrente che impara automaticamente come costruire in modo incrementale una rappresentazione bidimensionale direttamente dagli eventi, che può essere utilizzato come input per architetture convoluzionali per migliorare le loro prestazioni su attività di predizione del flusso ottico e riconoscimento di immagini rispetto alle a features euristiche, progettate da esperti. Infine, ci concentriamo sul complesso problema della segmentazione di oggetti video One-Shot, in cui al modello viene richiesto di segmentare oggetti specifici in un video dopo aver osservato un singolo fotogramma annotato. Affrontiamo il problema da una prospettiva di Meta-Learning mostrando che è possibile adattare una meta-rappresentazione generica a rappresentazioni specifiche per ogni task, modulando le attivazioni di una rete di segmentazione condizionata all'istanza data.Learning effective representations is crucial for scaling the performance of machine learning methods. Deep Neural Networks are flexible models that can learn powerful hierarchical representations by stacking several layers of computations. However, once learned, adapting the representation to new data or behaviours is nontrivial. In this thesis, we take a step in the direction of learning adaptive representations for visual data addressing the problem both from a practical and theoretical perspective. First, we study Residual Networks from a dynamical system perspective and augment them with a mechanism to automatically adapt the number of processing steps based on the characteristics of the data. Then, we focus on the problem of learning effective asynchronous representations for event-based data. We propose a recurrent mechanism that automatically learns how to incrementally build a two-dimensional representation from events, which can be used as input to convolutional frame-based architectures to improve their performance on optical flow prediction and image recognition tasks with respect to hand-designed features. Finally, we focus on the challenging problem of One-Shot Video Object Segmentation, where the model is asked to segment specific objects in unseen videos after observing a single annotated frame. We tackle the problem from a Meta-Learning perspective by showing that it is possible to adapt a generic meta-representation to specific task-representations, by modulating the activations of a segmentation network conditioned on the given instance.DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIAComputer Science and Engineering32SILVANO, CRISTINAPERNICI, BARBAR

POLITesi - Archivio digitale delle tesi di laurea e di dottorato (Politecnico di Milano)

Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras

Author: Cannici Marco
Ciccone Marco
Matteucci Matteo
Romanoni Andrea
Publication venue
Publication date: 01/01/2019
Field of study

Event-based cameras, also known as neuromorphic cameras, are bioinspired sensors able to perceive changes in the scene at high frequency with low power consumption. Becoming available only very recently, a limited amount of work addresses object detection on these devices. In this paper we propose two neural networks architectures for object detection: YOLE, which integrates the events into surfaces and uses a frame-based model to process them, and fcYOLE, an asynchronous event-based fully convolutional network which uses a novel and general formalization of the convolutional and max pooling layers to exploit the sparsity of camera events. We evaluate the algorithm with different extensions of publicly available datasets and on a novel synthetic dataset.Comment: accepted at CVPR2019 Event-based Vision Worksho

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum

Author: Ciccone Marco
Masone Carlo
Zaccone Riccardo
Publication venue
Publication date: 30/11/2023
Field of study

Federated Learning (FL) is the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. As the current literature reports, the main problems associated with FL refer to system and statistical challenges: the former ones demand for efficient learning from edge devices, including lowering communication bandwidth and frequency, while the latter require algorithms robust to non-iidness. State-of-art approaches either guarantee convergence at increased communication cost or are not sufficiently robust to handle extreme heterogeneous local distributions. In this work we propose a novel generalization of the heavy-ball momentum, and present FedHBM to effectively address statistical heterogeneity in FL without introducing any communication overhead. We conduct extensive experimentation on common FL vision and NLP datasets, showing that our FedHBM algorithm empirically yields better model quality and higher convergence speed w.r.t. the state-of-art, especially in pathological non-iid scenarios. While being designed for cross-silo settings, we show how FedHBM is applicable in moderate-to-high cross-device scenarios, and how good model initializations (e.g. pre-training) can be exploited for prompt acceleration. Extended experimentation on large-scale real-world federated datasets further corroborates the effectiveness of our approach for real-world FL applications

arXiv.org e-Print Archive