Search CORE

18 research outputs found

Classifying Signals on Irregular Domains via Convolutional Cluster Pooling

Author: Angelo Porrello
Davide Abati
Rita Cucchiara
Simone Calderara
Publication venue
Publication date: 01/01/2019
Field of study

We present a novel and hierarchical approach for supervised classification of signals spanning over a fixed graph, reflecting shared properties of the dataset. To this end, we introduce a Convolutional Cluster Pooling layer exploiting a multi-scale clustering in order to highlight, at different resolutions, locally connected regions on the input graph. Our proposal generalises well-established neural models such as Convolutional Neural Networks (CNNs) on irregular and complex domains, by means of the exploitation of the weight sharing property in a graph-oriented architecture. In this work, such property is based on the centrality of each vertex within its soft-assigned cluster. Extensive experiments on NTU RGB+D, CIFAR-10 and 20NEWS demonstrate the effectiveness of the proposed technique in capturing both local and global patterns in graph-structured data out of different domains

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Latent Space Autoregression for Novelty Detection

Author: Abati Davide
Calderara Simone
Cucchiara Rita
Porrello Angelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying its latent representations through an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and video anomaly detection settings. Differently from prior works, our proposal does not make any assumption about the nature of the novelties, making our work readily applicable to diverse contexts

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Conditional Channel Gated Networks for Task-Aware Continual Learning

Author: Abati Davide
Bejnordi Babak Ehteshami
Blankevoort Tijmen
Calderara Simone
Cucchiara Rita
Tomczak Jakub
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems: as they meet the objective of the current training examples, their performance on previous tasks drops drastically. In this work, we introduce a novel framework to tackle this problem with conditional computation. We equip each convolutional layer with task-specific gating modules, selecting which filters to apply on the given input. This way, we achieve two appealing properties. Firstly, the execution patterns of the gates allow to identify and protect important filters, ensuring no loss in the performance of the model for previously learned tasks. Secondly, by using a sparsity objective, we can promote the selection of a limited set of kernels, allowing to retain sufficient model capacity to digest new tasks.Existing solutions require, at test time, awareness of the task to which each example belongs to. This knowledge, however, may not be available in many practical scenarios. Therefore, we additionally introduce a task classifier that predicts the task label of each example, to deal with settings in which a task oracle is not available. We validate our proposal on four continual learning datasets. Results show that our model consistently outperforms existing methods both in the presence and the absence of a task oracle. Notably, on Split SVHN and Imagenet-50 datasets, our model yields up to 23.98% and 17.42% improvement in accuracy w.r.t. competing methods.Comment: CVPR 2020 (oral

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Region-of-Interest Based Neural Video Compression

Author: Abati Davide
Cohen Taco S
Habibian Amirhossein
Perugachi-Diaz Yura
Sautière Guillaume
Yang Yang
Publication venue
Publication date: 02/11/2022
Field of study

Humans do not perceive all parts of a scene with the same resolution, but rather focus on few regions of interest (ROIs). Traditional Object-Based codecs take advantage of this biological intuition, and are capable of non-uniform allocation of bits in favor of salient regions, at the expense of increased distortion the remaining areas: such a strategy allows a boost in perceptual quality under low rate constraints. Recently, several neural codecs have been introduced for video compression, yet they operate uniformly over all spatial locations, lacking the capability of ROI-based processing. In this paper, we introduce two models for ROI-based neural video coding. First, we propose an implicit model that is fed with a binary ROI mask and it is trained by de-emphasizing the distortion of the background. Secondly, we design an explicit latent scaling method, that allows control over the quantization binwidth for different spatial regions of latent variables, conditioned on the ROI mask. By extensive experiments, we show that our methods outperform all our baselines in terms of Rate-Distortion (R-D) performance in the ROI. Moreover, they can generalize to different datasets and to any arbitrary ROI at inference time. Finally, they do not require expensive pixel-level annotations during training, as synthetic ROI masks can be used with little to no degradation in performance. To the best of our knowledge, our proposals are the first solutions that integrate ROI-based capabilities into neural video compression models.Comment: Updated arxiv version to the camera-ready version after acceptance at British Machine Vision Conference (BMVC) 202

arXiv.org e-Print Archive

Normal and pathogenic variation of RFC1 repeat expansions: implications for clinical diagnosis

Cerebellar Ataxia, Neuropathy and Vestibular Areflexia Syndrome (CANVAS) is an autosomal recessive neurodegenerative disease, usually caused by biallelic AAGGG repeat expansions in RFC1. In this study, we leveraged whole genome sequencing (WGS) data from nearly 10,000 individuals recruited within the Genomics England sequencing project to investigate the normal and pathogenic variation of the RFC1 repeat. We identified three novel repeat motifs, AGGGC (n=6 from 5 families), AAGGC (n=2 from 1 family), AGAGG (n=1), associated with CANVAS in the homozygous or compound heterozygous state with the common pathogenic AAGGG expansion. While AAAAG, AAAGGG and AAGAG expansions appear to be benign, here we show a pathogenic role for large AAAGG repeat configuration expansions (n=5). Long read sequencing was used to fully characterise the entire repeat sequence and revealed a pure AGGGC expansion in six patients, whereas the other patients presented complex motifs with AAGGG or AAAGG interruptions. All pathogenic motifs seem to have arisen from a common haplotype and are predicted to form highly stable G quadruplexes, which have been previously demonstrated to affect gene transcription in other conditions. The assessment of these novel configurations is warranted in CANVAS patients with negative or inconclusive genetic testing. Particular attention should be paid to carriers of compound AAGGG/AAAGG expansions, since the AAAGG motif when very large (>500 repeats) or in the presence of AAGGG interruptions. Accurate sizing and full sequencing of the satellite repeat with long read is recommended in clinically selected cases, in order to achieve an accurate molecular diagnosis and counsel patients and their families

UCL Discovery

Identificazione di anomalie nell’attenzione del guidatore e nel comportamento delle persone.

Author: ABATI DAVIDE
Publication venue: Università degli studi di Modena e Reggio Emilia
Publication date: 09/03/2020
Field of study

Attraverso sensori e dispositivi informatici sempre più pervasivi il mondo diventa di giorno in giorno sempre più interconnesso e digitalizzato: di conseguenza, emergono nuove opportunità per l'intelligenza artificiale. In particolare, il monitoraggio pubblico si candida come tema critico e la visione artificiale ha le potenzialità per emergere come tecnologia guida nella costruzione di un mondo più sicuro. In questa tesi, presentiamo soluzioni per affrontare la salvaguardia pubblica in due diverse aree applicative. Consideriamo innanzitutto la sicurezza al volante, sviluppando un sistema in grado di prevedere su quali elementi della scena circostante un guidatore posa la sua attenzione. Nonostante il grande potenziale per il miglioramento della sicurezza, tale previsione appare molto complessa dal momento che guidare un'auto è un compito complicato, ed è altamente soggettivo dal punto di vista attentivo. A tal proposito, raccogliamo e rilasciamo DR(eye)VE, un dataset costituito da video acquisiti sia dal punto di vista del guidatore che da quello dell’auto, annotato con i punti di fissazione del guidatore sulla scena urbana esterna. Successivamente, una profonda ispezione di tali dati permette di stabilire quali fattori influenzano maggiormente l’attenzione del guidatore, in termini di movimento e di semantica. Guidati da tali evidenze, sviluppiamo infine una rete neurale profonda che, a partire da una scena urbana, identifica quali regioni sono salienti per l'attenzione del guidatore. In secondo luogo, affrontiamo la sicurezza in ambito videosorveglianza introducendo un modello di rilevamento delle anomalie. Tale modello è in grado di apprendere gli aspetti che caratterizzano situazioni normali (sicure), e quindi di generare una allerta ogni qualvolta compaiano eventi imprevisti. Addestrare tali modelli in assenza di esempi di condizioni anormale è lo scopo della ricerca per il rilevamento di anomalie (o rilevamento di novità). Nonostante la sua importanza ed una esuberanza di lavori precedenti, la natura imprevedibile di eventi anomali e la loro inaccessibilità durante la procedura di training degrada significativamente l'efficacia dei sistemi preesistenti. In questo contesto, proponiamo un modello generale costituito da un autoencoder profondo dotato di uno stimatore di densità parametrico, il quale impara la distribuzione delle sue rappresentazioni latenti attraverso una procedura autoregressiva. Mostriamo che un obiettivo di maximum likelihood nello spazio latente regolarizza l’obiettivo di ricostruzione dell'autoencoder e minimizza l'entropia differenziale della distribuzione dei vettori latenti. Intuitivamente, tale ottimizzazione congiunta forza il modello a descrivere (e ricostruire) ogni esempio in termini di features che appaiono frequentemente nel set di addestramento (pertanto, più rappresentative della normalità). Ampie indagini sperimentali e confronti con lo stato dell’arte dimostrano l'efficacia di entrambe le nostre proposte.As the world matures increasingly connected and digitized by the day, with sensors and computing devices becoming more and more pervasive, new opportunities appear for artificial intelligence. In particular, public monitoring steps forward as a critical theme, and computer vision can forcefully prevail as the lead technology to help build a safer world. In this thesis, we present solutions to tackle public safeguard in two different areas of operation. First, we begin with vehicle-based safety by developing a system capable of predicting where a person is likely to focus her attention on while driving. Such activity has a vast potential to improve driving safety. Nevertheless, it appears utterly complex since driving a car is a complicated task, and it is highly subjective from an attentive perspective. To handle attention prediction, we collect and release DR(eye)VE, a dataset consisting of driver-centric and car-centric clips, along with driver's fixation points on the outer urban scene. Next, we deeply inspect such data in order to establish which factors most influence a driver's gaze, both in terms of motion and semantics. Guided by such evidence, we finally develop a deep neural network that, given a car-centric urban scene, identifies which regions are likely to capture the driver's attention. Secondly, we address surveillance-based safety by introducing an anomaly detection model capable of learning the traits that characterize healthy (safe) situations and, therefore, alert when unexpected events appear. Learning such models without utilizing examples of abnormal conditions is the aim of anomaly detection (a.k.a. novelty detection) research. Despite its importance and a plethora of prior work, the unpredictable nature of novel events and their inaccessibility during the training procedure severely degrades the effectiveness of state-of-the-art systems. In this framework, we propose a general model consisting of a deep autoencoder equipped with a parametric density estimator, fitting its latent representations through an autoregressive procedure. We show that a maximum likelihood objective in latent space effectively regularizes the optimization of the autoencoder's reconstruction error, and minimizes the differential entropy of the distribution spanned by latent vectors. Intuitively, such a joint optimization forces the model to describe (and reconstruct) each example in terms of features that frequently appear in the training set. Extensive experimental inquiries and comparisons with prior art show the effectiveness of both our proposals

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia