Search CORE

131 research outputs found

Maxmin convolutional neural networks for image classification

Author: Blot Michael
Cord Matthieu
Thome Nicolas
Publication venue
Publication date: 25/09/2016
Field of study

Convolutional neural networks (CNN) are widely used in computer vision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures is still an open question. In this paper, we propose to modify the standard convo- lutional block of CNN in order to transfer more information layer after layer while keeping some invariance within the net- work. Our main idea is to exploit both positive and negative high scores obtained in the convolution maps. This behav- ior is obtained by modifying the traditional activation func- tion step before pooling. We are doubling the maps with spe- cific activations functions, called MaxMin strategy, in order to achieve our pipeline. Extensive experiments on two classical datasets, MNIST and CIFAR-10, show that our deep MaxMin convolutional net outperforms standard CNN

arXiv.org e-Print Archive

Crossref

Deformable Part-based Fully Convolutional Network for Object Detection

Author: Cord Matthieu
Henaff Gilles
Mordan Taylor
Thome Nicolas
Publication venue
Publication date: 01/01/2017
Field of study

Existing region-based object detectors are limited to regions with fixed box geometry to represent objects, even if those are highly non-rectangular. In this paper we introduce DP-FCN, a deep model for object detection which explicitly adapts to shapes of objects with deformable parts. Without additional annotations, it learns to focus on discriminative elements and to align them, and simultaneously brings more invariance for classification and geometric information to refine localization. DP-FCN is composed of three main modules: a Fully Convolutional Network to efficiently maintain spatial resolution, a deformable part-based RoI pooling layer to optimize positions of parts and build invariance, and a deformation-aware localization module explicitly exploiting displacements of parts to improve accuracy of bounding box regression. We experimentally validate our model and show significant gains. DP-FCN achieves state-of-the-art performances of 83.1% and 80.9% on PASCAL VOC 2007 and 2012 with VOC data only.Comment: Accepted to BMVC 2017 (oral

arXiv.org e-Print Archive

Crossref

BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

Author: Ben-younes Hedi
Cadene Rémi
Cord Matthieu
Thome Nicolas
Publication venue
Publication date: 27/01/2019
Field of study

Multimodal representation learning is gaining more and more interest within the deep learning community. While bilinear models provide an interesting framework to find subtle combination of modalities, their number of parameters grows quadratically with the input dimensions, making their practical implementation within classical deep learning pipelines challenging. In this paper, we introduce BLOCK, a new multimodal fusion based on the block-superdiagonal tensor decomposition. It leverages the notion of block-term ranks, which generalizes both concepts of rank and mode ranks for tensors, already used for multimodal fusion. It allows to define new ways for optimizing the tradeoff between the expressiveness and complexity of the fusion model, and is able to represent very fine interactions between modalities while maintaining powerful mono-modal representations. We demonstrate the practical interest of our fusion model by using BLOCK for two challenging tasks: Visual Question Answering (VQA) and Visual Relationship Detection (VRD), where we design end-to-end learnable architectures for representing relevant interactions between modalities. Through extensive experiments, we show that BLOCK compares favorably with respect to state-of-the-art multimodal fusion models for both VQA and VRD tasks. Our code is available at https://github.com/Cadene/block.bootstrap.pytorch

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Association for the Advancement of Artificial Intelligence: AAAI Publications

Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models

Author: Le Guen Vincent
Thome Nicolas
Publication venue: HAL CCSD
Publication date: 09/12/2019
Field of study

International audienceThis paper addresses the problem of time series forecasting for non-stationarysignals and multiple future steps prediction. To handle this challenging task, weintroduce DILATE (DIstortion Loss including shApe and TimE), a new objectivefunction for training deep neural networks. DILATE aims at accurately predictingsudden changes, and explicitly incorporates two terms supporting precise shapeand temporal change detection. We introduce a differentiable loss function suitablefor training deep neural nets, and provide a custom back-prop implementation forspeeding up optimization. We also introduce a variant of DILATE, which providesa smooth generalization of temporally-constrained Dynamic Time Warping (DTW).Experiments carried out on various non-stationary datasets reveal the very goodbehaviour of DILATE compared to models trained with the standard Mean SquaredError (MSE) loss function, and also to DTW and variants. DILATE is also agnosticto the choice of the model, and we highlight its benefit for training fully connectednetworks as well as specialized recurrent architectures, showing its capacity toimprove over state-of-the-art trajectory forecasting approaches

Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

Author: Guen Vincent Le
Thome Nicolas
Publication venue
Publication date: 16/03/2020
Field of study

Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Human Body Part Labeling and Tracking Using Graph Matching Theory

Author: Merad Djamel
Miguet Serge
Thome Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/11/2006
Field of study

International audienceProperly labeling human body parts in video sequencesis essential for robust tracking and motion interpretationframeworks. We propose to perform this task by usingGraph Matching. The silhouette skeleton is computed anddecomposed into a set of segments corresponding to the differentlimbs. A Graph capturing the topology of the segmentsis generated and matched against a 3D model of thehuman skeleton. The limb identification is carried out foreach node of the graph, potentially leading to the absenceof correspondence. The method captures the minimal informationabout the skeleton shape. No assumption about theviewpoint, the human pose, the geometry or the appearenceof the limbs is done during the matching process, making theapproach applicable to every configuration. Some correspondancesthat might be ambiguous only relying on topologyare enforced by tracking each graph node over time.Several results present the efficiency of the labeling, particularlyits robustness to limb detection errors that are likelyto occur in real situations because of occlusions or low levelsystem failures. Finally the relevance of the labeling in anoverall tracking system is pointed out

Crossref

HAL

Hal-Diderot