Search CORE

5,687 research outputs found

Fast object detection in compressed JPEG Images

Author: Chatelain Clément
Deguerre Benjamin
Gasso Gilles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2019
Field of study

Object detection in still images has drawn a lot of attention over past few years, and with the advent of Deep Learning impressive performances have been achieved with numerous industrial applications. Most of these deep learning models rely on RGB images to localize and identify objects in the image. However in some application scenarii, images are compressed either for storage savings or fast transmission. Therefore a time consuming image decompression step is compulsory in order to apply the aforementioned deep models. To alleviate this drawback, we propose a fast deep architecture for object detection in JPEG images, one of the most widespread compression format. We train a neural network to detect objects based on the blockwise DCT (discrete cosine transform) coefficients {issued from} the JPEG compression algorithm. We modify the well-known Single Shot multibox Detector (SSD) by replacing its first layers with one convolutional layer dedicated to process the DCT inputs. Experimental evaluations on PASCAL VOC and industrial dataset comprising images of road traffic surveillance show that the model is about

2\times

faster than regular SSD with promising detection performances. To the best of our knowledge, this paper is the first to address detection in compressed JPEG images

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

From temporal network data to the dynamics of social relationships

Author: Bail Didier Le
Barrat Alain
Claidière Nicolas
Gelardi Valeria
Publication venue: 'The Royal Society'
Publication date: 29/09/2021
Field of study

Networks are well-established representations of social systems, and temporal networks are widely used to study their dynamics. Temporal network data often consist in a succession of static networks over consecutive time windows whose length, however, is arbitrary, not necessarily corresponding to any intrinsic timescale of the system. Moreover, the resulting view of social network evolution is unsatisfactory: short time windows contain little information, whereas aggregating over large time windows blurs the dynamics. Going from a temporal network to a meaningful evolving representation of a social network therefore remains a challenge. Here we introduce a framework to that purpose: transforming temporal network data into an evolving weighted network where the weights of the links between individuals are updated at every interaction. Most importantly, this transformation takes into account the interdependence of social relationships due to the finite attention capacities of individuals: each interaction between two individuals not only reinforces their mutual relationship but also weakens their relationships with others. We study a concrete example of such a transformation and apply it to several data sets of social interactions. Using temporal contact data collected in schools, we show how our framework highlights specificities in their structure and temporal organization. We then introduce a synthetic perturbation into a data set of interactions in a group of baboons to show that it is possible to detect a perturbation in a social group on a wide range of timescales and parameters. Our framework brings new perspectives to the analysis of temporal social networks

arXiv.org e-Print Archive

HAL AMU

PubMed Central

Low-latency compression of mocap data using learned spatial decorrelation transform

Author: Chau Lap-Pui
He Ying
Hou Junhui
Magnenat-Thalmann Nadia
Publication venue
Publication date: 01/01/2016
Field of study

Due to the growing needs of human motion capture (mocap) in movie, video games, sports, etc., it is highly desired to compress mocap data for efficient storage and transmission. This paper presents two efficient frameworks for compressing human mocap data with low latency. The first framework processes the data in a frame-by-frame manner so that it is ideal for mocap data streaming and time critical applications. The second one is clip-based and provides a flexible tradeoff between latency and compression performance. Since mocap data exhibits some unique spatial characteristics, we propose a very effective transform, namely learned orthogonal transform (LOT), for reducing the spatial redundancy. The LOT problem is formulated as minimizing square error regularized by orthogonality and sparsity and solved via alternating iteration. We also adopt a predictive coding and temporal DCT for temporal decorrelation in the frame- and clip-based frameworks, respectively. Experimental results show that the proposed frameworks can produce higher compression performance at lower computational cost and latency than the state-of-the-art methods.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Deep learning for audio-visul speaker diarization

Author: Βαρθολομαίος Αργύριος Σ.
Publication venue
Publication date: 01/01/2017
Field of study

University of Thessaly Institutional Repository