44 research outputs found
Deep Compact Person Re-Identification with Distractor Synthesis via Guided DC-GANs
We present a dual-stream CNN that learns both appearance and facial features in tandem from still images and, after feature fusion, infers person identities. We then describe an alternative architecture of a single, lightweight ID-CondenseNet where a face detector-guided DC-GAN is used to generate distractor person images for enhanced training. For evaluation, we test both architectures on FLIMA, a new extension of an existing person re-identification dataset with added frame-by-frame annotations of face presence. Although the dual-stream CNN can outperform the CondenseNet approach on FLIMA, we show that the latter surpasses all state-of-the-art architectures in top-1 ranking performance when applied to the largest existing person re-identification dataset, MSMT17. We conclude that whilst re-identification performance is highly sensitive to the structure of datasets, distractor augmentation and network compression have a role to play for enhancing performance characteristics for larger scale applications
Semantically selective augmentation for deep compact person re-identification
We present a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks. In particular, we propose to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a semantic classifier to explicitly control the domain specificity of the generation process. Thereby, we encode information in the classifier network which can be utilized to steer adversarial synthesis, and which fuels our CondenseNet ID-network training. We provide a quantitative and qualitative analysis of the approach and its variants on a number of datasets, obtaining results that outperform the state-of-the-art on the LIMA dataset for long-term monitoring in indoor living spaces
PanAf20K : a large video dataset for wild ape detection and behaviour recognition
The work that allowed for the collection of the dataset was funded by the Max Planck Society, Max Planck Society Innovation Fund, and Heinz L. Krekeler. This work was supported by the UKRI CDT in Interactive AI under grant EP/S022937/1.We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across ∼20,000 camera trap videos of chimpanzees and gorillas collected at 18 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by a rich set of annotations and benchmarks making it suitable for training and testing a variety of challenging and ecologically important computer vision tasks including ape detection and behaviour recognition. Furthering AI analysis of camera trap information is critical given the International Union for Conservation of Nature now lists all species in the great ape family as either Endangered or Critically Endangered. We hope the dataset can form a solid basis for engagement of the AI community to improve performance, efficiency, and result interpretation in order to support assessments of great ape presence, abundance, distribution, and behaviour and thereby aid conservation efforts. The dataset and code are available from the project website: PanAf20KPeer reviewe
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition
We present the PanAf20K dataset, the largest and most diverse open-access
annotated video dataset of great apes in their natural environment. It
comprises more than 7 million frames across ~20,000 camera trap videos of
chimpanzees and gorillas collected at 14 field sites in tropical Africa as part
of the Pan African Programme: The Cultured Chimpanzee. The footage is
accompanied by a rich set of annotations and benchmarks making it suitable for
training and testing a variety of challenging and ecologically important
computer vision tasks including ape detection and behaviour recognition.
Furthering AI analysis of camera trap information is critical given the
International Union for Conservation of Nature now lists all species in the
great ape family as either Endangered or Critically Endangered. We hope the
dataset can form a solid basis for engagement of the AI community to improve
performance, efficiency, and result interpretation in order to support
assessments of great ape presence, abundance, distribution, and behaviour and
thereby aid conservation efforts.Comment: Accepted at IJC
DS-KCF: a real-time tracker for RGB-D data
© 2016 The Author(s) We propose an RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes. Despite the computational demands of the extra functionalities, we still achieve real-time performance rates of 35–43 fps in MATLAB and 187 fps in our C++ implementation. Our proposed method includes fast depth-based target object segmentation that enables, (1) efficient scale change handling within the KCF core functionality in the Fourier domain, (2) the detection of occlusions by temporal analysis of the target’s depth distribution, and (3) the estimation of a target’s change of shape through the temporal evolution of its segmented silhouette allows. Finally, we provide an in-depth analysis of the factors affecting the throughput and precision of our proposed tracker and perform extensive comparative analysis. Both the MATLAB and C++ versions of our software are available in the public domain
Imaging of subsurface lineaments in the southwestern part of the Thrace Basin from gravity data
Linear anomalies, as an indicator of the structural features of some geological bodies, are very important for the interpretation of gravity and magnetic data. In this study, an image processing technique known as the Hough transform (HT) algorithm is described for determining invisible boundaries and extensions in gravity anomaly maps. The Hough function implements the Hough transform used to extract straight lines or circles within two-dimensional potential field images. It is defined as image and Hough space. In the Hough domain, this function transforms each nonzero point in the parameter domain to a sinusoid. In the image space, each point in the Hough space is transformed to a straight line or circle. Lineaments are depicted from these straight lines which are transformed in the image domain. An application of the Hough transform to the Bouguer anomaly map of the southwestern part of the Thrace Basin, NW Turkey, shows the effectiveness of the proposed approach. Based on geological data and gravity data, the structural features in the southwestern part of the Thrace Basin are investigated by applying the proposed approach and the Blakely and Simpson method. Lineaments identified by these approaches are generally in good accordance with previously-mapped surface faults
A shortest path representation for video summarisation
A novel approach is presented to select multiple key frames within an isolated video shot where there is camera motion causing significant scene change. This is achieved by determining the dominant motion between frame pairs whose similarities are represented using a directed weighted graph. The shortest path in the graph, found using the � £ search algorithm, designates the key frames. The overall method can be applied to extract a set of key frames which portray both the video content and camera motions, all of which are useful features for video indexing and retrieval.