50 research outputs found

    Deep Compact Person Re-Identification with Distractor Synthesis via Guided DC-GANs

    Get PDF
    We present a dual-stream CNN that learns both appearance and facial features in tandem from still images and, after feature fusion, infers person identities. We then describe an alternative architecture of a single, lightweight ID-CondenseNet where a face detector-guided DC-GAN is used to generate distractor person images for enhanced training. For evaluation, we test both architectures on FLIMA, a new extension of an existing person re-identification dataset with added frame-by-frame annotations of face presence. Although the dual-stream CNN can outperform the CondenseNet approach on FLIMA, we show that the latter surpasses all state-of-the-art architectures in top-1 ranking performance when applied to the largest existing person re-identification dataset, MSMT17. We conclude that whilst re-identification performance is highly sensitive to the structure of datasets, distractor augmentation and network compression have a role to play for enhancing performance characteristics for larger scale applications

    Semantically selective augmentation for deep compact person re-identification

    Get PDF
    We present a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks. In particular, we propose to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a semantic classifier to explicitly control the domain specificity of the generation process. Thereby, we encode information in the classifier network which can be utilized to steer adversarial synthesis, and which fuels our CondenseNet ID-network training. We provide a quantitative and qualitative analysis of the approach and its variants on a number of datasets, obtaining results that outperform the state-of-the-art on the LIMA dataset for long-term monitoring in indoor living spaces

    Energy expenditure estimation using visual and inertial sensors

    Get PDF
    © The Institution of Engineering and Technology 2017. Deriving a person's energy expenditure accurately forms the foundation for tracking physical activity levels across many health and lifestyle monitoring tasks. In this study, the authors present a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor. The proposed individual-independent framework fuses information from both modalities which leads to improved estimates beyond the accuracy of single modality and manual metabolic equivalents of task (MET) lookup table based methods. For evaluation, the authors introduce a new dataset called SPHERE_RGBD + Inertial_calorie, for which visual and inertial data are simultaneously obtained with indirect calorimetry ground truth measurements based on gas exchange. Experiments show that the fusion of visual and inertial data reduces the estimation error by 8 and 18% compared with the use of visual only and inertial sensor only, respectively, and by 33% compared with a MET-based approach. The authors conclude from their results that the proposed approach is suitable for home monitoring in a controlled environment

    PanAf20K : a large video dataset for wild ape detection and behaviour recognition

    Get PDF
    The work that allowed for the collection of the dataset was funded by the Max Planck Society, Max Planck Society Innovation Fund, and Heinz L. Krekeler. This work was supported by the UKRI CDT in Interactive AI under grant EP/S022937/1.We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across ∼20,000 camera trap videos of chimpanzees and gorillas collected at 18 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by a rich set of annotations and benchmarks making it suitable for training and testing a variety of challenging and ecologically important computer vision tasks including ape detection and behaviour recognition. Furthering AI analysis of camera trap information is critical given the International Union for Conservation of Nature now lists all species in the great ape family as either Endangered or Critically Endangered. We hope the dataset can form a solid basis for engagement of the AI community to improve performance, efficiency, and result interpretation in order to support assessments of great ape presence, abundance, distribution, and behaviour and thereby aid conservation efforts. The dataset and code are available from the project website: PanAf20KPeer reviewe

    DS-KCF: a real-time tracker for RGB-D data

    Get PDF
    © 2016 The Author(s) We propose an RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes. Despite the computational demands of the extra functionalities, we still achieve real-time performance rates of 35–43 fps in MATLAB and 187 fps in our C++ implementation. Our proposed method includes fast depth-based target object segmentation that enables, (1) efficient scale change handling within the KCF core functionality in the Fourier domain, (2) the detection of occlusions by temporal analysis of the target’s depth distribution, and (3) the estimation of a target’s change of shape through the temporal evolution of its segmented silhouette allows. Finally, we provide an in-depth analysis of the factors affecting the throughput and precision of our proposed tracker and perform extensive comparative analysis. Both the MATLAB and C++ versions of our software are available in the public domain

    Imaging of subsurface lineaments in the southwestern part of the Thrace Basin from gravity data

    Full text link
    Linear anomalies, as an indicator of the structural features of some geological bodies, are very important for the interpretation of gravity and magnetic data. In this study, an image processing technique known as the Hough transform (HT) algorithm is described for determining invisible boundaries and extensions in gravity anomaly maps. The Hough function implements the Hough transform used to extract straight lines or circles within two-dimensional potential field images. It is defined as image and Hough space. In the Hough domain, this function transforms each nonzero point in the parameter domain to a sinusoid. In the image space, each point in the Hough space is transformed to a straight line or circle. Lineaments are depicted from these straight lines which are transformed in the image domain. An application of the Hough transform to the Bouguer anomaly map of the southwestern part of the Thrace Basin, NW Turkey, shows the effectiveness of the proposed approach. Based on geological data and gravity data, the structural features in the southwestern part of the Thrace Basin are investigated by applying the proposed approach and the Blakely and Simpson method. Lineaments identified by these approaches are generally in good accordance with previously-mapped surface faults

    Рецензия на книгу : The Changing Perspectives and ‘New’ Geopolitics of the Caucasus in the 21st Century / ed. by S. Yilmaz, M. Yorulmaz. Ankara : Astana Yayınları, 2021. 304 p.

    Get PDF

    A shortest path representation for video summarisation

    No full text
    A novel approach is presented to select multiple key frames within an isolated video shot where there is camera motion causing significant scene change. This is achieved by determining the dominant motion between frame pairs whose similarities are represented using a directed weighted graph. The shortest path in the graph, found using the � £ search algorithm, designates the key frames. The overall method can be applied to extract a set of key frames which portray both the video content and camera motions, all of which are useful features for video indexing and retrieval.
    corecore