38 research outputs found

    Capsule Networks for Object Detection in UAV Imagery

    Get PDF
    Recent advances in Convolutional Neural Networks (CNNs) have attracted great attention in remote sensing due to their high capability to model high-level semantic content of Remote Sensing (RS) images. However, CNNs do not explicitly retain the relative position of objects in an image and, thus, the effectiveness of the obtained features is limited in the framework of the complex object detection problems. To address this problem, in this paper we introduce Capsule Networks (CapsNets) for object detection in Unmanned Aerial Vehicle-acquired images. Unlike CNNs, CapsNets extract and exploit the information content about objects’ relative position across several layers, which enables parsing crowded scenes with overlapping objects. Experimental results obtained on two datasets for car and solar panel detection problems show that CapsNets provide similar object detection accuracies when compared to state-of-the-art deep models with significantly reduced computational time. This is due to the fact that CapsNets emphasize dynamic routine instead of the depth.EC/H2020/759764/EU/Accurate and Scalable Processing of Big Data in Earth Observation/BigEart

    Encoding Motion Cues for Pedestrian Path Prediction in Dense Crowd Scenarios

    Get PDF
    Pedestrian path prediction is an emerging topic in the crowd visual analysis domain, notwithstanding its practical importance in many respects. To date, the few contributions in the literature proposed quite straightforward approaches, and only a few of them have taken into account the interaction between pedestrians as a paramount cue in forecasting their potential walking preferences in a given scene. Moreover, the typical trend was to evaluate the proposed algorithms on sparse scenarios. To cope with more realistic cases, in this paper, we present an efficient approach for pedestrian path prediction in densely crowded scenes. The proposed approach initiates by extracting motion features related to the target pedestrian and his/her neighbors. Second, in order to further increase the representativeness of the extracted motion cues, an autoencoder feature learning model is considered, whose outcome finally feeds a Gaussian process regression prediction model to infer the potential future trajectories of the target pedestrians given their walking records in the scene. Experimental results demonstrate that our framework scores plausible results and outperforms traditional methods in the literature

    Recovering the Sight to blind People in indoor Environments with smart Technologies

    Get PDF
    The methodologies presented in this thesis address the problem of blind people rehabilitation through assistive technologies. In overall terms, the basic and principal needs that a blind individual might be concerned with can be confined to two components, namely (i) navigation/ obstacle avoidance, and (ii) object recognition. Having a close look at the literature, it seems clear that the former category has been devoted the biggest concern with respect to the latter one. Moreover, the few contributions on the second concern tend to approach the recognition task on a single predefined class of objects. Furthermore, both needs, to the best of our knowledge, have not been embedded into a single prototype. In this respect, we put forth in this thesis two main contributions. The first and main one tackles the issue of object recognition for the blind, in which we propose a ‘coarse recognition’ approach that proceeds by detecting objects in bulk rather than focusing on a single class. Thus, the underlying insight of the coarse recognition is to list the bunch of objects that likely exist in a camera-shot image (acquired by the blind individual with an opportune interface, e.g., voice recognition synthesis-based support), regardless of their position in the scene. It thus trades the computational time with object information details as to lessen the processing constraints. As for the second contribution, we further incorporate the recognition algorithm, along with an implemented navigation system that is supplied with a laser-based obstacle avoidance module. Evaluated on image datasets acquired in indoor environments, the recognition schemes have exhibited, with little to mild disparities with respect to one another, interesting results in terms of either recognition rates or processing gap. On the other hand, the navigation system has been assessed in an indoor site and has revealed plausible performance and flexibility with respect to the usual blind people’s mobility speed. A thorough experimental analysis is hereby provided alongside laying the foundations for potential future research lines, including object recognition in outdoor environments

    Sparse modeling of the land use classification problem

    No full text
    In this paper, we present a fusion method contextualized within a land use classification framework. At first, feature vectors are extracted from all the color channels of the given test image. Then, the generated vectors are recovered over a bunch of training feature vectors extracted from training images. The resulting reconstruction residuals feed a fusion mechanism to further compose a final residual that serves for inferring the final decision of the class pertaining to the test image. Validated on a benchmark dataset, the presented method shows to promote drastic improvements over using only one single spectral channel. Furthermore, encouraging gains have been recorded with respect to reference works

    Sparse coding joint decision rule for ear print recognition

    No full text
    Human ear recognition has been promoted as a profitable biometric over the past few years. With respect to other modalities, such as the face and iris, that have undergone a significant investigation in the literature, ear pattern is relatively still uncommon. We put forth a sparse coding-induced decision-making for ear recognition. It jointly involves the reconstruction residuals and the respective reconstruction coefficients pertaining to the input features (co-occurrence of adjacent local binary patterns) for a further fusion. We particularly show that combining both components (i.e., the residuals as well as the coefficients) yields better outcomes than the case when either of them is deemed singly. The proposed method has been evaluated on two benchmark datasets, namely IITD1 (125 subject) and IITD2 (221 subjects). The recognition rates of the suggested scheme amount for 99.5% and 98.95% for both datasets, respectively, which suggest that our method decently stands out against reference state-of-the-art methodologies. Furthermore, experiments conclude that the presented scheme manifests a promising robustness under large-scale occlusion scenarios

    Distance penalization and fusion for person re-identification

    No full text
    none3This paper presents a novel person re-identification framework based on data fusion. The pipeline of the proposed method is composed of two stages. First, a metric learning paradigm is applied on a bunch of distinct feature extractors to produce an ensemble of estimated distance measures, which are subsequently penalized according to their confidence in evidencing the correct matches from the false ones, and averaged as to draw a final decision. Second, the close persons from the gallery are selected based on the previously fused distance estimates, and utilized to build a dictionary as to reconstruct a given probe pattern. Evaluated on benchmark datasets, the proposed framework advances the state-of-the-art by interesting margins. In particular, Rank1 gains amounting to about 12%, 1%, 6%, and 12%, were scored on VIPeR, CAVIAR4REID, iLIDS, and 3DPeS, respectively.noneBehzad Mirmahboub; Mohamed Lamine Mekhalfi; Vittorio MurinoMirmahboub, Behzad; Mekhalfi, Mohamed Lamine; Murino, Vittori

    Person re-identification by order-induced metric fusion

    No full text
    This paper presents a novel two-pronged framework for person re-identification. Its idea articulates over the fact that distinct descriptors manifest different ranking scores for the same probe pattern. Thus, if conveniently fused, the descriptors in hand are ought to compensate each other, leading to significant improvements. In this respect, this paper proposes a learning-free weighting method that penalizes and averages the re-identification estimates (e.g., distances) pointed out by different descriptors according to their confidence in evidencing the correct match, to a given probe person, among a given gallery. We particularly show that tangible improvements can be attained with respect to utilizing each descriptor individually. Moreover, we consider a confidence measure mechanism that treats the mutual pairwise distances within the gallery, in order to raise the scores obtained at the fusion stage, and we show that interesting improvements can be achieved. We evaluate the proposed framework on four benchmark datasets and advance late works by large margins

    Person re-identification by order-induced metric fusion

    No full text
    This paper presents a novel two-pronged framework for person re-identification. Its idea articulates over the fact that distinct descriptors manifest different ranking scores for the same probe pattern. Thus, if conveniently fused, the descriptors in hand are ought to compensate each other, leading to significant improvements. In this respect, this paper proposes a learning-free weighting method that penalizes and averages the re-identification estimates (e.g., distances) pointed out by different descriptors according to their confidence in evidencing the correct match, to a given probe person, among a given gallery. We particularly show that tangible improvements can be attained with respect to utilizing each descriptor individually. Moreover, we consider a confidence measure mechanism that treats the mutual pairwise distances within the gallery, in order to raise the scores obtained at the fusion stage, and we show that interesting improvements can be achieved. We evaluate the proposed framework on four benchmark datasets and advance late works by large margins. \ua9 2017 Elsevier B.V

    Heart sounds analysis using wavelets responses and support vector machines

    No full text
    Over the last decade, computerized heart screening techniques have been increasingly receiving attention. In general, one can say that such techniques can be categorized as: with, or without the so-called Electrocardiogram (ECG) signal. Considering this latter strategy, we devote this paper with the intention to design an algorithm that provides with heart sounds known as Phonocardiograms (PGC) investigation for further definition of the present pathology if any. A novel algorithm for heart sounds segmentation is also presented. The decision making is accomplished by means of support vector machines (SVM) classifier which is fed by characteristic features extracted from PCGs basing on wavelet filter banks coefficients so that PCG signals are classified into five classes: normal heart sound (NHS), aortic stenosis (AS), aortic insufficiency (Al) mitral stenosis (MS), and mitral insufficiency (MI). The SVM was trained on a low-dimensional feature space, and tested on relatively a big dataset in order to show its generalization capability
    corecore