215,816 research outputs found

    A feature-based structural measure: an image similarity measure for face recognition

    Get PDF
    Facial recognition is one of the most challenging and interesting problems within the field of computer vision and pattern recognition. During the last few years, it has gained special attention due to its importance in relation to current issues such as security, surveillance systems and forensics analysis. Despite this high level of attention to facial recognition, the success is still limited by certain conditions; there is no method which gives reliable results in all situations. In this paper, we propose an efficient similarity index that resolves the shortcomings of the existing measures of feature and structural similarity. This measure, called the Feature-Based Structural Measure (FSM), combines the best features of the well-known SSIM (structural similarity index measure) and FSIM (feature similarity index measure) approaches, striking a balance between performance for similar and dissimilar images of human faces. In addition to the statistical structural properties provided by SSIM, edge detection is incorporated in FSM as a distinctive structural feature. Its performance is tested for a wide range of PSNR (peak signal-to-noise ratio), using ORL (Olivetti Research Laboratory, now AT&T Laboratory Cambridge) and FEI (Faculty of Industrial Engineering, São Bernardo do Campo, São Paulo, Brazil) databases. The proposed measure is tested under conditions of Gaussian noise; simulation results show that the proposed FSM outperforms the well-known SSIM and FSIM approaches in its efficiency of similarity detection and recognition of human faces

    Automated detection in benthic images for megafauna classification and marine resource exploration: supervised and unsupervised methods for classification and regression tasks in benthic images with efficient integration of expert knowledge

    Get PDF
    Schoening T. Automated detection in benthic images for megafauna classification and marine resource exploration: supervised and unsupervised methods for classification and regression tasks in benthic images with efficient integration of expert knowledge. Bielefeld: Universitätsbibliothek Bielefeld; 2015.Image acquisition of deep sea floors allows to cast a glance on an extraordinary environment. Exploring the rarely known geology and biology of the deep sea regularly questions the scientific understanding of occurring conditions, processes and changes. Increasing sampling efforts, by both more frequent image acquisition as well as widespread monitoring of large areas, currently refine the scientific models about this environment. Accompanied by the sampling efforts, novel challenges emerge for the image based marine research. These include growing data volume, growing data variety and increased velocity at which data is acquired. Apart from the included technical challenges, the fundamental problem is to add semantics to the acquired data to extract further meaning and gain derived knowledge. Manual analysis of the data in terms of manually annotating images (e.g. annotating occurring species to gain species interaction knowledge) is an intricate task and has become infeasible due to the huge data volumes. The combination of data and interpretation challenges calls for automated approaches based on pattern recognition and especially computer vision methods. These methods have been applied in other fields to add meaning to visual data but have rarely been applied to the peculiar case of marine imaging. First of all, the physical factors of the environment constitute a unique computer vision challenge and require special attention in adapting the methods. Second, the impossibility to create a reliable reference gold standard from multiple field expert annotations challenges the development and evaluation of automated, pattern recognition based approaches. In this thesis, novel automated methods to add semantics to benthic images are presented that are based on common pattern recognition techniques. Three major benthic computer vision scenarios are addressed: the detection of laser points for scale quantification, the detection and classification of benthic megafauna for habitat composition assessments and the detection and quantity estimation of benthic mineral resources for deep sea mining. All approaches to address these scenarios are fitted to the peculiarities of the marine environment. The primary paradigm, that guided the development of all methods, was to design systems that can be operated by field experts without knowledge about the applied pattern recognition methods. Therefore, the systems have to be generally applicable to arbitrary image based detection scenarios. This in turn makes them applicable in other computer vision fields outside the marine environment as well. By tuning system parameters automatically from field expert annotations and applying methods that cope with errors in those annotations, the limitations of inaccurate gold standards can be bypassed. This allows to use the developed systems to further refine the scientific models based on automated image analysis

    GII Representation-Based Cross-View Gait Recognition by Discriminative Projection With List-Wise Constraints

    Get PDF
    Remote person identification by gait is one of the most important topics in the field of computer vision and pattern recognition. However, gait recognition suffers severely from the appearance variance caused by the view change. It is very common that gait recognition has a high performance when the view is fixed but the performance will have a sharp decrease when the view variance becomes significant. Existing approaches have tried all kinds of strategies like tensor analysis or view transform models to slow down the trend of performance decrease but still have potential for further improvement. In this paper, a discriminative projection with list-wise constraints (DPLC) is proposed to deal with view variance in cross-view gait recognition, which has been further refined by introducing a rectification term to automatically capture the principal discriminative information. The DPLC with rectification (DPLCR) embeds list-wise relative similarity measurement among intraclass and inner-class individuals, which can learn a more discriminative and robust projection. Based on the original DPLCR, we have introduced the kernel trick to exploit nonlinear cross-view correlations and extended DPLCR to deal with the problem of multiview gait recognition. Moreover, a simple yet efficient gait representation, namely gait individuality image (GII), based on gait energy image is proposed, which could better capture the discriminative information for cross view gait recognition. Experiments have been conducted in the CASIA-B database and the experimental results demonstrate the outstanding performance of both the DPLCR framework and the new GII representation. It is shown that the DPLCR-based cross-view gait recognition has outperformed the-state-of-the-art approaches in almost all cases under large view variance. The combination of the GII representation and the DPLCR has further enhanced the performance to be a new benchmark for cross-view gait recognition

    Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

    Full text link
    In this paper, we introduce a novel end-end framework for multi-oriented scene text detection from an instance-aware semantic segmentation perspective. We present Fused Text Segmentation Networks, which combine multi-level features during the feature extracting as text instance may rely on finer feature expression compared to general objects. It detects and segments the text instance jointly and simultaneously, leveraging merits from both semantic segmentation task and region proposal based object detection task. Not involving any extra pipelines, our approach surpasses the current state of the art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever, we report a baseline on total-text containing curved text which suggests effectiveness of the proposed approach.Comment: Accepted by ICPR201

    Efficient and effective human action recognition in video through motion boundary description with a compact set of trajectories

    Get PDF
    Human action recognition (HAR) is at the core of human-computer interaction and video scene understanding. However, achieving effective HAR in an unconstrained environment is still a challenging task. To that end, trajectory-based video representations are currently widely used. Despite the promising levels of effectiveness achieved by these approaches, problems regarding computational complexity and the presence of redundant trajectories still need to be addressed in a satisfactory way. In this paper, we propose a method for trajectory rejection, reducing the number of redundant trajectories without degrading the effectiveness of HAR. Furthermore, to realize efficient optical flow estimation prior to trajectory extraction, we integrate a method for dynamic frame skipping. Experiments with four publicly available human action datasets show that the proposed approach outperforms state-of-the-art HAR approaches in terms of effectiveness, while simultaneously mitigating the computational complexity
    corecore