4,336 research outputs found

    Localization and recognition of the scoreboard in sports video based on SIFT point matching

    Get PDF
    In broadcast sports video, the scoreboard is attached at a fixed location in the video and generally the scoreboard always exists in all video frames in order to help viewers to understand the match’s progression quickly. Based on these observations, we present a new localization and recognition method for scoreboard text in sport videos in this paper. The method first matches the Scale Invariant Feature Transform (SIFT) points using a modified matching technique between two frames extracted from a video clip and then localizes the scoreboard by computing a robust estimate of the matched point cloud in a two-stage non-scoreboard filter process based on some domain rules. Next some enhancement operations are performed on the localized scoreboard, and a Multi-frame Voting Decision is used. Both aim to increasing the OCR rate. Experimental results demonstrate the effectiveness and efficiency of our proposed method

    In-the-wild Facial Expression Recognition in Extreme Poses

    Full text link
    In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions.Comment: Published on ICGIP201

    Place recognition: An Overview of Vision Perspective

    Full text link
    Place recognition is one of the most fundamental topics in computer vision and robotics communities, where the task is to accurately and efficiently recognize the location of a given query image. Despite years of wisdom accumulated in this field, place recognition still remains an open problem due to the various ways in which the appearance of real-world places may differ. This paper presents an overview of the place recognition literature. Since condition invariant and viewpoint invariant features are essential factors to long-term robust visual place recognition system, We start with traditional image description methodology developed in the past, which exploit techniques from image retrieval field. Recently, the rapid advances of related fields such as object detection and image classification have inspired a new technique to improve visual place recognition system, i.e., convolutional neural networks (CNNs). Thus we then introduce recent progress of visual place recognition system based on CNNs to automatically learn better image representations for places. Eventually, we close with discussions and future work of place recognition.Comment: Applied Sciences (2018

    Stratified decision forests for accurate anatomical landmark localization in cardiac images

    Get PDF
    Accurate localization of anatomical landmarks is an important step in medical imaging, as it provides useful prior information for subsequent image analysis and acquisition methods. It is particularly useful for initialization of automatic image analysis tools (e.g. segmentation and registration) and detection of scan planes for automated image acquisition. Landmark localization has been commonly performed using learning based approaches, such as classifier and/or regressor models. However, trained models may not generalize well in heterogeneous datasets when the images contain large differences due to size, pose and shape variations of organs. To learn more data-adaptive and patient specific models, we propose a novel stratification based training model, and demonstrate its use in a decision forest. The proposed approach does not require any additional training information compared to the standard model training procedure and can be easily integrated into any decision tree framework. The proposed method is evaluated on 1080 3D highresolution and 90 multi-stack 2D cardiac cine MR images. The experiments show that the proposed method achieves state-of-theart landmark localization accuracy and outperforms standard regression and classification based approaches. Additionally, the proposed method is used in a multi-atlas segmentation to create a fully automatic segmentation pipeline, and the results show that it achieves state-of-the-art segmentation accuracy

    A comparative evaluation of interest point detectors and local descriptors for visual SLAM

    Get PDF
    Abstract In this paper we compare the behavior of different interest points detectors and descriptors under the conditions needed to be used as landmarks in vision-based simultaneous localization and mapping (SLAM). We evaluate the repeatability of the detectors, as well as the invariance and distinctiveness of the descriptors, under different perceptual conditions using sequences of images representing planar objects as well as 3D scenes. We believe that this information will be useful when selecting an appropriat

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table
    corecore