1,936 research outputs found

    Perceptual-based textures for scene labeling: a bottom-up and a top-down approach

    Get PDF
    Due to the semantic gap, the automatic interpretation of digital images is a very challenging task. Both the segmentation and classification are intricate because of the high variation of the data. Therefore, the application of appropriate features is of utter importance. This paper presents biologically inspired texture features for material classification and interpreting outdoor scenery images. Experiments show that the presented texture features obtain the best classification results for material recognition compared to other well-known texture features, with an average classification rate of 93.0%. For scene analysis, both a bottom-up and top-down strategy are employed to bridge the semantic gap. At first, images are segmented into regions based on the perceptual texture and next, a semantic label is calculated for these regions. Since this emerging interpretation is still error prone, domain knowledge is ingested to achieve a more accurate description of the depicted scene. By applying both strategies, 91.9% of the pixels from outdoor scenery images obtained a correct label

    Deep filter banks for texture recognition, description, and segmentation

    Get PDF
    Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture representations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another.Comment: 29 pages; 13 figures; 8 table

    Edge Detection for Object Recognition in Aerial Photographs

    Get PDF
    An important objective in computer vision research is the automatic understanding of aerial photographs of urban and suburban locations. Several systems have been developed to begin to recognize man-made objects in these scenes. A brief review of these systems is presented. This paper introduces the Pennsylvania Landscan recognition system. It is performing recognition of a scale model of the University of Pennsylvania campus. The LandScan recognition system uses features such as shape and height to identify objects such as sidewalks and buildings. Also, this work includes extensive study of edge detection for object recognition Two statistics, edge pixel density and average edge extent, are developed to differentiate between object border edges, texture edges and noise edges. The Quantizer Votes edge detection algorithm is developed to find high intensity, high frequency edges. Future research directions concerning recognition system development, and edge qualities and statistics are motivated by the results of this research

    DCTNet : A Simple Learning-free Approach for Face Recognition

    Full text link
    PCANet was proposed as a lightweight deep learning network that mainly leverages Principal Component Analysis (PCA) to learn multistage filter banks followed by binarization and block-wise histograming. PCANet was shown worked surprisingly well in various image classification tasks. However, PCANet is data-dependence hence inflexible. In this paper, we proposed a data-independence network, dubbed DCTNet for face recognition in which we adopt Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is motivated by the fact that 2D DCT basis is indeed a good approximation for high ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is free from learning as 2D DCT bases can be computed in advance. Besides that, we also proposed an effective method to regulate the block-wise histogram feature vector of DCTNet for robustness. It is shown to provide surprising performance boost when the probe image is considerably different in appearance from the gallery image. We evaluate the performance of DCTNet extensively on a number of benchmark face databases and being able to achieve on par with or often better accuracy performance than PCANet.Comment: APSIPA ASC 201

    Integral Channel Features

    Get PDF
    We study the performance of ‘integral channel features’ for image classification tasks, focusing in particular on pedestrian detection. The general idea behind integral channel features is that multiple registered image channels are computed using linear and non-linear transformations of the input image, and then features such as local sums, histograms, and Haar features and their various generalizations are efficiently computed using integral images. Such features have been used in recent literature for a variety of tasks – indeed, variations appear to have been invented independently multiple times. Although integral channel features have proven effective, little effort has been devoted to analyzing or optimizing the features themselves. In this work we present a unified view of the relevant work in this area and perform a detailed experimental evaluation. We demonstrate that when designed properly, integral channel features not only outperform other features including histogram of oriented gradient (HOG), they also (1) naturally integrate heterogeneous sources of information, (2) have few parameters and are insensitive to exact parameter settings, (3) allow for more accurate spatial localization during detection, and (4) result in fast detectors when coupled with cascade classifiers

    A framework for cardio-pulmonary resuscitation (CPR) scene retrieval from medical simulation videos based on object and activity detection.

    Get PDF
    In this thesis, we propose a framework to detect and retrieve CPR activity scenes from medical simulation videos. Medical simulation is a modern training method for medical students, where an emergency patient condition is simulated on human-like mannequins and the students act upon. These simulation sessions are recorded by the physician, for later debriefing. With the increasing number of simulation videos, automatic detection and retrieval of specific scenes became necessary. The proposed framework for CPR scene retrieval, would eliminate the conventional approach of using shot detection and frame segmentation techniques. Firstly, our work explores the application of Histogram of Oriented Gradients in three dimensions (HOG3D) to retrieve the scenes containing CPR activity. Secondly, we investigate the use of Local Binary Patterns in Three Orthogonal Planes (LBPTOP), which is the three dimensional extension of the popular Local Binary Patterns. This technique is a robust feature that can detect specific activities from scenes containing multiple actors and activities. Thirdly, we propose an improvement to the above mentioned methods by a combination of HOG3D and LBP-TOP. We use decision level fusion techniques to combine the features. We prove experimentally that the proposed techniques and their combination out-perform the existing system for CPR scene retrieval. Finally, we devise a method to detect and retrieve the scenes containing the breathing bag activity, from the medical simulation videos. The proposed framework is tested and validated using eight medical simulation videos and the results are presented
    • …
    corecore