4,091 research outputs found

    Region-based Skin Color Detection.

    Get PDF
    Skin color provides a powerful cue for complex computer vision applications. Although skin color detection has been an active research area for decades, the mainstream technology is based on the individual pixels. This paper presents a new region-based technique for skin color detection which outperforms the current state-of-the-art pixel-based skin color detection method on the popular Compaq dataset (Jones and Rehg, 2002). Color and spatial distance based clustering technique is used to extract the regions from the images, also known as superpixels. In the first step, our technique uses the state-of-the-art non-parametric pixel-based skin color classifier (Jones and Rehg, 2002) which we call the basic skin color classifier. The pixel-based skin color evidence is then aggregated to classify the superpixels. Finally, the Conditional Random Field (CRF) is applied to further improve the results. As CRF operates over superpixels, the computational overhead is minimal. Our technique achieves 91.17% true positive rate with 13.12% false negative rate on the Compaq dataset tested over approximately 14,000 web images

    Benchmark of machine learning methods for classification of a Sentinel-2 image

    Get PDF
    Thanks to mainly ESA and USGS, a large bulk of free images of the Earth is readily available nowadays. One of the main goals of remote sensing is to label images according to a set of semantic categories, i.e. image classification. This is a very challenging issue since land cover of a specific class may present a large spatial and spectral variability and objects may appear at different scales and orientations. In this study, we report the results of benchmarking 9 machine learning algorithms tested for accuracy and speed in training and classification of land-cover classes in a Sentinel-2 dataset. The following machine learning methods (MLM) have been tested: linear discriminant analysis, k-nearest neighbour, random forests, support vector machines, multi layered perceptron, multi layered perceptron ensemble, ctree, boosting, logarithmic regression. The validation is carried out using a control dataset which consists of an independent classification in 11 land-cover classes of an area about 60 km2, obtained by manual visual interpretation of high resolution images (20 cm ground sampling distance) by experts. In this study five out of the eleven classes are used since the others have too few samples (pixels) for testing and validating subsets. The classes used are the following: (i) urban (ii) sowable areas (iii) water (iv) tree plantations (v) grasslands. Validation is carried out using three different approaches: (i) using pixels from the training dataset (train), (ii) using pixels from the training dataset and applying cross-validation with the k-fold method (kfold) and (iii) using all pixels from the control dataset. Five accuracy indices are calculated for the comparison between the values predicted with each model and control values over three sets of data: the training dataset (train), the whole control dataset (full) and with k-fold cross-validation (kfold) with ten folds. Results from validation of predictions of the whole dataset (full) show the random forests method with the highest values; kappa index ranging from 0.55 to 0.42 respectively with the most and least number pixels for training. The two neural networks (multi layered perceptron and its ensemble) and the support vector machines - with default radial basis function kernel - methods follow closely with comparable performanc

    Rotation-invariant features for multi-oriented text detection in natural images.

    Get PDF
    Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes

    Object Segmentation in Images using EEG Signals

    Get PDF
    This paper explores the potential of brain-computer interfaces in segmenting objects from images. Our approach is centered around designing an effective method for displaying the image parts to the users such that they generate measurable brain reactions. When an image region, specifically a block of pixels, is displayed we estimate the probability of the block containing the object of interest using a score based on EEG activity. After several such blocks are displayed, the resulting probability map is binarized and combined with the GrabCut algorithm to segment the image into object and background regions. This study shows that BCI and simple EEG analysis are useful in locating object boundaries in images.Comment: This is a preprint version prior to submission for peer-review of the paper accepted to the 22nd ACM International Conference on Multimedia (November 3-7, 2014, Orlando, Florida, USA) for the High Risk High Reward session. 10 page

    An iterative inference procedure applying conditional random fields for simultaneous classification of land cover and land use

    Get PDF
    Land cover and land use exhibit strong contextual dependencies. We propose a novel approach for the simultaneous classification of land cover and land use, where semantic and spatial context is considered. The image sites for land cover and land use classification form a hierarchy consisting of two layers: a land cover layer and a land use layer. We apply Conditional Random Fields (CRF) at both layers. The layers differ with respect to the image entities corresponding to the nodes, the employed features and the classes to be distinguished. In the land cover layer, the nodes represent super-pixels; in the land use layer, the nodes correspond to objects from a geospatial database. Both CRFs model spatial dependencies between neighbouring image sites. The complex semantic relations between land cover and land use are integrated in the classification process by using contextual features. We propose a new iterative inference procedure for the simultaneous classification of land cover and land use, in which the two classification tasks mutually influence each other. This helps to improve the classification accuracy for certain classes. The main idea of this approach is that semantic context helps to refine the class predictions, which, in turn, leads to more expressive context information. Thus, potentially wrong decisions can be reversed at later stages. The approach is designed for input data based on aerial images. Experiments are carried out on a test site to evaluate the performance of the proposed method. We show the effectiveness of the iterative inference procedure and demonstrate that a smaller size of the super-pixels has a positive influence on the classification result

    Fuzzy spectral and spatial feature integration for classification of nonferrous materials in hyperspectral data

    Get PDF
    Hyperspectral data allows the construction of more elaborate models to sample the properties of the nonferrous materials than the standard RGB color representation. In this paper, the nonferrous waste materials are studied as they cannot be sorted by classical procedures due to their color, weight and shape similarities. The experimental results presented in this paper reveal that factors such as the various levels of oxidization of the waste materials and the slight differences in their chemical composition preclude the use of the spectral features in a simplistic manner for robust material classification. To address these problems, the proposed FUSSER (fuzzy spectral and spatial classifier) algorithm detailed in this paper merges the spectral and spatial features to obtain a combined feature vector that is able to better sample the properties of the nonferrous materials than the single pixel spectral features when applied to the construction of multivariate Gaussian distributions. This approach allows the implementation of statistical region merging techniques in order to increase the performance of the classification process. To achieve an efficient implementation, the dimensionality of the hyperspectral data is reduced by constructing bio-inspired spectral fuzzy sets that minimize the amount of redundant information contained in adjacent hyperspectral bands. The experimental results indicate that the proposed algorithm increased the overall classification rate from 44% using RGB data up to 98% when the spectral-spatial features are used for nonferrous material classification

    Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context

    Full text link
    We present an algorithm for finding temporally consistent occlusion boundaries in videos to support segmentation of dynamic scenes. We learn occlusion boundaries in a pairwise Markov random field (MRF) framework. We first estimate the probability of an spatio-temporal edge being an occlusion boundary by using appearance, flow, and geometric features. Next, we enforce occlusion boundary continuity in a MRF model by learning pairwise occlusion probabilities using a random forest. Then, we temporally smooth boundaries to remove temporal inconsistencies in occlusion boundary estimation. Our proposed framework provides an efficient approach for finding temporally consistent occlusion boundaries in video by utilizing causality, redundancy in videos, and semantic layout of the scene. We have developed a dataset with fully annotated ground-truth occlusion boundaries of over 30 videos ($5000 frames). This dataset is used to evaluate temporal occlusion boundaries and provides a much needed baseline for future studies. We perform experiments to demonstrate the role of scene layout, and temporal information for occlusion reasoning in dynamic scenes.Comment: Applications of Computer Vision (WACV), 2015 IEEE Winter Conference o

    Immunochromatographic diagnostic test analysis using Google Glass.

    Get PDF
    We demonstrate a Google Glass-based rapid diagnostic test (RDT) reader platform capable of qualitative and quantitative measurements of various lateral flow immunochromatographic assays and similar biomedical diagnostics tests. Using a custom-written Glass application and without any external hardware attachments, one or more RDTs labeled with Quick Response (QR) code identifiers are simultaneously imaged using the built-in camera of the Google Glass that is based on a hands-free and voice-controlled interface and digitally transmitted to a server for digital processing. The acquired JPEG images are automatically processed to locate all the RDTs and, for each RDT, to produce a quantitative diagnostic result, which is returned to the Google Glass (i.e., the user) and also stored on a central server along with the RDT image, QR code, and other related information (e.g., demographic data). The same server also provides a dynamic spatiotemporal map and real-time statistics for uploaded RDT results accessible through Internet browsers. We tested this Google Glass-based diagnostic platform using qualitative (i.e., yes/no) human immunodeficiency virus (HIV) and quantitative prostate-specific antigen (PSA) tests. For the quantitative RDTs, we measured activated tests at various concentrations ranging from 0 to 200 ng/mL for free and total PSA. This wearable RDT reader platform running on Google Glass combines a hands-free sensing and image capture interface with powerful servers running our custom image processing codes, and it can be quite useful for real-time spatiotemporal tracking of various diseases and personal medical conditions, providing a valuable tool for epidemiology and mobile health
    corecore