196,829 research outputs found

    Fair comparison of skin detection approaches on publicly available datasets

    Full text link
    Skin detection is the process of discriminating skin and non-skin regions in a digital image and it is widely used in several applications ranging from hand gesture analysis to track body parts and face detection. Skin detection is a challenging problem which has drawn extensive attention from the research community, nevertheless a fair comparison among approaches is very difficult due to the lack of a common benchmark and a unified testing protocol. In this work, we investigate the most recent researches in this field and we propose a fair comparison among approaches using several different datasets. The major contributions of this work are an exhaustive literature review of skin color detection approaches, a framework to evaluate and combine different skin detector approaches, whose source code is made freely available for future research, and an extensive experimental comparison among several recent methods which have also been used to define an ensemble that works well in many different problems. Experiments are carried out in 10 different datasets including more than 10000 labelled images: experimental results confirm that the best method here proposed obtains a very good performance with respect to other stand-alone approaches, without requiring ad hoc parameter tuning. A MATLAB version of the framework for testing and of the methods proposed in this paper will be freely available from https://github.com/LorisNann

    Detection of Mines in Acoustic Images using Higher Order Spectral Features

    Get PDF
    A new pattern-recognition algorithm detects approximately 90% of the mines hidden in the Coastal Systems Station Sonar0, 1, and 3 databases of cluttered acoustic images, with about 10% false alarms. Similar to other approaches, the algorithm presented here includes processing the images with an adaptive Wiener filter (the degree of smoothing depends on the signal strength in a local neighborhood) to remove noise without destroying the structural information in the mine shapes, followed by a two-dimensional FIR filter designed to suppress noise and clutter, while enhancing the target signature. A double peak pattern is produced as the FIR filter passes over mine highlight and shadow regions. Although the location, size, and orientation of this pattern within a region of the image can vary, features derived from higher order spectra (HOS) are invariant to translation, rotation, and scaling, while capturing the spatial correlations of mine-like objects. Classification accuracy is improved by combining features based on geometrical properties of the filter output with features based on HOS. The highest accuracy is obtained by fusing classification based on bispectral features with classification based on trispectral features

    Development of retinal blood vessel segmentation methodology using wavelet transforms for assessment of diabetic retinopathy

    Get PDF
    Automated image processing has the potential to assist in the early detection of diabetes, by detecting changes in blood vessel diameter and patterns in the retina. This paper describes the development of segmentation methodology in the processing of retinal blood vessel images obtained using non-mydriatic colour photography. The methods used include wavelet analysis, supervised classifier probabilities and adaptive threshold procedures, as well as morphology-based techniques. We show highly accurate identification of blood vessels for the purpose of studying changes in the vessel network that can be utilized for detecting blood vessel diameter changes associated with the pathophysiology of diabetes. In conjunction with suitable feature extraction and automated classification methods, our segmentation method could form the basis of a quick and accurate test for diabetic retinopathy, which would have huge benefits in terms of improved access to screening people for risk or presence of diabetes

    Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval

    Full text link
    Texture is an important cue for different computer vision tasks and applications. Local Binary Pattern (LBP) is considered one of the best yet efficient texture descriptors. However, LBP has some notable limitations, mostly the sensitivity to noise. In this paper, we address these criteria by introducing a novel texture descriptor, Robust Adaptive Median Binary Pattern (RAMBP). RAMBP based on classification process of noisy pixels, adaptive analysis window, scale analysis and image regions median comparison. The proposed method handles images with high noisy textures, and increases the discriminative properties by capturing microstructure and macrostructure texture information. The proposed method has been evaluated on popular texture datasets for classification and retrieval tasks, and under different high noise conditions. Without any train or prior knowledge of noise type, RAMBP achieved the best classification compared to state-of-the-art techniques. It scored more than 90%90\% under 50%50\% impulse noise densities, more than 95%95\% under Gaussian noised textures with standard deviation σ=5\sigma = 5, and more than 99%99\% under Gaussian blurred textures with standard deviation σ=1.25\sigma = 1.25. The proposed method yielded competitive results and high performance as one of the best descriptors in noise-free texture classification. Furthermore, RAMBP showed also high performance for the problem of noisy texture retrieval providing high scores of recall and precision measures for textures with high levels of noise

    An agent-driven semantical identifier using radial basis neural networks and reinforcement learning

    Full text link
    Due to the huge availability of documents in digital form, and the deception possibility raise bound to the essence of digital documents and the way they are spread, the authorship attribution problem has constantly increased its relevance. Nowadays, authorship attribution,for both information retrieval and analysis, has gained great importance in the context of security, trust and copyright preservation. This work proposes an innovative multi-agent driven machine learning technique that has been developed for authorship attribution. By means of a preprocessing for word-grouping and time-period related analysis of the common lexicon, we determine a bias reference level for the recurrence frequency of the words within analysed texts, and then train a Radial Basis Neural Networks (RBPNN)-based classifier to identify the correct author. The main advantage of the proposed approach lies in the generality of the semantic analysis, which can be applied to different contexts and lexical domains, without requiring any modification. Moreover, the proposed system is able to incorporate an external input, meant to tune the classifier, and then self-adjust by means of continuous learning reinforcement.Comment: Published on: Proceedings of the XV Workshop "Dagli Oggetti agli Agenti" (WOA 2014), Catania, Italy, Sepember. 25-26, 201

    Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

    Get PDF
    Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information (RGB) are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multiple class Support Vector Machines (SVM)s. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public domain datasets including: MSR 3D Action,Northwestern UCLA multi-view actions and MSR 3D daily activity are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.Comment: 14 pages, 6 figures, 13 tables. Submitte
    • 

    corecore