Search CORE

1,788 research outputs found

Accelerating object extraction and detection using a hierarchical approach with shape descriptors

Author: Arshad Bassam Syed
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/08/2016
Field of study

Automatic object recognition is a fundamental problem in the fields of computer vision and machine learning, that has received a lot of research attention lately. Miniaturization and affordability, of both, high resolution digital cameras and advanced computing hardware, have further advanced the scope and applications of object recognition methods. While there are different methods, that build upon various low level features to construct object models, this work explores and implements the use of closed-contours as formidable object features. A hierarchical technique is employed to extract the contours, exploiting the inherent spatial relationships between the parent and child contours of an object, and later describing them as part of the query feature vector. Fourier Descriptors are used to effectively and invariantly describe the extracted contours. A diverse database of shapes is created and later used to train standard classification algorithms, for shape-labeling. A simple-hierarchical, shape label and spatial descriptor matching method is implemented, to find the nearest object-model, from a collection of stored templates. Multi-threaded architecture and GPU efficient image-processing functions are adopted wherever possible, speeding up the running time of the proposed technique, and making it efficient for use in real world applications. The technique is successfully tested on common traffic signs in real world images, with overall good performance and robustness being obtained as an end result

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

An Accelerated Hierarchical Approach for Object Shape Extraction and Recognition

Author: Arshad Bassam
Khan Fitratullah
Lei Hansheng
Quweider Mahmoud K.
Zhang Liyu
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/10/2019
Field of study

We present a novel automatic supervised object recognition algorithm based on a scale and rotation invariant Fourier descriptors algorithm. The algorithm is hierarchical in nature to capture the inherent intra-contour spatial relationships between the parent and child contours of an object. A set of distance metrics are introduced to go along with the hierarchical model. To test the algorithm, a diverse database of shapes is created and used to train standard classification algorithms, for shape-labeling. The implemented algorithm takes advantage of the multi-threaded architecture and GPU efficient image-processing functions present in OpenCV wherever possible, speeding up the running time and making it efficient for use in real-time applications. The technique is successfully tested on common traffic and road signs of real-world images, with excellent overall performance that is robust to moderate noise levels

Crossref

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

Image analysis using visual saliency with applications in hazmat sign detection and recognition

Author: Zhao Bin
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2014
Field of study

Visual saliency is the perceptual process that makes attractive objects stand out from their surroundings in the low-level human visual system. Visual saliency has been modeled as a preprocessing step of the human visual system for selecting the important visual information from a scene. We investigate bottom-up visual saliency using spectral analysis approaches. We present separate and composite model families that generalize existing frequency domain visual saliency models. We propose several frequency domain visual saliency models to generate saliency maps using new spectrum processing methods and an entropy-based saliency map selection approach. A group of saliency map candidates are then obtained by inverse transform. A final saliency map is selected among the candidates by minimizing the entropy of the saliency map candidates. The proposed models based on the separate and composite model families are also extended to various color spaces. We develop an evaluation tool for benchmarking visual saliency models. Experimental results show that the proposed models are more accurate and efficient than most state-of-the-art visual saliency models in predicting eye fixation.^ We use the above visual saliency models to detect the location of hazardous material (hazmat) signs in complex scenes. We develop a hazmat sign location detection and content recognition system using visual saliency. Saliency maps are employed to extract salient regions that are likely to contain hazmat sign candidates and then use a Fourier descriptor based contour matching method to locate the border of hazmat signs in these regions. This visual saliency based approach is able to increase the accuracy of sign location detection, reduce the number of false positive objects, and speed up the overall image analysis process. We also propose a color recognition method to interpret the color inside the detected hazmat sign. Experimental results show that our proposed hazmat sign location detection method is capable of detecting and recognizing projective distorted, blurred, and shaded hazmat signs at various distances.^ In other work we investigate error concealment for scalable video coding (SVC). When video compressed with SVC is transmitted over loss-prone networks, the decompressed video can suffer severe visual degradation across multiple frames. In order to enhance the visual quality, we propose an inter-layer error concealment method using motion vector averaging and slice interleaving to deal with burst packet losses and error propagation. Experimental results show that the proposed error concealment methods outperform two existing methods

Purdue E-Pubs

Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

Author: Jansson Ylva
Lindeberg Tony
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Hand gesture recognition for human computer interaction: a comparative study of different image features

Author: Reis L. P.
Ribeiro A. Fernando
Trigueiros Paulo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Hand gesture recognition for human computer interaction, being a natural way of human computer interaction, is an area of active research in computer vision and machine learning. This is an area with many different possible applications, giving users a simpler and more natural way to communicate with robots/systems interfaces, without the need for extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them to convey information or for device control. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. In this study we try to identify hand features that, isolated, respond better in various situations in human-computer interaction. The extracted features are used to train a set of classifiers with the help of RapidMiner in order to find the best learner. A dataset with our own gesture vocabulary consisted of 10 gestures, recorded from 20 users was created for later processing. Experimental results show that the radial signature and the centroid distance are the features that when used separately obtain better results, with an accuracy of 91% and 90,1% respectively obtained with a Neural Network classifier. These to methods have also the advantage of being simple in terms of computational complexity, which make them good candidates for real-time hand gesture recognition

Universidade do Minho: RepositoriUM

VSSA-NET: Vertical Spatial Sequence Attention Network for Traffic Sign Detection

Author: IEEE
IEEE
IEEE
Member Senior
Member Senior
Member Student
Wang Qi
Xiong Zhitong
Yuan Yuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/05/2019
Field of study

Although traffic sign detection has been studied for years and great progress has been made with the rise of deep learning technique, there are still many problems remaining to be addressed. For complicated real-world traffic scenes, there are two main challenges. Firstly, traffic signs are usually small size objects, which makes it more difficult to detect than large ones; Secondly, it is hard to distinguish false targets which resemble real traffic signs in complex street scenes without context information. To handle these problems, we propose a novel end-to-end deep learning method for traffic sign detection in complex environments. Our contributions are as follows: 1) We propose a multi-resolution feature fusion network architecture which exploits densely connected deconvolution layers with skip connections, and can learn more effective features for the small size object; 2) We frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention (VSSA) module to gain more context information for better detection performance. To comprehensively evaluate the proposed method, we do experiments on several traffic sign datasets as well as the general object detection dataset and the results have shown the effectiveness of our proposed method

arXiv.org e-Print Archive

A comparative study of different image features for hand gesture machine learning

Author: Reis L. P.
Ribeiro A. Fernando
Trigueiros Paulo
Publication venue: 'Scitepress'
Publication date: 18/10/2013
Field of study

Vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition. Hand gesture recognition for human computer interaction is an area of active research in computer vision and machine learning. The primary goal of gesture recognition research is to create a system, which can identify specific human gestures and use them to convey information or for device control. In this paper we present a comparative study of seven different algorithms for hand feature extraction, for static hand gesture classification, analysed with RapidMiner in order to find the best learner. We defined our own gesture vocabulary, with 10 gestures, and we have recorded videos from 20 persons performing the gestures for later processing. Our goal in the present study is to learn features that, isolated, respond better in various situations in human-computer interaction. Results show that the radial signature and the centroid distance are the features that when used separately obtain better results, being at the same time simple in terms of computational complexity.(undefined

Universidade do Minho: RepositoriUM