43,550 research outputs found

    Ear detection with convolutional neural networks

    Get PDF
    Object detection is still considered a difficult task in the field of computer vision. Specifically, earlobe detection has become a popular application as the interest in human identification using earlobe biometry has increased. So far earlobe detection problem has been solved using a combination of skin detection, edge detection, segmentation by fusion of histogram-based k-means, and template matching algorithms. In this work we present a method of earlobe detection without template matching by using a convolutional neural network, performing image segmentation. With this method, which is invariant to angle at which the photo was taken, earlobe shape, skin color, illumination, occlusions, and earlobe accessories, we were able to accurately detect the area of the image, where an earlobe is present. Moreover, detection time was significantly improved when compared to other methods for solving the same task. We expect our method to be used in Annotated Web Ears Toolbox

    Ear detection with convolutional neural networks

    Get PDF
    Object detection is still considered a difficult task in the field of computer vision. Specifically, earlobe detection has become a popular application as the interest in human identification using earlobe biometry has increased. So far earlobe detection problem has been solved using a combination of skin detection, edge detection, segmentation by fusion of histogram-based k-means, and template matching algorithms. In this work we present a method of earlobe detection without template matching by using a convolutional neural network, performing image segmentation. With this method, which is invariant to angle at which the photo was taken, earlobe shape, skin color, illumination, occlusions, and earlobe accessories, we were able to accurately detect the area of the image, where an earlobe is present. Moreover, detection time was significantly improved when compared to other methods for solving the same task. We expect our method to be used in Annotated Web Ears Toolbox

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Visual Object Recognition and Tracking of Tools

    Get PDF
    A method has been created to automatically build an algorithm off-line, using computer-aided design (CAD) models, and to apply this at runtime. The object type is discriminated, and the position and orientation are identified. This system can work with a single image and can provide improved performance using multiple images provided from videos. The spatial processing unit uses three stages: (1) segmentation; (2) initial type, pose, and geometry (ITPG) estimation; and (3) refined type, pose, and geometry (RTPG) calculation. The image segmentation module files all the tools in an image and isolates them from the background. For this, the system uses edge-detection and thresholding to find the pixels that are part of a tool. After the pixels are identified, nearby pixels are grouped into blobs. These blobs represent the potential tools in the image and are the product of the segmentation algorithm. The second module uses matched filtering (or template matching). This approach is used for condensing synthetic images using an image subspace that captures key information. Three degrees of orientation, three degrees of position, and any number of degrees of freedom in geometry change are included. To do this, a template-matching framework is applied. This framework uses an off-line system for calculating template images, measurement images, and the measurements of the template images. These results are used online to match segmented tools against the templates. The final module is the RTPG processor. Its role is to find the exact states of the tools given initial conditions provided by the ITPG module. The requirement that the initial conditions exist allows this module to make use of a local search (whereas the ITPG module had global scope). To perform the local search, 3D model matching is used, where a synthetic image of the object is created and compared to the sensed data. The availability of low-cost PC graphics hardware allows rapid creation of synthetic images. In this approach, a function of orientation, distance, and articulation is defined as a metric on the difference between the captured image and a synthetic image with an object in the given orientation, distance, and articulation. The synthetic image is created using a model that is looked up in an object-model database. A composable software architecture is used for implementation. Video is first preprocessed to remove sensor anomalies (like dead pixels), and then is processed sequentially by a prioritized list of tracker-identifiers

    Planogram Compliance Checking Based on Detection of Recurring Patterns

    Get PDF
    In this paper, a novel method for automatic planogram compliance checking in retail chains is proposed without requiring product template images for training. Product layout is extracted from an input image by means of unsupervised recurring pattern detection and matched via graph matching with the expected product layout specified by a planogram to measure the level of compliance. A divide and conquer strategy is employed to improve the speed. Specifically, the input image is divided into several regions based on the planogram. Recurring patterns are detected in each region respectively and then merged together to estimate the product layout. Experimental results on real data have verified the efficacy of the proposed method. Compared with a template-based method, higher accuracies are achieved by the proposed method over a wide range of products.Comment: Accepted by MM (IEEE Multimedia Magazine) 201

    Real-Time RGB-D based Template Matching Pedestrian Detection

    Full text link
    Pedestrian detection is one of the most popular topics in computer vision and robotics. Considering challenging issues in multiple pedestrian detection, we present a real-time depth-based template matching people detector. In this paper, we propose different approaches for training the depth-based template. We train multiple templates for handling issues due to various upper-body orientations of the pedestrians and different levels of detail in depth-map of the pedestrians with various distances from the camera. And, we take into account the degree of reliability for different regions of sliding window by proposing the weighted template approach. Furthermore, we combine the depth-detector with an appearance based detector as a verifier to take advantage of the appearance cues for dealing with the limitations of depth data. We evaluate our method on the challenging ETH dataset sequence. We show that our method outperforms the state-of-the-art approaches.Comment: published in ICRA 201
    • …
    corecore