41,602 research outputs found

    Deep learning for 3D ear detection: A complete pipeline from data generation to segmentation

    Get PDF
    The human ear has distinguishing features that can be used for identification. Automated ear detection from 3D profile face images plays a vital role in ear-based human recognition. This work proposes a complete pipeline including synthetic data generation and ground-truth data labeling for ear detection in 3D point clouds. The ear detection problem is formulated as a semantic part segmentation problem that detects the ear directly in 3D point clouds of profile face data. We introduce EarNet, a modified version of the PointNet++ architecture, and apply rotation augmentation to handle different pose variations in the real data. We demonstrate that PointNet and PointNet++ cannot manage the rotation of a given object without such augmentation. The synthetic 3D profile face data is generated using statistical shape models. In addition, an automatic tool has been developed and is made publicly available to create ground-truth labels of any 3D public data set that includes co-registered 2D images. The experimental results on the real data demonstrate higher localization as compared to existing state-of-the-art approaches

    Active Object Localization in Visual Situations

    Get PDF
    We describe a method for performing active localization of objects in instances of visual situations. A visual situation is an abstract concept---e.g., "a boxing match", "a birthday party", "walking the dog", "waiting for a bus"---whose image instantiations are linked more by their common spatial and semantic structure than by low-level visual similarity. Our system combines given and learned knowledge of the structure of a particular situation, and adapts that knowledge to a new situation instance as it actively searches for objects. More specifically, the system learns a set of probability distributions describing spatial and other relationships among relevant objects. The system uses those distributions to iteratively sample object proposals on a test image, but also continually uses information from those object proposals to adaptively modify the distributions based on what the system has detected. We test our approach's ability to efficiently localize objects, using a situation-specific image dataset created by our group. We compare the results with several baselines and variations on our method, and demonstrate the strong benefit of using situation knowledge and active context-driven localization. Finally, we contrast our method with several other approaches that use context as well as active search for object localization in images.Comment: 14 page

    Deep Learning for Semantic Part Segmentation with High-Level Guidance

    Full text link
    In this work we address the task of segmenting an object into its parts, or semantic part segmentation. We start by adapting a state-of-the-art semantic segmentation system to this task, and show that a combination of a fully-convolutional Deep CNN system coupled with Dense CRF labelling provides excellent results for a broad range of object categories. Still, this approach remains agnostic to high-level constraints between object parts. We introduce such prior information by means of the Restricted Boltzmann Machine, adapted to our task and train our model in an discriminative fashion, as a hidden CRF, demonstrating that prior information can yield additional improvements. We also investigate the performance of our approach ``in the wild'', without information concerning the objects' bounding boxes, using an object detector to guide a multi-scale segmentation scheme. We evaluate the performance of our approach on the Penn-Fudan and LFW datasets for the tasks of pedestrian parsing and face labelling respectively. We show superior performance with respect to competitive methods that have been extensively engineered on these benchmarks, as well as realistic qualitative results on part segmentation, even for occluded or deformable objects. We also provide quantitative and extensive qualitative results on three classes from the PASCAL Parts dataset. Finally, we show that our multi-scale segmentation scheme can boost accuracy, recovering segmentations for finer parts.Comment: 11 pages (including references), 3 figures, 2 table

    Dense 3D Face Correspondence

    Full text link
    We present an algorithm that automatically establishes dense correspondences between a large number of 3D faces. Starting from automatically detected sparse correspondences on the outer boundary of 3D faces, the algorithm triangulates existing correspondences and expands them iteratively by matching points of distinctive surface curvature along the triangle edges. After exhausting keypoint matches, further correspondences are established by generating evenly distributed points within triangles by evolving level set geodesic curves from the centroids of large triangles. A deformable model (K3DM) is constructed from the dense corresponded faces and an algorithm is proposed for morphing the K3DM to fit unseen faces. This algorithm iterates between rigid alignment of an unseen face followed by regularized morphing of the deformable model. We have extensively evaluated the proposed algorithms on synthetic data and real 3D faces from the FRGCv2, Bosphorus, BU3DFE and UND Ear databases using quantitative and qualitative benchmarks. Our algorithm achieved dense correspondences with a mean localisation error of 1.28mm on synthetic faces and detected 1414 anthropometric landmarks on unseen real faces from the FRGCv2 database with 3mm precision. Furthermore, our deformable model fitting algorithm achieved 98.5% face recognition accuracy on the FRGCv2 and 98.6% on Bosphorus database. Our dense model is also able to generalize to unseen datasets.Comment: 24 Pages, 12 Figures, 6 Tables and 3 Algorithm

    Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

    Full text link
    A key challenge in fine-grained recognition is how to find and represent discriminative local regions. Recent attention models are capable of learning discriminative region localizers only from category labels with reinforcement learning. However, not utilizing any explicit part information, they are not able to accurately find multiple distinctive regions. In this work, we introduce an attribute-guided attention localization scheme where the local region localizers are learned under the guidance of part attribute descriptions. By designing a novel reward strategy, we are able to learn to locate regions that are spatially and semantically distinctive with reinforcement learning algorithm. The attribute labeling requirement of the scheme is more amenable than the accurate part location annotation required by traditional part-based fine-grained recognition methods. Experimental results on the CUB-200-2011 dataset demonstrate the superiority of the proposed scheme on both fine-grained recognition and attribute recognition

    3D Object Class Detection in the Wild

    Full text link
    Object class detection has been a synonym for 2D bounding box localization for the longest time, fueled by the success of powerful statistical learning techniques, combined with robust image representations. Only recently, there has been a growing interest in revisiting the promise of computer vision from the early days: to precisely delineate the contents of a visual scene, object by object, in 3D. In this paper, we draw from recent advances in object detection and 2D-3D object lifting in order to design an object class detector that is particularly tailored towards 3D object class detection. Our 3D object class detection method consists of several stages gradually enriching the object detection output with object viewpoint, keypoints and 3D shape estimates. Following careful design, in each stage it constantly improves the performance and achieves state-ofthe-art performance in simultaneous 2D bounding box and viewpoint estimation on the challenging Pascal3D+ dataset
    • …
    corecore