137 research outputs found

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Get PDF
    Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, i.e. estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this paper introduces a taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison between the state-of-the-art, a new experimental protocol is presented, with training sets containing 10k, 100k and 1m images and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.Comment: to appear in ACM Computing Survey

    Automatic Landmarking for Non-cooperative 3D Face Recognition

    Get PDF
    This thesis describes a new framework for 3D surface landmarking and evaluates its performance for feature localisation on human faces. This framework has two main parts that can be designed and optimised independently. The first one is a keypoint detection system that returns positions of interest for a given mesh surface by using a learnt dictionary of local shapes. The second one is a labelling system, using model fitting approaches that establish a one-to-one correspondence between the set of unlabelled input points and a learnt representation of the class of object to detect. Our keypoint detection system returns local maxima over score maps that are generated from an arbitrarily large set of local shape descriptors. The distributions of these descriptors (scalars or histograms) are learnt for known landmark positions on a training dataset in order to generate a model. The similarity between the input descriptor value for a given vertex and a model shape is used as a descriptor-related score. Our labelling system can make use of both hypergraph matching techniques and rigid registration techniques to reduce the ambiguity attached to unlabelled input keypoints for which a list of model landmark candidates have been seeded. The soft matching techniques use multi-attributed hyperedges to reduce ambiguity, while the registration techniques use scale-adapted rigid transformation computed from 3 or more points in order to obtain one-to-one correspondences. Our final system achieves better or comparable (depending on the metric) results than the state-of-the-art while being more generic. It does not require pre-processing such as cropping, spike removal and hole filling and is more robust to occlusion of salient local regions, such as those near the nose tip and inner eye corners. It is also fully pose invariant and can be used with kinds of objects other than faces, provided that labelled training data is available

    2014 Summer Research Symposium Abstract Book

    Get PDF
    2014 Summer volume of abstracts for science research projects conducted by students at Trinity College

    Deep Learning for Medical Image Segmentation using Prior Knowledge and Topology

    Get PDF
    Image segmentation refers to the division of a digital image into distinct segments or groups of pixels/voxels. However, most of the existing deep learning approaches lack the utilization of prior knowledge, such as shape information, which could improve segmentation accuracy. In addition, conventional image segmentation frequently falls short in preserving intricate spatial details, motivating the innovation of strategies for multi-scaled feature integration. Furthermore, traditional image segmentation methods primarily concentrate on pixel-level or region-level analysis. However, given the inherent morphological similarities among various image objects, the significance of topology information surpasses that of pixel-level data in the realm of medical image semantic segmentation, and the incorporation of topology information for image segmentation is important. The first aim of this dissertation is to incorporate shape priors into medical image segmentation. A shape-prior-V-Net (SP-V-Net) is proposed, which contains a shape transformation module to refine the segmentation results according to the shape prior. SP-V-Net has been applied to lung segmentation and proximal femur segmentation. The second aim aims to improve image segmentation by leveraging hierarchical features. Two approaches are proposed: the feature pyramid U-Net++ (FP-U-Net++), which dynamically aggregates the feature pyramid in the decoder of U-Net ++, and the multi-input multi-scale U-Net (MIMS U-Net), which integrates the features in the encoder of the U-Net. The third aim explores topology-based image semantic segmentation using graph neural networks. Three graph-matching networks have been developed, including association graph-based, edge attention graph matching, and hyper-association graph matching networks. The proposed graph-matching networks convert the graph-matching problems into a vertex classification problem using an association graph, where the positive vertex indicates the nodes from two individual graphs are matched. These models were applied to coronary artery semantic labeling on invasive coronary angiograms. Moreover, this study presents a pioneering approach for topology-based image semantic labeling using graph matching. The successful completion of these aims contributes technically accurate and clinically applicable algorithms and techniques for medical image segmentation. The outcomes of this dissertation provide valuable tools for the medical imaging and computer vision communities, advancing the field and improving patient care through accurate and efficient medical image segmentation
    • 

    corecore