6 research outputs found

    On Symbiosis of Attribute Prediction and Semantic Segmentation

    Full text link
    In this paper, we propose to employ semantic segmentation to improve person-related attribute prediction. The core idea lies in the fact that the probability of an attribute to appear in an image is far from being uniform in the spatial domain. We build our attribute prediction model jointly with a deep semantic segmentation network. This harnesses the localization cues learned by the semantic segmentation to guide the attention of the attribute prediction to the regions where different attributes naturally show up. Therefore, in addition to prediction, we are able to localize the attributes despite merely having access to image-level labels (weak supervision) during training. We first propose semantic segmentation-based pooling and gating, respectively denoted as SSP and SSG. In the former, the estimated segmentation masks are used to pool the final activations of the attribute prediction network, from multiple semantically homogeneous regions. In SSG, the same idea is applied to the intermediate layers of the network. SSP and SSG, while effective, impose heavy memory utilization since each channel of the activations is pooled/gated with all the semantic segmentation masks. To circumvent this, we propose Symbiotic Augmentation (SA), where we learn only one mask per activation channel. SA allows the model to either pick one, or combine (weighted superposition) multiple semantic maps, in order to generate the proper mask for each channel. SA simultaneously applies the same mechanism to the reverse problem by leveraging output logits of attribute prediction to guide the semantic segmentation task. We evaluate our proposed methods for facial attributes on CelebA and LFWA datasets, while benchmarking WIDER Attribute and Berkeley Attributes of People for whole body attributes. Our proposed methods achieve superior results compared to the previous works.Comment: Accepted for publication in PAMI. arXiv admin note: substantial text overlap with arXiv:1704.0874

    Describing Images by Semantic Modeling using Attributes and Tags

    Get PDF
    This dissertation addresses the problem of describing images using visual attributes and textual tags, a fundamental task that narrows down the semantic gap between the visual reasoning of humans and machines. Automatic image annotation assigns relevant textual tags to the images. In this dissertation, we propose a query-specific formulation based on Weighted Multi-view Non-negative Matrix Factorization to perform automatic image annotation. Our proposed technique seamlessly adapt to the changes in training data, naturally solves the problem of feature fusion and handles the challenge of the rare tags. Unlike tags, attributes are category-agnostic, hence their combination models an exponential number of semantic labels. Motivated by the fact that most attributes describe local properties, we propose exploiting localization cues, through semantic parsing of human face and body to improve person-related attribute prediction. We also demonstrate that image-level attribute labels can be effectively used as weak supervision for the task of semantic segmentation. Next, we analyze the Selfie images by utilizing tags and attributes. We collect the first large-scale Selfie dataset and annotate it with different attributes covering characteristics such as gender, age, race, facial gestures, and hairstyle. We then study the popularity and sentiments of the selfies given an estimated appearance of various semantic concepts. In brief, we automatically infer what makes a good selfie. Despite its extensive usage, the deep learning literature falls short in understanding the characteristics and behavior of the Batch Normalization. We conclude this dissertation by providing a fresh view, in light of information geometry and Fisher kernels to why the batch normalization works. We propose Mixture Normalization that disentangles modes of variation in the underlying distribution of the layer outputs and confirm that it effectively accelerates training of different batch-normalized architectures including Inception-V3, Densely Connected Networks, and Deep Convolutional Generative Adversarial Networks while achieving better generalization error

    Método de extracción de rasgos faciales para la autenticación de personas

    Get PDF
    "Dentro de la visión por computadora, existen diferentes áreas de investigación, como: identificación, reconocimiento y autenticación. Actualmente, la autenticación de personas se ha convertido en una forma de mantener la seguridad en diferentes ámbitos. Existen diversas maneras de autenticar personas, como utilizando su voz, huella dactilar, rasgos faciales o su iris. En los últimos años, se han desarrollado un sin fin de algoritmos de seguridad capaces de autenticar personas a través de sus rasgos faciales, pero estos algoritmos o sistemas de seguridad, tienen un buen rendimiento bajo ciertas circunstancias, como que la persona esté de frente a la cámara de vigilancia, que haya una buena iluminación o que se cuente con una buena cámara. Tomando en cuenta las especificaciones de los sistemas existentes y sus pequeñas ineficiencias, en este trabajo se presenta el diseño de un método de extracción de rasgos faciales, el cual, por sus características y precisión, servirá para el proceso de autenticación de personas. Este método fue diseñado, tratando de abordar las características de los algoritmos existentes y que, además, como aportación, abarque la opción de que la persona sea autenticada no solo cuando esté de frente a la cámara, sino también estando de perfil"

    Landmark Free Face Attribute Prediction

    No full text
    corecore