32 research outputs found

    Good Practice in CNN Feature Transfer

    Full text link
    The objective of this paper is the effective transfer of the Convolutional Neural Network (CNN) feature in image search and classification. Systematically, we study three facts in CNN transfer. 1) We demonstrate the advantage of using images with a properly large size as input to CNN instead of the conventionally resized one. 2) We benchmark the performance of different CNN layers improved by average/max pooling on the feature maps. Our observation suggests that the Conv5 feature yields very competitive accuracy under such pooling step. 3) We find that the simple combination of pooled features extracted across various CNN layers is effective in collecting evidences from both low and high level descriptors. Following these good practices, we are capable of improving the state of the art on a number of benchmarks to a large margin

    A deep representation for depth images from synthetic data

    Full text link
    Convolutional Neural Networks (CNNs) trained on large scale RGB databases have become the secret sauce in the majority of recent approaches for object categorization from RGB-D data. Thanks to colorization techniques, these methods exploit the filters learned from 2D images to extract meaningful representations in 2.5D. Still, the perceptual signature of these two kind of images is very different, with the first usually strongly characterized by textures, and the second mostly by silhouettes of objects. Ideally, one would like to have two CNNs, one for RGB and one for depth, each trained on a suitable data collection, able to capture the perceptual properties of each channel for the task at hand. This has not been possible so far, due to the lack of a suitable depth database. This paper addresses this issue, proposing to opt for synthetically generated images rather than collecting by hand a 2.5D large scale database. While being clearly a proxy for real data, synthetic images allow to trade quality for quantity, making it possible to generate a virtually infinite amount of data. We show that the filters learned from such data collection, using the very same architecture typically used on visual data, learns very different filters, resulting in depth features (a) able to better characterize the different facets of depth images, and (b) complementary with respect to those derived from CNNs pre-trained on 2D datasets. Experiments on two publicly available databases show the power of our approach

    Component-based Attention for Large-scale Trademark Retrieval

    Full text link
    The demand for large-scale trademark retrieval (TR) systems has significantly increased to combat the rise in international trademark infringement. Unfortunately, the ranking accuracy of current approaches using either hand-crafted or pre-trained deep convolution neural network (DCNN) features is inadequate for large-scale deployments. We show in this paper that the ranking accuracy of TR systems can be significantly improved by incorporating hard and soft attention mechanisms, which direct attention to critical information such as figurative elements and reduce attention given to distracting and uninformative elements such as text and background. Our proposed approach achieves state-of-the-art results on a challenging large-scale trademark dataset.Comment: Fix typos related to authors' informatio

    Evolving Convolutional Neural Networks for Glaucoma Diagnosis / Redes neurais convolucionais em evolução para diagnóstico de glaucoma

    Get PDF
    O glaucoma é uma doença ocular que causa danos ao nervo óptico do olho e sucessivo estreitamento do campo visual nos pacientes afetados, o que pode levar o paciente, em estágio avançado, à cegueira. Este trabalho apresenta um estudo sobre o uso de Redes Neurais Convolucionais (CNNs) para o diagnóstico automático através de imagens de fundo de olho. No entanto, a construção de uma CNN capaz de alcançar resultados satisfatórios para o diagnóstico do glaucoma, envolve muito esforço que, em muitas situações, nem sempre é capaz de tais resultados. O objetivo deste trabalho é utilizar um algoritmo genético (AG) para otimizar arquiteturas de CNNs através da técnica de evolução de algoritmos que possa aprimorar o diagnóstico do glaucoma em imagens de fundo do olho do conjunto de dados RIM-ONE-r2. Nosso artigo demonstra resultados satisfatórios após o treinamento do melhor indivíduo escolhido pelo AG, com a obtenção de uma acurácia de 91%
    corecore