3 research outputs found

    Embedding based on function approximation for large scale image search

    Full text link
    The objective of this paper is to design an embedding method that maps local features describing an image (e.g. SIFT) to a higher dimensional representation useful for the image retrieval problem. First, motivated by the relationship between the linear approximation of a nonlinear function in high dimensional space and the stateof-the-art feature representation used in image retrieval, i.e., VLAD, we propose a new approach for the approximation. The embedded vectors resulted by the function approximation process are then aggregated to form a single representation for image retrieval. Second, in order to make the proposed embedding method applicable to large scale problem, we further derive its fast version in which the embedded vectors can be efficiently computed, i.e., in the closed-form. We compare the proposed embedding methods with the state of the art in the context of image search under various settings: when the images are represented by medium length vectors, short vectors, or binary vectors. The experimental results show that the proposed embedding methods outperform existing the state of the art on the standard public image retrieval benchmarks.Comment: Accepted to TPAMI 2017. The implementation and precomputed features of the proposed F-FAemb are released at the following link: http://tinyurl.com/F-FAem

    Active Object Classification from 3D Range Data with Mobile Robots

    Get PDF
    This thesis addresses the problem of how to improve the acquisition of 3D range data with a mobile robot for the task of object classification. Establishing the identities of objects in unknown environments is fundamental for robotic systems and helps enable many abilities such as grasping, manipulation, or semantic mapping. Objects are recognised by data obtained from sensor observations, however, data is highly dependent on viewpoint; the variation in position and orientation of the sensor relative to an object can result in large variation in the perception quality. Additionally, cluttered environments present a further challenge because key data may be missing. These issues are not always solved by traditional passive systems where data are collected from a fixed navigation process then fed into a perception pipeline. This thesis considers an active approach to data collection by deciding where is most appropriate to make observations for the perception task. The core contributions of this thesis are a non-myopic planning strategy to collect data efficiently under resource constraints, and supporting viewpoint prediction and evaluation methods for object classification. Our approach to planning uses Monte Carlo methods coupled with a classifier based on non-parametric Bayesian regression. We present a novel anytime and non-myopic planning algorithm, Monte Carlo active perception, that extends Monte Carlo tree search to partially observable environments and the active perception problem. This is combined with a particle-based estimation process and a learned observation likelihood model that uses Gaussian process regression. To support planning, we present 3D point cloud prediction algorithms and utility functions that measure the quality of viewpoints by their discriminatory ability and effectiveness under occlusion. The utility of viewpoints is quantified by information-theoretic metrics, such as mutual information, and an alternative utility function that exploits learned data is developed for special cases. The algorithms in this thesis are demonstrated in a variety of scenarios. We extensively test our online planning and classification methods in simulation as well as with indoor and outdoor datasets. Furthermore, we perform hardware experiments with different mobile platforms equipped with different types of sensors. Most significantly, our hardware experiments with an outdoor robot are to our knowledge the first demonstrations of online active perception in a real outdoor environment. Active perception has broad significance in many applications. This thesis emphasises the advantages of an active approach to object classification and presents its assimilation with a wide range of robotic systems, sensors, and perception algorithms. By demonstration of performance enhancements and diversity, our hope is that the concept of considering perception and planning in an integrated manner will be of benefit in improving current systems that rely on passive data collection

    Identificaci贸n de la fuente de adquisici贸n de ficheros multimedia de dispositivos m贸viles mediante Deep Learning

    Get PDF
    Actualmente, la sociedad vive rodeada de contenido multimedia como son las im谩genes y los v铆deos. La presencia de dispositivos electr贸nicos capaces de realizar fotograf铆as o grabar videos es una realidad en nuestra vida cotidiana, y su n煤mero aumenta con el paso del tiempo. La gran mayor铆a de la sociedad lleva un m贸vil en el bolsillo y hace uso de 茅l para realizar fotos o v铆deos. Ligado a ello, con los a帽os han ido apareciendo t茅cnicas de falsificaci贸n y manipulaci贸n de contenidos multimedia que dificultan saber si ese contenido es aut茅ntico o no y de d贸nde procede, lo que hace que las t茅cnicas de an谩lisis forense sean una necesidad actual. En este trabajo se propone una red neuronal convolucional capaz de identificar la fuente de adquisici贸n de v铆deos grabados con un dispositivo m贸vil. Los resultados obtenidos de los experimentos realizados en este trabajo demuestran la eficiencia de m茅todos propuestos. Para la evaluaci贸n de los m茅todos propuestos se realizaron experimentos con un dataset p煤blico ampliamente utilizado en la literatura y un dataset generado
    corecore