165,855 research outputs found

    Analyzing First-Person Stories Based on Socializing, Eating and Sedentary Patterns

    Full text link
    First-person stories can be analyzed by means of egocentric pictures acquired throughout the whole active day with wearable cameras. This manuscript presents an egocentric dataset with more than 45,000 pictures from four people in different environments such as working or studying. All the images were manually labeled to identify three patterns of interest regarding people's lifestyle: socializing, eating and sedentary. Additionally, two different approaches are proposed to classify egocentric images into one of the 12 target categories defined to characterize these three patterns. The approaches are based on machine learning and deep learning techniques, including traditional classifiers and state-of-art convolutional neural networks. The experimental results obtained when applying these methods to the egocentric dataset demonstrated their adequacy for the problem at hand.Comment: Accepted at First International Workshop on Social Signal Processing and Beyond, 19th International Conference on Image Analysis and Processing (ICIAP), September 201

    Analyzing First-Person Stories Based on Socializing, Eating and Sedentary Patterns

    Full text link
    First-person stories can be analyzed by means of egocentric pictures acquired throughout the whole active day with wearable cameras. This manuscript presents an egocentric dataset with more than 45,000 pictures from four people in different environments such as working or studying. All the images were manually labeled to identify three patterns of interest regarding people's lifestyle: socializing, eating and sedentary. Additionally, two different approaches are proposed to classify egocentric images into one of the 12 target categories defined to characterize these three patterns. The approaches are based on machine learning and deep learning techniques, including traditional classifiers and state-of-art convolutional neural networks. The experimental results obtained when applying these methods to the egocentric dataset demonstrated their adequacy for the problem at hand.Comment: Accepted at First International Workshop on Social Signal Processing and Beyond, 19th International Conference on Image Analysis and Processing (ICIAP), September 201

    Deep learning architectures for Computer Vision

    Get PDF
    Deep learning has become part of many state-of-the-art systems in multiple disciplines (specially in computer vision and speech processing). In this thesis Convolutional Neural Networks are used to solve the problem of recognizing people in images, both for verification and identification. Two different architectures, AlexNet and VGG19, both winners of the ILSVRC, have been fine-tuned and tested with four datasets: Labeled Faces in the Wild, FaceScrub, YouTubeFaces and Google UPC, a dataset generated at the UPC. Finally, with the features extracted from these fine-tuned networks, some verifications algorithms have been tested including Support Vector Machines, Joint Bayesian and Advanced Joint Bayesian formulation. The results of this work show that an Area Under the Receiver Operating Characteristic curve of 99.6% can be obtained, close to the state-of-the-art performance.El aprendizaje profundo se ha convertido en parte de muchos sistemas en el estado del arte de múltiples ámbitos (especialmente en visión por computador y procesamiento de voz). En esta tesis se utilizan las Redes Neuronales Convolucionales para resolver el problema de reconocer a personas en imágenes, tanto para verificación como para identificación. Dos arquitecturas diferentes, AlexNet y VGG19, ambas ganadores del ILSVRC, han sido afinadas y probadas con cuatro conjuntos de datos: Labeled Faces in the Wild, FaceScrub, YouTubeFaces y Google UPC, un conjunto generado en la UPC. Finalmente con las características extraídas de las redes afinadas, se han probado diferentes algoritmos de verificación, incluyendo Maquinas de Soporte Vectorial, Joint Bayesian y Advanced Joint Bayesian. Los resultados de este trabajo muestran que el Área Bajo la Curva de la Característica Operativa del Receptor puede llegar a ser del 99.6%, cercana al valor del estado del arte.L’aprenentatge profund s’ha convertit en una part importat de molts sistemes a l’estat de l’art de múltiples àmbits (especialment de la visió per computador i el processament de veu). A aquesta tesi s’utilitzen les Xarxes Neuronals Convolucionals per a resoldre el problema de reconèixer persones a imatges, tant per verificació com per identificatió. Dos arquitectures diferents, AlexNet i VGG19, les dues guanyadores del ILSVRC, han sigut afinades i provades amb quatre bases de dades: Labeled Faces in the Wild, FaceScrub, YouTubeFaces i Google UPC, un conjunt generat a la UPC. Finalment, amb les característiques extretes de les xarxes afinades, s’han provat diferents algoritmes de verificació, incloent Màquines de Suport Vectorial, Joint Bayesian i Advanced Joint Bayesian. Els resultats d’aquest treball mostres que un Àrea Baix la Curva de la Característica Operativa del Receptor por arribar a ser del 99.6%, propera al valor de l’estat de l’art

    Machine Analysis of Facial Expressions

    Get PDF
    No abstract

    Incremental Training of a Detector Using Online Sparse Eigen-decomposition

    Full text link
    The ability to efficiently and accurately detect objects plays a very crucial role for many computer vision tasks. Recently, offline object detectors have shown a tremendous success. However, one major drawback of offline techniques is that a complete set of training data has to be collected beforehand. In addition, once learned, an offline detector can not make use of newly arriving data. To alleviate these drawbacks, online learning has been adopted with the following objectives: (1) the technique should be computationally and storage efficient; (2) the updated classifier must maintain its high classification accuracy. In this paper, we propose an effective and efficient framework for learning an adaptive online greedy sparse linear discriminant analysis (GSLDA) model. Unlike many existing online boosting detectors, which usually apply exponential or logistic loss, our online algorithm makes use of LDA's learning criterion that not only aims to maximize the class-separation criterion but also incorporates the asymmetrical property of training data distributions. We provide a better alternative for online boosting algorithms in the context of training a visual object detector. We demonstrate the robustness and efficiency of our methods on handwriting digit and face data sets. Our results confirm that object detection tasks benefit significantly when trained in an online manner.Comment: 14 page
    corecore