199 research outputs found

    Vision-Based 2D and 3D Human Activity Recognition

    Get PDF

    Less is More: Micro-expression Recognition from Video using Apex Frame

    Full text link
    Despite recent interest and advances in facial micro-expression research, there is still plenty room for improvement in terms of micro-expression recognition. Conventional feature extraction approaches for micro-expression video consider either the whole video sequence or a part of it, for representation. However, with the high-speed video capture of micro-expressions (100-200 fps), are all frames necessary to provide a sufficiently meaningful representation? Is the luxury of data a bane to accurate recognition? A novel proposition is presented in this paper, whereby we utilize only two images per video: the apex frame and the onset frame. The apex frame of a video contains the highest intensity of expression changes among all frames, while the onset is the perfect choice of a reference frame with neutral expression. A new feature extractor, Bi-Weighted Oriented Optical Flow (Bi-WOOF) is proposed to encode essential expressiveness of the apex frame. We evaluated the proposed method on five micro-expression databases: CAS(ME)2^2, CASME II, SMIC-HS, SMIC-NIR and SMIC-VIS. Our experiments lend credence to our hypothesis, with our proposed technique achieving a state-of-the-art F1-score recognition performance of 61% and 62% in the high frame rate CASME II and SMIC-HS databases respectively.Comment: 14 pages double-column, author affiliations updated, acknowledgment of grant support adde

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Machine learning approaches to video activity recognition: from computer vision to signal processing

    Get PDF
    244 p.La investigación presentada se centra en técnicas de clasificación para dos tareas diferentes, aunque relacionadas, de tal forma que la segunda puede ser considerada parte de la primera: el reconocimiento de acciones humanas en vídeos y el reconocimiento de lengua de signos.En la primera parte, la hipótesis de partida es que la transformación de las señales de un vídeo mediante el algoritmo de Patrones Espaciales Comunes (CSP por sus siglas en inglés, comúnmente utilizado en sistemas de Electroencefalografía) puede dar lugar a nuevas características que serán útiles para la posterior clasificación de los vídeos mediante clasificadores supervisados. Se han realizado diferentes experimentos en varias bases de datos, incluyendo una creada durante esta investigación desde el punto de vista de un robot humanoide, con la intención de implementar el sistema de reconocimiento desarrollado para mejorar la interacción humano-robot.En la segunda parte, las técnicas desarrolladas anteriormente se han aplicado al reconocimiento de lengua de signos, pero además de ello se propone un método basado en la descomposición de los signos para realizar el reconocimiento de los mismos, añadiendo la posibilidad de una mejor explicabilidad. El objetivo final es desarrollar un tutor de lengua de signos capaz de guiar a los usuarios en el proceso de aprendizaje, dándoles a conocer los errores que cometen y el motivo de dichos errores

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Sistema biométrico para detección y reconocimiento de orejas basado en algoritmos de procesamiento de imágenes y redes neuronales profundas

    Get PDF
    [EN]The ear is an emerging biometric feature that has caught the attention of the scientific community for more than a decade. Its unique structure has stood out since long ago among forensic scientists, and has been used to identify suspects in many cases. The logical step towards a broader application of ear biometrics is to create a recognition system. To carry out this process, this work focuses on the use of data from images (2D). The present study mentions techniques like the Hausdorff distance, which adds robustness and increases the performance, filtering the subjects to use in the testing process. It also includes image ray transform (IRT) in the detection step. The ear is a fickle biometric feature when working with photographic images under varying conditions. This is largely due to the camera’s focus, the irregular shapes of the captures, the lighting conditions and the ever-changing shape of the projection when it is photographed. Therefore, to identify the presence and location of an ear in an image, we propose an ear detection system with multiple convolutional neural networks (CNN) and a clustering algorithm of detections. The proposed method coincides with the performance of other techniques when we analyze clean photographs, that is to say, catches in ideal conditions (purposeshot), reaching an accuracy of more than 98 %. When the system is subjected to natural images in real world conditions, where the subject appears in a multitude of orientations and photographic conditions in an uncontrolled environment, our system maintains the same precision, clearly exceeding the average result (83 %) obtained in previous researches. Finally, the algorithms used to complete the recognition steps are presented, using convolutional structures, extraction techniques and geometric approximations in order to increase the accuracy of the process.[ES]La oreja es un rasgo biométrico emergente que ha llamado la atención de la comunidad científica por más de una década. Su estructura única ha destacado desde hace mucho tiempo entre los científicos forenses, y se ha utilizado para la identificación de sospechosos en muchos casos. El paso lógico hacia una aplicación más amplia de la biometría de orejas es crear un sistema de reconocimiento. Este trabajo se centra en el uso de datos de imágenes (2D) para llevar a cabo dicho proceso. El presente estudio aborda técnicas como la distancia Hausdorff; la cual agrega robustez e incrementa el desempeño filtrando los sujetos a utilizar en la etapa de prueba del proceso. También incluye la transformación de imágenes con rayos (IRT) en la etapa de detección. La oreja es una característica biométrica inconstante cuando se trabaja con imágenes fotográficas en condiciones variables: esto se debe en gran parte al enfoque de la cámara, las formas irregulares de las capturas, las condiciones de iluminación y la forma siempre cambiante de la proyección cuando es fotografiada. Por tanto, para identificar la presencia y localización de una oreja en una imagen proponemos un sistema de detección de orejas con múltiples redes neuronales convolucionales (CNN) y un algoritmo de agrupación de detección. El método propuesto coincide con el rendimiento de otras técnicas cuando analizamos fotografías limpias, es decir, capturas en condiciones ideales (purposeshot), alcanzando una precisión de más del 98 %. Cuando el sistema está sujeto a imágenes naturales en condiciones del mundo real, donde el sujeto aparece en una multitud de orientaciones y condiciones fotográficas en ambiente no controlado, nuestro sistema mantiene la misma precisión superando claramente el resultado del 83 % promedio obtenido en investigaciones previas. Finalmente se exponen los algoritmos utilizados para completar los pasos del reconocimiento, utilizando estructuras convolucionales, técnicas de extracción de características y aproximaciones geométricas a fin de incrementar la presición del proceso

    Sparse Shape Modelling for 3D Face Analysis

    Get PDF
    This thesis describes a new method for localising anthropometric landmark points on 3D face scans. The points are localised by fitting a sparse shape model to a set of candidate landmarks. The candidates are found using a feature detector that is designed using a data driven methodology, this approach also informs the choice of landmarks for the shape model. The fitting procedure is developed to be robust to missing landmark data and spurious candidates. The feature detector and landmark choice is determined by the performance of different local surface descriptions on the face. A number of criteria are defined for a good landmark point and good feature detector. These inform a framework for measuring the performance of various surface descriptions and the choice of parameter values in the surface description generation. Two types of surface description are tested: curvature and spin images. These descriptions, in many ways, represent many aspects of the two most common approaches to local surface description. Using the data driven design process for surface description and landmark choice, a feature detector is developed using spin images. As spin images are a rich surface description, we are able to perform detection and candidate landmark labelling in a single step. A feature detector is developed based on linear discriminant analysis (LDA). This is compared to a simpler detector used in the landmark and surface description selection process. A sparse shape model is constructed using ground truth landmark data. This sparse shape model contains only the landmark point locations and relative positional variation. To localise landmarks, this model is fitted to the candidate landmarks using a RANSAC style algorithm and a novel model fitting algorithm. The results of landmark localisation show that the shape model approach is beneficial over template alignment approaches. Even with heavily contaminated candidate data, we are able to achieve good localisation for most landmarks

    Framework for proximal personified interfaces

    Get PDF
    • …
    corecore