Search CORE

80 research outputs found

Kernelized Multiview Projection for Robust Action Recognition

Author: A Gilbert
AF Bobick
C Schuldt
C Xu
D Tao
DJ Berndt
GH Hardy
L Liu
L Shao
Li Liu
Ling Shao
M Gönen
M Vrigkas
Mengyang Yu
O Kihl
R Bhatia
S Ji
T Xia
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Conventional action recognition algorithms adopt a single type of feature or a simple concatenation of multiple features. In this paper, we propose to better fuse and embed different feature representations for action recognition using a novel spectral coding algorithm called Kernelized Multiview Projection (KMP). Computing the kernel matrices from different features/views via time-sequential distance learning, KMP can encode different features with different weights to achieve a low-dimensional and semantically meaningful subspace where the distribution of each view is sufficiently smooth and discriminative. More crucially, KMP is linear for the reproducing kernel Hilbert space, which allows it to be competent for various practical applications. We demonstrate KMP’s performance for action recognition on five popular action datasets and the results are consistently superior to state-of-the-art techniques

Northumbria Research Link

Crossref

Springer - Publisher Connector

University of East Anglia digital repository

Improving acoustic vehicle classification by information fusion

Author: Damarla Thyagaraju
Guo Baofeng
Nixon Mark
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/03/2011
Field of study

We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicle’s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicle’s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac

Southampton (e-Prints Soton)

A multilevel paradigm for deep convolutional neural network features selection with an application to human gait recognition

Author: Arshad H
João Manuel R. S. Tavares
Khan MA
Satapathy SC
Sharif MI
Yasmin M
Zhang YD
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Human gait recognition (HGR) shows high importance in the area of video surveillance due to remote access and security threats. HGR is a technique commonly used for the identification of human style in daily life. However, many typical situations like change of clothes condition and variation in view angles degrade the system performance. Lately, different machine learning (ML) techniques have been introduced for video surveillance which gives promising results among which deep learning (DL) shows best performance in complex scenarios. In this article, an integrated framework is proposed for HGR using deep neural network and fuzzy entropy controlled skewness (FEcS) approach. The proposed technique works in two phases: In the first phase, deep convolutional neural network (DCNN) features are extracted by pre-trained CNN models (VGG19 and AlexNet) and their information is mixed by parallel fusion approach. In the second phase, entropy and skewness vectors are calculated from fused feature vector (FV) to select best subsets of features by suggested FEcS approach. The best subsets of picked features are finally fed to multiple classifiers and finest one is chosen on the basis of accuracy value. The experiments were carried out on four well-known datasets, namely, AVAMVG gait, CASIA A, B and C. The achieved accuracy of each dataset was 99.8, 99.7, 93.3 and 92.2%, respectively. Therefore, the obtained overall recognition results lead to conclude that the proposed system is very promising

Repositório Aberto da Universidade do Porto

Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re-identification

Author: Gao Junbin
Li Xue
Wang Yang
Wu Lin
Publication venue: 'Elsevier BV'
Publication date: 06/09/2017
Field of study

Person re-identification (re-id) aims to match pedestrians observed by disjoint camera views. It attracts increasing attention in computer vision due to its importance to surveillance system. To combat the major challenge of cross-view visual variations, deep embedding approaches are proposed by learning a compact feature space from images such that the Euclidean distances correspond to their cross-view similarity metric. However, the global Euclidean distance cannot faithfully characterize the ideal similarity in a complex visual feature space because features of pedestrian images exhibit unknown distributions due to large variations in poses, illumination and occlusion. Moreover, intra-personal training samples within a local range are robust to guide deep embedding against uncontrolled variations, which however, cannot be captured by a global Euclidean distance. In this paper, we study the problem of person re-id by proposing a novel sampling to mine suitable \textit{positives} (i.e. intra-class) within a local range to improve the deep embedding in the context of large intra-class variations. Our method is capable of learning a deep similarity metric adaptive to local sample structure by minimizing each sample's local distances while propagating through the relationship between samples to attain the whole intra-class minimization. To this end, a novel objective function is proposed to jointly optimize similarity metric learning, local positive mining and robust deep embedding. This yields local discriminations by selecting local-ranged positive samples, and the learned features are robust to dramatic intra-class variations. Experiments on benchmarks show state-of-the-art results achieved by our method.Comment: Published on Pattern Recognitio

arXiv.org e-Print Archive

University of Queensland eSpace

Gait recognition from multiple view-points

Author: Castro Payan Francisco Manuel
Publication venue: UMA Editorial
Publication date: 01/01/2018
Field of study

A la finalización de la tesis, la principal conclusión que se extrae es que la forma de andar permite identificar a las personas con una buena precisión (superior al 90 por ciento y llegando al 99 por ciento en determinados casos). Centrándonos en los diferentes enfoques desarrollados, el método basado en características extraídas a mano está especialmente indicado para bases de datos pequeñas en cuanto a número de muestras, ya que obtiene una buena precisión necesitando pocos datos de entrenamiento. Por otro lado, la aproximación basada en deep learning permite obtener buenos resultados para bases de datos grandes con la ventaja de que el tamaño de entrada puede ser muy pequeño, permitiendo una ejecución muy rápida. El enfoque incremental está especialmente indicado para entornos en los que se requieran añadir nuevos sujetos al sistema sin tener que entrenar el método de nuevo debido a los altos costes de tiempo y energía. Por último, el estudio de consumo nos ha permitido definir una serie de recomendaciones para poder minimizar el consumo de energía durante el entrenamiento de las redes profundas sin penalizar la precisión de las mismas. Fecha de lectura de Tesis Doctoral: 14 de diciembre 2018.Arquitectura de Computadores Resumen tesis: La identificación automática de personas está ganando mucha importancia en los últimos años ya que se puede aplicar en entornos que deben ser seguros (aeropuertos, centrales nucleares, etc) para agilizar todos los procesos de acceso. La mayoría de soluciones desarrolladas para este problema se basan en un amplio abanico de características físicas de los sujetos, como pueden ser el iris, la huella dactilar o la cara. Sin embargo, este tipo de técnicas tienen una serie de limitaciones ya que requieren la colaboración por parte del sujeto a identificar o bien son muy sensibles a cambios en la apariencia. Sin embargo, el reconocimiento del paso es una forma no invasiva de implementar estos controles de seguridad y, adicionalmente, no necesita la colaboración del sujeto. Además, es robusto frente a cambios en la apariencia del individuo ya que se centra en el movimiento. El objetivo principal de esta tesis es desarrollar un nuevo método para la identificación de personas a partir de la forma de caminar en entornos de múltiples vistas. Como entrada usamos el flujo óptico que proporciona una información muy rica sobre el movimiento del sujeto mientras camina. Para cumplir este objetivo, se han desarrollado dos técnicas diferentes: una basada en un enfoque tradicional de visión por computador donde se extraen manualmente características que definen al sujeto y, una segunda aproximación basada en aprendizaje profundo (deep learning) donde el propio método extrae sus características y las clasifica automáticamente. Además, para este último enfoque, se ha desarrollado una implementación basada en aprendizaje incremental para añadir nuevas clases sin entrenar el modelo desde cero y, un estudio energético para optimizar el consumo de energía durante el entrenamiento

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga

Efficient Human Activity Recognition in Large Image and Video Databases

Author: Cheema Muhammad Shahzad
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

bonndoc – Der Publikationsserver der Universität Bonn

Recommended from our members

View-invariant gait person re-identification with spatial and temporal attention

Author: Rahi Babak
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPerson re-identification at a distance across multiple none overlapping cameras has been an active research area for years. In the past ten years, Short term Person Re-Id techniques have made great strides in terms of accuracy using only appearance features in limited environments. However, massive intraclass variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject appearance and clothing might have changed after a significant period has elapsed. Facing these problems, soft biometric features such as Gait have been proposed in the past. Nevertheless, even Gait can vary with illness, ageing and changes in the emotional state, changes in walking surfaces, shoe type, clothes type, objects carried by the subject and even clutter in the scene. Therefore, Gait is considered a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So, extracting discriminative features from both spatial and temporal domains would be very beneficial to this research. Therefore, this thesis focuses on finding the best and most robust method to tackle the gait human Re-identification problem and solve it for practical applications. In real-world surveillance scenarios, the human gait cycle is primarily abnormal. These abnormalities include but not limited to temporal and spatial characteristics changes such as walking speed, broken gait phase and most importantly, varied camera angles. Our work performed an extensive literature study on spatial and temporal gait feature extraction methods with a focus on deep learning. Next, we conducted a comparative study and proposed a spatial-temporal approach for gait feature extraction using the fusion of multiple modalities, including optical-flow, raw silhouettes and RGB images. This approach was tested on two of the most challenging publicly available datasets for gait recognition TUM-GAID and CASIA-B, with excellent results presented in chapter 3. Furthermore, a modern spatial-temporal attention mechanism was proposed and tested on CASIA-B and OULP datasets which learns salient features independent of the gait cycle and view variations. The spatial attention layer in the proposed method extracts the spatial feature maps using a two-layered architecture that are fused using late fusion. It can pay attention to the identity-related salient regions in silhouette sequences discriminatively using the spatial feature maps. The temporal attention layer consists of an LSTM that encodes the temporal motion for silhouette sequences. It uses the encoded output vectors in the temporal attention architecture to focus on the most critical timesteps in the gait cycle and discard the rest. Furthermore, we improved the performance of our method by mapping our extracted spatial-temporal gait features to a discriminative null space for use in our Siamese architecture for crossmatching. We also conducted an element removal experiment on each segment of our spatial-temporal attentional network to gain insight into each component’s contribution to the performance. Our method showed outstanding robustness against abnormal gait cycles as well as viewpoint variations on both benchmark datasets

Brunel University Research Archive