Search CORE

11 research outputs found

Evaluation of Deep Learning based Pose Estimation for Sign Language Recognition

Author: Chen X.
Cooper H.
Tokui S.
Tompson J. J.
Tompson J. J.
Yosinski J.
Publication venue
Publication date: 19/04/2016
Field of study

Human body pose estimation and hand detection are two important tasks for systems that perform computer vision-based sign language recognition(SLR). However, both tasks are challenging, especially when the input is color videos, with no depth information. Many algorithms have been proposed in the literature for these tasks, and some of the most successful recent algorithms are based on deep learning. In this paper, we introduce a dataset for human pose estimation for SLR domain. We evaluate the performance of two deep learning based pose estimation methods, by performing user-independent experiments on our dataset. We also perform transfer learning, and we obtain results that demonstrate that transfer learning can improve pose estimation accuracy. The dataset and results from these methods can create a useful baseline for future works

arXiv.org e-Print Archive

Crossref

Sign language recognition with transformer networks

Author: Dambre Joni
De Coster Mathieu
Van Herreweghe Mieke
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

Sign languages are complex languages. Research into them is ongoing, supported by large video corpora of which only small parts are annotated. Sign language recognition can be used to speed up the annotation process of these corpora, in order to aid research into sign languages and sign language recognition. Previous research has approached sign language recognition in various ways, using feature extraction techniques or end-to-end deep learning. In this work, we apply a combination of feature extraction using OpenPose for human keypoint estimation and end-to-end feature learning with Convolutional Neural Networks. The proven multi-head attention mechanism used in transformers is applied to recognize isolated signs in the Flemish Sign Language corpus. Our proposed method significantly outperforms the previous state of the art of sign language recognition on the Flemish Sign Language corpus: we obtain an accuracy of 74.7% on a vocabulary of 100 classes. Our results will be implemented as a suggestion system for sign language corpus annotation

Ghent University Academic Bibliography

GENDER CLASSIFICATION VIA HUMAN JOINTS USING CONVOLUTIONAL NEURAL NETWORK

Author: Sung Cheng-En
Publication venue: 'San Jose State University Library'
Publication date: 01/01/2023
Field of study

With the growing demand for gender-related data on diverse applications, including security systems for ascertaining an individual’s identity for border crossing, as well as marketing purposes of digging the potential customer and tailoring special discounts for them, gender classification has become an essential task within the field of computer vision and deep learning. There has been extensive research conducted on classifying human gender using facial expression, exterior appearance (e.g., hair, clothes), or gait movement. However, within the scope of our research, none have specifically focused gender classification on two-dimensional body joints. Knowing this, we believe that a new prediction pipeline is required to improve the accuracy of gender classification on purely joint images. In this paper, we propose novel yet simple methods for gender recognition. We conducted our experiments on the BBC Pose and Short BBC pose datasets. We preprocess the raw images by filtering out the frame with missing human figures, removing background noise by cropping the images and labeling the joints via the C5 (model applied transfer learning on the RestNet-152) pre- trained model. We implemented both machine learning (SVM) and deep learning (Convolution Neural Network) methods to classify the images into binary genders. The result of the deep learning method outperformed the classic machine learning method with an accuracy of 66.5%

SJSU ScholarWorks

Single camera pose estimation using Bayesian filtering and Kinect motion priors

Author: Burke Michael
Lasenby Joan
Publication venue
Publication date: 17/06/2014
Field of study

Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF

arXiv.org e-Print Archive

CiteSeerX

Multi-channel Transformers for Multi-articulatory Sign Language Translation

Author: BS Parton
C Valli
E Malaia
G Caridakis
J Charles
J Zelinka
M Johnson
M Luzardo
O Koller
P Bojanowski
R Cui
R Sutton-Spence
S Tamura
SK Ko
T Starner
U Bellugi
Publication venue
Publication date: 01/01/2020
Field of study

Sign languages use multiple asynchronous information channels (articulators), not just the hands but also the face and body, which computational approaches often ignore. In this paper we tackle the multi-articulatory sign language translation task and propose a novel multi-channel transformer architecture. The proposed architecture allows both the inter and intra contextual relationships between different sign articulators to be modelled within the transformer network itself, while also maintaining channel specific information. We evaluate our approach on the RWTH-PHOENIX-Weather-2014T dataset and report competitive translation performance. Importantly, we overcome the reliance on gloss annotations which underpin other state-of-the-art approaches, thereby removing future need for expensive curated datasets

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

Automatic and efficient human pose estimation for sign language videos

Author: Charles J
Everingham M
Pfister T
Zisserman A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

We present a fully automatic arm and hand tracker that detects joint positions over continuous sign language video sequences of more than an hour in length. To achieve this, we make contributions in four areas: (i) we show that the overlaid signer can be separated from the background TV broadcast using co-segmentation over all frames with a layered model; (ii) we show that joint positions (shoulders, elbows, wrists) can be predicted per-frame using a random forest regressor given only this segmentation and a colour model; (iii) we show that the random forest can be trained from an existing semi-automatic, but computationally expensive, tracker; and, (iv) introduce an evaluator to assess whether the predicted joint positions are correct for each frame. The method is applied to 20 signing footage videos with changing background, challenging imaging conditions, and for different signers. Our framework outperforms the state-of-the-art long term tracker by Buehler et al. (International Journal of Computer Vision 95:180–197, 2011), does not require the manual annotation of that work, and, after automatic initialisation, performs tracking in real-time. We also achieve superior joint localisation results to those obtained using the pose estimation method of Yang and Ramanan (Proceedings of the IEEE conference on computer vision and pattern recognition, 2011)

Crossref

White Rose Research Online

Recommended from our members

Characterizing Unstructured Motor Behaviors in the Epilepsy Monitoring Unit

Author: Gabriel Paolo Gutierrez
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Key advancements in recording hardware, data computation, clinical care, and cognitive science continue to drive new possibilities in how humans and machines can interact directly through thought. Neural data analyses with these advancements has progressed neuroscience research in functional brain mapping and brain-computer interfaces (BCIs). Much of our knowledge about BCIs is informed by data collected through carefully controlled experiments. Constraining BCI experiments with structured paradigms allows researchers to collect a high number of consistent data in a short amount of time, while also controlling for external confounds. Very little is currently known about how well these task-based relationships extend to daily life, in part because collecting data outside of the lab is challenging. To further understand natural brain activity, we must study more complex behaviors in more environmentally relevant settings. The results of this dissertation address three general challenges to studying neural correlates to unstructured behaviors. First, we continuously monitored unstructured human movements in the epilepsy monitoring unit using a video sensor synchronized to clinical intracortical electrodes. Second, we annotated unstructured behaviors from these video using both manual and computer vision methods. Finally, analyzed neural features with respect to unstructured human movements, and evaluated the performance of features identified in previous task-based studies. The preliminary nature of this work means that a majority of our demonstrations are whether the continuous paradigm can be leveraged, how one might go about leveraging it, and evaluations that tie our results back to earlier task-based studies. Our advances here motivate future works that focus more intently on what types of behaviors and neural signal features to explore

eScholarship - University of California

Signal Processing and Machine Learning Techniques Towards Various Real-World Applications

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques require data to train the algorithms and model a complex system and make predictions based on that model. Due to development of sophisticated sensors it has become easier to collect large volumes of data which is used to make necessary hypotheses using ML. The promising results obtained using ML have opened up new opportunities of research across various departments and this dissertation is a manifestation of it. Here, some unique studies have been presented, from which valuable inference have been drawn for a real-world complex system. Each study has its own unique sets of motivation and relevance to the real world. An ensemble of signal processing (SP) and ML techniques have been explored in each study. This dissertation provides the detailed systematic approach and discusses the results achieved in each study. Valuable inferences drawn from each study play a vital role in areas of science and technology, and it is worth further investigation. This dissertation also provides a set of useful SP and ML tools for researchers in various fields of interest.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

ASU Digital Repository