18 research outputs found

    Exploiting 3D Hand Pose Estimation in Deep Learning-Based Sign Language Recognition from RGB Videos

    No full text
    In this paper, we investigate the benefit of 3D hand skeletal information to the task of sign language (SL) recognition from RGB videos, within a state-of-the-art, multiple-stream, deep-learning recognition system. As most SL datasets are available in traditional RGB-only video lacking depth information, we propose to infer 3D coordinates of the hand joints from RGB data via a powerful architecture that has been primarily introduced in the literature for the task of 3D human pose estimation. We then fuse these estimates with additional SL informative streams, namely 2D skeletal data, as well as convolutional neural network-based hand- and mouth-region representations, and employ an attention-based encoder-decoder for recognition. We evaluate our proposed approach on a corpus of isolated signs of Greek SL and a dataset of continuous finger-spelling in American SL, reporting significant gains by the inclusion of 3D hand pose information, while also outperforming the state-of-the-art on both databases. Further, we evaluate the 3D hand pose estimation technique as standalone. © 2020, Springer Nature Switzerland AG

    SPATIO-TEMPORAL GRAPH CONVOLUTIONAL NETWORKS FOR CONTINUOUS SIGN LANGUAGE RECOGNITION

    No full text
    We address the challenging problem of continuous sign language recognition (CSLR) from RGB videos, proposing a novel deep-learning framework that employs spatio-temporal graph convolutional networks (ST-GCNs), which operate on multiple, appropriately fused feature streams, capturing the signer's pose, shape, appearance, and motion information. In addition to introducing such networks to the continuous recognition problem, our model's novelty lies on: (i) the feature streams considered and their blending into three ST-GCN modules; (ii) the combination of such modules with bi-directional long short-term memory networks, thus capturing both short-term embedded signing dynamics and long-range feature dependencies; and (iii) the fusion scheme, where the resulting modules operate in parallel, their posteriors aligned via a guiding connectionist temporal classification method, and fused for sign gloss prediction. Notably, concerning (i), in addition to traditional CSLR features, we investigate the utility of 3D human pose and shape parameterization via the “ExPose” approach, as well as 3D skeletal joint information that is regressed from detected 2D joints. We evaluate the proposed system on two well-known CSLR benchmarks, conducting extensive ablations on its modules. We achieve the new state-of-the-art on one of the two datasets, while reaching very competitive performance on the other. © 2022 IEE

    Multimodal fusion and sequence learning for cued speech recognition from videos

    No full text
    Cued Speech (CS) constitutes a non-vocal mode of communication that relies on lip movements in conjunction with hand positional and gestural cues, in order to disambiguate phonetic information and make it accessible to the speech and hearing impaired. In this study, we address the automatic recognition of CS from videos, employing deep learning techniques and extending our earlier work on this topic as follows: First, for visual feature extraction, in addition to hand positioning embeddings and convolutional neural network-based appearance features of the mouth region and signing hand, we consider structural information of the hand and mouth articulators. Specifically, we utilize the OpenPose framework to extract 2D lip keypoints and hand skeletal coordinates of the signer, and we also infer 3D hand skeletal coordinates from the latter exploiting own earlier work on 2D-to-3D hand-pose regression. Second, we modify the sequence learning model, by considering a time-depth separable (TDS) convolution block structure that encodes the fused visual features, in conjunction with a decoder that is based on connectionist temporal classification for phonetic sequence prediction. We investigate the contribution of the above to CS recognition, evaluating our model on a French and a British English CS video dataset, and we report significant gains over the state-of-the-art on both sets. © Springer Nature Switzerland AG 2021

    An analysis of equine round pen training videos posted online: Differences between amateur and professional trainers

    No full text
    Natural Horsemanship is popular among many amateur and professional trainers and as such, has been the subject of recent scientific enquiry. One method commonly adopted by Natural Horsemanship (NH) trainers is that of round pen training (RPT). RPT sessions are usually split into a series of bouts; each including two phases: chasing/flight and chasing offset/flight offset. However, NH training styles are heterogeneous. This study investigated online videos of RPT to explore the characteristics of RPT sessions and test for differences in techniques and outcomes between amateurs and professionals (the latter being defined as those with accompanying online materials that promote clinics, merchandise or a service to the public). From more than 300 candidate videos, we selected sample files for individual amateur (n = 24) and professional (n = 21) trainers. Inclusion criteria were: training at liberty in a Round Pen; more than one bout and good quality video. Sessions or portions of sessions were excluded if the trainer attached equipment, such as a lunge line, directly to the horse or the horse was saddled, mounted or ridden. The number of bouts and duration of each chasing and non-chasing phase were recorded, and the duration of each RPT session was calculated. General weighted regression analysis revealed that, when compared with amateurs, professionals showed fewer arm movements per bout (p<0.05). Poisson regression analysis showed that professionals spent more time looking up at their horses, when transitioning between gaits, than amateurs did (p<0.05). The probability of horses following the trainer was not significantly associated with amount of chasing, regardless of category. Given that, according to some practitioners, the following response is a goal of RPT, this result may prompt caution in those inclined to give chase. The horses handled by professionals showed fewer conflict behaviours (e.g. kicking, biting, stomping, head-tossing, defecating, bucking and attempting to escape), and fewer oral and head movements (e.g. head-lowering, licking and chewing) than those horses handled by amateurs Overall, these findings highlight the need for selectivity when using the internet as an educational source and the importance of trainer skill and excellent timing when using negative reinforcement in horse training

    TNF-308G > A Single Nucleotide Polymorphism Is Associated With Leprosy Among Brazilians: A Genetic Epidemiology Assessment, Meta-Analysis, and Functional Study

    No full text
    Leprosy is an infectious disease caused by Mycobacterium leprae. Tumor necrosis factor (TNF) plays a key role in the host response. Some association studies have implicated the single nucleotide polymorphism TNF -308G > A in leprosy susceptibility, but these results are still controversial. We first conducted 4 association studies (2639 individuals) that showed a protective effect of the -308A allele (odds ratio [OR] = 0.77; P = .005). Next, results of a meta-analysis reinforced this association after inclusion of our new data (OR = 0.74; P = .04). Furthermore, a subgroup analysis including only Brazilian studies suggested that the association is specific to this population (OR = 0.63; P = .005). Finally, functional analyses using whole blood cultures showed that patients carrying the -308A allele produced higher TNF levels after lipopolysaccharide (LPS) (6 hours) and M. leprae (3 hours) stimulation. These results reinforce the association between TNF and leprosy and suggest the -308A allele as a marker of disease resistance, especially among Brazilians.Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP
    corecore