1,254 research outputs found
Deep HMResNet Model for Human Activity-Aware Robotic Systems
Endowing the robotic systems with cognitive capabilities for recognizing
daily activities of humans is an important challenge, which requires
sophisticated and novel approaches. Most of the proposed approaches explore
pattern recognition techniques which are generally based on hand-crafted
features or learned features. In this paper, a novel Hierarchal Multichannel
Deep Residual Network (HMResNet) model is proposed for robotic systems to
recognize daily human activities in the ambient environments. The introduced
model is comprised of multilevel fusion layers. The proposed Multichannel 1D
Deep Residual Network model is, at the features level, combined with a
Bottleneck MLP neural network to automatically extract robust features
regardless of the hardware configuration and, at the decision level, is fully
connected with an MLP neural network to recognize daily human activities.
Empirical experiments on real-world datasets and an online demonstration are
used for validating the proposed model. Results demonstrated that the proposed
model outperforms the baseline models in daily human activity recognition.Comment: Presented at AI-HRI AAAI-FSS, 2018 (arXiv:1809.06606
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
There is an undeniable communication barrier between deaf people and people
with normal hearing ability. Although innovations in sign language translation
technology aim to tear down this communication barrier, the majority of
existing sign language translation systems are either intrusive or constrained
by resolution or ambient lighting conditions. Moreover, these existing systems
can only perform single-sign ASL translation rather than sentence-level
translation, making them much less useful in daily-life communication
scenarios. In this work, we fill this critical gap by presenting DeepASL, a
transformative deep learning-based sign language translation technology that
enables ubiquitous and non-intrusive American Sign Language (ASL) translation
at both word and sentence levels. DeepASL uses infrared light as its sensing
mechanism to non-intrusively capture the ASL signs. It incorporates a novel
hierarchical bidirectional deep recurrent neural network (HB-RNN) and a
probabilistic framework based on Connectionist Temporal Classification (CTC)
for word-level and sentence-level ASL translation respectively. To evaluate its
performance, we have collected 7,306 samples from 11 participants, covering 56
commonly used ASL words and 100 ASL sentences. DeepASL achieves an average
94.5% word-level translation accuracy and an average 8.2% word error rate on
translating unseen ASL sentences. Given its promising performance, we believe
DeepASL represents a significant step towards breaking the communication
barrier between deaf people and hearing majority, and thus has the significant
potential to fundamentally change deaf people's lives
- …