4,429 research outputs found
Understanding face and eye visibility in front-facing cameras of smartphones used in the wild
Commodity mobile devices are now equipped with high-resolution front-facing cameras, allowing applications in biometrics (e.g., FaceID in the iPhone X), facial expression analysis, or gaze interaction. However, it is unknown how often users hold devices in a way that allows capturing their face or eyes, and how this impacts detection accuracy. We collected 25,726 in-the-wild photos, taken from the front-facing camera of smartphones as well as associated application usage logs. We found that the full face is visible about 29% of the time, and that in most cases the face is only partially visible. Furthermore, we identified an influence of users' current activity; for example, when watching videos, the eyes but not the entire face are visible 75% of the time in our dataset. We found that a state-of-the-art face detection algorithm performs poorly against photos taken from front-facing cameras. We discuss how these findings impact mobile applications that leverage face and eye detection, and derive practical implications to address state-of-the art's limitations
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
There is an undeniable communication barrier between deaf people and people
with normal hearing ability. Although innovations in sign language translation
technology aim to tear down this communication barrier, the majority of
existing sign language translation systems are either intrusive or constrained
by resolution or ambient lighting conditions. Moreover, these existing systems
can only perform single-sign ASL translation rather than sentence-level
translation, making them much less useful in daily-life communication
scenarios. In this work, we fill this critical gap by presenting DeepASL, a
transformative deep learning-based sign language translation technology that
enables ubiquitous and non-intrusive American Sign Language (ASL) translation
at both word and sentence levels. DeepASL uses infrared light as its sensing
mechanism to non-intrusively capture the ASL signs. It incorporates a novel
hierarchical bidirectional deep recurrent neural network (HB-RNN) and a
probabilistic framework based on Connectionist Temporal Classification (CTC)
for word-level and sentence-level ASL translation respectively. To evaluate its
performance, we have collected 7,306 samples from 11 participants, covering 56
commonly used ASL words and 100 ASL sentences. DeepASL achieves an average
94.5% word-level translation accuracy and an average 8.2% word error rate on
translating unseen ASL sentences. Given its promising performance, we believe
DeepASL represents a significant step towards breaking the communication
barrier between deaf people and hearing majority, and thus has the significant
potential to fundamentally change deaf people's lives
Rhythmic Micro-Gestures: Discreet Interaction On-the-Go
We present rhythmic micro-gestures, micro-movements of the hand that are repeated in time with a rhythm. We present a user study that investigated how well users can perform rhythmic micro-gestures and if they can use them eyes-free with non-visual feedback. We found that users could successfully use our interaction technique (97% success rate across all gestures) with short interaction times, rating them as low difficulty as well. Simple audio cues that only convey the rhythm outperformed animations showing the hand movements, supporting rhythmic micro-gestures as an eyes-free input technique
Exploring user-defined gestures for alternate interaction space for smartphones and smartwatches
2016 Spring.Includes bibliographical references.In smartphones and smartwatches, the input space is limited due to their small form factor. Although many studies have highlighted the possibility of expanding the interaction space for these devices, limited work has been conducted on exploring end-user preferences for gestures in the proposed interaction spaces. In this dissertation, I present the results of two elicitation studies that explore end-user preferences for creating gestures in the proposed alternate interaction spaces for smartphones and smartwatches. Using the data collected from the two elicitation studies, I present gestures preferred by end-users for common tasks that can be performed using smartphones and smartwatches. I also present the end-user mental models for interaction in proposed interaction spaces for these devices, and highlight common user motivations and preferences for suggested gestures. Based on the findings, I present design implications for incorporating the proposed alternate interaction spaces for smartphones and smartwatches
28 frames later: predicting screen touches from back-of-device grip changes
We demonstrate that front-of-screen targeting on mobile
phones can be predicted from back-of-device grip manipulations. Using simple, low-resolution capacitive touch sensors placed around a standard phone, we outline a machine learning approach to modelling the grip modulation and inferring front-of-screen touch targets. We experimentally demonstrate that grip is a remarkably good predictor of touch, and we can predict touch position 200ms before contact with an accuracy of 18mm
- …