86 research outputs found

    An Empirical Study Comparing Unobtrusive Physiological Sensors for Stress Detection in Computer Work.

    Get PDF
    Several unobtrusive sensors have been tested in studies to capture physiological reactions to stress in workplace settings. Lab studies tend to focus on assessing sensors during a specific computer task, while in situ studies tend to offer a generalized view of sensors' efficacy for workplace stress monitoring, without discriminating different tasks. Given the variation in workplace computer activities, this study investigates the efficacy of unobtrusive sensors for stress measurement across a variety of tasks. We present a comparison of five physiological measurements obtained in a lab experiment, where participants completed six different computer tasks, while we measured their stress levels using a chest-band (ECG, respiration), a wristband (PPG and EDA), and an emerging thermal imaging method (perinasal perspiration). We found that thermal imaging can detect increased stress for most participants across all tasks, while wrist and chest sensors were less generalizable across tasks and participants. We summarize the costs and benefits of each sensor stream, and show how some computer use scenarios present usability and reliability challenges for stress monitoring with certain physiological sensors. We provide recommendations for researchers and system builders for measuring stress with physiological sensors during workplace computer use

    Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database

    Get PDF
    We present a new phone-dependent feature weighting scheme that can be used to map articulatory configurations (e.g. EMA) onto vocal tract spectra (e.g. MFCC) through table lookup. The approach consists of assigning feature weights according to a feature's ability to predict the acoustic distance between frames. Since an articulator's predictive accuracy is phone-dependent (e.g., lip location is a better predictor for bilabial sounds than for palatal sounds), a unique weight vector is found for each phone. Inspection of the weights reveals a correspondence with the expected critical articulators for many phones. The proposed method reduces overall cepstral error by 6\% when compared to a uniform weighting scheme. Vowels show the greatest benefit, though improvements occur for 80\% of the tested phones

    Corrective feedback accuracy and pronunciation improvement: Feedback that is ‘good enough’

    Get PDF
    It is unclear whether corrective feedback (CF) provided by L2 computer-assisted pronunciation training (CAPT) tools must be 100% accurate to promote an acceptable level of improvement in pronunciation. Using a web-based interface, 30 native speakers of Chinese completed a pretest, a computer-based training session to produce nine sound contrasts in English, and a posttest. The study manipulated feedback accuracy using a modified “Wizard of Oz” protocol in which a phonetically-trained human listener in a separate room provided CF on the trainees’ productions, but the trainees thought that the computer-based system provided the CF. The computer system presented a set of three sound contrasts with 100% accuracy, three with 66% accuracy (with one of three human responses changed randomly), and three with 33% accuracy (with two of three human feedback responses being changed). The trainees’ pre- and posttest productions were rated for accuracy by native speakers of English. For trained items, productions were not significantly different when the trainees received CF with 100% or 66% accuracy, but both resulted in greater improvement than feedback with 33% accuracy. An important implication for L2 pronunciation training software is that machine feedback can be beneficial even when it is ‘good enough’ (i.e., not 100% accurate)

    Web GIS in practice X: a Microsoft Kinect natural user interface for Google Earth navigation

    Get PDF
    This paper covers the use of depth sensors such as Microsoft Kinect and ASUS Xtion to provide a natural user interface (NUI) for controlling 3-D (three-dimensional) virtual globes such as Google Earth (including its Street View mode), Bing Maps 3D, and NASA World Wind. The paper introduces the Microsoft Kinect device, briefly describing how it works (the underlying technology by PrimeSense), as well as its market uptake and application potential beyond its original intended purpose as a home entertainment and video game controller. The different software drivers available for connecting the Kinect device to a PC (Personal Computer) are also covered, and their comparative pros and cons briefly discussed. We survey a number of approaches and application examples for controlling 3-D virtual globes using the Kinect sensor, then describe Kinoogle, a Kinect interface for natural interaction with Google Earth, developed by students at Texas A&M University. Readers interested in trying out the application on their own hardware can download a Zip archive (included with the manuscript as additional files 1, 2, &3) that contains a 'Kinnogle installation package for Windows PCs'. Finally, we discuss some usability aspects of Kinoogle and similar NUIs for controlling 3-D virtual globes (including possible future improvements), and propose a number of unique, practical 'use scenarios' where such NUIs could prove useful in navigating a 3-D virtual globe, compared to conventional mouse/3-D mouse and keyboard-based interfaces

    L2-ARCTIC: A Non-Native English Speech Corpus

    Get PDF
    In this paper, we introduce L2-ARCTIC, a speech corpus of non-native English that is intended for research in voice conversion, accent conversion, and mispronunciation detection. This initial release includes recordings from ten non-native speakers of English whose first languages (L1s) are Hindi, Korean, Mandarin, Spanish, and Arabic, each L1 containing recordings from one male and one female speaker. Each speaker recorded approximately one hour of read speech from the Carnegie Mellon University ARCTIC prompts, from which we generated orthographic and forced-aligned phonetic transcriptions. In addition, we manually annotated 150 utterances per speaker to identify three types of mispronunciation errors: substitutions, deletions, and additions, making it a valuable resource not only for research in voice conversion and accent conversion but also in computer-assisted pronunciation training. The corpus is publicly accessible at https://psi.engr.tamu.edu/l2-arctic-corpus/
    corecore