Search CORE

9,763 research outputs found

Semi-Automated Data Labeling for Activity Recognition in Pervasive Healthcare

Author: Beltran-Marquez Jessica
Cleland I
Cruz-Sandoval Dagoberto
Ennis Andrew
Favela Jesus
Garcia-Constantino Matias
Gonzalez-Jasso Luis
Hernandez-Cruz Netzahualcoyotl
Lopez-Nava Irvin Hussein
Nugent CD
Rafferty Joseph
Synnott Jonathan
Publication venue: 'MDPI AG'
Publication date: 10/07/2019
Field of study

miMic: The microphone as a pencil

Author: Davide Andrea Mauro
Davide Rocchesso
Stefano Delle Monache
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

miMic, a sonic analogue of paper and pencil is proposed: An augmented microphone for vocal and gestural sonic sketching. Vocalizations are classified and interpreted as instances of sound models, which the user can play with by vocal and gestural control. The physical device is based on a modified microphone, with embedded inertial sensors and buttons. Sound models can be selected by vocal imitations that are automatically classified, and each model is mapped to vocal and gestural features for real-time control. With miMic, the sound designer can explore a vast sonic space and quickly produce expressive sonic sketches, which may be turned into sound prototypes by further adjustment of model parameters

Archivio istituzionale della ricerca - Università IUAV di Venezia

Archivio istituzionale della ricerca - Università di Palermo

Levitating Particle Displays with Interactive Voxels

Author: Brewster Stephen
Freeman Euan
Kourtelos Praxitelis
Williamson Julie R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Levitating objects can be used as the primitives in a new type of display. We present levitating particle displays and show how research into object levitation is enabling a new way of presenting and interacting with information. We identify novel properties of levitating particle displays and give examples of the interaction techniques and applications they allow. We then discuss design challenges for these displays, potential solutions, and promising areas for future research

Crossref

Enlighten

Deep Room Recognition Using Inaudible Echos

Author: Gu Chaojie
Song Qun
Tan Rui
Publication venue
Publication date: 01/01/2018
Field of study

Recent years have seen the increasing need of location awareness by mobile applications. This paper presents a room-level indoor localization approach based on the measured room's echos in response to a two-millisecond single-tone inaudible chirp emitted by a smartphone's loudspeaker. Different from other acoustics-based room recognition systems that record full-spectrum audio for up to ten seconds, our approach records audio in a narrow inaudible band for 0.1 seconds only to preserve the user's privacy. However, the short-time and narrowband audio signal carries limited information about the room's characteristics, presenting challenges to accurate room recognition. This paper applies deep learning to effectively capture the subtle fingerprints in the rooms' acoustic responses. Our extensive experiments show that a two-layer convolutional neural network fed with the spectrogram of the inaudible echos achieve the best performance, compared with alternative designs using other raw data formats and deep models. Based on this result, we design a RoomRecognize cloud service and its mobile client library that enable the mobile application developers to readily implement the room recognition functionality without resorting to any existing infrastructures and add-on hardware. Extensive evaluation shows that RoomRecognize achieves 99.7%, 97.7%, 99%, and 89% accuracy in differentiating 22 and 50 residential/office rooms, 19 spots in a quiet museum, and 15 spots in a crowded museum, respectively. Compared with the state-of-the-art approaches based on support vector machine, RoomRecognize significantly improves the Pareto frontier of recognition accuracy versus robustness against interfering sounds (e.g., ambient music).Comment: 29 page

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Neural correlates of the processing of co-speech gestures

Author: Alibali
Allison
Andreas Hennenlotter
Baddeley
Beattie
Beattie
Beauchamp
Beauchamp
Beauchamp
Binkofski
Calvert
Calvert
Cassell
Chomsky
Culham
Dupont
Feyereisen
Fiebach
Fiez
Fridman
Friston
Gallese
Gallese
Gauthier
Gauthier
Goldin-Meadow
Grill-Spector
Grodzinsky
Grosjean
Gunter
Gunter
Hadar
Henning Holle
Holle
Holler
Iacoboni
Iacoboni
Iacoboni
Josephs
Joubert
Kanwisher
Kelly
Kita
Krauss
Lee
Levelt
Lohmann
Marco Iacoboni
McNeill
McNeill
Molnar-Szakacs
Morrel-Samuels
Nixon
Nobe
Norris
Oldfield
Paulesu
Pelphrey
Rizzolatti
Rizzolatti
Rizzolatti
Rose
Saygin
Schubotz
Sekiyama
Shirley-Ann Rüschemeyer
Skipper
Tanenhaus
Tarr
Thomas C. Gunter
Ugurbil
Umiltà
van Atteveldt
Ward
Worsley
Wright
Wu
Özyürek
Publication venue: 'Elsevier BV'
Publication date: 13/11/2007
Field of study

In communicative situations, speech is often accompanied by gestures. For example, speakers tend to illustrate certain contents of speech by means of iconic gestures which are hand movements that bear a formal relationship to the contents of speech. The meaning of an iconic gesture is determined both by its form as well as the speech context in which it is performed. Thus, gesture and speech interact in comprehension. Using fMRI, the present study investigated what brain areas are involved in this interaction process. Participants watched videos in which sentences containing an ambiguous word (e.g. She touched the mouse) were accompanied by either a meaningless grooming movement, a gesture supporting the more frequent dominant meaning (e.g. animal) or a gesture supporting the less frequent subordinate meaning (e.g. computer device). We hypothesized that brain areas involved in the interaction of gesture and speech would show greater activation to gesture-supported sentences as compared to sentences accompanied by a meaningless grooming movement. The main results are that when contrasted with grooming, both types of gestures (dominant and subordinate) activated an array of brain regions consisting of the left posterior superior temporal sulcus (STS), the inferior parietal lobule bilaterally and the ventral precentral sulcus bilaterally. Given the crucial role of the STS in audiovisual integration processes, this activation might reflect the interaction between the meaning of gesture and the ambiguous sentence. The activations in inferior frontal and inferior parietal regions may reflect a mechanism of determining the goal of co-speech hand movements through an observation-execution matching process

Repository@Hull - Worktribe

Crossref

Sussex Research Online

MPG.PuRe

Recommended from our members

Correlating Visual Speaker Gestures with Measures of Audience Engagement to Aid Video Browsing

Author: Zhang John
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

In this thesis, we argue that in the domains of educational lectures and political debates, speaker gestures can be a source of semantic cues for video browsing. We hypothesize that certain human gestures, which can be automatically identified through techniques of computer vision, can convey significant information that are correlated to audience engagement. We present a joint-angle descriptor derived from an automatic upper body pose estimation framework to train an SVM which identifies point and spread poses in extracted video frames of an instructor giving a lecture. Ground-truth is collected in the form of 2500 manually annotated frames covering 20 minutes of a video lecture. Cross validation on the ground-truth data showed classifier F-scores of 0.54 and 0.39 for point and spread poses, respectively. We also derive an attribute for gestures which measures the angular variance of the arm movements from this system (analogous to arm waving). We present a method for tracking hands which succeeds even when left and right hands are clasping and occluding each other. We evaluate on a ground-truth dataset of 698 images with 1301 annotated left and right hands, mostly clasped. Our method performs better than baseline on recall (0.66 vs. 0.53) without sacrificing precision (0.65 for both) toward the goal of recognizing clasped hands. For tracking, it results in an improvement over a baseline method with an F-score of 0.59 vs. 0.48. From this, we are able to derive hand motion-based gesture attributes such as velocity, direction change and extremal pose. In ground-truth studies, we manually annotate and analyze the gestures of two instructors, each in a 75-minute computer science lecture using a 14-bit pose vector. We observe "pedagogical" gestures of punctuation and encouragement in addition to traditional classes of gestures such as deictic and metaphoric. We also introduce a tool to facilitate the manual annotations of gestures in video and present results on their frequencies and co-occurrences. In particular, we find that 5 poses represent 80% of the variation in the annotated ground truth. We demonstrate a correlation between the angular variance of arm movements and the presence of those conjunctions that are used to contrast connected clauses ("but", "neither", etc.) in the accompanying speech. We do this by training an AdaBoost-based binary classifier using decision trees as weak learners. On a ground-truth database of 4243 video clips totaling 3.83 hours, each with subtitles, training on sets of conjunctions indicating contrast produces classifiers capable of achieving 55% accuracy on a balanced test set. We study two different presentation methods: an attribute graph which shows a normalized measure of the visual attributes across an entire video, as well as emphasized subtitles, where individual words are emphasized (resized) based on their accompanying gestures. Results from 12 subjects show supportive ratings given for the browsing aids in the task of providing keywords for video under time constraints. Subjects' keywords are also compared to independent ground-truth, resulting in precisions from 0.50-0.55, even when given less than half real time to view the video. We demonstrate a correlation between gesture attributes and a rigorous method of measuring audience engagement: electroencephalography (EEG). Our 20 subjects watch 61 minutes of video of the 2012 U.S. Presidential Debates while under observation through EEG. After discarding corrupted recordings, we retain 47 minutes worth of EEG data for each subject. The subjects are examined in aggregate and in subgroups according to gender and political affiliation. We find statistically significant correlations between gesture attributes (particularly extremal pose) and our feature of engagement derived from EEG. For all subjects watching all videos, we see a statistically significant correlation between gesture and engagement with a Spearman rank correlation of rho = 0.098 with p < 0.05, Bonferroni corrected. For some stratifications, correlations reach as high as rho = 0.297. From these results, we conclude what gestures can be used to measure engagement

Columbia University Academic Commons

Echoes of the spoken past: how auditory cortex hears context during speech perception.

Author: Skipper JI
Publication venue
Publication date: 19/09/2014
Field of study

What do we hear when someone speaks and what does auditory cortex (AC) do with that sound? Given how meaningful speech is, it might be hypothesized that AC is most active when other people talk so that their productions get decoded. Here, neuroimaging meta-analyses show the opposite: AC is least active and sometimes deactivated when participants listened to meaningful speech compared to less meaningful sounds. Results are explained by an active hypothesis-and-test mechanism where speech production (SP) regions are neurally re-used to predict auditory objects associated with available context. By this model, more AC activity for less meaningful sounds occurs because predictions are less successful from context, requiring further hypotheses be tested. This also explains the large overlap of AC co-activity for less meaningful sounds with meta-analyses of SP. An experiment showed a similar pattern of results for non-verbal context. Specifically, words produced less activity in AC and SP regions when preceded by co-speech gestures that visually described those words compared to those words without gestures. Results collectively suggest that what we 'hear' during real-world speech perception may come more from the brain than our ears and that the function of AC is to confirm or deny internal predictions about the identity of sounds

UCL Discovery

PubMed Central

Using Multimodal Analysis to Investigate the Role of the Interpreter

Author: Bao-Rozee Jie
Publication venue: University of Stirling
Publication date: 24/05/2016
Field of study

Recent research in Interpreting Studies has favoured the argument that, in practice, the interpreter plays an active role, rather than the prescribed role stipulated in professional codes of conduct. Cutting-edge studies utilising multimodal research methods have taken a more comprehensive approach to investigating this argument, searching for evidence of the interpreter’s active involvement not only through textual analysis, but also by examining a range of non-verbal communicative means. Studies using multimodal analysis, such as those by Pasquandrea (2011) and Davitti (2012), have succeeded in offering new insights into the interpreter’s role in interaction. This research presents further investigation into the interpreter’s role through multimodal analysis by focusing on the use of gesture movements, gaze and body orientation in interpreter-mediated communication; it also looks at the impact of the state of knowledge asymmetry on the interpreter’s role. This thesis presents findings from six simulated face-to-face dialogue interpreting cases featuring three different groups of participants and interpreters representing different interpreting settings (e.g. parent-teacher meeting, business meeting, doctor-patient meeting, etc.). By adapting a multimodal approach, findings of this study (a) contribute to our understanding of the active role of the interpreter in Interpreting Studies by exploring new insights from a multimodal approach, and (b) offer new empirical findings from interpreter-mediated interactions to the technical analysis of multimodal communication

Stirling Online Research Repository