Search CORE

123,884 research outputs found

Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

Author: Beerends John G
Chung Joon Son
Cornu Thomas Le
Lan Yuxuan
Lee Daehyun
Ngiam Jiquan
Pachoud Samuel
Summerfield Quentin
Thiede Thilo
Zimmermann Marina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/08/2018
Field of study

Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

arXiv.org e-Print Archive

Crossref

Continuous Authentication for Voice Assistants

Author: Fawaz Kassem
Feng Huan
Shin Kang G.
Publication venue
Publication date: 16/01/2017
Field of study

Voice has become an increasingly popular User Interaction (UI) channel, mainly contributing to the ongoing trend of wearables, smart vehicles, and home automation systems. Voice assistants such as Siri, Google Now and Cortana, have become our everyday fixtures, especially in scenarios where touch interfaces are inconvenient or even dangerous to use, such as driving or exercising. Nevertheless, the open nature of the voice channel makes voice assistants difficult to secure and exposed to various attacks as demonstrated by security researchers. In this paper, we present VAuth, the first system that provides continuous and usable authentication for voice assistants. We design VAuth to fit in various widely-adopted wearable devices, such as eyeglasses, earphones/buds and necklaces, where it collects the body-surface vibrations of the user and matches it with the speech signal received by the voice assistant's microphone. VAuth guarantees that the voice assistant executes only the commands that originate from the voice of the owner. We have evaluated VAuth with 18 users and 30 voice commands and find it to achieve an almost perfect matching accuracy with less than 0.1% false positive rate, regardless of VAuth's position on the body and the user's language, accent or mobility. VAuth successfully thwarts different practical attacks, such as replayed attacks, mangled voice attacks, or impersonation attacks. It also has low energy and latency overheads and is compatible with most existing voice assistants

arXiv.org e-Print Archive

Crossref

Integrating user-centred design in the development of a silent speech interface based on permanent magnetic articulography

Author: Bai Jie
Cheah Lam A.
Ell Stephen R.
Fagan Michael J.
Gilbert James M.
Gonzalez Jose A.
Green Phil D.
Moore Roger K.
Rychenko Sergey I.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Abstract: A new wearable silent speech interface (SSI) based on Permanent Magnetic Articulography (PMA) was developed with the involvement of end users in the design process. Hence, desirable features such as appearance, port-ability, ease of use and light weight were integrated into the prototype. The aim of this paper is to address the challenges faced and the design considerations addressed during the development. Evaluation on both hardware and speech recognition performances are presented here. The new prototype shows a com-parable performance with its predecessor in terms of speech recognition accuracy (i.e. ~95% of word accuracy and ~75% of sequence accuracy), but significantly improved appearance, portability and hardware features in terms of min-iaturization and cost

Repository@Hull - Worktribe

Crossref

We need to talk about silence: Re-examining silence in International Relations theory

Author: Dingli Sophia
Publication venue: 'SAGE Publications'
Publication date: 08/04/2015
Field of study

The critique of silence in International Relations theory has been long-standing and sustained. However, despite the lasting popularity of the term, little effort has been made to unpack the implications of existing definitions and their uses, and of attempts to rid the worlds of theory and practice of silences. This article seeks to fill this vacuum by conducting a twofold exercise: a review and revision of the conceptualisation of silence current in the literature; and a review of the implications of attempts to eliminate silence from the worlds of theory and practice. Through the discussion, the article suggests that we deepen and broaden our understanding of silence while simultaneously accepting that a degree of silence will be a permanent feature of theory and practice in international politics. Finally, the conclusion illustrates the possibilities for analysis and theory opened by these arguments through an exploration of how they may be used to interpret and address recent events in Yemen

Repository@Hull - Worktribe

Crossref

Enlighten

Reverse production effect: Children recognize novel words better when they are heard rather than produced

Author: Abbs
Baddeley
Baddeley
Baese-Berk
Barcroft
Barr
Bergelson
Bruderer
Buckler
Carlson
Cho
Cho
Clark
Coady
Conway
Core
Curtin
Curtin
Dahlen
Davis
Dell
Demuth
DePaolis
DePaolis
Eriksson
Ettlinger
Fawcett
Fennell
Fennell
Ferguson
Gathercole
Gathercole
Gathercole
Goldinger
Gupta
Heisler
Hickok
Hodges
Hopkins
Icht
Kan
Kan
Kang
Kaushanskaya
Keren-Portnoy
Leonard
MacLeod
MacLeod
Maekawa
Majorano
Mama
Mathias
Matin
McAllister Byun
McAllister Byun
Mirman
Munro
Munson
Ngon
Ohala
Ota
Ozubko
Ozubko
Page
Pierrehumbert
Salverda
Schwartz
Sosa
Sosa
Stoel-Gammon
Stoel-Gammon
Storkel
Storkel
Storkel
Storkel
Storkel
Swingley
Vihman
Vihman
Vihman
Vihman
Vihman
Werker
Yeung
Zamuner
Zamuner
Zamuner
Zamuner
Publication venue: 'Wiley'
Publication date: 15/11/2018
Field of study

This is the peer reviewed version of the following article: Tania S. Zamuner, Stephanie Strahm, Elizabeth Morin-Lessard, and Michael P. A. Page, 'Reverse production effect: children recognize novel words better when they are heard rather than produced', Developmental Science, which has been published in final form at DOI 10.1111/desc.12636. Under embargo until 15 November 2018. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.This research investigates the effect of production on 4.5- to 6-year-old children’s recognition of newly learned words. In Experiment 1, children were taught four novel words in a produced or heard training condition during a brief training phase. In Experiment 2, children were taught eight novel words, and this time training condition was in a blocked design. Immediately after training, children were tested on their recognition of the trained novel words using a preferential looking paradigm. In both experiments, children recognized novel words that were produced and heard during training, but demonstrated better recognition for items that were heard. These findings are opposite to previous results reported in the literature with adults and children. Our results show that benefits of speech production for word learning are dependent on factors such as task complexity and the developmental stage of the learner.Peer reviewedFinal Accepted Versio

Crossref

University of Hertfordshire Research Archive