1,565 research outputs found
Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems
A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept
On Distant Speech Recognition for Home Automation
The official version of this draft is available at Springer via http://dx.doi.org/10.1007/978-3-319-16226-3_7International audienceIn the framework of Ambient Assisted Living, home automation may be a solution for helping elderly people living alone at home. This study is part of the Sweet-Home project which aims at developing a new home automation system based on voice command to improve support and well-being of people in loss of autonomy. The goal of the study is vocal order recognition with a focus on two aspects: distance speech recognition and sentence spotting. Several ASR techniques were evaluated on a realistic corpus acquired in a 4-room flat equipped with microphones set in the ceiling. This distant speech French corpus was recorded with 21 speakers who acted scenarios of activities of daily living. Techniques acting at the decoding stage, such as our novel approach called Driven Decoding Algorithm (DDA), gave better speech recognition results than the baseline and other approaches. This solution which uses the two best SNR channels and a priori knowledge (voice commands and distress sentences) has demonstrated an increase in recognition rate without introducing false alarms
Distant speech recognition for home automation: Preliminary experimental results in a smart home
International audienceThis paper presents a study that is part of the Sweet-Home project which aims at developing a new home automation system based on voice command. The study focused on two tasks: distant speech recognition and sentence spotting (e.g., recognition of domotic orders). Regarding the first task, different combinations of ASR systems, language and acoustic models were tested. Fusion of ASR outputs by consensus and with a triggered language model (using a priori knowledge) were investigated. For the sentence spotting task, an algorithm based on distance evaluation between the current ASR hypotheses and the predefine set of keyword patterns was introduced in order to retrieve the correct sentences in spite of the ASR errors. The techniques were assessed on real daily living data collected in a 4-room smart home that was fully equipped with standard tactile commands and with 7 wireless microphones set in the ceiling. Thanks to Driven Decoding Algorithm techniques, a classical ASR system reached 7.9% WER against 35% WER in standard configuration and 15% with MLLR adaptation only. The best keyword pattern classification result obtained in distant speech conditions was 7.5% CER
ZOE: A cloud-less dialog-enabled continuous sensing wearable exploiting heterogeneous computation
The wearable revolution, as a mass-market phenomenon, has finally
arrived. As a result, the question of how wearables should evolve
over the next 5 to 10 years is assuming an increasing level of societal
and commercial importance. A range of open design and
system questions are emerging, for instance: How can wearables
shift from being largely health and fitness focused to tracking a
wider range of life events? What will become the dominant methods
through which users interact with wearables and consume the
data collected? Are wearables destined to be cloud and/or smartphone
dependent for their operation?
Towards building the critical mass of understanding and experience
necessary to tackle such questions, we have designed and
implemented ZOE – a match-box sized (49g) collar- or lapel-worn
sensor that pushes the boundary of wearables in an important set of
new directions. First, ZOE aims to perform multiple deep sensor
inferences that span key aspects of everyday life (viz. personal, social
and place information) on continuously sensed data; while also
offering this data not only within conventional analytics but also
through a speech dialog system that is able to answer impromptu
casual questions from users. (Am I more stressed this week than
normal?) Crucially, and unlike other rich-sensing or dialog supporting
wearables, ZOE achieves this without cloud or smartphone
support – this has important side-effects for privacy since all user
information can remain on the device. Second, ZOE incorporates
the latest innovations in system-on-a-chip technology together with
a custom daughter-board to realize a three-tier low-power processor
hierarchy. We pair this hardware design with software techniques
that manage system latency while still allowing ZOE to remain energy
efficient (with a typical lifespan of 30 hours), despite its high
sensing workload, small form-factor, and need to remain responsive to user dialog requests.This work was supported by Microsoft Research through its PhD
Scholarship Program. We would also like to thank the anonymous
reviewers and our shepherd, Jeremy Gummeson, for helping us improve
the paper.This is the author accepted manuscript. The final version is available from ACM at http://dl.acm.org/citation.cfm?doid=2742647.2742672
Recommended from our members
Segmentation of British Sign Language (BSL): Mind the gap!
This study asks how users of British Sign Language (BSL) recognize individual signs in connected sign sequences. We examined whether this is achieved through modality-specific or modality-general segmentation procedures. A modality-specific feature of signed languages is that, during continuous signing, there are salient transitions between sign locations. We used the sign-spotting task to ask if and how BSL signers use these transitions in segmentation. A total of 96 real BSL signs were preceded by nonsense signs which were produced in either the target location or another location (with a small or large transition). Half of the transitions were within the same major body area (e.g., head) and half were across body areas (e.g., chest to hand). Deaf adult BSL users (a group of natives and early learners, and a group of late learners) spotted target signs best when there was a minimal transition and worst when there was a large transition. When location changes were present, both groups performed better when transitions were to a different body area than when they were within the same area. These findings suggest that transitions do not provide explicit sign-boundary cues in a modality-specific fashion. Instead, we argue that smaller transitions help recognition in a modality-general way by limiting lexical search to signs within location neighbourhoods, and that transitions across body areas also aid segmentation in a modality-general way, by providing a phonotactic cue to a sign boundary. We propose that sign segmentation is based on modality-general procedures which are core language-processing mechanisms
Proceedings: Voice Technology for Interactive Real-Time Command/Control Systems Application
Speech understanding among researchers and managers, current developments in voice technology, and an exchange of information concerning government voice technology efforts are discussed
Applications of Machine Learning for Fake News Detection in Social Networks
The value of online media for getting news is questionable. People seek out and devour news from online media because it is convenient, inexpensive, and widely disseminated. In contrast, it facilitates the widespread distribution of "counterfeit news," or news of lower quality that includes fabricated data. Many people and institutions are negatively impacted by the widespread circulation of false information. As a result, detecting fake news via social media has emerged as a topic of interest for academics. Searching for and reading the news is becoming increasingly convenient as a result of the widespread availability, quick expansion, and widespread dissemination of traditional news outlets and social media. Nowadays, there is a plethora of information that can be found on social media, and it can be difficult to tell what is real and what is not. The distribution costs of releasing news via social media are inexpensive, and anyone can do it. The widespread circulation of false information could have devastating effects on both individuals and communities. Developing a reliable machine learning method for spotting fake news is the focus of this work
- …