151,148 research outputs found
The Ambient Horn: Designing a novel audio-based learning experience
The Ambient Horn is a novel handheld device designed to support children learning about habitat distributions and interdependencies in an outdoor woodland environment. The horn was designed to emit non-speech audio sounds representing ecological processes. Both symbolic and arbitrary mappings were used to represent the processes. The sounds are triggered in response to the children’s location in certain parts of the woodland. A main objective was to provoke children into interpreting and reflecting upon the significance of the sounds in the context in which they occur. Our study of the horn being used showed the sounds to be provocative, generating much discussion about what they signified in relation to what the children saw in the woodland. In addition, the children appropriated the horn in creative ways, trying to ‘scoop’ up new sounds as they walked in different parts of the woodland
Omnidirectional Bats, Point-to-Plane Distances, and the Price of Uniqueness
We study simultaneous localization and mapping with a device that uses
reflections to measure its distance from walls. Such a device can be realized
acoustically with a synchronized collocated source and receiver; it behaves
like a bat with no capacity for directional hearing or vocalizing. In this
paper we generalize our previous work in 2D, and show that the 3D case is not
just a simple extension, but rather a fundamentally different inverse problem.
While generically the 2D problem has a unique solution, in 3D uniqueness is
always absent in rooms with fewer than nine walls. In addition to the complete
characterization of ambiguities which arise due to this non-uniqueness, we
propose a robust solution for inexact measurements similar to analogous results
for Euclidean Distance Matrices. Our theoretical results have important
consequences for the design of collocated range-only SLAM systems, and we
support them with an array of computer experiments.Comment: 5 pages, 8 figures, submitted to ICASSP 201
Integrating user-centred design in the development of a silent speech interface based on permanent magnetic articulography
Abstract: A new wearable silent speech interface (SSI) based on Permanent Magnetic Articulography (PMA) was developed with the involvement of end users in the design process. Hence, desirable features such as appearance, port-ability, ease of use and light weight were integrated into the prototype. The aim of this paper is to address the challenges faced and the design considerations addressed during the development. Evaluation on both hardware and speech recognition performances are presented here. The new prototype shows a com-parable performance with its predecessor in terms of speech recognition accuracy (i.e. ~95% of word accuracy and ~75% of sequence accuracy), but significantly improved appearance, portability and hardware features in terms of min-iaturization and cost
Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models
In this paper, we describe how to efficiently implement an acoustic room
simulator to generate large-scale simulated data for training deep neural
networks. Even though Google Room Simulator in [1] was shown to be quite
effective in reducing the Word Error Rates (WERs) for far-field applications by
generating simulated far-field training sets, it requires a very large number
of Fast Fourier Transforms (FFTs) of large size. Room Simulator in [1] used
approximately 80 percent of Central Processing Unit (CPU) usage in our CPU +
Graphics Processing Unit (GPU) training architecture [2]. In this work, we
implement an efficient OverLap Addition (OLA) based filtering using the
open-source FFTW3 library. Further, we investigate the effects of the Room
Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the
tail portions of RIRs whose power is less than 20 dB below the maximum power
without sacrificing the speech recognition accuracy. However, we observe that
cutting RIR tail more than this threshold harms the speech recognition accuracy
for rerecorded test sets. Using these approaches, we were able to reduce CPU
usage for the room simulator portion down to 9.69 percent in CPU/GPU training
architecture. Profiling result shows that we obtain 22.4 times speed-up on a
single machine and 37.3 times speed up on Google's distributed training
infrastructure.Comment: Published at INTERSPEECH 2018.
(https://www.isca-speech.org/archive/Interspeech_2018/abstracts/2566.html
Generating multimedia presentations: from plain text to screenplay
In many Natural Language Generation (NLG) applications, the output is limited to plain text – i.e., a string of words with punctuation and paragraph breaks, but no indications for layout, or pictures, or dialogue. In several projects, we have begun to explore NLG applications in which these extra media are brought into play. This paper gives an informal account of what we have learned. For coherence, we focus on the domain of patient information leaflets, and follow an example in which the same content is expressed first in plain text, then in formatted text, then in text with pictures, and finally in a dialogue script that can be performed by two animated agents. We show how the same meaning can be mapped to realisation patterns in different media, and how the expanded options for expressing meaning are related to the perceived style and tone of the presentation. Throughout, we stress that the extra media are not simple added to plain text, but integrated with it: thus the use of formatting, or pictures, or dialogue, may require radical rewording of the text itself
A real-time statistical time-series analyzer
Device extracts average frequency of human speech and produces second, third, and fourth moments of instantaneous frequency about this average. It operates on electrical time representation of input signal, performs statistical analysis on zero-crossing of almost any signal, and does not require specialized personnel to operate it
Voice scrambling for radio, cellular and telephone systems
An overview of the requirements of a scrambler for use with mobile radio equipment is presented. Details of the implementation of a scrambler are given satisfying these requirements. The scrambler is realized using general-purpose DSP technology giving the benefits of low-cost, high-volume production with the flexibility of customization and enhancement though software configuratio
Investigating the success factors of expert users to inform device development
Objective: Expert user testing is a well recognised tool within user experience and human computer interaction design. Within the domain of assistive technology device design, however, this technique seems to be little used. It is suggested that studying the success factors of expert assistive technology device users may provide a valuable source of data to inform development of assistive technology devices. This paper presents an example of this technique, within the context of a number of studies carried out by the authors, using the example of preliminary data from a study informing the development of an innovative Augmentative and Alternative Communication (AAC) device.
Main Content: The paper presents a qualitative study whose objective was to influence the design and further development of an innovative voice-input voice-output communication aid (Vivoca) which has previously reached proof-of-concept stage. The Vivoca device is designed for people with dysarthria and this dictates a number of specific constraints and considerations. In order to understand how Vivoca could be designed to be used successfully by people with dysarthria, this study aimed to identify the factors associated with expert users' successful use of current AAC devices. In order to allow comparison, the study included users with some understandable speech and also those with no understandable speech. The study procedure was designed to provide a profile of participants' communication methods and to identify the factors that participants felt made their communication successful.
Results: Preliminary results from the study (currently underway) are presented, including a qualitative analysis of interview data, and data profiling participants' communication methods and context. Initial data has highlighted the very specific requirements for a communication aid design for people with some understandable speech.
Conclusion: Study of expert users may provide an effective tool to help inform assistive technology device development
- …
