18 research outputs found
Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany
The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities.
The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT
Inaudible acoustics: Techniques and applications
This dissertation is focused on developing a sub-area of acoustics that we call inaudible acoustics. We have developed two core capabilities, (1) BackDoor and (2) Ripple, and demonstrated their use in various mobile and IoT applications. In BackDoor, we synthesize ultrasound signals that are inaudible to humans yet naturally recordable by all microphones. Importantly, the microphone does not require any modification, enabling billions of microphone-enabled devices, including phones, laptops, voice assistants, and IoT devices, to leverage the capability. Example applications include acoustic data beacons, acoustic watermarking, and spy-microphone jamming. In Ripple, we develop modulation and sensing techniques for vibratory signals that traverse through solid surfaces, enabling a new form of secure proximal communication. Applications of the vibratory communication system include on-body communication through imperceptible physical vibrations and device-device secure data transfer through physical contacts. Our prototypes include an inaudible jammer that secures private conversations from electronic eavesdropping, acoustic beacons for location-based information sharing, and vibratory communication in a smart-ring sending password through a finger touch. Our research also uncovers new security threats to acoustic devices. While simple abuse of inaudible jammer can disable hearing aids and cell phones, our work shows that voice interfaces, such as Amazon Echo, Google Home, Siri, etc., can be compromised through carefully designed inaudible voice commands. The contributions of this dissertation can be summarized in three primitives: (1) exploiting inherent hardware nonlinearity for sensing out-of-band signals, (2) developing the vibratory communication system for secure touch-based data exchange, and (3) structured information reconstruction from noisy acoustic signals. In developing these primitives, we draw from principles in wireless networking, digital communications, signal processing, and embedded design and translate them to completely functional systems
KAVUAKA: a low-power application-specific processor architecture for digital hearing aids
The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements
Recommended from our members
Multi-Mobile Computing
With mobile systems evermore ubiquitous, individual users often own multiple mobile systems and groups of users often have many mobile systems at their disposal. As a result, there is a growing demand for multi-mobile computing, the ability to combine the functionality of multiple mobile systems into a more capable one. However, there are several key challenges. First, mobile systems are highly heterogeneous with different software and hardware, each with their own interfaces and data formats. Second, there are no effective ways to allow users to easily and dynamically compose together multiple mobile systems for the quick interactions that typically take place with mobile systems. Finally, there is a lack of system infrastructure to allow existing apps to make use of multiple mobile systems, or to enable developers to write new multi-mobile aware apps. My thesis is that higher-level abstractions of mobile operating systems can be reused to combine heterogeneous mobile systems into a more capable one and enable existing and new apps to provide new functionality across multiple mobile systems.
First, we present M2, a system for multi-mobile computing that enables existing unmodified mobile apps to share and combine multiple devices, including cameras, displays, speakers, microphones, sensors, GPS, and input. To support heterogeneous devices, M2 introduces a new data-centric approach that leverages higher-level device abstractions and hardware acceleration to efficiently share device data, not API calls. M2 introduces device transformation, a new technique to mix and match heterogeneous devices, enabling, for example, existing apps to leverage a single larger display fused from multiple displays for better viewing, or use a Nintendo Wii-like gaming experience by translating accelerometer to touchscreen input. We have implemented M2 and show that it operates across heterogeneous systems, including multiple versions of Android and iOS, and can run existing apps across mobile systems with modest overhead and qualitative performance indistinguishable from using local device hardware.
Second, we present Tap, a framework that leverages M2âs data-centric architecture to make it easy for users to dynamically compose collections of mobile systems and developers to write new multi-mobile apps that make use of those impromptu collections. Tap allows users to simply tap systems together to compose them into a collection without the need for users to register or connect to any cloud infrastructure. Tap makes it possible for apps to use existing mobile platform APIs across multiple mobile systems by virtualizing data sources so that local and remote data sources can be combined together upon tapping. Virtualized data sources can be hardware or software features, including media, clipboard, calendar events, and devices such as cameras and microphones. Leveraging existing mobile platform APIs make it easy for developers to write apps that use hard- ware and software features across dynamically composed collections of mobile systems. We have implemented Tap and show that it provides good usability for dynamically composing multiple mobile systems and good performance for sharing hardware devices and software features across multiple mobile systems.
Finally, using M2 and Tap, we present various apps that show how existing apps can provide useful functionality across multiple mobile systems and how new apps can be easily developed to provide new multi-mobile functionality. Examples include panoramic video recording using cameras from multiple mobile systems, surround sound music player app that configures itself based on automatically detecting the location of multiple mobile systems, and an added feature to the Snapchat app that allows multiple users to share a live Snap, using their own cameras and filters. Our user studies with these apps show that multi-mobile computing offers a richer and more enhanced experience for users and a much simpler development effort for developers
Modulated Backscatter for Low-Power High-Bandwidth Communication
<p>This thesis re-examines the physical layer of a communication link in order to increase the energy efficiency of a remote device or sensor. Backscatter modulation allows a remote device to wirelessly telemeter information without operating a traditional transceiver. Instead, a backscatter device leverages a carrier transmitted by an access point or base station.</p><p>A low-power multi-state vector backscatter modulation technique is presented where quadrature amplitude modulation (QAM) signalling is generated without running a traditional transceiver. Backscatter QAM allows for significant power savings compared to traditional wireless communication schemes. For example, a device presented in this thesis that implements 16-QAM backscatter modulation is capable of streaming data at 96 Mbps with a radio communication efficiency of 15.5 pJ/bit. This is over 100x lower energy per bit than WiFi (IEEE 802.11).</p><p>This work could lead to a new class of high-bandwidth sensors or implantables with power consumption far lower than traditional radios.</p>Dissertatio
Outils de spatialisation sonore pour terminaux mobiles : microphone 3D pour une utilisation nomade
Mobile technologies (such as smartphones and tablets) are now common devices of the consumer market. In this PhD we want to use those technologies as the way to introduce tools of sound spatialization into the mass market. Today the size and the number of traducers used to pick-up and to render a spatial sound scene are the main factors which limit the portability of those devices. As a first step, a listening test, based on a spatial audio recording of an opera, let us to evaluate the 3D audio technologies available today for headphone rendering. The results of this test show that, using the appropriate binaural decoding, it is possible to achieve a good binaural rendering using only the four sensors of the Soundfield microphone.Then, the steps of the development of a 3D sound pick-up system are described. Several configurations are evaluated and compared. The device, composed of 3 cardioid microphones, was developed following an approach inspired by the sound source localization and by the concept of the "object format encoding". Using the microphone signals and an adapted post-processing it is possible to determine the directions of the sources and to extract a sound signal which is representative of the sound scene. In this way, it is possible to completely describe the sound scene and to compress the audio information.This method offer the advantage of being cross platform compatible. In fact, the sound scene encoded with this method can be rendered over any reproduction system.A second method to extract the spatial information is proposed. It uses the real in situ characteristics of the microphone array to perform the sound scene analysis.Some propositions are made to complement the 3D audio chain allowing to render the result of the sound scene encoding over a binaural system or any king of speaker array using all capabilities of the mobile devices.Les technologies nomades (smartphones, tablettes, . . . ) Ă©tant actuellement trĂšs rĂ©pandues,nous avons souhaitĂ©, dans le cadre de cette thĂšse, les utiliser comme vecteur pour proposer au grand public des outils de spatialisation sonore. La taille et le nombre de transducteurs utilisĂ©s pour la captation et la restitution sonore spatialisĂ©e sont Ă ce jour la limitation principale pour une utilisation nomade. Dans une premiĂšre Ă©tape, la captation dâun opĂ©ra pour une restitution sur des tablettes tactiles nous a permis dâĂ©valuer les technologies audio 3D disponibles aujourdâhui. Les rĂ©sultats de cette Ă©valuation ont rĂ©vĂ©lĂ© que lâutilisation des quatre capteurs du microphone Soundfield donne de bons rĂ©sultats Ă condition dâeffectuer un dĂ©codage binaural adaptĂ© pour une restitution sur casque. Selon une approche inspirĂ©e des mĂ©thodes de localisation de source et le concept de format « objet », un prototype de prise de son 3D lĂ©ger et compact a Ă©tĂ© dĂ©veloppĂ©. Le dispositif microphonique proposĂ© se compose de trois capsules microphoniques cardioĂŻdes. A partir des signaux microphoniques, un algorithme de post-traitement spatial est capable, dâune part, de dĂ©terminer la direction des sources et, dâautre part, dâextraire un signal sonore reprĂ©sentatif de la scĂšne spatiale. Ces deux informations permettent ainsi de caractĂ©risercomplĂštement la scĂšne sonore 3D en fournissant un encodage spatial offrant le double avantage dâune compression de lâinformation audio et dâune flexibilitĂ© pour le choix du systĂšme de reproduction. En effet, la scĂšne sonore ainsi encodĂ©e peut ĂȘtre restituĂ©e en utilisant un dĂ©codage adaptĂ© sur nâimporte quel type de dispositif.Plusieurs mĂ©thodes de localisation et diffĂ©rentes configurations microphoniques (gĂ©omĂ©trie et directivitĂ©) ont Ă©tĂ© Ă©tudiĂ©es.Dans une seconde Ă©tape, lâalgorithme dâextraction de lâinformation spatiale a Ă©tĂ© modifiĂ© pour prendre en compte les caractĂ©ristiques rĂ©elles in situ des microphones.Des mĂ©thodes pour complĂ©ter la chaĂźne acoustique sont proposĂ©es permettant la restitution binaurale ainsi que sur tout autre dispositif de restitution. Elles proposent lâutilisation de capteurs de localisation prĂ©sents sur les terminaux mobiles afin dâexploiter les capacitĂ©s quâils offrent aujourdâhui
Non-Intrusive Subscriber Authentication for Next Generation Mobile Communication Systems
Merged with duplicate record 10026.1/753 on 14.03.2017 by CS (TIS)The last decade has witnessed massive growth in both the technological development, and
the consumer adoption of mobile devices such as mobile handsets and PDAs. The recent
introduction of wideband mobile networks has enabled the deployment of new services
with access to traditionally well protected personal data, such as banking details or
medical records. Secure user access to this data has however remained a function of the
mobile device's authentication system, which is only protected from masquerade abuse by
the traditional PIN, originally designed to protect against telephony abuse.
This thesis presents novel research in relation to advanced subscriber authentication for
mobile devices. The research began by assessing the threat of masquerade attacks on
such devices by way of a survey of end users. This revealed that the current methods of
mobile authentication remain extensively unused, leaving terminals highly vulnerable to
masquerade attack. Further investigation revealed that, in the context of the more
advanced wideband enabled services, users are receptive to many advanced
authentication techniques and principles, including the discipline of biometrics which
naturally lends itself to the area of advanced subscriber based authentication.
To address the requirement for a more personal authentication capable of being applied
in a continuous context, a novel non-intrusive biometric authentication technique was
conceived, drawn from the discrete disciplines of biometrics and Auditory Evoked
Responses. The technique forms a hybrid multi-modal biometric where variations in the
behavioural stimulus of the human voice (due to the propagation effects of acoustic
waves within the human head), are used to verify the identity o f a user. The resulting
approach is known as the Head Authentication Technique (HAT).
Evaluation of the HAT authentication process is realised in two stages. Firstly, the
generic authentication procedures of registration and verification are automated within a
prototype implementation. Secondly, a HAT demonstrator is used to evaluate the
authentication process through a series of experimental trials involving a representative
user community. The results from the trials confirm that multiple HAT samples from
the same user exhibit a high degree of correlation, yet samples between users exhibit a
high degree of discrepancy. Statistical analysis of the prototypes performance realised
early system error rates of; FNMR = 6% and FMR = 0.025%. The results clearly
demonstrate the authentication capabilities of this novel biometric approach and the
contribution this new work can make to the protection of subscriber data in next
generation mobile networks.Orange Personal Communication Services Lt
Toujou radyo: the digital extensions of Haitian music broadcasting
âRadioâ is a hard word to pin down. It refers to a physical device with certain propertiesâ perhaps a dial and antennaâbut also to a whole world of sonic communication: newscasts, call- in shows, police dispatches, music streaming servicesâ media that can be transmitted in any number of ways, on any number of devices. This dissertation focuses on Haitian music broadcasting as a social practice, meaning the programming formats, aesthetic conventions and listener experiences that make this particular medium function. I then trace those practices from Haitiâs terrestrial airwaves, out to the many digital media platforms that are ubiquitous in Haiti and Haitian-American communities today, arguing that there is a clear line of continuity across these various forms of radio. In doing so, this research offers a possible model for understanding the so-called 'digital revolution' of which we are all part, with the distinction that this model is rooted in media practices of the Global South. The four case studies that form the body of this dissertation are each framed around the specificity of Haitian radioâfor both its producers and listenersâand progress outwards from the mediumâs terrestrial roots, to its farthest digital extensions. Chapter One introduces some of the program formats and hosting techniques heard commonly on Haitiâs airwaves, with a special focus on the art of âanimation,â by which broadcasters bring music to life through verbal commentary. Chapter Two provides a deeper history of broadcasting in Haiti and the Caribbean, arguing that radio in the region has been transnational in scope from its earliest applications. In Chapter Three the focus shifts to the United States, and to the specific regulatory, social and geographic dynamics that have informed the development of radio broadcasting among Haitian immigrants here. Finally, Chapter Four tells the stories of two musicians who have themselves taken on many of the roles and practices of broadcasters, as an opportunity for comparative analysis. Each case study considers how a diverse range of practices and technologies can be understood as still radioâtoujou radyoâwhile investigating the musical consequences and social impact of these digital extensions