1,126 research outputs found

    Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture

    Full text link
    This paper presents a configurable version of Extreme Bandwidth Extension Network (EBEN), a Generative Adversarial Network (GAN) designed to improve audio captured with body-conduction microphones. We show that although these microphones significantly reduce environmental noise, this insensitivity to ambient noise happens at the expense of the bandwidth of the speech signal acquired by the wearer of the devices. The obtained captured signals therefore require the use of signal enhancement techniques to recover the full-bandwidth speech. EBEN leverages a configurable multiband decomposition of the raw captured signal. This decomposition allows the data time domain dimensions to be reduced and the full band signal to be better controlled. The multiband representation of the captured signal is processed through a U-Net-like model, which combines feature and adversarial losses to generate an enhanced speech signal. We also benefit from this original representation in the proposed configurable discriminators architecture. The configurable EBEN approach can achieve state-of-the-art enhancement results on synthetic data with a lightweight generator that allows real-time processing.Comment: Accepted in IEEE/ACM Transactions on Audio, Speech and Language Processing on 14/08/202

    Amplitude and phase sonar calibration and the use of target phase for enhanced acoustic target characterisation

    Get PDF
    This thesis investigates the incorporation of target phase into sonar signal processing, for enhanced information in the context of acoustical oceanography. A sonar system phase calibration method, which includes both the amplitude and phase response is proposed. The technique is an extension of the widespread standard-target sonar calibration method, based on the use of metallic spheres as standard targets. Frequency domain data processing is used, with target phase measured as a phase angle difference between two frequency components. This approach minimizes the impact of range uncertainties in the calibration process. Calibration accuracy is examined by comparison to theoretical full-wave modal solutions. The system complex response is obtained for an operating frequency of 50 to 150 kHz, and sources of ambiguity are examined. The calibrated broadband sonar system is then used to study the complex scattering of objects important for the modelling of marine organism echoes, such as elastic spheres, fluid-filled shells, cylinders and prolate spheroids. Underlying echo formation mechanisms and their interaction are explored. Phase-sensitive sonar systems could be important for the acquisition of increased levels of information, crucial for the development of automated species identification. Studies of sonar system phase calibration and complex scattering from fundamental shapes are necessary in order to incorporate this type of fully-coherent processing into scientific acoustic instruments

    A Survey on Reservoir Computing and its Interdisciplinary Applications Beyond Traditional Machine Learning

    Full text link
    Reservoir computing (RC), first applied to temporal signal processing, is a recurrent neural network in which neurons are randomly connected. Once initialized, the connection strengths remain unchanged. Such a simple structure turns RC into a non-linear dynamical system that maps low-dimensional inputs into a high-dimensional space. The model's rich dynamics, linear separability, and memory capacity then enable a simple linear readout to generate adequate responses for various applications. RC spans areas far beyond machine learning, since it has been shown that the complex dynamics can be realized in various physical hardware implementations and biological devices. This yields greater flexibility and shorter computation time. Moreover, the neuronal responses triggered by the model's dynamics shed light on understanding brain mechanisms that also exploit similar dynamical processes. While the literature on RC is vast and fragmented, here we conduct a unified review of RC's recent developments from machine learning to physics, biology, and neuroscience. We first review the early RC models, and then survey the state-of-the-art models and their applications. We further introduce studies on modeling the brain's mechanisms by RC. Finally, we offer new perspectives on RC development, including reservoir design, coding frameworks unification, physical RC implementations, and interaction between RC, cognitive neuroscience and evolution.Comment: 51 pages, 19 figures, IEEE Acces

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Aesthetically driven design of network based multi-user instruments.

    Get PDF
    Digital networking technologies open up a new world of possibilities for music making, allowing performers to collaborate in ways not possible before. Network based Multi-User Instruments (NMIs) are one novel method of musical collaboration that take advantage of networking technology. NMIs are digital musical instruments that exist as a single entity instantiated over several nodes in a network and are performed simultaneously by multiple musicians in realtime. This new avenue is exciting, but it begs the question of how does one design instruments for this new medium? This research explores the use of an aesthetically driven design process to guide the design, construction, rehearsal, and performance of a series of NMIs. This is an iterative process that makes use of a regularly rehearsing and performing ensemble which serves as a test-bed for new instruments, from conception, to design, to implementation, to performance. This research includes details of several NMIs constructed in accordance with this design process. These NMIs have been quantitatively analysed and empirically tested for the presence of interconnectivity and group influence during performance as a method for measuring group collaboration. Furthermore qualitative analyses are applied which test for the perceived e ectiveness of these instruments during real-world performances in front of live audiences. The results of these analyses show that an aesthetically driven method of designing NMIs produces instruments that are interactive and collaborative. Furthermore results show that audiences perceive a measurable impression of interconnectivity and liveness in the ensemble even though most of the performers in the ensemble are not physically present

    The Translocal Event and the Polyrhythmic Diagram

    Get PDF
    This thesis identifies and analyses the key creative protocols in translocal performance practice, and ends with suggestions for new forms of transversal live and mediated performance practice, informed by theory. It argues that ontologies of emergence in dynamic systems nourish contemporary practice in the digital arts. Feedback in self-organised, recursive systems and organisms elicit change, and change transforms. The arguments trace concepts from chaos and complexity theory to virtual multiplicity, relationality, intuition and individuation (in the work of Bergson, Deleuze, Guattari, Simondon, Massumi, and other process theorists). It then examines the intersection of methodologies in philosophy, science and art and the radical contingencies implicit in the technicity of real-time, collaborative composition. Simultaneous forces or tendencies such as perception/memory, content/ expression and instinct/intellect produce composites (experience, meaning, and intuition- respectively) that affect the sensation of interplay. The translocal event is itself a diagram - an interstice between the forces of the local and the global, between the tendencies of the individual and the collective. The translocal is a point of reference for exploring the distribution of affect, parameters of control and emergent aesthetics. Translocal interplay, enabled by digital technologies and network protocols, is ontogenetic and autopoietic; diagrammatic and synaesthetic; intuitive and transductive. KeyWorx is a software application developed for realtime, distributed, multimodal media processing. As a technological tool created by artists, KeyWorx supports this intuitive type of creative experience: a real-time, translocal “jamming” that transduces the lived experience of a “biogram,” a synaesthetic hinge-dimension. The emerging aesthetics are processual – intuitive, diagrammatic and transversal

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications
    • …
    corecore