10 research outputs found
Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond
A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere
Proceedings of the EAA Joint Symposium on Auralization and Ambisonics 2014
In consideration of the remarkable intensity of research in the field of Virtual Acoustics, including different areas such as sound field analysis and synthesis, spatial audio technologies, and room acoustical modeling and auralization, it seemed about time to organize a second international symposium following the model of the first EAA Auralization Symposium initiated in 2009 by the acoustics group of the former Helsinki University of Technology (now Aalto University). Additionally, research communities which are focused on different approaches to sound field synthesis such as Ambisonics or Wave Field Synthesis have, in the meantime, moved closer together by using increasingly consistent theoretical frameworks. Finally, the quality of virtual acoustic environments is often considered as a result of all processing stages mentioned above, increasing the need for discussions on consistent strategies for evaluation. Thus, it seemed appropriate to integrate two of the most relevant communities, i.e. to combine the 2nd International Auralization Symposium with the 5th International Symposium on Ambisonics and Spherical Acoustics. The Symposia on Ambisonics, initiated in 2009 by the Institute of Electronic Music and Acoustics of the University of Music and Performing Arts in Graz, were traditionally dedicated to problems of spherical sound field analysis and re-synthesis, strategies for the exchange of ambisonics-encoded audio material, and – more than other conferences in this area – the artistic application of spatial audio systems.
This publication contains the official conference proceedings. It includes 29 manuscripts which have passed a 3-stage peer-review with a board of about 70 international reviewers involved in the process. Each contribution has already been published individually with a unique DOI on the DepositOnce digital repository of TU Berlin. Some conference contributions have been recommended for resubmission to Acta Acustica united with Acustica, to possibly appear in a Special Issue on Virtual Acoustics in late 2014. These are not published in this collection.European Acoustics Associatio
Assessment of a hybrid numerical approach to estimate sound wave propagation in an enclosure and application of auralizations to evaluate acoustical conditions of a classroom to establish the impact of acoustic variables on cognitive processes
In this research, the concept of auralization is explored taking into account a hybrid numerical approach to establish good options for calculating sound wave propagation and the application of virtual sound environments to evaluate acoustical conditions of a classroom, in order to determine the impact of acoustic variables on cognitive processes. The hybrid approach considers the combination of well-established Geometrical Acoustic (GA) techniques and the Finite Element Method (FEM), contemplating for the latter the definition of a real valued impedance boundary condition related to absorption coefficients available in GA databases. The realised virtual sound environments are verified against real environment measurements by means of objective and subjective methods. The former is based on acoustic measurements according to international standards, in order to evaluate the numerical approaches used with established acoustic indicators to assess sound propagation in rooms. The latter comprises a subjective test comparing the virtual auralizations to the reference ones, which are obtained by means of binaural impulse response measurements. The first application of the auralizations contemplates an intelligibility and listening difficulty subjective test, considering different acoustic conditions of reverberation time and background noise levels. The second application studies the impact of acoustic variables on the cognitive processes of attention, memory and executive function, by means of psychological tests
Bimodal Audiovisual Perception in Interactive Application Systems of Moderate Complexity
The dissertation at hand deals with aspects of quality perception of
interactive audiovisual application systems of moderate complexity as e.g.
defined in the MPEG-4 standard. Because in these systems the available
computing power is limited, it is decisive to know which factors influence
the perceived quality. Only then can the available computing power be
distributed in the most effective and efficient way for the simulation and
display of audiovisual 3D scenes. Whereas quality factors for the unimodal
auditory and visual stimuli are well known and respective models of
perception have been successfully devised based on this knowledge, this is
not true for bimodal audiovisual perception. For the latter, it is only
known that some kind of interdependency between auditory and visual
perception does exist. The exact mechanisms of human audiovisual perception
have not been described. It is assumed that interaction with an application
or scene has a major influence upon the perceived overall quality.
The goal of this work was to devise a system capable of performing
subjective audiovisual assessments in the given context in a largely
automated way. By applying the system, first evidence regarding audiovisual
interdependency and influence of interaction upon perception should be
collected. Therefore this work was composed of three fields of activities:
the creation of a test bench based on the available but (regarding the
audio functionality) somewhat restricted MPEG-4 player, the preoccupation
with methods and framework requirements that ensure comparability and
reproducibility of audiovisual assessments and results, and the performance
of a series of coordinated experiments including the analysis and
interpretation of the collected data. An object-based modular audio
rendering engine was co-designed and -implemented which allows to perform
simple room-acoustic simulations based on the MPEG-4 scene description
paradigm in real-time. Apart from the MPEG-4 player, the test bench
consists of a haptic Input Device used by test subjects to enter their
quality ratings and a logging tool that allows to journalize all relevant
events during an assessment session. The collected data can be exported
comfortably for further analysis using appropriate statistic tools.
A thorough analysis of the well established test methods and
recommendations for unimodal subjective assessments was performed to find
out whether a transfer to the audiovisual bimodal case is easily possible.
It became evident that - due to the limited knowledge about the underlying
perceptual processes - a novel categorization of experiments according to
their goals could be helpful to organize the research in the field.
Furthermore, a number of influencing factors could be identified that
exercise control over bimodal perception in the given context.
By performing the perceptual experiments using the devised system, its
functionality and ease of use was verified. Apart from that, some first
indications for the role of interaction in perceived overall quality have
been collected: interaction in the auditory modality reduces a human's
ability of correctly rating the audio quality, whereas visually based
(cross-modal) interaction does not necessarily generate this effect.Die vorliegende Dissertation beschäftigt sich mit Aspekten der
Qualitätswahrnehmung von interaktiven audiovisuellen Anwendungssystemen
moderater Komplexität, wie sie z.B. durch den MPEG-4 Standard definiert
sind. Die Frage, welche Faktoren Einfluss auf die wahrgenommene Qualität
von audiovisuellen Anwendungssystemen haben ist entscheidend dafĂĽr, wie die
nur begrenzt zur VerfĂĽgung stehende Rechenleistung fĂĽr die
Echtzeit-Simulation von 3D Szenen und deren Darbietung sinnvoll verteilt
werden soll. Während Qualitätsfaktoren für unimodale auditive als auch
visuelle Stimuli seit langem bekannt sind und entsprechende Modelle
existieren, mĂĽssen diese fĂĽr die bimodale audiovisuelle Wahrnehmung noch
hergeleitet werden. Dabei ist bekannt, dass eine Wechselwirkung zwischen
auditiver und visueller Qualität besteht, nicht jedoch, wie die Mechanismen
menschlicher audiovisueller Wahrnehmung genau arbeiten. Es wird auch
angenommen, dass der Faktor Interaktion einen wesentlichen Einfluss auf
wahrgenommene Qualität hat.
Das Ziel dieser Arbeit war, ein System fĂĽr die zeitsparende und weitgehend
automatisierte DurchfĂĽhrung von subjektiven audiovisuellen
Wahrnehmungstests im gegebenen Kontext zu erstellen und es fĂĽr einige
exemplarische Experimente einzusetzen, welche erste Aussagen ĂĽber
audiovisuelleWechselwirkungen und den Einfluss von Interaktion auf die
Wahrnehmung erlauben sollten. Demzufolge gliederte sich die Arbeit in drei
Aufgabenbereiche: die Erstellung eines geeigneten Testsystems auf der
Grundlage eines vorhandenen, jedoch in seiner Audiofunktionalität noch
eingeschränkten MPEG-4 Players, das Sicherstellen von Vergleichbarkeit und
Wiederholbarkeit von audiovisuellen Wahrnehmungstests durch definierte
Testmethoden und -bedingungen, und die eigentliche DurchfĂĽhrung der
aufeinander abgestimmten Experimente mit anschlieĂżender Auswertung und
Interpretation der gewonnenen Daten. Dazu wurde eine objektbasierte,
modulare Audio-Engine mitentworfen und -implementiert, welche basierend auf
den Möglichkeiten der MPEG-4 Szenenbeschreibung alle Fähigkeiten zur
Echtzeitberechnung von Raumakustik bietet. Innerhalb des entwickelten
Testsystems kommuniziert der MPEG-4 Player mit einem hardwaregestĂĽtzten
Benutzerinterface zur Eingabe der Qualitätsbewertungen durch die
Testpersonen. Sämtliche relevanten Ereignisse, die während einer
Testsession auftreten, können mit Hilfe eines Logging-Tools aufgezeichnet
und fĂĽr die weitere Datenanalyse mit Statistikprogrammen exportiert werden.
Eine Analyse der existierenden Testmethoden und -empfehlungen fĂĽr unimodale
Wahrnehmungstests sollte zeigen, ob deren Ăśbertragung auf den
audiovisuellen Fall möglich ist. Dabei wurde deutlich, dass bedingt durch
die fehlende Kenntnis der zugrundeliegenden Wahrnehmungsprozesse zunächst
eine Unterteilung nach den Zielen der durchgefĂĽhrten Experimente sinnvoll
erscheint. Weiterhin konnten Einflussfaktoren identifiziert werden, die die
bimodale Wahrnehmung im gegebenen Kontext steuern.
Bei der DurchfĂĽhrung der Wahrnehmungsexperimente wurde die
Funktionsfähigkeit des erstellten Testsystems verifiziert. Darüber hinaus
ergaben sich erste Anhaltspunkte fĂĽr den Einfluss von Interaktion auf die
wahrgenommene Gesamtqualität: Interaktion in der auditiven Modalität
verringert die Fähigkeit, Audioqualität korrekt beurteilen zu können,
während visuell gestützte Interaktion (cross-modal) diesen Effekt nicht
zwingend generiert
Rake, Peel, Sketch:The Signal Processing Pipeline Revisited
The prototypical signal processing pipeline can be divided into four blocks. Representation of the signal in a basis suitable for processing. Enhancement of the meaningful part of the signal and noise reduction. Estimation of important statistical properties of the signal. Adaptive processing to track and adapt to changes in the signal statistics. This thesis revisits each of these blocks and proposes new algorithms, borrowing ideas from information theory, theoretical computer science, or communications. First, we revisit the Walsh-Hadamard transform (WHT) for the case of a signal sparse in the transformed domain, namely that has only K †N non-zero coefficients. We show that an efficient algorithm exists that can compute these coefficients in O(K log2(K) log2(N/K)) and using only O(K log2(N/K)) samples. This algorithm relies on a fast hashing procedure that computes small linear combinations of transformed domain coefficients. A bipartite graph is formed with linear combinations on one side, and non-zero coefficients on the other. A peeling decoder is then used to recover the non-zero coefficients one by one. A detailed analysis of the algorithm based on error correcting codes over the binary erasure channel is given. The second chapter is about beamforming. Inspired by the rake receiver from wireless communications, we recognize that echoes in a room are an important source of extra signal diversity. We extend several classic beamforming algorithms to take advantage of echoes and also propose new optimal formulations. We explore formulations both in time and frequency domains. We show theoretically and in numerical simulations that the signal-to-interference-and-noise ratio increases proportionally to the number of echoes used. Finally, beyond objective measures, we show that echoes also directly improve speech intelligibility as measured by the perceptual evaluation of speech quality (PESQ) metric. Next, we attack the problem of direction of arrival of acoustic sources, to which we apply a robust finite rate of innovation reconstruction framework. FRIDA â the resulting algorithm â exploits wideband information coherently, works at very low signal-to-noise ratio, and can resolve very close sources. The algorithm can use either raw microphone signals or their cross- correlations. While the former lets us work with correlated sources, the latter creates a quadratic number of measurements that allows to locate many sources with few microphones. Thorough experiments on simulated and recorded data shows that FRIDA compares favorably with the state-of-the-art. We continue by revisiting the classic recursive least squares (RLS) adaptive filter with ideas borrowed from recent results on sketching least squares problems. The exact update of RLS is replaced by a few steps of conjugate gradient descent. We propose then two different precondi- tioners, obtained by sketching the data, to accelerate the convergence of the gradient descent. Experiments on artificial as well as natural signals show that the proposed algorithm has a performance very close to that of RLS at a lower computational burden. The fifth and final chapter is dedicated to the software and hardware tools developed for this thesis. We describe the pyroomacoustics Python package that contains routines for the evaluation of audio processing algorithms and reference implementations of popular algorithms. We then give an overview of the microphone arrays developed
The Future of Humanoid Robots
This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book
Abstracts on Radio Direction Finding (1899 - 1995)
The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography).
Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM.
The contents of these files are:
1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format];
2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format];
3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion
Visual-auditory visualisation of dynamic multi-scale heterogeneous objects.
The multi-scale phenomena analysis is an area of active research that is connecting simulations with experiments to get a correct insight into the compound dynamic structure. Visualisation is a challenging task due to a large amount of data and a wide range of complex data representations. The analysis of dynamic multi-scale phenomena requires a combination of geometric modelling and rendering techniques for the analysis of the changes in the internal structure in the case of data coming from different sources of various nature. Moreover, the area often addresses the limitations of solely visual data representation and considers the introduction of other sensory stimuli as a well-known tool to enhance visual analysis. However, there is a lack of software tools allowing perform an advanced real-time analysis of heterogeneous phenomena properties. The hardware-accelerated volume rendering allows getting insight into the internal structure of complex multi-scale phenomena. The technique is convenient for detailed visual analysis and highlights the features of interest in complex structures and is an area of active research. However, the conventional volume visualisation is limited to the use of transfer functions that operate on homogeneous material and, as a result, does not provide flexibility in geometry and material distribution modelling that is crucial for the analysis of heterogeneous objects. Moreover, the extension to visual-auditory analysis emphasises the necessity to review the entire conventional volume visualisation pipeline. The multi-sensory feedback highly depends on the use of modern hardware and software advances for real-time modelling and evaluation. In this work, we explore the aspects of the design of visual-auditory pipelines for the analysis of dynamic multi-scale properties of heterogeneous objects that can allow overcoming well-known problems of complex representations solely visual analysis. We consider the similarities between light and sound propagation as a solution to the problem. The approach benefits from a combination of GPU accelerated ray-casting, geometry, optical and auditory properties modelling. We discuss how the modern GPU techniques application in those areas allows introducing a unified approach to the visual-auditory analysis of dynamic multi-scale heterogeneous objects. Similarly to the conventional volume rendering technique based on light propagation, we model auditory feedback as a result of initial impulse propagation through 3D space and its digital representation as a sampled sound wave obtained with the ray-casting procedure. The auditory stimuli can complement visual ones in the analysis of the dynamic multi-scale heterogeneous object. We propose a framework that facilitates the design of dynamic multi-scale heterogeneous objects visual-auditory pipeline and discuss the framework application for two case studies. The first is a molecular phenomena study that is a result of molecular dynamics simulation and quantum simulation. The second explores microstructures in digital fabrication with an arbitrary irregular lattice structure. For considered case studies, the visual-auditory techniques facilitate the interactive analysis of both spatial structure and internal multi-scale properties of volume nature in complex heterogeneous objects. A GPU-accelerated framework for visual-auditory analysis of heterogeneous objects can be applied and extend beyond this research. Thus, to specify the main direction of such extension from the point of view of the potential users, strengthen the value of this research as well as to evaluate the vision of the application of the techniques described above, we carry out a preliminary evaluation. The user study aims to compare our expectations on the visual-auditory approach with the views of the potential users of this system if it is implemented as a software product. A preliminary evaluation study was carried out with limitations imposed by 2020/2021 restrictions. However, it confirms that the main direction for the visual-auditory analysis of heterogeneous objects has been identified correctly and visual and auditory stimuli can complement each other in the analysis of both volume and spatial distribution properties of heterogeneous phenomena. The user reviews also highlight the necessary enhancements that should be introduced to the approach in terms of the design of more complex user interfaces and consideration of additional application cases. To provide a more detailed picture on evaluation results and recommendations introduced, we also identify the key factors that define the user vision of the approach further enhancement and its possible application areas, such as users experience in the area of complex physical phenomena analysis or multi-sensory area. The discussed in this work aspects of heterogeneous objects analysis task, theoretical and practical solutions allow considering the application, further development and enhancement of the results in multidisciplinary areas of GPU accelerated High-performance visualisation pipelines design and multi-sensory analysis
Cumulative index to NASA Tech Briefs, 1986-1990, volumes 10-14
Tech Briefs are short announcements of new technology derived from the R&D activities of the National Aeronautics and Space Administration. These briefs emphasize information considered likely to be transferrable across industrial, regional, or disciplinary lines and are issued to encourage commercial application. This cumulative index of Tech Briefs contains abstracts and four indexes (subject, personal author, originating center, and Tech Brief number) and covers the period 1986 to 1990. The abstract section is organized by the following subject categories: electronic components and circuits, electronic systems, physical sciences, materials, computer programs, life sciences, mechanics, machinery, fabrication technology, and mathematics and information sciences
New Directions for Contact Integrators
Contact integrators are a family of geometric numerical schemes which
guarantee the conservation of the contact structure. In this work we review the
construction of both the variational and Hamiltonian versions of these methods.
We illustrate some of the advantages of geometric integration in the
dissipative setting by focusing on models inspired by recent studies in
celestial mechanics and cosmology.Comment: To appear as Chapter 24 in GSI 2021, Springer LNCS 1282