1,526 research outputs found

    Raking echoes in the time domain

    Get PDF
    The geometry of room acoustics is such that the reverberant signal can be seen as the same waveform emitted from multiple locations. In analogy with the rake receiver from wireless communications, we propose several beamforming strategies that exploit, rather than suppress, this additional spatio-temporal diversity. Unlike earlier work in the frequency domain, time domain designs allow to shape the impulse response of the beamformer. In particular, we can control perceptually relevant parameters, such as the amount of early echoes or the length of the beamformer response. Relying on the knowledge of the image sources positions, we derive different optimal beamformers. Leveraging perceptual cues, we show how to improve interference and noise reduction without degrading the perceptual quality. The designs are validated through simulation. Using early echoes is shown to strictly improve the signal to interference and noise ratio. Code and speech samples are available online at http://lcav.epfl.ch/Robin_Scheibler

    Raking the Cocktail Party

    Get PDF
    We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several "intuitive" and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the \emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at \url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on Selected Topics in Signal Processing (Special Issue on Spatial Audio

    Spherical microphone array acoustic rake receivers

    Get PDF
    Several signal independent acoustic rake receivers are proposed for speech dereverberation using spherical microphone arrays. The proposed rake designs take advantage of multipaths, by separately capturing and combining early reflections with the direct path. We investigate several approaches in combining reflections with the direct path source signal, including the development of beam patterns that point nulls at all preceding reflections. The proposed designs are tested in experimental simulations and their dereverberation performances evaluated using objective measures. For the tested configuration, the proposed designs achieve higher levels of dereverberation compared to conventional signal independent beamforming systems; achieving up to 3.6 dB improvement in the direct-to-reverberant ratio over the plane-wave decomposition beamformer

    Terahertz frequency-wavelet domain deconvolution for stratigraphic and subsurface investigation of art painting

    Get PDF
    Terahertz frequency-wavelet deconvolution is utilized specifically for the stratigraphic and subsurface investigation of art paintings with terahertz reflective imaging. In order to resolve the optically thin paint layers, a deconvolution technique is enhanced by the combination of frequency-domain filtering and stationary wavelet shrinkage, and applied to investigate a mid-20th century Italian oil painting on paperboard, After Fishing, by Ausonio Tanda. Based on the deconvolved terahertz data, the stratigraphy of the painting including the paint layers is reconstructed and subsurface features are clearly revealed, demonstrating that terahertz frequencywavelet deconvolution can be an effective tool to characterize stratified systems with optically thin layers

    Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

    Full text link
    We present pyroomacoustics, a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the package can be divided into three main components: an intuitive Python object-oriented interface to quickly construct different simulation scenarios involving multiple sound sources and microphones in 2D and 3D rooms; a fast C implementation of the image source model for general polyhedral rooms to efficiently generate room impulse responses and simulate the propagation between sources and receivers; and finally, reference implementations of popular algorithms for beamforming, direction finding, and adaptive filtering. Together, they form a package with the potential to speed up the time to market of new algorithms by significantly reducing the implementation overhead in the performance evaluation step.Comment: 5 pages, 5 figures, describes a software packag

    Rake, Peel, Sketch:The Signal Processing Pipeline Revisited

    Get PDF
    The prototypical signal processing pipeline can be divided into four blocks. Representation of the signal in a basis suitable for processing. Enhancement of the meaningful part of the signal and noise reduction. Estimation of important statistical properties of the signal. Adaptive processing to track and adapt to changes in the signal statistics. This thesis revisits each of these blocks and proposes new algorithms, borrowing ideas from information theory, theoretical computer science, or communications. First, we revisit the Walsh-Hadamard transform (WHT) for the case of a signal sparse in the transformed domain, namely that has only K †N non-zero coefficients. We show that an efficient algorithm exists that can compute these coefficients in O(K log2(K) log2(N/K)) and using only O(K log2(N/K)) samples. This algorithm relies on a fast hashing procedure that computes small linear combinations of transformed domain coefficients. A bipartite graph is formed with linear combinations on one side, and non-zero coefficients on the other. A peeling decoder is then used to recover the non-zero coefficients one by one. A detailed analysis of the algorithm based on error correcting codes over the binary erasure channel is given. The second chapter is about beamforming. Inspired by the rake receiver from wireless communications, we recognize that echoes in a room are an important source of extra signal diversity. We extend several classic beamforming algorithms to take advantage of echoes and also propose new optimal formulations. We explore formulations both in time and frequency domains. We show theoretically and in numerical simulations that the signal-to-interference-and-noise ratio increases proportionally to the number of echoes used. Finally, beyond objective measures, we show that echoes also directly improve speech intelligibility as measured by the perceptual evaluation of speech quality (PESQ) metric. Next, we attack the problem of direction of arrival of acoustic sources, to which we apply a robust finite rate of innovation reconstruction framework. FRIDA â the resulting algorithm â exploits wideband information coherently, works at very low signal-to-noise ratio, and can resolve very close sources. The algorithm can use either raw microphone signals or their cross- correlations. While the former lets us work with correlated sources, the latter creates a quadratic number of measurements that allows to locate many sources with few microphones. Thorough experiments on simulated and recorded data shows that FRIDA compares favorably with the state-of-the-art. We continue by revisiting the classic recursive least squares (RLS) adaptive filter with ideas borrowed from recent results on sketching least squares problems. The exact update of RLS is replaced by a few steps of conjugate gradient descent. We propose then two different precondi- tioners, obtained by sketching the data, to accelerate the convergence of the gradient descent. Experiments on artificial as well as natural signals show that the proposed algorithm has a performance very close to that of RLS at a lower computational burden. The fifth and final chapter is dedicated to the software and hardware tools developed for this thesis. We describe the pyroomacoustics Python package that contains routines for the evaluation of audio processing algorithms and reference implementations of popular algorithms. We then give an overview of the microphone arrays developed

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere

    Historical geography III: hope persists

    Get PDF
    The final report in this series focuses on the emerging intersections between historical geography, archaeology and the law. Whilst staying attuned to the darkest of geographies emerging from the sub-field, this report turns its attention to the creative and critical ways in which the dead are being used to reveal past lives and worlds that have been destroyed and forgotten. Using soil and the archaeological imagination as a pivot, this report centres on the interweaving themes of fragile environments, resurfacing and legal worlds in order to suggest the emerging possibilities for a hopeful excavation of new historical geographies

    'It’s hard to define good writing, but i recognise it when i see it’: can consensus-based assessment evaluate the teaching of writing?

    Get PDF
    In a Higher Education environment where evidence-based practice and accountability are highly valued, most writing practitioners will be familiar with direct requests or less tangible pressures to demonstrate that their teaching has a positive impact on students’ writing skills. Although such evaluations are not devoid of risk and the need for them is contested, it can be argued that it is better to engage with them, as this can avoid the danger of overly simplistic forms of measurements being imposed. The current paper engages with this question by proposing the conceptual basis for a new measurement tool. Based on Amabile’s Consensual Assessment Technique (CAT), developed to assess creativity, the tool develops the idea of consensual assessment of writing as a methodology that can provide robust data through systematic measurement. At the same time, I argue consensual assessment reflects the evaluation of writing in real life situations more closely than many of the methodologies for writing assessment used in other contexts, primarily large scale tests. As such, it would allow writing practitioners to go beyond ethnographic methods, or self- reporting, in order to obtain greater insight into the ways in which their teaching helps change students’ actual writing, without sacrificing the complexity of writing as social interaction, which is fundamental to an academic literacies approach

    Her Anticipation

    Get PDF
    My thesis paper will explore my artistic practice by analyzing my thesis video project,Her Anticipation. I will accomplish this by examining three main topics: The essential elements of the video, the video’s relationship to my earlier work, and a discussion of the video and its structure with representative examples. The first essential element of the video springs from my relationship with my husband, especially the aspect of the relationship that is tied up with our nearly forty-year age difference. The second element is my personal experience filtered through a deep reading of Edmund Spenser’s The Faerie Queene. The third element that I discuss is the thinking behind my cultivation of a personal voice in the medium of video. The thesis’ second main topic recounts my earlier work and its connection to my current video project. I began my artistic career with analog photography. Photography gave me training in composition and camera work. My early video work prepared me for composing moving images, camera motion, and editing. Most importantly, my shorter videos allowed me to experiment with creating nontraditional, uncoerced narratives. In the third section of my thesis I guide the reader through each of the four chapters of my video. For each chapter, I summarize the relevant passage in The Faerie Queene and then unpack that chapter of video. For each chapter, I discuss individual shots, sequence of shots, composition, and content. More importantly, I elaborate on the underlying emotional structure of the video (which the musical selections and literary excerpts help to articulate). I conclude my thesis by outlining a few of the implications that this project has for my work as an artist. My own artistic personality has been greatly clarified by working on Her Anticipation. Furthermore, the project has helped me imagine the trajectory of my future production. I will continue to create video works in this pattern, and they will be much advanced because of the experience that I have had creating this video
    • …
    corecore