Search CORE

7,052 research outputs found

Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds

Author: Deleforge Antoine
Forbes Florence
Horaud Radu
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 20/03/2014
Field of study

In this paper we address the problems of modeling the acoustic space generated by a full-spectrum sound source and of using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimensional structure of the high-dimensional interaural spectral data, based on a corpus recorded with a human-like audiomotor robot head. A non-linear dimensionality reduction technique is used to show that these data lie on a two-dimensional (2D) smooth manifold parameterized by the motor states of the listener, or equivalently, the sound source directions. We propose a probabilistic piecewise affine mapping model (PPAM) specifically designed to deal with high-dimensional data exhibiting an intrinsic piecewise linear structure. We derive a closed-form expectation-maximization (EM) procedure for estimating the model parameters, followed by Bayes inversion for obtaining the full posterior density function of a sound source direction. We extend this solution to deal with missing data and redundancy in real world spectrograms, and hence for 2D localization of natural sound sources such as speech. We further generalize the model to the challenging case of multiple sound sources and we propose a variational EM framework. The associated algorithm, referred to as variational EM for source separation and localization (VESSL) yields a Bayesian estimation of the 2D locations and time-frequency masks of all the sources. Comparisons of the proposed approach with several existing methods reveal that the combination of acoustic-space learning with Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Access to recorded interviews: A research agenda

Author: Heeren W.F.L.
Jong F.M.G. de
Oard D.W.
Ordelman R.J.F.
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

University of Twente Research Information

Recommended from our members

ToScA North America (6 – 8 June 2017, The University of Texas, Austin, TX) Program

Author: Ahmed Farah
Maisano Jessie
Publication venue: Jackson School of Geosciences; The University of Texas at Austin
Publication date: 01/06/2017
Field of study

ToScA North America will address key areas of science, including Multi-modal Imaging, Geosciences, Forensics, Increasing Contrast, Educational Outreach, Data, Materials Science and Medical and Biological Science.University of Texas High-Resolution X-ray CT Facility (UTCT); Jackson School of Geosciences, The University of Texas at Austin; Natural History Museum (London); Royal Microscopical Society (Oxford, UK)Geological Science

Texas ScholarWorks

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Author: Black Michael J.
Bolkart Timo
Choutas Vasileios
Ghorbani Nima
Osman Ahmed A. A.
Pavlakos Georgios
Tzionas Dimitrios
Publication venue
Publication date: 01/01/2019
Field of study

To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201

arXiv.org e-Print Archive

Accelerated High-Resolution Photoacoustic Tomography via Compressed Sensing

Author: Arridge Simon
Beard Paul
Betcke Marta
Cox Ben
Huynh Nam
Lucka Felix
Ogunlade Olumide
Zhang Edward
Publication venue: 'IOP Publishing'
Publication date: 28/09/2016
Field of study

Current 3D photoacoustic tomography (PAT) systems offer either high image quality or high frame rates but are not able to deliver high spatial and temporal resolution simultaneously, which limits their ability to image dynamic processes in living tissue. A particular example is the planar Fabry-Perot (FP) scanner, which yields high-resolution images but takes several minutes to sequentially map the photoacoustic field on the sensor plane, point-by-point. However, as the spatio-temporal complexity of many absorbing tissue structures is rather low, the data recorded in such a conventional, regularly sampled fashion is often highly redundant. We demonstrate that combining variational image reconstruction methods using spatial sparsity constraints with the development of novel PAT acquisition systems capable of sub-sampling the acoustic wave field can dramatically increase the acquisition speed while maintaining a good spatial resolution: First, we describe and model two general spatial sub-sampling schemes. Then, we discuss how to implement them using the FP scanner and demonstrate the potential of these novel compressed sensing PAT devices through simulated data from a realistic numerical phantom and through measured data from a dynamic experimental phantom as well as from in-vivo experiments. Our results show that images with good spatial resolution and contrast can be obtained from highly sub-sampled PAT data if variational image reconstruction methods that describe the tissues structures with suitable sparsity-constraints are used. In particular, we examine the use of total variation regularization enhanced by Bregman iterations. These novel reconstruction strategies offer new opportunities to dramatically increase the acquisition speed of PAT scanners that employ point-by-point sequential scanning as well as reducing the channel count of parallelized schemes that use detector arrays.Comment: submitted to "Physics in Medicine and Biology

arXiv.org e-Print Archive

TR-2005006: Integration of Laser Vibrometry with Infrared Video for Multimedia Surveillance Display

Author: Li Weihong
Zhu Zhigang
Publication venue: CUNY Academic Works
Publication date: 01/01/2005
Field of study

Image-based Lagrangian Particle Tracking in bed-load experiments

Author: A. Radice
F. Ballio
S. Sarkar
Publication venue: 'MyJove Corporation'
Publication date: 01/01/2017
Field of study

Image analysis has been increasingly used for the measurement of river flows due to its capabilities to furnish detailed quantitative depictions at a relatively low cost. This manuscript describes an application of particle tracking velocimetry (PTV) to a bed-load experiment with lightweight sediment. The key characteristics of the investigated sediment transport conditions were the presence of a covered flow and of a fixed rough bed above which particles were released in limited number at the flume inlet. Under the applied flow conditions, the motion of the individual bed-load particles was intermittent, with alternating movement and stillness terms. The flow pattern was preliminarily characterized by acoustic measurements of vertical profiles of the stream-wise velocity. During process visualization, a large field of view was obtained using two actioncameras placed at different locations along the flume. The experimental protocol is described in terms of channel calibration, experiment realization, image pre-processing, automatic particle tracking, and post-processing of particle track data from the two cameras. The presented proof-of-concept results include probability distributions of the particle hop length and duration. The achievements of this work are compared to those of existing literature to demonstrate the validity of the protocol

Archivio istituzionale della ricerca - Politecnico di Milano