6 research outputs found
Desarrollo de un sistema de multiconferencia inmersiva con audio 3D para móviles
En este trabajo se desarrolla un sistema de multiconferencia con audio espacial para terminales móviles. Este sistema mejora la inteligibilidad de la conversación usando técnicas de procesado de sonido binaural HRTF y utiliza una interfaz gráfica y táctil para situar a los participantes en un espacio virtual por medio de la pantalla del terminal.Aguilera MartÃ, E. (2011). Desarrollo de un sistema de multiconferencia inmersiva con audio 3D para móviles. http://hdl.handle.net/10251/15357Archivo delegad
Speaker Localization and Detection in Videoconferencing Environments Using a Modified SRP-PHAT Algorithm
[EN] The Steered Response Power - Phase Transform (SRP-PHAT) algorithm has been shown to be one of the most robust sound source localization approaches operating in noisy and reverberant environments. However, its practical implementation is usually based on a costly fine grid-search procedure, making the computational cost of the method a real issue. In this paper, we introduce an effective strategy which performs a full exploration of the sampled
space rather than computing the SRP at discrete spatial positions, increasing its robustness and allowing for a coarser spatial grid that reduces the computational cost required in a practical implementation. The modified SRP-PHAT functional has been successfully implemented in a real time speaker localization system for multiparticipant videoconferencing environments. Moreover, a localization-based speech-non speech frame discriminator is presented.This work was supported by the Ministry of Education and Science under the project TEC2009-14414-C03-01.Martà Guerola, A.; Cobos Serrano, M.; Aguilera MartÃ, E.; López Monfort, JJ. (2011). Speaker Localization and Detection in Videoconferencing Environments Using a Modified SRP-PHAT Algorithm. Waves. 3:40-47. http://hdl.handle.net/10251/57648S4047
On the distance perception in spatial audio system: a comparison between Wave-Field Synthesis and Panning Systems
Creating a realistic distance perception by means of spatial audio reproduction systems is not an easy task. Cues such as the ratio between the direct signal and the level of reverberation have been traditionally employed in stereo and surround systems. With the introduction of advanced spatial audio systems such as Wave Field Synthesis (WFS), it is possible to synthesize within the whole listening area the correct wavefront curvature produced by a virtual source located at a given distance. Some previous
studies suggest that this curvature can be an additional cue for the listener to extrapolate distance. In this
work, a subjective perceptual test has been carried out to compare the capabilities of WFS and Vector Base Amplitude Panning (VBAP) to reproduce accurately sound distances. Different variables were studied; type of sound, listening angle and reverberation at different distances. The analysis of the collected data suggests that WFS is better at reproducing distances than panning systems.Gutiérrez Parera, P.; López Monfort, JJ.; Aguilera MartÃ, E. (2014). On the distance perception in spatial audio system: a comparison between Wave-Field Synthesis and Panning Systems. Waves. 6:51-59. http://hdl.handle.net/10251/57870S5159
Computer-based detection and classification of flaws in citrus fruits
[EN] In this paper, a system for quality control in citrus fruits is presented. In current citrus manufacturing industries, calliper and color are successfully used for the automatic classification of fruits using vision systems. However, the detection of flaws in the citrus surface is carried out by means of human inspection. In this work, a computer vision system capable of detecting defects in the citrus peel and also classifying the type of flaw is presented. First, a review of citrus illnesses has been carried out in order to build a database of digitalized oranges classified by the kind of fault, which is used as a training set. The segmentation of faulty zones is performed by applying the Sobel gradient to the image. Afterwards, color and texture features of the flaw are extracted considering different color spaces, some of them related to high order statistics. Several techniques have been employed for classification purposes: Euler distance to a prototype, to the nearest neighbor and k-nearest neighbors. Additionally, a three layer neural network has been tested and compared, obtaining promising results.López Monfort, JJ.; Cobos Serrano, M.; Aguilera MartÃ, E. (2011). Computer-based detection and classification of flaws in citrus fruits. Neural Computing and Applications. 20(7):975-981. doi:10.1007/s00521-010-0396-2S975981207Blasco J, Aleixos J, Molto E (2007) Computer vision detection of peel defects in citrus by means of a region oriented segmentation. J Food Eng 81:535–543Blasco J, Aleixos N, Gomez J, Molto E (2007) Citrus sorting by identification of the most common defects using multispectral computer vision. J Food Eng 83:384–391Bryson AE, Ho YC (1969) Applied optimal control: optimization, estimation, and control. Xerox College Publishing, Lexington, MAConners RWea (1983) Identifying and locating surface defects in wood. IEEE Trans Pattern Anal Mach Intell 5:573–583Diaz R, Gil L, Serrano C, Blasco M, Molto E, Blasco J (2004) Comparison of three algorithms in the classification of table olives by means of computer vision. J Food Eng 61:101–107Douglas DH, Peucker TK (1973) Algorithm for the reduction of the number of points required to represent a line or its caricature. The Can Cartogr 10(2):112–122Du CJ, Sun DW (2005) Comparison of three methods for classification of pizza topping using different colour space transformations. J Food Eng 68:277–287Kolesnikov A (2003) Efficient algorithms for vectorization and polygonal approximation. Ph.D. thesis, University of Joensuu, FinlandMolto E (1997) A computer vision system for inspecting citrus, peaches and apples. In: Proceedings of VII national symposium on pattern recognition and image analysis. Sabadell, Spain, pp 121–126Muir AY, Porteus RL, Wastie RL (1982) Experiments in the detection of incipient diseases in potato tubers by optical methods. J Agric Eng Res 27:131–138Q Li (2002) Computer vision based system for apple surface defect detection. computer and electronics in agriculture. Comput Electron Agric 36:215–223Ruiz LA, Molto E, Juste F, Pla F, Valiente R (1996) Location and characterization of the stem–calyx area on oranges by computer vision. J Agric Eng Res 64:165–172Tan TSC, Kittler J (1994) Colour texture analysis using colour histogram. IEEE Proc Vis Image Signal Process 141:403–412Wen Z, Tao Y (1999) Building a rule-based machine-vision system for defect inspection on apple sorting and packing lines. Expert Syst Appl 16:307–31
An Immersive Multi-Party Conferencing System for Mobile Devices Using 3D Binaural Audio
[EN] The use of mobile telephony, along with the widespread
of smartphones in the consumer market, is gradually displacing
traditional telephony. Fixed-line telephone conference
calls have been widely employed for carrying out
distributed meetings around the world in the last decades.
However, the powerful characteristics brought by
modern mobile devices and data networks allow for new
conferencing schemes based on immersive communication,
one the fields having major commercial and technical
interest within the telecommunications industry today.
In this context, adding spatial audio features into conventional
conferencing systems is a natural way of creating
a realistic communication environment. In fact, the
human auditory system takes advantage of spatial audio
cues to locate, separate and understand multiple speakers
when they talk simultaneously. As a result, speech
intelligibility is significantly improved if the speakers are
simulated to be spatially distributed. This paper describes
the development of a new immersive multi-party conference
call service for mobile devices (smartphones and
tablets) that substantially improves the identification and
intelligibility of the participants. Headphone-based audio
reproduction and binaural sound processing algorithms
allow the user to locate the different speakers within a
virtual meeting room. Moreover, the use of a large touch
screen helps the user to identify and remember the participants
taking part in the conference, with the possibility
of changing their spatial location in an interactive
way.This work has been partially supported by the government of Spain grant TEC-2009-14414-C03-01 and by the new technologies department of TelefónicaAguilera MartÃ, E.; López Monfort, JJ.; Cobos Serrano, M.; Macià Pina, L.; Martà Guerola, A. (2012). An Immersive Multi-Party Conferencing System for Mobile Devices Using 3D Binaural Audio. Waves. 4:5-14. http://hdl.handle.net/10251/57918S514
Spatial Audio for Audioconferencing in Mobile Devices: Investigating the Importance of Virtual Mobility and Private Communication and Optimizations
Audioconferencing systems are becoming increasingly sophisticated, seeking to improve immersion, intelligibility, and sense of presence. In parallel, mobile devices are gaining traction for such applications, often supplanting desktops as the platform of choice especially when the communication does not require video. This article describes our design of a mobile multiparty audioconference application and our research into the influence of spatial audio and interactivity on user experience with the application. In particular, we consider implementation tradeoffs and investigate whether the full potential of spatial audio is realized simply by distributing the virtual locations of the participants according to some predetermined configuration. In addition, we analyze the utility of "whisper" mode functionality in which a subset of participants can engage in an ad-hoc sidebar conversation privately from the remaining participants. Our results provide interesting guidelines of relevance to the development future audioconferencing systems.The Spanish Ministry of Economy and Competitiveness supported this work under the project TEC2012-37945-C01. The experiments were carried out with the support of the Transatlantic Partnership for Excellence in Engineering (TEE). This support is gratefully acknowledged.Aguilera MartÃ, E.; López Monfort, JJ.; Cooperstock, JR. (2016). Spatial Audio for Audioconferencing in Mobile Devices: Investigating the Importance of Virtual Mobility and Private Communication and Optimizations. Journal of the Audio Engineering Society. 64(5):332-341. doi:10.17743/jaes.2016.0009S33234164