13 research outputs found

    Proceedings of the 35th WIC Symposium on Information Theory in the Benelux and the 4th joint WIC/IEEE Symposium on Information Theory and Signal Processing in the Benelux, Eindhoven, the Netherlands May 12-13, 2014

    Get PDF
    Compressive sensing (CS) as an approach for data acquisition has recently received much attention. In CS, the signal recovery problem from the observed data requires the solution of a sparse vector from an underdetermined system of equations. The underlying sparse signal recovery problem is quite general with many applications and is the focus of this talk. The main emphasis will be on Bayesian approaches for sparse signal recovery. We will examine sparse priors such as the super-Gaussian and student-t priors and appropriate MAP estimation methods. In particular, re-weighted l2 and re-weighted l1 methods developed to solve the optimization problem will be discussed. The talk will also examine a hierarchical Bayesian framework and then study in detail an empirical Bayesian method, the Sparse Bayesian Learning (SBL) method. If time permits, we will also discuss Bayesian methods for sparse recovery problems with structure; Intra-vector correlation in the context of the block sparse model and inter-vector correlation in the context of the multiple measurement vector problem

    Proceedings of the 35th WIC Symposium on Information Theory in the Benelux and the 4th joint WIC/IEEE Symposium on Information Theory and Signal Processing in the Benelux, Eindhoven, the Netherlands May 12-13, 2014

    Get PDF
    Compressive sensing (CS) as an approach for data acquisition has recently received much attention. In CS, the signal recovery problem from the observed data requires the solution of a sparse vector from an underdetermined system of equations. The underlying sparse signal recovery problem is quite general with many applications and is the focus of this talk. The main emphasis will be on Bayesian approaches for sparse signal recovery. We will examine sparse priors such as the super-Gaussian and student-t priors and appropriate MAP estimation methods. In particular, re-weighted l2 and re-weighted l1 methods developed to solve the optimization problem will be discussed. The talk will also examine a hierarchical Bayesian framework and then study in detail an empirical Bayesian method, the Sparse Bayesian Learning (SBL) method. If time permits, we will also discuss Bayesian methods for sparse recovery problems with structure; Intra-vector correlation in the context of the block sparse model and inter-vector correlation in the context of the multiple measurement vector problem

    Acoustic compression in Zoom audio does not compromise voice recognition performance

    Get PDF
    Human voice recognition over telephone channels typically yields lower accuracy when compared to audio recorded in a studio environment with higher quality. Here, we investigated the extent to which audio in video conferencing, subject to various lossy compression mechanisms, affects human voice recognition performance. Voice recognition performance was tested in an old–new recognition task under three audio conditions (telephone, Zoom, studio) across all matched (familiarization and test with same audio condition) and mismatched combinations (familiarization and test with different audio conditions). Participants were familiarized with female voices presented in either studio-quality (N = 22), Zoom-quality (N = 21), or telephone-quality (N = 20) stimuli. Subsequently, all listeners performed an identical voice recognition test containing a balanced stimulus set from all three conditions. Results revealed that voice recognition performance (dʹ) in Zoom audio was not significantly different to studio audio but both in Zoom and studio audio listeners performed significantly better compared to telephone audio. This suggests that signal processing of the speech codec used by Zoom provides equally relevant information in terms of voice recognition compared to studio audio. Interestingly, listeners familiarized with voices via Zoom audio showed a trend towards a better recognition performance in the test (p = 0.056) compared to listeners familiarized with studio audio. We discuss future directions according to which a possible advantage of Zoom audio for voice recognition might be related to some of the speech coding mechanisms used by Zoom

    The Effect Of Acoustic Variability On Automatic Speaker Recognition Systems

    Get PDF
    This thesis examines the influence of acoustic variability on automatic speaker recognition systems (ASRs) with three aims. i. To measure ASR performance under 5 commonly encountered acoustic conditions; ii. To contribute towards ASR system development with the provision of new research data; iii. To assess ASR suitability for forensic speaker comparison (FSC) application and investigative/pre-forensic use. The thesis begins with a literature review and explanation of relevant technical terms. Five categories of research experiments then examine ASR performance, reflective of conditions influencing speech quantity (inhibitors) and speech quality (contaminants), acknowledging quality often influences quantity. Experiments pertain to: net speech duration, signal to noise ratio (SNR), reverberation, frequency bandwidth and transcoding (codecs). The ASR system is placed under scrutiny with examination of settings and optimum conditions (e.g. matched/unmatched test audio and speaker models). Output is examined in relation to baseline performance and metrics assist in informing if ASRs should be applied to suboptimal audio recordings. Results indicate that modern ASRs are relatively resilient to low and moderate levels of the acoustic contaminants and inhibitors examined, whilst remaining sensitive to higher levels. The thesis provides discussion on issues such as the complexity and fragility of the speech signal path, speaker variability, difficulty in measuring conditions and mitigation (thresholds and settings). The application of ASRs to casework is discussed with recommendations, acknowledging the different modes of operation (e.g. investigative usage) and current UK limitations regarding presenting ASR output as evidence in criminal trials. In summary, and in the context of acoustic variability, the thesis recommends that ASRs could be applied to pre-forensic cases, accepting extraneous issues endure which require governance such as validation of method (ASR standardisation) and population data selection. However, ASRs remain unsuitable for broad forensic application with many acoustic conditions causing irrecoverable speech data loss contributing to high error rates

    Resource management in sensing services with audio applications

    Get PDF
    Middleware abstractions, or services, that can bridge the gap between the increasingly pervasive sensors and the sophisticated inference applications exist, but they lack the necessary resource-awareness to support high data-rate sensing modalities such as audio/video. This work therefore investigates the resource management problem in sensing services, with application in audio sensing. First, a modular, data-centric architecture is proposed as the framework within which optimal resource management is studied. Next, the guided-processing principle is proposed to achieve optimized trade-off between resource (energy) and (inference) performance. On cascade-based systems, empirical results show that the proposed approach significantly improves the detection performance (up to 1.7x and 4x reduction in false-alarm and miss rate, respectively) for the same energy consumption, when compared to the duty-cycling approach. Furthermore, the guided-processing approach is also generalizable to graph-based systems. Resource-efficiency in the multiple-application setting is achieved through the feature-sharing principle. Once applied, the method results in a system that can achieve 9x resource saving and 1.43x improvement in detection performance in an example application. Based on the encouraging results above, a prototype audio sensing service is built for demonstration. An interference-robust audio classification technique with limited training data would prove valuable within the service, so a novel algorithm with the desired properties is proposed. The technique combines AI-gram time-frequency representation and multidimensional dynamic time warping, and it outperforms the state-of-the-art using the prominent-region-based approach across a wide range of (synthetic, both stationary and transient) interference types and signal-to-interference ratios, and also on field recordings (with areas under the receiver operating characteristic and precision-recall curves being 91% and 87%, respectively)

    Automated Writing Evaluation for non-native speaker English academic writing: The case of IADE and its formative feedback

    Get PDF
    This dissertation presents an innovative approach to the development and empirical evaluation of Automated Writing Evaluation (AWE) technology used for teaching and learning. It introduces IADE (Intelligent Academic Discourse Evaluator), a new web-based AWE program that analyzes research article Introduction sections and generates immediate, individualized, discipline-specific feedback. The major purpose of the dissertation was to implement IADE as a formative assessment tool complementing L2 graduate-level academic writing instruction and to investigate the effectiveness and appropriateness of its automated evaluation and feedback. To achieve this goal, the study sought evidence of IADE\u27s Language Learning Potential, Meaning Focus, Learner Fit, and Impact qualities outlined in Chapelle\u27s (2001) CALL evaluation conceptual framework. A mixed-methods approach with a concurrent transformative strategy was employed. Quantitative data consisted of Likert-scale, yes/no, and open-ended survey responses; automated and human scores for first and last drafts; pre-/post test scores; and frequency counts for draft submission and for access to IADE\u27s Help Options. Qualitative data contained students\u27 first and last drafts as well as transcripts of think-aloud protocols and Camtasia computer screen recordings, observations, and semi-structured interviews. The findings indicate that IADE can be considered an effective formative assessment tool suitable for implementation in the targeted instructional context. Its effectiveness was a result of combined strengths of its Language Learning Potential, Meaning Focus, Learner Fit, and Impact qualities, which were all enhanced by the program\u27s automated feedback. The strength of Language Learning Potential was supported by evidence of noticing of and focus on discourse form, improved rhetorical quality of writing, increased learning gains, and relative helpfulness of practice and modified interaction. Learners\u27 focus on the functional meaning of discourse and construction of such meaning served as evidence of strong Meaning Focus. IADE\u27s automated feedback characteristics and Help Options were appropriate for targeted learners, which speaks of adequate Learner Fit. Finally, despite some negative effects caused by IADE\u27s numerical feedback, overall Impact, exerted at affective, intrinsic, pragmatic, and cognitive levels, was found to be positive due to the color-coded type of feedback. The results of this study provide valuable empirical knowledge to the areas of L2 academic writing, AWE, formative assessment, and I/CALL. They have important practical and theoretical implications and are informative for future research as well as for the design and application of new learning technologies

    Quality assessment of spherical microphone array auralizations

    Get PDF
    The thesis documents a scientific study on quality assessment and quality prediction in Virtual Acoustic Environments (VAEs) based on spherical microphone array data, using binaural synthesis for reproduction. In the experiments, predictive modeling is applied to estimate the influence of the array on the reproduction quality by relating the data derived in perceptual experiments to the output of an auditory model. The experiments adress various aspects of the array considered relevant in auralization applications: the influence of system errors as well as the influence of the array configuration employed. The system errors comprise spatial aliasing, measurement noise, and microphone positioning errors while the array configuration is represented by the sound field order in terms of spherical harmonics, defining the spatial resolution of the array. Based on array simulations, the experimental data comprise free-field sound fields and two shoe-box shaped rooms, one with weak and another with strong reverberation. Ten audio signals served as test material, e.g., orchestral/pop music, male/female singing voice or single instruments such as castanets. In the perceptual experiments, quantitative methods are used to evaluate the impact of system errors while a descriptive analysis assesses the array configuration using two quality factors for attribution: Apparent Source Width (ASW) and Listener Envelopment (LEV). Both are quality measures commonly used in concert hall acoustics to describe the spaciousness of a room. The results from the perceptual experiments are subsequently related to the technical data derived from the auditory model in order to build, train, and evaluate a variety of predictive models. Based on classification and regression approaches, these models are applied and investigated for automated quality assessment in order to identify and categorize system errors as well as to estimate their perceptual strength. Moreover, the models allow to predict the array’s influence on ASW and LEV perception and enable the classification of further sound field characteristics, like the reflection properties of the simulated room or the sound field order used. The applied prediction models comprise simple linear regression and decision trees, or more complex models such as support vector machines or artificial neural networks. The results show that the developed prediction models perform well in their classification and regression tasks. Although their functionality is limited to the conditions underlying the conducted experiments, they can still provide a useful tool to assess basic quality-related aspects which are important when developing spherical microphone arrays for auralization applications.Die vorliegende Arbeit beschäftigt sich mit der Qualitätsbewertung und -vorhersage in virtuellen akustischen Umgebungen, insbesondere in Raumsimulationen basierend auf Kugelarraydaten, die mithilfe binauraler Synthese auralisiert werden. Dabei werden verschiedene Prädiktionsverfahren angewandt, um den Einfluss des Arrays auf die Wiedergabequalität automatisiert vorherzusagen, indem die Daten von Hörexperimenten mit denen eines auditorischen Modells in Bezug gesetzt werden. Im Fokus der Experimente stehen unterschiedliche, praxisrelevante Aspekte des Messsystems, die einen Einfluss auf die Wiedergabequalität haben. Konkret sind dies Messfehler, wie räumliches Aliasing, Rauschen oder Mikrofonpositionierungsfehler, oder die Konfiguration des Arrays. Diese definiert das räumliche Auflösungsvermögen und entspricht der gewählten Ordnung der Sphärischen Harmonischen Zerlegung. Die Experimente basieren auf Kugelarray-Simulationen unter Freifeldbedingungen und in einfachen simulierten Rechteckräumen mit unterschiedlichen Reflexionseigenschaften, wobei ein Raum trocken, der andere dagegen stark reflektierend ist. Dabei dienen zehn Testsignale als Audiomaterial, die in praktischen Anwendungen relevant erscheinen, wie z. B. Orchester- oder Popmusik, männlicher und weiblicher Gesang oder Kastagnetten. In Wahrnehmungsexperimenten wird der Einfluss von Messfehlern in einer quantitativen Analyse bewertet und die Qualität der Synthese deskriptiv mit den Attributen Apparent Source Width (ASW) und Listener Envelopment (LEV) bewertet. Die resultierenden Daten bilden die Basis für die Qualitätsvorhersage, wobei die Hörtestergebnisse als Observationen und die Ausgangsdaten des auditorischen Modells als Prädiktoren dienen. Mit den Daten werden unterschiedliche Prädiktionsmodelle trainiert und deren Vorhersagegenauigkeit anschließend bewertet. Die entwickelten Modelle ermöglichen es, sowohl Messfehler zu identifizieren und zu klassifizieren als auch deren Ausprägung zu schätzen. Darüber hinaus erlauben sie es, den Einfluss der Arraykonfiguration auf die Wahrnehmung von ASW und LEV vorherzusagen und die verwendete Ordnung der Schallfeldzerlegung zu identifizieren, ebenso wie die Reflexionseigenschaften des simulierten Raumes. Es kommen sowohl einfache Regressionsmodelle und Entscheidungsbäume zur Anwendung als auch komplexere Modelle, wie Support Vector Machines oder neuronale Netze. Die entwickelten Modelle zeigen in der Regel eine hohe Genauigkeit bei der Qualitätsvorhersage und erlauben so die Analyse von grundlegenden Array-Eigenschaften, ohne aufwendige Hörexperimente durchführen zu müssen. Obwohl die Anwendbarkeit der Modelle auf die hier untersuchten Fälle beschränkt ist, können sie sich als hilfreiche Werkzeuge bei der Entwicklung von Kugelarrays für Auralisationsanwendungen erweisen

    Modelling, Dimensioning and Optimization of 5G Communication Networks, Resources and Services

    Get PDF
    This reprint aims to collect state-of-the-art research contributions that address challenges in the emerging 5G networks design, dimensioning and optimization. Designing, dimensioning and optimization of communication networks resources and services have been an inseparable part of telecom network development. The latter must convey a large volume of traffic, providing service to traffic streams with highly differentiated requirements in terms of bit-rate and service time, required quality of service and quality of experience parameters. Such a communication infrastructure presents many important challenges, such as the study of necessary multi-layer cooperation, new protocols, performance evaluation of different network parts, low layer network design, network management and security issues, and new technologies in general, which will be discussed in this book
    corecore