189 research outputs found

    Pitches of simultaneous complex tones

    Get PDF

    Speech Denoising Using Non-Negative Matrix Factorization with Kullback-Leibler Divergence and Sparseness Constraints

    Get PDF
    Proceedings of: IberSPEECH 2012 Conference, Madrid, Spain, November 21-23, 2012.A speech denoising method based on Non-Negative Matrix Factorization (NMF) is presented in this paper. With respect to previous related works, this paper makes two contributions. First, our method does not assume a priori knowledge about the nature of the noise. Second, it combines the use of the Kullback-Leibler divergence with sparseness constraints on the activation matrix, improving the performance of similar techniques that minimize the Euclidean distance and/or do not consider any sparsification. We evaluate the proposed method for both, speech enhancement and automatic speech recognitions tasks, and compare it to conventional spectral subtraction, showing improvements in speech quality and recognition accuracy, respectively, for different noisy conditions.This work has been partially supported by the Spanish Government grants TSI-020110-2009-103 and TEC2011-26807.Publicad

    De waarneembaarheid van toonhoogten in twee simultane complexe tonen

    Get PDF

    The Calogero-Sutherland Model and Polynomials with Prescribed Symmetry

    Full text link
    The Schr\"odinger operators with exchange terms for certain Calogero-Sutherland quantum many body systems have eigenfunctions which factor into the symmetric ground state and a multivariable polynomial. The polynomial can be chosen to have a prescribed symmetry (i.e. be symmetric or antisymmetric) with respect to the interchange of some specified variables. For four particular Calogero-Sutherland systems we construct an eigenoperator for these polynomials which separates the eigenvalues and establishes orthogonality. In two of the cases this involves identifying new operators which commute with the corresponding Schr\"odinger operators. In each case we express a particular class of the polynomials with prescribed symmetry in a factored form involving the corresponding symmetric polynomials.Comment: LaTeX 2.09, 31 page

    Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

    Full text link
    Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

    Moralizing Postcolonial Consumer Society: Fair Trade in the Netherlands, 1964-1997

    Get PDF
    Decolonization challenged people across the globe to define their place in a new postcolonial order. This challenge was felt in international political and economic affairs, but it also affected daily lives across the globe. The history of fair trade activism as seen from the Netherlands highlights how citizens in the North grappled to position themselves in a postcolonial consumer society. Interventions by fair trade activists connected debates about the morals of their society to the consequences of decolonization. They reacted to the imbalances of the global market in the wake of decolonization, joining critics from the South in demanding more equitable global relations. It was around this issue of "fair trade" that a transnational coalition of moderate and more radical activists emerged after the 1960s. This coalition held widely dissimilar views regarding the politics of the left and the use of consumer activism. The analysis of their interventions demonstrates that during the postwar era attempts at transforming the global market were inextricably interwoven with visions of a postcolonial order

    New single-ended objective measure for non-intrusive speech quality evaluation

    Get PDF
    peer-reviewedThis article proposes a new output-based method for non-intrusive assessment of speech quality of voice communication systems and evaluates its performance. The method requires access to the processed (degraded) speech only, and is based on measuring perception-motivated objective auditory distances between the voiced parts of the output speech to appropriately matching references extracted from a pre-formulated codebook. The codebook is formed by optimally clustering a large number of parametric speech vectors extracted from a database of clean speech records. The auditory distances are then mapped into objective Mean Opinion listening quality scores. An efficient data-mining tool known as the self-organizing map (SOM) achieves the required clustering and mapping/reference matching processes. In order to obtain a perception-based, speaker-independent parametric representation of the speech, three domain transformation techniques have been investigated. The first technique is based on a perceptual linear prediction (PLP) model, the second utilises a bark spectrum (BS) analysis and the third utilises mel-frequency cepstrum coefficients (MFCC). Reported evaluation results show that the proposed method provides high correlation with subjective listening quality scores, yielding accuracy similar to that of the ITU-T P.563 while maintaining a relatively low computational complexity. Results also demonstrate that the method outperforms the PESQ in a number of distortion conditions, such as those of speech degraded by channel impairments.acceptedpeer-reviewe

    Formal Analysis of Linear Control Systems using Theorem Proving

    Full text link
    Control systems are an integral part of almost every engineering and physical system and thus their accurate analysis is of utmost importance. Traditionally, control systems are analyzed using paper-and-pencil proof and computer simulation methods, however, both of these methods cannot provide accurate analysis due to their inherent limitations. Model checking has been widely used to analyze control systems but the continuous nature of their environment and physical components cannot be truly captured by a state-transition system in this technique. To overcome these limitations, we propose to use higher-order-logic theorem proving for analyzing linear control systems based on a formalized theory of the Laplace transform method. For this purpose, we have formalized the foundations of linear control system analysis in higher-order logic so that a linear control system can be readily modeled and analyzed. The paper presents a new formalization of the Laplace transform and the formal verification of its properties that are frequently used in the transfer function based analysis to judge the frequency response, gain margin and phase margin, and stability of a linear control system. We also formalize the active realizations of various controllers, like Proportional-Integral-Derivative (PID), Proportional-Integral (PI), Proportional-Derivative (PD), and various active and passive compensators, like lead, lag and lag-lead. For illustration, we present a formal analysis of an unmanned free-swimming submersible vehicle using the HOL Light theorem prover.Comment: International Conference on Formal Engineering Method

    A Brave New Internet: Hacking the Narrative of Mark Zuckerberg’s 2021 Introduction of the Metaverse

    Get PDF
    We are entering an era of ‘techlash’: increasing unease with the hold of large technology companies over our lives, driven by fatalistic feelings of loss of agency. Neither attempts by these companies to address such concerns, such as appointing ethical committees and ombudsmen, nor grassroot initiatives aimed at user empowerment, seem effective in addressing this. This context remains unacknowledged in Mark Zuckerberg’s introduction of the Metaverse on 28 October 2021. We will show, however, that it is still implicitly addressed through its narrative. A far-reaching transformation of the way in which we use the internet is presented as desirable and unescapable, employing an epic narrative mode which values constancy of the individual and their mastery over their surroundings. However, this future is shaped by Zuckerberg and his company: promising agency for all, it is remarkable how little agency is given to the user. We juxtapose this smooth future vision with a counternarrative using the same narrative building stones, but told in a narrative mode distributing agency more equally. Thus, we engage in strategic analysis, exploring how to resist narratives such as the Metaverse’s. We call this method hacking the narrative
    • …
    corecore