1,103 research outputs found
Combination strategy based on relative performance monitoring for multi-stream reverberant speech recognition
A multi-stream framework with deep neural network (DNN) classifiers is applied to improve automatic speech recognition (ASR) in environments with different reverberation characteristics. We propose a room parameter estimation model to establish a reliable combination strategy which performs on either DNN posterior probabilities or word lattices. The model is implemented by training a multilayer perceptron incorporating auditory-inspired features in order to distinguish between and generalize to various reverberant conditions, and the model output is shown to be highly correlated to ASR performances between multiple streams, i.e., relative performance monitoring, in contrast to conventional mean temporal distance based performance monitoring for a single stream. Compared to traditional multi-condition training, average relative word error rate improvements of 7.7% and 9.4% have been achieved by the proposed combination strategies performing on posteriors and lattices, respectively, when the multi-stream ASR is tested in known and unknown simulated reverberant environments as well as realistically recorded conditions taken from REVERB Challenge evaluation set
Do current-density nonlinearities cut off the glass transition?
Extended mode coupling theories for dense fluids predict that nonlinear
current-density couplings cut off the singular `ideal glass transition',
present in the standard mode coupling theory where such couplings are ignored.
We suggest here that, rather than allowing for activated processes as sometimes
supposed, contributions from current-density couplings are always negligible
close to a glass transition. We discuss in schematic terms how activated
processes can nonetheless cut off the transition, by causing the memory
function to become linear in correlators at late times.Comment: 4 page
Instrumental and perceptual evaluation of dereverberation techniques based on robust acoustic multichannel equalization
Speech signals recorded in an enclosed space by microphones at a distance from the speaker are often corrupted by reverberation, which arises from the superposition of many delayed and attenuated copies of the source signal. Because reverberation degrades the signal, removing reverberation would enhance quality. Dereverberation techniques based on acoustic multichannel equalization are known to be sensitive to room impulse response perturbations. In order to increase robustness, several methods have been proposed, as for example, using a shorter reshaping filter length, incorporating regularization, or applying a sparsity-promoting penalty function. This paper focuses on evaluating the performance of these methods for single-source multi-microphone scenarios, using instrumental performance measures as well as using subjective listening tests. By analyzing the correlation between the instrumental and the perceptual results, it is shown that signal-based performance measures are more advantageous than channel-based performance measures to evaluate the perceptual speech quality of signals that were dereverberated by equalization techniques. Furthermore, this analysis also demonstrates the need to develop more reliable instrumental performance measures
Measuring, modelling and predicting perceived reverberation
This paper investigates the relationship between the perceived level of reverberation and parameters measured from the room impulse response (RIR), as well as the design of an instrumental measure that predicts this perceived level. We first present the results of an experimental listening test conducted to assess the level of perceived reverberation in speech captured by a single microphone, before analysing the gathered data to assess the influence of parameters such as the reverberation time (T60) or the direct-to-reverberant ratio (DRR). Secondly, we use the results of this analysis to improve the signal based reverberation decay tail (RDT) measure, previously proposed by the authors to predict the perceived level of reverberation. The accuracy of the proposed measure is evaluated in terms of correlation with the subjective scores and compared to the performance of predictors using parameters extracted from the RIR. Results show that the proposed modifications to the RDT does improve its accuracy. Though still slightly outperformed by measures based on parameters of the RIR, we believe the proposed measure to be useful in scenarios in which the RIR or its parameters are unknown
Exploring auditory-inspired acoustic features for room acoustic parameter estimation from monaural speech
Room acoustic parameters that characterize acoustic environments can help to improve signal enhancement algorithms such as for dereverberation, or automatic speech recognition by adapting models to the current parameter set. The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key parameters. In this paper, we propose a blind ROom Parameter Estimator (ROPE) based on an artificial neural network that learns the mapping to discrete ranges of the RT and the ELR from single-microphone speech signals. Auditory-inspired acoustic features are used as neural network input, which are generated by a temporal modulation filter bank applied to the speech time-frequency representation. ROPE performance is analyzed in various reverberant environments in both clean and noisy conditions for both fullband and subband RT and ELR estimations. The importance of specific temporal modulation frequencies is analyzed by evaluating the contribution of individual filters to the ROPE performance. Experimental results show that ROPE is robust against different variations caused by room impulse responses (measured versus simulated), mismatched noise levels, and speech variability reflected through different corpora. Compared to state-of-the-art algorithms that were tested in the acoustic characterisation of environments (ACE) challenge, the ROPE model is the only one that is among the best for all individual tasks (RT and ELR estimation from fullband and subband signals). Improved fullband estimations are even obtained by ROPE when integrating speech-related frequency subbands. Furthermore, the model requires the least computational resources with a real time factor that is at least two times faster than competing algorithms. Results are achieved with an average observation window of 3 s, which is important for real-time applications
Joint estimation of reverberation time and early-to-late reverberation ratio from single-channel speech signals
The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key parameters commonly used to characterize acoustic room environments. In contrast to conventional blind estimation methods that process the two parameters separately, we propose a model for joint estimation to predict the RT and the ELR simultaneously from single-channel speech signals from either full-band or sub-band frequency data, which is referred to as joint room parameter estimator (jROPE). An artificial neural network is employed to learn the mapping from acoustic observations to the RT and the ELR classes. Auditory-inspired acoustic features obtained by temporal modulation filtering of the speech time-frequency representations are used as input for the neural network. Based on an in-depth analysis of the dependency between the RT and the ELR, a two-dimensional (RT, ELR) distribution with constrained boundaries is derived, which is then exploited to evaluate four different configurations for jROPE. Experimental results show that-in comparison to the single-task ROPE system which individually estimates the RT or the ELR-jROPE provides improved results for both tasks in various reverberant and (diffuse) noisy environments. Among the four proposed joint types, the one incorporating multi-task learning with shared input and hidden layers yields the best estimation accuracies on average. When encountering extreme reverberant conditions with RTs and ELRs lying beyond the derived (RT, ELR) distribution, the type considering RT and ELR as a joint parameter performs robustly, in particular. From state-of-the-art algorithms that were tested in the acoustic characterization of environments challenge, jROPE achieves comparable results among the best for all individual tasks (RT and ELR estimation from full-band and sub-band signals)
Mind the Gap: Autonomous Systems, the Responsibility Gap, and Moral Entanglement
When a computer system causes harm, who is responsible? This question has renewed significance given the proliferation of autonomous systems enabled by modern artificial intelligence techniques. At the root of this problem is a philosophical difficulty known in the literature as the responsibility gap. That is to say, because of the causal distance between the designers of autonomous systems and the eventual outcomes of those systems, the dilution of agency within the large and complex teams that design autonomous systems, and the impossibility of fully predicting how autonomous systems will behave once deployed, determining who is morally responsible for harms caused by autonomous systems is unclear at a conceptual level. I review past work on this topic, criticizing prior works for suggesting workarounds rather than philosophical answers to the conceptual problem presented by the responsibility gap. The view I develop, drawing on my earlier work on vicarious moral responsibility, explains why computing professionals are ethically required to take responsibility for the systems they design, despite not being blameworthy for the harms these systems may cause
The Concept of a University: Theory, Practice, and Society
Current disputes over the nature and purpose of the university are rooted in a philosophical divide between theory and practice. Academics often defend the concept of a university devoted to purely theoretical activities. Politicians and wider society tend to argue that the university should take on more practical concerns. I critique two typical defenses of the theoretical concept—one historical and one based on the value of pure research—and show that neither the theoretical nor the practical concept of a university accommodates all the important goals expected of university research and teaching. Using the classical pragmatist argument against a sharp division between theory and practice, I show how we can move beyond the debate between the theoretical and practical concepts of a university, while maintaining a place for pure and applied research, liberal and vocational education, and social impact through both economic applications and criticism aimed at promoting social justice
Conceptual Responsibility
This thesis concerns our moral and epistemic responsibilities regarding our concepts. I argue that certain concepts can be morally, epistemically, or socially problematic. This is particularly concerning with regard to our concepts of social kinds, which may have both descriptive and evaluative aspects. Being ignorant of certain concepts, or possessing mistaken conceptions, can be problematic for similar reasons, and contributes to various forms of epistemic injustice. I defend an expanded view of a type of epistemic injustice known as ‘hermeneutical injustice’, where widespread conceptual ignorance puts members of marginalized groups at risk of their distinctive and important experiences lacking intelligible interpretations. Together, I call the use of problematic concepts or the ignorance of appropriate concepts ‘conceptual incapacities’. I discuss the conditions under which we may be responsible for our conceptual incapacities on several major theories of responsibility, developing my own account of responsibility in the process, according to which we are responsible for something just in case it was caused by one of our reasons-responsive constitutive psychological traits. However, I argue that regardless of whether we are responsible for something, we may still be required to take responsibility for it. Whether or not we are responsible for our conceptual incapacities, we are required to reflect critically upon them in a variety of scenarios that throw our use of those concepts into question. I consider the method of conceptual engineering — the philosophical critique and revision of concepts — as one way we might take responsibility for our concepts, or at least, defer that duty to experts. But, this top-down model of conceptual revision is insufficient. Using a pragmatist model of the social epistemology of morality, I argue that conceptual inquiry is a social endeavour in which we are all required to participate, to some degree
- …