641 research outputs found

    Statistical models for natural sounds

    Get PDF
    It is important to understand the rich structure of natural sounds in order to solve important tasks, like automatic speech recognition, and to understand auditory processing in the brain. This thesis takes a step in this direction by characterising the statistics of simple natural sounds. We focus on the statistics because perception often appears to depend on them, rather than on the raw waveform. For example the perception of auditory textures, like running water, wind, fire and rain, depends on summary-statistics, like the rate of falling rain droplets, rather than on the exact details of the physical source. In order to analyse the statistics of sounds accurately it is necessary to improve a number of traditional signal processing methods, including those for amplitude demodulation, time-frequency analysis, and sub-band demodulation. These estimation tasks are ill-posed and therefore it is natural to treat them as Bayesian inference problems. The new probabilistic versions of these methods have several advantages. For example, they perform more accurately on natural signals and are more robust to noise, they can also fill-in missing sections of data, and provide error-bars. Furthermore, free-parameters can be learned from the signal. Using these new algorithms we demonstrate that the energy, sparsity, modulation depth and modulation time-scale in each sub-band of a signal are critical statistics, together with the dependencies between the sub-band modulators. In order to validate this claim, a model containing co-modulated coloured noise carriers is shown to be capable of generating a range of realistic sounding auditory textures. Finally, we explored the connection between the statistics of natural sounds and perception. We demonstrate that inference in the model for auditory textures qualitatively replicates the primitive grouping rules that listeners use to understand simple acoustic scenes. This suggests that the auditory system is optimised for the statistics of natural sounds

    A generative model for natural sounds based on latent force modelling

    Get PDF
    Generative models based on subband amplitude envelopes of natural sounds have resulted in convincing synthesis, showing subband amplitude modulation to be a crucial component of auditory perception. Probabilistic latent variable analysis can be particularly insightful, but existing approaches don’t incorporate prior knowledge about the physical behaviour of amplitude envelopes, such as exponential decay or feedback. We use latent force modelling, a probabilistic learning paradigm that encodes physical knowledge into Gaussian process regression, to model correlation across spectral subband envelopes. We augment the standard latent force model approach by explicitly modelling dependencies across multiple time steps. Incorporating this prior knowledge strengthens the interpretation of the latent functions as the source that generated the signal. We examine this interpretation via an experiment showing that sounds generated by sampling from our probabilistic model are perceived to be more realistic than those generated by comparative models based on nonnegative matrix factorisation, even in cases where our model is outperformed from a reconstruction error perspective

    Probabilistic generative modeling of speech

    Get PDF
    Speech processing refers to a set of tasks that involve speech analysis and synthesis. Most speech processing algorithms model a subset of speech parameters of interest and blur the rest using signal processing techniques and feature extraction. However, evidence shows that many speech parameters can be more accurately estimated if they are modeled jointly; speech synthesis also benefits from joint modeling. This thesis proposes a probabilistic generative model for speech called the Probabilistic Acoustic Tube (PAT). The highlights of the model are threefold. First, it is among the very first works to build a complete probabilistic model for speech. Second, it has a well-designed model for the phase spectrum of speech, which has been hard to model and often neglected. Third, it models the AM-FM effects in speech, which are perceptually significant but often ignored in frame-based speech processing algorithms. Experiment shows that the proposed model has good potential for a number of speech processing tasks

    Impaired extraction of speech rhythm from temporal modulation patterns in speech in developmental dyslexia

    Get PDF
    Dyslexia is associated with impaired neural representation of the sound structure of words (phonology). The “phonological deficit” in dyslexia may arise in part from impaired speech rhythm perception, thought to depend on neural oscillatory phase-locking to slow amplitude modulation (AM) patterns in the speech envelope. Speech contains AM patterns at multiple temporal rates, and these different AM rates are associated with phonological units of different grain sizes, e.g., related to stress, syllables or phonemes. Here, we assess the ability of adults with dyslexia to use speech AMs to identify rhythm patterns (RPs). We study 3 important temporal rates: “Stress” (~2 Hz), “Syllable” (~4 Hz) and “Sub-beat” (reduced syllables, ~14 Hz). 21 dyslexics and 21 controls listened to nursery rhyme sentences that had been tone-vocoded using either single AM rates from the speech envelope (Stress only, Syllable only, Sub-beat only) or pairs of AM rates (Stress + Syllable, Syllable + Sub-beat). They were asked to use the acoustic rhythm of the stimulus to identity the original nursery rhyme sentence. The data showed that dyslexics were significantly poorer at detecting rhythm compared to controls when they had to utilize multi-rate temporal information from pairs of AMs (Stress + Syllable or Syllable + Sub-beat). These data suggest that dyslexia is associated with a reduced ability to utilize AMs <20 Hz for rhythm recognition. This perceptual deficit in utilizing AM patterns in speech could be underpinned by less efficient neuronal phase alignment and cross-frequency neuronal oscillatory synchronization in dyslexia. Dyslexics' perceptual difficulties in capturing the full spectro-temporal complexity of speech over multiple timescales could contribute to the development of impaired phonological representations for words, the cognitive hallmark of dyslexia across languages

    Doppler Radar Techniques for Distinct Respiratory Pattern Recognition and Subject Identification.

    Get PDF
    Ph.D. Thesis. University of Hawaiʻi at Mānoa 2017

    On-line health monitoring of passive electronic components using digitally controlled power converter

    Get PDF
    This thesis presents System Identification based On-Line Health Monitoring to analyse the dynamic behaviour of the Switch-Mode Power Converter (SMPC), detect, and diagnose anomalies in passive electronic components. The anomaly detection in this research is determined by examining the change in passive component values due to degradation. Degradation, which is a long-term process, however, is characterised by inserting different component values in the power converter. The novel health-monitoring capability enables accurate detection of passive electronic components despite component variations and uncertainties and is valid for different topologies of the switch-mode power converter. The need for a novel on-line health-monitoring capability is driven by the need to improve unscheduled in-service, logistics, and engineering costs, including the requirement of Integrated Vehicle Health Management (IVHM) for electronic systems and components. The detection and diagnosis of degradations and failures within power converters is of great importance for aircraft electronic manufacturers, such as Thales, where component failures result in equipment downtime and large maintenance costs. The fact that existing techniques, including built-in-self test, use of dedicated sensors, physics-of-failure, and data-driven based health-monitoring, have yet to deliver extensive application in IVHM, provides the motivation for this research ... [cont.]

    An Overview on Application of Machine Learning Techniques in Optical Networks

    Get PDF
    Today's telecommunication networks have become sources of enormous amounts of widely heterogeneous data. This information can be retrieved from network traffic traces, network alarms, signal quality indicators, users' behavioral data, etc. Advanced mathematical tools are required to extract meaningful information from these data and take decisions pertaining to the proper functioning of the networks from the network-generated data. Among these mathematical tools, Machine Learning (ML) is regarded as one of the most promising methodological approaches to perform network-data analysis and enable automated network self-configuration and fault management. The adoption of ML techniques in the field of optical communication networks is motivated by the unprecedented growth of network complexity faced by optical networks in the last few years. Such complexity increase is due to the introduction of a huge number of adjustable and interdependent system parameters (e.g., routing configurations, modulation format, symbol rate, coding schemes, etc.) that are enabled by the usage of coherent transmission/reception technologies, advanced digital signal processing and compensation of nonlinear effects in optical fiber propagation. In this paper we provide an overview of the application of ML to optical communications and networking. We classify and survey relevant literature dealing with the topic, and we also provide an introductory tutorial on ML for researchers and practitioners interested in this field. Although a good number of research papers have recently appeared, the application of ML to optical networks is still in its infancy: to stimulate further work in this area, we conclude the paper proposing new possible research directions

    Extending Critical Infrastructure Element Longevity using Constellation-based ID Verification

    Get PDF
    This work supports a technical cradle-to-grave protection strategy aimed at extending the useful lifespan of Critical Infrastructure (CI) elements. This is done by improving mid-life operational protection measures through integration of reliable physical (PHY) layer security mechanisms. The goal is to improve existing protection that is heavily reliant on higher-layer mechanisms that are commonly targeted by cyberattack. Relative to prior device ID discrimination works, results herein reinforce the exploitability of constellation-based PHY layer features and the ability for those features to be practically implemented to enhance CI security. Prior work is extended by formalizing a device ID verification process that enables rogue device detection demonstration under physical access attack conditions that include unauthorized devices mimicking bit-level credentials of authorized network devices. The work transitions from distance-based to probability-based measures of similarity derived from empirical Multivariate Normal Probability Density Function (MVNPDF) statistics of multiple discriminant analysis radio frequency fingerprint projections. Demonstration results for Constellation-Based Distinct Native Attribute (CB-DNA) fingerprinting of WirelessHART adapters from two manufacturers includes 1) average cross-class percent correct classification of %C \u3e 90% across 28 different networks comprised of six authorized devices, and 2) average rogue rejection rate of 83.4% ≤ RRR ≤ 99.9% based on two held-out devices serving as attacking rogue devices for each network (a total of 120 individual rogue attacks). Using the MVNPDF measure proved most effective and yielded nearly 12% RRR improvement over a Euclidean distance measure
    corecore