1,361 research outputs found

    Frequency-warped autoregressive modeling and filtering

    Get PDF
    This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding

    Full text link
    In this letter, we propose a vocal tract length (VTL) perturbation method for text-dependent speaker verification (TD-SV), in which a set of TD-SV systems are trained, one for each VTL factor, and score-level fusion is applied to make a final decision. Next, we explore the bottleneck (BN) feature extracted by training deep neural networks with a self-supervised objective, autoregressive predictive coding (APC), for TD-SV and compare it with the well-studied speaker-discriminant BN feature. The proposed VTL method is then applied to APC and speaker-discriminant BN features. In the end, we combine the VTL perturbation systems trained on MFCC and the two BN features in the score domain. Experiments are performed on the RedDots challenge 2016 database of TD-SV using short utterances with Gaussian mixture model-universal background model and i-vector techniques. Results show the proposed methods significantly outperform the baselines.Comment: Copyright (c) 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Stereo linear predictive coding of audio

    Get PDF

    Comparative study of voice print Based acoustic features: MFCC and LPCC

    Full text link
    Voice is the best biometric feature for investigation and authentication. It has both biological and behavioural features. The acoustic features are related to the voice. The Speaker Recognition System is designed for the automatic authentication of speaker's identity which is truly based on the human's voice. Mel Frequency Cepstrum coefficient (MFCC) and Linear Prediction Cepstrum coefficient (LPCC) are taken in use for feature extraction from the provided voice sample. This paper provides a comparative study of MFCC and LPCC based on the accuracy of results and their working methodology. The results are better if MFCC is used for feature extraction

    Exploring space situational awareness using neuromorphic event-based cameras

    Get PDF
    The orbits around earth are a limited natural resource and one that hosts a vast range of vital space-based systems that support international systems use by both commercial industries, civil organisations, and national defence. The availability of this space resource is rapidly depleting due to the ever-growing presence of space debris and rampant overcrowding, especially in the limited and highly desirable slots in geosynchronous orbit. The field of Space Situational Awareness encompasses tasks aimed at mitigating these hazards to on-orbit systems through the monitoring of satellite traffic. Essential to this task is the collection of accurate and timely observation data. This thesis explores the use of a novel sensor paradigm to optically collect and process sensor data to enhance and improve space situational awareness tasks. Solving this issue is critical to ensure that we can continue to utilise the space environment in a sustainable way. However, these tasks pose significant engineering challenges that involve the detection and characterisation of faint, highly distant, and high-speed targets. Recent advances in neuromorphic engineering have led to the availability of high-quality neuromorphic event-based cameras that provide a promising alternative to the conventional cameras used in space imaging. These cameras offer the potential to improve the capabilities of existing space tracking systems and have been shown to detect and track satellites or ‘Resident Space Objects’ at low data rates, high temporal resolutions, and in conditions typically unsuitable for conventional optical cameras. This thesis presents a thorough exploration of neuromorphic event-based cameras for space situational awareness tasks and establishes a rigorous foundation for event-based space imaging. The work conducted in this project demonstrates how to enable event-based space imaging systems that serve the goals of space situational awareness by providing accurate and timely information on the space domain. By developing and implementing event-based processing techniques, the asynchronous operation, high temporal resolution, and dynamic range of these novel sensors are leveraged to provide low latency target acquisition and rapid reaction to challenging satellite tracking scenarios. The algorithms and experiments developed in this thesis successfully study the properties and trade-offs of event-based space imaging and provide comparisons with traditional observing methods and conventional frame-based sensors. The outcomes of this thesis demonstrate the viability of event-based cameras for use in tracking and space imaging tasks and therefore contribute to the growing efforts of the international space situational awareness community and the development of the event-based technology in astronomy and space science applications
    • …
    corecore