5,290 research outputs found

    Methods for Focal Plane Array Resolution Estimation Using Random Laser Speckle in Non-paraxial Geometries

    Get PDF
    The infrared (IR) imaging community has a need for direct IR detector evaluation due to the continued demand for small pixel pitch detectors, the emergence of strained-layer-super-lattice devices, and the associated lateral carrier diffusion issues. Conventional laser speckle-based modulation transfer function (MTF) estimation is dependent on Fresnel propagation and a wide-sense-stationary input random process, limiting the use of this approach for lambda (wavelength)-scale IR devices. This dissertation develops two alternative methodologies for speckle-based resolution evaluation of IR focal plane arrays (FPAs). Both techniques are formulated using Rayleigh-Sommerfield electric field propagation, making them valid in the non-paraxial geometries dictated for resolution estimation of lambda-scale devices. The generalized FPA MTF estimation approach numerically evaluates Rayleigh-Sommerfeld speckle irradiance autocorrelation functions (ACFs) to indirectly compute the power spectral density (PSD) of a non-wide-sense-stationary (WSS) speckle irradiance random process. The experimental error incurred by making WSS assumptions regarding the associated laser speckle random process is quantified utilizing the Wigner distribution function. This method is experimentally demonstrated on a lambda-scale longwave IR FPA, showing a 27% spatial frequency range improvement over established estimation methodology. Additionally, a resolution estimation approach, which utilizes an iterative maximum likelihood estimation approach and speckle irradiance ACFs to solve for a system impulse response, is developed and demonstrated with simulated speckle imagery

    HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering

    Get PDF

    A versatile pitch tracking algorithm : from human speech to killer whale vocalizations

    Get PDF
    Author Posting. © Acoustical Society of America, 2009. This article is posted here by permission of Acoustical Society of America for personal use, not for redistribution. The definitive version was published in Journal of the Acoustical Society of America 126 (2009): 451-459, doi:10.1121/1.3132525.In this article, a pitch tracking algorithm [named discrete logarithmic Fourier transformation-pitch detection algorithm (DLFT-PDA)], originally designed for human telephone speech, was modified for killer whale vocalizations. The multiple frequency components of some of these vocalizations demand a spectral (rather than temporal) approach to pitch tracking. The DLFT-PDA algorithm derives reliable estimations of pitch and the temporal change of pitch from the harmonic structure of the vocal signal. Scores from both estimations are combined in a dynamic programming search to find a smooth pitch track. The algorithm is capable of tracking killer whale calls that contain simultaneous low and high frequency components and compares favorably across most signal to noise ratio ranges to the peak-picking and sidewinder algorithms that have been used for tracking killer whale vocalizations previously.C.W. was supported by DARPA under Contract No. N66001-96-C-8526, monitored through Naval Command, Control, and Ocean Surveillance Center and by the National Science Foundation under Grant No. IRI-9618731. A.D.S. was supported by a National Defense Science and Engineering Graduate Fellowship

    Informed algorithms for sound source separation in enclosed reverberant environments

    Get PDF
    While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are informed i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft time-frequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then integrated into the probabilistic model framework that encodes the spatial characteristics of the enclosure and further improves the separation performance in challenging scenarios i.e. when sources are in close proximity and when the level of reverberation is high. Finally, new dereverberation based pre-processing is proposed based on the cascade of three dereverberation stages where each enhances the twomicrophone reverberant mixture. The dereverberation stages are based on amplitude spectral subtraction, where the late reverberation is estimated and suppressed. The combination of such dereverberation based pre-processing and use of soft mask separation yields the best separation performance. All methods are evaluated with real and synthetic mixtures formed for example from speech signals from the TIMIT database and measured room impulse responses
    corecore