149 research outputs found

    Speech Analysis with Bessel Functions

    Get PDF
    A signal may be expressed as a I inear combination of other functions, cal led the basis set. This is essentially a model of the waveform or signal. Infinitely many choices are possible for the basis set, with the most common choice being the set of trigonometric functions. This study involves the use of Bessel functions of the first kind as the basis set. The original goal of the research was to develop an automatic speaker recognition scheme, based upon the Fourier Bessel series. But the difficulty of collecting a high quality data base and certain hardware deficiencies precluded the completion of the original goal. Also, it was found that the theoretical foundation for the use of the Fourier-Bessel series for signal analysis was practically nonexistent. For these reasons, the study was confined to general purpose speech analysis and to investigation of the computational algorithms required.Electrical Engineerin

    Compression methods for mechanical vibration signals: Application to the plane engines

    No full text
    International audienceA novel approach for the compression of mechanical vibration signals is presented in this paper. The method relies on a simple and flexible decomposition into a large number of subbands, implemented by an orthogonal transform. Compression is achieved by a uniform adaptive quantization of each subband. The method is tested on a large number of real vibration signals issued by plane engines. High compression ratios can be achieved, while keeping a good quality of the reconstructed signal. It is also shown that compression has little impact on the detection of some commonly encountered defects of the plane engine

    Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks

    Get PDF
    The analysis and classification of the sounds produced by certain animal species, notably anurans, have revealed these amphibians to be a potentially strong indicator of temperature fluctuations and therefore of the existence of climate change. Environmental monitoring systems using Wireless Sensor Networks are therefore of interest to obtain indicators of global warming. For the automatic classification of the sounds recorded on such systems, the proper representation of the sound spectrum is essential since it contains the information required for cataloguing anuran calls. The present paper focuses on this process of feature extraction by exploring three alternatives: the standardized MPEG-7, the Filter Bank Energy (FBE), and the Mel Frequency Cepstral Coefficients (MFCC). Moreover, various values for every option in the extraction of spectrum features have been considered. Throughout the paper, it is shown that representing the frame spectrum with pure FBE offers slightly worse results than using the MPEG-7 features. This performance can easily be increased, however, by rescaling the FBE in a double dimension: vertically, by taking the logarithm of the energies; and, horizontally, by applying mel scaling in the filter banks. On the other hand, representing the spectrum in the cepstral domain, as in MFCC, has shown additional marginal improvements in classification performance.University of Seville: Telefónica Chair "Intelligence Networks

    Blind deconvolution of medical ultrasound images: parametric inverse filtering approach

    Get PDF
    ©2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or distribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.DOI: 10.1109/TIP.2007.910179The problem of reconstruction of ultrasound images by means of blind deconvolution has long been recognized as one of the central problems in medical ultrasound imaging. In this paper, this problem is addressed via proposing a blind deconvolution method which is innovative in several ways. In particular, the method is based on parametric inverse filtering, whose parameters are optimized using two-stage processing. At the first stage, some partial information on the point spread function is recovered. Subsequently, this information is used to explicitly constrain the spectral shape of the inverse filter. From this perspective, the proposed methodology can be viewed as a ldquohybridizationrdquo of two standard strategies in blind deconvolution, which are based on either concurrent or successive estimation of the point spread function and the image of interest. Moreover, evidence is provided that the ldquohybridrdquo approach can outperform the standard ones in a number of important practical cases. Additionally, the present study introduces a different approach to parameterizing the inverse filter. Specifically, we propose to model the inverse transfer function as a member of a principal shift-invariant subspace. It is shown that such a parameterization results in considerably more stable reconstructions as compared to standard parameterization methods. Finally, it is shown how the inverse filters designed in this way can be used to deconvolve the images in a nonblind manner so as to further improve their quality. The usefulness and practicability of all the introduced innovations are proven in a series of both in silico and in vivo experiments. Finally, it is shown that the proposed deconvolution algorithms are capable of improving the resolution of ultrasound images by factors of 2.24 or 6.52 (as judged by the autocorrelation criterion) depending on the type of regularization method used

    Time and Frequency Independent Manipulation of Audio in Real Time

    Get PDF
    Analog audio implies time-frequency dependence. With digitally sampled audio, this timefrequency dependence can be broken and either variable can be manipulated independently of the other, in real time. This paper will mostly focus on the frequency domain algorithm called the Phase Vocoder which breaks this time-frequency dependence. We will start by looking at Fourier Theory and the effect of discrete sampling. Then we will look at the Phase Vocoder\u27s theory of operation, as well as improvements made by Puckette, Laroche, and Dolson, to name a few. Through all of this, simple examples will be presented in order to gain intuition into the principles at hand. Towards the end, a time domain approach for time-frequency independence called Granular Synthesis will be explored. We will compare it to the Phase Vocoder, and see how our understanding of one changes how we think and make decisions for the other. Finally we will propose some ideas for further improvement to real-time time-frequency independent manipulation of audio

    Cepstral Processing for GPS Multipath Detection and Mitigation

    Get PDF
    This work presents a novel approach to code phase multipath mitigation for Global Positioning System (GPS) receivers. It uses the power and complex cepstra for multipath detection and mitigation prior to code phase tracking by a standard non-coherent delay lock loop. Cepstral theory is presented to demonstrate how multipath reflection delays can be detected through the use of the power cepstrum. Filtering can then be performed on the complex cepstrum to remove multipath effects in the cepstral domain. Finally, an inverse complex cepstrum is calculated yielding a theoretically multipath free direct path estimate in the time domain. Simulations are presented to verify the applicability of cepstral techniques to the problem of GPS multipath mitigation. Results show that, under noiseless conditions, cepstral processing prior to code tracking by a standard non-coherent delay lock loop leads to lower code tracking biases than direct tracking of the composite multipath signal by a narrow correlator receiver. Finally, this work shows that cepstral processing is highly sensitive to additive white Gaussian noise effects, leading to the conclusion that methods of limiting noise effects must be developed before this technique will be applicable in actual GPS receivers

    Glottal-synchronous speech processing

    No full text
    Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

    Interactive speech-driven facial animation

    Get PDF
    One of the fastest developing areas in the entertainment industry is digital animation. Television programmes and movies frequently use 3D animations to enhance or replace actors and scenery. With the increase in computing power, research is also being done to apply these animations in an interactive manner. Two of the biggest obstacles to the success of these undertakings are control (manipulating the models) and realism. This text describes many of the ways to improve control and realism aspects, in such a way that interactive animation becomes possible. Specifically, lip-synchronisation (driven by human speech), and various modeling and rendering techniques are discussed. A prototype that shows that interactive animation is feasible, is also described.Mr. A. Hardy Prof. S. von Solm

    Continuity of object tracking

    Get PDF
    2022 Spring.Includes bibliographical references.The demand for object tracking (OT) applications has been increasing for the past few decades in many areas of interest: security, surveillance, intelligence gathering, and reconnaissance. Lately, newly-defined requirements for unmanned vehicles have enhanced the interest in OT. Advancements in machine learning, data analytics, and deep learning have facilitated the recognition and tracking of objects of interest; however, continuous tracking is currently a problem of interest to many research projects. This dissertation presents a system implementing a means to continuously track an object and predict its trajectory based on its previous pathway, even when the object is partially or fully concealed for a period of time. The system is divided into two phases: The first phase exploits a single fixed camera system and the second phase is composed of a mesh of multiple fixed cameras. The first phase system is composed of six main subsystems: Image Processing, Detection Algorithm, Image Subtractor, Image Tracking, Tracking Predictor, and the Feedback Analyzer. The second phase of the system adds two main subsystems: Coordination Manager and Camera Controller Manager. Combined, these systems allow for reasonable object continuity in the face of object concealment
    corecore