1,888 research outputs found

    Image information restoration based on long-range correlation

    Get PDF
    2001-2002 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe

    Audio Inpainting

    Get PDF
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211

    New methods for robust speech recognition

    Get PDF
    Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement are developed for a robust speech recognition system. The methods of feature extraction and end-point detection are based on wavelet analysis or subband analysis of the speech signal. Two new sets of speech feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter sets are based on subband analysis. The SUBLSF feature parameters are obtained via linear predictive analysis on subbands. These speech feature parameters can produce better results than the full-band parameters when the noise is colored. The SUBCEP parameters are based on wavelet analysis or equivalently the multirate subband analysis of the speech signal. The SUBCEP parameters also provide robust recognition performance by appropriately deemphasizing the frequency bands corrupted by noise. It is experimentally observed that the subband analysis based feature parameters are more robust than the commonly used full-band analysis based parameters in the presence of car noise. The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed for Q-stable random processes. Adaptive noise cancelation techniques are used to reduce the mismacth between training and testing conditions of the recognition system over telephone lines. Another important problem in isolated speech recognition is to determine the boundaries of the speech utterances or words. Precise boundary detection of utterances improves the performance of speech recognition systems. A new distance measure based on the subband energy levels is introduced for endpoint detection.Erzin, EnginPh.D

    Decision-Based Marginal Total Variation Diffusion for Impulsive Noise Removal in Color Images

    Get PDF
    Impulsive noise removal for color images usually employs vector median filter, switching median filter, the total variation L1 method, and variants. These approaches, however, often introduce excessive smoothing and can result in extensive visual feature blurring and thus are suitable only for images with low density noise. A marginal method to reduce impulsive noise is proposed in this paper that overcomes this limitation that is based on the following facts: (i) each channel in a color image is contaminated independently, and contaminative components are independent and identically distributed; (ii) in a natural image the gradients of different components of a pixel are similar to one another. This method divides components into different categories based on different noise characteristics. If an image is corrupted by salt-and-pepper noise, the components are divided into the corrupted and the noise-free components; if the image is corrupted by random-valued impulses, the components are divided into the corrupted, noise-free, and the possibly corrupted components. Components falling into different categories are processed differently. If a component is corrupted, modified total variation diffusion is applied; if it is possibly corrupted, scaled total variation diffusion is applied; otherwise, the component is left unchanged. Simulation results demonstrate its effectiveness

    GENETIC FUZZY FILTER BASED ON MAD AND ROAD TO REMOVE MIXED IMPULSE NOISE

    Get PDF
    In this thesis, a genetic fuzzy image filtering based on rank-ordered absolute differences (ROAD) and median of the absolute deviations from the median (MAD) is proposed. The proposed method consists of three components, including fuzzy noise detection system, fuzzy switching scheme filtering, and fuzzy parameters optimization using genetic algorithms (GA) to perform efficient and effective noise removal. Our idea is to utilize MAD and ROAD as measures of noise probability of a pixel. Fuzzy inference system is used to justify the degree of which a pixel can be categorized as noisy. Based on the fuzzy inference result, the fuzzy switching scheme that adopts median filter as the main estimator is applied to the filtering. The GA training aims to find the best parameters for the fuzzy sets in the fuzzy noise detection. From the experimental results, the proposed method has successfully removed mixed impulse noise in low to medium probabilities, while keeping the uncorrupted pixels less affected by the median filtering. It also surpasses the other methods, either classical or soft computing-based approaches to impulse noise removal, in MAE and PSNR evaluations. It can also remove salt-and-pepper and uniform impulse noise well

    A class of adaptive directional image smoothing filters

    Get PDF
    Cataloged from PDF version of article.The gray level distribution around a pixel of an image usually tends to be more coherent in some directions compared to other directions. The idea of adaptive directional filtering is to estimate the direction of higher coherence around each pixel location and then to employ a window which approximates aline segment in that direction. Hence, the details of the image may be preserved while maintaining a satisfactory level of noise suppression performance. In this paper we describe a class of adaptive directional image smoothing filters based on generalized Gaussian distributions. We propose a measure of spread for the pixel values based on the maximum likelihood estimate of a scale parameter involved in the generalized Gaussian distribution. Several experimental results indicate a significant improvement compared to some standard filters. Copyright (C) 1996 Pattern Recognition Society

    Speech recognition in noise using weighted matching algorithms

    Get PDF

    Nonlinear smoothing filters and their realization

    Full text link

    Noise cancelling in acoustic voice signals with spectral subtraction

    Get PDF
    The main purpose of study throughout this entire End of Degree Project would be the noise removal within speech signals, focusing on the diverse amount of algorithms using the spectral subtraction method. A Matlab application has been designed and created. The application main goal is to remove any meaningless thing considered as a disturb element when trying to perceive a voice; that is, anything considered as a noise. Noise removal is the basis for any voice processing that the user wants to do later, as speech recognition, save the clean audio, voice analysis, etc. A studio on four algorithms has been executed, in order to perform the spectral subtraction: Boll, Berouti, Lockwood & Boudy, and Multiband. This document presents a theoretical study and its implementation. Moreover, in order to have ready for the user a suitable implementation of an application, an intuitive and simple interface has been designed. This document shows how the different algorithms work in some voices and with various types of noise. A few amounts of noises are ideal, used by its mathematical characteristics, while others, are quite common and presented in daily routine, it is presented as for example, the noise of a bus. To apply the method of spectral subtraction is necessary the implementation of a Vocal Activity Detector, able to recognize in which precise moments of the audio there is voice or not. Two types have been studied and implemented: the first one establishes the meaning of voice according to a threshold which is adequate to this record, while the second one is the combination of Zero Crossing Rate and energy. In the end, once the application is implemented, evaluating its performances was the next process, either in an objective and a subjective form. People stand point was considered and asked, in order to obtain the proper functioning of the application along different types of noise, voice, variables, algorithm, etc.Este Trabajo de Fin de Grado, consiste en el estudio de la eliminación de ruido en voces; en concreto en el estudio de distintos algoritmos para el método de la resta espectral. Se ha creado una aplicación en el programa de cálculo Matlab cuyo uso es la eliminación de todo aquello que nos pueda molestar a la hora de escuchar una voz, es decir, lo que se considera ruido. La eliminación de ruido es la base de cualquier tratamiento de voz que se quiera aplicar posteriormente; desde reconocimiento de voz, el análisis de la misma, la conservación de la grabación limpia. etc. Se ha hecho un estudio de cuatro algoritmos para llevar a cabo esta resta espectral: Boll, Berouti, Lockwood & Boudy y Multibanda. En este documento se encuentra tanto un estudio teórico, así como su implementación. Para la implementación de una aplicación que pueda ser usada por un usuario, se ha diseñado una interfaz fácil e intuitiva de usar, en ésta se muestra cómo funcionan los distintos algoritmos en distintas voces y con distintos tipos de ruido, algunos ideales, usados en las medidas oficiales de ruido por sus concretas características matemáticas, y otros, los de la vida cotidiana como el ruido de un autobús. Para aplicar el método de la resta espectral es necesario la implementación de un Detector de Actividad Vocal (VAD) que reconozca en qué momentos del audio hay voz o no. Se han estudiado e implementado dos: Uno de ellos establece qué es voz según un límite adecuado a esa grabación y el otro es la combinación de la Tasa de Cruces por Cero (ZCR) y la energía. Por último, una vez implementada esta aplicación se ha procedido a evaluar su funcionamiento, tanto de una forma objetiva como subjetiva, a través de la escucha de distintas personas, las cuales dan su opinión, para poder obtener el comportamiento de la aplicación con distintos tipos de ruidos, voces, variables, algoritmos, etc.Ingeniería de Sistemas Audiovisuale

    The Anatomy of Bangla OCR System for Printed Texts using Back Propagation Neural Network

    Get PDF
    This paper is based on Bangla (National Language of Bangladesh) Optical Character Recognition process for printed texts and its steps using Back Propagation Neural Network. Bangla character recognition is very important field of research because Bangla is most popular language in the Indian subcontinent. Pre-processing steps that follows are Image Acquisition, binarization, background removal, noise elimination, skew angle detection and correction, noise removal, line, word and character segmentations. In the post processing steps various features are extracted by applying DCT (Discrete Cosine Transform) from segmented characters. The segmented characters are then fed into a three layer feed forward Back Propagation Neural Network for training. Finally this network is used to recognize printed Bangla scripts
    corecore