90 research outputs found
New methods for robust speech recognition
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement
are developed for a robust speech recognition system.
The methods of feature extraction and end-point detection are based on
wavelet analysis or subband analysis of the speech signal. Two new sets of speech
feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter
sets are based on subband analysis. The SUBLSF feature parameters are obtained
via linear predictive analysis on subbands. These speech feature parameters
can produce better results than the full-band parameters when the noise is
colored. The SUBCEP parameters are based on wavelet analysis or equivalently
the multirate subband analysis of the speech signal. The SUBCEP parameters
also provide robust recognition performance by appropriately deemphasizing the
frequency bands corrupted by noise. It is experimentally observed that the
subband analysis based feature parameters are more robust than the commonly
used full-band analysis based parameters in the presence of car noise.
The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed
for Q-stable random processes. Adaptive noise cancelation techniques are used to
reduce the mismacth between training and testing conditions of the recognition
system over telephone lines.
Another important problem in isolated speech recognition is to determine
the boundaries of the speech utterances or words. Precise boundary detection
of utterances improves the performance of speech recognition systems. A new
distance measure based on the subband energy levels is introduced for endpoint
detection.Erzin, EnginPh.D
Efficient Multiband Algorithms for Blind Source Separation
The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme
Doctor of Philosophy
dissertationHearing aids suffer from the problem of acoustic feedback that limits the gain provided by hearing aids. Moreover, the output sound quality of hearing aids may be compromised in the presence of background acoustic noise. Digital hearing aids use advanced signal processing to reduce acoustic feedback and background noise to improve the output sound quality. However, it is known that the output sound quality of digital hearing aids deteriorates as the hearing aid gain is increased. Furthermore, popular subband or transform domain digital signal processing in modern hearing aids introduces analysis-synthesis delays in the forward path. Long forward-path delays are not desirable because the processed sound combines with the unprocessed sound that arrives at the cochlea through the vent and changes the sound quality. In this dissertation, we employ a variable, frequency-dependent gain function that is lower at frequencies of the incoming signal where the information is perceptually insignificant. In addition, the method of this dissertation automatically identifies and suppresses residual acoustical feedback components at frequencies that have the potential to drive the system to instability. The suppressed frequency components are monitored and the suppression is removed when such frequencies no longer pose a threat to drive the hearing aid system into instability. Together, the method of this dissertation provides more stable gain over traditional methods by reducing acoustical coupling between the microphone and the loudspeaker of a hearing aid. In addition, the method of this dissertation performs necessary hearing aid signal processing with low-delay characteristics. The central idea for the low-delay hearing aid signal processing is a spectral gain shaping method (SGSM) that employs parallel parametric equalization (EQ) filters. Parameters of the parametric EQ filters and associated gain values are selected using a least-squares approach to obtain the desired spectral response. Finally, the method of this dissertation switches to a least-squares adaptation scheme with linear complexity at the onset of howling. The method adapts to the altered feedback path quickly and allows the patient to not lose perceivable information. The complexity of the least-squares estimate is reduced by reformulating the least-squares estimate into a Toeplitz system and solving it with a direct Toeplitz solver. The increase in stable gain over traditional methods and the output sound quality were evaluated with psychoacoustic experiments on normal-hearing listeners with speech and music signals. The results indicate that the method of this dissertation provides 8 to 12 dB more hearing aid gain than feedback cancelers with traditional fixed gain functions. Furthermore, experimental results obtained with real world hearing aid gain profiles indicate that the method of this dissertation provides less distortion in the output sound quality than classical feedback cancelers, enabling the use of more comfortable style hearing aids for patients with moderate to profound hearing loss. Extensive MATLAB simulations and subjective evaluations of the results indicate that the method of this dissertation exhibits much smaller forward-path delays with superior howling suppression capability
Computer aided diagnosis in radiology
Ankara : The Department of Electrical and Electronics Engineering and Institute of Engineering and Sciences, Bilkent Univ., 1999.Thesis (Ph.D.) -- Bilkent University, 1999.Includes bibliographical references leaves 117-124.Breast cancer is one of the most deadly diseases for middle-aged women. In this thesis, computer-aided diagnosis tools are developed for the detection of breast cancer on mammograms. These tools include a detection scheme for microcalcification clusters which are an early sign of breast cancer, and a method to detect the boundaries of mass lesions. In the first microcalcification detection method we propose, a subband decomposition structure is employed. Contrary to the previous work, the detection is carried out in the subband domain. The mammogram image is first processed by a subband decomposition filter bank. The resulting subimage is analyzed to detect microcalcification clusters. In regions corresponding to the healthy breast tissue the distribution is almost Gaussian. Since microcalcifications are small, isolated bright spots, they produce outliers in the subimages and the distribution of pixels deviates from Gaussian. The subimages are divided into overlapping square regions. In each square region, skewness and kurtosis values are estimated. As third and fourth order correlation parameters, skewness and kurtosis, are measures of the asymmetry and impulsiveness of the distribution, they can be used to find the locations of microcalcification clusters. If the values of these parameters are higher than experimentally determined thresholds then the region is marked as a potential cancer area. Experimental studies indicate that this method successfully detects regions containing microcalcifications.
We also propose another microcalcification detection method which uses two- dimensional (2-D) adaptive filtering and a higher order statistics based Gaussianity test. In this method, statistics of the prediction errors are computed to determine whether the samples are from a Gaussian distribution. The prediction error sequence deviates from Gaussianity around microcalcification locations because prediction of microcalcification pixels is more difficult than prediction of the pixels corresponding to healthy breast tissue. Then, we develop a new Gaussianity test which has higher sensitivity to outliers. The scheme which uses this test gives better detection performance compared to the previously proposed methods. Within the detected regions it is possible to segment individual microcalcifications. An outlier labeling and nonlinear subband decomposition based microcalcification segmentation method is also investigated.
Two types of lesions, namely mass and stellate lesions, might be indicators of breast cancer. Finally, we propose a snake algorithm based scheme to detect the boundaries of mass lesions on mammograms. This scheme is compared with a recently developed region growing based boundary detection method. It is observed that the snake algorithm results in a more smooth boundary which is consistent with the morphological structure of mass lesions.GĂĽrcan, Metin NafiPh.D
High Dynamic Range Visual Content Compression
This thesis addresses the research questions of High Dynamic Range (HDR) visual contents compression. The HDR representations are intended to represent the actual physical value of the light rather than exposed value. The current HDR compression schemes are the extension of legacy Low Dynamic Range (LDR) compressions, by using Tone-Mapping Operators (TMO) to reduce the dynamic range of the HDR contents. However, introducing TMO increases the overall computational complexity, and it causes the temporal artifacts. Furthermore, these compression schemes fail to compress non-salient region differently than the salient region, when Human Visual System (HVS)
perceives them differently. The main contribution of this thesis is to propose a novel Mapping-free visual saliency-guided HDR content compression scheme. Firstly, the relationship of Discrete Wavelet Transform (DWT) lifting steps and TMO are explored. A novel approach to compress HDR image by Joint Photographic Experts Group (JPEG) 2000 codec while backward compatible to LDR is proposed. This approach exploits the reversibility of tone mapping and scalability of DWT. Secondly, the importance of the TMO in the HDR compression is evaluated in this thesis. A mapping-free post HDR image compression based on JPEG and JPEG2000 standard codecs for current HDR image formats is proposed. This approach exploits the structure of HDR formats. It has an equivalent compression performance and the lowest computational complexity compared to the existing HDR lossy compressions (50% lower than the state-of-the-art). Finally, the shortcomings of the current HDR visual saliency models, and HDR visual saliency-guided compression are explored in this thesis. A spatial saliency model for HDR visual content outperform others
by 10% for spatial visual prediction task with 70% lower computational complexity is proposed. Furthermore, the experiment suggested more than 90% temporal saliency is predicted by the proposed spatial model. Moreover, the proposed saliency model can be used to guide the HDR compression by applying different quantization factor according to the intensity of predicted saliency map
Cognitive Radio Systems
Cognitive radio is a hot research area for future wireless communications in the recent years. In order to increase the spectrum utilization, cognitive radio makes it possible for unlicensed users to access the spectrum unoccupied by licensed users. Cognitive radio let the equipments more intelligent to communicate with each other in a spectrum-aware manner and provide a new approach for the co-existence of multiple wireless systems. The goal of this book is to provide highlights of the current research topics in the field of cognitive radio systems. The book consists of 17 chapters, addressing various problems in cognitive radio systems
- …