25 research outputs found
Метод векторного дельта-квантування параметрів мовленнєвого сигналу
У статті запропоновано метод векторного квантування LSF-параметрів мовленнєвого сигналу з прогнозуванням подальшого значення. Основна ідея методу полягає в тому, що замість квантування дійсного вектора LSF кодується різниця між дійсним та прогнозованим значеннями. Такий підхід дозволяє зменшити динамічний діапазон вхідних величин і відповідно похибку квантування. Розроблено ітераційну процедуру побудови кодових книг для реалізації запропонованого методу. Наведено експериментальні результати апробації методу для різних швидкостей кодування мовленнєвого сигналу
Blockwise Transform Image Coding Enhancement and Edge Detection
The goal of this thesis is high quality image coding, enhancement and edge detection. A unified approach using novel fast transforms is developed to achieve all three objectives. Requirements are low bit rate, low complexity of implementation and parallel processing. The last requirement is achieved by processing the image in small blocks such that all blocks can be processed simultaneously. This is similar to biological vision. A major issue is to minimize the resulting block effects. This is done by using proper transforms and possibly an overlap-save technique. The bit rate in image coding is minimized by developing new results in optimal adaptive multistage transform coding. Newly developed fast trigonometric transforms are also utilized and compared for transform coding, image enhancement and edge detection. Both image enhancement and edge detection involve generalised bandpass filtering wit fast transforms. The algorithms have been developed with special attention to the properties of biological vision systems
Use of principal component analysis with linear predictive features in developing a blind SNR estimation system
Signal-to-noise ratio is an important concept in electrical communications, as it is a measurable ratio between a given transmitted signal and the inherent background noise of a transmission channel. Currently signal-to-noise ratio testing is primarily performed by using an intrusive method of comparing a corrupted signal to the original signal and giving it a score based on the comparison. However, this technique is inefficient and often impossible for practical use because it requires the original signal for comparison. A speech signal\u27s characteristics and properties could be used to develop a non-intrusive method for determining SNR, or a method that does not require the presence of the original clean signal.
In this thesis, several extracted features were investigated to determine whether a neural network trained with data from corrupt speech signals could accurately estimate the SNR of a speech signal. A MultiLayer Perceptron (MLP) was trained on extracted features for each decibel level from 0dB to 30dB, in an attempt to create \u27expert classifiers\u27 for each SNR level. This type of architecture would then have 31 independent classifiers operating together to accurately estimate the signal-to-noise ratio of an unknown speech signal. Principal component analysis was also implemented to reduce dimensionality and increase class discrimination. The performance of several neural network classifier structures is examined, as well as analyzing the overall results to determine the optimal feature for estimating signal-to-noise ratio of an unknown speech signal. Decision-level fusion was the final procedure which combined the outputs of several classifier systems in an effort to reduce the estimation error
Recommended from our members
Speech coding
Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably
A human visual system based image coder
Over the years, society has changed considerably due to technological changes, and digital images have become part and parcel of our everyday lives. Irrespective of applications (i.e., digital camera) and services (information sharing, e.g., Youtube, archive / storage), there is the need for high image quality with high compression ratios. Hence, considerable efforts have been invested in the area of image compression. The traditional image compression systems take into account of statistical redundancies inherent in the image data. However, the development and adaptation of vision models, which take into account the properties of the human visual system (HVS), into picture coders have since shown promising results. The objective of the thesis is to propose the implementation of a vision model in two different manners in the JPEG2000 coding system: (a) a Perceptual Colour Distortion Measure (PCDM) for colour images in the encoding stage, and (b) a Perceptual Post Filtering (PPF) algorithm for colour images in the decoding stage. Both implementations are embedded into the JPEG2000 coder. The vision model here exploits the contrast sensitivity, the inter-orientation masking and intra-band masking visual properties of the HVS. Extensive calibration work has been undertaken to fine-tune the 42 model parameters of the PCDM and Just-Noticeable-Difference thresholds of the PPF for colour images. Evaluation with subjective assessments of PCDM based coder has shown perceived quality improvement over the JPEG2000 benchmark with the MSE (mean square error) and CVIS criteria. For the PPF adapted JPEG2000 decoder, performance evaluation has also shown promising results against the JPEG2000 benchmarks. Based on subjective evaluation, when both PCDM and PPF are used in the JPEG2000 coding system, the overall perceived image quality is superior to the stand-alone JPEG2000 with the PCDM
Applications of MATLAB in Science and Engineering
The book consists of 24 chapters illustrating a wide range of areas where MATLAB tools are applied. These areas include mathematics, physics, chemistry and chemical engineering, mechanical engineering, biological (molecular biology) and medical sciences, communication and control systems, digital signal, image and video processing, system modeling and simulation. Many interesting problems have been included throughout the book, and its contents will be beneficial for students and professionals in wide areas of interest