135 research outputs found

    Quantification et classification

    Get PDF
    The problem of quantizer design for detection or classification has a long history, with classical contributions by Kassam , Poor, Picinbono, Bucklew and others . The goal was to design a quantizer such that a detection rule based on the quantize d information was optimized . During recent years an alternative approach has been developed which seeks to jointly optimiz e quantization and classification by incorporating the Bayes risk resulting from the quantizer into the quantizer optimization . In this paper the general classical approach of Picinbono and Duvaut is compared contrasted with the joint approach an d illustrated by a simple example .Il existe une importante littérature traitant du problème de la conception d'un quantificateur pour un système de détection ou de classification. A l'origine, les travaux menés dans ce domaine - notamment par Kassam, Poor, Picinbono et Bucklew - ont pour but de concevoir un quantificateur qui optimise une règle de décision basée sur l'information quantifiée. Rompant avec cette approche classique, ces dernières années ont vu l'émergence d'une approche alternative dont l'objectif est d'optimiser conjointement les opérations de quantification et de classification. L'optimisation conjointe est réalisée par minimisation d'un critère Lagrangien comprennant l'erreur quadratique moyenne (quantification) et le risque de Bayes (classification). Dans cet article, nous proposons de comparer l'approche conjointe à l'approche classique, plus courante, de Picinbono et Duvaut. Nous illustrons les deux méthodes à l'aide d'un exemple simple

    Effects of discrete wavelet compression on automated mammographic shape recognition

    Full text link
    At present early detection is critical for the cure of breast cancer. Mammography is a breast screening technique which can detect breast cancer at the earliest possible stage. Mammographic lesions are typically classified into three shape classes, namely round, nodular and stellate. Presently this classification is done by experienced radiologists. In order to increase the speed and decrease the cost of diagnosis, automated recognition systems are being developed. This study analyses an automated classification procedure and its sensitivity to wavelet based image compression; In this study, the mammographic shape images are compressed using discrete wavelet compression and then classified using statistical classification methods. First, one dimensional compression is done on the radial distance measure and the shape features are extracted. Second, linear discriminant analysis is used to compute the weightings of the features. Third, a minimum distance Euclidean classifier and the leave-one-out test method is used for classification. Lastly, a two dimensional compression is performed on the images, and the above process of feature extraction and classification is repeated. The results are compared with those obtained with uncompressed mammographic images

    Automatic speaker recognition

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır

    Multi-image classification and compression using vector quantization

    Get PDF
    Vector Quantization (VQ) is an image processing technique based on statistical clustering, and designed originally for image compression. In this dissertation, several methods for multi-image classification and compression based on a VQ design are presented. It is demonstrated that VQ can perform joint multi-image classification and compression by associating a class identifier with each multi-spectral signature codevector. We extend the Weighted Bayes Risk VQ (WBRVQ) method, previously used for single-component images, that explicitly incorporates a Bayes risk component into the distortion measure used in the VQ quantizer design and thereby permits a flexible trade-off between classification and compression priorities. In the specific case of multi-spectral images, we investigate the application of the Multi-scale Retinex algorithm as a preprocessing stage, before classification and compression, that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The goals of this research are four-fold: (1) to study the interrelationship between statistical clustering, classification and compression in a multi-image VQ context; (2) to study mixed-pixel classification and combined classification and compression for simulated and actual, multispectral and hyperspectral multi-images; (3) to study the effects of multi-image enhancement on class spectral signatures; and (4) to study the preservation of scientific data integrity as a function of compression. In this research, a key issue is not just the subjective quality of the resulting images after classification and compression but also the effect of multi-image dimensionality on the complexity of the optimal coder design

    Automatic speaker recognition: modelling, feature extraction and effects of clinical environment

    Get PDF
    Speaker recognition is the task of establishing identity of an individual based on his/her voice. It has a significant potential as a convenient biometric method for telephony applications and does not require sophisticated or dedicated hardware. The Speaker Recognition task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker-specific feature parameters from the speech. The features are used to generate statistical models of different speakers. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Current state of the art speaker recognition systems use the Gaussian mixture model (GMM) technique in combination with the Expectation Maximization (EM) algorithm to build the speaker models. The most frequently used features are the Mel Frequency Cepstral Coefficients (MFCC). This thesis investigated areas of possible improvements in the field of speaker recognition. The identified drawbacks of the current speaker recognition systems included: slow convergence rates of the modelling techniques and feature’s sensitivity to changes due aging of speakers, use of alcohol and drugs, changing health conditions and mental state. The thesis proposed a new method of deriving the Gaussian mixture model (GMM) parameters called the EM-ITVQ algorithm. The EM-ITVQ showed a significant improvement of the equal error rates and higher convergence rates when compared to the classical GMM based on the expectation maximization (EM) method. It was demonstrated that features based on the nonlinear model of speech production (TEO based features) provided better performance compare to the conventional MFCCs features. For the first time the effect of clinical depression on the speaker verification rates was tested. It was demonstrated that the speaker verification results deteriorate if the speakers are clinically depressed. The deterioration process was demonstrated using conventional (MFCC) features. The thesis also showed that when replacing the MFCC features with features based on the nonlinear model of speech production (TEO based features), the detrimental effect of the clinical depression on speaker verification rates can be reduced

    Bayesian models for visual information retrieval

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.Includes bibliographical references (leaves 192-208).This thesis presents a unified solution to visual recognition and learning in the context of visual information retrieval. Realizing that the design of an effective recognition architecture requires careful consideration of the interplay between feature selection, feature representation, and similarity function, we start by searching for a performance criteria that can simultaneously guide the design of all three components. A natural solution is to formulate visual recognition as a decision theoretical problem, where the goal is to minimize the probability of retrieval error. This leads to a Bayesian architecture that is shown to generalize a significant number of previous recognition approaches, solving some of the most challenging problems faced by these: joint modeling of color and texture, objective guidelines for controlling the trade-off between feature transformation and feature representation, and unified support for local and global queries without requiring image segmentation. The new architecture is shown to perform well on color, texture, and generic image databases, providing a good trade-off between retrieval accuracy, invariance, perceptual relevance of similarity judgments, and complexity. Because all that is needed to perform optimal Bayesian decisions is the ability to evaluate beliefs on the different hypothesis under consideration, a Bayesian architecture is not restricted to visual recognition. On the contrary, it establishes a universal recognition language (the language of probabilities) that provides a computational basis for the integration of information from multiple content sources and modalities. In result, it becomes possible to build retrieval systems that can simultaneously account for text, audio, video, or any other content modalities. Since the ability to learn follows from the ability to integrate information over time, this language is also conducive to the design of learning algorithms. We show that learning is, indeed, an important asset for visual information retrieval by designing both short and long-term learning mechanisms. Over short time scales (within a retrieval session), learning is shown to assure faster convergence to the desired target images. Over long time scales (between retrieval sessions), it allows the retrieval system to tailor itself to the preferences of particular users. In both cases, all the necessary computations are carried out through Bayesian belief propagation algorithms that, although optimal in a decision-theoretic sense, are extremely simple, intuitive, and easy to implement.by Nuno Miguel Borges de Pinho Cruz de Vasconcelos.Ph.D

    Multimodal Biometric Analysis for Monitoring of Wellness

    Get PDF
    Biometric data can provide useful information about person's overall wellness. The focus of this dissertation is wellness monitoring and diagnostics based on behavioral and physiological traits. The research comprises of three studies: passive non-intrusive biometric monitoring, active monitoring using a wearable computer, and a diagnostics of early stages of Parkinson's disease. In the first study, a biometric analysis system for collecting voice and gait data from a target individual has been constructed. A central issue in that problem is filtering of data that is collected from non-target subjects. A novel approach to gait analysis using floor vibrations has been introduced. Naive Bayes model has been used for gait analysis, and the Gaussian Mixture Model has been implemented for voice analysis. It has been shown that the designed biometric system can provide sufficiently accurate data stream for health monitoring purposes.In the second study, a universal wellness monitoring algorithm based on a binary classification model has been developed. It has been tested on the data collected with a wearable body monitor SenseWear®PRO and with the Support Vector Machines acting as an underlying binary classification model. The obtained results demonstrate that the wellness score produced by the algorithm can successfully discriminate anomalous data.The focus of the final part of this thesis is an ongoing project, which aims to develop an automated tool for diagnostics of early stages of Parkinson's disease. A spectral measure of balance impairment is introduced, and it is shown that that measure can separate the patients with Parkinson's disease from control subjects
    corecore