143 research outputs found
A detection-based pattern recognition framework and its applications
The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation.
Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages.
A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts
can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage.
This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed
in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min
OBTAINING ACCURATE PROBABILITIES USING CLASSIFIER CALIBRATION
Learning probabilistic classification and prediction models that generate accurate probabilities is essential in many prediction and decision-making tasks in machine learning and data mining. One way to achieve this goal is to post-process the output of classification models to obtain more accurate probabilities. These post-processing methods are often referred to as calibration methods in the machine learning literature.
This thesis describes a suite of parametric and non-parametric methods for calibrating the output of classification and prediction models. In order to evaluate the calibration performance of a classifier, we introduce two new calibration measures that are intuitive statistics of the calibration
curves. We present extensive experimental results on both simulated and real datasets to evaluate the performance of the proposed methods compared with commonly used calibration methods in the literature. In particular, in terms of binary classifier calibration, our experimental results
show that the proposed methods are able to improve the calibration power of classifiers while retaining their discrimination performance. Our theoretical findings show that by using a simple non-parametric calibration method, it is possible to improve the calibration performance of a classifier
without sacrificing discrimination capability. The methods are also computationally tractable for large-scale datasets as they run in O(N log N) time, where N is the number of samples.
In this thesis we also introduce a novel framework to derive calibrated probabilities of causal relationships from observational data. The framework consists of three main components: (1) an approximate method for generating initial probability estimates of the edge types for each pair
of variables, (2) the availability of a relatively small number of the causal relationships in the network for which the truth status is known, which we call a calibration training set, and (3) a calibration method for using the approximate probability estimates and the calibration training set
to generate calibrated probabilities for the many remaining pairs of variables. Our experiments on a range of simulated data support that the proposed approach improves the calibration of edge predictions. The results also support that the approach often improves the precision and recall of those predictions
Reconnaissance de l'écriture manuscrite en-ligne par approche combinant systèmes à vastes marges et modèles de Markov cachés
Handwriting recognition is one of the leading applications of pattern recognition and machine learning. Despite having some limitations, handwriting recognition systems have been used as an input method of many electronic devices and helps in the automation of many manual tasks requiring processing of handwriting images. In general, a handwriting recognition system comprises three functional components; preprocessing, recognition and post-processing. There have been improvements made within each component in the system. However, to further open the avenues of expanding its applications, specific improvements need to be made in the recognition capability of the system. Hidden Markov Model (HMM) has been the dominant methods of recognition in handwriting recognition in offline and online systems. However, the use of Gaussian observation densities in HMM and representational model for word modeling often does not lead to good classification. Hybrid of Neural Network (NN) and HMM later improves word recognition by taking advantage of NN discriminative property and HMM representational capability. However, the use of NN does not optimize recognition capability as the use of Empirical Risk minimization (ERM) principle in its training leads to poor generalization. In this thesis, we focus on improving the recognition capability of a cursive online handwritten word recognition system by using an emerging method in machine learning, the support vector machine (SVM). We first evaluated SVM in isolated character recognition environment using IRONOFF and UNIPEN character databases. SVM, by its use of principle of structural risk minimization (SRM) have allowed simultaneous optimization of representational and discriminative capability of the character recognizer. We finally demonstrate the various practical issues in using SVM within a hybrid setting with HMM. In addition, we tested the hybrid system on the IRONOFF word database and obtained favourable results.Nos travaux concernent la reconnaissance de l'écriture manuscrite qui est l'un des domaines de prédilection pour la reconnaissance des formes et les algorithmes d'apprentissage. Dans le domaine de l'écriture en-ligne, les applications concernent tous les dispositifs de saisie permettant à un usager de communiquer de façon transparente avec les systèmes d'information. Dans ce cadre, nos travaux apportent une contribution pour proposer une nouvelle architecture de reconnaissance de mots manuscrits sans contrainte de style. Celle-ci se situe dans la famille des approches hybrides locale/globale où le paradigme de la segmentation/reconnaissance va se trouver résolu par la complémentarité d'un système de reconnaissance de type discriminant agissant au niveau caractère et d'un système par approche modèle pour superviser le niveau global. Nos choix se sont portés sur des Séparateurs à Vastes Marges (SVM) pour le classifieur de caractères et sur des algorithmes de programmation dynamique, issus d'une modélisation par Modèles de Markov Cachés (HMM). Cette combinaison SVM/HMM est unique dans le domaine de la reconnaissance de l'écriture manuscrite. Des expérimentations ont été menées, d'abord dans un cadre de reconnaissance de caractères isolés puis sur la base IRONOFF de mots cursifs. Elles ont montré la supériorité des approches SVM par rapport aux solutions à bases de réseaux de neurones à convolutions (Time Delay Neural Network) que nous avions développées précédemment, et leur bon comportement en situation de reconnaissance de mots
PHONOTACTIC AND ACOUSTIC LANGUAGE RECOGNITION
Práce pojednává o fonotaktickĂ©m a akustickĂ©m pĹ™Ăstupu pro automatickĂ© rozpoznávánĂ jazyka. Prvnà část práce pojednává o fonotaktickĂ©m pĹ™Ăstupu zaloĹľenĂ©m na vĂ˝skytu fonĂ©movĂ˝ch sekvenci v Ĺ™eÄŤi. NejdĹ™Ăve je prezentován popis vĂ˝voje fonĂ©movĂ©ho rozpoznávaÄŤe jako techniky pro pĹ™epis Ĺ™eÄŤi do sekvence smysluplnĂ˝ch symbolĹŻ. HlavnĂ dĹŻraz je kladen na dobrĂ© natrĂ©novánĂ fonĂ©movĂ©ho rozpoznávaÄŤe a kombinaci vĂ˝sledkĹŻ z nÄ›kolika fonĂ©movĂ˝ch rozpoznávaÄŤĹŻ trĂ©novanĂ˝ch na rĹŻznĂ˝ch jazycĂch (ParalelnĂ fonĂ©movĂ© rozpoznávánĂ následovanĂ© jazykovĂ˝mi modely (PPRLM)). Práce takĂ© pojednává o novĂ© technice anti-modely v PPRLM a studuje pouĹľitĂ fonĂ©movĂ˝ch grafĹŻ mĂsto nejlepšĂho pĹ™episu. Na závÄ›r práce jsou porovnány dva pĹ™Ăstupy modelovánĂ vĂ˝stupu fonĂ©movĂ©ho rozpoznávaÄŤe -- standardnĂ n-gramovĂ© jazykovĂ© modely a binárnĂ rozhodovacĂ stromy. HlavnĂ pĹ™Ănos v akustickĂ©m pĹ™Ăstupu je diskriminativnĂ modelovánĂ cĂlovĂ˝ch modelĹŻ jazykĹŻ a prvnĂ experimenty s kombinacĂ diskriminativnĂho trĂ©novánĂ a na pĹ™ĂznacĂch, kde byl odstranÄ›n vliv kanálu. Práce dále zkoumá rĹŻznĂ© druhy technik fĂşzi akustickĂ©ho a fonotaktickĂ©ho pĹ™Ăstupu. Všechny experimenty jsou provedeny na standardnĂch datech z NIST evaluaci konanĂ© v letech 2003, 2005 a 2007, takĹľe jsou pĹ™Ămo porovnatelnĂ© s vĂ˝sledky ostatnĂch skupin zabĂ˝vajĂcĂch se automatickĂ˝m rozpoznávánĂm jazyka. S fĂşzĂ uvedenĂ˝ch technik jsme posunuli state-of-the-art vĂ˝sledky a dosáhli vynikajĂcĂch vĂ˝sledkĹŻ ve dvou NIST evaluacĂch.This thesis deals with phonotactic and acoustic techniques for automatic language recognition (LRE). The first part of the thesis deals with the phonotactic language recognition based on co-occurrences of phone sequences in speech. A thorough study of phone recognition as tokenization technique for LRE is done, with focus on the amounts of training data for phone recognizer and on the combination of phone recognizers trained on several language (Parallel Phone Recognition followed by Language Model - PPRLM). The thesis also deals with novel technique of anti-models in PPRLM and investigates into using phone lattices instead of strings. The work on phonotactic approach is concluded by a comparison of classical n-gram modeling techniques and binary decision trees. The acoustic LRE was addressed too, with the main focus on discriminative techniques for training target language acoustic models and on initial (but successful) experiments with removing channel dependencies. We have also investigated into the fusion of phonotactic and acoustic approaches. All experiments were performed on standard data from NIST 2003, 2005 and 2007 evaluations so that the results are directly comparable to other laboratories in the LRE community. With the above mentioned techniques, the fused systems defined the state-of-the-art in the LRE field and reached excellent results in NIST evaluations.
Decision Modelling Driven by Twitter Data: A Case Study of the 2017 Presidential Election in Ecuador
Deep Learning Models For Biomedical Data Analysis
The field of biomedical data analysis is a vibrant area of research dedicated to extracting valuable insights from a wide range of biomedical data sources, including biomedical images and genomics data. The emergence of deep learning, an artificial intelligence approach, presents significant prospects for enhancing biomedical data analysis and knowledge discovery. This dissertation focused on exploring innovative deep-learning methods for biomedical image processing and gene data analysis.
During the COVID-19 pandemic, biomedical imaging data, including CT scans and chest x-rays, played a pivotal role in identifying COVID-19 cases by categorizing patient chest x-ray outcomes as COVID-19-positive or negative. While supervised deep learning methods have effectively recognized COVID-19 patterns in chest x-ray datasets, the availability of annotated training data remains limited. To address this challenge, the thesis introduced a semi-supervised deep learning model named ssResNet, built upon the Residual Neural Network (ResNet) architecture. The model combines supervised and unsupervised paths, incorporating a weighted supervised loss function to manage data imbalance. The strategies to diminish prediction uncertainty in deep learning models for critical applications like medical image processing is explore. It achieves this through an ensemble deep learning model, integrating bagging deep learning and model calibration techniques. This ensemble model not only boosts biomedical image segmentation accuracy but also reduces prediction uncertainty, as validated on a comprehensive chest x-ray image segmentation dataset.
Furthermore, the thesis introduced an ensemble model integrating Proformer and ensemble learning methodologies. This model constructs multiple independent Proformers for predicting gene expression, their predictions are combined through weighted averaging to generate final predictions. Experimental outcomes underscore the efficacy of this ensemble model in enhancing prediction performance across various metrics.
In conclusion, this dissertation advances biomedical data analysis by harnessing the potential of deep learning techniques. It devises innovative approaches for processing biomedical images and gene data. By leveraging deep learning\u27s capabilities, this work paves the way for further progress in biomedical data analytics and its applications within clinical contexts.
Index Terms- biomedical data analysis, COVID-19, deep learning, ensemble learning, gene data analytics, medical image segmentation, prediction uncertainty, Proformer, Residual Neural Network (ResNet), semi-supervised learning
- …