687 research outputs found
Unlabeled pattern management through Semi-Supervised classification techniques
l'obbiettivo di questo progetto consiste nell'analizzare le performance di alcuni algoritmi di semi-supervised learning proposti negli ultimi anni. In particolare si è usato un algoritmo di feature selection basato su Self-training per determinare l'insieme ottimo di features per ogni dataset. Poi sono stati applicati alcuni algoritmi di semi-supervised learning per classificare i dati. Questi algoritmi sono stati testati usando rispettivamente come classificatore di base SVM e SMC
Master of Science
thesisPresently, speech recognition is gaining worldwide popularity in applications like Google Voice, speech-to-text reporter (speech-to-text transcription, video captioning, real-time transcriptions), hands-free computing, and video games. Research has been done for several years and many speech recognizers have been built. However, most of the speech recognizers fail to recognize the speech accurately. Consider the well-known application of Google Voice, which aids in users search of the web using voice. Though Google Voice does a good job in transcribing the spoken words, it does not accurately recognize the words spoken with different accents. With the fact that several accents are evolving around the world, it is essential to train the speech recognizer to recognize accented speech. Accent classification is defined as the problem of classifying the accents in a given language. This thesis explores various methods to identify the accents. We introduce a new concept of clustering windows of a speech signal and learn a distance metric using specific distance measure over phonetic strings to classify the accents. A language structure is incorporated to learn this distance metric. We also show how kernel approximation algorithms help in learning a distance metric
A detection-based pattern recognition framework and its applications
The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation.
Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages.
A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts
can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage.
This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed
in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min
Data structure > labels? Unsupervised heuristics for SVM hyperparameter estimation
Classification is one of the main areas of pattern recognition research, and
within it, Support Vector Machine (SVM) is one of the most popular methods
outside of field of deep learning -- and a de-facto reference for many Machine
Learning approaches. Its performance is determined by parameter selection,
which is usually achieved by a time-consuming grid search cross-validation
procedure (GSCV). That method, however relies on the availability and quality
of labelled examples and thus, when those are limited can be hindered. To
address that problem, there exist several unsupervised heuristics that take
advantage of the characteristics of the dataset for selecting parameters
instead of using class label information. While an order of magnitude faster,
they are scarcely used under the assumption that their results are
significantly worse than those of grid search. To challenge that assumption, we
have proposed improved heuristics for SVM parameter selection and tested it
against GSCV and state of the art heuristics on over 30 standard classification
datasets. The results show not only its advantage over state-of-art heuristics
but also that it is statistically no worse than GSCV
The Impact of Emotion Focused Features on SVM and MLR Models for Depression Detection
Major depressive disorder (MDD) is a common mental health diagnosis with estimates upwards of 25% of the United States population remain undiagnosed. Psychomotor symptoms of MDD impacts speed of control of the vocal tract, glottal source features and the rhythm of speech. Speech enables people to perceive the emotion of the speaker and MDD decreases the mood magnitudes expressed by an individual. This study asks the questions: “if high level features deigned to combine acoustic features related to emotion detection are added to glottal source features and mean response time in support vector machines and multivariate logistic regression models, would that improve the recall of the MDD class?” To answer this question, a literature review goes through common features in MDD detection, especially features related to emotion recognition. Using feature transformation, emotion recognition composite features are produced and added to glottal source features for model evaluation
- …