56,188 research outputs found

    Classification Among Hidden Markov Models

    Get PDF
    An important task in AI is one of classifying an observation as belonging to one class among several (e.g. image classification). We revisit this problem in a verification context: given k partially observable systems modeled as Hidden Markov Models (also called labeled Markov chains), and an execution of one of them, can we eventually classify which system performed this execution, just by looking at its observations? Interestingly, this problem generalizes several problems in verification and control, such as fault diagnosis and opacity. Also, classification has strong connections with different notions of distances between stochastic models. In this paper, we study a general and practical notion of classifiers, namely limit-sure classifiers, which allow misclassification, i.e. errors in classification, as long as the probability of misclassification tends to 0 as the length of the observation grows. To study the complexity of several notions of classification, we develop techniques based on a simple but powerful notion of stationary distributions for HMMs. We prove that one cannot classify among HMMs iff there is a finite separating word from their stationary distributions. This provides a direct proof that classifiability can be checked in PTIME, as an alternative to existing proofs using separating events (i.e. sets of infinite separating words) for the total variation distance. Our approach also allows us to introduce and tackle new notions of classifiability which are applicable in a security context

    Vector Autoregressive Hierarchical Hidden Markov Models for Extracting Finger Movements Using Multichannel Surface EMG Signals

    Get PDF
    We present a novel computational technique intended for the robust and adaptable control of a multifunctional prosthetic hand using multichannel surface electromyography. The initial processing of the input data was oriented towards extracting relevant time domain features of the EMG signal. Following the feature calculation, a piecewise modeling of the multidimensional EMG feature dynamics using vector autoregressive models was performed. The next step included the implementation of hierarchical hidden semi-Markov models to capture transitions between piecewise segments of movements and between different movements. Lastly, inversion of the model using an approximate Bayesian inference scheme served as the classifier. The effectiveness of the novel algorithms was assessed versus methods commonly used for real-time classification of EMGs in a prosthesis control application. The obtained results show that using hidden semi-Markov models as the top layer, instead of the hidden Markov models, ranks top in all the relevant metrics among the tested combinations. The choice of the presented methodology for the control of prosthetic hand is also supported by the equal or lower computational complexity required, compared to other algorithms, which enables the implementation on low-power microcontrollers, and the ability to adapt to user preferences of executing individual movements during activities of daily living

    Malware Classification Based on Hidden Markov Model and Word2Vec Features

    Get PDF
    Malware classification is an important and challenging problem in information security. Modern malware classification techniques rely on machine learning models that can be trained on a wide variety of features, including opcode sequences, API calls, and byte ��-grams, among many others. In this research, we implement hybrid machine learning techniques, where we train hidden Markov models (HMM) and compute Word2Vec encodings based on opcode sequences. The resulting trained HMMs and Word2Vec embedding vectors are then used as features for classification algorithms. Specifically, we consider support vector machine (SVM), ��-nearest neighbor (��-NN), random forest (RF), and deep neural network (DNN) classifiers. We conduct substantial experiments over a variety of malware families. Our results surpass those of comparable classification experiments

    Music genre classification based on dynamical models

    Get PDF
    This paper studies several alternatives to extract dynamical features from hidden Markov Models (HMMs) that are meaningful for music genre supervised classification. Songs are modelled using a three scale approach: a first stage of short term (milliseconds) features, followed by two layers of dynamical models: a multivariate AR that provides mid term (seconds) features for each song followed by an HMM stage that captures long term (song) features shared among similar songs. We study from an empirical point of view which features are relevant for the genre classification task. Experiments on a database including pieces of heavy metal, punk, classical and reggae music illustrate the advantages of each set of features

    Performance of the supervised generative classifiers of spatio-temporal areal data using various spatial autocorrelation indexes

    Get PDF
    This article is concerned with a generative approach to supervised classification of spatio-temporal data collected at fixed areal units and modeled by Gaussian Markov random field. We focused on the classifiers based on Bayes discriminant functions formed by the log-ratio of the class conditional likelihoods. As a novel modeling contribution, we propose to use decision threshold values induced by three popular spatial autocorrelation indexes, i.e., Moran’s I, Geary’s C and Getis–Ord G. The goal of this study is to extend the recent investigations in the context of geostatistical and hidden Markov Gaussian models to one in the context of areal Gaussian Markov models. The classifiers performance measures are chosen to be the average accuracy rate, which shows the percentage of correctly classified test data, balanced accuracy rate specified by the average of sensitivity and specificity and the geometric mean of sensitivity and specificity. The proposed methodology is illustrated using annual death rate data collected by the Institute of Hygiene of the Republic of Lithuania from the 60 unicipalities in the period from 2001 to 2019. Classification model selection procedure is illustrated on three data sets with class labels specified by the threshold to mortality index due to acute cardiovascular event, malignant neoplasms and diseases of the circulatory system. Presented critical comparison among proposed approach classifiers with various spatial autocorrelation indexes (decision threshold values) and classifier based hidden Markov model can aid in the selection of proper classification techniques for the spatio-temporal areal data

    Malware Classification with Word Embedding Features

    Get PDF
    Malware classification is an important and challenging problem in information security. Modern malware classification techniques rely on machine learning models that can be trained on features such as opcode sequences, API calls, and byte nn-grams, among many others. In this research, we consider opcode features. We implement hybrid machine learning techniques, where we engineer feature vectors by training hidden Markov models -- a technique that we refer to as HMM2Vec -- and Word2Vec embeddings on these opcode sequences. The resulting HMM2Vec and Word2Vec embedding vectors are then used as features for classification algorithms. Specifically, we consider support vector machine (SVM), kk-nearest neighbor (kk-NN), random forest (RF), and convolutional neural network (CNN) classifiers. We conduct substantial experiments over a variety of malware families. Our experiments extend well beyond any previous work in this field

    Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

    Full text link
    The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodically changing their predominant cellular surface protein (VSG). The motivation for using patterns recognition methods to identify these genes, instead of traditional homology based ones, is that the levels of sequence identity (amino acid and DNA sequence) amongst these genes is often below of what is considered reliable in these methods. Among pattern recognition approaches, HMM are particularly suitable to tackle this problem because they can handle more naturally the determination of gene edges. We evaluate the performance of the model using different number of states in the Markov model, as well as several performance metrics. The model is applied using public genomic data. Our empirical results show that the VSG genes on T. brucei can be safely identified (high sensitivity and low rate of false positives) using HMM.Comment: Accepted article in July, 2015 in Pattern Analysis and Applications, Springer. The article contains 23 pages, 4 figures, 8 tables and 51 reference
    corecore