Search CORE

12 research outputs found

Improved estimation of hidden Markov model parameters from multiple observation sequences

Author: Caelli Terry
Davis Richard I. A.
Lovell Brian C.
Publication venue: The Institute of Electrical and Electronics Engineers
Publication date: 01/01/2002
Field of study

The huge popularity of Hidden Markov models in pattern recognition is due to the ability to 'learn' model parameters from an observation sequence through Baum-Welch and other re-estimation procedures. In the case of HMM parameter estimation from an ensemble of observation sequences, rather than a single sequence, we require techniques for finding the parameters which maximize the likelihood of the estimated model given the entire set of observation sequences. The importance of this study is that HMMs with parameters estimated from multiple observations are shown to be many orders of magnitude more probable than HMM models learned from any single observation sequence - thus the effectiveness of HMM 'learning' is greatly enhanced. In this paper, we present techniques that usually find models significantly more likely than Rabiner's well-known method on both seen and unseen sequences

Deakin Research Online

University of Queensland eSpace

An efficient hidden Markov model training scheme for anomaly intrusion detection of server applications based on system calls

Author: Hoang X
Hu J
Publication venue: IEEE (Piscataway, USA)
Publication date: 01/01/2004
Field of study

Recently hidden Markov model (HMM) has been proved to be a good tool to model normal behaviours of privileged processes for anomaly intrusion detection based on system calls. However, one major problem with this approach is that it demands excessive computing resources in the HMM training process, which makes it inefficient for practical intrusion detection systems. In this paper a simple and efficient HMM training scheme is proposed by the innovative integration of multiple-observations training and incremental HMM training. The proposed scheme first divides the long observation sequence into multiple subsets of sequences. Next each subset of data is used to infer one sub-model, and then this sub-model is incrementally merged into the final HMM model. Our experimental results show that our HMM training scheme can reduce the training time by about 60% compared to that of the conventional batch training. The results also show that our HMM-based detection model is able to detect all denial-of-service attacks embedded in testing traces

RMIT Research Repository

Comparing and Evaluating HMM Ensemble Training Algorithms Using Train and Test and Condition Number Criteria

Author: Davis Richard I. A.
Lovell Brian C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Hidden Markov Models have many applications in signal processing and pattern recognition, but their convergence-based training algorithms are known to suffer from over-sensitivity to the initial random model choice. This paper describes the boundary between regions in which ensemble learning is superior to Rabiner's multiplesequence Baum-Welch training method, and proposes techniques for determining the best method in any arbitrary situation. It also studies the suitability of the training methods using the condition number, a recently proposed diagnostic tool for testing the quality of the model. A new method for training Hidden Markov Models called the Viterbi Path counting algorithm is introduced and is found to produce significantly better performance than current methods in a range of trials

CiteSeerX

Crossref

University of Queensland eSpace

Investigation of Training Algorithms for Hidden Markov Models Applied to Automatic Speech Recognition

Author: Fang Eric
Publication venue: Clemson University Libraries
Publication date: 01/05/2009
Field of study

The work presented in this thesis focuses on simulating a speech recognizer which is trained by different people with different speaking styles and investigates how sensitive the training and recognition processes are to the variations in the training data. There are four main parts to this work. The first involves an experiment of weighting methods for training with multiple observation sequences. The second involves the testing of different initial parameters. The third part includes the first experiment involving training with multiple observation sequences. The model\u27s sensitivity to variations in training data was evaluated by comparing the cases of different values of variation. The final part varied the observation vectors with the variation restricted to only one of the eight positions in the sequence. The experiment was repeated for each of eight positions in the observation sequence, and the effect on recognition was evaluated

Clemson University: TigerPrints

Learning discrete Hidden Markov Models from state distribution vectors

Author: Moscovich Luis G.
Publication venue: LSU Digital Commons
Publication date: 01/01/2005
Field of study

Hidden Markov Models (HMMs) are probabilistic models that have been widely applied to a number of fields since their inception in the late 1960’s. Computational Biology, Image Processing, and Signal Processing, are but a few of the application areas of HMMs. In this dissertation, we develop several new efficient learning algorithms for learning HMM parameters. First, we propose a new polynomial-time algorithm for supervised learning of the parameters of a first order HMM from a state probability distribution (SD) oracle. The SD oracle provides the learner with the state distribution vector corresponding to a query string. We prove the correctness of the algorithm and establish the conditions under which it is guaranteed to construct a model that exactly matches the oracle’s target HMM. We also conduct a simulation experiment to test the viability of the algorithm. Furthermore, the SD oracle is proven to be necessary for polynomial-time learning in the sense that the consistency problem for HMMs, where a training set of state distribution vectors such as those provided by the SD oracle is used but without the ability to query on arbitrary strings, is NP-complete. Next, we define helpful distributions on an instance set of strings for which polynomial-time HMM learning from state distribution vectors is feasible in the absence of an SD oracle and propose a new PAC-learning algorithm under helpful distribution for HMM parameters. The PAC-learning algorithm ensures with high probability that HMM parameters can be learned from training examples without asking queries. Furthermore, we propose a hybrid learning algorithm for approximating HMM parameters from a dataset composed of strings and their corresponding state distribution vectors, and provide supporting experimental data, which indicates our hybrid algorithm produces more accurate approximations than the existing method

Louisiana State University

Beyond output voting: Detecting compromised replicas using HMM-based behavioral distance

Author: GAO Debin
Reiter Michael K.
SONG Dawn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Recommended from our members

Uncertainty quantification and its properties for hidden Markov models with application to condition based maintenance

Author: Zhang Deyi, Ph. D.
Publication venue
Publication date: 22/02/2018
Field of study

Condition-based maintenance (CBM) can be viewed as a transformation of data gathered from a piece of equipment into information about its condition, and further into decisions on what to do with the equipment. Hidden Markov model (HMM) is a useful framework to probabilistically model the condition of complex engineering systems with partial observability of the underlying states. Condition monitoring and prediction of such type of system requires accurate knowledge of HMM that describes the degradation of such a system with data collected from the sensors mounted on it, as well as understanding of the uncertainty of the HMMs identified from the available data. To that end, this thesis proposes a novel HMM estimation scheme based on the principles of Bayes theorem. The newly proposed Bayesian estimation approach for estimating HMM parameters naturally yields information about model parametric uncertainties via posterior distributions of HMM parameters emanating from the estimation process. In addition, a novel condition monitoring scheme based on uncertain HMMs of the degradation process is proposed and demonstrated on a large dataset obtained from a semiconductor manufacturing facility. Portion of the data was used to build operating mode specific HMMs of machine degradation via the newly proposed Bayesian estimation process, while the remainder of the data was used for monitoring of machine condition using the uncertain degradation HMMs yielded by Bayesian estimation. Comparison with a traditional signature-based statistical monitoring method showed that the newly proposed approach effectively utilizes the fact that its parameters are uncertain themselves, leading to orders of magnitude fewer false alarms. This methodology is further extended to address the practical issue that maintenance interventions are usually imperfect. We propose both a novel non-ergodic and non-homogeneous HMM that assumes imperfect maintenances and a novel process monitoring method capable of monitoring the hidden states considering model uncertainty. Significant improvement in both the log-likelihood of estimated HMM parameters and monitoring performance were observed, compared to those obtained using degradation HMMs that always assumed perfect maintenance. Finally, behavior of the posterior distribution of parameters of unidirectional non- ergodic HMMs modeling in this thesis for degradation was theoretically analyzed in terms of their evolution as more data become available in the estimation process. The convergence problem is formulated as a Bernstein-von Mises theorem (BvMT), and under certain regularity conditions, the sequence of posterior distributions is proven to converge to a Gaussian distribution with variance matrix being the inverse of the Fisher information matrix. An example of a unidirectional HMM is presented for which the regularity conditions are verified, and illustrations of expected theoretical results are given using simulation. The understanding of such convergence of posterior distributions enables one to determine when Bayesian estimation of degradation HMMs is justified and converges toward true model parameters, as well as how much data one then needs to achieve desired accuracy of the resulting model. Understanding of these issues is of utmost important if HMMs are to be used for degradation modeling and monitoring.Operations Research and Industrial Engineerin

Texas ScholarWorks

Improved Estimation of Hidden Markov Model Parameters from Multiple Observation Sequences

Author: Observation Sequenc Es
Publication venue
Publication date
Field of study

The huge popularity of Hidden Markov models in pattern recognition is due to the ability to "learn" model parameters from an observation sequence through Baum-Welch and other re-estimation procedures. In the case of HMM parameter estimation from an ensemble of observation sequences, rather than a single sequence, we require techniques for finding the parameters which maximize the likelihood of the estimated model given the entire set of observation sequences. The importance of this study is that HMMs with parameters estimated from multiple observations are shown to be many orders of magnitude more probable than HMM models learned from any single observation sequence --- thus the effectiveness of HMM "learning" is greatly enhanced. In this paper, we present techniques that usually find models significantly more likely than Rabiner's wellknown method on both seen and unseen sequences

CiteSeerX

A review and application of hidden Markov models and double chain Markov models

Author: Hoff Michael Ryan
Publication venue
Publication date: 01/01/2016
Field of study

A Dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Master of Science. Johannesburg, 2016.Hidden Markov models (HMMs) and double chain Markov models (DCMMs) are classical Markov model extensions used in a range of applications in the literature. This dissertation provides a comprehensive review of these models with focus on i) providing detailed mathematical derivations of key results - some of which, at the time of writing, were not found elsewhere in the literature, ii) discussing estimation techniques for unknown model parameters and the hidden state sequence, and iii) discussing considerations which practitioners of these models would typically take into account. Simulation studies are performed to measure statistical properties of estimated model parameters and the estimated hidden state path - derived using the Baum-Welch algorithm (BWA) and the Viterbi Algorithm (VA) respectively. The effectiveness of the BWA and the VA is also compared between the HMM and DCMM. Selected HMM and DCMM applications are reviewed and assessed in light of the conclusions drawn from the simulation study. Attention is given to application in the field of Credit Risk.LG201

Wits Institutional Repository on DSPACE

E-commerce security enhancement and anomaly intrusion detection using machine learning techniques

Author: Hoang X
Publication venue: RMIT University
Publication date: 01/01/2006
Field of study

With the fast growth of the Internet and the World Wide Web, security has become a major concern of many organizations, enterprises and users. Criminal attacks and intrusions into computer and information systems are spreading quickly and they can come from anywhere on the globe. Intrusion prevention measures, such as user authentication, firewalls and cryptography have been used as the first line of defence to protect computer and information systems from intrusions. As intrusion prevention alone may not be sufficient in a highly dynamic environment, such as the Internet, intrusion detection has been used as the second line of defence against intrusions. However, existing cryptography-based intrusion prevention measures implemented in software, have problems with the protection of long-term private keys and the degradation of system performance. Moreover, the security of these software-based intrusion prevention measures depends on the security of the underlying operating system, and therefore they are vulnerable to threats caused by security flaws of the underlying operating system. On the other hand, existing anomaly intrusion detection approaches usually produce excessive false alarms. They also lack in efficiency due to high construction and maintenance costs. In our approach, we employ the &quot;defence in depth&quot; principle to develop a solution to solve these problems. Our solution consists of two lines of defence: preventing intrusions at the first line and detecting intrusions at the second line if the prevention measures of the first line have been penetrated. At the first line of defence, our goal is to develop an encryption model that enhances communication and end-system security, and improves the performance of web-based E-commerce systems. We have developed a hardware-based RSA encryption model to address the above mentioned problems of existing software-based intrusion prevention measures. The proposed hardware-based encryption model is based on the integration of an existing web-based client/server model and embedded hardware-based RSA encryption modules. DSP embedded hardware is selected to develop the proposed encryption model because of its advanced security features and high processing capability. The experimental results showed that the proposed DSP hardware-based RSA encryption model outperformed the software-based RSA implementation running on Pentium 4 machines that have almost double clock speed of the DSP's clock speed at large RSA encryption keys. At the second line of defence, our goal is to develop an anomaly intrusion detection model that improves the detection accuracy, efficiency and adaptability of existing anomaly detection approaches. Existing anomaly detection systems are not effective as they usually produce excessive false alarms. In addition, several anomaly detection approaches suffer a serious efficiency problem due to high construction costs of the detection profiles. High construction costs will eventually reduce the applicability of these approaches in practice. Furthermore, existing anomaly detection systems lack in adaptability because no mechanisms are provided to update their detection profiles dynamically, in order to adapt to the changes of the behaviour of monitored objects. We have developed a model for program anomaly intrusion detection to address these problems. The proposed detection model uses a hidden Markov model (HMM) to characterize normal program behaviour using system calls. In order to increase the detection rate and to reduce the false alarm rate, we propose two detection schemes: a two-layer detection scheme and a fuzzy-based detection scheme. The two-layer detection scheme aims at reducing false alarms by applying a double-layer test on each sequence of test traces of system calls. On the other hand, the fuzzy-based detection scheme focuses on further improving the detection rate, as well as reducing false alarms. It employs the fuzzy inference to combine multiple sequence information to correctly determine the sequence status. The experimental results showed that the proposed detection schemes reduced false alarms by approximately 48%, compared to the normal database scheme. In addition, our detection schemes generated strong anomaly signals for all tested traces, which in turn improve the detection rate. We propose an HMM incremental training scheme with optimal initialization to address the efficiency problem by reducing the construction costs, in terms of model training time and storage demand. Unlike the HMM batch training scheme, which updates the HMM model using the complete training set, our HMM incremental training scheme incrementally updates the HMM model using one training subset at a time, until convergence. The experimental results showed that the proposed HMM incremental training scheme reduced training time four-fold, compared to the HMM batch training, based on the well-known Baum-Welch algorithm. The proposed training scheme also reduced storage demand substantially, as the size of each training subset is significantly smaller than the size of the complete training set. We also describe our complete model for program anomaly detection using system calls in chapter 8. The complete model consists of two development stages: training stage and testing stage. In the training stage, an HMM model and a normal database are constructed to represent normal program behaviour. In addition, fuzzy sets and rules are defined to represent the space and combined conditions of the sequence parameters. In the testing stage, the HMM model and the normal database, are used to generate the sequence parameters which are used as the input for the fuzzy inference engine to evaluate each sequence of system calls for anomalies and possible intrusions. The proposed detection model also provides a mechanism to update its detection profile (the HMM model and the normal database) using online training data. This makes the proposed detection model up-to-date, and therefore, maintains the detection accuracy

RMIT Research Repository